Re: [Mesa-dev] [RFC] i965/dbg: Expose cases hitting a presumably dead optimization

2016-03-12 Thread Pohjolainen, Topi
On Sat, Mar 12, 2016 at 08:44:54AM -0800, Jason Ekstrand wrote:
>On Mar 11, 2016 11:47 PM, "Pohjolainen, Topi"
><[1]topi.pohjolai...@intel.com> wrote:
>>
>> On Fri, Mar 11, 2016 at 05:59:37PM -0800, Jason Ekstrand wrote:
>> >On Fri, Mar 11, 2016 at 4:40 AM, Topi Pohjolainen
>> ><[1][2]topi.pohjolai...@intel.com> wrote:
>> >
>> >  The logic iterates over param[] which contains pointers to
>> >  uniform storage set during linking (see
>> >  link_assign_uniform_locations()).
>> >  The pointers are unique and it should be impossible to find
>> >  matching entries.
>> >  I couldn't find any regressions with test system. In addition
>> >  I tried several benchmarks on HSW and none hit this.
>> >  I'm hoping to remove this optimization attempt. This is the
>only
>> >  bit that depends on knowing about the actual storage during
>> >  compilation. All the rest deal with just relative push and
>pull
>> >  locations once the actual filling of pull_param[] is moved
>> >  outside the compiler just as param[]. (Filling pull_param is
>> >  based on the pull locations and doesn't need to be inside the
>> >  compiler).
>> >  Any thoughts?
>> >
>> >I'm not 100% sure what you're trying to do, but I have a branch
>that
>> >may be of interest:
>> >
>[2][3]https://cgit.freedesktop.org/~jekstrand/mesa/log/?h=wip/i965-unif
>orm
>> >s
>> >The branch enables support for pushing small uniform arrays.
>Among
>> >other things, it redoes the way we do push constants and gets
>rid of
>> >some of the data tracking in the backend compiler.  The big
>reason why
>> >I haven't tried too hard to get it merged is because it
>regresses Sandy
>> >Bridge just a bit.  I know I've seen and fixed the bug before in
>an
>> >alternate attempt, but I don't remember how.
>> >I'm going to be refreshing it soon because we need indirect push
>> >constants for the Vulkan driver.  (The branch is already merged
>into
>> >the Vulkan branch.)
>>
>> I'd like to stop filling param[] before compilation. This is really
>not
>> needed by the compiler as it deals with pull and push constant
>locations,
>> i.e., positions in the push and pull files. Actual uniform values and
>their
>> location in the uniform storage are not needed until actual pipeline
>upload.
>>
>> My plan is to move the iteration over the core uniform storage to
>pipeline
>> upload time. We can fill push and pull buffers directly without the
>need of
>> storing pointers to param[] in the middle. Not only makes this things
>simpler
>> and more flexible in my mind, does it give us the possibility to
>upload
>> floats with 16-bit precision instead of 32-bits. Current upload logic
>only
>> gets pointers to 32-bit values without knowing if they should be
>represented
>> with 16 bits let alone whether the values are floats or integers to
>begin
>> with.
> 
>Right. Kristian and I have talked about some related things that we
>need for pipeline caching and the Vulkan driver.  In Vulkan, they
>aren't actual pointers at all but are, instead, offsets into a push
>constant block.  Fortunately, the back-end compiler never dereferences
>them so you can shove whatever you want in there and it's OK. We've
>talked about turning the pull and push params into just a set of
>integers that means whatever the api and state setup code want.  One of
>the problems with pointers is that you can't easily put them into an
>on-disk shader cache (which we have for Vulkan).
> 
>When you talk about 16 or 64-bit values, what is your intention?  Are
>64-bit values still going to take up two slots or are they now one
>64-bit slot?  Are there two 16-bit values per slot or just one?  Are
>16-bit uniforms converted before they get uploaded or consumed directly
>by the shader?  I'm still a little confused as to what problem you're
>trying to solve.

I'm seeing the 16-bit float as two-fold. First, the uniform storage always
represents them as normal 32-bit floats for the gl-api to work correctly
(even if they are marked as low/mediump I don't think the api for setting and
querying them is allowed operate with reduced precision. On the other hand,
such conversion back and forth in the core gl-api doesn't sound appealing
at all just from implementation point of view).
Therefore after the compiler has chosen to represent a particular uniform
with reduced precision and set the operand types accordingly, the upload logic
has to convert the 32-bit float into equivalent 16-bit value before uplaoding.

Second is the question on how to pack the 16-bit values. I'm seeing this as
second step 

Re: [Mesa-dev] [PATCH] softpipe: fix anisotropic filtering crash

2016-03-12 Thread Eduardo Lima Mitev

On 03/13/2016 06:53 AM, srol...@vmware.com wrote:

From: Roland Scheidegger 

The filt_args->offset wasn't assigned but was always used later leading
to a crash (as far as I can tell, texel offsets don't actually make much
sense with anisotropic filtering, but because there's no explicit setting
if offsets are enabled there the array is always accessed).

This fixes https://bugs.freedesktop.org/show_bug.cgi?id=94481

CC: 
---
  src/gallium/drivers/softpipe/sp_tex_sample.c | 14 --
  1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/softpipe/sp_tex_sample.c 
b/src/gallium/drivers/softpipe/sp_tex_sample.c
index e3e28a3..5e3d47b 100644
--- a/src/gallium/drivers/softpipe/sp_tex_sample.c
+++ b/src/gallium/drivers/softpipe/sp_tex_sample.c
@@ -2209,6 +2209,7 @@ img_filter_2d_ewa(const struct sp_sampler_view *sp_sview,
const float t[TGSI_QUAD_SIZE],
const float p[TGSI_QUAD_SIZE],
const uint faces[TGSI_QUAD_SIZE],
+  const int8_t *offset,
unsigned level,
const float dudx, const float dvdx,
const float dudy, const float dvdy,
@@ -2268,6 +2269,8 @@ img_filter_2d_ewa(const struct sp_sampler_view *sp_sview,
 /* F *= formScale; */ /* no need to scale F as we don't use it below here 
*/

 args.level = level;
+   args.offset = offset;
+
 for (j = 0; j < TGSI_QUAD_SIZE; j++) {
/* Heckbert MS thesis, p. 59; scan over the bounding box of the ellipse
 * and incrementally update the value of Ax^2+Bxy*Cy^2; when this
@@ -2431,6 +2434,8 @@ mip_filter_linear_aniso(const struct sp_sampler_view 
*sp_sview,
 const float dvdy = (t[QUAD_TOP_LEFT] - t[QUAD_BOTTOM_LEFT]) * t_to_v;
 struct img_filter_args args;

+   args.offset = filt_args->offset;
+
 if (filt_args->control == TGSI_SAMPLER_LOD_BIAS ||
 filt_args->control == TGSI_SAMPLER_LOD_NONE ||
 /* XXX FIXME */
@@ -2495,6 +2500,11 @@ mip_filter_linear_aniso(const struct sp_sampler_view 
*sp_sview,
   args.p = p[j];
   args.level = psview->u.tex.last_level;
   args.face_id = filt_args->faces[j];
+ /*
+  * XXX: we overwrote any linear filter with nearest, so this
+  * isn't right (albeit if last level is 1x1 and no border it
+  * will work just the same).
+  */
   min_filter(sp_sview, sp_samp, , [0][j]);
}


Patch looks right but this comment seems unrelated with it. If that's 
the case then perhaps it should be moved out to a patch of its own. 
Other than that:


Reviewed-by: Eduardo Lima Mitev 

Thanks.

Eduardo


 }
@@ -2503,8 +2513,8 @@ mip_filter_linear_aniso(const struct sp_sampler_view 
*sp_sview,
 * seem to be worth the extra running time.
 */
img_filter_2d_ewa(sp_sview, sp_samp, min_filter, mag_filter,
-s, t, p, filt_args->faces, level0,
-dudx, dvdx, dudy, dvdy, rgba);
+s, t, p, filt_args->faces, filt_args->offset,
+level0, dudx, dvdx, dudy, dvdy, rgba);
 }

 if (DEBUG_TEX) {



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] softpipe: fix anisotropic filtering crash

2016-03-12 Thread sroland
From: Roland Scheidegger 

The filt_args->offset wasn't assigned but was always used later leading
to a crash (as far as I can tell, texel offsets don't actually make much
sense with anisotropic filtering, but because there's no explicit setting
if offsets are enabled there the array is always accessed).

This fixes https://bugs.freedesktop.org/show_bug.cgi?id=94481

CC: 
---
 src/gallium/drivers/softpipe/sp_tex_sample.c | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/softpipe/sp_tex_sample.c 
b/src/gallium/drivers/softpipe/sp_tex_sample.c
index e3e28a3..5e3d47b 100644
--- a/src/gallium/drivers/softpipe/sp_tex_sample.c
+++ b/src/gallium/drivers/softpipe/sp_tex_sample.c
@@ -2209,6 +2209,7 @@ img_filter_2d_ewa(const struct sp_sampler_view *sp_sview,
   const float t[TGSI_QUAD_SIZE],
   const float p[TGSI_QUAD_SIZE],
   const uint faces[TGSI_QUAD_SIZE],
+  const int8_t *offset,
   unsigned level,
   const float dudx, const float dvdx,
   const float dudy, const float dvdy,
@@ -2268,6 +2269,8 @@ img_filter_2d_ewa(const struct sp_sampler_view *sp_sview,
/* F *= formScale; */ /* no need to scale F as we don't use it below here */
 
args.level = level;
+   args.offset = offset;
+
for (j = 0; j < TGSI_QUAD_SIZE; j++) {
   /* Heckbert MS thesis, p. 59; scan over the bounding box of the ellipse
* and incrementally update the value of Ax^2+Bxy*Cy^2; when this
@@ -2431,6 +2434,8 @@ mip_filter_linear_aniso(const struct sp_sampler_view 
*sp_sview,
const float dvdy = (t[QUAD_TOP_LEFT] - t[QUAD_BOTTOM_LEFT]) * t_to_v;
struct img_filter_args args;
 
+   args.offset = filt_args->offset;
+
if (filt_args->control == TGSI_SAMPLER_LOD_BIAS ||
filt_args->control == TGSI_SAMPLER_LOD_NONE ||
/* XXX FIXME */
@@ -2495,6 +2500,11 @@ mip_filter_linear_aniso(const struct sp_sampler_view 
*sp_sview,
  args.p = p[j];
  args.level = psview->u.tex.last_level;
  args.face_id = filt_args->faces[j];
+ /*
+  * XXX: we overwrote any linear filter with nearest, so this
+  * isn't right (albeit if last level is 1x1 and no border it
+  * will work just the same).
+  */
  min_filter(sp_sview, sp_samp, , [0][j]);
   }
}
@@ -2503,8 +2513,8 @@ mip_filter_linear_aniso(const struct sp_sampler_view 
*sp_sview,
* seem to be worth the extra running time.
*/
   img_filter_2d_ewa(sp_sview, sp_samp, min_filter, mag_filter,
-s, t, p, filt_args->faces, level0,
-dudx, dvdx, dudy, dvdy, rgba);
+s, t, p, filt_args->faces, filt_args->offset,
+level0, dudx, dvdx, dudy, dvdy, rgba);
}
 
if (DEBUG_TEX) {
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 94503] OpenCL segfaults during compilation

2016-03-12 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=94503

--- Comment #3 from Tyson Whitehead  ---
Created attachment 122262
  --> https://bugs.freedesktop.org/attachment.cgi?id=122262=edit
Simplified kernel that causes other (different) compiler segfault

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 94503] OpenCL segfaults during compilation

2016-03-12 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=94503

--- Comment #2 from Tyson Whitehead  ---
Thanks for the heads-up Matt.

I rebuilt the Debian package of mesa 11.2.0-rc3 against the Debian package of
llvm 3.9~svn262954 and am pleased to say the simplified kernel I provided also
now compiles for me.

Unfortunately the full set of my OpenCL code I still causing a segfault. 
Pruning code reveals it is a different kernel though, and the backtrace is
entirely different too, so progress is being made!

I'm attaching a simplified version of this next kernel function.  I would
appreciate it if you could give it a go on your setup and see if it is
segfaulting for you as well.

Program received signal SIGSEGV, Segmentation fault.
0x73bad2a0 in (anonymous namespace)::JoinVals::pruneValues
(this=this@entry=0x7fffb8a0, 
Other=..., EndPoints=..., changeInstrs=changeInstrs@entry=false)
at
/tmp/buildd/llvm-toolchain-snapshot-3.9~svn262954/lib/CodeGen/RegisterCoalescer.cpp:2388

#0  0x73bad2a0 in (anonymous namespace)::JoinVals::pruneValues
(this=this@entry=0x7fffb8a0, 
Other=..., EndPoints=..., changeInstrs=changeInstrs@entry=false)
at
/tmp/buildd/llvm-toolchain-snapshot-3.9~svn262954/lib/CodeGen/RegisterCoalescer.cpp:2388
#1  0x73bb38da in (anonymous
namespace)::RegisterCoalescer::joinSubRegRanges (this=0x2386a50, 
this=0x2386a50, CP=..., LaneMask=8, RRange=..., LRange=...)
at
/tmp/buildd/llvm-toolchain-snapshot-3.9~svn262954/lib/CodeGen/RegisterCoalescer.cpp:2569
#2  (anonymous namespace)::RegisterCoalescer::mergeSubRangeInto
(this=this@entry=0x2386a50, LI=..., 
ToMerge=..., LaneMask=8, CP=...)
at
/tmp/buildd/llvm-toolchain-snapshot-3.9~svn262954/lib/CodeGen/RegisterCoalescer.cpp:2622
#3  0x73bb4a31 in (anonymous
namespace)::RegisterCoalescer::joinVirtRegs (
this=this@entry=0x2386a50, CP=...)
at
/tmp/buildd/llvm-toolchain-snapshot-3.9~svn262954/lib/CodeGen/RegisterCoalescer.cpp:2688
#4  0x73bb54a0 in (anonymous
namespace)::RegisterCoalescer::joinIntervals (CP=..., 
this=0x2386a50)
at
/tmp/buildd/llvm-toolchain-snapshot-3.9~svn262954/lib/CodeGen/RegisterCoalescer.cpp:2734
#5  (anonymous namespace)::RegisterCoalescer::joinCopy (Again=, CopyMI=0xb991b0, 
this=0x2386a50)
at
/tmp/buildd/llvm-toolchain-snapshot-3.9~svn262954/lib/CodeGen/RegisterCoalescer.cpp:1449
#6  (anonymous namespace)::RegisterCoalescer::copyCoalesceWorkList
(this=this@entry=0x2386a50, 
CurrList=...)
at
/tmp/buildd/llvm-toolchain-snapshot-3.9~svn262954/lib/CodeGen/RegisterCoalescer.cpp:2805
#7  0x73bb70bb in (anonymous
namespace)::RegisterCoalescer::coalesceLocals (
this=this@entry=0x2386a50)
at
/tmp/buildd/llvm-toolchain-snapshot-3.9~svn262954/lib/CodeGen/RegisterCoalescer.cpp:2930
#8  0x73bb7da8 in (anonymous
namespace)::RegisterCoalescer::joinAllIntervals (this=0x2386a50)
at
/tmp/buildd/llvm-toolchain-snapshot-3.9~svn262954/lib/CodeGen/RegisterCoalescer.cpp:2956
#9  (anonymous namespace)::RegisterCoalescer::runOnMachineFunction
(this=0x2386a50, fn=...)
at
/tmp/buildd/llvm-toolchain-snapshot-3.9~svn262954/lib/CodeGen/RegisterCoalescer.cpp:3006
#10 0x739cc752 in llvm::FPPassManager::runOnFunction (this=0x238d260,
F=...)
at
/tmp/buildd/llvm-toolchain-snapshot-3.9~svn262954/lib/IR/LegacyPassManager.cpp:1550
#11 0x739cca8b in llvm::FPPassManager::runOnModule (this=0x238d260,
M=...)
at
/tmp/buildd/llvm-toolchain-snapshot-3.9~svn262954/lib/IR/LegacyPassManager.cpp:1571
#12 0x739cc3cf in (anonymous namespace)::MPPassManager::runOnModule
(M=..., this=0x238cfd0)
at
/tmp/buildd/llvm-toolchain-snapshot-3.9~svn262954/lib/IR/LegacyPassManager.cpp:1627
#13 llvm::legacy::PassManagerImpl::run (this=0xa96580, M=...)
at
/tmp/buildd/llvm-toolchain-snapshot-3.9~svn262954/lib/IR/LegacyPassManager.cpp:1730
#14 0x739cc569 in llvm::legacy::PassManager::run
(this=this@entry=0x7fffc6a0, M=...)
at
/tmp/buildd/llvm-toolchain-snapshot-3.9~svn262954/lib/IR/LegacyPassManager.cpp:1761
#15 0x74507ef7 in LLVMTargetMachineEmit (T=T@entry=0x239f8a0,
M=M@entry=0xb42a60, OS=..., 
codegen=codegen@entry=LLVMObjectFile,
ErrorMessage=ErrorMessage@entry=0x7fffc948)
at
/tmp/buildd/llvm-toolchain-snapshot-3.9~svn262954/lib/Target/TargetMachineC.cpp:206
#16 0x74508219 in LLVMTargetMachineEmitToMemoryBuffer
(T=T@entry=0x239f8a0, M=M@entry=0xb42a60, 
codegen=codegen@entry=LLVMObjectFile,
ErrorMessage=ErrorMessage@entry=0x7fffc948, 
OutMemBuf=OutMemBuf@entry=0x7fffcae8)
at
/tmp/buildd/llvm-toolchain-snapshot-3.9~svn262954/lib/Target/TargetMachineC.cpp:230
#17 0x76605584 in (anonymous namespace)::emit_code
(tm=tm@entry=0x239f8a0, 
mod=mod@entry=0xb42a60, file_type=file_type@entry=LLVMObjectFile, 
out_buffer=out_buffer@entry=0x7fffcae8, 
r_log="test2.c:36:31: warning: double precision constant requires

[Mesa-dev] [PATCH 2/3] nv50/ir: avoid folding mul + add if the mul has a dnz

2016-03-12 Thread Ilia Mirkin
Signed-off-by: Ilia Mirkin 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index 6192c06..66e7b2e 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -1635,11 +1635,10 @@ AlgebraicOpt::tryADDToMADOrSAD(Instruction *add, 
operation toOp)
if (src->getUniqueInsn() && src->getUniqueInsn()->bb != add->bb)
   return false;
 
-   if (src->getInsn()->saturate)
+   if (src->getInsn()->saturate || src->getInsn()->postFactor ||
+   src->getInsn()->dnz)
   return false;
 
-   if (src->getInsn()->postFactor)
-  return false;
if (toOp == OP_SAD) {
   ImmediateValue imm;
   if (!src->getInsn()->src(2).getImmediate(imm))
-- 
2.4.10

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] nv50, nvc0: handle SQRT lowering inside the driver

2016-03-12 Thread Ilia Mirkin
First off, st/mesa lowers DSQRT incorrectly (it uses CMP to attempt to
find out whether the input is less than 0). Secondly the current
approach (x * rsq(x)) behaves poorly for x = inf - a NaN is produced
instead of inf.

Instead we switch to the less accurate rcp(rsq(x)) method - this behaves
nicely for all valid inputs. We still don't do this for DSQRT since the
RSQ/RCP ops are *really* inaccurate, and don't even have Newton-Raphson
steps right now. Eventually we should have a separate library function
for DSQRT that does it more precisely (and perhaps move this lowering to
the post-opt phase).

This fixes a number of dEQP precision tests that were expecting better
behavior for infinite inputs.

Signed-off-by: Ilia Mirkin 
---
 .../drivers/nouveau/codegen/nv50_ir_build_util.cpp |  6 +++-
 .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  |  2 ++
 .../nouveau/codegen/nv50_ir_lowering_nv50.cpp  |  7 ++---
 .../nouveau/codegen/nv50_ir_lowering_nvc0.cpp  | 32 +++---
 src/gallium/drivers/nouveau/nv50/nv50_screen.c |  2 +-
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c |  2 +-
 6 files changed, 28 insertions(+), 23 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_build_util.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_build_util.cpp
index f58cf97..84ebfdb 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_build_util.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_build_util.cpp
@@ -585,6 +585,7 @@ BuildUtil::split64BitOpPostRA(Function *fn, Instruction *i,
  return NULL;
   srcNr = 2;
   break;
+   case OP_SELP: srcNr = 3; break;
default:
   // TODO when needed
   return NULL;
@@ -601,7 +602,10 @@ BuildUtil::split64BitOpPostRA(Function *fn, Instruction *i,
 
for (int s = 0; s < srcNr; ++s) {
   if (lo->getSrc(s)->reg.size < 8) {
- hi->setSrc(s, zero);
+ if (s == 2)
+hi->setSrc(s, lo->getSrc(s));
+ else
+hi->setSrc(s, zero);
   } else {
  if (lo->getSrc(s)->refCount() > 1)
 lo->setSrc(s, cloneShallow(fn, lo->getSrc(s)));
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
index b06d86a..d284446 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
@@ -616,6 +616,7 @@ static nv50_ir::operation translateOpcode(uint opcode)
 
NV50_IR_OPCODE_CASE(RCP, RCP);
NV50_IR_OPCODE_CASE(RSQ, RSQ);
+   NV50_IR_OPCODE_CASE(SQRT, SQRT);
 
NV50_IR_OPCODE_CASE(MUL, MUL);
NV50_IR_OPCODE_CASE(ADD, ADD);
@@ -2689,6 +2690,7 @@ Converter::handleInstruction(const struct 
tgsi_full_instruction *insn)
case TGSI_OPCODE_FLR:
case TGSI_OPCODE_TRUNC:
case TGSI_OPCODE_RCP:
+   case TGSI_OPCODE_SQRT:
case TGSI_OPCODE_IABS:
case TGSI_OPCODE_INEG:
case TGSI_OPCODE_NOT:
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp
index 8752b0c..12c5f69 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp
@@ -1203,10 +1203,9 @@ NV50LoweringPreSSA::handleDIV(Instruction *i)
 bool
 NV50LoweringPreSSA::handleSQRT(Instruction *i)
 {
-   Instruction *rsq = bld.mkOp1(OP_RSQ, TYPE_F32,
-bld.getSSA(), i->getSrc(0));
-   i->op = OP_MUL;
-   i->setSrc(1, rsq->getDef(0));
+   bld.setPosition(i, true);
+   i->op = OP_RSQ;
+   bld.mkOp1(OP_RCP, i->dType, i->getDef(0), i->getDef(0));
 
return true;
 }
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
index d181f15..29b77c9 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
@@ -1778,22 +1778,22 @@ NVC0LoweringPass::handleMOD(Instruction *i)
 bool
 NVC0LoweringPass::handleSQRT(Instruction *i)
 {
-   Value *pred = bld.getSSA(1, FILE_PREDICATE);
-   Value *zero = bld.getSSA();
-   Instruction *rsq;
-
-   bld.mkOp1(OP_MOV, TYPE_U32, zero, bld.mkImm(0));
-   if (i->dType == TYPE_F64)
-  zero = bld.mkOp2v(OP_MERGE, TYPE_U64, bld.getSSA(8), zero, zero);
-   bld.mkCmp(OP_SET, CC_LE, i->dType, pred, i->dType, i->getSrc(0), zero);
-   bld.mkOp1(OP_MOV, i->dType, i->getDef(0), zero)->setPredicate(CC_P, pred);
-   rsq = bld.mkOp1(OP_RSQ, i->dType,
-   bld.getSSA(typeSizeof(i->dType)), i->getSrc(0));
-   rsq->setPredicate(CC_NOT_P, pred);
-   i->op = OP_MUL;
-   i->setSrc(1, rsq->getDef(0));
-   i->setPredicate(CC_NOT_P, pred);
-
+   if (i->dType == TYPE_F64) {
+  Value *pred = bld.getSSA(1, FILE_PREDICATE);
+  Value *zero = bld.loadImm(NULL, 0.0d);
+  Value *dst = bld.getSSA(8);
+  Instruction *mov, *rsq;
+ 

[Mesa-dev] [PATCH 1/3] nvc0: fix blit triangle size to fully cover FB's > 8192x8192

2016-03-12 Thread Ilia Mirkin
The idea is that a single triangle will cover the whole area being
drawn, allowing the blit shader to do its work. However the max fb size
is 16384x16384, which means that the triangle we draw needs to be twice
that in order to cover the whole area fully. Increase the size of the
triangle to 32768x32768.

This fixes a number of dEQP tests that were failing because a blit was
involved which would miss some of the resulting texture.

Signed-off-by: Ilia Mirkin 
Cc: "11.1 11.2" 
---
 src/gallium/drivers/nouveau/nvc0/nvc0_surface.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_surface.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_surface.c
index ccfc9e2..f2ad4bf 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_surface.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_surface.c
@@ -1215,8 +1215,8 @@ nvc0_blit_3d(struct nvc0_context *nvc0, const struct 
pipe_blit_info *info)
x0 = (float)info->src.box.x - x_range * (float)info->dst.box.x;
y0 = (float)info->src.box.y - y_range * (float)info->dst.box.y;
 
-   x1 = x0 + 16384.0f * x_range;
-   y1 = y0 + 16384.0f * y_range;
+   x1 = x0 + 32768.0f * x_range;
+   y1 = y0 + 32768.0f * y_range;
 
x0 *= (float)(1 << nv50_miptree(src)->ms_x);
x1 *= (float)(1 << nv50_miptree(src)->ms_x);
@@ -1327,14 +1327,14 @@ nvc0_blit_3d(struct nvc0_context *nvc0, const struct 
pipe_blit_info *info)
   *(vbuf++) = fui(y0);
   *(vbuf++) = fui(z);
 
-  *(vbuf++) = fui(16384 << nv50_miptree(dst)->ms_x);
+  *(vbuf++) = fui(32768 << nv50_miptree(dst)->ms_x);
   *(vbuf++) = fui(0.0f);
   *(vbuf++) = fui(x1);
   *(vbuf++) = fui(y0);
   *(vbuf++) = fui(z);
 
   *(vbuf++) = fui(0.0f);
-  *(vbuf++) = fui(16384 << nv50_miptree(dst)->ms_y);
+  *(vbuf++) = fui(32768 << nv50_miptree(dst)->ms_y);
   *(vbuf++) = fui(x0);
   *(vbuf++) = fui(y1);
   *(vbuf++) = fui(z);
-- 
2.4.10

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa include guard style. (Was: [PATCH] i965/cfg: Remove redundant #pragma once.)

2016-03-12 Thread Ian Romanick
On 03/11/2016 03:46 PM, Eric Anholt wrote:
> Ian Romanick  writes:
> 
>> On 03/10/2016 05:53 PM, Francisco Jerez wrote:
>>> Iago Toral  writes:
>>>
 On Wed, 2016-03-09 at 19:04 -0800, Francisco Jerez wrote:
> Matt Turner  writes:
>
>> On Wed, Mar 9, 2016 at 1:37 PM, Francisco Jerez  
>> wrote:
>>> Iago Toral  writes:
>>>
 On Tue, 2016-03-08 at 17:42 -0800, Francisco Jerez wrote:
> brw_cfg.h already has include guards, remove the "#pragma once" which
> is redundant and non-standard.

 FWIW, I think using both #pragma once and include guards is a way to
 keep portability while still getting the performance advantage of
 #pragma once where it is supported.

>>> It's highly unlikely to make any significant difference on any
>>> reasonably modern compiler.  I cannot measure any change in compilation
>>> time locally from my cleanup.
>>>
 Also it seems that we do the same thing in many other files...

>>> Really?  I'm not aware of any other file where we use both.
>>
>> There are quite a few in glsl/
>
> Heh, apparently you're right.  Anyway it seems rather pointless to use
> '#pragma once' in a bunch of scattered header files with the expectation
> to gain some speed, the improvement from a single header file is so
> minuscule (if it will make any difference at all on a modern compiler
> and compilation workload, which I doubt) that we would have to use it
> universally in order to have the chance to measure any improvement.
>
> Can we please just decide for one of the include guard styles and use it
> consistently?  Given that the majority of header files in the Mesa
> codebase use old-school define guards, that it's the only standard
> option, that it has well-defined semantics in presence of file copies
> and hardlinks, and that the performance argument against it is rather
> dubious (although I definitely find '#pragma once' prettier and more
> concise), I'd vote for using preprocessor define guards universally.
>
> What do other people think?

 I think we have to use define guards necessarily since #pragma once is
 not standard even it it has wide support. So the question is whether we
 want to use only define guards or define guards plus #pragma once. I am
 fine with doing only define guards as you propose.
>>>
>>> *Shrug* I have the impression that the only real advantage of '#pragma
>>> once' is that you no longer need to do the ifndef/define dance, so I
>>> don't think I can see much benefit in doing both.
>>
>> Several compilers will cache the file name where '#pragma once' occurs
>> and never read that file again.  A #include of a file previously seen
>> with '#pragma once' becomes a no-op.  Since the file is never read, the
>> compiler avoids all the I/O and the parsing.  That is true of MSVC and,
>> I thought, some versions of GCC.  As Iago points out, some compilers
>> ignore the #pragma altogether.  Since Mesa supports (or does it?) some
>> of these compilers, we have to have the ifdef/define/endif guards.
> 
> Compilers have noticed that ifdef/define/endif is a thing and optimized
> it, anyway.
> 
> https://gcc.gnu.org/onlinedocs/cppinternals/Guard-Macros.html

That's cool!  I don't think GCC did that when I looked into this in
2010.  It sounds like the #pragma actually breaks the GCC optimization,
so let's get rid of them all.




signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeonsi: avoid crash when a sampler state is bound for a buffer texture

2016-03-12 Thread Ilia Mirkin
On Fri, Mar 11, 2016 at 11:17 AM, Nicolai Hähnle  wrote:
> From: Nicolai Hähnle 
>
> Sampler states don't really make sense with buffer textures, but the PBO
> upload code sets one because apparently nouveau needs it. It would be
> nice to work that out at some point, but in any case being defensive
> here is a good idea.

Sampler states are set in regular GL as well if you have a regular
buffer texture too, no?

>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94284
> Cc: mesa-sta...@lists.freedesktop.org
> ---
>  src/gallium/drivers/radeonsi/si_descriptors.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c 
> b/src/gallium/drivers/radeonsi/si_descriptors.c
> index 9aa4877..f5ad113 100644
> --- a/src/gallium/drivers/radeonsi/si_descriptors.c
> +++ b/src/gallium/drivers/radeonsi/si_descriptors.c
> @@ -324,6 +324,7 @@ static void si_bind_sampler_states(struct pipe_context 
> *ctx, unsigned shader,
>  */
> if (samplers->views.views[i] &&
> samplers->views.views[i]->texture &&
> +   samplers->views.views[i]->texture->target != PIPE_BUFFER 
> &&
> ((struct 
> r600_texture*)samplers->views.views[i]->texture)->fmask.size)
> continue;
>
> --
> 2.5.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] vc4: Add a helper for NIR->QIR control flow function node

2016-03-12 Thread Rhys Kidd
Templated implementation at present until the recently landed
NIR function support is plumbed through.

Signed-off-by: Rhys Kidd 
---
 src/gallium/drivers/vc4/vc4_program.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/src/gallium/drivers/vc4/vc4_program.c 
b/src/gallium/drivers/vc4/vc4_program.c
index 4b625a2..b026013 100644
--- a/src/gallium/drivers/vc4/vc4_program.c
+++ b/src/gallium/drivers/vc4/vc4_program.c
@@ -1686,6 +1686,13 @@ ntq_emit_block(struct vc4_compile *c, nir_block *block)
 }
 
 static void
+ntq_emit_function(struct vc4_compile *c, nir_function_impl *func)
+{
+fprintf(stderr, "FUNCTIONS not handled.\n");
+abort();
+}
+
+static void
 ntq_emit_cf_list(struct vc4_compile *c, struct exec_list *list)
 {
 foreach_list_typed(nir_cf_node, node, node, list) {
@@ -1699,6 +1706,10 @@ ntq_emit_cf_list(struct vc4_compile *c, struct exec_list 
*list)
 ntq_emit_if(c, nir_cf_node_as_if(node));
 break;
 
+case nir_cf_node_function:
+ntq_emit_function(c, nir_cf_node_as_function(node));
+break;
+
 default:
 fprintf(stderr, "Unknown NIR node type\n");
 abort();
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] vc4: Add NIR->QIR control flow graph loops

2016-03-12 Thread Rhys Kidd
Fixes the following piglit tests:
- shaders/complex-loop-analysis-bug
- shaders/glsl-fs-discard-04

Converts the following piglit tests from crash to fail:
- shaders/glsl-fs-continue-inside-do-while
- shaders/glsl-fs-loop
- shaders/glsl-fs-loop-continue
- shaders/glsl-fs-loop-nested
- shaders/glsl-texcoord-array
- shaders/glsl-vs-continue-inside-do-while
- shaders/glsl-vs-loop
- shaders/glsl-vs-loop-continue
- shaders/glsl-vs-loop-nested

No piglit regressions.

Signed-off-by: Rhys Kidd 
---
 src/gallium/drivers/vc4/vc4_program.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/vc4/vc4_program.c 
b/src/gallium/drivers/vc4/vc4_program.c
index b026013..82dfdbe 100644
--- a/src/gallium/drivers/vc4/vc4_program.c
+++ b/src/gallium/drivers/vc4/vc4_program.c
@@ -1685,6 +1685,14 @@ ntq_emit_block(struct vc4_compile *c, nir_block *block)
 }
 }
 
+static void ntq_emit_cf_list(struct vc4_compile *c, struct exec_list *list);
+
+static void
+ntq_emit_loop(struct vc4_compile *c, nir_loop *nloop)
+{
+ntq_emit_cf_list(c, >body);
+}
+
 static void
 ntq_emit_function(struct vc4_compile *c, nir_function_impl *func)
 {
@@ -1697,7 +1705,6 @@ ntq_emit_cf_list(struct vc4_compile *c, struct exec_list 
*list)
 {
 foreach_list_typed(nir_cf_node, node, node, list) {
 switch (node->type) {
-/* case nir_cf_node_loop: */
 case nir_cf_node_block:
 ntq_emit_block(c, nir_cf_node_as_block(node));
 break;
@@ -1706,6 +1713,10 @@ ntq_emit_cf_list(struct vc4_compile *c, struct exec_list 
*list)
 ntq_emit_if(c, nir_cf_node_as_if(node));
 break;
 
+case nir_cf_node_loop:
+ntq_emit_loop(c, nir_cf_node_as_loop(node));
+break;
+
 case nir_cf_node_function:
 ntq_emit_function(c, nir_cf_node_as_function(node));
 break;
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] vc4: Add better debug of NIR->QIR control flow graph failure

2016-03-12 Thread Rhys Kidd
Ensure NIR control flow graph nodes that are unhandled in QIR
are reported with sufficient verbosity to aid debugging.

This improves piglit outputs, amongst other tools.

There are no other remaining uses of assert(0) as a blunt tool
within vc4.

Signed-off-by: Rhys Kidd 
---
 src/gallium/drivers/vc4/vc4_program.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/vc4/vc4_program.c 
b/src/gallium/drivers/vc4/vc4_program.c
index 5c91c02..4b625a2 100644
--- a/src/gallium/drivers/vc4/vc4_program.c
+++ b/src/gallium/drivers/vc4/vc4_program.c
@@ -1700,7 +1700,8 @@ ntq_emit_cf_list(struct vc4_compile *c, struct exec_list 
*list)
 break;
 
 default:
-assert(0);
+fprintf(stderr, "Unknown NIR node type\n");
+abort();
 }
 }
 }
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/3] vc4: Rework NIR control flow graph handling

2016-03-12 Thread Rhys Kidd
Short patchset to go some way towards improving the handling of NIR control flow
graphs in vc4.

Whilst in no way completely addressing the known issues this improves piglit
output, provides better internal handlers for loop and function nir_cf_node
types and creates a cleaner base upon which to build.

Fixes the following piglit tests:
- shaders/complex-loop-analysis-bug
- shaders/glsl-fs-discard-04

Converts the following piglit tests from crash to fail:
- shaders/glsl-fs-continue-inside-do-while
- shaders/glsl-fs-loop
- shaders/glsl-fs-loop-continue
- shaders/glsl-fs-loop-nested
- shaders/glsl-texcoord-array
- shaders/glsl-vs-continue-inside-do-while
- shaders/glsl-vs-loop
- shaders/glsl-vs-loop-continue
- shaders/glsl-vs-loop-nested

No piglit regressions.

Importantly, the specific piglit fixes reported were from a debug build of Mesa.

At present there are known vc4 problems exposed by piglit with release builds. I
hope to work on resolving these shortly.

Nonetheless a full piglit run was also done with release builds and I confirm no
regressions were seen.

Rhys Kidd (3):
  vc4: Add better debug of NIR->QIR control flow graph failure
  vc4: Add a helper for NIR->QIR control flow function node
  vc4: Add NIR->QIR control flow graph loops

 src/gallium/drivers/vc4/vc4_program.c | 27 +--
 1 file changed, 25 insertions(+), 2 deletions(-)

-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Fix error condition for 1d array texture

2016-03-12 Thread Jason Ekstrand
On Mar 11, 2016 12:33 PM, "Alejandro Piñeiro"  wrote:
>
> On 11/03/16 20:15, Anuj Phogat wrote:
> > yoffset is also applicable to 1d array textures.
> >
> > Signed-off-by: Anuj Phogat 
> > ---
> > I don't know if it fixes any test, but it looked incorrect to me.
>
> No one fixed doing a piglit all.py run (also no regression). Didn't test
> with a deqp run.

There are very few tests for glGetTexImage.  Not hitting one doesn't mean
much.

> In any case, I also agree that the change seems to make sense.
>
> >
> >  src/mesa/main/texgetimage.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/src/mesa/main/texgetimage.c b/src/mesa/main/texgetimage.c
> > index 06bc8f1..dc21551 100644
> > --- a/src/mesa/main/texgetimage.c
> > +++ b/src/mesa/main/texgetimage.c
> > @@ -1046,7 +1046,7 @@ dimensions_error_check(struct gl_context *ctx,
> >  "%s(xoffset = %d)", caller, xoffset);
> >  return true;
> >   }
> > - if (target != GL_TEXTURE_1D && target != GL_TEXTURE_1D_ARRAY)
{
> > + if (target != GL_TEXTURE_1D) {
> >  if (yoffset % bh != 0) {
> > _mesa_error(ctx, GL_INVALID_VALUE,
> > "%s(yoffset = %d)", caller, yoffset);

I don't think this is correct.  The check is for compressed textures to
ensure that the texture coordinates are a multiple of the block size of the
texture.  I'm not sure what the rules are for 1-D array compressed textures
(if they even exist) bit I'm pretty sure the compression doesn't cross
slices.  If anything, we probably want to take the check below that looks
at height and pull it into the if too.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] ARB_shading_language_include

2016-03-12 Thread Karol Herbst
Hi all,

the game "Divinity: Original Sin - Enhanced Edition" uses
ARB_shading_language_include whenever it detects a non catalyst driver on Linux.

Apitraces from the game running on catalyst show that the shaders are simply
included within the game engine and replay fine with all mesa drivers as long as

"glShaderSource(shader = 216, count = 1, string =
[6BB0788BA6DFF7F4204CCFE5139E8AE6], length = [-1])"
calls are ignored

so there are two issues:

1. The game just uses ARB_shading_language_include without checking if it's
actually there.
I have a WIP branch here:
https://github.com/karolherbst/mesa/commits/ARB_shading_language_include
that branch contains everything needed to run the game, but also hacks
around the glShaderSource calls I mentioned above.
The big question is now: Would a proper implemention be accepted in mesa,
even when only one game actually requires it?

2. glShadersource calls invalidate the compile Status of shaders and linking
fails
I have _no_ idea what the spec say about this, but the game actually creates
shader, compiles them, links them, uises them, then calls those glShaderSource
calls, and links them again. Mesa fails with an linking error indicating that a
shader is uncompiled (because glShaderSource marks a shader as uncompiled)
So what is going on there?

Many Thanks

Karol
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [GSoC2016] Interested in implementing "Soft" double precision floating point support

2016-03-12 Thread tournier.elie
I found on PCC website that it was imported in OpenBSD and NetBSD system so
the license should be compatible.
I think I will use it as a base for add, multiply, absolute value, negate,
convert to/from single precision, and comparison functions.

Tomorrow, I will make a draft of my proposal for GSoC in which I will
resume everything.

2016-03-11 22:00 GMT+01:00 Ian Romanick :

> On 03/10/2016 03:09 PM, Dylan Baker wrote:
> > Quoting Marek Olšák (2016-03-10 06:57:57)
> >> On Thu, Mar 10, 2016 at 3:30 PM, tournier.elie 
> wrote:
> >>> First, thank you all for your answers.
> >>>
> >>> So if I summarize what was said, we need
> >>> Ian:
> >>>  - add
> >>>  - negate
> >>>  - absolute value
> >>>  - multiply
> >>>  - reciprocal
> >>>  - convert to single precision
> >>>  - convert from single precision
> >>> Roland:
> >>>  - sqrt
> >>>  - comparaison (< / == / >)
> >>>  - floor/ceil
> >>> I will contact Pat Brown (His name appear in the contact field in [1])
> to
> >>> know if we need the function below for implement gpu_shader_fp64.
> >>>  - pow
> >>>  - exp
> >>>  - log
> >>>
> >>> About the license
> >>>
> >>> Like I mentioned in the project description, there are quite a few
> >>> existing C implementations of these functions.  Finding one of those
> >>> that you can understand and that has a compatible license is probably
> >>> the best place to start.
> >>>
> >>> Main Mesa code is under MIT license.
> >>> If I chose to use a GNU GPL license file like Linux kernel [3], my
> code must
> >>> be under GNU GPL and probably all the project too. Am I right?
> >>>
> >>> [1] https://www.opengl.org/registry/specs/ARB/gpu_shader_fp64.txt
> >>> [2] http://www.mesa3d.org/license.html
> >>> [3]
> >>>
> https://github.com/torvalds/linux/blob/097f70b3c4d84ffccca15195bdfde3a37c0a7c0f/arch/arm/nwfpe/softfloat.c
> >>
> >> You can't use GNU GPL for this project.
> >>
> >> The kernel as a whole is licensed under GNU GPL, but some source files
> >> aren't. The file you linked doesn't mention GNU GPL. Somebody needs to
> >> verify that the file you linked can be legally re-licensed under the
> >> MIT license. If not, I think you have to forget the contents of the
> >> file immediately, but I'm not a lawyer.
> >>
> >> Marek
> >> ___
> >> mesa-dev mailing list
> >> mesa-dev@lists.freedesktop.org
> >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >
> > Most BSD style licenses are legally compatible, as long as none of the
> > developers object. One of the BSD kernels should have a softfloat
> > implementation that would be license compatible.
>
> Yes, and there are a couple C compilers that have compatible licenses.
> Portable C Compiler (PCC) being one.  LLVM might also support some
> devices that lack floating-point hardware.
>
>
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >
>
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeonsi: avoid crash when a sampler state is bound for a buffer texture

2016-03-12 Thread Laurent Carlier
Le vendredi 11 mars 2016, 11:17:21 CET Nicolai Hähnle a écrit :
> From: Nicolai Hähnle 
> 
> Sampler states don't really make sense with buffer textures, but the PBO
> upload code sets one because apparently nouveau needs it. It would be
> nice to work that out at some point, but in any case being defensive
> here is a good idea.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94284
> Cc: mesa-sta...@lists.freedesktop.org
> ---
>  src/gallium/drivers/radeonsi/si_descriptors.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c
> b/src/gallium/drivers/radeonsi/si_descriptors.c index 9aa4877..f5ad113
> 100644
> --- a/src/gallium/drivers/radeonsi/si_descriptors.c
> +++ b/src/gallium/drivers/radeonsi/si_descriptors.c
> @@ -324,6 +324,7 @@ static void si_bind_sampler_states(struct pipe_context
> *ctx, unsigned shader, */
>   if (samplers->views.views[i] &&
>   samplers->views.views[i]->texture &&
> + samplers->views.views[i]->texture->target != PIPE_BUFFER &&
>   ((struct
> r600_texture*)samplers->views.views[i]->texture)->fmask.size) continue;

That fixed bug 94284, thanks

-- 
Laurent Carlier
http://www.archlinux.org

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/4] glcpp: Implicitly resolve version after the first non-space/hash token.

2016-03-12 Thread Jon Turney

On 10/03/2016 19:26, Kenneth Graunke wrote:

On Wednesday, March 9, 2016 3:18:50 PM PST Jon Turney wrote:

On 05/03/2016 03:33, Kenneth Graunke wrote:

We resolved the implicit version directive when processing control lines,
such as #ifdef, to ensure any built-in macros exist.  However, we failed
to resolve it when handling ordinary text.

[...]

diff --git a/src/compiler/glsl/glcpp/tests/146-version-first-

hash.c.expected b/src/compiler/glsl/glcpp/tests/146-version-first-
hash.c.expected

new file mode 100644
index 000..2872090
--- /dev/null
+++ b/src/compiler/glsl/glcpp/tests/146-version-first-hash.c.expected
@@ -0,0 +1,3 @@
+0:1(3): preprocessor error: #version must appear on the first line
+
+


This last test fails in glcpp-test-cr-lf for me (See attached).

Can you just confirm that it passes for you, before I start looking into
why it might fail just for me...?


Sorry about that.  I had just run glcpp-test, but not glcpp-test-cr-lf.

It turns out that our handling of hash followed by newline was not
counting lines correctly, so it was returning either line 3 or line 4
based on the line terminator characters.  0:1(3) in the test was wrong;
it should have actually been 0:2(1).

Iago just reviewed my patch to fix this, so I've pushed it.  Hopefully
master should work for you now.


Yes, that's fixed. Thanks!

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] vulkan: regression on Haswell

2016-03-12 Thread Jacek Konieczny
On 2016-03-12 19:21, Jason Ekstrand wrote:
>> > Haswell should still work just fine if
>> > you're on a 4.4 kernel, but we really should make it detect the command
>> > parser version and do something intelligent.
>>
>> I am confused now… Should it 'work just fine' without this hack on 4.4,
>> or is the remark about the 'fixed' version?
>>
>> Because:
>>
>> $ uname -r
>> 4.4.4-1
> 
> Yeah, we've had the same confusion on Nanley's laptop.  Still trying to
> get it all sorted out.  What distro are you using?

PLD Linux. Quite niche, I could have built my own kernel as well.
Though, I can provide any specific information about that build, if that
may help.

Jacek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] vulkan: regression on Haswell

2016-03-12 Thread Jason Ekstrand
On Mar 12, 2016 9:11 AM, "Jacek Konieczny"  wrote:
>
> On 2016-03-12 17:58, Jason Ekstrand wrote:
> > There is a bug report that's tracking this regression:
> > https://bugs.freedesktop.org/show_bug.cgi?id=94468
> >
> > In the meantime, a workaround is comment out:
> > genX(cmd_buffer_config_l3)(cmd_buffer, false);
> > in src/intel/vulkan/genX_cmd_buffer.c.
> >
> >
> > I just pushed a hack patch that does exactly that for you on gen7.
> > Hopefully, Jordan can get the command parser version stuff figured out
> > soon.  Until then, we'll just disable it to get haswell at least sort-of
> > working.
>
> Thanks! It works fine now.
>
> Though, from the commit:
>
> > Haswell should still work just fine if
> > you're on a 4.4 kernel, but we really should make it detect the command
> > parser version and do something intelligent.
>
> I am confused now… Should it 'work just fine' without this hack on 4.4,
> or is the remark about the 'fixed' version?
>
> Because:
>
> $ uname -r
> 4.4.4-1

Yeah, we've had the same confusion on Nanley's laptop.  Still trying to get
it all sorted out.  What distro are you using?
--Jason
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] vulkan: regression on Haswell

2016-03-12 Thread Jacek Konieczny
On 2016-03-12 17:58, Jason Ekstrand wrote:
> There is a bug report that's tracking this regression:
> https://bugs.freedesktop.org/show_bug.cgi?id=94468
> 
> In the meantime, a workaround is comment out:
> genX(cmd_buffer_config_l3)(cmd_buffer, false);
> in src/intel/vulkan/genX_cmd_buffer.c.
> 
> 
> I just pushed a hack patch that does exactly that for you on gen7.
> Hopefully, Jordan can get the command parser version stuff figured out
> soon.  Until then, we'll just disable it to get haswell at least sort-of
> working.

Thanks! It works fine now.

Though, from the commit:

> Haswell should still work just fine if
> you're on a 4.4 kernel, but we really should make it detect the command
> parser version and do something intelligent.

I am confused now… Should it 'work just fine' without this hack on 4.4,
or is the remark about the 'fixed' version?

Because:

$ uname -r
4.4.4-1

Jacek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] vulkan: regression on Haswell

2016-03-12 Thread Jason Ekstrand
On Sat, Mar 12, 2016 at 8:29 AM, Nanley Chery  wrote:

> On Sat, Mar 12, 2016 at 12:20:26PM +0100, Jacek Konieczny wrote:
> > On 2016-03-12 11:59, Jacek Konieczny wrote:
> > > Hi,
> > >
> > > I have been playing with Vulkan API and using the Mesa Intel Vulkan
> > > driver from the 'vulkan' branch.
> > >
> > > Recent driver upgrade has broken my, previously working code, causing
> > > massive flickering and graphical artifacts.
> > >
> > > git bisect have shown, that this is the breaking change:
> > >
> > > commit 7ebbc3946ae9cffb3c3db522dcbe2f1041633164 (refs/bisect/bad)
> > > Author: Nanley Chery 
> > > Date:   Fri Mar 4 11:43:19 2016 -0800
> > >
> > > anv/meta: Minimize height of images used for copies
> >
> > I am sorry. It seems I have been using 'git bisect' wrong.
> >
> > This is the breaking change:
> >
> > commit 248ab61740c4082517424f5aa94b2f4e7b210d76 (HEAD)
> > Author: Jason Ekstrand 
> > Date:   Tue Mar 8 17:10:05 2016 -0800
> >
> > anv/cmd_buffer: Pull the core of flush_state into genX_cmd_buffer
> >
> > And this seems to make more sense.
> >
>
> There is a bug report that's tracking this regression:
> https://bugs.freedesktop.org/show_bug.cgi?id=94468
>
> In the meantime, a workaround is comment out:
> genX(cmd_buffer_config_l3)(cmd_buffer, false);
> in src/intel/vulkan/genX_cmd_buffer.c.
>

I just pushed a hack patch that does exactly that for you on gen7.
Hopefully, Jordan can get the command parser version stuff figured out
soon.  Until then, we'll just disable it to get haswell at least sort-of
working.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC] i965/dbg: Expose cases hitting a presumably dead optimization

2016-03-12 Thread Jason Ekstrand
On Mar 11, 2016 11:47 PM, "Pohjolainen, Topi" 
wrote:
>
> On Fri, Mar 11, 2016 at 05:59:37PM -0800, Jason Ekstrand wrote:
> >On Fri, Mar 11, 2016 at 4:40 AM, Topi Pohjolainen
> ><[1]topi.pohjolai...@intel.com> wrote:
> >
> >  The logic iterates over param[] which contains pointers to
> >  uniform storage set during linking (see
> >  link_assign_uniform_locations()).
> >  The pointers are unique and it should be impossible to find
> >  matching entries.
> >  I couldn't find any regressions with test system. In addition
> >  I tried several benchmarks on HSW and none hit this.
> >  I'm hoping to remove this optimization attempt. This is the only
> >  bit that depends on knowing about the actual storage during
> >  compilation. All the rest deal with just relative push and pull
> >  locations once the actual filling of pull_param[] is moved
> >  outside the compiler just as param[]. (Filling pull_param is
> >  based on the pull locations and doesn't need to be inside the
> >  compiler).
> >  Any thoughts?
> >
> >I'm not 100% sure what you're trying to do, but I have a branch that
> >may be of interest:
> >[2]
https://cgit.freedesktop.org/~jekstrand/mesa/log/?h=wip/i965-uniform
> >s
> >The branch enables support for pushing small uniform arrays.  Among
> >other things, it redoes the way we do push constants and gets rid of
> >some of the data tracking in the backend compiler.  The big reason
why
> >I haven't tried too hard to get it merged is because it regresses
Sandy
> >Bridge just a bit.  I know I've seen and fixed the bug before in an
> >alternate attempt, but I don't remember how.
> >I'm going to be refreshing it soon because we need indirect push
> >constants for the Vulkan driver.  (The branch is already merged into
> >the Vulkan branch.)
>
> I'd like to stop filling param[] before compilation. This is really not
> needed by the compiler as it deals with pull and push constant locations,
> i.e., positions in the push and pull files. Actual uniform values and
their
> location in the uniform storage are not needed until actual pipeline
upload.
>
> My plan is to move the iteration over the core uniform storage to pipeline
> upload time. We can fill push and pull buffers directly without the need
of
> storing pointers to param[] in the middle. Not only makes this things
simpler
> and more flexible in my mind, does it give us the possibility to upload
> floats with 16-bit precision instead of 32-bits. Current upload logic only
> gets pointers to 32-bit values without knowing if they should be
represented
> with 16 bits let alone whether the values are floats or integers to begin
> with.

Right. Kristian and I have talked about some related things that we need
for pipeline caching and the Vulkan driver.  In Vulkan, they aren't actual
pointers at all but are, instead, offsets into a push constant block.
Fortunately, the back-end compiler never dereferences them so you can shove
whatever you want in there and it's OK. We've talked about turning the pull
and push params into just a set of integers that means whatever the api and
state setup code want.  One of the problems with pointers is that you can't
easily put them into an on-disk shader cache (which we have for Vulkan).

When you talk about 16 or 64-bit values, what is your intention?  Are
64-bit values still going to take up two slots or are they now one 64-bit
slot?  Are there two 16-bit values per slot or just one?  Are 16-bit
uniforms converted before they get uploaded or consumed directly by the
shader?  I'm still a little confused as to what problem you're trying to
solve.

One thing to think about as you work on this is that Vulkan doesn't have
individual uniforms but instead has a block of explicit push constants.  In
the shader, the push constants are specified with explicit offsets into
that block similar to a UBO.  The result is that it's very difficult for
the state setup code to know what size anything is.  Just chopping the push
constant space into 32-bit hunks that the compiler is free to rearrange is
terribly convenient.  We could use 16 or 8-bit chunks just as easily but
having some be 32-bit, others 64-bit, and others 16-bit has the potential
to get very painful.

Food for thought.  Maybe I'm completely missing what your trying to do.
--Jason
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/14] nir: Add explicitly sized types

2016-03-12 Thread Connor Abbott
On Fri, Mar 11, 2016 at 2:33 AM, Samuel Iglesias Gonsálvez
 wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA256
>
>
>
> On 11/03/16 01:08, Jason Ekstrand wrote:
>> On Thu, Mar 10, 2016 at 4:00 PM, Connor Abbott
>>  wrote:
>>
>>> On Mon, Mar 7, 2016 at 3:45 AM, Samuel Iglesias Gonsálvez
>>>  wrote:
 From: Jason Ekstrand 

 v2: Fix size/type mask to properly handle 8-bit types.

 Signed-off-by: Juan A. Suarez Romero  ---
 src/compiler/nir/nir.h | 17 - 1 file changed,
 16 insertions(+), 1 deletion(-)

 diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
 index cccb3a4..659e98c 100644 --- a/src/compiler/nir/nir.h +++
 b/src/compiler/nir/nir.h @@ -605,9 +605,24 @@ typedef enum {
 nir_type_float, nir_type_int, nir_type_uint, -   nir_type_bool
 +   nir_type_bool, +   nir_type_bool32 =32 |
 nir_type_bool, +   nir_type_int8 =  8  | nir_type_int, +
 nir_type_int16 = 16 | nir_type_int, +   nir_type_int32 =
 32 | nir_type_int, +   nir_type_int64 = 64 | nir_type_int,
 +   nir_type_uint8 = 8  | nir_type_uint, +
 nir_type_uint16 =16 | nir_type_uint, +   nir_type_uint32 =
 32 | nir_type_uint, +   nir_type_uint64 =64 |
 nir_type_uint, +   nir_type_float16 =   16 | nir_type_float, +
 nir_type_float32 =   32 | nir_type_float, +   nir_type_float64
 =   64 | nir_type_float, } nir_alu_type;

 +#define NIR_ALU_TYPE_SIZE_MASK 0xfff8 +#define
 NIR_ALU_TYPE_BASE_TYPE_MASK 0x0007
>>>
>>> So I'm not really the one to be reviewing this series (after all,
>>> I wrote most of it :) ) but one thing that I never quite liked,
>>> and didn't get around to fixing, is how we use these raw
>>> constants all over the place. Perhaps we could make things more
>>> readable by adding nir_get_sized_type(), nir_get_unsized_type(),
>>> and nir_type_size() helpers and then use those instead of
>>> or-ing/and-ing things together everywhere.
>>>
>>
>> Agreed.
>>
>>
>
> Agreed. We saw it too but, as this is used in a lot in the fp64 patches,
> we were thinking on apply one patch at the end of the fp64 series adding
> those helper functions (maybe just macros like NIR_GET_UNSIZED_TYPE and
> NIR_GET_TYPE_SIZE) and adapting the users of the mask.

I should probably mention, in general we tend to prefer inline
functions over macros where possible since it's clearer what their
argument types and return type are and they tend to integrate better
with gdb.

>
> However, we can add them here and modify the rest of fp64 patches if
> you prefer it.
>
> Sam
>
>>>
 + typedef enum { NIR_OP_IS_COMMUTATIVE = (1 << 0),
 NIR_OP_IS_ASSOCIATIVE = (1 << 1), -- 2.7.0

 ___ mesa-dev
 mailing list mesa-dev@lists.freedesktop.org
 https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>> ___ mesa-dev mailing
>>> list mesa-dev@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>>
>>
> -BEGIN PGP SIGNATURE-
> Version: GnuPG v2
>
> iQIcBAEBCAAGBQJW4nTAAAoJEH/0ujLxfcNDQmcP/3PDBMxX+z91XQ0wSY7QMuu8
> I4BVir0n1J3g05S8Yid+z61vCOMNdDB9xmUCJmV1Jv+YuS4SB5GaluHj9jFBPgvj
> YQtT5SnoGC1tBEViAPa+nNRwxF+fxh8xLKG+OQ2IXqDMAdIsx5V772Ea8/anClhi
> q4d8Fw93URPubBKTTh8IMt/dOa0oN3L0Cka7062bLl27+Y2Ml8MyPVLEQPBI2WP8
> ayMicIDco2ldRS3u/jteGc6R4GI9Ef8gIsSVyEYPKUYgNmVkun5LMJjpjbh2PXBB
> VaManLcCdv6Yf2GP9ehQjTp4rr0GLl2rcAaftt0pD7MN1ZzQlFp/opyIQpzFe+Ny
> hqzzvbn8wh/W4goKbfir6HpasaPC56AamTnHZ9zJVhaUIPjan/oSSRHRoK9kswib
> rpnj5WDQN9KKnuY89Pxoo/w8aesgyektLiFbsXQx7jbNVxKOdrvKwnhSjSQs0sUG
> C+e/2oLSMiH2VLnYT7iJoinD8IlQXgmYBo/IZvFgtcOfZdJRgSssrWQclfagv8MR
> dzNLUTR5sS6/GG+4nTuD14uGaswuToCRCNiq2CDnemFXMdtgkIkztj8dwZd8u9hY
> kP5UQKoW6KU+0fFf8PQez2YCFX/dxLXtRyP8uP+V5ZUh1y+Qv4TDwYacl/VG8Hlt
> kx7+UXIC4g/vUS5ONfP0
> =6z48
> -END PGP SIGNATURE-
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] vulkan: regression on Haswell

2016-03-12 Thread Nanley Chery
On Sat, Mar 12, 2016 at 12:20:26PM +0100, Jacek Konieczny wrote:
> On 2016-03-12 11:59, Jacek Konieczny wrote:
> > Hi,
> > 
> > I have been playing with Vulkan API and using the Mesa Intel Vulkan
> > driver from the 'vulkan' branch.
> > 
> > Recent driver upgrade has broken my, previously working code, causing
> > massive flickering and graphical artifacts.
> > 
> > git bisect have shown, that this is the breaking change:
> > 
> > commit 7ebbc3946ae9cffb3c3db522dcbe2f1041633164 (refs/bisect/bad)
> > Author: Nanley Chery 
> > Date:   Fri Mar 4 11:43:19 2016 -0800
> > 
> > anv/meta: Minimize height of images used for copies
> 
> I am sorry. It seems I have been using 'git bisect' wrong.
> 
> This is the breaking change:
> 
> commit 248ab61740c4082517424f5aa94b2f4e7b210d76 (HEAD)
> Author: Jason Ekstrand 
> Date:   Tue Mar 8 17:10:05 2016 -0800
> 
> anv/cmd_buffer: Pull the core of flush_state into genX_cmd_buffer
> 
> And this seems to make more sense.
> 

There is a bug report that's tracking this regression:
https://bugs.freedesktop.org/show_bug.cgi?id=94468

In the meantime, a workaround is comment out:
genX(cmd_buffer_config_l3)(cmd_buffer, false);
in src/intel/vulkan/genX_cmd_buffer.c.

Regards,
Nanley
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] vulkan: regression on Haswell

2016-03-12 Thread Jacek Konieczny
On 2016-03-12 11:59, Jacek Konieczny wrote:
> Hi,
> 
> I have been playing with Vulkan API and using the Mesa Intel Vulkan
> driver from the 'vulkan' branch.
> 
> Recent driver upgrade has broken my, previously working code, causing
> massive flickering and graphical artifacts.
> 
> git bisect have shown, that this is the breaking change:
> 
> commit 7ebbc3946ae9cffb3c3db522dcbe2f1041633164 (refs/bisect/bad)
> Author: Nanley Chery 
> Date:   Fri Mar 4 11:43:19 2016 -0800
> 
> anv/meta: Minimize height of images used for copies

I am sorry. It seems I have been using 'git bisect' wrong.

This is the breaking change:

commit 248ab61740c4082517424f5aa94b2f4e7b210d76 (HEAD)
Author: Jason Ekstrand 
Date:   Tue Mar 8 17:10:05 2016 -0800

anv/cmd_buffer: Pull the core of flush_state into genX_cmd_buffer

And this seems to make more sense.

Jacek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] vulkan: regression on Haswell

2016-03-12 Thread Jacek Konieczny
Hi,

I have been playing with Vulkan API and using the Mesa Intel Vulkan
driver from the 'vulkan' branch.

Recent driver upgrade has broken my, previously working code, causing
massive flickering and graphical artifacts.

git bisect have shown, that this is the breaking change:

commit 7ebbc3946ae9cffb3c3db522dcbe2f1041633164 (refs/bisect/bad)
Author: Nanley Chery 
Date:   Fri Mar 4 11:43:19 2016 -0800

anv/meta: Minimize height of images used for copies

It might be, that my code is broken, but it worked correctly before this
change and has been checked with validation layers and Valgrind.

Jacek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] swrast: Delete the unused 'slice' parameter

2016-03-12 Thread Alejandro Piñeiro
On 12/03/16 00:16, Anuj Phogat wrote:
> Signed-off-by: Anuj Phogat 

Any reason to not just move the slice assert at line 243 as part of the
checks of check_map_teximage?

> ---
>  src/mesa/swrast/s_texture.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/src/mesa/swrast/s_texture.c b/src/mesa/swrast/s_texture.c
> index 9ccd0e3..6ea7b6c 100644
> --- a/src/mesa/swrast/s_texture.c
> +++ b/src/mesa/swrast/s_texture.c
> @@ -178,7 +178,7 @@ _swrast_free_texture_image_buffer(struct gl_context *ctx,
>   */
>  static void
>  check_map_teximage(const struct gl_texture_image *texImage,
> -   GLuint slice, GLuint x, GLuint y, GLuint w, GLuint h)
> +   GLuint x, GLuint y, GLuint w, GLuint h)
>  {
>  
> if (texImage->TexObject->Target == GL_TEXTURE_1D)
> @@ -216,7 +216,7 @@ _swrast_map_teximage(struct gl_context *ctx,
> GLint stride, texelSize;
> GLuint bw, bh;
>  
> -   check_map_teximage(texImage, slice, x, y, w, h);
> +   check_map_teximage(texImage, x, y, w, h);
>  
> if (!swImage->Buffer) {
>/* Either glTexImage was called with a NULL  argument or

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev