date:20170316

On Fri, 2017-03-17 at 16:26 +1100, Edward O'Callaghan wrote:
> We memset number of elements without multiplication by the
> element size.
> 
> Signed-off-by: Edward O'Callaghan 
> ---
>  src/mesa/main/formatquery.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/mesa/main/formatquery.c b/src/mesa/main/formatquery.c
> index 598d34d..50d7c31 100644
> --- a/src/mesa/main/formatquery.c
> +++ b/src/mesa/main/formatquery.c
> @@ -1564,7 +1564,7 @@ _mesa_GetInternalformati64v(GLenum target, GLenum 
> internalformat,
>  * no pname can return a negative value, we fill params32 with negative
>  * values as reference values, that can be used to know what copy-back to
>  * params */
> -   memset(params32, -1, 16);
> +   memset(params32, -1, 16*sizeof(GLint));

sizeof(params32) ?

>  
> /* For GL_MAX_COMBINED_DIMENSIONS we need to get back 2 32-bit integers,
>  * and at the same time we only need 2. So for that pname, we call the

-- 
Jan Vesely 

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] mesa/main: Fix memset in formatquery.c

2017-03-16 Thread Edward O'Callaghan

We memset number of elements without multiplication by the
element size.

Signed-off-by: Edward O'Callaghan 
---
 src/mesa/main/formatquery.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/main/formatquery.c b/src/mesa/main/formatquery.c
index 598d34d..50d7c31 100644
--- a/src/mesa/main/formatquery.c
+++ b/src/mesa/main/formatquery.c
@@ -1564,7 +1564,7 @@ _mesa_GetInternalformati64v(GLenum target, GLenum 
internalformat,
 * no pname can return a negative value, we fill params32 with negative
 * values as reference values, that can be used to know what copy-back to
 * params */
-   memset(params32, -1, 16);
+   memset(params32, -1, 16*sizeof(GLint));
 
/* For GL_MAX_COMBINED_DIMENSIONS we need to get back 2 32-bit integers,
 * and at the same time we only need 2. So for that pname, we call the
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 6/7] gallium/radeon: formalize that create_batch_query doesn't need pipe_context

2017-03-16 Thread Timothy Arceri


4, 5 & 6 are:

Reviewed-by: Timothy Arceri 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/7] radeonsi: compile all TGSI compute shaders asynchronously

2017-03-16 Thread Timothy Arceri


Reviewed-by: Timothy Arceri 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] radv/meta: fix image clears for r4g4 format.

From: Dave Airlie 

This just uses an 8-bit clear and packs the values.

Signed-off-by: Dave Airlie 
---
 src/amd/vulkan/radv_meta_clear.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/src/amd/vulkan/radv_meta_clear.c b/src/amd/vulkan/radv_meta_clear.c
index c07775f..6583d64 100644
--- a/src/amd/vulkan/radv_meta_clear.c
+++ b/src/amd/vulkan/radv_meta_clear.c
@@ -1175,6 +1175,14 @@ radv_cmd_clear_image(struct radv_cmd_buffer *cmd_buffer,
internal_clear_value.color.uint32[0] = value;
}
 
+   if (format == VK_FORMAT_R4G4_UNORM_PACK8) {
+   uint8_t r, g;
+   format = VK_FORMAT_R8_UINT;
+   r = float_to_ubyte(clear_value->color.float32[0]) >> 4;
+   g = float_to_ubyte(clear_value->color.float32[1]) >> 4;
+   internal_clear_value.color.uint32[0] = (r << 4) | (g & 0xf);
+   }
+
for (uint32_t r = 0; r < range_count; r++) {
const VkImageSubresourceRange *range = &ranges[r];
for (uint32_t l = 0; l < radv_get_levelCount(image, range); 
++l) {
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 1/1] clover: use pipe_resource references

On Thu, 2017-03-16 at 18:07 -0700, Francisco Jerez wrote:
> Jan Vesely  writes:
> 
> > On Thu, 2017-03-16 at 17:22 -0700, Francisco Jerez wrote:
> > > Jan Vesely  writes:
> > > 
> > > > On Thu, 2017-03-16 at 15:24 -0700, Francisco Jerez wrote:
> > > > > Jan Vesely  writes:
> > > > > 
> > > > > > v2: buffers are created with one reference.
> > > > > > v3: add pipe_resource reference to mapping object
> > > > > > 
> > > > > 
> > > > > Mapping objects are supposed to be short-lived, they're logically part
> > > > > of the parent resource object so they shouldn't ever out-live it.  
> > > > > What
> > > > > is this useful for?
> > > > 
> > > > currently they can outlive the underlying pipe_resource. pipe_resource
> > > > is destroyed in root_resource destructor, while the list of mappings is
> > > > destroyed after resource destructor.
> > > 
> > > Right.  I guess the problem is that the pipe_transfer object associated
> > > to the clover::mapping object holds a pointer to the backing
> > > pipe_resource object but it fails to increment its reference count?  I
> > > guess that's the reason why v2 didn't help?
> > 
> > yes, though the pointer is hidden somewhere. I thought pxfer->resource
> > might be it, but it's not, and digging deeper into the structure didn't
> > sound like a good idea to me.
> > 
> 
> What is pxfer->resource about in that case?

that's a good question, so I did a bit of digging. for EG global
buffers are shadowed in global buffer memory pool, so mapping uses
memory pool's pipe_resource. I'm still not 100% sure where exactly
unmapping the global pool accesses the original buffer's pipe_resource,
but I don't think it matters that much. It's not required for pxfer-
>resource to be equal to resource->pipe. we need to guarantee that
resource->pipe outlives all mappings.
either explicitly, or by grabbing reference.

Jan


> 
> > > 
> > > > this is arguably an application bug. the piglit test does not call
> > > > clUnmapMemObject(), but it'd be nice to not access freed memory.
> > > > 
> > > > Vedran's alternative to clear the list before destroying pipe_resource
> > > > works as well (assert that the list is empty in resource destructor
> > > > would help spot the issue).
> > > > 
> > > 
> > > Assuming that pipe_transfers are supposed *not* to hold a reference to
> > > the underlying pipe_resource, which implies that the caller must
> > > guarantee it will never outlive its backing resource, it sounds like the
> > > minimal solution would be to have clover::mapping make the same
> > > assumptions.  You could probably achieve that in one line of code by
> > > clearing the mapping list from the clover::resource destructor as you
> > > suggest above.
> > 
> > I'd say the interface would be nicer if pipe_transfers did hold a
> > reference (or at least a mapping count to assert on), but I have no
> > plans to go that route.
> > the problem is a bit more complicated by the fact that pipe_resource is
> > handled by root_resource, while the list of mappings is private to
> > parent class resource.
> > 
> > Vedran's patch is here:
> > https://lists.freedesktop.org/archives/mesa-dev/2017-March/147092.html
> > 
> > I thought that using references would be nicer, as it looked useful for
> > device shared buffers, but that no longer applies.
> > 
> > Jan
> > 
> > > 
> > > > Jan
> > > > 
> > > > > 
> > > > > > CC: "17.0 13.0" 
> > > > > > 
> > > > > > Signed-off-by: Jan Vesely 
> > > > > > ---
> > > > > >  src/gallium/state_trackers/clover/core/resource.cpp | 11 
> > > > > > ---
> > > > > >  src/gallium/state_trackers/clover/core/resource.hpp |  7 ---
> > > > > >  2 files changed, 12 insertions(+), 6 deletions(-)
> > > > > > 
> > > > > > diff --git a/src/gallium/state_trackers/clover/core/resource.cpp 
> > > > > > b/src/gallium/state_trackers/clover/core/resource.cpp
> > > > > > index 06fd3f6..83e3c26 100644
> > > > > > --- a/src/gallium/state_trackers/clover/core/resource.cpp
> > > > > > +++ b/src/gallium/state_trackers/clover/core/resource.cpp
> > > > > > @@ -25,6 +25,7 @@
> > > > > >  #include "pipe/p_screen.h"
> > > > > >  #include "util/u_sampler.h"
> > > > > >  #include "util/u_format.h"
> > > > > > +#include "util/u_inlines.h"
> > > > > >  
> > > > > >  using namespace clover;
> > > > > >  
> > > > > > @@ -176,7 +177,7 @@ root_resource::root_resource(clover::device 
> > > > > > &dev, memory_obj &obj,
> > > > > >  }
> > > > > >  
> > > > > >  root_resource::~root_resource() {
> > > > > > -   device().pipe->resource_destroy(device().pipe, pipe);
> > > > > > +   pipe_resource_reference(&this->pipe, NULL);
> > > > > >  }
> > > > > >  
> > > > > >  sub_resource::sub_resource(resource &r, const vector &offset) :
> > > > > > @@ -202,18 +203,21 @@ mapping::mapping(command_queue &q, resource 
> > > > > > &r,
> > > > > >pxfer = NULL;
> > > > > >throw error(CL_OUT_OF_RESOURCES);
> > > > > > }
> > > > > > +   pipe_resource_reference(&res, r.pipe);
> > > > > >  }
> > > > > >  
> > >

[Mesa-dev] [PATCH] spirv: Implement IsInf using an integer comparison

Since we already do fabs on the one source, we're guaranteed to get
positive infinity if we get any infinity at all.  Since +inf only has
one IEEE 754 representation, we can use an integer comparison and avoid
all of the ordered/unordered issues.

Cc: Dave Airlie 
---
 src/compiler/spirv/vtn_alu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/compiler/spirv/vtn_alu.c b/src/compiler/spirv/vtn_alu.c
index 0738fe0..9e4beed 100644
--- a/src/compiler/spirv/vtn_alu.c
+++ b/src/compiler/spirv/vtn_alu.c
@@ -447,7 +447,7 @@ vtn_handle_alu(struct vtn_builder *b, SpvOp opcode,
   break;
 
case SpvOpIsInf:
-  val->ssa->def = nir_feq(&b->nb, nir_fabs(&b->nb, src[0]),
+  val->ssa->def = nir_ieq(&b->nb, nir_fabs(&b->nb, src[0]),
   nir_imm_float(&b->nb, INFINITY));
   break;
 
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC libdrm 0/2] Replace the build system with meson

Quoting Marek Olšák (2017-03-16 18:53:59)
> On Fri, Mar 17, 2017 at 12:11 AM, Dylan Baker  wrote:
> > Quoting Marek Olšák (2017-03-16 15:36:26)
> >> Is there a way not to use ninja with meson, because ninja redirects
> >> all stderr output from gcc to stdout, which breaks many development
> >> environments that expect errors in stderr?
> >>
> >> I'm basically saying that if ninja can't keep gcc errors in stderr, I
> >> wouldn't like any project that I might be involved in to require ninja
> >> for building.
> >>
> >> Marek
> >
> > There is no way to use another backend on Linux, and meson will not support
> > Make. Ninja is a big part of the appeal here, since it is faster than make 
> > is.
> > Are there particular tools you know don't work with ninja? It seems like in 
> > the
> > 7+ years since ninja came out that someone would have fixed the tools, or 
> > that
> > some stream redirection could be used to fix the problem, "ninja 1>&2"?
> 
> I actually read some thread about it and the conclusion seemed to be
> that ninja developers don't care. I have no other option than to
> believe that ninja was made for automated build bots, not for
> development.
> 
> Some editors expect that errors and only errors go to stderr and all
> other garbage info goes to stdout. This is something I can't change.
> 
> Marek

And I found this: 
https://groups.google.com/forum/#!topic/ninja-build/4syh2jzXWcI

Which leads me to believe that they would be responsive to a patch, the core
team just doesn't have a use for it. There is in fact a patch already written
(linked in that thread), that just needs someone to clean it up and propose it
for merge.

Dylan


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC libdrm 0/2] Replace the build system with meson

quoting jason ekstrand (2017-03-16 19:03:15)
> on march 16, 2017 5:41:24 pm emil velikov  wrote:
> > and meson is not a thing on neither bsd(s), solaris (and derivatives) nor 
> > android :-\
> 
> i have trouble bringing myself to care.  the bsds need to stop using 10 
> year old compilers.  it can be made to work on solaris and bsd if someone 
> bothered to put a little work into it.  besides, given that large chunks of 
> gnome are switching they're going to have to make it work some day soon 
> anyway.

I decided to check the ports on my freebsd box, it has meson, in fact:
freebsd: https://svnweb.freebsd.org/ports/head/devel/meson/
netbsd: http://pkgsrc.se/wip/py-meson
openbsd: http://ports.su/devel/meson

The only OS I couldn't find a package for is openindiana (the clostest thing to
Solaris I could come up with), but there is ninja for Solaris, and meson itself
is pure python installable via pip, so even there it's not impossible.

Dylan


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] radv: flush f32->f16 conversion denormals to zero. (v2)

From: Dave Airlie 

SPIR-V defines the f32->f16 operation as flushing denormals to 0,
this compares the class using amd class opcode.

Thanks to Matt Arsenault for figuring it out.

This fix is VI+ only, add a TODO for SI/CIK.

This fixes:
dEQP-VK.spirv_assembly.instruction.compute.opquantize.flush_to_zero

Signed-off-by: Dave Airlie 
---
 src/amd/common/ac_nir_to_llvm.c | 32 
 src/amd/common/sid.h| 13 +
 2 files changed, 41 insertions(+), 4 deletions(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 15974a7..6c28e98 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -1022,6 +1022,33 @@ static LLVMValueRef emit_b2f(struct nir_to_llvm_context 
*ctx,
return LLVMBuildAnd(ctx->builder, src0, LLVMBuildBitCast(ctx->builder, 
LLVMConstReal(ctx->f32, 1.0), ctx->i32, ""), "");
 }
 
+static LLVMValueRef emit_f2f16(struct nir_to_llvm_context *ctx,
+  LLVMValueRef src0)
+{
+   LLVMValueRef result;
+   LLVMValueRef cond;
+
+   src0 = to_float(ctx, src0);
+   result = LLVMBuildFPTrunc(ctx->builder, src0, ctx->f16, "");
+
+   /* TODO SI/CIK options here */
+   if (ctx->options->chip_class >= VI) {
+   LLVMValueRef args[2];
+   /* Check if the result is a denormal - and flush to 0 if so. */
+   args[0] = result;
+   args[1] = LLVMConstInt(ctx->i32, N_SUBNORMAL | P_SUBNORMAL, 
false);
+   cond = ac_build_intrinsic(&ctx->ac, "llvm.amdgcn.class.f16", 
ctx->i1, args, 2, AC_FUNC_ATTR_READNONE);
+   }
+
+   /* need to convert back up to f32 */
+   result = LLVMBuildFPExt(ctx->builder, result, ctx->f32, "");
+
+   if (ctx->options->chip_class >= VI)
+   result = LLVMBuildSelect(ctx->builder, cond, ctx->f32zero, 
result, "");
+
+   return result;
+}
+
 static LLVMValueRef emit_umul_high(struct nir_to_llvm_context *ctx,
   LLVMValueRef src0, LLVMValueRef src1)
 {
@@ -1520,10 +1547,7 @@ static void visit_alu(struct nir_to_llvm_context *ctx, 
nir_alu_instr *instr)
result = emit_b2f(ctx, src[0]);
break;
case nir_op_fquantize2f16:
-   src[0] = to_float(ctx, src[0]);
-   result = LLVMBuildFPTrunc(ctx->builder, src[0], ctx->f16, "");
-   /* need to convert back up to f32 */
-   result = LLVMBuildFPExt(ctx->builder, result, ctx->f32, "");
+   result = emit_f2f16(ctx, src[0]);
break;
case nir_op_umul_high:
result = emit_umul_high(ctx, src[0], src[1]);
diff --git a/src/amd/common/sid.h b/src/amd/common/sid.h
index 7789add..fa20690 100644
--- a/src/amd/common/sid.h
+++ b/src/amd/common/sid.h
@@ -9063,5 +9063,18 @@
 #defineCIK_SDMA_PACKET_SRBM_WRITE  0xe
 #defineCIK_SDMA_COPY_MAX_SIZE  0x3fffe0
 
+enum amd_cmp_class_flags {
+   S_NAN = 1 << 0,// Signaling NaN
+   Q_NAN = 1 << 1,// Quiet NaN
+   N_INFINITY = 1 << 2,   // Negative infinity
+   N_NORMAL = 1 << 3, // Negative normal
+   N_SUBNORMAL = 1 << 4,  // Negative subnormal
+   N_ZERO = 1 << 5,   // Negative zero
+   P_ZERO = 1 << 6,   // Positive zero
+   P_SUBNORMAL = 1 << 7,  // Positive subnormal
+   P_NORMAL = 1 << 8, // Positive normal
+   P_INFINITY = 1 << 9// Positive infinity
+};
+
 #endif /* _SID_H */
 
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] radv/ac: canonicalize the output for 32-bit float min/max.

From: Dave Airlie 

This fixes:
dEQP-VK.glsl.builtin.precision.min.*
dEQP-VK.glsl.builtin.precision.max.*
dEQP-VK.glsl.builtin.precision.clamp.*

As weren't flushing the denorms as SPIR-V required.

Signed-off-by: Dave Airlie 
---
 src/amd/common/ac_nir_to_llvm.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 6c28e98..776208a 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -1456,10 +1456,18 @@ static void visit_alu(struct nir_to_llvm_context *ctx, 
nir_alu_instr *instr)
case nir_op_fmax:
result = emit_intrin_2f_param(ctx, "llvm.maxnum",
  to_float_type(ctx, def_type), 
src[0], src[1]);
+   if (instr->dest.dest.ssa.bit_size == 32)
+   result = emit_intrin_1f_param(ctx, "llvm.canonicalize",
+ to_float_type(ctx, 
def_type),
+ result);
break;
case nir_op_fmin:
result = emit_intrin_2f_param(ctx, "llvm.minnum",
  to_float_type(ctx, def_type), 
src[0], src[1]);
+   if (instr->dest.dest.ssa.bit_size == 32)
+   result = emit_intrin_1f_param(ctx, "llvm.canonicalize",
+ to_float_type(ctx, 
def_type),
+ result);
break;
case nir_op_ffma:
result = emit_intrin_3f_param(ctx, "llvm.fma",
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] nir: add an isinf opcode, and an option to use it.

2017-03-16 Thread Roland Scheidegger

Am 17.03.2017 um 04:33 schrieb Roland Scheidegger:
> Am 17.03.2017 um 02:29 schrieb Dave Airlie:
>> On 17 March 2017 at 11:09, Jason Ekstrand  wrote:
>>> On March 16, 2017 5:04:37 PM Dave Airlie  wrote:
>>>
 From: Dave Airlie 

 In order to get isinf(NaN) correct, at least radv can't
 use an unordered equals which feq has to be for us, this
 passes isinf to the backend and let's it sort it out as it
 pleases.
>>>
>>>
>>> I think comparisons are something that were going to need to sort out better
>>> in general.  SPIR-V's rules are stricter than GL (at least the way we
>>> interpret it).  Could you please be more specific about the issue?
>>
>> IsInf(NaN) unordered appears to end up at true, when the spec for isinf
>> says it should be false.
>>
>> well SPIR-V has the unorder and ordered stuff for OpenCL kernels, just
>> not sure what want in NIR in this area. If I default to using ordered 
>> compares
>> for NIR I get isnan and funord fails last I tried.
> 
> FWIW tgsi uses ordered for all comparisons except not equal.
> Ok well maybe not all drivers do, but that's what I'd consider what it
> should be (radeonsi does that too at a quick glance). Because, well, you
> guessed it, d3d10... GL of course won't mandate anything with its super
> relaxed nan rules. But this pretty much matches what you get with
> ordinary comparison operators in c, so may uneducated guess is it would
> be a good match for nir too...

Forgot to mention, doing that should fix isnan, as that uses not equal.
But if spir-v needs all comparisons both in ordered and unordered form
you're screwed either way if you don't have them in nir.

Roland

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv/ac: canonicalize the output for 32-bit float min/max.

Doh missent two patches, will rebase and resend.

Dave.
>
> This fixes:
> dEQP-VK.glsl.builtin.precision.min.*
> dEQP-VK.glsl.builtin.precision.max.*
> dEQP-VK.glsl.builtin.precision.clamp.*
>
> As weren't flushing the denorms as SPIR-V required.
>
> Signed-off-by: Dave Airlie 
> ---
>  src/amd/common/ac_nir_to_llvm.c | 14 --
>  1 file changed, 8 insertions(+), 6 deletions(-)
>
> diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
> index bb53386..122df7f 100644
> --- a/src/amd/common/ac_nir_to_llvm.c
> +++ b/src/amd/common/ac_nir_to_llvm.c
> @@ -1449,16 +1449,18 @@ static void visit_alu(struct nir_to_llvm_context 
> *ctx, nir_alu_instr *instr)
> case nir_op_fmax:
> result = emit_intrin_2f_param(ctx, "llvm.maxnum",
>   to_float_type(ctx, def_type), 
> src[0], src[1]);
> -   result = emit_intrin_1f_param(ctx, "llvm.canonicalize",
> - to_float_type(ctx, def_type),
> - result);
> +   if (instr->dest.dest.ssa.bit_size == 32)
> +   result = emit_intrin_1f_param(ctx, 
> "llvm.canonicalize",
> + to_float_type(ctx, 
> def_type),
> + result);
> break;
> case nir_op_fmin:
> result = emit_intrin_2f_param(ctx, "llvm.minnum",
>   to_float_type(ctx, def_type), 
> src[0], src[1]);
> -   result = emit_intrin_1f_param(ctx, "llvm.canonicalize",
> - to_float_type(ctx, def_type),
> - result);
> +   if (instr->dest.dest.ssa.bit_size == 32)
> +   result = emit_intrin_1f_param(ctx, 
> "llvm.canonicalize",
> + to_float_type(ctx, 
> def_type),
> + result);
> break;
> case nir_op_ffma:
> result = emit_intrin_3f_param(ctx, "llvm.fma",
> --
> 2.9.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] nir: add an isinf opcode, and an option to use it.

On Thu, Mar 16, 2017 at 8:33 PM, Roland Scheidegger 
wrote:

> Am 17.03.2017 um 02:29 schrieb Dave Airlie:
> > On 17 March 2017 at 11:09, Jason Ekstrand  wrote:
> >> On March 16, 2017 5:04:37 PM Dave Airlie  wrote:
> >>
> >>> From: Dave Airlie 
> >>>
> >>> In order to get isinf(NaN) correct, at least radv can't
> >>> use an unordered equals which feq has to be for us, this
> >>> passes isinf to the backend and let's it sort it out as it
> >>> pleases.
> >>
> >>
> >> I think comparisons are something that were going to need to sort out
> better
> >> in general.  SPIR-V's rules are stricter than GL (at least the way we
> >> interpret it).  Could you please be more specific about the issue?
> >
> > IsInf(NaN) unordered appears to end up at true, when the spec for isinf
> > says it should be false.
> >
> > well SPIR-V has the unorder and ordered stuff for OpenCL kernels, just
> > not sure what want in NIR in this area. If I default to using ordered
> compares
> > for NIR I get isnan and funord fails last I tried.
>
> FWIW tgsi uses ordered for all comparisons except not equal.
> Ok well maybe not all drivers do, but that's what I'd consider what it
> should be (radeonsi does that too at a quick glance). Because, well, you
> guessed it, d3d10... GL of course won't mandate anything with its super
> relaxed nan rules. But this pretty much matches what you get with
> ordinary comparison operators in c, so may uneducated guess is it would
> be a good match for nir too...
>

That was more-or-less my plan.

On Thu, Mar 16, 2017 at 6:32 PM, Dave Airlie  wrote:

> > Another option would be to make this lower_isinf and add a quick lowering
> > line to nir_opt_algebraic.  That's more idiomatic for nir.
>
> If I do that though won't that mean I have to set lower_isinf for all
> current NIR
> users?
>

No, just us.  We're the only ones that consume SPIR-V. :-)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] radv/ac: canonicalize the output for 32-bit float min/max.

From: Dave Airlie 

This fixes:
dEQP-VK.glsl.builtin.precision.min.*
dEQP-VK.glsl.builtin.precision.max.*
dEQP-VK.glsl.builtin.precision.clamp.*

As weren't flushing the denorms as SPIR-V required.

Signed-off-by: Dave Airlie 
---
 src/amd/common/ac_nir_to_llvm.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index bb53386..122df7f 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -1449,16 +1449,18 @@ static void visit_alu(struct nir_to_llvm_context *ctx, 
nir_alu_instr *instr)
case nir_op_fmax:
result = emit_intrin_2f_param(ctx, "llvm.maxnum",
  to_float_type(ctx, def_type), 
src[0], src[1]);
-   result = emit_intrin_1f_param(ctx, "llvm.canonicalize",
- to_float_type(ctx, def_type),
- result);
+   if (instr->dest.dest.ssa.bit_size == 32)
+   result = emit_intrin_1f_param(ctx, "llvm.canonicalize",
+ to_float_type(ctx, 
def_type),
+ result);
break;
case nir_op_fmin:
result = emit_intrin_2f_param(ctx, "llvm.minnum",
  to_float_type(ctx, def_type), 
src[0], src[1]);
-   result = emit_intrin_1f_param(ctx, "llvm.canonicalize",
- to_float_type(ctx, def_type),
- result);
+   if (instr->dest.dest.ssa.bit_size == 32)
+   result = emit_intrin_1f_param(ctx, "llvm.canonicalize",
+ to_float_type(ctx, 
def_type),
+ result);
break;
case nir_op_ffma:
result = emit_intrin_3f_param(ctx, "llvm.fma",
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] nir: add an isinf opcode, and an option to use it.

2017-03-16 Thread Roland Scheidegger

Am 17.03.2017 um 02:29 schrieb Dave Airlie:
> On 17 March 2017 at 11:09, Jason Ekstrand  wrote:
>> On March 16, 2017 5:04:37 PM Dave Airlie  wrote:
>>
>>> From: Dave Airlie 
>>>
>>> In order to get isinf(NaN) correct, at least radv can't
>>> use an unordered equals which feq has to be for us, this
>>> passes isinf to the backend and let's it sort it out as it
>>> pleases.
>>
>>
>> I think comparisons are something that were going to need to sort out better
>> in general.  SPIR-V's rules are stricter than GL (at least the way we
>> interpret it).  Could you please be more specific about the issue?
> 
> IsInf(NaN) unordered appears to end up at true, when the spec for isinf
> says it should be false.
> 
> well SPIR-V has the unorder and ordered stuff for OpenCL kernels, just
> not sure what want in NIR in this area. If I default to using ordered compares
> for NIR I get isnan and funord fails last I tried.

FWIW tgsi uses ordered for all comparisons except not equal.
Ok well maybe not all drivers do, but that's what I'd consider what it
should be (radeonsi does that too at a quick glance). Because, well, you
guessed it, d3d10... GL of course won't mandate anything with its super
relaxed nan rules. But this pretty much matches what you get with
ordinary comparison operators in c, so may uneducated guess is it would
be a good match for nir too...


Roland

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv: flush f32->f16 conversion denormals to zero.

2017-03-16 Thread Matt Arsenault


> On Mar 16, 2017, at 20:02, Dave Airlie  wrote:
> 
> From: Dave Airlie 
> 
> SPIR-V defines the f32->f16 operation as flushing denormals to 0,
> this compares the class using amd class opcode.
> 
> Thanks to Matt Arsenault for figuring it out.
> 
> This fixes:
> dEQP-VK.spirv_assembly.instruction.compute.opquantize.flush_to_zero
> 
> Signed-off-by: Dave Airlie 
> ---
> src/amd/common/ac_nir_to_llvm.c |  9 -
> src/amd/common/sid.h| 13 +
> 2 files changed, 21 insertions(+), 1 deletion(-)
> 
> diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
> index 77e3a85..ac80677 100644
> --- a/src/amd/common/ac_nir_to_llvm.c
> +++ b/src/amd/common/ac_nir_to_llvm.c
> @@ -1027,11 +1027,18 @@ static LLVMValueRef emit_f2f16(struct 
> nir_to_llvm_context *ctx,
> {
>   LLVMValueRef result;
>   LLVMValueRef cond;
> + LLVMValueRef args[2];
>   src0 = to_float(ctx, src0);
>   result = LLVMBuildFPTrunc(ctx->builder, src0, ctx->f16, "");
> - result = ac_build_intrinsic(&ctx->ac, "llvm.canonicalize.f16", 
> ctx->f16, &result, 1, AC_FUNC_ATTR_READNONE);
> + LLVMValueRef mask = LLVMConstInt(ctx->i32, N_SUBNORMAL | P_SUBNORMAL, 
> false);
> + 

I don’t think you need the canonicalize here. This will also only work on VI+ 
which supports f16 instructions

-Matt
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] radv: flush f32->f16 conversion denormals to zero.

From: Dave Airlie 

SPIR-V defines the f32->f16 operation as flushing denormals to 0,
this compares the class using amd class opcode.

Thanks to Matt Arsenault for figuring it out.

This fixes:
dEQP-VK.spirv_assembly.instruction.compute.opquantize.flush_to_zero

Signed-off-by: Dave Airlie 
---
 src/amd/common/ac_nir_to_llvm.c |  9 -
 src/amd/common/sid.h| 13 +
 2 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 77e3a85..ac80677 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -1027,11 +1027,18 @@ static LLVMValueRef emit_f2f16(struct 
nir_to_llvm_context *ctx,
 {
LLVMValueRef result;
LLVMValueRef cond;
+   LLVMValueRef args[2];
src0 = to_float(ctx, src0);
result = LLVMBuildFPTrunc(ctx->builder, src0, ctx->f16, "");
-   result = ac_build_intrinsic(&ctx->ac, "llvm.canonicalize.f16", 
ctx->f16, &result, 1, AC_FUNC_ATTR_READNONE);
+   LLVMValueRef mask = LLVMConstInt(ctx->i32, N_SUBNORMAL | P_SUBNORMAL, 
false);
+   
+   args[0] = result;
+   args[1] = mask;
+   cond = ac_build_intrinsic(&ctx->ac, "llvm.amdgcn.class.f16", ctx->i1, 
args, 2, AC_FUNC_ATTR_READNONE);
+
/* need to convert back up to f32 */
result = LLVMBuildFPExt(ctx->builder, result, ctx->f32, "");
+   result = LLVMBuildSelect(ctx->builder, cond, ctx->f32zero, result, "");
return result;
 }
 
diff --git a/src/amd/common/sid.h b/src/amd/common/sid.h
index 7789add..fa20690 100644
--- a/src/amd/common/sid.h
+++ b/src/amd/common/sid.h
@@ -9063,5 +9063,18 @@
 #defineCIK_SDMA_PACKET_SRBM_WRITE  0xe
 #defineCIK_SDMA_COPY_MAX_SIZE  0x3fffe0
 
+enum amd_cmp_class_flags {
+   S_NAN = 1 << 0,// Signaling NaN
+   Q_NAN = 1 << 1,// Quiet NaN
+   N_INFINITY = 1 << 2,   // Negative infinity
+   N_NORMAL = 1 << 3, // Negative normal
+   N_SUBNORMAL = 1 << 4,  // Negative subnormal
+   N_ZERO = 1 << 5,   // Negative zero
+   P_ZERO = 1 << 6,   // Positive zero
+   P_SUBNORMAL = 1 << 7,  // Positive subnormal
+   P_NORMAL = 1 << 8, // Positive normal
+   P_INFINITY = 1 << 9// Positive infinity
+};
+
 #endif /* _SID_H */
 
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC libdrm 0/2] Replace the build system with meson

2017-03-16 Thread Brian Paul

On Thu, Mar 16, 2017 at 8:03 PM, Jason Ekstrand 
wrote:

> On March 16, 2017 5:41:24 PM Emil Velikov 
> wrote:
>
>> On 17 March 2017 at 00:21, Dylan Baker  wrote:
>>
>>> Hi Emil,
>>>
>>> Quoting Emil Velikov (2017-03-16 16:35:33)
>>>
 While I can see you're impressed by Meson, I would kindly urge you to
 not use it here. As you look closely you can see that one could
 trivially improve the times, yet the biggest thing is that most of the
 code in libdrm must go ;-)

>>>
>>> Perhaps I wasn't clear enough, I don't really expect this to land ever.
>>> I sent
>>> it out more because I'd written it and it works and is a useful
>>> demonstration of
>>> meson+ninja performance. Obviously 20 seconds -> 5 seconds isn't a huge
>>> deal :);
>>> but in a larger project, consider that a 4x speedup would be 4 minutes
>>> to 1
>>> minute, and that is a huge difference in time.
>>>
>>> You are still failing to see past your usecase. As said before - if
>> there's any need to improve things say so.
>> Note that you simply cannot apply the 1000x speedup in any situation.
>>
>
> Yes, you can't just linearly apply any scaling factor.  However, when you
> build mesa on a machine with a decent number of threads (I think our build
> machine for the CI system has 32 threads), autotools+make is slow as mud.
> Also, there's very little you can do to speed up configure since it's a
> pile of shell and perl that inherently runs single-threaded and is fairly
> complex due to mesa's complicated dependencies.
>
> As the port is not 1:1 wrt the autoconf one, the performance numbers
 above are comparing apples to oranges.

>>>
>>> I fail to see what I'm missing from meson that would have an effect on
>>> the
>>> times I reported. There are some files that are installed by autoconf
>>> that I
>>> didn't bother to install with meson (because I don't expect this to
>>> land). Since
>>> I didn't time installs, I don't see how it isn't an apples to apples
>>> comparison.
>>>
>>> You already (explicitly) mentioned some differences. Admittedly not a
>> deal breaker.
>>
>> I understand that libdrm is a pessimal case for recursive-make since most
>>> sub folders contain < 5 C files, However, even if you were to flatten
>>> the make
>>> files meson+ninja would still be faster when you consider that meson
>>> configures and builds faster than autotools configures.
>>>
>>> That's correct. If is so concerned - they should slim down the
>> configure.ac ;-)
>>
>
> There are real limits as to what you can do there.
>
> If you/others are unhappy with the build times of libdrm - poke me on
 IRC. I will give you some easy tips on how to improve those.

 You have some good python knowledge - I would kindly urge you to
 improve/rewrite the slow and/or hacky python scripts we have in mesa.
 This is a topic that was mentioned multiple times, and a part where
 everyone will be glad to see some progress.

 Thanks
 Emil

>>>
>>> The real goal here is to do mesa (in case I didn't make that clear
>>> either), and
>>> the advantage for mesa is not just performance, it's that meson supports
>>> visual
>>> studio on windows; which means that we could hopefully not just get
>>> faster
>>> builds, but also replace both autotools and scons with a single build
>>> system.
>>>
>>> Yes that was more than clear. Yet it won't fly, I'm afraid.
>>
>> The VMWare people like their SCons,
>>
>
> How much?  I would really rather the VMWare people speak on behalf of
> VMWare.  Meson is the single best shot we've ever had for replacing both
> with one build system.  I'm sure the VMware people would like to have a
> build system that gets maintained by the community as a whole.
>

Sure, I'd like to see one build system instead of two.  Meson supports
Windows so that's good.  But the big issue is our automated build system.
Replacing SCons with Meson could be a big deal in that context.  It would
at least involve pulling Meson into our toolchain and rewriting a bunch of
Python code to grok Meson.  I'd have to go off and investigate that to even
see if it's a possibility.  Maybe next week.

-Brian


>
> and Meson is not a thing on neither BSD(s), Solaris (and derivatives) nor
>> Android :-\
>>
>
> I have trouble bringing myself to care.  The BSDs need to stop using 10
> year old compilers.  It can be made to work on Solaris and BSD if someone
> bothered to put a little work into it.  Besides, given that large chunks of
> GNOME are switching they're going to have to make it work some day soon
> anyway.
>
> Android is a bit unfortunate.  Mesa is one of the few projects that let's
> the Android people carry their build system in-tree and I would like to
> keep that going if it were practical.  Dylan and I have talked about this a
> decent bit and one potential solution is to see if the meson people would
> accept an Android back-end.  Then we would be down to a single build system
> (wouldn't that be nice).
>
> If there's s

[Mesa-dev] [PATCH 1/2] nir: add an isinf opcode, and an option to use it. (v2)

From: Dave Airlie 

In order to get isinf(NaN) correct, at least radv can't
use an unordered equals which feq has to be for us, this
passes isinf to the backend and let's it sort it out as it
pleases. This turns lowering on for i965 only as it's the
only other spir-v consumer than radv.

v2: use lower_isinf and add algebraic lowering for it (Jason)
Signed-off-by: Dave Airlie 
---
 src/compiler/nir/nir.h| 1 +
 src/compiler/nir/nir_opcodes.py   | 3 +++
 src/compiler/nir/nir_opt_algebraic.py | 3 +++
 src/compiler/spirv/vtn_alu.c  | 3 +--
 src/intel/compiler/brw_compiler.c | 1 +
 5 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index 57b8be3..a43ad24 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -1777,6 +1777,7 @@ typedef struct nir_shader_compiler_options {
bool lower_bitfield_insert;
bool lower_uadd_carry;
bool lower_usub_borrow;
+   bool lower_isinf;
/** lowers fneg and ineg to fsub and isub. */
bool lower_negate;
/** lowers fsub and isub to fadd+fneg and iadd+ineg. */
diff --git a/src/compiler/nir/nir_opcodes.py b/src/compiler/nir/nir_opcodes.py
index 52868d5..f6287f2 100644
--- a/src/compiler/nir/nir_opcodes.py
+++ b/src/compiler/nir/nir_opcodes.py
@@ -203,6 +203,9 @@ unop("fquantize2f16", tfloat, "(fabs(src0) < ldexpf(1.0, 
-14)) ? copysignf(0.0f,
 unop("fsin", tfloat, "bit_size == 64 ? sin(src0) : sinf(src0)")
 unop("fcos", tfloat, "bit_size == 64 ? cos(src0) : cosf(src0)")
 
+# isinf.
+
+unop_convert("isinf", tbool, tfloat, "isinf(src0)")
 
 # Partial derivatives.
 
diff --git a/src/compiler/nir/nir_opt_algebraic.py 
b/src/compiler/nir/nir_opt_algebraic.py
index 49c1460..61bf551 100644
--- a/src/compiler/nir/nir_opt_algebraic.py
+++ b/src/compiler/nir/nir_opt_algebraic.py
@@ -305,6 +305,9 @@ optimizations = [
(('fabs', ('b2f', a)), ('b2f', a)),
(('iabs', ('b2i', a)), ('b2i', a)),
 
+   # isinf
+   (('isinf', a), ('feq', ('fabs', a), float('inf')), 'options->lower_isinf'),
+
# Packing and then unpacking does nothing
(('unpack_64_2x32_split_x', ('pack_64_2x32_split', a, b)), a),
(('unpack_64_2x32_split_y', ('pack_64_2x32_split', a, b)), b),
diff --git a/src/compiler/spirv/vtn_alu.c b/src/compiler/spirv/vtn_alu.c
index 0738fe0..d3d51d1 100644
--- a/src/compiler/spirv/vtn_alu.c
+++ b/src/compiler/spirv/vtn_alu.c
@@ -447,8 +447,7 @@ vtn_handle_alu(struct vtn_builder *b, SpvOp opcode,
   break;
 
case SpvOpIsInf:
-  val->ssa->def = nir_feq(&b->nb, nir_fabs(&b->nb, src[0]),
-  nir_imm_float(&b->nb, INFINITY));
+  val->ssa->def = nir_isinf(&b->nb, src[0]);
   break;
 
case SpvOpFUnordEqual:
diff --git a/src/intel/compiler/brw_compiler.c 
b/src/intel/compiler/brw_compiler.c
index cd9473f..abdf53d 100644
--- a/src/intel/compiler/brw_compiler.c
+++ b/src/intel/compiler/brw_compiler.c
@@ -40,6 +40,7 @@
.lower_uadd_carry = true,  \
.lower_usub_borrow = true, \
.lower_fdiv = true,\
+   .lower_isinf = true,   \
.lower_flrp64 = true,  \
.native_integers = true,   \
.use_interpolated_input_intrinsics = true, \
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] radv/ac: emit isinf using ordered equal (v2)

From: Dave Airlie 

This fixes:
dEQP-VK.glsl.builtin.function.common.isinf.*

v2: update to lower_isinf.

Signed-off-by: Dave Airlie 
---
 src/amd/common/ac_nir_to_llvm.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 9a6b952..77e3a85 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -1181,6 +1181,17 @@ static LLVMValueRef emit_ddxy(struct nir_to_llvm_context 
*ctx,
return result;
 }
 
+static LLVMValueRef emit_isinf(struct nir_to_llvm_context *ctx,
+  nir_alu_instr *instr,
+  LLVMValueRef src0)
+{
+   LLVMTypeRef def_type = get_def_type(ctx, &instr->dest.dest.ssa);
+   src0 = emit_intrin_1f_param(ctx, "llvm.fabs",
+  to_float_type(ctx, def_type), src0);
+   src0 = emit_float_cmp(ctx, LLVMRealOEQ, src0, LLVMConstReal(ctx->f32, 
INFINITY));
+   return src0;
+}
+
 /*
  * this takes an I,J coordinate pair,
  * and works out the X and Y derivatives.
@@ -1544,6 +1555,9 @@ static void visit_alu(struct nir_to_llvm_context *ctx, 
nir_alu_instr *instr)
case nir_op_fddy_coarse:
result = emit_ddxy(ctx, instr->op, src[0]);
break;
+   case nir_op_isinf:
+   result = emit_isinf(ctx, instr, src[0]);
+   break;
default:
fprintf(stderr, "Unknown NIR alu instr: ");
nir_print_instr(&instr->instr, stderr);
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC libdrm 0/2] Replace the build system with meson


On March 16, 2017 5:41:24 PM Emil Velikov  wrote:

On 17 March 2017 at 00:21, Dylan Baker  wrote:

Hi Emil,

Quoting Emil Velikov (2017-03-16 16:35:33)

While I can see you're impressed by Meson, I would kindly urge you to
not use it here. As you look closely you can see that one could
trivially improve the times, yet the biggest thing is that most of the
code in libdrm must go ;-)


Perhaps I wasn't clear enough, I don't really expect this to land ever. I sent
it out more because I'd written it and it works and is a useful 
demonstration of
meson+ninja performance. Obviously 20 seconds -> 5 seconds isn't a huge 
deal :);

but in a larger project, consider that a 4x speedup would be 4 minutes to 1
minute, and that is a huge difference in time.


You are still failing to see past your usecase. As said before - if
there's any need to improve things say so.
Note that you simply cannot apply the 1000x speedup in any situation.


Yes, you can't just linearly apply any scaling factor.  However, when you 
build mesa on a machine with a decent number of threads (I think our build 
machine for the CI system has 32 threads), autotools+make is slow as mud.  
Also, there's very little you can do to speed up configure since it's a 
pile of shell and perl that inherently runs single-threaded and is fairly 
complex due to mesa's complicated dependencies.



As the port is not 1:1 wrt the autoconf one, the performance numbers
above are comparing apples to oranges.


I fail to see what I'm missing from meson that would have an effect on the
times I reported. There are some files that are installed by autoconf that I
didn't bother to install with meson (because I don't expect this to land). 
Since
I didn't time installs, I don't see how it isn't an apples to apples 
comparison.



You already (explicitly) mentioned some differences. Admittedly not a
deal breaker.


I understand that libdrm is a pessimal case for recursive-make since most
sub folders contain < 5 C files, However, even if you were to flatten the make
files meson+ninja would still be faster when you consider that meson
configures and builds faster than autotools configures.


That's correct. If is so concerned - they should slim down the configure.ac ;-)


There are real limits as to what you can do there.


If you/others are unhappy with the build times of libdrm - poke me on
IRC. I will give you some easy tips on how to improve those.

You have some good python knowledge - I would kindly urge you to
improve/rewrite the slow and/or hacky python scripts we have in mesa.
This is a topic that was mentioned multiple times, and a part where
everyone will be glad to see some progress.

Thanks
Emil


The real goal here is to do mesa (in case I didn't make that clear either), and
the advantage for mesa is not just performance, it's that meson supports visual
studio on windows; which means that we could hopefully not just get faster
builds, but also replace both autotools and scons with a single build system.


Yes that was more than clear. Yet it won't fly, I'm afraid.

The VMWare people like their SCons,


How much?  I would really rather the VMWare people speak on behalf of 
VMWare.  Meson is the single best shot we've ever had for replacing both 
with one build system.  I'm sure the VMware people would like to have a 
build system that gets maintained by the community as a whole.


and Meson is not a thing on neither BSD(s), Solaris (and derivatives) nor 
Android :-\


I have trouble bringing myself to care.  The BSDs need to stop using 10 
year old compilers.  It can be made to work on Solaris and BSD if someone 
bothered to put a little work into it.  Besides, given that large chunks of 
GNOME are switching they're going to have to make it work some day soon anyway.


Android is a bit unfortunate.  Mesa is one of the few projects that let's 
the Android people carry their build system in-tree and I would like to 
keep that going if it were practical.  Dylan and I have talked about this a 
decent bit and one potential solution is to see if the meson people would 
accept an Android back-end.  Then we would be down to a single build system 
(wouldn't that be nice).



If there's something "slow" say what/where and we can improve upon
things. You seems to be rewriting $world because someone sold you that
A is the holy grail.


I don't think that's fair.  No, Meson is not the holy grail but it is the 
closest anyone has yet been able to come to a viable autotools replacement.


Speed is only one aspect to this.  Unifying the Linux and windows builds is 
also a significant advantage.  Also, autotools is objectively terrible and 
having a build system that's modifiable be mere humans without the need for 
hours of pouring over documentation only to find that you did it wrong 
anyway is a definite plus.



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 100223] marshal_generated.c:38:10: fatal error: 'X11/Xlib-xcb.h' file not found

2017-03-16 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=100223

Timothy Arceri  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Timothy Arceri  ---
Should be fixed by:

commit c81c563fbbb06b6f1dd06ed62f252ed28d45be5a
Author: Emil Velikov 
Date:   Thu Mar 16 11:05:23 2017 +

mapi: remove Xlib/xcb include in gl_marshal.py

The only use of the header is to provide the _X_INLINE macro. We already
require (and provide where needed) 'inline', plus it's used in the file
already.

So replace the macro and drop the include. This fixes the build on
platforms which lack the header - from X-less Linuxes to Androids.

Fixes: 05dd4a1104e ("glapi: Generate GL API marshalling code from the
XML.")
Reported-by: Vinson Lee 
Reviewed-by: Timothy Arceri 
Reviewed-by: Tapani Pälli 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100223
Signed-off-by: Emil Velikov 

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC libdrm 0/2] Replace the build system with meson

2017-03-16 Thread Marek Olšák

On Fri, Mar 17, 2017 at 12:11 AM, Dylan Baker  wrote:
> Quoting Marek Olšák (2017-03-16 15:36:26)
>> Is there a way not to use ninja with meson, because ninja redirects
>> all stderr output from gcc to stdout, which breaks many development
>> environments that expect errors in stderr?
>>
>> I'm basically saying that if ninja can't keep gcc errors in stderr, I
>> wouldn't like any project that I might be involved in to require ninja
>> for building.
>>
>> Marek
>
> There is no way to use another backend on Linux, and meson will not support
> Make. Ninja is a big part of the appeal here, since it is faster than make is.
> Are there particular tools you know don't work with ninja? It seems like in 
> the
> 7+ years since ninja came out that someone would have fixed the tools, or that
> some stream redirection could be used to fix the problem, "ninja 1>&2"?

I actually read some thread about it and the conclusion seemed to be
that ninja developers don't care. I have no other option than to
believe that ninja was made for automated build bots, not for
development.

Some editors expect that errors and only errors go to stderr and all
other garbage info goes to stdout. This is something I can't change.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] nir: add an isinf opcode, and an option to use it.

> Another option would be to make this lower_isinf and add a quick lowering
> line to nir_opt_algebraic.  That's more idiomatic for nir.

If I do that though won't that mean I have to set lower_isinf for all
current NIR
users?

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] nir: add an isinf opcode, and an option to use it.

On 17 March 2017 at 11:09, Jason Ekstrand  wrote:
> On March 16, 2017 5:04:37 PM Dave Airlie  wrote:
>
>> From: Dave Airlie 
>>
>> In order to get isinf(NaN) correct, at least radv can't
>> use an unordered equals which feq has to be for us, this
>> passes isinf to the backend and let's it sort it out as it
>> pleases.
>
>
> I think comparisons are something that were going to need to sort out better
> in general.  SPIR-V's rules are stricter than GL (at least the way we
> interpret it).  Could you please be more specific about the issue?

IsInf(NaN) unordered appears to end up at true, when the spec for isinf
says it should be false.

well SPIR-V has the unorder and ordered stuff for OpenCL kernels, just
not sure what want in NIR in this area. If I default to using ordered compares
for NIR I get isnan and funord fails last I tried.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 100236] Undefined symbols for architecture x86_64: "typeinfo for llvm::RTDyldMemoryManager"

2017-03-16 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=100236

--- Comment #3 from Michel Dänzer  ---
What does LLVM_CXXFLAGS contain in config.log before and after that commit?

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 1/1] clover: use pipe_resource references

2017-03-16 Thread Francisco Jerez

Jan Vesely  writes:

> On Thu, 2017-03-16 at 17:22 -0700, Francisco Jerez wrote:
>> Jan Vesely  writes:
>> 
>> > On Thu, 2017-03-16 at 15:24 -0700, Francisco Jerez wrote:
>> > > Jan Vesely  writes:
>> > > 
>> > > > v2: buffers are created with one reference.
>> > > > v3: add pipe_resource reference to mapping object
>> > > > 
>> > > 
>> > > Mapping objects are supposed to be short-lived, they're logically part
>> > > of the parent resource object so they shouldn't ever out-live it.  What
>> > > is this useful for?
>> > 
>> > currently they can outlive the underlying pipe_resource. pipe_resource
>> > is destroyed in root_resource destructor, while the list of mappings is
>> > destroyed after resource destructor.
>> 
>> Right.  I guess the problem is that the pipe_transfer object associated
>> to the clover::mapping object holds a pointer to the backing
>> pipe_resource object but it fails to increment its reference count?  I
>> guess that's the reason why v2 didn't help?
>
> yes, though the pointer is hidden somewhere. I thought pxfer->resource
> might be it, but it's not, and digging deeper into the structure didn't
> sound like a good idea to me.
>

What is pxfer->resource about in that case?

>> 
>> > this is arguably an application bug. the piglit test does not call
>> > clUnmapMemObject(), but it'd be nice to not access freed memory.
>> > 
>> > Vedran's alternative to clear the list before destroying pipe_resource
>> > works as well (assert that the list is empty in resource destructor
>> > would help spot the issue).
>> > 
>> 
>> Assuming that pipe_transfers are supposed *not* to hold a reference to
>> the underlying pipe_resource, which implies that the caller must
>> guarantee it will never outlive its backing resource, it sounds like the
>> minimal solution would be to have clover::mapping make the same
>> assumptions.  You could probably achieve that in one line of code by
>> clearing the mapping list from the clover::resource destructor as you
>> suggest above.
>
> I'd say the interface would be nicer if pipe_transfers did hold a
> reference (or at least a mapping count to assert on), but I have no
> plans to go that route.
> the problem is a bit more complicated by the fact that pipe_resource is
> handled by root_resource, while the list of mappings is private to
> parent class resource.
>
> Vedran's patch is here:
> https://lists.freedesktop.org/archives/mesa-dev/2017-March/147092.html
>
> I thought that using references would be nicer, as it looked useful for
> device shared buffers, but that no longer applies.
>
> Jan
>
>> 
>> > Jan
>> > 
>> > > 
>> > > > CC: "17.0 13.0" 
>> > > > 
>> > > > Signed-off-by: Jan Vesely 
>> > > > ---
>> > > >  src/gallium/state_trackers/clover/core/resource.cpp | 11 ---
>> > > >  src/gallium/state_trackers/clover/core/resource.hpp |  7 ---
>> > > >  2 files changed, 12 insertions(+), 6 deletions(-)
>> > > > 
>> > > > diff --git a/src/gallium/state_trackers/clover/core/resource.cpp 
>> > > > b/src/gallium/state_trackers/clover/core/resource.cpp
>> > > > index 06fd3f6..83e3c26 100644
>> > > > --- a/src/gallium/state_trackers/clover/core/resource.cpp
>> > > > +++ b/src/gallium/state_trackers/clover/core/resource.cpp
>> > > > @@ -25,6 +25,7 @@
>> > > >  #include "pipe/p_screen.h"
>> > > >  #include "util/u_sampler.h"
>> > > >  #include "util/u_format.h"
>> > > > +#include "util/u_inlines.h"
>> > > >  
>> > > >  using namespace clover;
>> > > >  
>> > > > @@ -176,7 +177,7 @@ root_resource::root_resource(clover::device &dev, 
>> > > > memory_obj &obj,
>> > > >  }
>> > > >  
>> > > >  root_resource::~root_resource() {
>> > > > -   device().pipe->resource_destroy(device().pipe, pipe);
>> > > > +   pipe_resource_reference(&this->pipe, NULL);
>> > > >  }
>> > > >  
>> > > >  sub_resource::sub_resource(resource &r, const vector &offset) :
>> > > > @@ -202,18 +203,21 @@ mapping::mapping(command_queue &q, resource &r,
>> > > >pxfer = NULL;
>> > > >throw error(CL_OUT_OF_RESOURCES);
>> > > > }
>> > > > +   pipe_resource_reference(&res, r.pipe);
>> > > >  }
>> > > >  
>> > > >  mapping::mapping(mapping &&m) :
>> > > > -   pctx(m.pctx), pxfer(m.pxfer), p(m.p) {
>> > > > +   pctx(m.pctx), pxfer(m.pxfer), res(m.res), p(m.p) {
>> > > > m.pctx = NULL;
>> > > > m.pxfer = NULL;
>> > > > +   m.res = NULL;
>> > > > m.p = NULL;
>> > > >  }
>> > > >  
>> > > >  mapping::~mapping() {
>> > > > if (pxfer) {
>> > > >pctx->transfer_unmap(pctx, pxfer);
>> > > > }
>> > > > +   pipe_resource_reference(&res, NULL);
>> > > >  }
>> > > >  
>> > > > @@ -222,5 +226,6 @@ mapping::operator=(mapping m) {
>> > > > std::swap(pctx, m.pctx);
>> > > > std::swap(pxfer, m.pxfer);
>> > > > +   std::swap(res, m.res);
>> > > > std::swap(p, m.p);
>> > > > return *this;
>> > > >  }
>> > > > diff --git a/src/gallium/state_trackers/clover/core/resource.hpp 
>> > > > b/src/gallium/state_trackers/clover/core/resource.hpp
>> > > > index 9993dcb..c

Re: [Mesa-dev] [PATCH 1/2] nir: add an isinf opcode, and an option to use it.


On March 16, 2017 5:04:37 PM Dave Airlie  wrote:


From: Dave Airlie 

In order to get isinf(NaN) correct, at least radv can't
use an unordered equals which feq has to be for us, this
passes isinf to the backend and let's it sort it out as it
pleases.


I think comparisons are something that were going to need to sort out 
better in general.  SPIR-V's rules are stricter than GL (at least the way 
we interpret it).  Could you please be more specific about the issue?



Signed-off-by: Dave Airlie 
---
 src/compiler/nir/nir.h  | 1 +
 src/compiler/nir/nir_opcodes.py | 2 +-
 src/compiler/spirv/vtn_alu.c| 7 +--
 3 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index 57b8be3..bcdca4b 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -1777,6 +1777,7 @@ typedef struct nir_shader_compiler_options {
bool lower_bitfield_insert;
bool lower_uadd_carry;
bool lower_usub_borrow;
+   bool use_isinf;


Another option would be to make this lower_isinf and add a quick lowering 
line to nir_opt_algebraic.  That's more idiomatic for nir.



/** lowers fneg and ineg to fsub and isub. */
bool lower_negate;
/** lowers fsub and isub to fadd+fneg and iadd+ineg. */
diff --git a/src/compiler/nir/nir_opcodes.py b/src/compiler/nir/nir_opcodes.py
index 52868d5..7387208 100644
--- a/src/compiler/nir/nir_opcodes.py
+++ b/src/compiler/nir/nir_opcodes.py
@@ -203,7 +203,7 @@ unop("fquantize2f16", tfloat, "(fabs(src0) < 
ldexpf(1.0, -14)) ? copysignf(0.0f,

 unop("fsin", tfloat, "bit_size == 64 ? sin(src0) : sinf(src0)")
 unop("fcos", tfloat, "bit_size == 64 ? cos(src0) : cosf(src0)")

-
+unop_convert("isinf", tbool, tfloat, "isinf(src0)")


Please keep the space before the comment below.


 # Partial derivatives.


diff --git a/src/compiler/spirv/vtn_alu.c b/src/compiler/spirv/vtn_alu.c
index 0738fe0..79f51e7 100644
--- a/src/compiler/spirv/vtn_alu.c
+++ b/src/compiler/spirv/vtn_alu.c
@@ -447,8 +447,11 @@ vtn_handle_alu(struct vtn_builder *b, SpvOp opcode,
   break;

case SpvOpIsInf:
-  val->ssa->def = nir_feq(&b->nb, nir_fabs(&b->nb, src[0]),
-  nir_imm_float(&b->nb, INFINITY));
+  if (b->shader->options->use_isinf)
+ val->ssa->def = nir_isinf(&b->nb, src[0]);
+  else
+ val->ssa->def = nir_feq(&b->nb, nir_fabs(&b->nb, src[0]),
+ nir_imm_float(&b->nb, INFINITY));
   break;

case SpvOpFUnordEqual:
--
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] i965: bounds checks while concatenating sysfs paths

2017-03-16 Thread Robert Bragg

This adds some missing return value checks for all uses of snprintf in
brw_performance_query.c. This also switches a use of strncpy + strncat
for snprintf for consistency and to avoiding the chance of the strncpy
leaving an unterminated string in the dest buffer if the src is too
long.

This issue with strncpy was picked up by Coverity.

CID: 1402201
Signed-off-by: Robert Bragg 
---
 src/mesa/drivers/dri/i965/brw_performance_query.c | 43 +--
 1 file changed, 32 insertions(+), 11 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_performance_query.c 
b/src/mesa/drivers/dri/i965/brw_performance_query.c
index 4052117ea5..2e04e091d2 100644
--- a/src/mesa/drivers/dri/i965/brw_performance_query.c
+++ b/src/mesa/drivers/dri/i965/brw_performance_query.c
@@ -1508,9 +1508,13 @@ enumerate_sysfs_metrics(struct brw_context *brw, const 
char *sysfs_dev_dir)
char buf[256];
DIR *metricsdir = NULL;
struct dirent *metric_entry;
+   int len;
 
-   strncpy(buf, sysfs_dev_dir, sizeof(buf));
-   strncat(buf, "/metrics", sizeof(buf));
+   len = snprintf(buf, sizeof(buf), "%s/metrics", sysfs_dev_dir);
+   if (len < 0 || len >= sizeof(buf)) {
+  DBG("Failed to concatenate path to sysfs metrics/ directory\n");
+  return;
+   }
 
metricsdir = opendir(buf);
if (!metricsdir) {
@@ -1533,8 +1537,12 @@ enumerate_sysfs_metrics(struct brw_context *brw, const 
char *sysfs_dev_dir)
  struct brw_perf_query_info *query;
  uint64_t id;
 
- snprintf(buf, sizeof(buf), "%s/metrics/%s/id",
-  sysfs_dev_dir, metric_entry->d_name);
+ len = snprintf(buf, sizeof(buf), "%s/metrics/%s/id",
+sysfs_dev_dir, metric_entry->d_name);
+ if (len < 0 || len >= sizeof(buf)) {
+DBG("Failed to concatenate path to sysfs metric id file\n");
+continue;
+ }
 
  if (!read_file_uint64(buf, &id)) {
 DBG("Failed to read metric set id from %s: %m", buf);
@@ -1561,8 +1569,13 @@ read_sysfs_drm_device_file_uint64(struct brw_context 
*brw,
   uint64_t *value)
 {
char buf[512];
+   int len;
 
-   snprintf(buf, sizeof(buf), "%s/%s", sysfs_dev_dir, file);
+   len = snprintf(buf, sizeof(buf), "%s/%s", sysfs_dev_dir, file);
+   if (len < 0 || len >= sizeof(buf)) {
+  DBG("Failed to concatenate sys filename to read u64 from\n");
+  return false;
+   }
 
return read_file_uint64(buf, value);
 }
@@ -1620,6 +1633,7 @@ get_sysfs_dev_dir(struct brw_context *brw,
int min, maj;
DIR *drmdir;
struct dirent *drm_entry;
+   int len;
 
assert(path_buf);
assert(path_buf_len);
@@ -1638,8 +1652,12 @@ get_sysfs_dev_dir(struct brw_context *brw,
   return false;
}
 
-   snprintf(path_buf, path_buf_len,
-"/sys/dev/char/%d:%d/device/drm", maj, min);
+   len = snprintf(path_buf, path_buf_len,
+  "/sys/dev/char/%d:%d/device/drm", maj, min);
+   if (len < 0 || len >= path_buf_len) {
+  DBG("Failed to concatenate sysfs path to drm device\n");
+  return false;
+   }
 
drmdir = opendir(path_buf);
if (!drmdir) {
@@ -1652,11 +1670,14 @@ get_sysfs_dev_dir(struct brw_context *brw,
drm_entry->d_type == DT_LNK) &&
   strncmp(drm_entry->d_name, "card", 4) == 0)
   {
- snprintf(path_buf, path_buf_len,
-  "/sys/dev/char/%d:%d/device/drm/%s",
-  maj, min, drm_entry->d_name);
+ len = snprintf(path_buf, path_buf_len,
+"/sys/dev/char/%d:%d/device/drm/%s",
+maj, min, drm_entry->d_name);
  closedir(drmdir);
- return true;
+ if (len < 0 || len >= path_buf_len)
+return false;
+ else
+return true;
   }
}
 
-- 
2.12.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] MESA and KOTOR

2017-03-16 Thread Federico Dossena

I managed to fix the patch and apply it to mesa master, but I'm getting
the same result as with my stub. The crash is still the same, in
glu32.dll, I wonder if the GLU that you guys have in your repo will work
any better. I tried to crosscompile it but without luck, any instructions?

Still, I want to thank all of you for helping me out on this one. I've
been fixing old games for years but this one has always been my nemesis.

Il 2017-03-16 19:04, Brian Paul ha scritto:

Patch for implementing WGL_ARB_make_current_read attached. I can’t test it at
the moment since I’m not near my Windows development environment. Let me know
what you find.

-Brian

On Mar 15, 2017, at 12:26 PM, Federico Dossena wrote:

That's good, can't wait to see your implementation.

I have tried to simply return wglMakeCurrent(hReadDC,hglrc); but then I get a
crash in gluBuild2DMipmaps (not mesa, glu32.dll). According to the
specification, it should work, or at least draw some glitches.
Looking at the parameters passed by the game to wglMakeContextCurrentARB, I see
that hReadDC and hDrawDC are the same so I guess they intended to use it as a
replacement for wglMakeCurrent, but still, it's not working. So, does
wglMakeContextCurrentARB do something else in addition to that?

Il 2017-03-15 15:50, Jose Fonseca ha scritto:

VMware maintains a Windows OpenGL driver based off Mesa source.

We typically open source most of our modifications, but these haven't been yet
open sourced. No particular reason I believe. We've been just busy with other
stuff.

The simplest shim would be to invoke wglMakeCurrent from
wglMakeContextCurrentARB, ignoring exttra arg.

Jose

On 15/03/17 14:31, Federico Dossena wrote:

Where can I find that implementation?

Also, is there an alternative to that function? As in a snippet of code
that does the same thing and can be used to create a "shim"?
It's so old, I can barely find documentation about it...

On March 15, 2017 2:42:35 PM GMT+01:00, Jose Fonseca
wrote:

It looks like wglMakeContextCurrentARB too has been implemented
internally but not yet crossported.

It's far from trivial (especially because Microsoft ICD interface never
was designed to allow implementations to provide alternative
imlpementations of functions like wglMakeCurrent or wglCreateContext)
though in the way you're using it, it's less important, as the original
opengl32.dll is never used.

I don't know how much effort / time it takes to crossport this and other
outstanding patches to master, but my guess is that it would be more
effective to wait a bit.

Jose

On 15/03/17 06:35, Federico Dossena wrote:

I have created a simple stub for wglMakeContextCurrentARB in
stw_wgl.c
and stw_getprocaddress.c. It simply returns TRUE, but the good
thing is
that now the game no longer crashes because the function is missing!
However I get a divide by zero in glu32.dll, presumably because
the stub
doesn't do jack.
I tried returning FALSE but the game has no fallback, it just
ignores
the return values and assumes that everything is fine.

From what I've seen, there is no need to override the system's
opengl32.dll like you did for wglCreateContext/wglDeleteContext,
so it
shouldn't be too tricky to implement the function. However, I
can't seem
to find any real documentation about what it's supposed to do. I
found
this at
https://urldefense.proofpoint.com/v2/url?u=https-3A__www.khronos.org_registry_OpenGL_extensions_ARB_WGL-5FARB-5Fmake-5Fcurrent-5Fread.txt&d=DwIGaQ&c=uilaK90D4TOVoH58JNXRgQ&r=Ie7_encNUsqxbSRbqbNgofw0ITcfE8JKfaUjIQhncGA&m=SWh6lT89FsLAyTgJ-rsJ9RAojPix3V1ZDKyBjIR31pI&s=fUdpS5GuTWuLy5BFTEQD_f8_MfXcrh7ZeLWr6WDKvk0&e=
but it's pretty vague:

The function wglMakeContextCurrentARB associates the context
with the device for draws and the device for
reads. All subsequent OpenGL calls made by the calling thread are
drawn on the device identified by and read on the device
identified by .

How do I do that? Do I have to copy the frame buffer? Or just the
pointer? Or am I completely off road?

Thanks for helping me out ;)

Il 2017-03-14 03:44, Brian Paul ha scritto:

Looks like my KOTOR patch never made it to master. I'm
attaching it
below so you can try it. I should commit it master. In any
case, let
me know if it helps.

-Brian

On 03/13/2017 10:55 AM, Federico Dossena wrote:

Hi Jose, thanks for replying, I've seen your name inside
many files in
mesa ;)

I have tried mesa master (previously I was using 17.0.1)
but it still
crashes for the same null pointer.

Re: [Mesa-dev] [PATCH] configure.ac: Use POSIX word boundary regex.

2017-03-16 Thread Jan Beich

Vinson Lee  writes:

> -# Use \> (marks the end of the word)
> +# Use [[:>:]] (marks the end of the word)

[[:>:]] is "an extension, compatible with but not specified by POSIX 1003.2".
GNU libc doesn't support it.

$ echo 'foot foo bar' | sed -E 's/foo[[:>:]]//g'
sed: -e expression #1, char 15: Invalid character class name

$ sed --version | sed 1q
sed (GNU sed) 4.4

$ ldd --version
ldd (Debian GLIBC 2.24-9) 2.24

>  echo " `$1`" | sed -E \
>  -e 's/[[[:space:]]]+-m[[^[:space:]]]*//g' \
> --e 's/[[[:space:]]]+-DNDEBUG\>//g' \
> +-e 's/[[[:space:]]]+-DNDEBUG[[[:>:]]]//g' \

Try matching some whitespace after the word as a workaround e.g.,

  -e 's/[[[:space:]]]+-DNDEBUG($|[[[:space:]]])/\1/g'
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 1/1] clover: use pipe_resource references

On Thu, 2017-03-16 at 17:22 -0700, Francisco Jerez wrote:
> Jan Vesely  writes:
> 
> > On Thu, 2017-03-16 at 15:24 -0700, Francisco Jerez wrote:
> > > Jan Vesely  writes:
> > > 
> > > > v2: buffers are created with one reference.
> > > > v3: add pipe_resource reference to mapping object
> > > > 
> > > 
> > > Mapping objects are supposed to be short-lived, they're logically part
> > > of the parent resource object so they shouldn't ever out-live it.  What
> > > is this useful for?
> > 
> > currently they can outlive the underlying pipe_resource. pipe_resource
> > is destroyed in root_resource destructor, while the list of mappings is
> > destroyed after resource destructor.
> 
> Right.  I guess the problem is that the pipe_transfer object associated
> to the clover::mapping object holds a pointer to the backing
> pipe_resource object but it fails to increment its reference count?  I
> guess that's the reason why v2 didn't help?

yes, though the pointer is hidden somewhere. I thought pxfer->resource
might be it, but it's not, and digging deeper into the structure didn't
sound like a good idea to me.

> 
> > this is arguably an application bug. the piglit test does not call
> > clUnmapMemObject(), but it'd be nice to not access freed memory.
> > 
> > Vedran's alternative to clear the list before destroying pipe_resource
> > works as well (assert that the list is empty in resource destructor
> > would help spot the issue).
> > 
> 
> Assuming that pipe_transfers are supposed *not* to hold a reference to
> the underlying pipe_resource, which implies that the caller must
> guarantee it will never outlive its backing resource, it sounds like the
> minimal solution would be to have clover::mapping make the same
> assumptions.  You could probably achieve that in one line of code by
> clearing the mapping list from the clover::resource destructor as you
> suggest above.

I'd say the interface would be nicer if pipe_transfers did hold a
reference (or at least a mapping count to assert on), but I have no
plans to go that route.
the problem is a bit more complicated by the fact that pipe_resource is
handled by root_resource, while the list of mappings is private to
parent class resource.

Vedran's patch is here:
https://lists.freedesktop.org/archives/mesa-dev/2017-March/147092.html

I thought that using references would be nicer, as it looked useful for
device shared buffers, but that no longer applies.

Jan

> 
> > Jan
> > 
> > > 
> > > > CC: "17.0 13.0" 
> > > > 
> > > > Signed-off-by: Jan Vesely 
> > > > ---
> > > >  src/gallium/state_trackers/clover/core/resource.cpp | 11 ---
> > > >  src/gallium/state_trackers/clover/core/resource.hpp |  7 ---
> > > >  2 files changed, 12 insertions(+), 6 deletions(-)
> > > > 
> > > > diff --git a/src/gallium/state_trackers/clover/core/resource.cpp 
> > > > b/src/gallium/state_trackers/clover/core/resource.cpp
> > > > index 06fd3f6..83e3c26 100644
> > > > --- a/src/gallium/state_trackers/clover/core/resource.cpp
> > > > +++ b/src/gallium/state_trackers/clover/core/resource.cpp
> > > > @@ -25,6 +25,7 @@
> > > >  #include "pipe/p_screen.h"
> > > >  #include "util/u_sampler.h"
> > > >  #include "util/u_format.h"
> > > > +#include "util/u_inlines.h"
> > > >  
> > > >  using namespace clover;
> > > >  
> > > > @@ -176,7 +177,7 @@ root_resource::root_resource(clover::device &dev, 
> > > > memory_obj &obj,
> > > >  }
> > > >  
> > > >  root_resource::~root_resource() {
> > > > -   device().pipe->resource_destroy(device().pipe, pipe);
> > > > +   pipe_resource_reference(&this->pipe, NULL);
> > > >  }
> > > >  
> > > >  sub_resource::sub_resource(resource &r, const vector &offset) :
> > > > @@ -202,18 +203,21 @@ mapping::mapping(command_queue &q, resource &r,
> > > >pxfer = NULL;
> > > >throw error(CL_OUT_OF_RESOURCES);
> > > > }
> > > > +   pipe_resource_reference(&res, r.pipe);
> > > >  }
> > > >  
> > > >  mapping::mapping(mapping &&m) :
> > > > -   pctx(m.pctx), pxfer(m.pxfer), p(m.p) {
> > > > +   pctx(m.pctx), pxfer(m.pxfer), res(m.res), p(m.p) {
> > > > m.pctx = NULL;
> > > > m.pxfer = NULL;
> > > > +   m.res = NULL;
> > > > m.p = NULL;
> > > >  }
> > > >  
> > > >  mapping::~mapping() {
> > > > if (pxfer) {
> > > >pctx->transfer_unmap(pctx, pxfer);
> > > > }
> > > > +   pipe_resource_reference(&res, NULL);
> > > >  }
> > > >  
> > > > @@ -222,5 +226,6 @@ mapping::operator=(mapping m) {
> > > > std::swap(pctx, m.pctx);
> > > > std::swap(pxfer, m.pxfer);
> > > > +   std::swap(res, m.res);
> > > > std::swap(p, m.p);
> > > > return *this;
> > > >  }
> > > > diff --git a/src/gallium/state_trackers/clover/core/resource.hpp 
> > > > b/src/gallium/state_trackers/clover/core/resource.hpp
> > > > index 9993dcb..cea9617 100644
> > > > --- a/src/gallium/state_trackers/clover/core/resource.hpp
> > > > +++ b/src/gallium/state_trackers/clover/core/resource.hpp
> > > > @@ -123,9 +123,10 @@ namespace clover {
> > > >

Re: [Mesa-dev] [PATCH 1/2] nir: add an isinf opcode, and an option to use it.

2017-03-16 Thread Bas Nieuwenhuizen

On Fri, Mar 17, 2017 at 1:04 AM, Dave Airlie  wrote:
> From: Dave Airlie 
>
> In order to get isinf(NaN) correct, at least radv can't
> use an unordered equals which feq has to be for us, this

Why do we have to use an unordered equal normally? SPIR-V has both
ordered and unordered compares.  I can't find anything in the vulkan
SPIR-V environment that relaxes that.

> passes isinf to the backend and let's it sort it out as it
> pleases.
>
> Signed-off-by: Dave Airlie 
> ---
>  src/compiler/nir/nir.h  | 1 +
>  src/compiler/nir/nir_opcodes.py | 2 +-
>  src/compiler/spirv/vtn_alu.c| 7 +--
>  3 files changed, 7 insertions(+), 3 deletions(-)
>
> diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
> index 57b8be3..bcdca4b 100644
> --- a/src/compiler/nir/nir.h
> +++ b/src/compiler/nir/nir.h
> @@ -1777,6 +1777,7 @@ typedef struct nir_shader_compiler_options {
> bool lower_bitfield_insert;
> bool lower_uadd_carry;
> bool lower_usub_borrow;
> +   bool use_isinf;
> /** lowers fneg and ineg to fsub and isub. */
> bool lower_negate;
> /** lowers fsub and isub to fadd+fneg and iadd+ineg. */
> diff --git a/src/compiler/nir/nir_opcodes.py b/src/compiler/nir/nir_opcodes.py
> index 52868d5..7387208 100644
> --- a/src/compiler/nir/nir_opcodes.py
> +++ b/src/compiler/nir/nir_opcodes.py
> @@ -203,7 +203,7 @@ unop("fquantize2f16", tfloat, "(fabs(src0) < ldexpf(1.0, 
> -14)) ? copysignf(0.0f,
>  unop("fsin", tfloat, "bit_size == 64 ? sin(src0) : sinf(src0)")
>  unop("fcos", tfloat, "bit_size == 64 ? cos(src0) : cosf(src0)")
>
> -
> +unop_convert("isinf", tbool, tfloat, "isinf(src0)")
>  # Partial derivatives.
>
>
> diff --git a/src/compiler/spirv/vtn_alu.c b/src/compiler/spirv/vtn_alu.c
> index 0738fe0..79f51e7 100644
> --- a/src/compiler/spirv/vtn_alu.c
> +++ b/src/compiler/spirv/vtn_alu.c
> @@ -447,8 +447,11 @@ vtn_handle_alu(struct vtn_builder *b, SpvOp opcode,
>break;
>
> case SpvOpIsInf:
> -  val->ssa->def = nir_feq(&b->nb, nir_fabs(&b->nb, src[0]),
> -  nir_imm_float(&b->nb, INFINITY));
> +  if (b->shader->options->use_isinf)
> + val->ssa->def = nir_isinf(&b->nb, src[0]);
> +  else
> + val->ssa->def = nir_feq(&b->nb, nir_fabs(&b->nb, src[0]),
> + nir_imm_float(&b->nb, INFINITY));
>break;
>
> case SpvOpFUnordEqual:
> --
> 2.9.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC libdrm 0/2] Replace the build system with meson

2017-03-16 Thread Emil Velikov

On 17 March 2017 at 00:21, Dylan Baker  wrote:
> Hi Emil,
>
> Quoting Emil Velikov (2017-03-16 16:35:33)
>> While I can see you're impressed by Meson, I would kindly urge you to
>> not use it here. As you look closely you can see that one could
>> trivially improve the times, yet the biggest thing is that most of the
>> code in libdrm must go ;-)
>
> Perhaps I wasn't clear enough, I don't really expect this to land ever. I sent
> it out more because I'd written it and it works and is a useful demonstration 
> of
> meson+ninja performance. Obviously 20 seconds -> 5 seconds isn't a huge deal 
> :);
> but in a larger project, consider that a 4x speedup would be 4 minutes to 1
> minute, and that is a huge difference in time.
>
You are still failing to see past your usecase. As said before - if
there's any need to improve things say so.
Note that you simply cannot apply the 1000x speedup in any situation.

>>
>> As the port is not 1:1 wrt the autoconf one, the performance numbers
>> above are comparing apples to oranges.
>
> I fail to see what I'm missing from meson that would have an effect on the
> times I reported. There are some files that are installed by autoconf that I
> didn't bother to install with meson (because I don't expect this to land). 
> Since
> I didn't time installs, I don't see how it isn't an apples to apples 
> comparison.
>
You already (explicitly) mentioned some differences. Admittedly not a
deal breaker.

> I understand that libdrm is a pessimal case for recursive-make since most
> sub folders contain < 5 C files, However, even if you were to flatten the make
> files meson+ninja would still be faster when you consider that meson
> configures and builds faster than autotools configures.
>
That's correct. If is so concerned - they should slim down the configure.ac ;-)

>> If you/others are unhappy with the build times of libdrm - poke me on
>> IRC. I will give you some easy tips on how to improve those.
>>
>> You have some good python knowledge - I would kindly urge you to
>> improve/rewrite the slow and/or hacky python scripts we have in mesa.
>> This is a topic that was mentioned multiple times, and a part where
>> everyone will be glad to see some progress.
>>
>> Thanks
>> Emil
>
> The real goal here is to do mesa (in case I didn't make that clear either), 
> and
> the advantage for mesa is not just performance, it's that meson supports 
> visual
> studio on windows; which means that we could hopefully not just get faster
> builds, but also replace both autotools and scons with a single build system.
>
Yes that was more than clear. Yet it won't fly, I'm afraid.

The VMWare people like their SCons, and Meson is not a thing on
neither BSD(s), Solaris (and derivatives) nor Android :-\
If there's something "slow" say what/where and we can improve upon
things. You seems to be rewriting $world because someone sold you that
A is the holy grail.

I'll repeat my earlier request - your python skills/knowledge will be
greatly appreciated in existing parts of Mesa.
Speaking of which - you last work doesn't seem to have landed. What's
blocking it ?

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 24/25] intel/vulkan: Get rid of recursive make

2017-03-16 Thread Grazvydas Ignotas

On Thu, Mar 9, 2017 at 9:07 PM, Emil Velikov  wrote:
> From: Jason Ekstrand 
>
> v2 [Emil Velikov]
>  - Various fixes and initial stab at the Android build.
>  - Keep the generation rules/EXTRA_DIST outside the conditional

This has broken anv build for me, because I don't have vulkan.h
anywhere in my system except mesa:
make[4]: Entering directory '/home/notaz/src/radeon/mesa/src/intel'
  CC   vulkan/vulkan_libvulkan_intel_la-anv_gem.lo
In file included from vulkan/anv_private.h:66:0,
 from vulkan/anv_gem.c:31:
/opt/xorg/include/vulkan/vulkan_intel.h:27:20: fatal error: vulkan.h:
No such file or directory
compilation terminated.

I hope that wasn't intentional?

Gražvydas
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: Select pipeline and emit state base address in Gen8+ HiZ ops.

On Wednesday, March 8, 2017 10:27:20 AM PDT Nanley Chery wrote:
> On Wed, Mar 08, 2017 at 10:07:12AM -0800, Nanley Chery wrote:
> > On Wed, Mar 08, 2017 at 02:17:59AM -0800, Kenneth Graunke wrote:
> > > On Thursday, March 2, 2017 4:36:08 PM PST Nanley Chery wrote:
> > > > On Mon, Feb 06, 2017 at 03:55:49PM -0800, Kenneth Graunke wrote:
> > > > > If a HiZ op is the first thing in the batch, we should make sure
> > > > > to select the render pipeline and emit state base address before
> > > > > proceeding.
> > > > > 
> > > > > I believe 3DSTATE_WM_HZ_OP creates 3DPRIMITIVEs internally, and
> > > > > dispatching those on the GPGPU pipeline seems a bit sketchy.  I'm
> > > > 
> > > > Yes, it does seem like we currently allow HZ_OPs within a GPGPU
> > > > pipeline. This patch should fix that problem.
> > > > 
> > > > > not actually sure that STATE_BASE_ADDRESS is necessary, as the
> > > > > depth related commands use graphics addresses, not ones relative
> > > > > to the base address...but we're likely to set it as part of the
> > > > > next operation anyway, so we should just do it right away.
> > > > > 
> > > > 
> > > > I agree, re-emitting STATE_BASE_ADDRESS doesn't seem necessary. I think
> > > > we should drop this part of the patch and add it back in later if we get
> > > > some data that it's necessary. Leaving it there may be distracting to
> > > > some readers and the BDW PRM warns that it's an expensive command:
> > > > 
> > > > Execution of this command causes a full pipeline flush, thus its
> > > > use should be minimized for higher performance.
> > > 
> > > I think it should be basically free, actually.  We track a boolean,
> > > brw->batch.state_base_address_emitted, to avoid emitting it multiple
> > > times per batch.
> > > 
> > > Let's say the first thing in a fresh batch is a HiZ op, followed by
> > > normal drawing.  Previously, we'd do:
> > > 
> > > 1. HiZ op commands
> > > 2. STATE_BASE_ADDRESS (triggered by normal rendering upload)
> > > 3. rest of normal drawing commands
> > > 
> > > Now we'd do:
> > > 
> > > 1. STATE_BASE_ADDRESS (triggered by HiZ op)
> > > 2. HiZ op commands
> > > 3. normal drawing commands (second SBA is skipped)
> > > 
> > > In other words...we're just moving it a bit earlier.  I suppose there
> > > could be a batch containing only HiZ ops, at which point we'd pay for
> > > a single STATE_BASE_ADDRESS...but that seems really unlikely.
> > > 
> > 
> > Sorry for not stating it up front, but the special case you've mentioned
> > is exactly what I'd like not to hurt unnecessarily.
> > 

Why?  We really think there are going to be batches with only
3DSTATE_WM_HZ_OP and no normal rendering or BLORP?  It sounds
really hypothetical to me.

> Correct me if I'm wrong, but after thinking about it some more, it seems
> that performance wouldn't suffer by emitting the SBA since the pipeline
> was already flushed at the end of the preceding batch. It may also
> improve since the pipelined HiZ op will likely be followed by other
> pipelined commands. I'm not totally confident in my understanding on
> pipeline flushes by the way. Is this why you'd like to emit the SBA here?
> I think it's fine to leave it if we expound on the rationale.

Performance is not a motivation for this patch.  Having the GPU do
work without a pipeline selected or state base addresses in place seems
potentially dangerous.  I was hoping it would help with GPU hangs.  I'm
not certain that it does, and it might be safe to skip this, but it
seems like a lot of mental gymnastics to prove that it's safe for very
little upside.

I think you're right, though - doing the non-pipelined commands at the
top may actually be better than kicking off work, stalling, and kicking
off more work.  *shrug*

> -Nanley
> 
> > > > > Cc: "17.0" 
> > > > > Signed-off-by: Kenneth Graunke 
> > > > > ---
> > > > >  src/mesa/drivers/dri/i965/gen8_depth_state.c | 3 +++
> > > > >  1 file changed, 3 insertions(+)
> > > > > 
> > > > > diff --git a/src/mesa/drivers/dri/i965/gen8_depth_state.c 
> > > > > b/src/mesa/drivers/dri/i965/gen8_depth_state.c
> > > > > index a7e61354fd5..620b32df8bb 100644
> > > > > --- a/src/mesa/drivers/dri/i965/gen8_depth_state.c
> > > > > +++ b/src/mesa/drivers/dri/i965/gen8_depth_state.c
> > > > > @@ -404,6 +404,9 @@ gen8_hiz_exec(struct brw_context *brw, struct 
> > > > > intel_mipmap_tree *mt,
> > > > > if (op == BLORP_HIZ_OP_NONE)
> > > > >return;
> > > > >  
> > > > 
> > > > It would be helpful if you included the rationale here as a code
> > > > comment. Something like the first two sentences of your commit message
> > > > should work.
> > > 
> > > I can do that.
> > > 
> > > > > +   brw_select_pipeline(brw, BRW_RENDER_PIPELINE);
> > > > 
> > > > According to Vol07 of the BDW+ PRMs,
> > > > 
> > > > The previously active pipeline needs to be flushed via the
> > > > MI_FLUSH command immediately before switching to a different
> > > > pipe

Re: [Mesa-dev] [PATCH v3 1/1] clover: use pipe_resource references

2017-03-16 Thread Francisco Jerez

Jan Vesely  writes:

> On Thu, 2017-03-16 at 15:24 -0700, Francisco Jerez wrote:
>> Jan Vesely  writes:
>> 
>> > v2: buffers are created with one reference.
>> > v3: add pipe_resource reference to mapping object
>> > 
>> 
>> Mapping objects are supposed to be short-lived, they're logically part
>> of the parent resource object so they shouldn't ever out-live it.  What
>> is this useful for?
>
> currently they can outlive the underlying pipe_resource. pipe_resource
> is destroyed in root_resource destructor, while the list of mappings is
> destroyed after resource destructor.

Right.  I guess the problem is that the pipe_transfer object associated
to the clover::mapping object holds a pointer to the backing
pipe_resource object but it fails to increment its reference count?  I
guess that's the reason why v2 didn't help?

> this is arguably an application bug. the piglit test does not call
> clUnmapMemObject(), but it'd be nice to not access freed memory.
>
> Vedran's alternative to clear the list before destroying pipe_resource
> works as well (assert that the list is empty in resource destructor
> would help spot the issue).
>

Assuming that pipe_transfers are supposed *not* to hold a reference to
the underlying pipe_resource, which implies that the caller must
guarantee it will never outlive its backing resource, it sounds like the
minimal solution would be to have clover::mapping make the same
assumptions.  You could probably achieve that in one line of code by
clearing the mapping list from the clover::resource destructor as you
suggest above.

> Jan
>
>> 
>> > CC: "17.0 13.0" 
>> > 
>> > Signed-off-by: Jan Vesely 
>> > ---
>> >  src/gallium/state_trackers/clover/core/resource.cpp | 11 ---
>> >  src/gallium/state_trackers/clover/core/resource.hpp |  7 ---
>> >  2 files changed, 12 insertions(+), 6 deletions(-)
>> > 
>> > diff --git a/src/gallium/state_trackers/clover/core/resource.cpp 
>> > b/src/gallium/state_trackers/clover/core/resource.cpp
>> > index 06fd3f6..83e3c26 100644
>> > --- a/src/gallium/state_trackers/clover/core/resource.cpp
>> > +++ b/src/gallium/state_trackers/clover/core/resource.cpp
>> > @@ -25,6 +25,7 @@
>> >  #include "pipe/p_screen.h"
>> >  #include "util/u_sampler.h"
>> >  #include "util/u_format.h"
>> > +#include "util/u_inlines.h"
>> >  
>> >  using namespace clover;
>> >  
>> > @@ -176,7 +177,7 @@ root_resource::root_resource(clover::device &dev, 
>> > memory_obj &obj,
>> >  }
>> >  
>> >  root_resource::~root_resource() {
>> > -   device().pipe->resource_destroy(device().pipe, pipe);
>> > +   pipe_resource_reference(&this->pipe, NULL);
>> >  }
>> >  
>> >  sub_resource::sub_resource(resource &r, const vector &offset) :
>> > @@ -202,18 +203,21 @@ mapping::mapping(command_queue &q, resource &r,
>> >pxfer = NULL;
>> >throw error(CL_OUT_OF_RESOURCES);
>> > }
>> > +   pipe_resource_reference(&res, r.pipe);
>> >  }
>> >  
>> >  mapping::mapping(mapping &&m) :
>> > -   pctx(m.pctx), pxfer(m.pxfer), p(m.p) {
>> > +   pctx(m.pctx), pxfer(m.pxfer), res(m.res), p(m.p) {
>> > m.pctx = NULL;
>> > m.pxfer = NULL;
>> > +   m.res = NULL;
>> > m.p = NULL;
>> >  }
>> >  
>> >  mapping::~mapping() {
>> > if (pxfer) {
>> >pctx->transfer_unmap(pctx, pxfer);
>> > }
>> > +   pipe_resource_reference(&res, NULL);
>> >  }
>> >  
>> > @@ -222,5 +226,6 @@ mapping::operator=(mapping m) {
>> > std::swap(pctx, m.pctx);
>> > std::swap(pxfer, m.pxfer);
>> > +   std::swap(res, m.res);
>> > std::swap(p, m.p);
>> > return *this;
>> >  }
>> > diff --git a/src/gallium/state_trackers/clover/core/resource.hpp 
>> > b/src/gallium/state_trackers/clover/core/resource.hpp
>> > index 9993dcb..cea9617 100644
>> > --- a/src/gallium/state_trackers/clover/core/resource.hpp
>> > +++ b/src/gallium/state_trackers/clover/core/resource.hpp
>> > @@ -123,9 +123,10 @@ namespace clover {
>> >}
>> >  
>> > private:
>> > -  pipe_context *pctx;
>> > -  pipe_transfer *pxfer;
>> > -  void *p;
>> > +  pipe_context *pctx = NULL;
>> > +  pipe_transfer *pxfer = NULL;
>> > +  pipe_resource *res = NULL;
>> > +  void *p = NULL;
>> > };
>> >  }
>> >  
>> > -- 
>> > 2.9.3


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC libdrm 0/2] Replace the build system with meson

Hi Emil,

Quoting Emil Velikov (2017-03-16 16:35:33)
> While I can see you're impressed by Meson, I would kindly urge you to
> not use it here. As you look closely you can see that one could
> trivially improve the times, yet the biggest thing is that most of the
> code in libdrm must go ;-)

Perhaps I wasn't clear enough, I don't really expect this to land ever. I sent
it out more because I'd written it and it works and is a useful demonstration of
meson+ninja performance. Obviously 20 seconds -> 5 seconds isn't a huge deal :);
but in a larger project, consider that a 4x speedup would be 4 minutes to 1
minute, and that is a huge difference in time.

> 
> As the port is not 1:1 wrt the autoconf one, the performance numbers
> above are comparing apples to oranges.

I fail to see what I'm missing from meson that would have an effect on the
times I reported. There are some files that are installed by autoconf that I
didn't bother to install with meson (because I don't expect this to land). Since
I didn't time installs, I don't see how it isn't an apples to apples comparison.

I understand that libdrm is a pessimal case for recursive-make since most
sub folders contain < 5 C files, However, even if you were to flatten the make
files meson+ninja would still be faster when you consider that meson
configures and builds faster than autotools configures.

> If you/others are unhappy with the build times of libdrm - poke me on
> IRC. I will give you some easy tips on how to improve those.
> 
> You have some good python knowledge - I would kindly urge you to
> improve/rewrite the slow and/or hacky python scripts we have in mesa.
> This is a topic that was mentioned multiple times, and a part where
> everyone will be glad to see some progress.
> 
> Thanks
> Emil

The real goal here is to do mesa (in case I didn't make that clear either), and
the advantage for mesa is not just performance, it's that meson supports visual
studio on windows; which means that we could hopefully not just get faster
builds, but also replace both autotools and scons with a single build system.

Dylan


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] i965: Remove pointless NULL check from Gen6 primitive counting code.

We create the BO when creating a transform feedback object, and only
destroy it when deleting that object.  So it won't be NULL.

CID: 1401410
---
 src/mesa/drivers/dri/i965/gen6_sol.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen6_sol.c 
b/src/mesa/drivers/dri/i965/gen6_sol.c
index 132f0696e35..a6115746692 100644
--- a/src/mesa/drivers/dri/i965/gen6_sol.c
+++ b/src/mesa/drivers/dri/i965/gen6_sol.c
@@ -286,9 +286,10 @@ brw_save_primitives_written_counters(struct brw_context 
*brw,
const struct gl_context *ctx = &brw->ctx;
const int streams = ctx->Const.MaxVertexStreams;
 
+   assert(obj->prim_count_bo != NULL);
+
/* Check if there's enough space for a new pair of four values. */
-   if (obj->prim_count_bo != NULL &&
-   obj->prim_count_buffer_index + 2 * streams >= 4096 / sizeof(uint64_t)) {
+   if (obj->prim_count_buffer_index + 2 * streams >= 4096 / sizeof(uint64_t)) {
   /* Gather up the results so far and release the BO. */
   tally_prims_generated(brw, obj);
}
-- 
2.12.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/4] glsl: add new IR lower pass for sqrt(abs()) and inversesqrt(abs())




On 03/17/2017 01:17 AM, Kenneth Graunke wrote:

On Thursday, March 16, 2017 5:06:55 PM PDT Samuel Pitoiset wrote:

Looks easier to do that at lowering time and mostly because
builtin_builder is a singleton class without access to any
states/constants.

Signed-off-by: Samuel Pitoiset 
---
 src/compiler/Makefile.sources   |  1 +
 src/compiler/glsl/ir_optimization.h |  2 +
 src/compiler/glsl/lower_sqrt.cpp| 77 +
 src/compiler/glsl/test_optpass.cpp  |  2 +
 4 files changed, 82 insertions(+)
 create mode 100644 src/compiler/glsl/lower_sqrt.cpp


Why not put this in lower_instructions.cpp?  That's sort of the
catch-all pass for tiny bits of expression lowering.


Because it looked hacky at first look, but I can do that yes.




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] util/disk_cache: pass predicate functions file stats directly (v3)

2017-03-16 Thread Grazvydas Ignotas

On Thu, Mar 16, 2017 at 7:36 PM, Alan Swanson  wrote:
> Since switching to LRU eviction the only user of these predicate
> functions now resolves directory entry stats itself so pass them
> directly saving calling fstat and strlen twice (and the
> expensive strlen is skipped entirely if access time is newer).
>
> v2: Update for empty cache dir detection changes
> v3: Fix passing string length to predicate with the +1 for NULL
> termination and also pass sb as pointer
> ---
> That +1 was embarrassingly careless. Thanks for catching.

Now it doesn't seem to compile, sent some wrong version?

Gražvydas

>
>  src/util/disk_cache.c | 55 
> ---
>  1 file changed, 21 insertions(+), 34 deletions(-)
>
> diff --git a/src/util/disk_cache.c b/src/util/disk_cache.c
> index e015e56f5e..c2ed58047a 100644
> --- a/src/util/disk_cache.c
> +++ b/src/util/disk_cache.c
> @@ -481,8 +481,9 @@ make_cache_file_directory(struct disk_cache *cache, const 
> cache_key key)
>   */
>  static char *
>  choose_lru_file_matching(const char *dir_path,
> - bool (*predicate)(const struct dirent *,
> -   const char *dir_path))
> + bool (*predicate)(const char *dir_path,
> +   const struct stat *,
> +   const char *, const size_t))
>  {
> DIR *dir;
> struct dirent *entry;
> @@ -498,17 +499,19 @@ choose_lru_file_matching(const char *dir_path,
>entry = readdir(dir);
>if (entry == NULL)
>   break;
> -  if (!predicate(entry, dir_path))
> - continue;
>
>struct stat sb;
>if (fstatat(dirfd(dir), entry->d_name, &sb, 0) == 0) {
>   if (!lru_atime || (sb.st_atime < lru_atime)) {
> -size_t len = strlen(entry->d_name) + 1;
> -char *tmp = realloc(lru_name, len);
> +size_t len = strlen(entry->d_name);
> +
> +if (!predicate(dir_path, sb, entry->d_name, len))
> +   continue;
> +
> +char *tmp = realloc(lru_name, len + 1);
>  if (tmp) {
> lru_name = tmp;
> -   memcpy(lru_name, entry->d_name, len);
> +   memcpy(lru_name, entry->d_name, len + 1);
> lru_atime = sb.st_atime;
>  }
>   }
> @@ -533,21 +536,13 @@ choose_lru_file_matching(const char *dir_path,
>   * ".tmp"
>   */
>  static bool
> -is_regular_non_tmp_file(const struct dirent *entry, const char *path)
> +is_regular_non_tmp_file(const char *path, const struct stat *sb,
> +const char *d_name, const size_t len)
>  {
> -   char *filename;
> -   if (asprintf(&filename, "%s/%s", path, entry->d_name) == -1)
> -  return false;
> -
> -   struct stat sb;
> -   int res = stat(filename, &sb);
> -   free(filename);
> -
> -   if (res == -1 || !S_ISREG(sb.st_mode))
> +   if (!S_ISREG(sb->st_mode))
>return false;
>
> -   size_t len = strlen (entry->d_name);
> -   if (len >= 4 && strcmp(&entry->d_name[len-4], ".tmp") == 0)
> +   if (len >= 4 && strcmp(&d_name[len-4], ".tmp") == 0)
>return false;
>
> return true;
> @@ -579,29 +574,21 @@ unlink_lru_file_from_directory(const char *path)
>   * special name of ".."). We also return false if the dir is empty.
>   */
>  static bool
> -is_two_character_sub_directory(const struct dirent *entry, const char *path)
> +is_two_character_sub_directory(const char *path, const struct stat *sb,
> +   const char *d_name, const size_t len)
>  {
> -   char *subdir;
> -   if (asprintf(&subdir, "%s/%s", path, entry->d_name) == -1)
> +   if (!S_ISDIR(sb->st_mode))
>return false;
>
> -   struct stat sb;
> -   int res = stat(subdir, &sb);
> -   if (res == -1 || !S_ISDIR(sb.st_mode)) {
> -  free(subdir);
> +   if (len != 2)
>return false;
> -   }
>
> -   if (strlen(entry->d_name) != 2) {
> -  free(subdir);
> +   if (strcmp(d_name, "..") == 0)
>return false;
> -   }
>
> -   if (strcmp(entry->d_name, "..") == 0) {
> -  free(subdir);
> +   char *subdir;
> +   if (asprintf(&subdir, "%s/%s", path, d_name) == -1)
>return false;
> -   }
> -
> DIR *dir = opendir(subdir);
> free(subdir);
>
> --
> 2.11.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/4] glsl: add new IR lower pass for sqrt(abs()) and inversesqrt(abs())

On Thursday, March 16, 2017 5:06:55 PM PDT Samuel Pitoiset wrote:
> Looks easier to do that at lowering time and mostly because
> builtin_builder is a singleton class without access to any
> states/constants.
> 
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/compiler/Makefile.sources   |  1 +
>  src/compiler/glsl/ir_optimization.h |  2 +
>  src/compiler/glsl/lower_sqrt.cpp| 77 
> +
>  src/compiler/glsl/test_optpass.cpp  |  2 +
>  4 files changed, 82 insertions(+)
>  create mode 100644 src/compiler/glsl/lower_sqrt.cpp

Why not put this in lower_instructions.cpp?  That's sort of the
catch-all pass for tiny bits of expression lowering.

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/4] glsl: add new IR lower pass for sqrt(abs()) and inversesqrt(abs())

Looks easier to do that at lowering time and mostly because
builtin_builder is a singleton class without access to any
states/constants.

Signed-off-by: Samuel Pitoiset 
---
 src/compiler/Makefile.sources   |  1 +
 src/compiler/glsl/ir_optimization.h |  2 +
 src/compiler/glsl/lower_sqrt.cpp| 77 +
 src/compiler/glsl/test_optpass.cpp  |  2 +
 4 files changed, 82 insertions(+)
 create mode 100644 src/compiler/glsl/lower_sqrt.cpp

diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources
index 2455d4eb5a..9cc2d9be7a 100644
--- a/src/compiler/Makefile.sources
+++ b/src/compiler/Makefile.sources
@@ -112,6 +112,7 @@ LIBGLSL_FILES = \
glsl/lower_output_reads.cpp \
glsl/lower_shared_reference.cpp \
glsl/lower_ubo_reference.cpp \
+   glsl/lower_sqrt.cpp \
glsl/opt_algebraic.cpp \
glsl/opt_array_splitting.cpp \
glsl/opt_conditional_discard.cpp \
diff --git a/src/compiler/glsl/ir_optimization.h 
b/src/compiler/glsl/ir_optimization.h
index 67a7514c7d..073956bba5 100644
--- a/src/compiler/glsl/ir_optimization.h
+++ b/src/compiler/glsl/ir_optimization.h
@@ -173,3 +173,5 @@ compare_index_block(exec_list *instructions, ir_variable 
*index,
 
 bool lower_64bit_integer_instructions(exec_list *instructions,
   unsigned what_to_lower);
+
+bool lower_sqrt(exec_list *instructions);
diff --git a/src/compiler/glsl/lower_sqrt.cpp b/src/compiler/glsl/lower_sqrt.cpp
new file mode 100644
index 00..5e827f81bb
--- /dev/null
+++ b/src/compiler/glsl/lower_sqrt.cpp
@@ -0,0 +1,77 @@
+/*
+ * Copyright © 2010 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+/**
+ * \file lower_sqrt.cpp
+ * IR lower pass to compute absolute value before sqrt() and inversesqrt() in
+ * order to follow D3D9 behaviour for (buggy) apps that request it.
+ *
+ * \author Samuel Pitoiset 
+ */
+
+#include "ir.h"
+#include "ir_rvalue_visitor.h"
+#include "ir_builder.h"
+
+using namespace ir_builder;
+
+namespace {
+
+class lower_sqrt_visitor : public ir_rvalue_visitor {
+public:
+   lower_sqrt_visitor() : progress(false)
+   {
+  /* empty */
+   }
+
+   void handle_rvalue(ir_rvalue **rvalue)
+   {
+  if (!*rvalue)
+ return;
+
+  ir_expression *ir = (*rvalue)->as_expression();
+  if (!ir)
+ return;
+
+  if (ir->operation != ir_unop_rsq && ir->operation != ir_unop_rsq)
+ return;
+
+  *rvalue = expr(ir->operation, abs(ir->operands[0]));
+
+  this->progress = true;
+   }
+
+   bool progress;
+};
+
+}
+
+bool
+lower_sqrt(exec_list *instructions)
+{
+   lower_sqrt_visitor v;
+
+   visit_list_elements(&v, instructions);
+
+   return v.progress;
+}
diff --git a/src/compiler/glsl/test_optpass.cpp 
b/src/compiler/glsl/test_optpass.cpp
index c6e97888f6..ea145f95f8 100644
--- a/src/compiler/glsl/test_optpass.cpp
+++ b/src/compiler/glsl/test_optpass.cpp
@@ -121,6 +121,8 @@ do_optimization(struct exec_list *ir, const char 
*optimization,
   return lower_instructions(ir, int_0);
} else if (strcmp(optimization, "lower_noise") == 0) {
   return lower_noise(ir);
+   } else if (strcmp(optimization, "lower_sqrt") == 0) {
+  return lower_sqrt(ir);
} else if (sscanf(optimization, "lower_variable_index_to_cond_assign "
  "( %d , %d , %d , %d ) ", &int_0, &int_1, &int_2,
  &int_3) == 4) {
-- 
2.12.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 4/4] drirc: add force_glsl_abs_sqrt() for "Spec Ops: The Line"

Game ported from D3D9 which expects sqrt() to compute the absolute
value as explained in the spec.

This gets rid of the NaN values as well as the black squares
with RadeonSI.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97338
Signed-off-by: Samuel Pitoiset 
---
 src/mesa/drivers/dri/common/drirc | 8 
 1 file changed, 8 insertions(+)

diff --git a/src/mesa/drivers/dri/common/drirc 
b/src/mesa/drivers/dri/common/drirc
index 494e9e1509..23d09fabb1 100644
--- a/src/mesa/drivers/dri/common/drirc
+++ b/src/mesa/drivers/dri/common/drirc
@@ -120,5 +120,13 @@ TODO: document the other workarounds.
 
 
 
+
+
+
+
+
+
+
+
 
 
-- 
2.12.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/4] st/glsl_to_tgsi: enable lower_sqrt() conditionally

It relies on the force_glsl_abs_sqrt driconf option.

Signed-off-by: Samuel Pitoiset 
---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 0757d141fc..e22c77bf3a 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -6982,6 +6982,10 @@ st_link_shader(struct gl_context *ctx, struct 
gl_shader_program *prog)
  lower_discard(ir);
   }
 
+  if (ctx->Const.ForceGLSLAbsSqrt) {
+ lower_sqrt(ir);
+  }
+
   if (ctx->Const.GLSLOptimizeConservatively) {
  /* Do it once and repeat only if there's unsupported control flow. */
  do {
-- 
2.12.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/4] driconf: add force_glsl_abs_sqrt option

This will allow to force computing the absolute value for sqrt()
and inversesqrt() in order to follow D3D9 behaviour for buggy
apps that rely on it.

Signed-off-by: Samuel Pitoiset 
---
 src/gallium/include/state_tracker/st_api.h  | 1 +
 src/gallium/state_trackers/dri/dri_screen.c | 3 +++
 src/mesa/drivers/dri/common/xmlpool/t_options.h | 5 +
 src/mesa/drivers/dri/i965/brw_context.c | 3 +++
 src/mesa/drivers/dri/i965/intel_screen.c| 1 +
 src/mesa/main/mtypes.h  | 6 ++
 src/mesa/state_tracker/st_extensions.c  | 2 ++
 7 files changed, 21 insertions(+)

diff --git a/src/gallium/include/state_tracker/st_api.h 
b/src/gallium/include/state_tracker/st_api.h
index a9997744cd..868181d168 100644
--- a/src/gallium/include/state_tracker/st_api.h
+++ b/src/gallium/include/state_tracker/st_api.h
@@ -248,6 +248,7 @@ struct st_config_options
boolean allow_glsl_extension_directive_midshader;
boolean allow_higher_compat_version;
boolean glsl_zero_init;
+   boolean force_glsl_abs_sqrt;
unsigned char config_options_sha1[20];
 };
 
diff --git a/src/gallium/state_trackers/dri/dri_screen.c 
b/src/gallium/state_trackers/dri/dri_screen.c
index 9b37dff677..55b1752f4a 100644
--- a/src/gallium/state_trackers/dri/dri_screen.c
+++ b/src/gallium/state_trackers/dri/dri_screen.c
@@ -71,6 +71,7 @@ const __DRIconfigOptionsExtension gallium_config_options = {
  DRI_CONF_FORCE_GLSL_VERSION(0)
  DRI_CONF_ALLOW_GLSL_EXTENSION_DIRECTIVE_MIDSHADER("false")
  DRI_CONF_ALLOW_HIGHER_COMPAT_VERSION("false")
+ DRI_CONF_FORCE_GLSL_ABS_SQRT("false")
   DRI_CONF_SECTION_END
 
   DRI_CONF_SECTION_MISCELLANEOUS
@@ -105,6 +106,8 @@ dri_fill_st_options(struct dri_screen *screen)
options->allow_higher_compat_version =
   driQueryOptionb(optionCache, "allow_higher_compat_version");
options->glsl_zero_init = driQueryOptionb(optionCache, "glsl_zero_init");
+   options->force_glsl_abs_sqrt =
+  driQueryOptionb(optionCache, "force_glsl_abs_sqrt");
 
driComputeOptionsSha1(optionCache, options->config_options_sha1);
 }
diff --git a/src/mesa/drivers/dri/common/xmlpool/t_options.h 
b/src/mesa/drivers/dri/common/xmlpool/t_options.h
index f200093177..1ed8a9b4f9 100644
--- a/src/mesa/drivers/dri/common/xmlpool/t_options.h
+++ b/src/mesa/drivers/dri/common/xmlpool/t_options.h
@@ -120,6 +120,11 @@ DRI_CONF_OPT_BEGIN_B(allow_higher_compat_version, def) \
 DRI_CONF_DESC(en,gettext("Allow a higher compat profile (version 3.1+) 
for apps that request it")) \
 DRI_CONF_OPT_END
 
+#define DRI_CONF_FORCE_GLSL_ABS_SQRT(def) \
+DRI_CONF_OPT_BEGIN_B(force_glsl_abs_sqrt, def) \
+DRI_CONF_DESC(en,gettext("Force computing the absolute value for 
sqrt() and inversesqrt()")) \
+DRI_CONF_OPT_END
+
 
 
 /**
diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 32cfb2efe4..f584fab180 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -922,6 +922,9 @@ brw_process_driconf_options(struct brw_context *brw)
ctx->Const.AllowHigherCompatVersion =
   driQueryOptionb(options, "allow_higher_compat_version");
 
+   ctx->Const.ForceGLSLAbsSqrt =
+  driQueryOptionb(options, "force_glsl_abs_sqrt");
+
ctx->Const.GLSLZeroInit = driQueryOptionb(options, "glsl_zero_init");
 
brw->dual_color_blend_by_location =
diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
b/src/mesa/drivers/dri/i965/intel_screen.c
index 10dab2317e..e5fa0cc17a 100644
--- a/src/mesa/drivers/dri/i965/intel_screen.c
+++ b/src/mesa/drivers/dri/i965/intel_screen.c
@@ -81,6 +81,7 @@ DRI_CONF_BEGIN
   DRI_CONF_DUAL_COLOR_BLEND_BY_LOCATION("false")
   DRI_CONF_ALLOW_GLSL_EXTENSION_DIRECTIVE_MIDSHADER("false")
   DRI_CONF_ALLOW_HIGHER_COMPAT_VERSION("false")
+  DRI_CONF_FORCE_GLSL_ABS_SQRT("false")
 
   DRI_CONF_OPT_BEGIN_B(shader_precompile, "true")
 DRI_CONF_DESC(en, "Perform code generation at shader link time.")
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index e53d5762f9..01c656e64e 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -3556,6 +3556,12 @@ struct gl_constants
GLboolean AllowHigherCompatVersion;
 
/**
+* Force computing the absolute value for sqrt() and inversesqrt() to follow
+* D3D9 when apps rely on this behaviour.
+*/
+   GLboolean ForceGLSLAbsSqrt;
+
+   /**
 * Force uninitialized variables to default to zero.
 */
GLboolean GLSLZeroInit;
diff --git a/src/mesa/state_tracker/st_extensions.c 
b/src/mesa/state_tracker/st_extensions.c
index 0dc2580a88..16f86856a3 100644
--- a/src/mesa/state_tracker/st_extensions.c
+++ b/src/mesa/state_tracker/st_extensions.c
@@ -881,6 +881,8 @@ void st_init_extensions(struct pipe_screen *screen,
 
consts->AllowHigherCompatVersion = options->allow_higher_compat_version;
 
+   consts->ForceGLSLAbsSqrt = options->force_glsl_abs_s

Re: [Mesa-dev] [PATCH] i965: avoid using a GNU make pattern rule

2017-03-16 Thread Emil Velikov

On 16 March 2017 at 23:53, Robert Bragg  wrote:
> On Thu, Mar 16, 2017 at 1:50 PM, Emil Velikov  
> wrote:
>> On 16 March 2017 at 02:49, Jonathan Gray  wrote:
>>> % pattern rules are a GNU extension.  As there is only one file here
>>> avoid patterns and globbing entirely to fix the build on non-GNU make.
>>>
>>> Signed-off-by: Jonathan Gray 
>>> ---
>>>  src/mesa/drivers/dri/i965/Makefile.am | 8 
>>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/src/mesa/drivers/dri/i965/Makefile.am 
>>> b/src/mesa/drivers/dri/i965/Makefile.am
>>> index a83e3a6fa1..fee1ccbbf5 100644
>>> --- a/src/mesa/drivers/dri/i965/Makefile.am
>>> +++ b/src/mesa/drivers/dri/i965/Makefile.am
>>> @@ -92,8 +92,8 @@ EXTRA_DIST = \
>>>  # .c and .h files in one go so we don't hit problems with parallel
>>>  # make and multiple invocations of the same script trying to write
>>>  # to the same files.
>>> -brw_oa_%.h: brw_oa_%.xml brw_oa.py Makefile.am
>>> -   $(PYTHON2) $(PYTHON_FLAGS) $(srcdir)/brw_oa.py 
>>> --header=$(builddir)/brw_oa_$(*).h --chipset="$(*)" 
>>> $(srcdir)/brw_oa_$(*).xml
>>> -brw_oa_%.c: brw_oa_%.xml brw_oa.py Makefile.am
>>> -   $(PYTHON2) $(PYTHON_FLAGS) $(srcdir)/brw_oa.py 
>>> --code=$(builddir)/brw_oa_$(*).c --chipset="$(*)" $(srcdir)/brw_oa_$(*).xml
>> 
>> Hmm that is not even remotely what was reviewed or suggested ... strange.
>> 
>
> Sorry, I didn't intend to change this under the radar, but I guess the
> you didn't see this:
> https://lists.freedesktop.org/archives/mesa-dev/2017-March/147200.html
>
Dropping the r-b and/or poking me on IRC would have been appreciated.
It's odd that it was completely rewritten, as we seemingly reached
consensus on IRC :-(

>>
>>> +brw_oa_hsw.h: $(srcdir)/brw_oa_hsw.xml
>>> +   $(PYTHON2) $(PYTHON_FLAGS) $(srcdir)/brw_oa.py 
>>> --header=$(builddir)/brw_oa_hsw.h --chipset=hsw $(srcdir)/brw_oa_hsw.xml
>>> +brw_oa_hsw.c: $(srcdir)/brw_oa_hsw.xml
>>> +   $(PYTHON2) $(PYTHON_FLAGS) $(srcdir)/brw_oa.py 
>>> --code=$(builddir)/brw_oa_hsw.c --chipset=hsw $(srcdir)/brw_oa_hsw.xml
>>>
>> Thank you Jonathan.
>>
>> We might need a generic rule as other generations are covered and/or
>> even move the lot to another location.
>> All that in due time, for now I'll add the missing brw_oa.py
>> dependency and push this.
>
> Just to mention; this change just caught me out when rebasing my gen 8
> / 9 changes and so I'm wondering about the best way to deal with this,
> ideally without copying seven times for hsw, bdw, chv, sklgt2, sklgt3,
> sklgt4, bxt + more later. For now I'm copy & pasting, and maybe it'll
> just be easiest to stick with that.
>
Yes, I'd stick with that for now.

We should be able to use suffix rules - I believe Jonathan can share
some ideas. Alternatively I'll take a look on fresh mind.

> I wouldn't guess the i915-perf kernel interface has been ported to any
> other OS besides Linux, so maybe these can be annexed as Linux
> specific somehow to avoid tripping up other OS builds.
>
This is more hacky than you'd imagine. Plus keeping it portable is not
that hard/complex right ?

> It looks like there are a few other files in Mesa that use patterns,
> such as ./src/mapi/glapi/gen/Makefile.am - I wonder how come that
> hasn't proven to be a problem.
>
Do not give [almost] anything in src/mapi as an example, please.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] nir: add an isinf opcode, and an option to use it.

From: Dave Airlie 

In order to get isinf(NaN) correct, at least radv can't
use an unordered equals which feq has to be for us, this
passes isinf to the backend and let's it sort it out as it
pleases.

Signed-off-by: Dave Airlie 
---
 src/compiler/nir/nir.h  | 1 +
 src/compiler/nir/nir_opcodes.py | 2 +-
 src/compiler/spirv/vtn_alu.c| 7 +--
 3 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index 57b8be3..bcdca4b 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -1777,6 +1777,7 @@ typedef struct nir_shader_compiler_options {
bool lower_bitfield_insert;
bool lower_uadd_carry;
bool lower_usub_borrow;
+   bool use_isinf;
/** lowers fneg and ineg to fsub and isub. */
bool lower_negate;
/** lowers fsub and isub to fadd+fneg and iadd+ineg. */
diff --git a/src/compiler/nir/nir_opcodes.py b/src/compiler/nir/nir_opcodes.py
index 52868d5..7387208 100644
--- a/src/compiler/nir/nir_opcodes.py
+++ b/src/compiler/nir/nir_opcodes.py
@@ -203,7 +203,7 @@ unop("fquantize2f16", tfloat, "(fabs(src0) < ldexpf(1.0, 
-14)) ? copysignf(0.0f,
 unop("fsin", tfloat, "bit_size == 64 ? sin(src0) : sinf(src0)")
 unop("fcos", tfloat, "bit_size == 64 ? cos(src0) : cosf(src0)")
 
-
+unop_convert("isinf", tbool, tfloat, "isinf(src0)")
 # Partial derivatives.
 
 
diff --git a/src/compiler/spirv/vtn_alu.c b/src/compiler/spirv/vtn_alu.c
index 0738fe0..79f51e7 100644
--- a/src/compiler/spirv/vtn_alu.c
+++ b/src/compiler/spirv/vtn_alu.c
@@ -447,8 +447,11 @@ vtn_handle_alu(struct vtn_builder *b, SpvOp opcode,
   break;
 
case SpvOpIsInf:
-  val->ssa->def = nir_feq(&b->nb, nir_fabs(&b->nb, src[0]),
-  nir_imm_float(&b->nb, INFINITY));
+  if (b->shader->options->use_isinf)
+ val->ssa->def = nir_isinf(&b->nb, src[0]);
+  else
+ val->ssa->def = nir_feq(&b->nb, nir_fabs(&b->nb, src[0]),
+ nir_imm_float(&b->nb, INFINITY));
   break;
 
case SpvOpFUnordEqual:
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] radv/ac: emit isinf using ordered equal

From: Dave Airlie 

This fixes:
dEQP-VK.glsl.builtin.function.common.isinf.*

Signed-off-by: Dave Airlie 
---
 src/amd/common/ac_nir_to_llvm.c | 14 ++
 src/amd/vulkan/radv_pipeline.c  |  1 +
 2 files changed, 15 insertions(+)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 2b41c51..15974a7 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -1168,6 +1168,17 @@ static LLVMValueRef emit_ddxy(struct nir_to_llvm_context 
*ctx,
return result;
 }
 
+static LLVMValueRef emit_isinf(struct nir_to_llvm_context *ctx,
+  nir_alu_instr *instr,
+  LLVMValueRef src0)
+{
+   LLVMTypeRef def_type = get_def_type(ctx, &instr->dest.dest.ssa);
+   src0 = emit_intrin_1f_param(ctx, "llvm.fabs",
+  to_float_type(ctx, def_type), src0);
+   src0 = emit_float_cmp(ctx, LLVMRealOEQ, src0, LLVMConstReal(ctx->f32, 
INFINITY));
+   return src0;
+}
+
 /*
  * this takes an I,J coordinate pair,
  * and works out the X and Y derivatives.
@@ -1534,6 +1545,9 @@ static void visit_alu(struct nir_to_llvm_context *ctx, 
nir_alu_instr *instr)
case nir_op_fddy_coarse:
result = emit_ddxy(ctx, instr->op, src[0]);
break;
+   case nir_op_isinf:
+   result = emit_isinf(ctx, instr, src[0]);
+   break;
default:
fprintf(stderr, "Unknown NIR alu instr: ");
nir_print_instr(&instr->instr, stderr);
diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index ce228df..cc15ef8 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -47,6 +47,7 @@ void radv_shader_variant_destroy(struct radv_device *device,
 static const struct nir_shader_compiler_options nir_options = {
.vertex_id_zero_based = true,
.lower_scmp = true,
+   .use_isinf = true,
.lower_flrp32 = true,
.lower_fsat = true,
.lower_pack_snorm_2x16 = true,
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] st/dri: wait for thread to finish before unbinding context

2017-03-16 Thread Timothy Arceri

Fixes a bunch of piglit crashes that hit an assert() when trying
to delete the framebuffer. The assert() was triggered because
WinSysDrawBuffer was set to NULL before glDeleteFramebuffers()
was called.
---
 src/gallium/state_trackers/dri/dri_context.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/gallium/state_trackers/dri/dri_context.c 
b/src/gallium/state_trackers/dri/dri_context.c
index 91d2d1f..92d7984 100644
--- a/src/gallium/state_trackers/dri/dri_context.c
+++ b/src/gallium/state_trackers/dri/dri_context.c
@@ -199,20 +199,23 @@ dri_destroy_context(__DRIcontext * cPriv)
 GLboolean
 dri_unbind_context(__DRIcontext * cPriv)
 {
/* dri_util.c ensures cPriv is not null */
struct dri_screen *screen = dri_screen(cPriv->driScreenPriv);
struct dri_context *ctx = dri_context(cPriv);
struct st_api *stapi = screen->st_api;
 
if (--ctx->bind_count == 0) {
   if (ctx->st == ctx->stapi->get_current(ctx->stapi)) {
+ if (ctx->st->thread_finish)
+ctx->st->thread_finish(ctx->st);
+
  /* For conformance, unbind is supposed to flush the context.
   * However, if we do it here we might end up flushing a partially
   * destroyed context. Instead, we flush in dri_make_current and
   * in dri_destroy_context which should cover all the cases.
   */
  stapi->make_current(stapi, NULL, NULL, NULL);
   }
}
 
return GL_TRUE;
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: avoid using a GNU make pattern rule

2017-03-16 Thread Robert Bragg

On Thu, Mar 16, 2017 at 1:50 PM, Emil Velikov  wrote:
> On 16 March 2017 at 02:49, Jonathan Gray  wrote:
>> % pattern rules are a GNU extension.  As there is only one file here
>> avoid patterns and globbing entirely to fix the build on non-GNU make.
>>
>> Signed-off-by: Jonathan Gray 
>> ---
>>  src/mesa/drivers/dri/i965/Makefile.am | 8 
>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/Makefile.am 
>> b/src/mesa/drivers/dri/i965/Makefile.am
>> index a83e3a6fa1..fee1ccbbf5 100644
>> --- a/src/mesa/drivers/dri/i965/Makefile.am
>> +++ b/src/mesa/drivers/dri/i965/Makefile.am
>> @@ -92,8 +92,8 @@ EXTRA_DIST = \
>>  # .c and .h files in one go so we don't hit problems with parallel
>>  # make and multiple invocations of the same script trying to write
>>  # to the same files.
>> -brw_oa_%.h: brw_oa_%.xml brw_oa.py Makefile.am
>> -   $(PYTHON2) $(PYTHON_FLAGS) $(srcdir)/brw_oa.py 
>> --header=$(builddir)/brw_oa_$(*).h --chipset="$(*)" $(srcdir)/brw_oa_$(*).xml
>> -brw_oa_%.c: brw_oa_%.xml brw_oa.py Makefile.am
>> -   $(PYTHON2) $(PYTHON_FLAGS) $(srcdir)/brw_oa.py 
>> --code=$(builddir)/brw_oa_$(*).c --chipset="$(*)" $(srcdir)/brw_oa_$(*).xml
> 
> Hmm that is not even remotely what was reviewed or suggested ... strange.
> 

Sorry, I didn't intend to change this under the radar, but I guess the
you didn't see this:
https://lists.freedesktop.org/archives/mesa-dev/2017-March/147200.html

>
>> +brw_oa_hsw.h: $(srcdir)/brw_oa_hsw.xml
>> +   $(PYTHON2) $(PYTHON_FLAGS) $(srcdir)/brw_oa.py 
>> --header=$(builddir)/brw_oa_hsw.h --chipset=hsw $(srcdir)/brw_oa_hsw.xml
>> +brw_oa_hsw.c: $(srcdir)/brw_oa_hsw.xml
>> +   $(PYTHON2) $(PYTHON_FLAGS) $(srcdir)/brw_oa.py 
>> --code=$(builddir)/brw_oa_hsw.c --chipset=hsw $(srcdir)/brw_oa_hsw.xml
>>
> Thank you Jonathan.
>
> We might need a generic rule as other generations are covered and/or
> even move the lot to another location.
> All that in due time, for now I'll add the missing brw_oa.py
> dependency and push this.

Just to mention; this change just caught me out when rebasing my gen 8
/ 9 changes and so I'm wondering about the best way to deal with this,
ideally without copying seven times for hsw, bdw, chv, sklgt2, sklgt3,
sklgt4, bxt + more later. For now I'm copy & pasting, and maybe it'll
just be easiest to stick with that.

I wouldn't guess the i915-perf kernel interface has been ported to any
other OS besides Linux, so maybe these can be annexed as Linux
specific somehow to avoid tripping up other OS builds.

It looks like there are a few other files in Mesa that use patterns,
such as ./src/mapi/glapi/gen/Makefile.am - I wonder how come that
hasn't proven to be a problem.

btw the recent change here removed Makefile.am from the list of
dependencies, which I think was reasonable to have originally (and
consistent with what automake normally does for rules it generates).

Regards,
- Robert

>
> Thanks
> Emil
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 6/7] intel: Fix requests for exact surface row pitch

On Wed 15 Mar 2017, Jason Ekstrand wrote:
> They should probably assert that isl_surf_init succeeds instead.

Good point. I'll fix that.

> 
> On Mon, Mar 13, 2017 at 3:28 PM, Chad Versace 
> wrote:
> 
> > All callers of isl_surf_init() that set 'min_row_pitch' wanted to
> > request an *exact* row pitch, as evidenced by nearby asserts, but isl
> > lacked API for doing so. Now that isl has an API for that, update the
> > code to use it.
> >
> > Reviewed-by: Nanley Chery 
> > Reviewed-by: Anuj Phogat 
> > ---
> >  src/intel/blorp/blorp_blit.c | 3 +--
> >  src/intel/vulkan/anv_blorp.c | 3 +--
> >  src/intel/vulkan/anv_image.c | 2 +-
> >  3 files changed, 3 insertions(+), 5 deletions(-)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC libdrm 0/2] Replace the build system with meson

2017-03-16 Thread Emil Velikov

Hi Dylan,

On 16 March 2017 at 21:25, Dylan Baker  wrote:
> Why bother, and why would we want this?   
>│~
>
> First it's written in python, which means the potential developer base
> is massive. And it provides a recursive view for humans, but a
> non-recursive view for the system. This is the best of both worlds,
> humans can organize the build system in a way that makes sense, and the
> machine gets a non-recursive build system. It also uses ninja rather
> than make, and ninja is faster than make inherently. Meson is also a
> simpler syntax than autotools or cmake it's not Turing Complete by
> design nor does it expose python, again, by design. This allows meson
> itself to be reimplemented in a another language if python becomes a
> dead-end or a bottle-neck. It also makes it much easier to understand
> what the build system is doing.
>
> What's different about using meson?
>
> Well, apart from a faster builds and less magic in the build system? The
> configure flags are different, it uses -D= more like cmake
> than the --enable or --with flags of autotools, although oddly it uses
> --prefix and friends when calling meson, but not with mesonconf, there's
> a bug opened on this. Meson also doesn't support in-tree builds at all;
> all builds are done out of tree. It also doesn't provide a "make dist"
> target, fortunately there's this awesome tool called git, and it
> provides a "git archive" command that does much the same thing. Did I
> mention it's fast?
>
> Here are the performance numbers I see on a 2 core 4 thread SKL, without
> initial configuration, and building out of tree (using zsh):
>
> For meson the command line is:
> time (meson build -Dmanpages=true && ninja -C build)
>
> For autotools the command line is:
> time (mdkir build && cd build && ../autotools && make -j5 -l4)
>
> meson (cold ccache): 13.37s user 1.74s system 255% cpu  5.907 total
> autotools (cold ccache): 26.50s user 1.71s system 129% cpu 21.835 total
> meson (hot ccache):   2.13s user 0.39s system 154% cpu  1.633 total
> autotools (hot ccache):  13.93s user 0.73s system 102% cpu 14.259 total
>
> That's ~4x faster for a cold build and ~10x faster for a hot build.
>
> For a make clean && make style build with a hot cache:
> meson: 4.64s user 0.33s system 334% cpu 1.486 total
> autotools: 7.93s user 0.32s system 167% cpu 4.920 total
>
> Why bother with libdrm?
>
> It's a simple build system, that could be completely (or mostly
> completely) be ported in a very short time, and could serve as a tech
> demo for the advantages of using meson to garner feedback for embarking
> on a larger project, like mesa (which is what I'm planning to work on
> next).
>
> tl;dr
>
> I wrote this as practice for porting Mesa, and figured I might as well
> send it out since I wrote it.
>
> It is very likely that neither of these large patches will show up on the
> mailing list, but this is available at my github:
> https://github.com/dcbaker/libdrm wip/meson
>
While I can see you're impressed by Meson, I would kindly urge you to
not use it here. As you look closely you can see that one could
trivially improve the times, yet the biggest thing is that most of the
code in libdrm must go ;-)

As the port is not 1:1 wrt the autoconf one, the performance numbers
above are comparing apples to oranges.

If you/others are unhappy with the build times of libdrm - poke me on
IRC. I will give you some easy tips on how to improve those.

You have some good python knowledge - I would kindly urge you to
improve/rewrite the slow and/or hacky python scripts we have in mesa.
This is a topic that was mentioned multiple times, and a part where
everyone will be glad to see some progress.

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 4/7] isl: Validate the calculated row pitch

On Wed 15 Mar 2017, Jason Ekstrand wrote:
> Fun story: This will implicitly handle the (invalid) case of trying to
> create a MCS for a 16xMSAA surface that's more than 8k wide. :-)  We may
> want to keep the check in init_mcs for clarity and because it's in the docs
> but the extra validation is nice.

Nice. I like it when a concise, general case unexpectedly supercedes
seemingly unrelated special cases. Let's do more of that in isl :)

Leaving the validation in get_mcs_surf() is ok with me. The extra code
and comment helps clarify things. But if someone does
delete it, I'm ok with that too.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 4/7] isl: Validate the calculated row pitch

On Wed 15 Mar 2017, Jason Ekstrand wrote:
> On Wed, Mar 15, 2017 at 3:34 PM, Nanley Chery  wrote:
> 
> > On Mon, Mar 13, 2017 at 03:28:01PM -0700, Chad Versace wrote:
> > > Validate that it fits in RENDER_SURFACE_STATE::SurfacePitch or, if it's
> > > an aux surface, AuxiliarySurfacePitch.
> > > ---
> > >  src/intel/isl/isl.c | 35 +--
> > >  1 file changed, 29 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
> > > index 784566749b4..405f5b917fe 100644
> > > --- a/src/intel/isl/isl.c
> > > +++ b/src/intel/isl/isl.c
> > > @@ -1089,18 +1089,39 @@ isl_calc_min_row_pitch(const struct isl_device
> > *dev,
> > > }
> > >  }
> > >
> > > -static uint32_t
> > > +static bool
> > >  isl_calc_row_pitch(const struct isl_device *dev,
> > > const struct isl_surf_init_info *surf_info,
> > > const struct isl_tile_info *tile_info,
> > > enum isl_dim_layout dim_layout,
> > > -   const struct isl_extent2d *phys_slice0_sa)
> > > +   const struct isl_extent2d *phys_slice0_sa,
> > > +   uint32_t *out_row_pitch)
> > >  {
> > > const uint32_t alignment =
> > >isl_calc_row_pitch_alignment(surf_info, tile_info);
> > >
> > > -   return isl_calc_min_row_pitch(dev, surf_info, tile_info,
> > phys_slice0_sa,
> > > - alignment);
> > > +   const uint32_t row_pitch =
> > > +  isl_calc_min_row_pitch(dev, surf_info, tile_info, phys_slice0_sa,
> > > + alignment);
> > > +
> > > +   const uint32_t row_pitch_tiles = row_pitch /
> > tile_info->phys_extent_B.width;
> > > +
> > > +   /* Check that the pitch fits in RENDER_SURFACE_STATE::SurfacePitch
> > or
> > > +* AuxiliarySurfacePitch.
> > > +*/
> > > +   if (dim_layout == ISL_DIM_LAYOUT_GEN9_1D) {
> > > +  /* SurfacePitch is ignored for this layout.
> > > +   * FINISHME: How to validate row pitch for ISL_DIM_LAYOUT_GEN9_1D?
> > > +   */
> > > +   } else if (isl_tiling_is_aux(tile_info->tiling)) {
> > > +  if (row_pitch_tiles > (1 << 9))
> > > + return false;
> > > +   } else if (row_pitch > (1 << 17)) {
> >
> > For SKL at least, shouldn't this be 1 << 18 ?
> >
> 
> I concur

Me too.

Another problem is the max pitch differs for depth buffers.

I'll inspect the genxml more closely, and resend this patch with fixups.
And I'll use the surface usage mask when determining the max pitch.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/7] isl: Add func isl_tiling_is_aux()

On Thu, Mar 16, 2017 at 4:17 PM, Chad Versace 
wrote:

> On Wed 15 Mar 2017, Jason Ekstrand wrote:
> > On Mon, Mar 13, 2017 at 3:28 PM, Chad Versace 
> > wrote:
> >
> > > ---
> > >  src/intel/isl/isl.h | 9 +
> > >  1 file changed, 9 insertions(+)
> > >
> > > diff --git a/src/intel/isl/isl.h b/src/intel/isl/isl.h
> > > index 9d92906ca71..b79793b0c93 100644
> > > --- a/src/intel/isl/isl.h
> > > +++ b/src/intel/isl/isl.h
> > > @@ -473,6 +473,9 @@ typedef uint32_t isl_tiling_flags_t;
> > >  /** The Skylake BSpec refers to Yf and Ys as "standard tiling
> formats". */
> > >  #define ISL_TILING_STD_Y_MASK (ISL_TILING_Yf_BIT | \
> > > ISL_TILING_Ys_BIT)
> > > +
> > > +#define ISL_TILING_AUX_MASK   (ISL_TILING_HIZ_BIT | \
> > > +   ISL_TILING_CCS_BIT)
> > >
> >
> > What about MCS?
>
> Right. This is bad code.
>
> How about I test against ISL_SURF_USAGE_{HIZ,MCS,CCS_D,CCS_E} instead?
> In the next patch, where it matters?
>

Sure.  Either works so long as you  include all the AUX bits.

--Jason
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/7] isl: Add func isl_tiling_is_aux()

On Wed 15 Mar 2017, Jason Ekstrand wrote:
> On Mon, Mar 13, 2017 at 3:28 PM, Chad Versace 
> wrote:
> 
> > ---
> >  src/intel/isl/isl.h | 9 +
> >  1 file changed, 9 insertions(+)
> >
> > diff --git a/src/intel/isl/isl.h b/src/intel/isl/isl.h
> > index 9d92906ca71..b79793b0c93 100644
> > --- a/src/intel/isl/isl.h
> > +++ b/src/intel/isl/isl.h
> > @@ -473,6 +473,9 @@ typedef uint32_t isl_tiling_flags_t;
> >  /** The Skylake BSpec refers to Yf and Ys as "standard tiling formats". */
> >  #define ISL_TILING_STD_Y_MASK (ISL_TILING_Yf_BIT | \
> > ISL_TILING_Ys_BIT)
> > +
> > +#define ISL_TILING_AUX_MASK   (ISL_TILING_HIZ_BIT | \
> > +   ISL_TILING_CCS_BIT)
> >
> 
> What about MCS?

Right. This is bad code.

How about I test against ISL_SURF_USAGE_{HIZ,MCS,CCS_D,CCS_E} instead?
In the next patch, where it matters?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 00/23] nir/i965: Return progress from NIR passes

On Thu, Mar 16, 2017 at 2:17 PM, Matt Turner  wrote:

> I started to add support to NIR for something like INTEL_DEBUG=optimizer,
> but
> then realized that a bunch of NIR passes didn't even return progress.
>
> After fixing that, I realized that a bunch of NIR passes didn't preserve
> metadata.
>
> Sigh.
>

:(

I think I may have seen a few more as I was reviewing but they're not
passes we're calling.

I also found an issue in patch 19 so I think that one needs a bit of work.

1-18 and 20-23 are

Reviewed-by: Jason Ekstrand 

I can't say I read them super-carefully so it's somewhere between an ack
and a review (probably closer to the review side) but I didn't see anything
amiss other than the issue on 19.

--Jason

> I have not tested radv, freedreno, or vc4, but I think that since I did not
> replace their OPT_V/NIR_PASS_V everything should work just the same, and
> allow
> them to transition at their leisure.
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC libdrm 0/2] Replace the build system with meson

Quoting Marek Olšák (2017-03-16 15:36:26)
> Is there a way not to use ninja with meson, because ninja redirects
> all stderr output from gcc to stdout, which breaks many development
> environments that expect errors in stderr?
> 
> I'm basically saying that if ninja can't keep gcc errors in stderr, I
> wouldn't like any project that I might be involved in to require ninja
> for building.
> 
> Marek

There is no way to use another backend on Linux, and meson will not support
Make. Ninja is a big part of the appeal here, since it is faster than make is.
Are there particular tools you know don't work with ninja? It seems like in the
7+ years since ninja came out that someone would have fixed the tools, or that
some stream redirection could be used to fix the problem, "ninja 1>&2"?

Dylan


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 19/23] nir: Return progress from nir_convert_from_ssa().

On Thu, Mar 16, 2017 at 2:18 PM, Matt Turner  wrote:

> ---
>  src/compiler/nir/nir.h  |  2 +-
>  src/compiler/nir/nir_from_ssa.c | 21 +++--
>  2 files changed, 16 insertions(+), 7 deletions(-)
>
> diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
> index db47699..0a127cd 100644
> --- a/src/compiler/nir/nir.h
> +++ b/src/compiler/nir/nir.h
> @@ -2587,7 +2587,7 @@ void nir_convert_loop_to_lcssa(nir_loop *loop);
>   * registers.  If false, convert all values (even those not involved in a
> phi
>   * node) to registers.
>   */
> -void nir_convert_from_ssa(nir_shader *shader, bool phi_webs_only);
> +bool nir_convert_from_ssa(nir_shader *shader, bool phi_webs_only);
>
>  bool nir_lower_phis_to_regs_block(nir_block *block);
>  bool nir_lower_ssa_defs_to_regs_block(nir_block *block);
> diff --git a/src/compiler/nir/nir_from_ssa.c b/src/compiler/nir/nir_from_
> ssa.c
> index d2646c6..07fcceb 100644
> --- a/src/compiler/nir/nir_from_ssa.c
> +++ b/src/compiler/nir/nir_from_ssa.c
> @@ -524,18 +524,21 @@ rewrite_ssa_def(nir_ssa_def *def, void *void_state)
>  static bool
>  resolve_registers_block(nir_block *block, struct from_ssa_state *state)
>  {
> +   bool progress = false;
> +
> nir_foreach_instr_safe(instr, block) {
>state->instr = instr;
> -  nir_foreach_ssa_def(instr, rewrite_ssa_def, state);
> +  progress |= nir_foreach_ssa_def(instr, rewrite_ssa_def, state);
>

We can't trust the return value here.  rewrite_ssa_def *always* returns
true regardless of whether or not it works.  (It kind-of has to thanks to
the way NIR's function pointer based iteration functions work.)  I think we
need to have a progress boolean in the state structure for this pass.


>
>if (instr->type == nir_instr_type_phi) {
>   nir_instr_remove(instr);
>   ralloc_steal(state->dead_ctx, instr);
> + progress = true;
>}
> }
> state->instr = NULL;
>
> -   return true;
> +   return progress;
>  }
>
>  static void
> @@ -756,10 +759,11 @@ resolve_parallel_copies_block(nir_block *block,
> struct from_ssa_state *state)
> return true;
>  }
>
> -static void
> +static bool
>  nir_convert_from_ssa_impl(nir_function_impl *impl, bool phi_webs_only)
>  {
> struct from_ssa_state state;
> +   bool progress = false;
>
> nir_builder_init(&state.builder, impl);
> state.dead_ctx = ralloc_context(NULL);
> @@ -791,7 +795,7 @@ nir_convert_from_ssa_impl(nir_function_impl *impl,
> bool phi_webs_only)
> }
>
> nir_foreach_block(block, impl) {
> -  resolve_registers_block(block, &state);
> +  progress |= resolve_registers_block(block, &state);
> }
>
> nir_foreach_block(block, impl) {
> @@ -804,15 +808,20 @@ nir_convert_from_ssa_impl(nir_function_impl *impl,
> bool phi_webs_only)
> /* Clean up dead instructions and the hash tables */
> _mesa_hash_table_destroy(state.merge_node_table, NULL);
> ralloc_free(state.dead_ctx);
> +   return progress;
>  }
>
> -void
> +bool
>  nir_convert_from_ssa(nir_shader *shader, bool phi_webs_only)
>  {
> +   bool progress = false;
> +
> nir_foreach_function(function, shader) {
>if (function->impl)
> - nir_convert_from_ssa_impl(function->impl, phi_webs_only);
> + progress |= nir_convert_from_ssa_impl(function->impl,
> phi_webs_only);
> }
> +
> +   return progress;
>  }
>
>
> --
> 2.10.2
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 1/1] clover: use pipe_resource references

On Thu, 2017-03-16 at 15:24 -0700, Francisco Jerez wrote:
> Jan Vesely  writes:
> 
> > v2: buffers are created with one reference.
> > v3: add pipe_resource reference to mapping object
> > 
> 
> Mapping objects are supposed to be short-lived, they're logically part
> of the parent resource object so they shouldn't ever out-live it.  What
> is this useful for?

currently they can outlive the underlying pipe_resource. pipe_resource
is destroyed in root_resource destructor, while the list of mappings is
destroyed after resource destructor.
this is arguably an application bug. the piglit test does not call
clUnmapMemObject(), but it'd be nice to not access freed memory.

Vedran's alternative to clear the list before destroying pipe_resource
works as well (assert that the list is empty in resource destructor
would help spot the issue).

Jan

> 
> > CC: "17.0 13.0" 
> > 
> > Signed-off-by: Jan Vesely 
> > ---
> >  src/gallium/state_trackers/clover/core/resource.cpp | 11 ---
> >  src/gallium/state_trackers/clover/core/resource.hpp |  7 ---
> >  2 files changed, 12 insertions(+), 6 deletions(-)
> > 
> > diff --git a/src/gallium/state_trackers/clover/core/resource.cpp 
> > b/src/gallium/state_trackers/clover/core/resource.cpp
> > index 06fd3f6..83e3c26 100644
> > --- a/src/gallium/state_trackers/clover/core/resource.cpp
> > +++ b/src/gallium/state_trackers/clover/core/resource.cpp
> > @@ -25,6 +25,7 @@
> >  #include "pipe/p_screen.h"
> >  #include "util/u_sampler.h"
> >  #include "util/u_format.h"
> > +#include "util/u_inlines.h"
> >  
> >  using namespace clover;
> >  
> > @@ -176,7 +177,7 @@ root_resource::root_resource(clover::device &dev, 
> > memory_obj &obj,
> >  }
> >  
> >  root_resource::~root_resource() {
> > -   device().pipe->resource_destroy(device().pipe, pipe);
> > +   pipe_resource_reference(&this->pipe, NULL);
> >  }
> >  
> >  sub_resource::sub_resource(resource &r, const vector &offset) :
> > @@ -202,18 +203,21 @@ mapping::mapping(command_queue &q, resource &r,
> >pxfer = NULL;
> >throw error(CL_OUT_OF_RESOURCES);
> > }
> > +   pipe_resource_reference(&res, r.pipe);
> >  }
> >  
> >  mapping::mapping(mapping &&m) :
> > -   pctx(m.pctx), pxfer(m.pxfer), p(m.p) {
> > +   pctx(m.pctx), pxfer(m.pxfer), res(m.res), p(m.p) {
> > m.pctx = NULL;
> > m.pxfer = NULL;
> > +   m.res = NULL;
> > m.p = NULL;
> >  }
> >  
> >  mapping::~mapping() {
> > if (pxfer) {
> >pctx->transfer_unmap(pctx, pxfer);
> > }
> > +   pipe_resource_reference(&res, NULL);
> >  }
> >  
> > @@ -222,5 +226,6 @@ mapping::operator=(mapping m) {
> > std::swap(pctx, m.pctx);
> > std::swap(pxfer, m.pxfer);
> > +   std::swap(res, m.res);
> > std::swap(p, m.p);
> > return *this;
> >  }
> > diff --git a/src/gallium/state_trackers/clover/core/resource.hpp 
> > b/src/gallium/state_trackers/clover/core/resource.hpp
> > index 9993dcb..cea9617 100644
> > --- a/src/gallium/state_trackers/clover/core/resource.hpp
> > +++ b/src/gallium/state_trackers/clover/core/resource.hpp
> > @@ -123,9 +123,10 @@ namespace clover {
> >}
> >  
> > private:
> > -  pipe_context *pctx;
> > -  pipe_transfer *pxfer;
> > -  void *p;
> > +  pipe_context *pctx = NULL;
> > +  pipe_transfer *pxfer = NULL;
> > +  pipe_resource *res = NULL;
> > +  void *p = NULL;
> > };
> >  }
> >  
> > -- 
> > 2.9.3


signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] configure.ac: Use POSIX word boundary regex.

2017-03-16 Thread Vinson Lee

Fixes: fe56c745b8cb ("Convert sed(1) syntax to be compatible with FreeBSD and 
OpenBSD")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100236
Signed-off-by: Vinson Lee 
---
 configure.ac | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/configure.ac b/configure.ac
index 8c9d756f294e..684c0e6fcaed 100644
--- a/configure.ac
+++ b/configure.ac
@@ -907,18 +907,18 @@ llvm_add_target() {
 # Call this inside ` ` to get the return value.
 # $1 is the llvm-config command with arguments.
 strip_unwanted_llvm_flags() {
-# Use \> (marks the end of the word)
+# Use [[:>:]] (marks the end of the word)
 echo " `$1`" | sed -E \
 -e 's/[[[:space:]]]+-m[[^[:space:]]]*//g' \
--e 's/[[[:space:]]]+-DNDEBUG\>//g' \
+-e 's/[[[:space:]]]+-DNDEBUG[[[:>:]]]//g' \
 -e 's/[[[:space:]]]+-D_GNU_SOURCE\>//g' \
--e 's/[[[:space:]]]+-pedantic\>//g' \
+-e 's/[[[:space:]]]+-pedantic[[[:>:]]]//g' \
 -e 's/[[[:space:]]]+-W[[^[:space:]]]*//g' \
 -e 's/[[[:space:]]]+-O[[^[:space:]]]*//g' \
 -e 's/[[[:space:]]]+-g[[^[:space:]]]*//g' \
--e 's/-fno-rtti\>/-Fno-rtti/g' \
+-e 's/-fno-rtti[[[:>:]]]/-Fno-rtti/g' \
 -e 's/[[[:space:]]]+-f[[^[:space:]]]*//g' \
--e 's/-Fno-rtti\>/-fno-rtti/g' \
+-e 's/-Fno-rtti[[[:>:]]]/-fno-rtti/g' \
 -e 's/^[[[:space:]]]//'
 }
 
-- 
2.12.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC libdrm 0/2] Replace the build system with meson

2017-03-16 Thread Marek Olšák

Is there a way not to use ninja with meson, because ninja redirects
all stderr output from gcc to stdout, which breaks many development
environments that expect errors in stderr?

I'm basically saying that if ninja can't keep gcc errors in stderr, I
wouldn't like any project that I might be involved in to require ninja
for building.

Marek

On Thu, Mar 16, 2017 at 10:25 PM, Dylan Baker  wrote:
> Why bother, and why would we want this?   
>│~
>
> First it's written in python, which means the potential developer base
> is massive. And it provides a recursive view for humans, but a
> non-recursive view for the system. This is the best of both worlds,
> humans can organize the build system in a way that makes sense, and the
> machine gets a non-recursive build system. It also uses ninja rather
> than make, and ninja is faster than make inherently. Meson is also a
> simpler syntax than autotools or cmake it's not Turing Complete by
> design nor does it expose python, again, by design. This allows meson
> itself to be reimplemented in a another language if python becomes a
> dead-end or a bottle-neck. It also makes it much easier to understand
> what the build system is doing.
>
> What's different about using meson?
>
> Well, apart from a faster builds and less magic in the build system? The
> configure flags are different, it uses -D= more like cmake
> than the --enable or --with flags of autotools, although oddly it uses
> --prefix and friends when calling meson, but not with mesonconf, there's
> a bug opened on this. Meson also doesn't support in-tree builds at all;
> all builds are done out of tree. It also doesn't provide a "make dist"
> target, fortunately there's this awesome tool called git, and it
> provides a "git archive" command that does much the same thing. Did I
> mention it's fast?
>
> Here are the performance numbers I see on a 2 core 4 thread SKL, without
> initial configuration, and building out of tree (using zsh):
>
> For meson the command line is:
> time (meson build -Dmanpages=true && ninja -C build)
>
> For autotools the command line is:
> time (mdkir build && cd build && ../autotools && make -j5 -l4)
>
> meson (cold ccache): 13.37s user 1.74s system 255% cpu  5.907 total
> autotools (cold ccache): 26.50s user 1.71s system 129% cpu 21.835 total
> meson (hot ccache):   2.13s user 0.39s system 154% cpu  1.633 total
> autotools (hot ccache):  13.93s user 0.73s system 102% cpu 14.259 total
>
> That's ~4x faster for a cold build and ~10x faster for a hot build.
>
> For a make clean && make style build with a hot cache:
> meson: 4.64s user 0.33s system 334% cpu 1.486 total
> autotools: 7.93s user 0.32s system 167% cpu 4.920 total
>
> Why bother with libdrm?
>
> It's a simple build system, that could be completely (or mostly
> completely) be ported in a very short time, and could serve as a tech
> demo for the advantages of using meson to garner feedback for embarking
> on a larger project, like mesa (which is what I'm planning to work on
> next).
>
> tl;dr
>
> I wrote this as practice for porting Mesa, and figured I might as well
> send it out since I wrote it.
>
> It is very likely that neither of these large patches will show up on the
> mailing list, but this is available at my github:
> https://github.com/dcbaker/libdrm wip/meson
>
> Dylan Baker (2):
>   Port build system to meson
>   remove autotools build
>
>  .editorconfig|   2 +-
>  .gitignore   |  82 +-
>  Makefile.am  | 144 +
>  Makefile.sources |  41 +--
>  README   |  21 +-
>  amdgpu/Makefile.am   |  47 +---
>  amdgpu/Makefile.sources  |  15 +-
>  amdgpu/libdrm_amdgpu.pc.in   |  11 +-
>  amdgpu/meson.build   |  57 +++-
>  autogen.sh   |  20 +-
>  configure.ac | 568 +
>  etnaviv/Makefile.am  |  26 +-
>  etnaviv/Makefile.sources |  12 +-
>  etnaviv/libdrm_etnaviv.pc.in |  11 +-
>  etnaviv/meson.build  |  56 +++-
>  exynos/Makefile.am   |  27 +--
>  exynos/libdrm_exynos.pc.in   |  11 +-
>  exynos/meson.build   |  52 +++-
>  freedreno/Makefile.am|  30 +--
>  freedreno/Makefile.sources   |  26 +-
>  freedreno/libdrm_freedreno.pc.in |  11 +-
>  freedreno/meson.build|  72 -
>  include/drm/meson.build  |  48 +++-
>  intel/Makefile.am|  73 +
>  intel/Makefile.sources   |  15 +-
>  intel/intel_bufmgr_gem.c |   8 +-
>  intel/libdrm_intel.pc.in |  11 +-
>  intel/meson.build|  55 +++-
>  libdrm.pc.in |  10 +-
>  libkms/Makefile.am   |  43 +--
>  libkms/Makefile.sources  |  23 +-
>  libkms/libkms.pc.in  |  11 +-
>  libkms/meson.build

Re: [Mesa-dev] [PATCH v3 1/1] clover: use pipe_resource references

2017-03-16 Thread Francisco Jerez

Jan Vesely  writes:

> v2: buffers are created with one reference.
> v3: add pipe_resource reference to mapping object
>

Mapping objects are supposed to be short-lived, they're logically part
of the parent resource object so they shouldn't ever out-live it.  What
is this useful for?

> CC: "17.0 13.0" 
>
> Signed-off-by: Jan Vesely 
> ---
>  src/gallium/state_trackers/clover/core/resource.cpp | 11 ---
>  src/gallium/state_trackers/clover/core/resource.hpp |  7 ---
>  2 files changed, 12 insertions(+), 6 deletions(-)
>
> diff --git a/src/gallium/state_trackers/clover/core/resource.cpp 
> b/src/gallium/state_trackers/clover/core/resource.cpp
> index 06fd3f6..83e3c26 100644
> --- a/src/gallium/state_trackers/clover/core/resource.cpp
> +++ b/src/gallium/state_trackers/clover/core/resource.cpp
> @@ -25,6 +25,7 @@
>  #include "pipe/p_screen.h"
>  #include "util/u_sampler.h"
>  #include "util/u_format.h"
> +#include "util/u_inlines.h"
>  
>  using namespace clover;
>  
> @@ -176,7 +177,7 @@ root_resource::root_resource(clover::device &dev, 
> memory_obj &obj,
>  }
>  
>  root_resource::~root_resource() {
> -   device().pipe->resource_destroy(device().pipe, pipe);
> +   pipe_resource_reference(&this->pipe, NULL);
>  }
>  
>  sub_resource::sub_resource(resource &r, const vector &offset) :
> @@ -202,18 +203,21 @@ mapping::mapping(command_queue &q, resource &r,
>pxfer = NULL;
>throw error(CL_OUT_OF_RESOURCES);
> }
> +   pipe_resource_reference(&res, r.pipe);
>  }
>  
>  mapping::mapping(mapping &&m) :
> -   pctx(m.pctx), pxfer(m.pxfer), p(m.p) {
> +   pctx(m.pctx), pxfer(m.pxfer), res(m.res), p(m.p) {
> m.pctx = NULL;
> m.pxfer = NULL;
> +   m.res = NULL;
> m.p = NULL;
>  }
>  
>  mapping::~mapping() {
> if (pxfer) {
>pctx->transfer_unmap(pctx, pxfer);
> }
> +   pipe_resource_reference(&res, NULL);
>  }
>  
> @@ -222,5 +226,6 @@ mapping::operator=(mapping m) {
> std::swap(pctx, m.pctx);
> std::swap(pxfer, m.pxfer);
> +   std::swap(res, m.res);
> std::swap(p, m.p);
> return *this;
>  }
> diff --git a/src/gallium/state_trackers/clover/core/resource.hpp 
> b/src/gallium/state_trackers/clover/core/resource.hpp
> index 9993dcb..cea9617 100644
> --- a/src/gallium/state_trackers/clover/core/resource.hpp
> +++ b/src/gallium/state_trackers/clover/core/resource.hpp
> @@ -123,9 +123,10 @@ namespace clover {
>}
>  
> private:
> -  pipe_context *pctx;
> -  pipe_transfer *pxfer;
> -  void *p;
> +  pipe_context *pctx = NULL;
> +  pipe_transfer *pxfer = NULL;
> +  pipe_resource *res = NULL;
> +  void *p = NULL;
> };
>  }
>  
> -- 
> 2.9.3


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 04/23] nir: Return progress from nir_lower_vars_to_ssa().

On Thu, Mar 16, 2017 at 2:18 PM, Matt Turner  wrote:

> ---
>  src/compiler/nir/nir.h   | 2 +-
>  src/compiler/nir/nir_lower_vars_to_ssa.c | 8 ++--
>  2 files changed, 7 insertions(+), 3 deletions(-)
>
> diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
> index 2dedb45..acbe91c 100644
> --- a/src/compiler/nir/nir.h
> +++ b/src/compiler/nir/nir.h
> @@ -2390,7 +2390,7 @@ bool nir_is_per_vertex_io(nir_variable *var,
> gl_shader_stage stage);
>  void nir_lower_io_types(nir_shader *shader);
>  void nir_lower_regs_to_ssa_impl(nir_function_impl *impl);
>  void nir_lower_regs_to_ssa(nir_shader *shader);
> -void nir_lower_vars_to_ssa(nir_shader *shader);
> +bool nir_lower_vars_to_ssa(nir_shader *shader);
>
>  bool nir_remove_dead_variables(nir_shader *shader, nir_variable_mode
> modes);
>  bool nir_lower_constant_initializers(nir_shader *shader,
> diff --git a/src/compiler/nir/nir_lower_vars_to_ssa.c
> b/src/compiler/nir/nir_lower_vars_to_ssa.c
> index 37a786c..e5a12eb 100644
> --- a/src/compiler/nir/nir_lower_vars_to_ssa.c
> +++ b/src/compiler/nir/nir_lower_vars_to_ssa.c
> @@ -737,11 +737,15 @@ nir_lower_vars_to_ssa_impl(nir_function_impl *impl)
> return progress;
>  }
>
> -void
> +bool
>  nir_lower_vars_to_ssa(nir_shader *shader)
>  {
> +   bool progress = false;
> +
> nir_foreach_function(function, shader) {
>if (function->impl)
> - nir_lower_vars_to_ssa_impl(function->impl);
> + progress |= nir_lower_vars_to_ssa_impl(function->impl);
>

Already had a concept of progress and just didn't wire it up?  Good work
me. :(


> }
> +
> +   return progress;
>  }
> --
> 2.10.2
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/4] i965: Fall back to GL 4.2 on Haswell if the kernel isn't new enough.

2017-03-16 Thread Ilia Mirkin

On Thu, Mar 16, 2017 at 6:12 PM, Kenneth Graunke  wrote:
> In commit d2590eb65ff28a9cbd592353d15d7e6cbd2c6fc6 I enabled GL 4.5
> on Haswell...but failed to check if we could do indirect compute
> shader dispatch...and query buffer objects.
>
> Indirect compute shader dispatch requires command parser version 5
> (kernel commit 7b9748cb513a6bef4af87b79f0da3ff7e8b56cd8, which is in
> Linux v4.4).  On earlier kernels we would have disabled
> ARB_compute_shader, which is a mandatory part of OpenGL 4.3+.
>
> Query buffer objects currently require MI_MATH and MI_LOAD_REGISTER_REG,
> which mean command parser version 7 (Linux v4.8).  On earlier kernels
> we would have disabled ARB_query_buffer_object, which is a mandatory
> part of OpenGL 4.3+.

According to my notes, ARB_qbo is part of GL 4.4. Can you double-check?

>
> The new version support looks like:
>
> - Kernel 4.1 and older => OpenGL 3.3
> - Kernel 4.2-4.7   => OpenGL 4.2
> - Kernel 4.8+  => OpenGL 4.5
>
> Cc: "17.0" 
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/intel_screen.c | 9 +++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
> b/src/mesa/drivers/dri/i965/intel_screen.c
> index 81cb0deabb3..225a387fa90 100644
> --- a/src/mesa/drivers/dri/i965/intel_screen.c
> +++ b/src/mesa/drivers/dri/i965/intel_screen.c
> @@ -1554,8 +1554,13 @@ set_max_gl_versions(struct intel_screen *screen)
>dri_screen->max_gl_es2_version = has_astc ? 32 : 31;
>break;
> case 7:
> -  dri_screen->max_gl_core_version = screen->devinfo.is_haswell &&
> - can_do_pipelined_register_writes(screen) ? 45 : 33;
> +  dri_screen->max_gl_core_version = 33;
> +  if (screen->devinfo.is_haswell &&
> +  can_do_pipelined_register_writes(screen) &&
> +  can_do_mi_math_and_lrr(screen)) {
> + dri_screen->max_gl_core_version =
> +can_do_compute_dispatch(screen) ? 45 : 42;
> +  }
>dri_screen->max_gl_compat_version = 30;
>dri_screen->max_gl_es1_version = 11;
>dri_screen->max_gl_es2_version = screen->devinfo.is_haswell ? 31 : 30;
> --
> 2.12.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: Select pipeline and emit state base address in Gen8+ HiZ ops.

2017-03-16 Thread Nanley Chery

On Wed, Mar 08, 2017 at 10:27:20AM -0800, Nanley Chery wrote:
> On Wed, Mar 08, 2017 at 10:07:12AM -0800, Nanley Chery wrote:
> > On Wed, Mar 08, 2017 at 02:17:59AM -0800, Kenneth Graunke wrote:
> > > On Thursday, March 2, 2017 4:36:08 PM PST Nanley Chery wrote:
> > > > On Mon, Feb 06, 2017 at 03:55:49PM -0800, Kenneth Graunke wrote:
> > > > > If a HiZ op is the first thing in the batch, we should make sure
> > > > > to select the render pipeline and emit state base address before
> > > > > proceeding.
> > > > > 
> > > > > I believe 3DSTATE_WM_HZ_OP creates 3DPRIMITIVEs internally, and
> > > > > dispatching those on the GPGPU pipeline seems a bit sketchy.  I'm
> > > > 
> > > > Yes, it does seem like we currently allow HZ_OPs within a GPGPU
> > > > pipeline. This patch should fix that problem.
> > > > 
> > > > > not actually sure that STATE_BASE_ADDRESS is necessary, as the
> > > > > depth related commands use graphics addresses, not ones relative
> > > > > to the base address...but we're likely to set it as part of the
> > > > > next operation anyway, so we should just do it right away.
> > > > > 
> > > > 
> > > > I agree, re-emitting STATE_BASE_ADDRESS doesn't seem necessary. I think
> > > > we should drop this part of the patch and add it back in later if we get
> > > > some data that it's necessary. Leaving it there may be distracting to
> > > > some readers and the BDW PRM warns that it's an expensive command:
> > > > 
> > > > Execution of this command causes a full pipeline flush, thus its
> > > > use should be minimized for higher performance.
> > > 
> > > I think it should be basically free, actually.  We track a boolean,
> > > brw->batch.state_base_address_emitted, to avoid emitting it multiple
> > > times per batch.
> > > 
> > > Let's say the first thing in a fresh batch is a HiZ op, followed by
> > > normal drawing.  Previously, we'd do:
> > > 
> > > 1. HiZ op commands
> > > 2. STATE_BASE_ADDRESS (triggered by normal rendering upload)
> > > 3. rest of normal drawing commands
> > > 
> > > Now we'd do:
> > > 
> > > 1. STATE_BASE_ADDRESS (triggered by HiZ op)
> > > 2. HiZ op commands
> > > 3. normal drawing commands (second SBA is skipped)
> > > 
> > > In other words...we're just moving it a bit earlier.  I suppose there
> > > could be a batch containing only HiZ ops, at which point we'd pay for
> > > a single STATE_BASE_ADDRESS...but that seems really unlikely.
> > > 
> > 
> > Sorry for not stating it up front, but the special case you've mentioned
> > is exactly what I'd like not to hurt unnecessarily.
> > 
> 
> Correct me if I'm wrong, but after thinking about it some more, it seems
> that performance wouldn't suffer by emitting the SBA since the pipeline
> was already flushed at the end of the preceding batch. It may also
> improve since the pipelined HiZ op will likely be followed by other
> pipelined commands. I'm not totally confident in my understanding on
> pipeline flushes by the way. Is this why you'd like to emit the SBA here?
> I think it's fine to leave it if we expound on the rationale.
> 

Thinking about it some more, we probably don't want to make separate
performance and bug-fix changes in the same patch. I have another
comment at the bottom of this email.

> -Nanley
> 
> > > > > Cc: "17.0" 
> > > > > Signed-off-by: Kenneth Graunke 
> > > > > ---
> > > > >  src/mesa/drivers/dri/i965/gen8_depth_state.c | 3 +++
> > > > >  1 file changed, 3 insertions(+)
> > > > > 
> > > > > diff --git a/src/mesa/drivers/dri/i965/gen8_depth_state.c 
> > > > > b/src/mesa/drivers/dri/i965/gen8_depth_state.c
> > > > > index a7e61354fd5..620b32df8bb 100644
> > > > > --- a/src/mesa/drivers/dri/i965/gen8_depth_state.c
> > > > > +++ b/src/mesa/drivers/dri/i965/gen8_depth_state.c
> > > > > @@ -404,6 +404,9 @@ gen8_hiz_exec(struct brw_context *brw, struct 
> > > > > intel_mipmap_tree *mt,
> > > > > if (op == BLORP_HIZ_OP_NONE)
> > > > >return;
> > > > >  
> > > > 
> > > > It would be helpful if you included the rationale here as a code
> > > > comment. Something like the first two sentences of your commit message
> > > > should work.
> > > 
> > > I can do that.
> > > 
> > > > > +   brw_select_pipeline(brw, BRW_RENDER_PIPELINE);
> > > > 
> > > > According to Vol07 of the BDW+ PRMs,
> > > > 
> > > > The previously active pipeline needs to be flushed via the
> > > > MI_FLUSH command immediately before switching to a different
> > > > pipeline via use of the PIPELINE_SELECT command.
> > > > 
> > > > However it doesn't look like MI_FLUSH is present after HSW. So there
> > > > shouldn't be any additional work to do here.
> > > 
> > > Flushes are definitely required when switching the pipeline, but I
> > > believe that brw_emit_select_pipeline() does that work.
> > > 
> > > FWIW, MI_FLUSH was replaced by PIPE_CONTROL many generations ago.
> > > I believe the validation team stopped testing MI_FLUSH on Sandybridge.
> > > 
> > 
> > Thanks for l

[Mesa-dev] [PATCH 1/4] i965: Fall back to GL 4.2 on Haswell if the kernel isn't new enough.

In commit d2590eb65ff28a9cbd592353d15d7e6cbd2c6fc6 I enabled GL 4.5
on Haswell...but failed to check if we could do indirect compute
shader dispatch...and query buffer objects.

Indirect compute shader dispatch requires command parser version 5
(kernel commit 7b9748cb513a6bef4af87b79f0da3ff7e8b56cd8, which is in
Linux v4.4).  On earlier kernels we would have disabled
ARB_compute_shader, which is a mandatory part of OpenGL 4.3+.

Query buffer objects currently require MI_MATH and MI_LOAD_REGISTER_REG,
which mean command parser version 7 (Linux v4.8).  On earlier kernels
we would have disabled ARB_query_buffer_object, which is a mandatory
part of OpenGL 4.3+.

The new version support looks like:

- Kernel 4.1 and older => OpenGL 3.3
- Kernel 4.2-4.7   => OpenGL 4.2
- Kernel 4.8+  => OpenGL 4.5

Cc: "17.0" 
Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/intel_screen.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
b/src/mesa/drivers/dri/i965/intel_screen.c
index 81cb0deabb3..225a387fa90 100644
--- a/src/mesa/drivers/dri/i965/intel_screen.c
+++ b/src/mesa/drivers/dri/i965/intel_screen.c
@@ -1554,8 +1554,13 @@ set_max_gl_versions(struct intel_screen *screen)
   dri_screen->max_gl_es2_version = has_astc ? 32 : 31;
   break;
case 7:
-  dri_screen->max_gl_core_version = screen->devinfo.is_haswell &&
- can_do_pipelined_register_writes(screen) ? 45 : 33;
+  dri_screen->max_gl_core_version = 33;
+  if (screen->devinfo.is_haswell &&
+  can_do_pipelined_register_writes(screen) &&
+  can_do_mi_math_and_lrr(screen)) {
+ dri_screen->max_gl_core_version =
+can_do_compute_dispatch(screen) ? 45 : 42;
+  }
   dri_screen->max_gl_compat_version = 30;
   dri_screen->max_gl_es1_version = 11;
   dri_screen->max_gl_es2_version = screen->devinfo.is_haswell ? 31 : 30;
-- 
2.12.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 4/4] i965: Skip register write detection when possible.

Detecting register write support by trial and error introduces a
stall at screen creation time, which it would be nice to avoid.
Certain command parser versions guarantee this will work (see the
giant comment in intelInitScreen2 below, or a few commits ago):

- Ivybridge: version >= 1 (kernel v3.16)
- Baytrail:  version >= 2 (kernel v3.19)
- Haswell:   version >= 7 (kernel v4.8)

For simplicity, we don't bother with version 1 in this patch.

This assumes that the user hasn't disabled aliasing PPGTT via a kernel
command line parameter.  Don't do that - you're only breaking things.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/intel_screen.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
b/src/mesa/drivers/dri/i965/intel_screen.c
index 94df3cd8b0d..449be83e9aa 100644
--- a/src/mesa/drivers/dri/i965/intel_screen.c
+++ b/src/mesa/drivers/dri/i965/intel_screen.c
@@ -1361,13 +1361,19 @@ err:
 static bool
 intel_detect_pipelined_so(struct intel_screen *screen)
 {
+   const struct gen_device_info *devinfo = &screen->devinfo;
+
/* Supposedly, Broadwell just works. */
-   if (screen->devinfo.gen >= 8)
+   if (devinfo->gen >= 8)
   return true;
 
-   if (screen->devinfo.gen <= 6)
+   if (devinfo->gen <= 6)
   return false;
 
+   /* See the big explanation about command parser versions below */
+   if (screen->cmd_parser_version >= (devinfo->is_haswell ? 7 : 2))
+  return true;
+
/* We use SO_WRITE_OFFSET0 since you're supposed to write it (unlike the
 * statistics registers), and we already reset it to zero before using it.
 */
-- 
2.12.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/4] i965: Set screen->cmd_parser_version to 0 if we can't write registers.

If we can't write registers, then the effective command parser version
is 0 - it may exist, but it's not usefully enabling anything.

See kernel commit 1ca3712ca3429a617ed6c5f87718e4f6fe4ae0c6 (in v4.8)
where the kernel starts doing this for us.  This makes us do more or
less the same thing on older kernels.

This should preserve a bit of sanity by allowing us to perform a
screen->cmd_parser_version > N check to determine that we really can
use the features promised by command parser version N.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/intel_screen.c | 17 +++--
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
b/src/mesa/drivers/dri/i965/intel_screen.c
index d952062e27b..94df3cd8b0d 100644
--- a/src/mesa/drivers/dri/i965/intel_screen.c
+++ b/src/mesa/drivers/dri/i965/intel_screen.c
@@ -1843,8 +1843,18 @@ __DRIconfig **intelInitScreen2(__DRIscreen *dri_screen)
 *   means that we can no longer use it as an indicator of the
 *   age of the kernel.
 */
-   if (intel_detect_pipelined_so(screen))
+   if (intel_get_param(screen, I915_PARAM_CMD_PARSER_VERSION,
+   &screen->cmd_parser_version) < 0) {
+  /* Command parser does not exist - getparam is unrecognized */
+  screen->cmd_parser_version = 0;
+   }
+
+   if (!intel_detect_pipelined_so(screen)) {
+  /* We can't do anything, so the effective version is 0. */
+  screen->cmd_parser_version = 0;
+   } else {
   screen->kernel_features |= KERNEL_ALLOWS_SOL_OFFSET_WRITES;
+   }
 
const char *force_msaa = getenv("INTEL_FORCE_MSAA");
if (force_msaa) {
@@ -1877,11 +1887,6 @@ __DRIconfig **intelInitScreen2(__DRIscreen *dri_screen)
  (ret != -1 || errno != EINVAL);
}
 
-   if (intel_get_param(screen, I915_PARAM_CMD_PARSER_VERSION,
-   &screen->cmd_parser_version) < 0) {
-  screen->cmd_parser_version = 0;
-   }
-
if (devinfo->gen >= 8 || screen->cmd_parser_version >= 2)
   screen->kernel_features |= KERNEL_ALLOWS_PREDICATE_WRITES;
 
-- 
2.12.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/4] i965: Document the sad story of the kernel command parser.

This should help us figure out the complexities of which kernel
versions we need to get various features on various platforms.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/intel_screen.c | 97 
 1 file changed, 97 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
b/src/mesa/drivers/dri/i965/intel_screen.c
index 225a387fa90..d952062e27b 100644
--- a/src/mesa/drivers/dri/i965/intel_screen.c
+++ b/src/mesa/drivers/dri/i965/intel_screen.c
@@ -1746,6 +1746,103 @@ __DRIconfig **intelInitScreen2(__DRIscreen *dri_screen)
   screen->subslice_total = 1 << (devinfo->gt - 1);
}
 
+   /* Gen7-7.5 kernel requirements / command parser saga:
+*
+* - pre-v3.16:
+*   Haswell and Baytrail cannot use any privileged batchbuffer features.
+*
+*   Ivybridge has aliasing PPGTT on by default, which accidentally marks
+*   all batches secure, allowing them to use any feature with no checking.
+*   This is effectively equivalent to a command parser version of
+*   \infinity - everything is possible.
+*
+*   The command parser does not exist, and querying the version will
+*   return -EINVAL.
+*
+* - v3.16:
+*   The kernel enables the command parser by default, for systems with
+*   aliasing PPGTT enabled (Ivybridge and Haswell).  However, the
+*   hardware checker is still enabled, so Haswell and Baytrail cannot
+*   do anything.
+*
+*   Ivybridge goes from "everything is possible" to "only what the
+*   command parser allows" (if the user boots with i915.cmd_parser=0,
+*   then everything is possible again).  We can only safely use features
+*   allowed by the supported command parser version.
+*
+*   Annoyingly, I915_PARAM_CMD_PARSER_VERSION reports the static version
+*   implemented by the kernel, even if it's turned off.  So, checking
+*   for version > 0 does not mean that you can write registers.  We have
+*   to try it and see.  The version does, however, indicate the age of
+*   the kernel.
+*
+*   Instead of matching the hardware checker's behavior of converting
+*   privileged commands to MI_NOOP, it makes execbuf2 start returning
+*   -EINVAL, making it dangerous to try and use privileged features.
+*
+*   Effective command parser versions:
+*   - Haswell:   0 (reporting 1, writes don't work)
+*   - Baytrail:  0 (reporting 1, writes don't work)
+*   - Ivybridge: 1 (enabled) or infinite (disabled)
+*
+* - v3.17:
+*   Baytrail aliasing PPGTT is enabled, making it like Ivybridge:
+*   effectively version 1 (enabled) or infinite (disabled).
+*
+* - v3.19: f1f55cc0556031c8ee3fe99dae7251e78b9b653b
+*   Command parser v2 supports predicate writes.
+*
+*   - Haswell:   0 (reporting 1, writes don't work)
+*   - Baytrail:  2 (enabled) or infinite (disabled)
+*   - Ivybridge: 2 (enabled) or infinite (disabled)
+*
+*   So version >= 2 is enough to know that Ivybridge and Baytrail
+*   will work.  Haswell still can't do anything.
+*
+* - v4.0: Version 3 happened.  Largely not relevant.
+*
+* - v4.1: 6702cf16e0ba8b0129f5aa1b6609d4e9c70bc13b
+*   L3 config registers are properly saved and restored as part
+*   of the hardware context.  We can approximately detect this point
+*   in time by checking if I915_PARAM_REVISION is recognized - it
+*   landed in a later commit, but in the same release cycle.
+*
+* - v4.2: 245054a1fe33c06ad233e0d58a27ec7b64db9284
+*   Command parser finally gains secure batch promotion.  On Haswell,
+*   the hardware checker gets disabled, which finally allows it to do
+*   privileged commands.
+*
+*   I915_PARAM_CMD_PARSER_VERSION reports 3.  Effective versions:
+*   - Haswell:   3 (enabled) or 0 (disabled)
+*   - Baytrail:  3 (enabled) or infinite (disabled)
+*   - Ivybridge: 3 (enabled) or infinite (disabled)
+*
+*   Unfortunately, detecting this point in time is tricky, because
+*   no version bump happened when this important change occurred.
+*   On Haswell, if we can write any register, then the kernel is at
+*   least this new, and we can start trusting the version number.
+*
+* - v4.4: 2bbe6bbb0dc94fd4ce287bdac9e1bd184e23057b and
+*   Command parser reaches version 4, allowing access to Haswell
+*   atomic scratch and chicken3 registers.  If version >= 4, we know
+*   the kernel is new enough to support privileged features on all
+*   hardware.  However, the user might have disabled it...and the
+*   kernel will still report version 4.  So we still have to guess
+*   and check.
+*
+* - v4.4: 7b9748cb513a6bef4af87b79f0da3ff7e8b56cd8
+*   Command parser v5 whitelists indirect compute shader dispatch
+*   registers, needed for OpenGL 4.3 and later.
+*
+* - v4.8:
+*   Command pars

Re: [Mesa-dev] [PATCH 2/2] radv: add external memory support.

2017-03-16 Thread Bas Nieuwenhuizen

On Wed, Mar 15, 2017 at 1:25 AM, Dave Airlie  wrote:
> From: Dave Airlie 
>
> This adds support for exporting 2D images, to an
> opaque fd.
>
> This implements the:
> VK_KHX_external_memory_capabilities
> VK_KHX_external_memory
> VK_KHX_external_memory_fd
>
> extensions.
>
> These are used by SteamVR, we should work with anv
> to decide if we should ship these under an env
> var or something.
>
> Signed-off-by: Dave Airlie 
> ---
>  src/amd/vulkan/radv_device.c   |  85 ++-
>  src/amd/vulkan/radv_entrypoints_gen.py |   3 +
>  src/amd/vulkan/radv_formats.c  | 104 
> -
>  3 files changed, 178 insertions(+), 14 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
> index d1fd58d..7266d0a 100644
> --- a/src/amd/vulkan/radv_device.c
> +++ b/src/amd/vulkan/radv_device.c
> @@ -84,6 +84,18 @@ static const VkExtensionProperties instance_extensions[] = 
> {
> .specVersion = 5,
> },
>  #endif
> +   {
> +   .extensionName = 
> VK_KHR_GET_PHYSICAL_DEVICE_PROPERTIES_2_EXTENSION_NAME,
> +   .specVersion = 1,
> +   },

Why is this in here? Even if it turns out we want this, can we put it
in a separate patch.


> +   {
> +   .extensionName = 
> VK_KHX_EXTERNAL_MEMORY_CAPABILITIES_EXTENSION_NAME,
> +   .specVersion = 1,
> +   },
> +   {
> +   .extensionName = 
> VK_KHX_EXTERNAL_SEMAPHORE_CAPABILITIES_EXTENSION_NAME,
> +   .specVersion = 1,
> +   },
>  };
>
>  static const VkExtensionProperties common_device_extensions[] = {
> @@ -115,6 +127,18 @@ static const VkExtensionProperties 
> common_device_extensions[] = {
> .extensionName = VK_NV_DEDICATED_ALLOCATION_EXTENSION_NAME,
> .specVersion = 1,
> },
> +   {
> +   .extensionName = 
> VK_KHX_EXTERNAL_MEMORY_CAPABILITIES_EXTENSION_NAME,
> +   .specVersion = 1,
> +   },
> +   {
> +   .extensionName = VK_KHX_EXTERNAL_MEMORY_EXTENSION_NAME,
> +   .specVersion = 1,
> +   },
> +   {
> +   .extensionName = VK_KHX_EXTERNAL_MEMORY_FD_EXTENSION_NAME,
> +   .specVersion = 1,
> +   },
>  };
>
>  static VkResult
> @@ -255,7 +279,6 @@ radv_physical_device_finish(struct radv_physical_device 
> *device)
> close(device->local_fd);
>  }
>
> -
>  static void *
>  default_alloc_func(void *pUserData, size_t size, size_t align,
> VkSystemAllocationScope allocationScope)
> @@ -1694,7 +1717,7 @@ VkResult radv_AllocateMemory(
> VkResult result;
> enum radeon_bo_domain domain;
> uint32_t flags = 0;
> -   const VkDedicatedAllocationMemoryAllocateInfoNV *dedicate_info = NULL;
> +
> assert(pAllocateInfo->sType == 
> VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO);
>
> if (pAllocateInfo->allocationSize == 0) {
> @@ -1703,15 +1726,10 @@ VkResult radv_AllocateMemory(
> return VK_SUCCESS;
> }
>
> -   vk_foreach_struct(ext, pAllocateInfo->pNext) {
> -   switch (ext->sType) {
> -   case 
> VK_STRUCTURE_TYPE_DEDICATED_ALLOCATION_MEMORY_ALLOCATE_INFO_NV:
> -   dedicate_info = (const 
> VkDedicatedAllocationMemoryAllocateInfoNV *)ext;
> -   break;
> -   default:
> -   break;
> -   }
> -   }
> +   const VkImportMemoryFdInfoKHX *import_info =
> +   vk_find_struct_const(pAllocateInfo->pNext, 
> IMPORT_MEMORY_FD_INFO_KHX);
> +   const VkDedicatedAllocationMemoryAllocateInfoNV *dedicate_info =
> +   vk_find_struct_const(pAllocateInfo->pNext, 
> DEDICATED_ALLOCATION_MEMORY_ALLOCATE_INFO_NV);
>
> mem = vk_alloc2(&device->alloc, pAllocator, sizeof(*mem), 8,
>   VK_SYSTEM_ALLOCATION_SCOPE_OBJECT);
> @@ -1726,6 +1744,17 @@ VkResult radv_AllocateMemory(
> mem->buffer = NULL;
> }
>
> +   if (import_info) {
> +   assert(import_info->handleType ==
> +  VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_FD_BIT_KHX);
> +   mem->bo = device->ws->buffer_from_fd(device->ws, 
> import_info->fd,
> +NULL, NULL);
> +   if (!mem->bo)
> +   goto fail;
> +   else
> +   goto out_success;
> +   }
> +
> uint64_t alloc_size = align_u64(pAllocateInfo->allocationSize, 4096);
> if (pAllocateInfo->memoryTypeIndex == RADV_MEM_TYPE_GTT_WRITE_COMBINE 
> ||
> pAllocateInfo->memoryTypeIndex == RADV_MEM_TYPE_GTT_CACHED)
> @@ -1749,7 +1778,7 @@ VkResult radv_AllocateMemory(
> goto fail;
> }
> mem->type_index = pAllocateInfo->memoryTypeIndex;
> -
> +out_success:

maybe put this a bit ealirer, so we i

[Mesa-dev] [PATCH v3 1/1] clover: use pipe_resource references

v2: buffers are created with one reference.
v3: add pipe_resource reference to mapping object

CC: "17.0 13.0" 

Signed-off-by: Jan Vesely 
---
 src/gallium/state_trackers/clover/core/resource.cpp | 11 ---
 src/gallium/state_trackers/clover/core/resource.hpp |  7 ---
 2 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/src/gallium/state_trackers/clover/core/resource.cpp 
b/src/gallium/state_trackers/clover/core/resource.cpp
index 06fd3f6..83e3c26 100644
--- a/src/gallium/state_trackers/clover/core/resource.cpp
+++ b/src/gallium/state_trackers/clover/core/resource.cpp
@@ -25,6 +25,7 @@
 #include "pipe/p_screen.h"
 #include "util/u_sampler.h"
 #include "util/u_format.h"
+#include "util/u_inlines.h"
 
 using namespace clover;
 
@@ -176,7 +177,7 @@ root_resource::root_resource(clover::device &dev, 
memory_obj &obj,
 }
 
 root_resource::~root_resource() {
-   device().pipe->resource_destroy(device().pipe, pipe);
+   pipe_resource_reference(&this->pipe, NULL);
 }
 
 sub_resource::sub_resource(resource &r, const vector &offset) :
@@ -202,18 +203,21 @@ mapping::mapping(command_queue &q, resource &r,
   pxfer = NULL;
   throw error(CL_OUT_OF_RESOURCES);
}
+   pipe_resource_reference(&res, r.pipe);
 }
 
 mapping::mapping(mapping &&m) :
-   pctx(m.pctx), pxfer(m.pxfer), p(m.p) {
+   pctx(m.pctx), pxfer(m.pxfer), res(m.res), p(m.p) {
m.pctx = NULL;
m.pxfer = NULL;
+   m.res = NULL;
m.p = NULL;
 }
 
 mapping::~mapping() {
if (pxfer) {
   pctx->transfer_unmap(pctx, pxfer);
}
+   pipe_resource_reference(&res, NULL);
 }
 
@@ -222,5 +226,6 @@ mapping::operator=(mapping m) {
std::swap(pctx, m.pctx);
std::swap(pxfer, m.pxfer);
+   std::swap(res, m.res);
std::swap(p, m.p);
return *this;
 }
diff --git a/src/gallium/state_trackers/clover/core/resource.hpp 
b/src/gallium/state_trackers/clover/core/resource.hpp
index 9993dcb..cea9617 100644
--- a/src/gallium/state_trackers/clover/core/resource.hpp
+++ b/src/gallium/state_trackers/clover/core/resource.hpp
@@ -123,9 +123,10 @@ namespace clover {
   }
 
private:
-  pipe_context *pctx;
-  pipe_transfer *pxfer;
-  void *p;
+  pipe_context *pctx = NULL;
+  pipe_transfer *pxfer = NULL;
+  pipe_resource *res = NULL;
+  void *p = NULL;
};
 }
 
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC libdrm 0/2] Replace the build system with meson

Quoting Ilia Mirkin (2017-03-16 14:32:09)
> On Thu, Mar 16, 2017 at 5:25 PM, Dylan Baker  wrote:
> > Why bother, and why would we want this? 
> >  │~
> >
> > First it's written in python, which means the potential developer base
> > is massive. And it provides a recursive view for humans, but a
> > non-recursive view for the system. This is the best of both worlds,
> > humans can organize the build system in a way that makes sense, and the
> > machine gets a non-recursive build system. It also uses ninja rather
> > than make, and ninja is faster than make inherently. Meson is also a
> > simpler syntax than autotools or cmake it's not Turing Complete by
> > design nor does it expose python, again, by design. This allows meson
> > itself to be reimplemented in a another language if python becomes a
> > dead-end or a bottle-neck. It also makes it much easier to understand
> > what the build system is doing.
> >
> > What's different about using meson?
> >
> > Well, apart from a faster builds and less magic in the build system? The
> > configure flags are different, it uses -D= more like cmake
> > than the --enable or --with flags of autotools, although oddly it uses
> > --prefix and friends when calling meson, but not with mesonconf, there's
> > a bug opened on this. Meson also doesn't support in-tree builds at all;
> > all builds are done out of tree. It also doesn't provide a "make dist"
> > target, fortunately there's this awesome tool called git, and it
> > provides a "git archive" command that does much the same thing. Did I
> > mention it's fast?
> >
> > Here are the performance numbers I see on a 2 core 4 thread SKL, without
> > initial configuration, and building out of tree (using zsh):
> >
> > For meson the command line is:
> > time (meson build -Dmanpages=true && ninja -C build)
> >
> > For autotools the command line is:
> > time (mdkir build && cd build && ../autotools && make -j5 -l4)
> 
> Probably mkdir...

derp, yeah.

> 
> >
> > meson (cold ccache): 13.37s user 1.74s system 255% cpu  5.907 total
> > autotools (cold ccache): 26.50s user 1.71s system 129% cpu 21.835 total
> > meson (hot ccache):   2.13s user 0.39s system 154% cpu  1.633 total
> > autotools (hot ccache):  13.93s user 0.73s system 102% cpu 14.259 total
> >
> > That's ~4x faster for a cold build and ~10x faster for a hot build.
> >
> > For a make clean && make style build with a hot cache:
> > meson: 4.64s user 0.33s system 334% cpu 1.486 total
> > autotools: 7.93s user 0.32s system 167% cpu 4.920 total
> >
> > Why bother with libdrm?
> >
> > It's a simple build system, that could be completely (or mostly
> > completely) be ported in a very short time, and could serve as a tech
> > demo for the advantages of using meson to garner feedback for embarking
> > on a larger project, like mesa (which is what I'm planning to work on
> > next).
> >
> > tl;dr
> >
> > I wrote this as practice for porting Mesa, and figured I might as well
> > send it out since I wrote it.
> >
> > It is very likely that neither of these large patches will show up on the
> > mailing list, but this is available at my github:
> > https://github.com/dcbaker/libdrm wip/meson
> 
> I haven't looked at meson or your patches in detail, but autotools
> supports 2 very important use-cases very well:
> 
> 1. ./configure --help
> 2. Cross-compilation with minimal requirement from the project being built
> 
> Can you comment on how these are handled in meson?
> 
> Cheers,
> 
>   -ilia

1. mesonconf  provides much the same thing. You can also read the
meson_options.txt file, which is generally pretty short. I haven't added
descriptions to the options in this patch.

2. you write a small ini style configuration file, something like:
[binaries]
c = '/usr/bin/aarch64-linux-gnu-gc'
ar = '/usr/bin/aarch64-linux-gnu-gcc-ar'
strip = '/usr/bin/aarch64-linux-gnu-strip'
pkg-config = '/usr/bin/aarch64-linux-gnu-pkgconfig'

Then you just configure with:
meson build --cross-file cross_file.txt

then just ninja like normal

There's a more detailed walkthrough here:
https://github.com/mesonbuild/meson/wiki/Cross-compilation

I was able to cross compile the arm libraries for aarch64 using basically the
above configuration (I had to write a wrapper script for pkg-config to set a
couple of environment variables and install and archlinux aarach64 chroot,
because arch), of course, I don't have access to any arm machines that I could
test with.

Dylan


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 16/16] anv: Turn on inherited queries

It all just works since it's just a hardware register so we might as
well turn it on.
---
 src/intel/vulkan/anv_device.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index 8d4d243..f158d77 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -470,7 +470,7 @@ void anv_GetPhysicalDeviceFeatures(
   .shaderInt16  = false,
   .shaderResourceMinLod = false,
   .variableMultisampleRate  = false,
-  .inheritedQueries = false,
+  .inheritedQueries = true,
};
 
/* We can't do image stores in vec4 shaders */
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 13/16] anv/pipeline: Enable clipper statistics

---
 src/intel/vulkan/genX_pipeline.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/intel/vulkan/genX_pipeline.c b/src/intel/vulkan/genX_pipeline.c
index a6ec3b6..bb3e203 100644
--- a/src/intel/vulkan/genX_pipeline.c
+++ b/src/intel/vulkan/genX_pipeline.c
@@ -995,6 +995,7 @@ emit_3dstate_clip(struct anv_pipeline *pipeline,
(void) wm_prog_data;
anv_batch_emit(&pipeline->batch, GENX(3DSTATE_CLIP), clip) {
   clip.ClipEnable   = true;
+  clip.StatisticsEnable = true;
   clip.EarlyCullEnable  = true;
   clip.APIMode  = APIMODE_D3D,
   clip.ViewportXYClipTestEnable = true;
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 15/16] anv: Implement pipeline statistics queries

From: Ilia Mirkin 

In the end, pipeline statistics queries look a lot like occlusion
queries only with between 1 and 11 begin/end pairs being generated
instead of just the one.

Reviewed-By: Lionel Landwerlin 
---
 src/intel/vulkan/TODO  |   1 -
 src/intel/vulkan/anv_device.c  |   2 +-
 src/intel/vulkan/anv_private.h |   3 +
 src/intel/vulkan/genX_query.c  | 232 +++--
 4 files changed, 226 insertions(+), 12 deletions(-)

diff --git a/src/intel/vulkan/TODO b/src/intel/vulkan/TODO
index 5366774..b4da05d 100644
--- a/src/intel/vulkan/TODO
+++ b/src/intel/vulkan/TODO
@@ -3,7 +3,6 @@ Intel Vulkan ToDo
 
 Missing Features:
  - Investigate CTS failures on HSW
- - Pipeline statistics queries
  - Sparse memory
 
 Performance:
diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index d8eafb9..8d4d243 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -451,7 +451,7 @@ void anv_GetPhysicalDeviceFeatures(
   .textureCompressionASTC_LDR   = pdevice->info.gen >= 9, /* 
FINISHME CHV */
   .textureCompressionBC = true,
   .occlusionQueryPrecise= true,
-  .pipelineStatisticsQuery  = false,
+  .pipelineStatisticsQuery  = true,
   .fragmentStoresAndAtomics = true,
   .shaderTessellationAndGeometryPointSize   = true,
   .shaderImageGatherExtended= true,
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index 4f0c5b9..da1a34f 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -1984,8 +1984,11 @@ struct anv_render_pass {
struct anv_subpass   subpasses[0];
 };
 
+#define ANV_PIPELINE_STATISTICS_MASK 0x07ff
+
 struct anv_query_pool {
VkQueryType  type;
+   VkQueryPipelineStatisticFlagspipeline_statistics;
/** Stride between slots, in bytes */
uint32_t stride;
/** Number of slots in this query pool */
diff --git a/src/intel/vulkan/genX_query.c b/src/intel/vulkan/genX_query.c
index dd5ae80..2bbca66 100644
--- a/src/intel/vulkan/genX_query.c
+++ b/src/intel/vulkan/genX_query.c
@@ -51,6 +51,7 @@ VkResult genX(CreateQueryPool)(
 */
uint32_t uint64s_per_slot = 1;
 
+   VkQueryPipelineStatisticFlags pipeline_statistics = 0;
switch (pCreateInfo->queryType) {
case VK_QUERY_TYPE_OCCLUSION:
   /* Occlusion queries have two values: begin and end. */
@@ -61,7 +62,15 @@ VkResult genX(CreateQueryPool)(
   uint64s_per_slot += 1;
   break;
case VK_QUERY_TYPE_PIPELINE_STATISTICS:
-  return VK_ERROR_INCOMPATIBLE_DRIVER;
+  pipeline_statistics = pCreateInfo->pipelineStatistics;
+  /* We're going to trust this field implicitly so we need to ensure that
+   * no unhandled extension bits leak in.
+   */
+  pipeline_statistics &= ANV_PIPELINE_STATISTICS_MASK;
+
+  /* Statistics queries have a min and max for every statistic */
+  uint64s_per_slot += 2 * _mesa_bitcount(pipeline_statistics);
+  break;
default:
   assert(!"Invalid query type");
}
@@ -72,6 +81,7 @@ VkResult genX(CreateQueryPool)(
   return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
 
pool->type = pCreateInfo->queryType;
+   pool->pipeline_statistics = pipeline_statistics;
pool->stride = uint64s_per_slot * sizeof(uint64_t);
pool->slots = pCreateInfo->queryCount;
 
@@ -137,6 +147,7 @@ VkResult genX(GetQueryPoolResults)(
int ret;
 
assert(pool->type == VK_QUERY_TYPE_OCCLUSION ||
+  pool->type == VK_QUERY_TYPE_PIPELINE_STATISTICS ||
   pool->type == VK_QUERY_TYPE_TIMESTAMP);
 
if (pData == NULL)
@@ -184,8 +195,27 @@ VkResult genX(GetQueryPoolResults)(
 cpu_write_query_result(pData, flags, 0, slot[2] - slot[1]);
 break;
  }
- case VK_QUERY_TYPE_PIPELINE_STATISTICS:
-unreachable("pipeline stats not supported");
+
+ case VK_QUERY_TYPE_PIPELINE_STATISTICS: {
+uint32_t statistics = pool->pipeline_statistics;
+uint32_t idx = 0;
+while (statistics) {
+   uint32_t stat = u_bit_scan(&statistics);
+   uint64_t result = slot[idx * 2 + 2] - slot[idx * 2 + 1];
+
+   /* WaDividePSInvocationCountBy4:HSW,BDW */
+   if ((device->info.gen == 8 || device->info.is_haswell) &&
+   (1 << stat) == 
VK_QUERY_PIPELINE_STATISTIC_FRAGMENT_SHADER_INVOCATIONS_BIT)
+  result >>= 2;
+
+   cpu_write_query_result(pData, flags, idx, result);
+
+   idx++;
+}
+assert(idx == _mesa_bitcount(pool->pipeline_statistics));
+break;
+ }
+
  case VK_QUERY_TYPE_TIMESTAMP: {
 cpu_write_query_result(pData, flags, 0, slot[1]);

[Mesa-dev] [PATCH v2 14/16] anv: Disable VF statistics for blorp and SOL memcpy

In order to get accurate statistics, we need to disable statistics for
blits, clears, and the surface state memcpy at the top of each secondary
command buffer.  There are two possible approaches to this:

 1) Disable before the blit/memcpy and re-enable afterwards

 2) Move emitting 3DSTATE_VF_STATISTICS from initialization and make it
part of pipeline state and then just disabale statistics before
blits and memcpy operations.

Emitting 3DSTATE_VF_STATISTICS should be fairly cheap so it doesn't
really matter which path we take.  We choose the second option as it's
more consistent with the way the rest of the statistics are enabled and
disabled.
---
 src/intel/vulkan/genX_blorp_exec.c | 5 +
 src/intel/vulkan/genX_gpu_memcpy.c | 4 
 src/intel/vulkan/genX_pipeline.c   | 9 +
 src/intel/vulkan/genX_state.c  | 3 ---
 4 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/src/intel/vulkan/genX_blorp_exec.c 
b/src/intel/vulkan/genX_blorp_exec.c
index c1499fb..74c2679 100644
--- a/src/intel/vulkan/genX_blorp_exec.c
+++ b/src/intel/vulkan/genX_blorp_exec.c
@@ -169,6 +169,11 @@ genX(blorp_exec)(struct blorp_batch *batch,
 */
genX(cmd_buffer_enable_pma_fix)(cmd_buffer, false);
 
+   /* Disable VF statistics */
+   blorp_emit(batch, GENX(3DSTATE_VF_STATISTICS), vf) {
+  vf.StatisticsEnable = false;
+   }
+
blorp_exec(batch, params);
 
cmd_buffer->state.vb_dirty = ~0;
diff --git a/src/intel/vulkan/genX_gpu_memcpy.c 
b/src/intel/vulkan/genX_gpu_memcpy.c
index eb11c2f..3cbc723 100644
--- a/src/intel/vulkan/genX_gpu_memcpy.c
+++ b/src/intel/vulkan/genX_gpu_memcpy.c
@@ -218,6 +218,10 @@ genX(cmd_buffer_gpu_memcpy)(struct anv_cmd_buffer 
*cmd_buffer,
}
 #endif
 
+   anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_VF_STATISTICS), vf) {
+  vf.StatisticsEnable = false;
+   }
+
anv_batch_emit(&cmd_buffer->batch, GENX(3DPRIMITIVE), prim) {
   prim.VertexAccessType = SEQUENTIAL;
   prim.PrimitiveTopologyType= _3DPRIM_POINTLIST;
diff --git a/src/intel/vulkan/genX_pipeline.c b/src/intel/vulkan/genX_pipeline.c
index bb3e203..5e6e609 100644
--- a/src/intel/vulkan/genX_pipeline.c
+++ b/src/intel/vulkan/genX_pipeline.c
@@ -1564,6 +1564,14 @@ emit_3dstate_vf_topology(struct anv_pipeline *pipeline)
 #endif
 
 static void
+emit_3dstate_vf_statistics(struct anv_pipeline *pipeline)
+{
+   anv_batch_emit(&pipeline->batch, GENX(3DSTATE_VF_STATISTICS), vfs) {
+  vfs.StatisticsEnable = true;
+   }
+}
+
+static void
 compute_kill_pixel(struct anv_pipeline *pipeline,
const VkPipelineMultisampleStateCreateInfo *ms_info,
const struct anv_subpass *subpass)
@@ -1669,6 +1677,7 @@ genX(graphics_pipeline_create)(
emit_3dstate_ps_extra(pipeline, subpass);
emit_3dstate_vf_topology(pipeline);
 #endif
+   emit_3dstate_vf_statistics(pipeline);
 
*pPipeline = anv_pipeline_to_handle(pipeline);
 
diff --git a/src/intel/vulkan/genX_state.c b/src/intel/vulkan/genX_state.c
index fd8f8ac..bf1217b 100644
--- a/src/intel/vulkan/genX_state.c
+++ b/src/intel/vulkan/genX_state.c
@@ -52,9 +52,6 @@ genX(init_device_state)(struct anv_device *device)
   ps.PipelineSelection = _3D;
}
 
-   anv_batch_emit(&batch, GENX(3DSTATE_VF_STATISTICS), vfs)
-  vfs.StatisticsEnable = true;
-
anv_batch_emit(&batch, GENX(3DSTATE_AA_LINE_PARAMETERS), aa);
 
anv_batch_emit(&batch, GENX(3DSTATE_DRAWING_RECTANGLE), rect) {
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 12/16] genxml: s/Clipper Statistics Enable/Statistics Enable/

It's in 3DSTATE_CLIP, so it doesn't really need the extra detail.  This
matches what we do for VS, FS, etc.
---
 src/intel/genxml/gen6.xml  | 2 +-
 src/intel/genxml/gen7.xml  | 2 +-
 src/intel/genxml/gen75.xml | 2 +-
 src/intel/genxml/gen8.xml  | 2 +-
 src/intel/genxml/gen9.xml  | 2 +-
 5 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/intel/genxml/gen6.xml b/src/intel/genxml/gen6.xml
index 8a7eee0..5a7547d 100644
--- a/src/intel/genxml/gen6.xml
+++ b/src/intel/genxml/gen6.xml
@@ -803,7 +803,7 @@
 
 
 
-
+
 
 
 
diff --git a/src/intel/genxml/gen7.xml b/src/intel/genxml/gen7.xml
index 8f6c341..284070c 100644
--- a/src/intel/genxml/gen7.xml
+++ b/src/intel/genxml/gen7.xml
@@ -924,7 +924,7 @@
   
   
 
-
+
 
 
 
diff --git a/src/intel/genxml/gen75.xml b/src/intel/genxml/gen75.xml
index 43def02..a46352b 100644
--- a/src/intel/genxml/gen75.xml
+++ b/src/intel/genxml/gen75.xml
@@ -1043,7 +1043,7 @@
   
   
 
-
+
 
 
 
diff --git a/src/intel/genxml/gen8.xml b/src/intel/genxml/gen8.xml
index 0fcf242..7e7dbc8 100644
--- a/src/intel/genxml/gen8.xml
+++ b/src/intel/genxml/gen8.xml
@@ -1123,7 +1123,7 @@
 
 
 
-
+
 
 
 
diff --git a/src/intel/genxml/gen9.xml b/src/intel/genxml/gen9.xml
index cd219e3..4d27f19 100644
--- a/src/intel/genxml/gen9.xml
+++ b/src/intel/genxml/gen9.xml
@@ -1175,7 +1175,7 @@
 
 
 
-
+
 
 
 
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 10/16] anv/query: Break GPU query calculation into a helper

Reviewed-By: Lionel Landwerlin 
---
 src/intel/vulkan/genX_query.c | 30 ++
 1 file changed, 18 insertions(+), 12 deletions(-)

diff --git a/src/intel/vulkan/genX_query.c b/src/intel/vulkan/genX_query.c
index b8d4c55..6c26e6a 100644
--- a/src/intel/vulkan/genX_query.c
+++ b/src/intel/vulkan/genX_query.c
@@ -416,6 +416,22 @@ store_query_result(struct anv_batch *batch, uint32_t reg,
}
 }
 
+static void
+compute_query_result(struct anv_batch *batch, uint32_t dst_reg,
+ struct anv_bo *bo, uint32_t offset)
+{
+   emit_load_alu_reg_u64(batch, CS_GPR(0), bo, offset);
+   emit_load_alu_reg_u64(batch, CS_GPR(1), bo, offset + 8);
+
+   /* FIXME: We need to clamp the result for 32 bit. */
+
+   uint32_t *dw = anv_batch_emitn(batch, 5, GENX(MI_MATH));
+   dw[1] = alu(OPCODE_LOAD, OPERAND_SRCA, OPERAND_R1);
+   dw[2] = alu(OPCODE_LOAD, OPERAND_SRCB, OPERAND_R0);
+   dw[3] = alu(OPCODE_SUB, 0, 0);
+   dw[4] = alu(OPCODE_STORE, dst_reg, OPERAND_ACCU);
+}
+
 void genX(CmdCopyQueryPoolResults)(
 VkCommandBuffer commandBuffer,
 VkQueryPool queryPool,
@@ -444,18 +460,8 @@ void genX(CmdCopyQueryPoolResults)(
   slot_offset = (firstQuery + i) * pool->stride;
   switch (pool->type) {
   case VK_QUERY_TYPE_OCCLUSION:
- emit_load_alu_reg_u64(&cmd_buffer->batch,
-   CS_GPR(0), &pool->bo, slot_offset + 8);
- emit_load_alu_reg_u64(&cmd_buffer->batch,
-   CS_GPR(1), &pool->bo, slot_offset + 16);
-
- /* FIXME: We need to clamp the result for 32 bit. */
-
- uint32_t *dw = anv_batch_emitn(&cmd_buffer->batch, 5, GENX(MI_MATH));
- dw[1] = alu(OPCODE_LOAD, OPERAND_SRCA, OPERAND_R1);
- dw[2] = alu(OPCODE_LOAD, OPERAND_SRCB, OPERAND_R0);
- dw[3] = alu(OPCODE_SUB, 0, 0);
- dw[4] = alu(OPCODE_STORE, OPERAND_R2, OPERAND_ACCU);
+ compute_query_result(&cmd_buffer->batch, OPERAND_R2,
+  &pool->bo, slot_offset + 8);
  break;
 
   case VK_QUERY_TYPE_TIMESTAMP:
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 09/16] genxml: Add pipeline statistics registers on gen7+

Reviewed-By: Lionel Landwerlin 
---
 src/intel/genxml/gen7.xml  | 44 
 src/intel/genxml/gen75.xml | 44 
 src/intel/genxml/gen8.xml  | 44 
 src/intel/genxml/gen9.xml  | 44 
 4 files changed, 176 insertions(+)

diff --git a/src/intel/genxml/gen7.xml b/src/intel/genxml/gen7.xml
index 8219d64..8f6c341 100644
--- a/src/intel/genxml/gen7.xml
+++ b/src/intel/genxml/gen7.xml
@@ -2501,6 +2501,50 @@
 
   
 
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
   
 
 
diff --git a/src/intel/genxml/gen75.xml b/src/intel/genxml/gen75.xml
index 8e65c59..43def02 100644
--- a/src/intel/genxml/gen75.xml
+++ b/src/intel/genxml/gen75.xml
@@ -2909,6 +2909,50 @@
 
   
 
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
   
 
 
diff --git a/src/intel/genxml/gen8.xml b/src/intel/genxml/gen8.xml
index 1628237..0fcf242 100644
--- a/src/intel/genxml/gen8.xml
+++ b/src/intel/genxml/gen8.xml
@@ -3175,6 +3175,50 @@
 
   
 
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
   
 
 
diff --git a/src/intel/genxml/gen9.xml b/src/intel/genxml/gen9.xml
index 6849669..cd219e3 100644
--- a/src/intel/genxml/gen9.xml
+++ b/src/intel/genxml/gen9.xml
@@ -3446,6 +3446,50 @@
 
   
 
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
   
 
 
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 11/16] anv/query: Rework store_query_result

The new version is a nice GPU parallel to cpu_write_query_result and it
nicely handles things like dealing with 32 vs. 64-bit offsets in the
destination buffer.

Reviewed-By: Lionel Landwerlin 
---
 src/intel/vulkan/genX_query.c | 39 ---
 1 file changed, 24 insertions(+), 15 deletions(-)

diff --git a/src/intel/vulkan/genX_query.c b/src/intel/vulkan/genX_query.c
index 6c26e6a..dd5ae80 100644
--- a/src/intel/vulkan/genX_query.c
+++ b/src/intel/vulkan/genX_query.c
@@ -400,18 +400,31 @@ emit_load_alu_reg_u64(struct anv_batch *batch, uint32_t 
reg,
 }
 
 static void
-store_query_result(struct anv_batch *batch, uint32_t reg,
-   struct anv_bo *bo, uint32_t offset, VkQueryResultFlags 
flags)
+gpu_write_query_result(struct anv_batch *batch,
+   struct anv_buffer *dst_buffer, uint32_t dst_offset,
+   VkQueryResultFlags flags,
+   uint32_t value_index, uint32_t reg)
 {
+   if (flags & VK_QUERY_RESULT_64_BIT)
+  dst_offset += value_index * 8;
+   else
+  dst_offset += value_index * 4;
+
anv_batch_emit(batch, GENX(MI_STORE_REGISTER_MEM), srm) {
   srm.RegisterAddress  = reg;
-  srm.MemoryAddress= (struct anv_address) { bo, offset };
+  srm.MemoryAddress= (struct anv_address) {
+ .bo = dst_buffer->bo,
+ .offset = dst_buffer->offset + dst_offset,
+  };
}
 
if (flags & VK_QUERY_RESULT_64_BIT) {
   anv_batch_emit(batch, GENX(MI_STORE_REGISTER_MEM), srm) {
  srm.RegisterAddress  = reg + 4;
- srm.MemoryAddress= (struct anv_address) { bo, offset + 4 };
+ srm.MemoryAddress= (struct anv_address) {
+.bo = dst_buffer->bo,
+.offset = dst_buffer->offset + dst_offset + 4,
+ };
   }
}
 }
@@ -454,7 +467,6 @@ void genX(CmdCopyQueryPoolResults)(
   }
}
 
-   dst_offset = buffer->offset + destOffset;
for (uint32_t i = 0; i < queryCount; i++) {
 
   slot_offset = (firstQuery + i) * pool->stride;
@@ -462,32 +474,29 @@ void genX(CmdCopyQueryPoolResults)(
   case VK_QUERY_TYPE_OCCLUSION:
  compute_query_result(&cmd_buffer->batch, OPERAND_R2,
   &pool->bo, slot_offset + 8);
+ gpu_write_query_result(&cmd_buffer->batch, buffer, destOffset,
+flags, 0, CS_GPR(2));
  break;
 
   case VK_QUERY_TYPE_TIMESTAMP:
  emit_load_alu_reg_u64(&cmd_buffer->batch,
CS_GPR(2), &pool->bo, slot_offset + 8);
+ gpu_write_query_result(&cmd_buffer->batch, buffer, destOffset,
+flags, 0, CS_GPR(2));
  break;
 
   default:
  unreachable("unhandled query type");
   }
 
-  store_query_result(&cmd_buffer->batch,
- CS_GPR(2), buffer->bo, dst_offset, flags);
-
   if (flags & VK_QUERY_RESULT_WITH_AVAILABILITY_BIT) {
  emit_load_alu_reg_u64(&cmd_buffer->batch, CS_GPR(0),
&pool->bo, slot_offset);
- if (flags & VK_QUERY_RESULT_64_BIT)
-store_query_result(&cmd_buffer->batch,
-   CS_GPR(0), buffer->bo, dst_offset + 8, flags);
- else
-store_query_result(&cmd_buffer->batch,
-   CS_GPR(0), buffer->bo, dst_offset + 4, flags);
+ gpu_write_query_result(&cmd_buffer->batch, buffer, destOffset,
+flags, 1, CS_GPR(0));
   }
 
-  dst_offset += destStride;
+  destOffset += destStride;
}
 }
 
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 08/16] anv/query: Add a helper for writing a query pool result

Reviewed-By: Lionel Landwerlin 
---
 src/intel/vulkan/genX_query.c | 33 +
 1 file changed, 17 insertions(+), 16 deletions(-)

diff --git a/src/intel/vulkan/genX_query.c b/src/intel/vulkan/genX_query.c
index 2d8f352..b8d4c55 100644
--- a/src/intel/vulkan/genX_query.c
+++ b/src/intel/vulkan/genX_query.c
@@ -108,6 +108,19 @@ void genX(DestroyQueryPool)(
vk_free2(&device->alloc, pAllocator, pool);
 }
 
+static void
+cpu_write_query_result(void *dst_slot, VkQueryResultFlags flags,
+   uint32_t value_index, uint64_t result)
+{
+   if (flags & VK_QUERY_RESULT_64_BIT) {
+  uint64_t *dst64 = dst_slot;
+  dst64[value_index] = result;
+   } else {
+  uint32_t *dst32 = dst_slot;
+  dst32[value_index] = result;
+   }
+}
+
 VkResult genX(GetQueryPoolResults)(
 VkDevice_device,
 VkQueryPool queryPool,
@@ -121,7 +134,6 @@ VkResult genX(GetQueryPoolResults)(
ANV_FROM_HANDLE(anv_device, device, _device);
ANV_FROM_HANDLE(anv_query_pool, pool, queryPool);
int64_t timeout = INT64_MAX;
-   uint64_t result;
int ret;
 
assert(pool->type == VK_QUERY_TYPE_OCCLUSION ||
@@ -169,13 +181,13 @@ VkResult genX(GetQueryPoolResults)(
   if (write_results) {
  switch (pool->type) {
  case VK_QUERY_TYPE_OCCLUSION: {
-result = slot[2] - slot[1];
+cpu_write_query_result(pData, flags, 0, slot[2] - slot[1]);
 break;
  }
  case VK_QUERY_TYPE_PIPELINE_STATISTICS:
 unreachable("pipeline stats not supported");
  case VK_QUERY_TYPE_TIMESTAMP: {
-result = slot[1];
+cpu_write_query_result(pData, flags, 0, slot[1]);
 break;
  }
  default:
@@ -185,19 +197,8 @@ VkResult genX(GetQueryPoolResults)(
  status = VK_NOT_READY;
   }
 
-  if (flags & VK_QUERY_RESULT_64_BIT) {
- uint64_t *dst = pData;
- if (write_results)
-dst[0] = result;
- if (flags & VK_QUERY_RESULT_WITH_AVAILABILITY_BIT)
-dst[1] = available;
-  } else {
- uint32_t *dst = pData;
- if (write_results)
-dst[0] = result;
- if (flags & VK_QUERY_RESULT_WITH_AVAILABILITY_BIT)
-dst[1] = available;
-  }
+  if (flags & VK_QUERY_RESULT_WITH_AVAILABILITY_BIT)
+ cpu_write_query_result(pData, flags, 1, available);
 
   pData += stride;
   if (pData >= data_end)
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 05/16] anv/query: Let 32-bit values wrap

From the Vulkan 1.0.39 Specification:

   "If VK_QUERY_RESULT_64_BIT is not set and the result overflows a
   32-bit value, the value may either wrap or saturate."

So we can either clamp or wrap.  Wrapping is both easier and what the
user gets if they use vkCmdCopyQueryPoolResults and we should be
consistent.  We could make vkCmdCopyQueryPoolResults clamp but it's
annoying and ends up burning extra batch for something the spec clearly
doesn't require.

Reviewed-By: Lionel Landwerlin 
---
 src/intel/vulkan/genX_query.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/src/intel/vulkan/genX_query.c b/src/intel/vulkan/genX_query.c
index 2429386..a311b4b 100644
--- a/src/intel/vulkan/genX_query.c
+++ b/src/intel/vulkan/genX_query.c
@@ -181,8 +181,6 @@ VkResult genX(GetQueryPoolResults)(
 dst[1] = slot[firstQuery + i].available;
   } else {
  uint32_t *dst = pData;
- if (result > UINT32_MAX)
-result = UINT32_MAX;
  if (write_results)
 dst[0] = result;
  if (flags & VK_QUERY_RESULT_WITH_AVAILABILITY_BIT)
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 07/16] anv/query: Use a variable-length slot size

Not all queries are the same.  Even the two queries we support today
require a different amount of data per slot.  Once we introduce pipeline
statistics queries, the size will vary wildly.

Reviewed-By: Lionel Landwerlin 
---
 src/intel/vulkan/anv_private.h |  9 +++-
 src/intel/vulkan/genX_query.c  | 52 --
 2 files changed, 33 insertions(+), 28 deletions(-)

diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index 795fd24..4f0c5b9 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -1984,14 +1984,11 @@ struct anv_render_pass {
struct anv_subpass   subpasses[0];
 };
 
-struct anv_query_pool_slot {
-   uint64_t available;
-   uint64_t begin;
-   uint64_t end;
-};
-
 struct anv_query_pool {
VkQueryType  type;
+   /** Stride between slots, in bytes */
+   uint32_t stride;
+   /** Number of slots in this query pool */
uint32_t slots;
struct anv_bobo;
 };
diff --git a/src/intel/vulkan/genX_query.c b/src/intel/vulkan/genX_query.c
index 9338209..2d8f352 100644
--- a/src/intel/vulkan/genX_query.c
+++ b/src/intel/vulkan/genX_query.c
@@ -41,14 +41,24 @@ VkResult genX(CreateQueryPool)(
ANV_FROM_HANDLE(anv_device, device, _device);
struct anv_query_pool *pool;
VkResult result;
-   uint32_t slot_size;
-   uint64_t size;
 
assert(pCreateInfo->sType == VK_STRUCTURE_TYPE_QUERY_POOL_CREATE_INFO);
 
+   /* Query pool slots are made up of some number of 64-bit values packed
+* tightly together.  The first 64-bit value is always the "available" bit
+* which is 0 when the query is unavailable and 1 when it is available.
+* The 64-bit values that follow are determined by the type of query.
+*/
+   uint32_t uint64s_per_slot = 1;
+
switch (pCreateInfo->queryType) {
case VK_QUERY_TYPE_OCCLUSION:
+  /* Occlusion queries have two values: begin and end. */
+  uint64s_per_slot += 2;
+  break;
case VK_QUERY_TYPE_TIMESTAMP:
+  /* Timestamps just have the one timestamp value */
+  uint64s_per_slot += 1;
   break;
case VK_QUERY_TYPE_PIPELINE_STATISTICS:
   return VK_ERROR_INCOMPATIBLE_DRIVER;
@@ -56,16 +66,16 @@ VkResult genX(CreateQueryPool)(
   assert(!"Invalid query type");
}
 
-   slot_size = sizeof(struct anv_query_pool_slot);
pool = vk_alloc2(&device->alloc, pAllocator, sizeof(*pool), 8,
  VK_SYSTEM_ALLOCATION_SCOPE_OBJECT);
if (pool == NULL)
   return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
 
pool->type = pCreateInfo->queryType;
+   pool->stride = uint64s_per_slot * sizeof(uint64_t);
pool->slots = pCreateInfo->queryCount;
 
-   size = pCreateInfo->queryCount * slot_size;
+   uint64_t size = pool->slots * pool->stride;
result = anv_bo_init_new(&pool->bo, device, size);
if (result != VK_SUCCESS)
   goto fail;
@@ -130,18 +140,20 @@ VkResult genX(GetQueryPoolResults)(
}
 
void *data_end = pData + dataSize;
-   struct anv_query_pool_slot *slot = pool->bo.map;
 
if (!device->info.has_llc) {
-  uint64_t offset = firstQuery * sizeof(*slot);
-  uint64_t size = queryCount * sizeof(*slot);
+  uint64_t offset = firstQuery * pool->stride;
+  uint64_t size = queryCount * pool->stride;
   anv_invalidate_range(pool->bo.map + offset,
MIN2(size, pool->bo.size - offset));
}
 
VkResult status = VK_SUCCESS;
for (uint32_t i = 0; i < queryCount; i++) {
-  bool available = slot[firstQuery + i].available;
+  uint64_t *slot = pool->bo.map + (firstQuery + i) * pool->stride;
+
+  /* Availability is always at the start of the slot */
+  bool available = slot[0];
 
   /* From the Vulkan 1.0.42 spec:
*
@@ -157,13 +169,13 @@ VkResult genX(GetQueryPoolResults)(
   if (write_results) {
  switch (pool->type) {
  case VK_QUERY_TYPE_OCCLUSION: {
-result = slot[firstQuery + i].end - slot[firstQuery + i].begin;
+result = slot[2] - slot[1];
 break;
  }
  case VK_QUERY_TYPE_PIPELINE_STATISTICS:
 unreachable("pipeline stats not supported");
  case VK_QUERY_TYPE_TIMESTAMP: {
-result = slot[firstQuery + i].begin;
+result = slot[1];
 break;
  }
  default:
@@ -178,13 +190,13 @@ VkResult genX(GetQueryPoolResults)(
  if (write_results)
 dst[0] = result;
  if (flags & VK_QUERY_RESULT_WITH_AVAILABILITY_BIT)
-dst[1] = slot[firstQuery + i].available;
+dst[1] = available;
   } else {
  uint32_t *dst = pData;
  if (write_results)
 dst[0] = result;
  if (flags & VK_QUERY_RESULT_WITH_AVAILABILITY_BIT)
-dst[1] = slot[firstQuery + i].available

[Mesa-dev] [PATCH v2 06/16] anv/query: Move the available bits to the front

We're about to make slots variable-length and always having the
available bits at the front makes certain operations substantially
easier once we do that.

Reviewed-By: Lionel Landwerlin 
---
 src/intel/vulkan/anv_private.h |  2 +-
 src/intel/vulkan/genX_query.c  | 43 +-
 2 files changed, 18 insertions(+), 27 deletions(-)

diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index 7682bfc..795fd24 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -1985,9 +1985,9 @@ struct anv_render_pass {
 };
 
 struct anv_query_pool_slot {
+   uint64_t available;
uint64_t begin;
uint64_t end;
-   uint64_t available;
 };
 
 struct anv_query_pool {
diff --git a/src/intel/vulkan/genX_query.c b/src/intel/vulkan/genX_query.c
index a311b4b..9338209 100644
--- a/src/intel/vulkan/genX_query.c
+++ b/src/intel/vulkan/genX_query.c
@@ -232,21 +232,12 @@ void genX(CmdResetQueryPool)(
ANV_FROM_HANDLE(anv_query_pool, pool, queryPool);
 
for (uint32_t i = 0; i < queryCount; i++) {
-  switch (pool->type) {
-  case VK_QUERY_TYPE_OCCLUSION:
-  case VK_QUERY_TYPE_TIMESTAMP: {
- anv_batch_emit(&cmd_buffer->batch, GENX(MI_STORE_DATA_IMM), sdm) {
-sdm.Address = (struct anv_address) {
-   .bo = &pool->bo,
-   .offset = (firstQuery + i) * sizeof(struct anv_query_pool_slot) 
+
- offsetof(struct anv_query_pool_slot, available),
-};
-sdm.ImmediateData = 0;
- }
- break;
-  }
-  default:
- assert(!"Invalid query type");
+  anv_batch_emit(&cmd_buffer->batch, GENX(MI_STORE_DATA_IMM), sdm) {
+ sdm.Address = (struct anv_address) {
+.bo = &pool->bo,
+.offset = (firstQuery + i) * sizeof(struct anv_query_pool_slot),
+ };
+ sdm.ImmediateData = 0;
   }
}
 }
@@ -277,7 +268,7 @@ void genX(CmdBeginQuery)(
switch (pool->type) {
case VK_QUERY_TYPE_OCCLUSION:
   emit_ps_depth_count(cmd_buffer, &pool->bo,
-  query * sizeof(struct anv_query_pool_slot));
+  query * sizeof(struct anv_query_pool_slot) + 8);
   break;
 
case VK_QUERY_TYPE_PIPELINE_STATISTICS:
@@ -297,10 +288,10 @@ void genX(CmdEndQuery)(
switch (pool->type) {
case VK_QUERY_TYPE_OCCLUSION:
   emit_ps_depth_count(cmd_buffer, &pool->bo,
-  query * sizeof(struct anv_query_pool_slot) + 8);
+  query * sizeof(struct anv_query_pool_slot) + 16);
 
   emit_query_availability(cmd_buffer, &pool->bo,
-  query * sizeof(struct anv_query_pool_slot) + 16);
+  query * sizeof(struct anv_query_pool_slot));
   break;
 
case VK_QUERY_TYPE_PIPELINE_STATISTICS:
@@ -327,11 +318,11 @@ void genX(CmdWriteTimestamp)(
case VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT:
   anv_batch_emit(&cmd_buffer->batch, GENX(MI_STORE_REGISTER_MEM), srm) {
  srm.RegisterAddress  = TIMESTAMP;
- srm.MemoryAddress= (struct anv_address) { &pool->bo, offset };
+ srm.MemoryAddress= (struct anv_address) { &pool->bo, offset + 8 };
   }
   anv_batch_emit(&cmd_buffer->batch, GENX(MI_STORE_REGISTER_MEM), srm) {
  srm.RegisterAddress  = TIMESTAMP + 4;
- srm.MemoryAddress= (struct anv_address) { &pool->bo, offset + 4 };
+ srm.MemoryAddress= (struct anv_address) { &pool->bo, offset + 12 
};
   }
   break;
 
@@ -340,7 +331,7 @@ void genX(CmdWriteTimestamp)(
   anv_batch_emit(&cmd_buffer->batch, GENX(PIPE_CONTROL), pc) {
  pc.DestinationAddressType  = DAT_PPGTT;
  pc.PostSyncOperation   = WriteTimestamp;
- pc.Address = (struct anv_address) { &pool->bo, offset };
+ pc.Address = (struct anv_address) { &pool->bo, offset + 8 };
 
  if (GEN_GEN == 9 && cmd_buffer->device->info.gt == 4)
 pc.CommandStreamerStallEnable = true;
@@ -348,7 +339,7 @@ void genX(CmdWriteTimestamp)(
   break;
}
 
-   emit_query_availability(cmd_buffer, &pool->bo, offset + 16);
+   emit_query_availability(cmd_buffer, &pool->bo, offset);
 }
 
 #if GEN_GEN > 7 || GEN_IS_HASWELL
@@ -445,9 +436,9 @@ void genX(CmdCopyQueryPoolResults)(
   switch (pool->type) {
   case VK_QUERY_TYPE_OCCLUSION:
  emit_load_alu_reg_u64(&cmd_buffer->batch,
-   CS_GPR(0), &pool->bo, slot_offset);
+   CS_GPR(0), &pool->bo, slot_offset + 8);
  emit_load_alu_reg_u64(&cmd_buffer->batch,
-   CS_GPR(1), &pool->bo, slot_offset + 8);
+   CS_GPR(1), &pool->bo, slot_offset + 16);
 
  /* FIXME: We need to clamp the result for 32 bit. */
 
@@ -460,7 +451,7 @@ void genX(CmdCopyQueryPoolResults)(
 
   case VK_QUERY_TYPE_TIMESTAMP:
  emit_load_alu

[Mesa-dev] [PATCH v2 03/16] anv/GetQueryPoolResults: Actually implement the spec

The Vulkan spec is fairly clear about when we should and should not
write query pool results.  We're also supposed to return VK_NOT_READY if
VK_QUERY_RESULT_PARTIAL_BIT is not set and we come across any queries
which are not yet finished.  This fixes rendering corruptions on The
Talos Principle where geometry flickers in and out due to bogus query
results being returned by the driver.  These issues are most noticable
on Sky Lake GT4 2hen running on "ultra" settings.

Reviewed-By: Lionel Landwerlin 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100182
Cc: "17.0 13.0" 
---
 src/intel/vulkan/genX_query.c | 52 ++-
 1 file changed, 36 insertions(+), 16 deletions(-)

diff --git a/src/intel/vulkan/genX_query.c b/src/intel/vulkan/genX_query.c
index 72ac2cb..b5955d3 100644
--- a/src/intel/vulkan/genX_query.c
+++ b/src/intel/vulkan/genX_query.c
@@ -139,32 +139,52 @@ VkResult genX(GetQueryPoolResults)(
MIN2(size, pool->bo.size - offset));
}
 
+   VkResult status = VK_SUCCESS;
for (uint32_t i = 0; i < queryCount; i++) {
-  switch (pool->type) {
-  case VK_QUERY_TYPE_OCCLUSION: {
- result = slot[firstQuery + i].end - slot[firstQuery + i].begin;
- break;
-  }
-  case VK_QUERY_TYPE_PIPELINE_STATISTICS:
- unreachable("pipeline stats not supported");
-  case VK_QUERY_TYPE_TIMESTAMP: {
- result = slot[firstQuery + i].begin;
- break;
-  }
-  default:
- unreachable("invalid pool type");
+  bool available = slot[firstQuery + i].available;
+
+  /* From the Vulkan 1.0.42 spec:
+   *
+   *"If VK_QUERY_RESULT_WAIT_BIT and VK_QUERY_RESULT_PARTIAL_BIT are
+   *both not set then no result values are written to pData for
+   *queries that are in the unavailable state at the time of the call,
+   *and vkGetQueryPoolResults returns VK_NOT_READY. However,
+   *availability state is still written to pData for those queries if
+   *VK_QUERY_RESULT_WITH_AVAILABILITY_BIT is set."
+   */
+  bool write_results = available || (flags & VK_QUERY_RESULT_PARTIAL_BIT);
+
+  if (write_results) {
+ switch (pool->type) {
+ case VK_QUERY_TYPE_OCCLUSION: {
+result = slot[firstQuery + i].end - slot[firstQuery + i].begin;
+break;
+ }
+ case VK_QUERY_TYPE_PIPELINE_STATISTICS:
+unreachable("pipeline stats not supported");
+ case VK_QUERY_TYPE_TIMESTAMP: {
+result = slot[firstQuery + i].begin;
+break;
+ }
+ default:
+unreachable("invalid pool type");
+ }
+  } else {
+ status = VK_NOT_READY;
   }
 
   if (flags & VK_QUERY_RESULT_64_BIT) {
  uint64_t *dst = pData;
- dst[0] = result;
+ if (write_results)
+dst[0] = result;
  if (flags & VK_QUERY_RESULT_WITH_AVAILABILITY_BIT)
 dst[1] = slot[firstQuery + i].available;
   } else {
  uint32_t *dst = pData;
  if (result > UINT32_MAX)
 result = UINT32_MAX;
- dst[0] = result;
+ if (write_results)
+dst[0] = result;
  if (flags & VK_QUERY_RESULT_WITH_AVAILABILITY_BIT)
 dst[1] = slot[firstQuery + i].available;
   }
@@ -174,7 +194,7 @@ VkResult genX(GetQueryPoolResults)(
  break;
}
 
-   return VK_SUCCESS;
+   return status;
 }
 
 static void
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 04/16] genxml: Make MI_STORE_DATA_IMM have a single 64-bit data field

This is way more convenient than having two separate dword fields.

Reviewed-By: Lionel Landwerlin 
---
 src/intel/genxml/gen6.xml | 3 +--
 src/intel/genxml/gen7.xml | 3 +--
 src/intel/genxml/gen75.xml| 3 +--
 src/intel/genxml/gen8.xml | 3 +--
 src/intel/genxml/gen9.xml | 3 +--
 src/intel/vulkan/genX_query.c | 3 +--
 6 files changed, 6 insertions(+), 12 deletions(-)

diff --git a/src/intel/genxml/gen6.xml b/src/intel/genxml/gen6.xml
index 6a9b090..8a7eee0 100644
--- a/src/intel/genxml/gen6.xml
+++ b/src/intel/genxml/gen6.xml
@@ -1805,8 +1805,7 @@
 
 
 
-
-
+
   
 
   
diff --git a/src/intel/genxml/gen7.xml b/src/intel/genxml/gen7.xml
index 7368b5b..8219d64 100644
--- a/src/intel/genxml/gen7.xml
+++ b/src/intel/genxml/gen7.xml
@@ -2316,8 +2316,7 @@
 
 
 
-
-
+
   
 
   
diff --git a/src/intel/genxml/gen75.xml b/src/intel/genxml/gen75.xml
index ed82236..8e65c59 100644
--- a/src/intel/genxml/gen75.xml
+++ b/src/intel/genxml/gen75.xml
@@ -2709,8 +2709,7 @@
 
 
 
-
-
+
   
 
   
diff --git a/src/intel/genxml/gen8.xml b/src/intel/genxml/gen8.xml
index 32ed764..1628237 100644
--- a/src/intel/genxml/gen8.xml
+++ b/src/intel/genxml/gen8.xml
@@ -2980,8 +2980,7 @@
 
 
 
-
-
+
   
 
   
diff --git a/src/intel/genxml/gen9.xml b/src/intel/genxml/gen9.xml
index ec29d13..6849669 100644
--- a/src/intel/genxml/gen9.xml
+++ b/src/intel/genxml/gen9.xml
@@ -3255,8 +3255,7 @@
 
 
 
-
-
+
   
 
   
diff --git a/src/intel/vulkan/genX_query.c b/src/intel/vulkan/genX_query.c
index b5955d3..2429386 100644
--- a/src/intel/vulkan/genX_query.c
+++ b/src/intel/vulkan/genX_query.c
@@ -243,8 +243,7 @@ void genX(CmdResetQueryPool)(
.offset = (firstQuery + i) * sizeof(struct anv_query_pool_slot) 
+
  offsetof(struct anv_query_pool_slot, available),
 };
-sdm.DataDWord0 = 0;
-sdm.DataDWord1 = 0;
+sdm.ImmediateData = 0;
  }
  break;
   }
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 00/16] Queries!

This v2 of my earlier queries series fixes the bugs found by reviewers and
the CTS.  It now passes 100% of the CTS tests.  New "features" include:

 - Clip statistics are now enabled
 - VF statistics are disabled for blits and gpu memcpy operations (other
   pipeline statistics were already disabled) so pipeline statistics
   queries only include drawing from the client.
 - No longer asserting when you try to make a pipeline statistics query
   pool (I did my dev work in release mode)
 - Inherited queries are now enabled

Ilia Mirkin (1):
  anv: Implement pipeline statistics queries

Jason Ekstrand (15):
  anv/query: Fix the location of timestamp availability
  anv/query: Invalidate the correct range
  anv/GetQueryPoolResults: Actually implement the spec
  genxml: Make MI_STORE_DATA_IMM have a single 64-bit data field
  anv/query: Let 32-bit values wrap
  anv/query: Move the available bits to the front
  anv/query: Use a variable-length slot size
  anv/query: Add a helper for writing a query pool result
  genxml: Add pipeline statistics registers on gen7+
  anv/query: Break GPU query calculation into a helper
  anv/query: Rework store_query_result
  genxml: s/Clipper Statistics Enable/Statistics Enable/
  anv/pipeline: Enable clipper statistics
  anv: Disable VF statistics for blorp and SOL memcpy
  anv: Turn on inherited queries

 src/intel/genxml/gen6.xml  |   5 +-
 src/intel/genxml/gen7.xml  |  49 -
 src/intel/genxml/gen75.xml |  49 -
 src/intel/genxml/gen8.xml  |  49 -
 src/intel/genxml/gen9.xml  |  49 -
 src/intel/vulkan/TODO  |   1 -
 src/intel/vulkan/anv_device.c  |   4 +-
 src/intel/vulkan/anv_private.h |  10 +-
 src/intel/vulkan/genX_blorp_exec.c |   5 +
 src/intel/vulkan/genX_gpu_memcpy.c |   4 +
 src/intel/vulkan/genX_pipeline.c   |  10 +
 src/intel/vulkan/genX_query.c  | 436 +
 src/intel/vulkan/genX_state.c  |   3 -
 13 files changed, 554 insertions(+), 120 deletions(-)

-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 02/16] anv/query: Invalidate the correct range

Reviewed-By: Lionel Landwerlin 
Cc: "17.0 13.0" 
---
 src/intel/vulkan/genX_query.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/src/intel/vulkan/genX_query.c b/src/intel/vulkan/genX_query.c
index 4e6638a..72ac2cb 100644
--- a/src/intel/vulkan/genX_query.c
+++ b/src/intel/vulkan/genX_query.c
@@ -132,8 +132,12 @@ VkResult genX(GetQueryPoolResults)(
void *data_end = pData + dataSize;
struct anv_query_pool_slot *slot = pool->bo.map;
 
-   if (!device->info.has_llc)
-  anv_invalidate_range(slot, MIN2(queryCount * sizeof(*slot), 
pool->bo.size));
+   if (!device->info.has_llc) {
+  uint64_t offset = firstQuery * sizeof(*slot);
+  uint64_t size = queryCount * sizeof(*slot);
+  anv_invalidate_range(pool->bo.map + offset,
+   MIN2(size, pool->bo.size - offset));
+   }
 
for (uint32_t i = 0; i < queryCount; i++) {
   switch (pool->type) {
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 01/16] anv/query: Fix the location of timestamp availability

Reviewed-By: Lionel Landwerlin 
Cc: "17.0 13.0" 
---
 src/intel/vulkan/genX_query.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/vulkan/genX_query.c b/src/intel/vulkan/genX_query.c
index 830f867..4e6638a 100644
--- a/src/intel/vulkan/genX_query.c
+++ b/src/intel/vulkan/genX_query.c
@@ -327,7 +327,7 @@ void genX(CmdWriteTimestamp)(
   break;
}
 
-   emit_query_availability(cmd_buffer, &pool->bo, query + 16);
+   emit_query_availability(cmd_buffer, &pool->bo, offset + 16);
 }
 
 #if GEN_GEN > 7 || GEN_IS_HASWELL
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 02/23] nir: Fix misspellings.

Reviewed-by: Dylan Baker 

Quoting Matt Turner (2017-03-16 14:17:59)
> ---
>  src/compiler/nir/nir.h   | 2 +-
>  src/compiler/nir/nir_from_ssa.c  | 6 +++---
>  src/compiler/nir/nir_lower_returns.c | 2 +-
>  src/compiler/nir/nir_move_vec_src_uses_to_dest.c | 4 ++--
>  4 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
> index f904d93..2dedb45 100644
> --- a/src/compiler/nir/nir.h
> +++ b/src/compiler/nir/nir.h
> @@ -1907,7 +1907,7 @@ nir_register *nir_local_reg_create(nir_function_impl 
> *impl);
>  
>  void nir_reg_remove(nir_register *reg);
>  
> -/** Adds a variable to the appropreate list in nir_shader */
> +/** Adds a variable to the appropriate list in nir_shader */
>  void nir_shader_add_variable(nir_shader *shader, nir_variable *var);
>  
>  static inline void
> diff --git a/src/compiler/nir/nir_from_ssa.c b/src/compiler/nir/nir_from_ssa.c
> index 27e94f8..fdfbf98 100644
> --- a/src/compiler/nir/nir_from_ssa.c
> +++ b/src/compiler/nir/nir_from_ssa.c
> @@ -63,7 +63,7 @@ ssa_def_dominates(nir_ssa_def *a, nir_ssa_def *b)
>  
>  /* The following data structure, which I have named merge_set is a way of
>   * representing a set registers of non-interfering registers.  This is
> - * based on the concept of a "dominence forest" presented in "Fast Copy
> + * based on the concept of a "dominance forest" presented in "Fast Copy
>   * Coalescing and Live-Range Identification" by Budimlic et. al. but the
>   * implementation concept is taken from  "Revisiting Out-of-SSA Translation
>   * for Correctness, Code Quality, and Efficiency" by Boissinot et. al..
> @@ -71,8 +71,8 @@ ssa_def_dominates(nir_ssa_def *a, nir_ssa_def *b)
>   * Each SSA definition is associated with a merge_node and the association
>   * is represented by a combination of a hash table and the "def" parameter
>   * in the merge_node structure.  The merge_set stores a linked list of
> - * merge_nodes in dominence order of the ssa definitions.  (Since the
> - * liveness analysis pass indexes the SSA values in dominence order for us,
> + * merge_nodes in dominance order of the ssa definitions.  (Since the
> + * liveness analysis pass indexes the SSA values in dominance order for us,
>   * this is an easy thing to keep up.)  It is assumed that no pair of the
>   * nodes in a given set interfere.  Merging two sets or checking for
>   * interference can be done in a single linear-time merge-sort walk of the
> diff --git a/src/compiler/nir/nir_lower_returns.c 
> b/src/compiler/nir/nir_lower_returns.c
> index 33490b2..423192a 100644
> --- a/src/compiler/nir/nir_lower_returns.c
> +++ b/src/compiler/nir/nir_lower_returns.c
> @@ -113,7 +113,7 @@ lower_returns_in_if(nir_if *if_stmt, struct 
> lower_returns_state *state)
>  * returns inside of the body of the if.  If we're in a loop, then these
>  * were lowered to breaks which automatically skip to the end of the
>  * loop so we don't have to do anything.  If we're not in a loop, then
> -* all we know is that the return flag is set appropreately and that the
> +* all we know is that the return flag is set appropriately and that the
>  * recursive calls ensured that nothing gets executed *inside* the if
>  * after a return.  In order to ensure nothing outside gets executed
>  * after a return, we need to predicate everything following on the
> diff --git a/src/compiler/nir/nir_move_vec_src_uses_to_dest.c 
> b/src/compiler/nir/nir_move_vec_src_uses_to_dest.c
> index 5ad17b8..29ebf92 100644
> --- a/src/compiler/nir/nir_move_vec_src_uses_to_dest.c
> +++ b/src/compiler/nir/nir_move_vec_src_uses_to_dest.c
> @@ -114,10 +114,10 @@ move_vec_src_uses_to_dest_block(nir_block *block)
>  if (vec->src[j].src.ssa != vec->src[i].src.ssa)
> continue;
>  
> -/* Mark the given chanle as having been handled */
> +/* Mark the given channel as having been handled */
>  srcs_remaining &= ~(1 << j);
>  
> -/* Mark the appropreate channel as coming from src j */
> +/* Mark the appropriate channel as coming from src j */
>  swizzle[vec->src[j].swizzle[0]] = j;
>   }
>  
> -- 
> 2.10.2
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 03/23] nir: Fix syntax.

Reviewed-by: Dylan Baker 

Quoting Matt Turner (2017-03-16 14:18:00)
> et is not an abbreviation.
> ---
>  src/compiler/nir/nir_from_ssa.c  | 10 +-
>  src/compiler/nir/nir_lower_vars_to_ssa.c |  2 +-
>  2 files changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/src/compiler/nir/nir_from_ssa.c b/src/compiler/nir/nir_from_ssa.c
> index fdfbf98..d2646c6 100644
> --- a/src/compiler/nir/nir_from_ssa.c
> +++ b/src/compiler/nir/nir_from_ssa.c
> @@ -32,7 +32,7 @@
>  /*
>   * This file implements an out-of-SSA pass as described in "Revisiting
>   * Out-of-SSA Translation for Correctness, Code Quality, and Efficiency" by
> - * Boissinot et. al.
> + * Boissinot et al.
>   */
>  
>  struct from_ssa_state {
> @@ -64,9 +64,9 @@ ssa_def_dominates(nir_ssa_def *a, nir_ssa_def *b)
>  /* The following data structure, which I have named merge_set is a way of
>   * representing a set registers of non-interfering registers.  This is
>   * based on the concept of a "dominance forest" presented in "Fast Copy
> - * Coalescing and Live-Range Identification" by Budimlic et. al. but the
> + * Coalescing and Live-Range Identification" by Budimlic et al. but the
>   * implementation concept is taken from  "Revisiting Out-of-SSA Translation
> - * for Correctness, Code Quality, and Efficiency" by Boissinot et. al..
> + * for Correctness, Code Quality, and Efficiency" by Boissinot et al.
>   *
>   * Each SSA definition is associated with a merge_node and the association
>   * is represented by a combination of a hash table and the "def" parameter
> @@ -177,7 +177,7 @@ merge_merge_sets(merge_set *a, merge_set *b)
>   *
>   * This is an implementation of Algorithm 2 in "Revisiting Out-of-SSA
>   * Translation for Correctness, Code Quality, and Efficiency" by
> - * Boissinot et. al.
> + * Boissinot et al.
>   */
>  static bool
>  merge_sets_interfere(merge_set *a, merge_set *b)
> @@ -561,7 +561,7 @@ emit_copy(nir_builder *b, nir_src src, nir_src dest_src)
>  /* Resolves a single parallel copy operation into a sequence of movs
>   *
>   * This is based on Algorithm 1 from "Revisiting Out-of-SSA Translation for
> - * Correctness, Code Quality, and Efficiency" by Boissinot et. al..
> + * Correctness, Code Quality, and Efficiency" by Boissinot et al.
>   * However, I never got the algorithm to work as written, so this version
>   * is slightly modified.
>   *
> diff --git a/src/compiler/nir/nir_lower_vars_to_ssa.c 
> b/src/compiler/nir/nir_lower_vars_to_ssa.c
> index 4ea5ea5..37a786c 100644
> --- a/src/compiler/nir/nir_lower_vars_to_ssa.c
> +++ b/src/compiler/nir/nir_lower_vars_to_ssa.c
> @@ -475,7 +475,7 @@ lower_copies_to_load_store(struct deref_node *node,
>   *
>   * This algorithm is very similar to the one outlined in "Efficiently
>   * Computing Static Single Assignment Form and the Control Dependence
> - * Graph" by Cytron et. al.  The primary difference is that we only put one
> + * Graph" by Cytron et al.  The primary difference is that we only put one
>   * SSA def on the stack per block.
>   */
>  static bool
> -- 
> 2.10.2
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC libdrm 0/2] Replace the build system with meson

2017-03-16 Thread Ilia Mirkin

On Thu, Mar 16, 2017 at 5:25 PM, Dylan Baker  wrote:
> Why bother, and why would we want this?   
>│~
>
> First it's written in python, which means the potential developer base
> is massive. And it provides a recursive view for humans, but a
> non-recursive view for the system. This is the best of both worlds,
> humans can organize the build system in a way that makes sense, and the
> machine gets a non-recursive build system. It also uses ninja rather
> than make, and ninja is faster than make inherently. Meson is also a
> simpler syntax than autotools or cmake it's not Turing Complete by
> design nor does it expose python, again, by design. This allows meson
> itself to be reimplemented in a another language if python becomes a
> dead-end or a bottle-neck. It also makes it much easier to understand
> what the build system is doing.
>
> What's different about using meson?
>
> Well, apart from a faster builds and less magic in the build system? The
> configure flags are different, it uses -D= more like cmake
> than the --enable or --with flags of autotools, although oddly it uses
> --prefix and friends when calling meson, but not with mesonconf, there's
> a bug opened on this. Meson also doesn't support in-tree builds at all;
> all builds are done out of tree. It also doesn't provide a "make dist"
> target, fortunately there's this awesome tool called git, and it
> provides a "git archive" command that does much the same thing. Did I
> mention it's fast?
>
> Here are the performance numbers I see on a 2 core 4 thread SKL, without
> initial configuration, and building out of tree (using zsh):
>
> For meson the command line is:
> time (meson build -Dmanpages=true && ninja -C build)
>
> For autotools the command line is:
> time (mdkir build && cd build && ../autotools && make -j5 -l4)

Probably mkdir...

>
> meson (cold ccache): 13.37s user 1.74s system 255% cpu  5.907 total
> autotools (cold ccache): 26.50s user 1.71s system 129% cpu 21.835 total
> meson (hot ccache):   2.13s user 0.39s system 154% cpu  1.633 total
> autotools (hot ccache):  13.93s user 0.73s system 102% cpu 14.259 total
>
> That's ~4x faster for a cold build and ~10x faster for a hot build.
>
> For a make clean && make style build with a hot cache:
> meson: 4.64s user 0.33s system 334% cpu 1.486 total
> autotools: 7.93s user 0.32s system 167% cpu 4.920 total
>
> Why bother with libdrm?
>
> It's a simple build system, that could be completely (or mostly
> completely) be ported in a very short time, and could serve as a tech
> demo for the advantages of using meson to garner feedback for embarking
> on a larger project, like mesa (which is what I'm planning to work on
> next).
>
> tl;dr
>
> I wrote this as practice for porting Mesa, and figured I might as well
> send it out since I wrote it.
>
> It is very likely that neither of these large patches will show up on the
> mailing list, but this is available at my github:
> https://github.com/dcbaker/libdrm wip/meson

I haven't looked at meson or your patches in detail, but autotools
supports 2 very important use-cases very well:

1. ./configure --help
2. Cross-compilation with minimal requirement from the project being built

Can you comment on how these are handled in meson?

Cheers,

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 01/23] nir: Stop using apostrophes to pluralize.