Re: [Mesa-dev] [EGL android: accquire fence implementation] i965: Queue the buffer with a sync fence for Android OS

2017-07-11 Thread Tomasz Figa
Hi Yogesh,

On Wed, Jul 12, 2017 at 2:18 AM, Marathe, Yogesh
 wrote:
>> -Original Message-
>> From: mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] On Behalf
>> Of Tomasz Figa
>> Sent: Tuesday, July 11, 2017 9:59 PM
>> To: Marathe, Yogesh 
>> Cc: Gao, Shuo ; Liu, Zhiquan ;
>> Kondapally, Kalyan ; Chad Versace
>> ; Eric Engestrom ; Emil
>> Velikov ; Wu, Zhongmin
>> ; Kenneth Graunke ; Rob
>> Clark ; Widawsky, Benjamin
>> ; ML mesa-dev > d...@lists.freedesktop.org>; Kristian H . Kristensen
>> ; Timothy Arceri 
>> Subject: Re: [Mesa-dev] [EGL android: accquire fence implementation] i965:
>> Queue the buffer with a sync fence for Android OS
>>
>> Hi Yogesh,
>>
>> On Wed, Jul 12, 2017 at 1:09 AM, Marathe, Yogesh
>>  wrote:
>> > Hello Figa, Few caveats on that approach
>>
>> (I'm Tomasz, by the way)
>
> My bad Tomasz.

No problem. :)

>
>> >> Queue the buffer with a sync fence for Android OS
>> >>
>> >> Now for real, sorry guys... (Seriously gmail why you do this to me.)
>> >>
>> >> -chuanbo.w...@intel.com (bounces)
>> >> +"Gao, Shuo"  (forgot to add originally, sorry)
>> >>
>> >> On Wed, Jul 12, 2017 at 12:23 AM, Tomasz Figa 
>> wrote:
>> >> > Hi Zhongmin,
>> >> >
>> >> > On Tue, Jul 11, 2017 at 11:02 AM, Wu, Zhongmin
>> >> > 
>> >> wrote:
>> >> >> By the way,
>> >> >>
>> >> >> For cancelBuffer, sorry I forget such function, thanks for notice.
>> >> >> It should
>> >> also pass the same fence fd as the queuebuffer.
>> >> >>
>> >> >> And Yogesh, you mentioned the gallium,   is it another platform 
>> >> >> supported
>> by
>> >> mesa ?  I am sorry I have no idea about this,  could you please help
>> >> to check this ?
>> >> >>
>> >> >> I think we can co-work with mesa team to work out an acceptable
>> >> >> fix which
>> >> can meet the requirement of Android without any break on other platforms.
>> >> >
>> >> > One thing needs clarifying here. Release fences from EGL are _not_
>> >> > a requirement. It is an optional feature. Android compliance suites
>> >> > pass fully without Android sync fence support in Mesa at all.
>> >> >
>> >> > Other than that, it's been taking long enough and I agree that we
>> >> > should finally wire both acquire and release fence support in EGL
>> >> > and related drivers. Otherwise we can forget about getting good
>> >> > user experience on Android.
>> >> >
>> >> > On a technical side, the EGL change needs to take into account that
>> >> > not all drivers support fences and so it needs to have a fallback
>> >> > to old behavior for those which don't.
>> >> >
>> >> > Other than that, correct me if I'm wrong, but could we just use the
>> >> > DRI2 fence extension instead of adding some custom callbacks? I can
>> >> > see that a normal client request to create a sync fence would end
>> >> > up calling dri2_dpy->fence->create_fence_fd() (if it's present) [1].
>> >> > Could we do the same?
>> >
>> > May be here we need to look at complete sequence eglCreateSyncKHR ->
>> > _eglCreateSync -> dri2_create_sync, as eglCreateSyncKHR  seems entry
>> > point and its doing attrib/type checks before reaching
>> > dri2_create_sync(). Also, dri2_create_sync is static, can't be called
>> > directly, there needs to be an entry point / interface.
>> >
>>
>> I think you misunderstood my suggestion. I didn't mean dri2_create_sync(), 
>> but
>> rather using the DRI2 fence extension directly, just as dri2_create_sync() 
>> does.
>> You can access dri2_egl_display from Android EGL code and in fact it already
>> uses other extensions like this.
>
> Sorry, I'm still searching around. To try this out, can you please specify, 
> which
> functions did you mean by DRI2 fence extension? An example within EGL code
> would help. Please note we need to call these from platform_android.c finally.

I meant using dri2_dpy->fence->create_fence_fd() directly, if the
available (i.e. dri2_dpy->fence and dri2_dpy->fence->create_fence_fd
are non-NULL) or falling back to -1 if not.

Best regards,
Tomasz

>
>>
>> Best regards,
>> Tomasz
>>
>> >> >
>> >> > [1]
>> >> > https://cgit.freedesktop.org/mesa/mesa/tree/src/egl/drivers/dri2/eg
>> >> > l_d
>> >> > ri2.c#n2772
>> >> >
>> >> > + Kristian, Chad and Dominik who have been looking into sync fence
>> >> > integration with our EGL drivers.
>> >> >
>> >> > Best regards,
>> >> > Tomasz
>> >> >
>> >> >>
>> >> >> -Original Message-
>> >> >> From: Wu, Zhongmin
>> >> >> Sent: Tuesday, July 11, 2017 8:40
>> >> >> To: 'Emil Velikov' ; Marathe, Yogesh
>> >> >> 
>> >> >> Cc: Widawsky, 

Re: [Mesa-dev] [PATCH] anv: Round u_vector element sizes to a power of two

2017-07-11 Thread Jason Ekstrand
On Tue, Jul 11, 2017 at 8:18 PM, Kenneth Graunke 
wrote:

> On Tuesday, July 11, 2017 7:09:32 PM PDT Jason Ekstrand wrote:
> > This fixes 32-bit builds of the driver.
> >
> > Fixes: 08413a81b93dc537fb0c34327ad162f07e8c3427
> > Cc: Mark Janes 
> > Cc: mesa-sta...@lists.freedesktop.org
> > ---
> >  src/intel/vulkan/anv_batch_chain.c | 15 +--
> >  1 file changed, 13 insertions(+), 2 deletions(-)
> >
> > diff --git a/src/intel/vulkan/anv_batch_chain.c
> b/src/intel/vulkan/anv_batch_chain.c
> > index 47fee739..c895848 100644
> > --- a/src/intel/vulkan/anv_batch_chain.c
> > +++ b/src/intel/vulkan/anv_batch_chain.c
> > @@ -672,6 +672,15 @@ anv_cmd_buffer_new_binding_table_block(struct
> anv_cmd_buffer *cmd_buffer)
> > return VK_SUCCESS;
> >  }
> >
> > +static inline uint32_t
> > +round_up_to_power_of_two(uint32_t value)
> > +{
> > +   if (value <= 1)
> > +  return value;
> > +
> > +   return 1 << (32 - __builtin_clz(value - 1));
> > +}
>
> Would be nice to have this in a src/util header, instead of
> src/gallium/auxiliary/util/u_math.h's util_next_power_of_two
> and src/mesa/main/imports.h's _mesa_next_pow_two_32, and again in anv.
>

We already pull in u_math.h (Thanks, Eric.) but I didn't find
next_power_of_two when I went looking for it.  I'll use
util_next_power_of_two instead.


> But moving all that around is annoying, and you know CLZ works here,
> and it's not that much code, so *shrug*
>
> Reviewed-by: Kenneth Graunke 
>

Thanks!


> > +
> >  VkResult
> >  anv_cmd_buffer_init_batch_bo_chain(struct anv_cmd_buffer *cmd_buffer)
> >  {
> > @@ -706,9 +715,11 @@ anv_cmd_buffer_init_batch_bo_chain(struct
> anv_cmd_buffer *cmd_buffer)
> >
> > *(struct anv_batch_bo **)u_vector_add(_buffer->seen_bbos) =
> batch_bo;
> >
> > +   /* u_vector requires power-of-two size elements */
> > +   uint32_t pow2_state_size =
> > +  round_up_to_power_of_two(sizeof(struct anv_state));
> > success = u_vector_init(_buffer->bt_block_states,
> > -   sizeof(struct anv_state),
> > -   8 * sizeof(struct anv_state));
> > +   pow2_state_size, 8 * pow2_state_size);
> > if (!success)
> >goto fail_seen_bbos;
> >
> >
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] anv: Round u_vector element sizes to a power of two

2017-07-11 Thread Kenneth Graunke
On Tuesday, July 11, 2017 7:09:32 PM PDT Jason Ekstrand wrote:
> This fixes 32-bit builds of the driver.
> 
> Fixes: 08413a81b93dc537fb0c34327ad162f07e8c3427
> Cc: Mark Janes 
> Cc: mesa-sta...@lists.freedesktop.org
> ---
>  src/intel/vulkan/anv_batch_chain.c | 15 +--
>  1 file changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/src/intel/vulkan/anv_batch_chain.c 
> b/src/intel/vulkan/anv_batch_chain.c
> index 47fee739..c895848 100644
> --- a/src/intel/vulkan/anv_batch_chain.c
> +++ b/src/intel/vulkan/anv_batch_chain.c
> @@ -672,6 +672,15 @@ anv_cmd_buffer_new_binding_table_block(struct 
> anv_cmd_buffer *cmd_buffer)
> return VK_SUCCESS;
>  }
>  
> +static inline uint32_t
> +round_up_to_power_of_two(uint32_t value)
> +{
> +   if (value <= 1)
> +  return value;
> +
> +   return 1 << (32 - __builtin_clz(value - 1));
> +}

Would be nice to have this in a src/util header, instead of
src/gallium/auxiliary/util/u_math.h's util_next_power_of_two
and src/mesa/main/imports.h's _mesa_next_pow_two_32, and again in anv.

But moving all that around is annoying, and you know CLZ works here,
and it's not that much code, so *shrug*

Reviewed-by: Kenneth Graunke 

> +
>  VkResult
>  anv_cmd_buffer_init_batch_bo_chain(struct anv_cmd_buffer *cmd_buffer)
>  {
> @@ -706,9 +715,11 @@ anv_cmd_buffer_init_batch_bo_chain(struct anv_cmd_buffer 
> *cmd_buffer)
>  
> *(struct anv_batch_bo **)u_vector_add(_buffer->seen_bbos) = batch_bo;
>  
> +   /* u_vector requires power-of-two size elements */
> +   uint32_t pow2_state_size =
> +  round_up_to_power_of_two(sizeof(struct anv_state));
> success = u_vector_init(_buffer->bt_block_states,
> -   sizeof(struct anv_state),
> -   8 * sizeof(struct anv_state));
> +   pow2_state_size, 8 * pow2_state_size);
> if (!success)
>goto fail_seen_bbos;
>  
> 



signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] anv: Round u_vector element sizes to a power of two

2017-07-11 Thread Jason Ekstrand
On Tue, Jul 11, 2017 at 7:09 PM, Jason Ekstrand 
wrote:

> This fixes 32-bit builds of the driver.
>

I just updated the commit message to be more descriptive:

This fixes 32-bit builds of the driver.  Commit 08413a81b93dc537fb0c3
changed things so that we now put struct anv_states in the u_vector for
binding tables.  On 64-bit builds, sizeof(struct anv_state) is a power
of two but it isn't on 32-bit builds.



> Fixes: 08413a81b93dc537fb0c34327ad162f07e8c3427
> Cc: Mark Janes 
> Cc: mesa-sta...@lists.freedesktop.org
> ---
>  src/intel/vulkan/anv_batch_chain.c | 15 +--
>  1 file changed, 13 insertions(+), 2 deletions(-)
>
> diff --git a/src/intel/vulkan/anv_batch_chain.c
> b/src/intel/vulkan/anv_batch_chain.c
> index 47fee739..c895848 100644
> --- a/src/intel/vulkan/anv_batch_chain.c
> +++ b/src/intel/vulkan/anv_batch_chain.c
> @@ -672,6 +672,15 @@ anv_cmd_buffer_new_binding_table_block(struct
> anv_cmd_buffer *cmd_buffer)
> return VK_SUCCESS;
>  }
>
> +static inline uint32_t
> +round_up_to_power_of_two(uint32_t value)
> +{
> +   if (value <= 1)
> +  return value;
> +
> +   return 1 << (32 - __builtin_clz(value - 1));
> +}
> +
>  VkResult
>  anv_cmd_buffer_init_batch_bo_chain(struct anv_cmd_buffer *cmd_buffer)
>  {
> @@ -706,9 +715,11 @@ anv_cmd_buffer_init_batch_bo_chain(struct
> anv_cmd_buffer *cmd_buffer)
>
> *(struct anv_batch_bo **)u_vector_add(_buffer->seen_bbos) =
> batch_bo;
>
> +   /* u_vector requires power-of-two size elements */
> +   uint32_t pow2_state_size =
> +  round_up_to_power_of_two(sizeof(struct anv_state));
> success = u_vector_init(_buffer->bt_block_states,
> -   sizeof(struct anv_state),
> -   8 * sizeof(struct anv_state));
> +   pow2_state_size, 8 * pow2_state_size);
> if (!success)
>goto fail_seen_bbos;
>
> --
> 2.5.0.400.gff86faf
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 08/19] mesa/st: implement memory objects as a backend for texture storage

2017-07-11 Thread Andres Rodriguez



On 2017-07-11 02:31 PM, Marek Olšák wrote:

On Sat, Jul 1, 2017 at 1:03 AM, Andres Rodriguez  wrote:

From: Dave Airlie 

Instead of allocating memory to back a texture, use the provided memory
object.

Signed-off-by: Andres Rodriguez 
---
  src/mesa/main/mtypes.h |   2 +
  src/mesa/state_tracker/st_cb_texture.c | 123 +
  src/mesa/state_tracker/st_extensions.c |   5 ++
  3 files changed, 130 insertions(+)

diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 8dcc1a8..463f444 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -4125,6 +4125,8 @@ struct gl_extensions
 GLboolean EXT_framebuffer_sRGB;
 GLboolean EXT_gpu_program_parameters;
 GLboolean EXT_gpu_shader4;
+   GLboolean EXT_memory_object;
+   GLboolean EXT_memory_object_fd;
 GLboolean EXT_packed_float;
 GLboolean EXT_pixel_buffer_object;
 GLboolean EXT_point_parameters;
diff --git a/src/mesa/state_tracker/st_cb_texture.c 
b/src/mesa/state_tracker/st_cb_texture.c
index 9798321..063f6d6 100644
--- a/src/mesa/state_tracker/st_cb_texture.c
+++ b/src/mesa/state_tracker/st_cb_texture.c
@@ -53,6 +53,7 @@
  #include "state_tracker/st_cb_flush.h"
  #include "state_tracker/st_cb_texture.h"
  #include "state_tracker/st_cb_bufferobjects.h"
+#include "state_tracker/st_cb_memoryobjects.h"
  #include "state_tracker/st_format.h"
  #include "state_tracker/st_pbo.h"
  #include "state_tracker/st_texture.h"
@@ -2890,6 +2891,125 @@ st_TexParameter(struct gl_context *ctx,
 }
  }

+/**
+ * Allocate a new pipe_resource object
+ * width0, height0, depth0 are the dimensions of the level 0 image
+ * (the highest resolution).  last_level indicates how many mipmap levels
+ * to allocate storage for.  For non-mipmapped textures, this will be zero.
+ */
+static struct pipe_resource *
+st_texture_create_memory(struct st_context *st,

st_texture_create_from_memory?
Fixing both of these. Dan reported a regression with this latest 
patchset and steamvr on FramebufferTexture(). I'll resend once I get 
that fixed as well.


Regards,
Andres

Marek


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] anv: Round u_vector element sizes to a power of two

2017-07-11 Thread Jason Ekstrand
This fixes 32-bit builds of the driver.

Fixes: 08413a81b93dc537fb0c34327ad162f07e8c3427
Cc: Mark Janes 
Cc: mesa-sta...@lists.freedesktop.org
---
 src/intel/vulkan/anv_batch_chain.c | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/src/intel/vulkan/anv_batch_chain.c 
b/src/intel/vulkan/anv_batch_chain.c
index 47fee739..c895848 100644
--- a/src/intel/vulkan/anv_batch_chain.c
+++ b/src/intel/vulkan/anv_batch_chain.c
@@ -672,6 +672,15 @@ anv_cmd_buffer_new_binding_table_block(struct 
anv_cmd_buffer *cmd_buffer)
return VK_SUCCESS;
 }
 
+static inline uint32_t
+round_up_to_power_of_two(uint32_t value)
+{
+   if (value <= 1)
+  return value;
+
+   return 1 << (32 - __builtin_clz(value - 1));
+}
+
 VkResult
 anv_cmd_buffer_init_batch_bo_chain(struct anv_cmd_buffer *cmd_buffer)
 {
@@ -706,9 +715,11 @@ anv_cmd_buffer_init_batch_bo_chain(struct anv_cmd_buffer 
*cmd_buffer)
 
*(struct anv_batch_bo **)u_vector_add(_buffer->seen_bbos) = batch_bo;
 
+   /* u_vector requires power-of-two size elements */
+   uint32_t pow2_state_size =
+  round_up_to_power_of_two(sizeof(struct anv_state));
success = u_vector_init(_buffer->bt_block_states,
-   sizeof(struct anv_state),
-   8 * sizeof(struct anv_state));
+   pow2_state_size, 8 * pow2_state_size);
if (!success)
   goto fail_seen_bbos;
 
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] freedreno/ir3: fix load_front_face conversion

2017-07-11 Thread Ilia Mirkin
The comments are correct - we get -1 and 0. However by adding 1, we
convert this into 0,1. This mostly works for conditionals, but when
negated, this will yield the wrong result. Instead just negate the
values (as they are backwards -- -1 means back instead of front).

Fixes tests/shaders/glsl-fs-frontfacing-not.shader_test and
dEQP-GLES3.functional.shaders.builtin_variable.frontfacing on A530.

Signed-off-by: Ilia Mirkin 
---
 src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c | 9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c 
b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
index ba1c64ee37c..764aeb49f1a 100644
--- a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
+++ b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
@@ -1546,14 +1546,11 @@ emit_intrinsic(struct ir3_compile *ctx, 
nir_intrinsic_instr *intr)
ctx->frag_face = create_input(b, 0);
ctx->frag_face->regs[0]->flags |= IR3_REG_HALF;
}
-   /* for fragface, we always get -1 or 0, but that is inverse
-* of what nir expects (where ~0 is true).  Unfortunately
-* trying to widen from half to full in add.s seems to do a
-* non-sign-extending widen (resulting in something that
-* gets interpreted as float Inf??)
+   /* for fragface, we get -1 for back and 0 for front. However 
this is
+* the inverse of what nir expects (where ~0 is true).
 */
dst[0] = ir3_COV(b, ctx->frag_face, TYPE_S16, TYPE_S32);
-   dst[0] = ir3_ADD_S(b, dst[0], 0, create_immed(b, 1), 0);
+   dst[0] = ir3_NOT_B(b, dst[0], 0);
break;
case nir_intrinsic_load_local_invocation_id:
if (!ctx->local_invocation_id) {
-- 
2.13.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 09/20] nir: Add system values from ARB_shader_ballot

2017-07-11 Thread Connor Abbott
On Tue, Jul 11, 2017 at 6:02 PM, Matt Turner  wrote:
> On Mon, Jul 10, 2017 at 4:05 PM, Connor Abbott  wrote:
>> On Mon, Jul 10, 2017 at 3:50 PM, Matt Turner  wrote:
>>> On Mon, Jul 10, 2017 at 1:10 PM, Connor Abbott  wrote:
 On Thu, Jul 6, 2017 at 4:48 PM, Matt Turner  wrote:
> We already had a channel_num system value, which I'm renaming to
> subgroup_invocation to match the rest of the new system values.
>
> Note that while ballotARB(true) will return zeros in the high 32-bits on
> systems where gl_SubGroupSizeARB <= 32, the gl_SubGroup??MaskARB
> variables do not consider whether channels are enabled. See issue (1) of
> ARB_shader_ballot.
> ---
>  src/compiler/nir/nir.c |  4 
>  src/compiler/nir/nir_intrinsics.h  |  8 +++-
>  src/compiler/nir/nir_lower_system_values.c | 28 
> 
>  src/intel/compiler/brw_fs_nir.cpp  |  2 +-
>  src/intel/compiler/brw_nir_intrinsics.c|  4 ++--
>  5 files changed, 42 insertions(+), 4 deletions(-)
>
> diff --git a/src/compiler/nir/nir.c b/src/compiler/nir/nir.c
> index 491b908396..9827e129ca 100644
> --- a/src/compiler/nir/nir.c
> +++ b/src/compiler/nir/nir.c
> @@ -1908,6 +1908,10 @@ nir_intrinsic_from_system_value(gl_system_value 
> val)
>return nir_intrinsic_load_helper_invocation;
> case SYSTEM_VALUE_VIEW_INDEX:
>return nir_intrinsic_load_view_index;
> +   case SYSTEM_VALUE_SUBGROUP_SIZE:
> +  return nir_intrinsic_load_subgroup_size;
> +   case SYSTEM_VALUE_SUBGROUP_INVOCATION:
> +  return nir_intrinsic_load_subgroup_invocation;
> default:
>unreachable("system value does not directly correspond to 
> intrinsic");
> }
> diff --git a/src/compiler/nir/nir_intrinsics.h 
> b/src/compiler/nir/nir_intrinsics.h
> index 6c6ba4cf59..96ecfbc338 100644
> --- a/src/compiler/nir/nir_intrinsics.h
> +++ b/src/compiler/nir/nir_intrinsics.h
> @@ -344,10 +344,16 @@ SYSTEM_VALUE(work_group_id, 3, 0, xx, xx, xx)
>  SYSTEM_VALUE(user_clip_plane, 4, 1, UCP_ID, xx, xx)
>  SYSTEM_VALUE(num_work_groups, 3, 0, xx, xx, xx)
>  SYSTEM_VALUE(helper_invocation, 1, 0, xx, xx, xx)
> -SYSTEM_VALUE(channel_num, 1, 0, xx, xx, xx)
>  SYSTEM_VALUE(alpha_ref_float, 1, 0, xx, xx, xx)
>  SYSTEM_VALUE(layer_id, 1, 0, xx, xx, xx)
>  SYSTEM_VALUE(view_index, 1, 0, xx, xx, xx)
> +SYSTEM_VALUE(subgroup_size, 1, 0, xx, xx, xx)
> +SYSTEM_VALUE(subgroup_invocation, 1, 0, xx, xx, xx)
> +SYSTEM_VALUE(subgroup_eq_mask, 1, 0, xx, xx, xx)
> +SYSTEM_VALUE(subgroup_ge_mask, 1, 0, xx, xx, xx)
> +SYSTEM_VALUE(subgroup_gt_mask, 1, 0, xx, xx, xx)
> +SYSTEM_VALUE(subgroup_le_mask, 1, 0, xx, xx, xx)
> +SYSTEM_VALUE(subgroup_lt_mask, 1, 0, xx, xx, xx)
>
>  /* Blend constant color values.  Float values are clamped. */
>  SYSTEM_VALUE(blend_const_color_r_float, 1, 0, xx, xx, xx)
> diff --git a/src/compiler/nir/nir_lower_system_values.c 
> b/src/compiler/nir/nir_lower_system_values.c
> index 810100a081..faf0c3c9da 100644
> --- a/src/compiler/nir/nir_lower_system_values.c
> +++ b/src/compiler/nir/nir_lower_system_values.c
> @@ -116,6 +116,34 @@ convert_block(nir_block *block, nir_builder *b)
> nir_load_base_instance(b));
>   break;
>
> +  case SYSTEM_VALUE_SUBGROUP_EQ_MASK:
> +  case SYSTEM_VALUE_SUBGROUP_GE_MASK:
> +  case SYSTEM_VALUE_SUBGROUP_GT_MASK:
> +  case SYSTEM_VALUE_SUBGROUP_LE_MASK:
> +  case SYSTEM_VALUE_SUBGROUP_LT_MASK: {
> + nir_ssa_def *count = nir_load_subgroup_invocation(b);
> +
> + switch (var->data.location) {
> + case SYSTEM_VALUE_SUBGROUP_EQ_MASK:
> +sysval = nir_ishl(b, nir_imm_int64(b, 1ull), count);
> +break;
> + case SYSTEM_VALUE_SUBGROUP_GE_MASK:
> +sysval = nir_ishl(b, nir_imm_int64(b, ~0ull), count);
> +break;
> + case SYSTEM_VALUE_SUBGROUP_GT_MASK:
> +sysval = nir_ishl(b, nir_imm_int64(b, ~1ull), count);
> +break;
> + case SYSTEM_VALUE_SUBGROUP_LE_MASK:
> +sysval = nir_inot(b, nir_ishl(b, nir_imm_int64(b, ~1ull), 
> count));
> +break;
> + case SYSTEM_VALUE_SUBGROUP_LT_MASK:
> +sysval = nir_inot(b, nir_ishl(b, nir_imm_int64(b, ~0ull), 
> count));
> +break;
> + default:
> +unreachable("you seriously can't tell this is unreachable?");
> + }
> +  }
> +

 While this fine to do for both Intel and AMD, Nvidia actually has
 special system values for 

[Mesa-dev] [PATCH 3/3] swr: Add path to draw directly from client memory without copy.

2017-07-11 Thread Bruce Cherniak
If size of client memory copy is too large, don't copy. The draw will
access user-buffer directly and then block.  This is faster and more
efficient than queuing many large client draws.

Applications that use large draws from client arrays benefit from this.
VMD is an example.

The threshold for this path defaults to 32KB.  This value can be
overridden by setting environment variable SWR_CLIENT_COPY_LIMIT.
---
 src/gallium/drivers/swr/swr_context.h  |  1 +
 src/gallium/drivers/swr/swr_draw.cpp   |  9 +
 src/gallium/drivers/swr/swr_screen.cpp | 10 +
 src/gallium/drivers/swr/swr_screen.h   |  2 ++
 src/gallium/drivers/swr/swr_state.cpp  | 37 --
 5 files changed, 48 insertions(+), 11 deletions(-)

diff --git a/src/gallium/drivers/swr/swr_context.h 
b/src/gallium/drivers/swr/swr_context.h
index 3ff4bf3e2f..ab3057af96 100644
--- a/src/gallium/drivers/swr/swr_context.h
+++ b/src/gallium/drivers/swr/swr_context.h
@@ -51,6 +51,7 @@
 #define SWR_NEW_FRAMEBUFFER (1 << 15)
 #define SWR_NEW_CLIP (1 << 16)
 #define SWR_NEW_SO (1 << 17)
+#define SWR_LARGE_CLIENT_DRAW (1<<18) // Indicates client draw will block
 
 namespace std
 {
diff --git a/src/gallium/drivers/swr/swr_draw.cpp 
b/src/gallium/drivers/swr/swr_draw.cpp
index f26b8e873c..cbd1558624 100644
--- a/src/gallium/drivers/swr/swr_draw.cpp
+++ b/src/gallium/drivers/swr/swr_draw.cpp
@@ -188,6 +188,15 @@ swr_draw_vbo(struct pipe_context *pipe, const struct 
pipe_draw_info *info)
info->instance_count,
info->start,
info->start_instance);
+
+   /* On large client-buffer draw, we used client buffer directly, without
+* copy.  Block until draw is finished.
+* VMD is an example application that benefits from this. */
+   if (ctx->dirty & SWR_LARGE_CLIENT_DRAW) {
+  struct swr_screen *screen = swr_screen(pipe->screen);
+  swr_fence_submit(ctx, screen->flush_fence);
+  swr_fence_finish(pipe->screen, NULL, screen->flush_fence, 0);
+   }
 }
 
 
diff --git a/src/gallium/drivers/swr/swr_screen.cpp 
b/src/gallium/drivers/swr/swr_screen.cpp
index 9b3897ce6b..8be09697e6 100644
--- a/src/gallium/drivers/swr/swr_screen.cpp
+++ b/src/gallium/drivers/swr/swr_screen.cpp
@@ -1066,6 +1066,16 @@ swr_destroy_screen(struct pipe_screen *p_screen)
 static void
 swr_validate_env_options(struct swr_screen *screen)
 {
+   /* The client_copy_limit sets a maximum on the amount of user-buffer memory
+* copied to scratch space on a draw.  Past this, the draw will access
+* user-buffer directly and then block.  This is faster than queuing many
+* large client draws. */
+   screen->client_copy_limit = 32768;
+   int client_copy_limit =
+  debug_get_num_option("SWR_CLIENT_COPY_LIMIT", 32768);
+   if (client_copy_limit > 0)
+  screen->client_copy_limit = client_copy_limit;
+
/* XXX msaa under development, disable by default for now */
screen->msaa_max_count = 0; /* was SWR_MAX_NUM_MULTISAMPLES; */
 
diff --git a/src/gallium/drivers/swr/swr_screen.h 
b/src/gallium/drivers/swr/swr_screen.h
index dc1bb47f02..6d6d1cb87d 100644
--- a/src/gallium/drivers/swr/swr_screen.h
+++ b/src/gallium/drivers/swr/swr_screen.h
@@ -43,8 +43,10 @@ struct swr_screen {
 
struct sw_winsys *winsys;
 
+   /* Configurable environment settings */
boolean msaa_force_enable;
uint8_t msaa_max_count;
+   uint32_t client_copy_limit;
 
HANDLE hJitMgr;
 };
diff --git a/src/gallium/drivers/swr/swr_state.cpp 
b/src/gallium/drivers/swr/swr_state.cpp
index 45c9c213e5..6c406a37ec 100644
--- a/src/gallium/drivers/swr/swr_state.cpp
+++ b/src/gallium/drivers/swr/swr_state.cpp
@@ -1267,12 +1267,20 @@ swr_update_derived(struct pipe_context *pipe,
 partial_inbounds = 0;
 min_vertex_index = info.min_index;
 
-/* Copy only needed vertices to scratch space */
 size = AlignUp(size, 4);
-const void *ptr = (const uint8_t *) vb->buffer.user + base;
-ptr = (uint8_t *)swr_copy_to_scratch_space(
-   ctx, >scratch->vertex_buffer, ptr, size);
-p_data = (const uint8_t *)ptr - base;
+/* If size of client memory copy is too large, don't copy. The
+ * draw will access user-buffer directly and then block.  This is
+ * faster than queuing many large client draws. */
+if (size >= screen->client_copy_limit) {
+   post_update_dirty_flags |= SWR_LARGE_CLIENT_DRAW;
+   p_data = (const uint8_t *) vb->buffer.user;
+} else {
+   /* Copy only needed vertices to scratch space */
+   const void *ptr = (const uint8_t *) vb->buffer.user + base;
+   ptr = (uint8_t *)swr_copy_to_scratch_space(
+ ctx, >scratch->vertex_buffer, ptr, size);
+   p_data = (const uint8_t *)ptr - base;
+}
  }
 
  swrVertexBuffers[i] = {0};
@@ -1311,12 

[Mesa-dev] [PATCH 0/3] swr: Optimize large draws from client arrays.

2017-07-11 Thread Bruce Cherniak
If size of client memory copy is too large, don't copy. The draw will
access user-buffer directly and then block.  This is faster and more
efficient than queuing many large client draws.

Applications that use large draws from client arrays benefit from this.
VMD is an example.

The threshold for this path defaults to 32KB.  This value can be
overridden by setting environment variable SWR_CLIENT_COPY_LIMIT.

Bruce Cherniak (3):
  swr: Remove hard-coded constant and "todo" comment.
  swr: Move environment config options into separate function.
  swr: Add path to draw directly from client memory without copy.

 src/gallium/drivers/swr/swr_context.h   |  1 +
 src/gallium/drivers/swr/swr_draw.cpp|  9 +
 src/gallium/drivers/swr/swr_scratch.cpp |  3 +-
 src/gallium/drivers/swr/swr_screen.cpp  | 70 +
 src/gallium/drivers/swr/swr_screen.h|  2 +
 src/gallium/drivers/swr/swr_state.cpp   | 37 +++--
 6 files changed, 84 insertions(+), 38 deletions(-)

-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] swr: Move environment config options into separate function.

2017-07-11 Thread Bruce Cherniak
Moved reading of environment config options out of
swr_create_screen_internal, into a separate swr_validate_env_options.
This is to keep from cluttering create_screen.
---
 src/gallium/drivers/swr/swr_screen.cpp | 60 +++---
 1 file changed, 34 insertions(+), 26 deletions(-)

diff --git a/src/gallium/drivers/swr/swr_screen.cpp 
b/src/gallium/drivers/swr/swr_screen.cpp
index 53b5dadec9..9b3897ce6b 100644
--- a/src/gallium/drivers/swr/swr_screen.cpp
+++ b/src/gallium/drivers/swr/swr_screen.cpp
@@ -1062,6 +1062,39 @@ swr_destroy_screen(struct pipe_screen *p_screen)
FREE(screen);
 }
 
+
+static void
+swr_validate_env_options(struct swr_screen *screen)
+{
+   /* XXX msaa under development, disable by default for now */
+   screen->msaa_max_count = 0; /* was SWR_MAX_NUM_MULTISAMPLES; */
+
+   /* validate env override values, within range and power of 2 */
+   int msaa_max_count = debug_get_num_option("SWR_MSAA_MAX_COUNT", 0);
+   if (msaa_max_count) {
+  if ((msaa_max_count < 0) || (msaa_max_count > SWR_MAX_NUM_MULTISAMPLES)
+|| !util_is_power_of_two(msaa_max_count)) {
+ fprintf(stderr, "SWR_MSAA_MAX_COUNT invalid: %d\n", msaa_max_count);
+ fprintf(stderr, "must be power of 2 between 1 and %d" \
+ " (or 0 to disable msaa)\n",
+   SWR_MAX_NUM_MULTISAMPLES);
+ msaa_max_count = 0;
+  }
+
+  fprintf(stderr, "SWR_MSAA_MAX_COUNT: %d\n", msaa_max_count);
+  if (!msaa_max_count)
+ fprintf(stderr, "(msaa disabled)\n");
+
+  screen->msaa_max_count = msaa_max_count;
+   }
+
+   screen->msaa_force_enable = debug_get_bool_option(
+ "SWR_MSAA_FORCE_ENABLE", false);
+   if (screen->msaa_force_enable)
+  fprintf(stderr, "SWR_MSAA_FORCE_ENABLE: true\n");
+}
+
+
 PUBLIC
 struct pipe_screen *
 swr_create_screen_internal(struct sw_winsys *winsys)
@@ -1099,32 +1132,7 @@ swr_create_screen_internal(struct sw_winsys *winsys)
 
util_format_s3tc_init();
 
-   /* XXX msaa under development, disable by default for now */
-   screen->msaa_max_count = 0; /* was SWR_MAX_NUM_MULTISAMPLES; */
-
-   /* validate env override values, within range and power of 2 */
-   int msaa_max_count = debug_get_num_option("SWR_MSAA_MAX_COUNT", 0);
-   if (msaa_max_count) {
-  if ((msaa_max_count < 0) || (msaa_max_count > SWR_MAX_NUM_MULTISAMPLES)
-|| !util_is_power_of_two(msaa_max_count)) {
- fprintf(stderr, "SWR_MSAA_MAX_COUNT invalid: %d\n", msaa_max_count);
- fprintf(stderr, "must be power of 2 between 1 and %d" \
- " (or 0 to disable msaa)\n",
-   SWR_MAX_NUM_MULTISAMPLES);
- msaa_max_count = 0;
-  }
-
-  fprintf(stderr, "SWR_MSAA_MAX_COUNT: %d\n", msaa_max_count);
-  if (!msaa_max_count)
- fprintf(stderr, "(msaa disabled)\n");
-
-  screen->msaa_max_count = msaa_max_count;
-   }
-
-   screen->msaa_force_enable = debug_get_bool_option(
- "SWR_MSAA_FORCE_ENABLE", false);
-   if (screen->msaa_force_enable)
-  fprintf(stderr, "SWR_MSAA_FORCE_ENABLE: true\n");
+   swr_validate_env_options(screen);
 
return >base;
 }
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] swr: Remove hard-coded constant and "todo" comment.

2017-07-11 Thread Bruce Cherniak
Removed the hard-coded constant in favor of a #define.  Also removed
TODO comment, the constant value doesn't need an environment
configurable option.
---
 src/gallium/drivers/swr/swr_scratch.cpp | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/swr/swr_scratch.cpp 
b/src/gallium/drivers/swr/swr_scratch.cpp
index db095dea7e..ea49bbefba 100644
--- a/src/gallium/drivers/swr/swr_scratch.cpp
+++ b/src/gallium/drivers/swr/swr_scratch.cpp
@@ -28,6 +28,7 @@
 #include "swr_fence_work.h"
 #include "api.h"
 
+#define SCRATCH_SINGLE_ALLOCATION_LIMIT 2048
 
 void *
 swr_copy_to_scratch_space(struct swr_context *ctx,
@@ -39,7 +40,7 @@ swr_copy_to_scratch_space(struct swr_context *ctx,
assert(space);
assert(size);
 
-   if (size >= 2048) { /* XXX TODO create KNOB_ for this */
+   if (size >= SCRATCH_SINGLE_ALLOCATION_LIMIT) {
   /* Use per draw SwrAllocDrawContextMemory for larger copies */
   ptr = SwrAllocDrawContextMemory(ctx->swrContext, size, 4);
} else {
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 09/20] nir: Add system values from ARB_shader_ballot

2017-07-11 Thread Matt Turner
On Mon, Jul 10, 2017 at 4:05 PM, Connor Abbott  wrote:
> On Mon, Jul 10, 2017 at 3:50 PM, Matt Turner  wrote:
>> On Mon, Jul 10, 2017 at 1:10 PM, Connor Abbott  wrote:
>>> On Thu, Jul 6, 2017 at 4:48 PM, Matt Turner  wrote:
 We already had a channel_num system value, which I'm renaming to
 subgroup_invocation to match the rest of the new system values.

 Note that while ballotARB(true) will return zeros in the high 32-bits on
 systems where gl_SubGroupSizeARB <= 32, the gl_SubGroup??MaskARB
 variables do not consider whether channels are enabled. See issue (1) of
 ARB_shader_ballot.
 ---
  src/compiler/nir/nir.c |  4 
  src/compiler/nir/nir_intrinsics.h  |  8 +++-
  src/compiler/nir/nir_lower_system_values.c | 28 
 
  src/intel/compiler/brw_fs_nir.cpp  |  2 +-
  src/intel/compiler/brw_nir_intrinsics.c|  4 ++--
  5 files changed, 42 insertions(+), 4 deletions(-)

 diff --git a/src/compiler/nir/nir.c b/src/compiler/nir/nir.c
 index 491b908396..9827e129ca 100644
 --- a/src/compiler/nir/nir.c
 +++ b/src/compiler/nir/nir.c
 @@ -1908,6 +1908,10 @@ nir_intrinsic_from_system_value(gl_system_value val)
return nir_intrinsic_load_helper_invocation;
 case SYSTEM_VALUE_VIEW_INDEX:
return nir_intrinsic_load_view_index;
 +   case SYSTEM_VALUE_SUBGROUP_SIZE:
 +  return nir_intrinsic_load_subgroup_size;
 +   case SYSTEM_VALUE_SUBGROUP_INVOCATION:
 +  return nir_intrinsic_load_subgroup_invocation;
 default:
unreachable("system value does not directly correspond to 
 intrinsic");
 }
 diff --git a/src/compiler/nir/nir_intrinsics.h 
 b/src/compiler/nir/nir_intrinsics.h
 index 6c6ba4cf59..96ecfbc338 100644
 --- a/src/compiler/nir/nir_intrinsics.h
 +++ b/src/compiler/nir/nir_intrinsics.h
 @@ -344,10 +344,16 @@ SYSTEM_VALUE(work_group_id, 3, 0, xx, xx, xx)
  SYSTEM_VALUE(user_clip_plane, 4, 1, UCP_ID, xx, xx)
  SYSTEM_VALUE(num_work_groups, 3, 0, xx, xx, xx)
  SYSTEM_VALUE(helper_invocation, 1, 0, xx, xx, xx)
 -SYSTEM_VALUE(channel_num, 1, 0, xx, xx, xx)
  SYSTEM_VALUE(alpha_ref_float, 1, 0, xx, xx, xx)
  SYSTEM_VALUE(layer_id, 1, 0, xx, xx, xx)
  SYSTEM_VALUE(view_index, 1, 0, xx, xx, xx)
 +SYSTEM_VALUE(subgroup_size, 1, 0, xx, xx, xx)
 +SYSTEM_VALUE(subgroup_invocation, 1, 0, xx, xx, xx)
 +SYSTEM_VALUE(subgroup_eq_mask, 1, 0, xx, xx, xx)
 +SYSTEM_VALUE(subgroup_ge_mask, 1, 0, xx, xx, xx)
 +SYSTEM_VALUE(subgroup_gt_mask, 1, 0, xx, xx, xx)
 +SYSTEM_VALUE(subgroup_le_mask, 1, 0, xx, xx, xx)
 +SYSTEM_VALUE(subgroup_lt_mask, 1, 0, xx, xx, xx)

  /* Blend constant color values.  Float values are clamped. */
  SYSTEM_VALUE(blend_const_color_r_float, 1, 0, xx, xx, xx)
 diff --git a/src/compiler/nir/nir_lower_system_values.c 
 b/src/compiler/nir/nir_lower_system_values.c
 index 810100a081..faf0c3c9da 100644
 --- a/src/compiler/nir/nir_lower_system_values.c
 +++ b/src/compiler/nir/nir_lower_system_values.c
 @@ -116,6 +116,34 @@ convert_block(nir_block *block, nir_builder *b)
 nir_load_base_instance(b));
   break;

 +  case SYSTEM_VALUE_SUBGROUP_EQ_MASK:
 +  case SYSTEM_VALUE_SUBGROUP_GE_MASK:
 +  case SYSTEM_VALUE_SUBGROUP_GT_MASK:
 +  case SYSTEM_VALUE_SUBGROUP_LE_MASK:
 +  case SYSTEM_VALUE_SUBGROUP_LT_MASK: {
 + nir_ssa_def *count = nir_load_subgroup_invocation(b);
 +
 + switch (var->data.location) {
 + case SYSTEM_VALUE_SUBGROUP_EQ_MASK:
 +sysval = nir_ishl(b, nir_imm_int64(b, 1ull), count);
 +break;
 + case SYSTEM_VALUE_SUBGROUP_GE_MASK:
 +sysval = nir_ishl(b, nir_imm_int64(b, ~0ull), count);
 +break;
 + case SYSTEM_VALUE_SUBGROUP_GT_MASK:
 +sysval = nir_ishl(b, nir_imm_int64(b, ~1ull), count);
 +break;
 + case SYSTEM_VALUE_SUBGROUP_LE_MASK:
 +sysval = nir_inot(b, nir_ishl(b, nir_imm_int64(b, ~1ull), 
 count));
 +break;
 + case SYSTEM_VALUE_SUBGROUP_LT_MASK:
 +sysval = nir_inot(b, nir_ishl(b, nir_imm_int64(b, ~0ull), 
 count));
 +break;
 + default:
 +unreachable("you seriously can't tell this is unreachable?");
 + }
 +  }
 +
>>>
>>> While this fine to do for both Intel and AMD, Nvidia actually has
>>> special system values for these, and AMD has special instructions for
>>> bitCount(foo & gl_SubGroupLtMask), so I think we should have actual
>>
>> So, just add this to the above switch statement?
>>
>>if 

Re: [Mesa-dev] [EGL android: accquire fence implementation] i965: Queue the buffer with a sync fence for Android OS

2017-07-11 Thread Wu, Zhongmin
Thanks very much for Emil and Tomasz's suggestions.

We can see that the out fence needed in queue buffer is get from batch buffer 
flushing.

Now we  may flush bath buffer at below places:

1.   Create sync ===> flush buffer and get out fence from driver
2.Do glFLush()   >flush buffer  ( not ask for out fence from driver)
3.swapbuffer >  flush buffer ( not ask for  out fence from drvier,  
 queue buffer with fd -1 )
...


So,   in my opinion,  we don't need to get the out fence with help of other 
paths in swapbuffer. we just need to save the last out fence in the latest 
batch buffer flushing, and give it to Android with queue buffer in each 
swapbuffer.

( for example,   if glflush is just  called before swapbuffer,   the following 
buffer flushing will returned directly as the batch buffer is empty now, we can 
use the out fence got in glflush, and using create sync API may still not get 
the fence. ) 

But you are right, we need to change the method in the patch, the patch is 
ill-considered.  As Emil said,  the canclebuffer is missed. 
We can't save the latest fence fd in the context object, as the context is 
reset yet before calling canclebuffer.  Then it is not good to add the call 
back in __DRI2flushExtensionRec,   We do need a better design for this.

-Original Message-
From: Emil Velikov [mailto:emil.l.veli...@gmail.com] 
Sent: Wednesday, July 12, 2017 1:41 
To: Tomasz Figa 
Cc: Wu, Zhongmin ; Marathe, Yogesh 
; Widawsky, Benjamin ; 
Liu, Zhiquan ; Eric Engestrom ; Rob 
Clark ; Kenneth Graunke ; 
Kondapally, Kalyan ; ML mesa-dev 
; Timothy Arceri ; 
Kristian H . Kristensen ; Chad Versace 
; db...@chromium.org
Subject: Re: [Mesa-dev] [EGL android: accquire fence implementation] i965: 
Queue the buffer with a sync fence for Android OS

On 11 July 2017 at 16:23, Tomasz Figa  wrote:
> Hi Zhongmin,
>
> On Tue, Jul 11, 2017 at 11:02 AM, Wu, Zhongmin  wrote:
>> By the way,
>>
>> For cancelBuffer, sorry I forget such function, thanks for notice. It should 
>> also pass the same fence fd as the queuebuffer.
>>
>> And Yogesh, you mentioned the gallium,   is it another platform supported by 
>> mesa ?  I am sorry I have no idea about this,  could you please help to 
>> check this ?
>>
>> I think we can co-work with mesa team to work out an acceptable fix which 
>> can meet the requirement of Android without any break on other platforms.
>
> One thing needs clarifying here. Release fences from EGL are _not_ a 
> requirement. It is an optional feature. Android compliance suites pass 
> fully without Android sync fence support in Mesa at all.
>
> Other than that, it's been taking long enough and I agree that we 
> should finally wire both acquire and release fence support in EGL and 
> related drivers. Otherwise we can forget about getting good user 
> experience on Android.
>
Right, I'm not trying to say otherwise.

The strange part, IMHO, is that now flatland has a hard requirement on both 
fences, where the [developer-side of the] documentation does not say anything 
about this.
This sounds a bit backwards. I believe documentation update is in order?

FWIW I was under the impression that EGL_ANDROID_native_fence_sync can be used 
in flatland. Although as Rob mentioned... not sure if the extension is 
available since the EGL meta seems to block/strip it out.


> On a technical side, the EGL change needs to take into account that 
> not all drivers support fences and so it needs to have a fallback to 
> old behavior for those which don't.
>

> Other than that, correct me if I'm wrong, but could we just use the
> DRI2 fence extension instead of adding some custom callbacks? I can 
> see that a normal client request to create a sync fence would end up 
> calling dri2_dpy->fence->create_fence_fd() (if it's present) [1].
> Could we do the same?
>
Reusing existing API would be ideal.

If not, Zhongmin/Yogesh please note:
 - when extending the interface, the version number must be bumped
 - user should check the version and the function pointer prior to use, falling 
back to the old scheme
 - get_retrive_fd [barring the typo - retrieve], should have at least the fd 
ownership documented

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 16/16] anv: Predicate fast-clear resolves

2017-07-11 Thread Jason Ekstrand
On Tue, Jul 11, 2017 at 3:34 PM, Nanley Chery  wrote:

> On Mon, Jul 10, 2017 at 04:00:23PM -0700, Jason Ekstrand wrote:
> > On Wed, Jun 28, 2017 at 2:14 PM, Nanley Chery 
> wrote:
> >
> > > Image layouts only let us know that an image *may* be fast-cleared. For
> > > this reason we can end up with redundant resolves. Testing has shown
> > > that such resolves can measurably hurt performance and that predicating
> > > them can avoid the penalty.
> > >
> > > Signed-off-by: Nanley Chery 
> > > ---
> > >  src/intel/vulkan/anv_blorp.c   |  3 +-
> > >  src/intel/vulkan/anv_private.h | 13 --
> > >  src/intel/vulkan/genX_cmd_buffer.c | 87
> ++
> > > ++--
> > >  3 files changed, 95 insertions(+), 8 deletions(-)
> > >
> > > diff --git a/src/intel/vulkan/anv_blorp.c
> b/src/intel/vulkan/anv_blorp.c
> > > index 35317ba6be..d06d7e2cc3 100644
> > > --- a/src/intel/vulkan/anv_blorp.c
> > > +++ b/src/intel/vulkan/anv_blorp.c
> > > @@ -1619,7 +1619,8 @@ anv_ccs_resolve(struct anv_cmd_buffer * const
> > > cmd_buffer,
> > >return;
> > >
> > > struct blorp_batch batch;
> > > -   blorp_batch_init(_buffer->device->blorp, , cmd_buffer,
> 0);
> > > +   blorp_batch_init(_buffer->device->blorp, , cmd_buffer,
> > > +BLORP_BATCH_PREDICATE_ENABLE);
> > >
> > > struct blorp_surf surf;
> > > get_blorp_surf_for_anv_image(image, VK_IMAGE_ASPECT_COLOR_BIT,
> > > diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_
> > > private.h
> > > index be1623f3c3..951cf50842 100644
> > > --- a/src/intel/vulkan/anv_private.h
> > > +++ b/src/intel/vulkan/anv_private.h
> > > @@ -2118,11 +2118,16 @@ anv_fast_clear_state_entry_size(const struct
> > > anv_device *device)
> > >  {
> > > assert(device);
> > > /* Entry contents:
> > > -*   +--+
> > > -*   | clear value dword(s) |
> > > -*   +--+
> > > +*   ++
> > > +*   | clear value dword(s) | needs resolve dword |
> > > +*   ++
> > >  */
> > > -   return device->isl_dev.ss.clear_value_size;
> > > +
> > > +   /* Ensure that the needs resolve dword is in fact dword-aligned to
> > > enable
> > > +* GPU memcpy operations.
> > > +*/
> > > +   assert(device->isl_dev.ss.clear_value_size % 4 == 0);
> > > +   return device->isl_dev.ss.clear_value_size + 4;
> > >  }
> > >
> > >  /* Returns true if a HiZ-enabled depth buffer can be sampled from. */
> > > diff --git a/src/intel/vulkan/genX_cmd_buffer.c
> > > b/src/intel/vulkan/genX_cmd_buffer.c
> > > index 62a2f22782..65d9c92783 100644
> > > --- a/src/intel/vulkan/genX_cmd_buffer.c
> > > +++ b/src/intel/vulkan/genX_cmd_buffer.c
> > > @@ -421,6 +421,59 @@ get_fast_clear_state_entry_offset(const struct
> > > anv_device *device,
> > > return offset;
> > >  }
> > >
> > > +#define MI_PREDICATE_SRC0  0x2400
> > > +#define MI_PREDICATE_SRC1  0x2408
> > > +
> > > +enum ccs_resolve_state {
> > > +   CCS_RESOLVE_NOT_NEEDED,
> > > +   CCS_RESOLVE_NEEDED,
> > >
> >
> > Are these two values sufficient?  Do we ever have a scenario where we do
> a
> > partial resolve and then a full resolve?  Do we need to be able to track
> > that?
> >
> >
>
> Yes, they are. We don't currently have such a scenario. This may come up
> later if we start temporarily enabling CCS_E, but I can't think of what
> sequence of events would trigger that to happen.
>
> > > +   CCS_RESOLVE_STARTING,
> > > +};
> > > +
> > > +/* Manages the state of an color image subresource to ensure resolves
> are
> > > + * performed properly.
> > > + */
> > > +static void
> > > +genX(set_resolve_state)(struct anv_cmd_buffer *cmd_buffer,
> > > +const struct anv_image *image,
> > > +unsigned level,
> > > +enum ccs_resolve_state state)
> > > +{
> > > +   assert(cmd_buffer && image);
> > > +   assert(image->aspects == VK_IMAGE_ASPECT_COLOR_BIT);
> > > +   assert(level < anv_image_aux_levels(image));
> > > +
> > > +   const uint32_t resolve_flag_offset =
> > > +  get_fast_clear_state_entry_offset(cmd_buffer->device, image,
> > > level) +
> > > +  cmd_buffer->device->isl_dev.ss.clear_value_size;
> > > +
> > > +   if (state != CCS_RESOLVE_STARTING) {
> > > +  assert(state == CCS_RESOLVE_NEEDED || state ==
> > > CCS_RESOLVE_NOT_NEEDED);
> > > +  /* The HW docs say that there is no way to guarantee the
> completion
> > > of
> > > +   * the following command. We use it nevertheless because it
> shows no
> > > +   * issues in testing is currently being used in the GL driver.
> > > +   */
> > > +  anv_batch_emit(_buffer->batch, GENX(MI_STORE_DATA_IMM),
> sdi) {
> > > + sdi.Address = (struct anv_address) { image->bo,
> > > resolve_flag_offset };
> > > + sdi.ImmediateData = state 

Re: [Mesa-dev] [PATCH v3 11/16] anv/cmd_buffer: Move aux_usage assignment up

2017-07-11 Thread Jason Ekstrand
On Tue, Jul 11, 2017 at 4:31 PM, Nanley Chery  wrote:

> On Mon, Jul 10, 2017 at 02:36:16PM -0700, Jason Ekstrand wrote:
> > On Wed, Jun 28, 2017 at 2:14 PM, Nanley Chery 
> wrote:
> >
> > > For readability, bring the assignment of CCS closer to the assignment
> of
> > > NONE and MCS.
> > >
> > > Signed-off-by: Nanley Chery 
> > > ---
> > >  src/intel/vulkan/genX_cmd_buffer.c | 62
> ++
> > > 
> > >  1 file changed, 30 insertions(+), 32 deletions(-)
> > >
> > > diff --git a/src/intel/vulkan/genX_cmd_buffer.c
> > > b/src/intel/vulkan/genX_cmd_buffer.c
> > > index 49ad41edbd..1aa79c8e7b 100644
> > > --- a/src/intel/vulkan/genX_cmd_buffer.c
> > > +++ b/src/intel/vulkan/genX_cmd_buffer.c
> > > @@ -253,6 +253,36 @@ color_attachment_compute_aux_usage(struct
> anv_device
> > > * device,
> > >att_state->input_aux_usage = ISL_AUX_USAGE_MCS;
> > >att_state->fast_clear = false;
> > >return;
> > > +   } else if (iview->image->aux_usage == ISL_AUX_USAGE_CCS_E) {
> > > +  att_state->aux_usage = ISL_AUX_USAGE_CCS_E;
> > > +  att_state->input_aux_usage = ISL_AUX_USAGE_CCS_E;
> > >
> >
> > I'm not sure if this actually improves readability as-is.  The no aux
> case
> > and MCS cases are early returns.  Maybe what we want is something like
> this:
> >
> > if (aux_surface.isl.size == 0) {
> >/* set to none */
> >return;
> > }
> >
> > switch (aux_usage) {
> > case MCS:
> > case CCS_E:
> >aux_state->aux_usage = iview->image->aux_usage;
> >aux_state->input_aux_usage = iview->image->aux_usage;
> >break;
> >
> > case NONE:
> >assert(samples == 1);
> >/* stuff below */
> >break;
> >
> > default:
> >unreachable();
> > }
> >
> > /* Now we determine whether or not we want to fast-clear */
> >
> > if (samples > 1) {
> >perf_debug();
> >fast_clear = false;
> >return;
> > }
> >
> > /* Other fast clear determination. */
> >
> > Incidentally, it may be cleaner in the long run if we split this into two
> > functions: compute_ccs_usage and compute_mcs_usage.
> >
> > Just thoughts BTW.  I'm not 100% sure how to make this the most readable.
> >
> >
>
> I'm personally not finding the alternative block above more readable.


I'm not entirely convinced either. :-)


> To
> overcome this disagreement, I don't mind doing any of the following:
>
> 1. Omit the rationale in the commit message.
> 2. Drop the patch from this series.
>

I don't really care that much.  I'd go for either 2. or just leave it as
is.  Either is fine with me.  I think it'll be much more clear what to do
once we have MCS fast clears in there.  What I wrote above was me trying to
imagine what it would look like with MCS fast clears and then delete the
extra MCS bits.  It'll work better once someone is writing code in a
computer rather than my brain. :-)

--Jason


> Thoughts?
>
> -Nanley
>
> > > +   } else {
> > > +  att_state->aux_usage = ISL_AUX_USAGE_CCS_D;
> > > +  /* From the Sky Lake PRM, RENDER_SURFACE_STATE::
> > > AuxiliarySurfaceMode:
> > > +   *
> > > +   *"If Number of Multisamples is MULTISAMPLECOUNT_1,
> AUX_CCS_D
> > > +   *setting is only allowed if Surface Format supported for
> Fast
> > > +   *Clear. In addition, if the surface is bound to the
> sampling
> > > +   *engine, Surface Format must be supported for Render Target
> > > +   *Compression for surfaces bound to the sampling engine."
> > > +   *
> > > +   * In other words, we can only sample from a fast-cleared image
> if
> > > it
> > > +   * also supports color compression.
> > > +   */
> > > +  if (isl_format_supports_ccs_e(>info,
> iview->isl.format)) {
> > > + /* TODO: Consider using a heuristic to determine if
> temporarily
> > > enabling
> > > +  * CCS_E for this image view would be beneficial.
> > > +  *
> > > +  * While fast-clear resolves and partial resolves are fairly
> > > cheap in the
> > > +  * case where you render to most of the pixels, full resolves
> > > are not
> > > +  * because they potentially involve reading and writing the
> > > entire
> > > +  * framebuffer.  If we can't texture with CCS_E, we should
> leave
> > > it off and
> > > +  * limit ourselves to fast clears.
> > > +  */
> > > + att_state->input_aux_usage = ISL_AUX_USAGE_CCS_D;
> > > +  } else {
> > > + att_state->input_aux_usage = ISL_AUX_USAGE_NONE;
> > > +  }
> > > }
> > >
> > > assert(iview->image->aux_surface.isl.usage &
> ISL_SURF_USAGE_CCS_BIT);
> > > @@ -315,38 +345,6 @@ color_attachment_compute_aux_usage(struct
> anv_device
> > > * device,
> > > } else {
> > >att_state->fast_clear = false;
> > > }
> > > -
> > > -   /**
> > > -* TODO: Consider using a heuristic to determine if temporarily
> > > enabling
> > > -* CCS_E for this image 

[Mesa-dev] [PATCH 08/11] anv/image: Use vk_zalloc instead of an explicit memset

2017-07-11 Thread Jason Ekstrand
---
 src/intel/vulkan/anv_image.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c
index fe5544e..7c4655d 100644
--- a/src/intel/vulkan/anv_image.c
+++ b/src/intel/vulkan/anv_image.c
@@ -264,12 +264,11 @@ anv_image_create(VkDevice _device,
anv_assert(pCreateInfo->extent.height > 0);
anv_assert(pCreateInfo->extent.depth > 0);
 
-   image = vk_alloc2(>alloc, alloc, sizeof(*image), 8,
-  VK_SYSTEM_ALLOCATION_SCOPE_OBJECT);
+   image = vk_zalloc2(>alloc, alloc, sizeof(*image), 8,
+   VK_SYSTEM_ALLOCATION_SCOPE_OBJECT);
if (!image)
   return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
 
-   memset(image, 0, sizeof(*image));
image->type = pCreateInfo->imageType;
image->extent = pCreateInfo->extent;
image->vk_format = pCreateInfo->format;
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 11/11] anv: Add a new anv_surface_state struct

2017-07-11 Thread Jason Ekstrand
This struct represents a full surface state including the addresses of
the referenced main and auxiliary surfaces (if any).  This makes
relocation setup substantially simpler and allows us to move 100% of the
surface state setup logic into anv_image where it belongs.  Before, we
were manually fishing data out of surface states when emitting
relocations so we knew how to offset aux address.  It's best to keep all
of the surface state emit logic together.  This also gets us closer, at
least cosmetically, to a world of no relocations where addresses are
placed in surface states up-front.
---
 src/intel/vulkan/anv_blorp.c   |  4 +-
 src/intel/vulkan/anv_image.c   | 68 +---
 src/intel/vulkan/anv_private.h | 35 ++-
 src/intel/vulkan/genX_cmd_buffer.c | 91 ++
 4 files changed, 101 insertions(+), 97 deletions(-)

diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c
index 459d57e..16f7f42 100644
--- a/src/intel/vulkan/anv_blorp.c
+++ b/src/intel/vulkan/anv_blorp.c
@@ -994,7 +994,7 @@ clear_color_attachment(struct anv_cmd_buffer *cmd_buffer,
 
uint32_t binding_table;
VkResult result =
-  binding_table_for_surface_state(cmd_buffer, att_state->color_rt_state,
+  binding_table_for_surface_state(cmd_buffer, att_state->color.state,
   _table);
if (result != VK_SUCCESS)
   return;
@@ -1606,7 +1606,7 @@ ccs_resolve_attachment(struct anv_cmd_buffer *cmd_buffer,
cmd_buffer->state.pending_pipe_bits |=
   ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT | ANV_PIPE_CS_STALL_BIT;
 
-   anv_ccs_resolve(cmd_buffer, att_state->color_rt_state, image,
+   anv_ccs_resolve(cmd_buffer, att_state->color.state, image,
iview->isl.base_level, fb->layers, resolve_op);
 
cmd_buffer->state.pending_pipe_bits |=
diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c
index 82c4718..1036821 100644
--- a/src/intel/vulkan/anv_image.c
+++ b/src/intel/vulkan/anv_image.c
@@ -563,7 +563,7 @@ anv_image_view_fill_state(struct anv_device *device,
   enum isl_aux_usage aux_usage,
   const union isl_color_value *clear_color,
   enum anv_image_view_state_flags flags,
-  struct anv_state *state,
+  struct anv_surface_state *state_inout,
   struct brw_image_param *image_param_out)
 {
const struct anv_surface *surface =
@@ -584,6 +584,10 @@ anv_image_view_fill_state(struct anv_device *device,
if (!clear_color)
   clear_color = _clear_color;
 
+   const uint64_t address = iview->image->offset + surface->offset;
+   const uint64_t aux_address = (aux_usage == ISL_AUX_USAGE_NONE) ?
+  0 : iview->image->offset + iview->image->aux_surface.offset;
+
if (view_usage == ISL_SURF_USAGE_STORAGE_BIT &&
!(flags & ANV_IMAGE_VIEW_STATE_STORAGE_WRITE_ONLY) &&
!isl_has_matching_typed_storage_image_format(>info,
@@ -593,11 +597,14 @@ anv_image_view_fill_state(struct anv_device *device,
* the shader.
*/
   assert(aux_usage == ISL_AUX_USAGE_NONE);
-  isl_buffer_fill_state(>isl_dev, state->map,
+  isl_buffer_fill_state(>isl_dev, state_inout->state.map,
+.address = address,
 .size = surface->isl.size,
 .format = ISL_FORMAT_RAW,
 .stride = 1,
 .mocs = device->default_mocs);
+  state_inout->address = address,
+  state_inout->aux_address = 0;
} else {
   if (view_usage == ISL_SURF_USAGE_STORAGE_BIT &&
   !(flags & ANV_IMAGE_VIEW_STATE_STORAGE_WRITE_ONLY)) {
@@ -610,16 +617,32 @@ anv_image_view_fill_state(struct anv_device *device,
   view.format);
   }
 
-  isl_surf_fill_state(>isl_dev, state->map,
+  isl_surf_fill_state(>isl_dev, state_inout->state.map,
   .surf = >isl,
   .view = ,
+  .address = address,
   .clear_color = *clear_color,
   .aux_surf = >image->aux_surface.isl,
   .aux_usage = aux_usage,
+  .aux_address = aux_address,
   .mocs = device->default_mocs);
+  state_inout->address = address;
+  if (device->info.gen >= 8) {
+ state_inout->aux_address = aux_address;
+  } else {
+ /* On gen7 and prior, the bottom 12 bits of the MCS base address are
+  * used to store other information.  This should be ok, however,
+  * because surface buffer addresses are always 4K page alinged.
+  */
+ uint32_t *aux_addr_dw = state_inout->state.map +
+ device->isl_dev.ss.aux_addr_offset;
+ 

[Mesa-dev] [PATCH 09/11] anv/image: zalloc image views

2017-07-11 Thread Jason Ekstrand
This allows us to avoid some extra zeroing.
---
 src/intel/vulkan/anv_image.c | 8 +---
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c
index 7c4655d..af50ebd 100644
--- a/src/intel/vulkan/anv_image.c
+++ b/src/intel/vulkan/anv_image.c
@@ -567,7 +567,7 @@ anv_CreateImageView(VkDevice _device,
ANV_FROM_HANDLE(anv_image, image, pCreateInfo->image);
struct anv_image_view *iview;
 
-   iview = vk_alloc2(>alloc, pAllocator, sizeof(*iview), 8,
+   iview = vk_zalloc2(>alloc, pAllocator, sizeof(*iview), 8,
   VK_SYSTEM_ALLOCATION_SCOPE_OBJECT);
if (iview == NULL)
   return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
@@ -694,9 +694,6 @@ anv_CreateImageView(VkDevice _device,
 
   anv_state_flush(device, iview->optimal_sampler_surface_state);
   anv_state_flush(device, iview->general_sampler_surface_state);
-   } else {
-  iview->optimal_sampler_surface_state.alloc_size = 0;
-  iview->general_sampler_surface_state.alloc_size = 0;
}
 
/* NOTE: This one needs to go last since it may stomp isl_view.format */
@@ -747,9 +744,6 @@ anv_CreateImageView(VkDevice _device,
 
   anv_state_flush(device, iview->storage_surface_state);
   anv_state_flush(device, iview->writeonly_storage_surface_state);
-   } else {
-  iview->storage_surface_state.alloc_size = 0;
-  iview->writeonly_storage_surface_state.alloc_size = 0;
}
 
*pView = anv_image_view_to_handle(iview);
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/11] anv/image: Add INPUT_ATTACHMENT to the list of required usages

2017-07-11 Thread Jason Ekstrand
From the Vulkan 1.0.53 spec VU for vkCreateImageView:

"image must have been created with a usage value containing at least
one of VK_IMAGE_USAGE_SAMPLED_BIT, VK_IMAGE_USAGE_STORAGE_BIT,
VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT,
VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT, or
VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT"

We were missing VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT from out list.

Cc: mesa-sta...@lists.freedesktop.org
---
 src/intel/vulkan/anv_image.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c
index a4cf6f8..f555db8 100644
--- a/src/intel/vulkan/anv_image.c
+++ b/src/intel/vulkan/anv_image.c
@@ -580,6 +580,7 @@ anv_CreateImageView(VkDevice _device,
assert(image->usage & (VK_IMAGE_USAGE_SAMPLED_BIT |
   VK_IMAGE_USAGE_STORAGE_BIT |
   VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT |
+  VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT |
   VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT));
 
switch (image->type) {
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/11] anv/image: Break surface state fill logic into a helper

2017-07-11 Thread Jason Ekstrand
This gives us a single centralized place where we take an image view and
use it to fill out a surface state.
---
 src/intel/vulkan/anv_image.c   | 171 +
 src/intel/vulkan/anv_private.h |  14 +++
 src/intel/vulkan/genX_cmd_buffer.c |  37 +++-
 3 files changed, 125 insertions(+), 97 deletions(-)

diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c
index af50ebd..82c4718 100644
--- a/src/intel/vulkan/anv_image.c
+++ b/src/intel/vulkan/anv_image.c
@@ -556,6 +556,77 @@ remap_swizzle(VkComponentSwizzle swizzle, 
VkComponentSwizzle component,
}
 }
 
+void
+anv_image_view_fill_state(struct anv_device *device,
+  const struct anv_image_view *iview,
+  isl_surf_usage_flags_t view_usage,
+  enum isl_aux_usage aux_usage,
+  const union isl_color_value *clear_color,
+  enum anv_image_view_state_flags flags,
+  struct anv_state *state,
+  struct brw_image_param *image_param_out)
+{
+   const struct anv_surface *surface =
+  anv_image_get_surface_for_aspect_mask(iview->image, iview->aspect_mask);
+
+   struct isl_view view = iview->isl;
+   view.usage |= view_usage;
+
+   if (view_usage == ISL_SURF_USAGE_RENDER_TARGET_BIT)
+  view.swizzle = anv_swizzle_for_render(view.swizzle);
+
+   /* If this is a HiZ buffer we can sample from with a programmable clear
+* value (SKL+), define the clear value to the optimal constant.
+*/
+   union isl_color_value default_clear_color = { .u32 = { 0, } };
+   if (device->info.gen >= 9 && aux_usage == ISL_AUX_USAGE_HIZ)
+  default_clear_color.f32[0] = ANV_HZ_FC_VAL;
+   if (!clear_color)
+  clear_color = _clear_color;
+
+   if (view_usage == ISL_SURF_USAGE_STORAGE_BIT &&
+   !(flags & ANV_IMAGE_VIEW_STATE_STORAGE_WRITE_ONLY) &&
+   !isl_has_matching_typed_storage_image_format(>info,
+view.format)) {
+  /* In this case, we are a writeable storage buffer which needs to be
+   * lowered to linear. All tiling and offset calculations will be done in
+   * the shader.
+   */
+  assert(aux_usage == ISL_AUX_USAGE_NONE);
+  isl_buffer_fill_state(>isl_dev, state->map,
+.size = surface->isl.size,
+.format = ISL_FORMAT_RAW,
+.stride = 1,
+.mocs = device->default_mocs);
+   } else {
+  if (view_usage == ISL_SURF_USAGE_STORAGE_BIT &&
+  !(flags & ANV_IMAGE_VIEW_STATE_STORAGE_WRITE_ONLY)) {
+ /* Typed surface reads support a very limited subset of the shader
+  * image formats.  Translate it into the closest format the hardware
+  * supports.
+  */
+ assert(aux_usage == ISL_AUX_USAGE_NONE);
+ view.format = isl_lower_storage_image_format(>info,
+  view.format);
+  }
+
+  isl_surf_fill_state(>isl_dev, state->map,
+  .surf = >isl,
+  .view = ,
+  .clear_color = *clear_color,
+  .aux_surf = >image->aux_surface.isl,
+  .aux_usage = aux_usage,
+  .mocs = device->default_mocs);
+   }
+
+   anv_state_flush(device, *state);
+
+   if (image_param_out) {
+  assert(view_usage == ISL_SURF_USAGE_STORAGE_BIT);
+  isl_surf_fill_image_param(>isl_dev, image_param_out,
+>isl, );
+   }
+}
 
 VkResult
 anv_CreateImageView(VkDevice _device,
@@ -663,37 +734,19 @@ anv_CreateImageView(VkDevice _device,
  anv_layout_to_aux_usage(>info, image, iview->aspect_mask,
  VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL);
 
-  /* If this is a HiZ buffer we can sample from with a programmable clear
-   * value (SKL+), define the clear value to the optimal constant.
-   */
-  union isl_color_value clear_color = { .u32 = { 0, } };
-  if ((iview->aspect_mask & VK_IMAGE_ASPECT_DEPTH_BIT) &&
-  device->info.gen >= 9)
- clear_color.f32[0] = ANV_HZ_FC_VAL;
-
-  struct isl_view view = iview->isl;
-  view.usage |= ISL_SURF_USAGE_TEXTURE_BIT;
-
-  isl_surf_fill_state(>isl_dev,
-  iview->optimal_sampler_surface_state.map,
-  .surf = >isl,
-  .view = ,
-  .clear_color = clear_color,
-  .aux_surf = >aux_surface.isl,
-  .aux_usage = iview->optimal_sampler_aux_usage,
-  .mocs = device->default_mocs);
-
-  isl_surf_fill_state(>isl_dev,
-  iview->general_sampler_surface_state.map,
-  .surf = >isl,
-

[Mesa-dev] [PATCH 03/11] anv/cmd_buffer: Properly handle render passes with 0 attachments

2017-07-11 Thread Jason Ekstrand
We were early returning and never created the NULL surface state.

Cc: mesa-sta...@lists.freedesktop.org
---
 src/intel/vulkan/genX_cmd_buffer.c | 23 +++
 1 file changed, 11 insertions(+), 12 deletions(-)

diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index 53c58ca..9b3bb10 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -429,19 +429,18 @@ genX(cmd_buffer_setup_attachments)(struct anv_cmd_buffer 
*cmd_buffer,
 
vk_free(_buffer->pool->alloc, state->attachments);
 
-   if (pass->attachment_count == 0) {
+   if (pass->attachment_count > 0) {
+  state->attachments = vk_alloc(_buffer->pool->alloc,
+pass->attachment_count *
+ sizeof(state->attachments[0]),
+8, VK_SYSTEM_ALLOCATION_SCOPE_OBJECT);
+  if (state->attachments == NULL) {
+ /* Propagate VK_ERROR_OUT_OF_HOST_MEMORY to vkEndCommandBuffer */
+ return anv_batch_set_error(_buffer->batch,
+VK_ERROR_OUT_OF_HOST_MEMORY);
+  }
+   } else {
   state->attachments = NULL;
-  return VK_SUCCESS;
-   }
-
-   state->attachments = vk_alloc(_buffer->pool->alloc,
- pass->attachment_count *
-  sizeof(state->attachments[0]),
- 8, VK_SYSTEM_ALLOCATION_SCOPE_OBJECT);
-   if (state->attachments == NULL) {
-  /* Propagate VK_ERROR_OUT_OF_HOST_MEMORY to vkEndCommandBuffer */
-  return anv_batch_set_error(_buffer->batch,
- VK_ERROR_OUT_OF_HOST_MEMORY);
}
 
/* Reserve one for the NULL state. */
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 02/11] anv: Get rid of some unused function declarations

2017-07-11 Thread Jason Ekstrand
---
 src/intel/vulkan/anv_private.h | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index 4dce360..3e7f9c7 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -2226,13 +2226,6 @@ void anv_fill_buffer_surface_state(struct anv_device 
*device,
uint32_t offset, uint32_t range,
uint32_t stride);
 
-void anv_image_view_fill_image_param(struct anv_device *device,
- struct anv_image_view *view,
- struct brw_image_param *param);
-void anv_buffer_view_fill_image_param(struct anv_device *device,
-  struct anv_buffer_view *view,
-  struct brw_image_param *param);
-
 struct anv_sampler {
uint32_t state[4];
 };
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/11] anv: Separate surface states by layout instead of aux_usage

2017-07-11 Thread Jason Ekstrand
---
 src/intel/vulkan/anv_descriptor_set.c |  5 +---
 src/intel/vulkan/anv_image.c  | 56 ---
 src/intel/vulkan/anv_private.h| 21 ++---
 src/intel/vulkan/genX_cmd_buffer.c| 29 --
 4 files changed, 58 insertions(+), 53 deletions(-)

diff --git a/src/intel/vulkan/anv_descriptor_set.c 
b/src/intel/vulkan/anv_descriptor_set.c
index 4b58b0b..91387c0 100644
--- a/src/intel/vulkan/anv_descriptor_set.c
+++ b/src/intel/vulkan/anv_descriptor_set.c
@@ -615,12 +615,9 @@ anv_descriptor_set_write_image_view(struct 
anv_descriptor_set *set,
 
*desc = (struct anv_descriptor) {
   .type = type,
+  .layout = info->imageLayout,
   .image_view = image_view,
   .sampler = sampler,
-  .aux_usage = image_view == NULL ? ISL_AUX_USAGE_NONE :
-   anv_layout_to_aux_usage(devinfo, image_view->image,
-   image_view->aspect_mask,
-   info->imageLayout),
};
 }
 
diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c
index f555db8..fe5544e 100644
--- a/src/intel/vulkan/anv_image.c
+++ b/src/intel/vulkan/anv_image.c
@@ -654,54 +654,50 @@ anv_CreateImageView(VkDevice _device,
if (image->usage & VK_IMAGE_USAGE_SAMPLED_BIT ||
(image->usage & VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT &&
 !(iview->aspect_mask & VK_IMAGE_ASPECT_COLOR_BIT))) {
-  iview->sampler_surface_state = alloc_surface_state(device);
-  iview->no_aux_sampler_surface_state = alloc_surface_state(device);
+  iview->optimal_sampler_surface_state = alloc_surface_state(device);
+  iview->general_sampler_surface_state = alloc_surface_state(device);
 
-  /* Sampling is performed in one of two buffer configurations in anv: with
-   * an auxiliary buffer or without it. Sampler states aren't always needed
-   * for both configurations, but are currently created unconditionally for
-   * simplicity.
-   *
-   * TODO: Consider allocating each surface state only when necessary.
-   */
-
-  /* Create a sampler state with the optimal aux_usage for sampling. This
-   * may use the aux_buffer.
-   */
-  const enum isl_aux_usage surf_usage =
+  iview->general_sampler_aux_usage =
+ anv_layout_to_aux_usage(>info, image, iview->aspect_mask,
+ VK_IMAGE_LAYOUT_GENERAL);
+  iview->optimal_sampler_aux_usage =
  anv_layout_to_aux_usage(>info, image, iview->aspect_mask,
  VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL);
 
   /* If this is a HiZ buffer we can sample from with a programmable clear
* value (SKL+), define the clear value to the optimal constant.
*/
-  const float red_clear_color = surf_usage == ISL_AUX_USAGE_HIZ &&
-device->info.gen >= 9 ?
-ANV_HZ_FC_VAL : 0.0f;
+  union isl_color_value clear_color = { .u32 = { 0, } };
+  if ((iview->aspect_mask & VK_IMAGE_ASPECT_DEPTH_BIT) &&
+  device->info.gen >= 9)
+ clear_color.f32[0] = ANV_HZ_FC_VAL;
 
   struct isl_view view = iview->isl;
   view.usage |= ISL_SURF_USAGE_TEXTURE_BIT;
+
   isl_surf_fill_state(>isl_dev,
-  iview->sampler_surface_state.map,
+  iview->optimal_sampler_surface_state.map,
   .surf = >isl,
   .view = ,
-  .clear_color.f32 = { red_clear_color,},
+  .clear_color = clear_color,
   .aux_surf = >aux_surface.isl,
-  .aux_usage = surf_usage,
+  .aux_usage = iview->optimal_sampler_aux_usage,
   .mocs = device->default_mocs);
 
-  /* Create a sampler state that only uses the main buffer. */
   isl_surf_fill_state(>isl_dev,
-  iview->no_aux_sampler_surface_state.map,
+  iview->general_sampler_surface_state.map,
   .surf = >isl,
   .view = ,
+  .clear_color = clear_color,
+  .aux_surf = >aux_surface.isl,
+  .aux_usage = iview->general_sampler_aux_usage,
   .mocs = device->default_mocs);
 
-  anv_state_flush(device, iview->sampler_surface_state);
-  anv_state_flush(device, iview->no_aux_sampler_surface_state);
+  anv_state_flush(device, iview->optimal_sampler_surface_state);
+  anv_state_flush(device, iview->general_sampler_surface_state);
} else {
-  iview->sampler_surface_state.alloc_size = 0;
-  iview->no_aux_sampler_surface_state.alloc_size = 0;
+  iview->optimal_sampler_surface_state.alloc_size = 0;
+  iview->general_sampler_surface_state.alloc_size = 0;
}
 

[Mesa-dev] [PATCH 06/11] intel/isl: Add some sanity checks for compressed surfaces

2017-07-11 Thread Jason Ekstrand
---
 src/intel/isl/isl_surface_state.c | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/src/intel/isl/isl_surface_state.c 
b/src/intel/isl/isl_surface_state.c
index 1620c93..e8bdb65 100644
--- a/src/intel/isl/isl_surface_state.c
+++ b/src/intel/isl/isl_surface_state.c
@@ -254,6 +254,24 @@ isl_genX(surf_fill_state_s)(const struct isl_device *dev, 
void *state,
if (info->surf->dim == ISL_SURF_DIM_1D)
   assert(!isl_format_is_compressed(info->view->format));
 
+   if (isl_format_is_compressed(info->surf->format)) {
+  /* You're not allowed to make a view of a compressed format with any
+   * format other than the surface format.  None of the userspace APIs
+   * allow for this directly and doing so would mess up a number of
+   * surface parameters such as Width, Height, and alignments.  Ideally,
+   * we'd like to assert that the two formats match.  However, we have an
+   * S3TC workaround that requires us to do reinterpretation.  So assert
+   * that they're at least the same bpb and block size.
+   */
+  MAYBE_UNUSED const struct isl_format_layout *surf_fmtl =
+ isl_format_get_layout(info->surf->format);
+  MAYBE_UNUSED const struct isl_format_layout *view_fmtl =
+ isl_format_get_layout(info->surf->format);
+  assert(surf_fmtl->bpb == view_fmtl->bpb);
+  assert(surf_fmtl->bw == view_fmtl->bw);
+  assert(surf_fmtl->bh == view_fmtl->bh);
+   }
+
s.SurfaceFormat = info->view->format;
 
 #if GEN_GEN <= 5
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 05/11] intel/isl: Add a helper to get a subimage surface

2017-07-11 Thread Jason Ekstrand
---
 src/intel/blorp/blorp_blit.c | 38 +-
 src/intel/isl/isl.c  | 35 +++
 src/intel/isl/isl.h  | 23 +++
 3 files changed, 71 insertions(+), 25 deletions(-)

diff --git a/src/intel/blorp/blorp_blit.c b/src/intel/blorp/blorp_blit.c
index 0850473..6407808 100644
--- a/src/intel/blorp/blorp_blit.c
+++ b/src/intel/blorp/blorp_blit.c
@@ -1406,36 +1406,24 @@ blorp_surf_convert_to_single_slice(const struct 
isl_device *isl_dev,
   layer = info->view.base_array_layer;
 
uint32_t byte_offset;
-   isl_surf_get_image_offset_B_tile_sa(>surf,
-   info->view.base_level, layer, z,
-   _offset,
-   >tile_x_sa, >tile_y_sa);
+   isl_surf_get_image_surf(isl_dev, >surf,
+   info->view.base_level, layer, z,
+   >surf,
+   _offset, >tile_x_sa, >tile_y_sa);
info->addr.offset += byte_offset;
 
-   const uint32_t slice_width_px =
-  minify(info->surf.logical_level0_px.width, info->view.base_level);
-   const uint32_t slice_height_px =
-  minify(info->surf.logical_level0_px.height, info->view.base_level);
-
uint32_t tile_x_px, tile_y_px;
surf_get_intratile_offset_px(info, _x_px, _y_px);
 
-   struct isl_surf_init_info init_info = {
-  .dim = ISL_SURF_DIM_2D,
-  .format = info->surf.format,
-  .width = slice_width_px + tile_x_px,
-  .height = slice_height_px + tile_y_px,
-  .depth = 1,
-  .levels = 1,
-  .array_len = 1,
-  .samples = info->surf.samples,
-  .row_pitch = info->surf.row_pitch,
-  .usage = info->surf.usage,
-  .tiling_flags = 1 << info->surf.tiling,
-   };
-
-   ok = isl_surf_init_s(isl_dev, >surf, _info);
-   assert(ok);
+   /* Instead of using the X/Y Offset fields in RENDER_SURFACE_STATE, we place
+* the image at the tile boundary and offset our sampling or rendering.
+* For this reason, we need to grow the image by the offset to ensure that
+* the hardware doesn't think we've gone past the edge.
+*/
+   info->surf.logical_level0_px.w += tile_x_px;
+   info->surf.logical_level0_px.h += tile_y_px;
+   info->surf.phys_level0_sa.w += info->tile_x_sa;
+   info->surf.phys_level0_sa.h += info->tile_y_sa;
 
/* The view is also different now. */
info->view.base_level = 0;
diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
index ba56d86..ce25c63 100644
--- a/src/intel/isl/isl.c
+++ b/src/intel/isl/isl.c
@@ -2221,6 +2221,41 @@ isl_surf_get_image_offset_B_tile_sa(const struct 
isl_surf *surf,
 }
 
 void
+isl_surf_get_image_surf(const struct isl_device *dev,
+const struct isl_surf *surf,
+uint32_t level,
+uint32_t logical_array_layer,
+uint32_t logical_z_offset_px,
+struct isl_surf *image_surf,
+uint32_t *offset_B,
+uint32_t *x_offset_sa,
+uint32_t *y_offset_sa)
+{
+   isl_surf_get_image_offset_B_tile_sa(surf,
+   level,
+   logical_array_layer,
+   logical_z_offset_px,
+   offset_B,
+   x_offset_sa,
+   y_offset_sa);
+
+   bool ok UNUSED;
+   ok = isl_surf_init(dev, image_surf,
+  .dim = ISL_SURF_DIM_2D,
+  .format = surf->format,
+  .width = isl_minify(surf->logical_level0_px.w, level),
+  .height = isl_minify(surf->logical_level0_px.h, level),
+  .depth = 1,
+  .levels = 1,
+  .array_len = 1,
+  .samples = surf->samples,
+  .row_pitch = surf->row_pitch,
+  .usage = surf->usage,
+  .tiling_flags = (1 << surf->tiling));
+   assert(ok);
+}
+
+void
 isl_tiling_get_intratile_offset_el(enum isl_tiling tiling,
uint32_t bpb,
uint32_t row_pitch,
diff --git a/src/intel/isl/isl.h b/src/intel/isl/isl.h
index 07ff01a..94784e2 100644
--- a/src/intel/isl/isl.h
+++ b/src/intel/isl/isl.h
@@ -1780,6 +1780,29 @@ isl_surf_get_image_offset_B_tile_sa(const struct 
isl_surf *surf,
 uint32_t *y_offset_sa);
 
 /**
+ * Create an isl_surf that represents a particular subimage in the surface.
+ *
+ * The newly created surface will have a single miplevel and array slice.  The
+ * surface lives at the returned byte and intratile offsets, in samples.
+ *
+ * It is safe to call this function with surf == image_surf.
+ *
+ * @invariant level < surface levels
+ 

[Mesa-dev] [PATCH 01/11] anv: Stop leaking the no_aux sampler surface state

2017-07-11 Thread Jason Ekstrand
Cc: mesa-sta...@lists.freedesktop.org
---
 src/intel/vulkan/anv_image.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c
index c84fc8d..a4cf6f8 100644
--- a/src/intel/vulkan/anv_image.c
+++ b/src/intel/vulkan/anv_image.c
@@ -776,6 +776,11 @@ anv_DestroyImageView(VkDevice _device, VkImageView _iview,
   iview->sampler_surface_state);
}
 
+   if (iview->no_aux_sampler_surface_state.alloc_size > 0) {
+  anv_state_pool_free(>surface_state_pool,
+  iview->no_aux_sampler_surface_state);
+   }
+
if (iview->storage_surface_state.alloc_size > 0) {
   anv_state_pool_free(>surface_state_pool,
   iview->storage_surface_state);
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 00/11] anv: A few refactors and fixes

2017-07-11 Thread Jason Ekstrand
I started working on another up-and-coming Vulkan feature today and, as is
frequently the case, found a bunch of bugs along the way.  This tiny little
series fixes some bugs, does a few cleanups, and does most of the needed
refactoring for said feature.  The last two patches are the important part.
They move us away from the world where the binding table building code
knows about the details of your surface states.

Nanley, I know this series conflicts a bit with what you're working on.
I'm not in all that much of a rush to land these so don't worry about
rebasing.

Jason Ekstrand (11):
  anv: Stop leaking the no_aux sampler surface state
  anv: Get rid of some unused function declarations
  anv/cmd_buffer: Properly handle render passes with 0 attachments
  anv/image: Add INPUT_ATTACHMENT to the list of required usages
  intel/isl: Add a helper to get a subimage surface
  intel/isl: Add some sanity checks for compressed surfaces
  anv: Separate surface states by layout instead of aux_usage
  anv/image: Use vk_zalloc instead of an explicit memset
  anv/image: zalloc image views
  anv/image: Break surface state fill logic into a helper
  anv: Add a new anv_surface_state struct

 src/intel/blorp/blorp_blit.c  |  38 ++---
 src/intel/isl/isl.c   |  35 +
 src/intel/isl/isl.h   |  23 
 src/intel/isl/isl_surface_state.c |  18 +++
 src/intel/vulkan/anv_blorp.c  |   4 +-
 src/intel/vulkan/anv_descriptor_set.c |   5 +-
 src/intel/vulkan/anv_image.c  | 252 +++---
 src/intel/vulkan/anv_private.h|  67 +
 src/intel/vulkan/genX_cmd_buffer.c| 144 ---
 9 files changed, 339 insertions(+), 247 deletions(-)

-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 11/16] anv/cmd_buffer: Move aux_usage assignment up

2017-07-11 Thread Nanley Chery
On Mon, Jul 10, 2017 at 02:36:16PM -0700, Jason Ekstrand wrote:
> On Wed, Jun 28, 2017 at 2:14 PM, Nanley Chery  wrote:
> 
> > For readability, bring the assignment of CCS closer to the assignment of
> > NONE and MCS.
> >
> > Signed-off-by: Nanley Chery 
> > ---
> >  src/intel/vulkan/genX_cmd_buffer.c | 62 ++
> > 
> >  1 file changed, 30 insertions(+), 32 deletions(-)
> >
> > diff --git a/src/intel/vulkan/genX_cmd_buffer.c
> > b/src/intel/vulkan/genX_cmd_buffer.c
> > index 49ad41edbd..1aa79c8e7b 100644
> > --- a/src/intel/vulkan/genX_cmd_buffer.c
> > +++ b/src/intel/vulkan/genX_cmd_buffer.c
> > @@ -253,6 +253,36 @@ color_attachment_compute_aux_usage(struct anv_device
> > * device,
> >att_state->input_aux_usage = ISL_AUX_USAGE_MCS;
> >att_state->fast_clear = false;
> >return;
> > +   } else if (iview->image->aux_usage == ISL_AUX_USAGE_CCS_E) {
> > +  att_state->aux_usage = ISL_AUX_USAGE_CCS_E;
> > +  att_state->input_aux_usage = ISL_AUX_USAGE_CCS_E;
> >
> 
> I'm not sure if this actually improves readability as-is.  The no aux case
> and MCS cases are early returns.  Maybe what we want is something like this:
> 
> if (aux_surface.isl.size == 0) {
>/* set to none */
>return;
> }
> 
> switch (aux_usage) {
> case MCS:
> case CCS_E:
>aux_state->aux_usage = iview->image->aux_usage;
>aux_state->input_aux_usage = iview->image->aux_usage;
>break;
> 
> case NONE:
>assert(samples == 1);
>/* stuff below */
>break;
> 
> default:
>unreachable();
> }
> 
> /* Now we determine whether or not we want to fast-clear */
> 
> if (samples > 1) {
>perf_debug();
>fast_clear = false;
>return;
> }
> 
> /* Other fast clear determination. */
> 
> Incidentally, it may be cleaner in the long run if we split this into two
> functions: compute_ccs_usage and compute_mcs_usage.
> 
> Just thoughts BTW.  I'm not 100% sure how to make this the most readable.
> 
> 

I'm personally not finding the alternative block above more readable. To
overcome this disagreement, I don't mind doing any of the following:

1. Omit the rationale in the commit message.
2. Drop the patch from this series.

Thoughts?

-Nanley

> > +   } else {
> > +  att_state->aux_usage = ISL_AUX_USAGE_CCS_D;
> > +  /* From the Sky Lake PRM, RENDER_SURFACE_STATE::
> > AuxiliarySurfaceMode:
> > +   *
> > +   *"If Number of Multisamples is MULTISAMPLECOUNT_1, AUX_CCS_D
> > +   *setting is only allowed if Surface Format supported for Fast
> > +   *Clear. In addition, if the surface is bound to the sampling
> > +   *engine, Surface Format must be supported for Render Target
> > +   *Compression for surfaces bound to the sampling engine."
> > +   *
> > +   * In other words, we can only sample from a fast-cleared image if
> > it
> > +   * also supports color compression.
> > +   */
> > +  if (isl_format_supports_ccs_e(>info, iview->isl.format)) {
> > + /* TODO: Consider using a heuristic to determine if temporarily
> > enabling
> > +  * CCS_E for this image view would be beneficial.
> > +  *
> > +  * While fast-clear resolves and partial resolves are fairly
> > cheap in the
> > +  * case where you render to most of the pixels, full resolves
> > are not
> > +  * because they potentially involve reading and writing the
> > entire
> > +  * framebuffer.  If we can't texture with CCS_E, we should leave
> > it off and
> > +  * limit ourselves to fast clears.
> > +  */
> > + att_state->input_aux_usage = ISL_AUX_USAGE_CCS_D;
> > +  } else {
> > + att_state->input_aux_usage = ISL_AUX_USAGE_NONE;
> > +  }
> > }
> >
> > assert(iview->image->aux_surface.isl.usage & ISL_SURF_USAGE_CCS_BIT);
> > @@ -315,38 +345,6 @@ color_attachment_compute_aux_usage(struct anv_device
> > * device,
> > } else {
> >att_state->fast_clear = false;
> > }
> > -
> > -   /**
> > -* TODO: Consider using a heuristic to determine if temporarily
> > enabling
> > -* CCS_E for this image view would be beneficial.
> > -*
> > -* While fast-clear resolves and partial resolves are fairly cheap in
> > the
> > -* case where you render to most of the pixels, full resolves are not
> > -* because they potentially involve reading and writing the entire
> > -* framebuffer.  If we can't texture with CCS_E, we should leave it
> > off and
> > -* limit ourselves to fast clears.
> > -*/
> > -   if (iview->image->aux_usage == ISL_AUX_USAGE_CCS_E) {
> > -  att_state->aux_usage = ISL_AUX_USAGE_CCS_E;
> > -  att_state->input_aux_usage = ISL_AUX_USAGE_CCS_E;
> > -   } else {
> > -  att_state->aux_usage = ISL_AUX_USAGE_CCS_D;
> > -  /* From the Sky Lake PRM, RENDER_SURFACE_STATE::
> > AuxiliarySurfaceMode:
> > -   *
> > -   *"If Number 

Re: [Mesa-dev] [PATCH 1/6] radeonsi: declare new user SGPR indices for bindless samplers/images

2017-07-11 Thread Marek Olšák
On Tue, Jul 4, 2017 at 3:05 PM, Samuel Pitoiset
 wrote:
> A new pair of user SGPR is needed for loading the bindless
> descriptors from shaders. Because the descriptors are global for
> all stages, there is no need to add separate indices for GFX9.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/gallium/drivers/radeonsi/si_shader.c  | 9 +++--
>  src/gallium/drivers/radeonsi/si_shader.h  | 4 +++-
>  src/gallium/drivers/radeonsi/si_shader_internal.h | 1 +
>  3 files changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
> b/src/gallium/drivers/radeonsi/si_shader.c
> index 55d1232512..b56d2d791b 100644
> --- a/src/gallium/drivers/radeonsi/si_shader.c
> +++ b/src/gallium/drivers/radeonsi/si_shader.c
> @@ -4094,10 +4094,12 @@ static void declare_per_stage_desc_pointers(struct 
> si_shader_context *ctx,
>  SI_NUM_SHADER_BUFFERS + 
> SI_NUM_CONST_BUFFERS);
> params[(*num_params)++] = si_const_array(ctx->v8i32,
>  SI_NUM_IMAGES + 
> SI_NUM_SAMPLERS * 2);
> +   params[(*num_params)++] = si_const_array(ctx->v8i32, 0);
>
> if (assign_params) {
> -   ctx->param_const_and_shader_buffers = *num_params - 2;
> -   ctx->param_samplers_and_images = *num_params - 1;
> +   ctx->param_const_and_shader_buffers = *num_params - 3;
> +   ctx->param_samplers_and_images = *num_params - 2;
> +   ctx->param_bindless_samplers_and_images = *num_params - 1;

This function is for per-stage pointers, but bindless resources are
global. It will break GFX9 HS and GS, because this function is called
twice per shader.

See how rw_buffers or vertex_buffers are declared.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium/util: remove unused util_{begin, end}_pipestat_query() helpers

2017-07-11 Thread Marek Olšák
I'd like to keep these for debugging.

Marek

On Thu, Jul 6, 2017 at 6:43 PM, Samuel Pitoiset
 wrote:
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/gallium/auxiliary/util/u_helpers.c | 50 
> --
>  src/gallium/auxiliary/util/u_helpers.h |  7 -
>  2 files changed, 57 deletions(-)
>
> diff --git a/src/gallium/auxiliary/util/u_helpers.c 
> b/src/gallium/auxiliary/util/u_helpers.c
> index e0feade3ac..6946a1ada9 100644
> --- a/src/gallium/auxiliary/util/u_helpers.c
> +++ b/src/gallium/auxiliary/util/u_helpers.c
> @@ -117,53 +117,3 @@ util_upload_index_buffer(struct pipe_context *pipe,
> *out_offset -= start_offset;
> return *out_buffer != NULL;
>  }
> -
> -struct pipe_query *
> -util_begin_pipestat_query(struct pipe_context *ctx)
> -{
> -   struct pipe_query *q =
> -  ctx->create_query(ctx, PIPE_QUERY_PIPELINE_STATISTICS, 0);
> -   if (!q)
> -  return NULL;
> -
> -   ctx->begin_query(ctx, q);
> -   return q;
> -}
> -
> -void
> -util_end_pipestat_query(struct pipe_context *ctx, struct pipe_query *q,
> -FILE *f)
> -{
> -   static unsigned counter;
> -   struct pipe_query_data_pipeline_statistics stats;
> -
> -   ctx->end_query(ctx, q);
> -   ctx->get_query_result(ctx, q, true, (void*));
> -   ctx->destroy_query(ctx, q);
> -
> -   fprintf(f,
> -   "Draw call %u:\n"
> -   "ia_vertices= %"PRIu64"\n"
> -   "ia_primitives  = %"PRIu64"\n"
> -   "vs_invocations = %"PRIu64"\n"
> -   "gs_invocations = %"PRIu64"\n"
> -   "gs_primitives  = %"PRIu64"\n"
> -   "c_invocations  = %"PRIu64"\n"
> -   "c_primitives   = %"PRIu64"\n"
> -   "ps_invocations = %"PRIu64"\n"
> -   "hs_invocations = %"PRIu64"\n"
> -   "ds_invocations = %"PRIu64"\n"
> -   "cs_invocations = %"PRIu64"\n",
> -   p_atomic_inc_return(),
> -   stats.ia_vertices,
> -   stats.ia_primitives,
> -   stats.vs_invocations,
> -   stats.gs_invocations,
> -   stats.gs_primitives,
> -   stats.c_invocations,
> -   stats.c_primitives,
> -   stats.ps_invocations,
> -   stats.hs_invocations,
> -   stats.ds_invocations,
> -   stats.cs_invocations);
> -}
> diff --git a/src/gallium/auxiliary/util/u_helpers.h 
> b/src/gallium/auxiliary/util/u_helpers.h
> index ab970d791b..7fed144d81 100644
> --- a/src/gallium/auxiliary/util/u_helpers.h
> +++ b/src/gallium/auxiliary/util/u_helpers.h
> @@ -50,13 +50,6 @@ bool util_upload_index_buffer(struct pipe_context *pipe,
>struct pipe_resource **out_buffer,
>unsigned *out_offset);
>
> -struct pipe_query *
> -util_begin_pipestat_query(struct pipe_context *ctx);
> -
> -void
> -util_end_pipestat_query(struct pipe_context *ctx, struct pipe_query *q,
> -FILE *f);
> -
>  #ifdef __cplusplus
>  }
>  #endif
> --
> 2.13.2
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] ddebug: fix parsing of the pipelined mode

2017-07-11 Thread Marek Olšák
Reviewed-by: Marek Olšák 

Marek

On Fri, Jul 7, 2017 at 9:52 AM, Samuel Pitoiset
 wrote:
> Trivial.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/gallium/drivers/ddebug/dd_screen.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/gallium/drivers/ddebug/dd_screen.c 
> b/src/gallium/drivers/ddebug/dd_screen.c
> index a5d2be1402..14e6f6b011 100644
> --- a/src/gallium/drivers/ddebug/dd_screen.c
> +++ b/src/gallium/drivers/ddebug/dd_screen.c
> @@ -374,7 +374,7 @@ ddebug_screen_create(struct pipe_screen *screen)
>
>if (sscanf(option+8, "%u", _dump_call) != 1)
>   return screen;
> -   } else if (!strncmp(option, "pipelined", 8)) {
> +   } else if (!strncmp(option, "pipelined", 9)) {
>mode = DD_DETECT_HANGS_PIPELINED;
>
>if (sscanf(option+10, "%u", ) != 1)
> --
> 2.13.2
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 16/16] anv: Predicate fast-clear resolves

2017-07-11 Thread Nanley Chery
On Mon, Jul 10, 2017 at 04:00:23PM -0700, Jason Ekstrand wrote:
> On Wed, Jun 28, 2017 at 2:14 PM, Nanley Chery  wrote:
> 
> > Image layouts only let us know that an image *may* be fast-cleared. For
> > this reason we can end up with redundant resolves. Testing has shown
> > that such resolves can measurably hurt performance and that predicating
> > them can avoid the penalty.
> >
> > Signed-off-by: Nanley Chery 
> > ---
> >  src/intel/vulkan/anv_blorp.c   |  3 +-
> >  src/intel/vulkan/anv_private.h | 13 --
> >  src/intel/vulkan/genX_cmd_buffer.c | 87 ++
> > ++--
> >  3 files changed, 95 insertions(+), 8 deletions(-)
> >
> > diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c
> > index 35317ba6be..d06d7e2cc3 100644
> > --- a/src/intel/vulkan/anv_blorp.c
> > +++ b/src/intel/vulkan/anv_blorp.c
> > @@ -1619,7 +1619,8 @@ anv_ccs_resolve(struct anv_cmd_buffer * const
> > cmd_buffer,
> >return;
> >
> > struct blorp_batch batch;
> > -   blorp_batch_init(_buffer->device->blorp, , cmd_buffer, 0);
> > +   blorp_batch_init(_buffer->device->blorp, , cmd_buffer,
> > +BLORP_BATCH_PREDICATE_ENABLE);
> >
> > struct blorp_surf surf;
> > get_blorp_surf_for_anv_image(image, VK_IMAGE_ASPECT_COLOR_BIT,
> > diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_
> > private.h
> > index be1623f3c3..951cf50842 100644
> > --- a/src/intel/vulkan/anv_private.h
> > +++ b/src/intel/vulkan/anv_private.h
> > @@ -2118,11 +2118,16 @@ anv_fast_clear_state_entry_size(const struct
> > anv_device *device)
> >  {
> > assert(device);
> > /* Entry contents:
> > -*   +--+
> > -*   | clear value dword(s) |
> > -*   +--+
> > +*   ++
> > +*   | clear value dword(s) | needs resolve dword |
> > +*   ++
> >  */
> > -   return device->isl_dev.ss.clear_value_size;
> > +
> > +   /* Ensure that the needs resolve dword is in fact dword-aligned to
> > enable
> > +* GPU memcpy operations.
> > +*/
> > +   assert(device->isl_dev.ss.clear_value_size % 4 == 0);
> > +   return device->isl_dev.ss.clear_value_size + 4;
> >  }
> >
> >  /* Returns true if a HiZ-enabled depth buffer can be sampled from. */
> > diff --git a/src/intel/vulkan/genX_cmd_buffer.c
> > b/src/intel/vulkan/genX_cmd_buffer.c
> > index 62a2f22782..65d9c92783 100644
> > --- a/src/intel/vulkan/genX_cmd_buffer.c
> > +++ b/src/intel/vulkan/genX_cmd_buffer.c
> > @@ -421,6 +421,59 @@ get_fast_clear_state_entry_offset(const struct
> > anv_device *device,
> > return offset;
> >  }
> >
> > +#define MI_PREDICATE_SRC0  0x2400
> > +#define MI_PREDICATE_SRC1  0x2408
> > +
> > +enum ccs_resolve_state {
> > +   CCS_RESOLVE_NOT_NEEDED,
> > +   CCS_RESOLVE_NEEDED,
> >
> 
> Are these two values sufficient?  Do we ever have a scenario where we do a
> partial resolve and then a full resolve?  Do we need to be able to track
> that?
> 
> 

Yes, they are. We don't currently have such a scenario. This may come up
later if we start temporarily enabling CCS_E, but I can't think of what
sequence of events would trigger that to happen.

> > +   CCS_RESOLVE_STARTING,
> > +};
> > +
> > +/* Manages the state of an color image subresource to ensure resolves are
> > + * performed properly.
> > + */
> > +static void
> > +genX(set_resolve_state)(struct anv_cmd_buffer *cmd_buffer,
> > +const struct anv_image *image,
> > +unsigned level,
> > +enum ccs_resolve_state state)
> > +{
> > +   assert(cmd_buffer && image);
> > +   assert(image->aspects == VK_IMAGE_ASPECT_COLOR_BIT);
> > +   assert(level < anv_image_aux_levels(image));
> > +
> > +   const uint32_t resolve_flag_offset =
> > +  get_fast_clear_state_entry_offset(cmd_buffer->device, image,
> > level) +
> > +  cmd_buffer->device->isl_dev.ss.clear_value_size;
> > +
> > +   if (state != CCS_RESOLVE_STARTING) {
> > +  assert(state == CCS_RESOLVE_NEEDED || state ==
> > CCS_RESOLVE_NOT_NEEDED);
> > +  /* The HW docs say that there is no way to guarantee the completion
> > of
> > +   * the following command. We use it nevertheless because it shows no
> > +   * issues in testing is currently being used in the GL driver.
> > +   */
> > +  anv_batch_emit(_buffer->batch, GENX(MI_STORE_DATA_IMM), sdi) {
> > + sdi.Address = (struct anv_address) { image->bo,
> > resolve_flag_offset };
> > + sdi.ImmediateData = state == CCS_RESOLVE_NEEDED;
> > +  }
> > +   } else {
> > +  /* Make the pending predicated resolve a no-op if one is not needed.
> > +   * predicate = do_resolve = resolve_flag != 0;
> > +   */
> > +  emit_lri(_buffer->batch, MI_PREDICATE_SRC1, 0);
> > +  emit_lri(_buffer->batch, MI_PREDICATE_SRC1 + 4, 

Re: [Mesa-dev] [PATCH] radeonsi: enable support for EXT_memory_object v2

2017-07-11 Thread Marek Olšák
On Fri, Jul 7, 2017 at 6:34 AM, Andres Rodriguez  wrote:
> v2: fix an indentation error
>
> Signed-off-by: Andres Rodriguez 
> ---
>  src/gallium/drivers/r600/r600_pipe.c   | 2 +-
>  src/gallium/drivers/radeonsi/si_pipe.c | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/src/gallium/drivers/r600/r600_pipe.c 
> b/src/gallium/drivers/r600/r600_pipe.c
> index e3abc10..dc225aa 100644
> --- a/src/gallium/drivers/r600/r600_pipe.c
> +++ b/src/gallium/drivers/r600/r600_pipe.c
> @@ -296,6 +296,7 @@ static int r600_get_param(struct pipe_screen* pscreen, 
> enum pipe_cap param)
> case PIPE_CAP_TGSI_MUL_ZERO_WINS:
> case PIPE_CAP_CAN_BIND_CONST_BUFFER_AS_VERTEX:
> case PIPE_CAP_ALLOW_MAPPED_BUFFERS_DURING_EXECUTION:
> +   case PIPE_CAP_MEMOBJ:

R600 will never support Vulkan, so exposing the extension there is unnecessary.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] Initial support for EXT_external_objects v2

2017-07-11 Thread Marek Olšák
ac, radeonsi, and gallium patches look OK to me except where I commented.

Marek

On Fri, Jul 7, 2017 at 6:24 AM, Andres Rodriguez  wrote:
> This series is an initial step towards the implementation of
> EXT_external_objects. It implements the functionality under
> EXT_memory_object and EXT_memory_object_fd.
>
> This updated version of the series has the following changes:
>
>  * Re-worked UUIDs to be provided by the gallium driver
>  * Use a PIPE_CAP to gate the exposure of the extension
>  * Add a comment for the non-dedicated memobj path
>  * Fixed radeonsi and radv producing different driver UUIDs
>
> Regards,
> Andres
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 17/25] mesa: hook up UUID queries for driver and device v2

2017-07-11 Thread Marek Olšák
On Fri, Jul 7, 2017 at 6:24 AM, Andres Rodriguez  wrote:
> v2: respective changes for new gallium interface
>
> Signed-off-by: Andres Rodriguez 
> ---
>  src/mesa/main/dd.h  | 15 +++
>  src/mesa/main/get.c | 17 +
>  src/mesa/main/version.c | 13 +
>  src/mesa/main/version.h |  6 ++
>  src/mesa/state_tracker/st_context.c | 20 
>  5 files changed, 71 insertions(+)
>
> diff --git a/src/mesa/main/dd.h b/src/mesa/main/dd.h
> index 27c6efc..f7fe217 100644
> --- a/src/mesa/main/dd.h
> +++ b/src/mesa/main/dd.h
> @@ -1108,6 +1108,21 @@ struct dd_function_table {
>GLenum usage,
>struct gl_buffer_object *bufObj);
>
> +   /**
> +* Fill uuid with an unique identifier for this driver
> +*
> +* uuid must point to GL_UUID_SIZE_EXT bytes of available memory
> +*/
> +   void (*GetDriverUuid)(struct gl_context *ctx, char *uuid);
> +
> +   /**
> +* Fill uuid with an unique identifier for the device associated
> +* to this driver
> +*
> +* uuid must point to GL_UUID_SIZE_EXT bytes of available memory
> +*/
> +   void (*GetDeviceUuid)(struct gl_context *ctx, char *uuid);
> +
> /*@}*/
>
> /**
> diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
> index 9f26ad1..bcbec1a 100644
> --- a/src/mesa/main/get.c
> +++ b/src/mesa/main/get.c
> @@ -40,6 +40,7 @@
>  #include "framebuffer.h"
>  #include "samplerobj.h"
>  #include "stencil.h"
> +#include "version.h"
>
>  /* This is a table driven implemetation of the glGet*v() functions.
>   * The basic idea is that most getters just look up an int somewhere
> @@ -832,6 +833,14 @@ find_custom_value(struct gl_context *ctx, const struct 
> value_desc *d, union valu
>  ctx->Texture.Unit[unit].CurrentTex[d->offset]->Name;
>break;
>
> +   /* GL_EXT_external_objects */
> +   case GL_DRIVER_UUID_EXT:
> +  _mesa_get_driver_uuid(ctx, v->value_int_4);
> +  break;
> +   case GL_DEVICE_UUID_EXT:
> +  _mesa_get_device_uuid(ctx, v->value_int_4);
> +  break;
> +
> /* GL_EXT_packed_float */
> case GL_RGBA_SIGNED_COMPONENTS_EXT:
>{
> @@ -2491,6 +2500,14 @@ find_value_indexed(const char *func, GLenum pname, 
> GLuint index, union value *v)
>   goto invalid_value;
>v->value_int = ctx->Const.MaxComputeVariableGroupSize[index];
>return TYPE_INT;
> +
> +   /* GL_EXT_external_objects */
> +   case GL_DRIVER_UUID_EXT:
> +  _mesa_get_driver_uuid(ctx, v->value_int_4);
> +  return TYPE_INT_4;
> +   case GL_DEVICE_UUID_EXT:
> +  _mesa_get_device_uuid(ctx, v->value_int_4);
> +  return TYPE_INT_4;
> }
>
>   invalid_enum:
> diff --git a/src/mesa/main/version.c b/src/mesa/main/version.c
> index 34f8bbb..c8aa3ca 100644
> --- a/src/mesa/main/version.c
> +++ b/src/mesa/main/version.c
> @@ -635,3 +635,16 @@ _mesa_compute_version(struct gl_context *ctx)
>break;
> }
>  }
> +
> +
> +void
> +_mesa_get_driver_uuid(struct gl_context *ctx, GLint *uuid)
> +{
> +   ctx->Driver.GetDriverUuid(ctx, (char*) uuid);
> +}
> +
> +void
> +_mesa_get_device_uuid(struct gl_context *ctx, GLint *uuid)
> +{
> +   ctx->Driver.GetDeviceUuid(ctx, (char*) uuid);
> +}
> diff --git a/src/mesa/main/version.h b/src/mesa/main/version.h
> index ee7cb75..4cb5e5f 100644
> --- a/src/mesa/main/version.h
> +++ b/src/mesa/main/version.h
> @@ -47,4 +47,10 @@ _mesa_override_gl_version(struct gl_context *ctx);
>  extern void
>  _mesa_override_glsl_version(struct gl_constants *consts);
>
> +extern void
> +_mesa_get_driver_uuid(struct gl_context *ctx, GLint *uuid);
> +
> +extern void
> +_mesa_get_device_uuid(struct gl_context *ctx, GLint *uuid);
> +
>  #endif /* VERSION_H */
> diff --git a/src/mesa/state_tracker/st_context.c 
> b/src/mesa/state_tracker/st_context.c
> index a846be3..a8194ed 100644
> --- a/src/mesa/state_tracker/st_context.c
> +++ b/src/mesa/state_tracker/st_context.c
> @@ -641,6 +641,24 @@ st_set_background_context(struct gl_context *ctx,
> smapi->set_background_context(>iface, queue_info);
>  }
>
> +static void
> +st_get_device_uuid(struct gl_context *ctx, char *uuid)
> +{
> +   struct pipe_screen *screen = st_context(ctx)->pipe->screen;
> +
> +   assert(GL_UUID_SIZE_EXT <= PIPE_UUID_SIZE);

This assertion seems to be the other way around.

> +   screen->get_device_uuid(screen, uuid);
> +}
> +
> +static void
> +st_get_driver_uuid(struct gl_context *ctx, char *uuid)
> +{
> +   struct pipe_screen *screen = st_context(ctx)->pipe->screen;
> +
> +   assert(GL_UUID_SIZE_EXT <= PIPE_UUID_SIZE);

Same here.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/4] dri: Add KHR_no_error DRI extension

2017-07-11 Thread Grigori Goronzy
This basic extension allows usage of the __DRI_CTX_FLAG_NO_ERROR flag.
This includes support code for classic Mesa drivers to switch on the
no-error mode if the flag is set.
---
 include/GL/internal/dri_interface.h  | 19 +++
 src/gallium/state_trackers/dri/dri2.c|  6 ++
 src/gallium/state_trackers/dri/dri_context.c |  3 ++-
 src/mesa/drivers/dri/common/dri_util.c   |  8 ++--
 src/mesa/drivers/dri/i965/intel_screen.c |  8 +++-
 5 files changed, 40 insertions(+), 4 deletions(-)

diff --git a/include/GL/internal/dri_interface.h 
b/include/GL/internal/dri_interface.h
index 2da46f7..da60648 100644
--- a/include/GL/internal/dri_interface.h
+++ b/include/GL/internal/dri_interface.h
@@ -1050,6 +1050,12 @@ struct __DRIdri2LoaderExtensionRec {
 #define __DRI_CTX_FLAG_ROBUST_BUFFER_ACCESS0x0004
 
 /**
+ * \requires __DRI2_NO_ERROR.
+ *
+ */
+#define __DRI_CTX_FLAG_NO_ERROR0x0008
+
+/**
  * \name Context reset strategies.
  */
 /*@{*/
@@ -1612,6 +1618,19 @@ struct __DRIrobustnessExtensionRec {
 };
 
 /**
+ * No-error context driver extension.
+ *
+ * Existence of this extension means the driver can accept the
+ * __DRI_CTX_FLAG_NO_ERROR flag.
+ */
+#define __DRI2_NO_ERROR "DRI_NoError"
+#define __DRI2_NO_ERROR_VERSION 1
+
+typedef struct __DRInoErrorExtensionRec {
+   __DRIextension base;
+} __DRInoErrorExtension;
+
+/**
  * DRI config options extension.
  *
  * This extension provides the XML string containing driver options for use by
diff --git a/src/gallium/state_trackers/dri/dri2.c 
b/src/gallium/state_trackers/dri/dri2.c
index 5da1c4e..244a6ad 100644
--- a/src/gallium/state_trackers/dri/dri2.c
+++ b/src/gallium/state_trackers/dri/dri2.c
@@ -1667,6 +1667,10 @@ static const __DRIrobustnessExtension dri2Robustness = {
.base = { __DRI2_ROBUSTNESS, 1 }
 };
 
+static const __DRInoErrorExtension driNoError = {
+   .base = { __DRI2_NO_ERROR, 1 }
+};
+
 static int
 dri2_interop_query_device_info(__DRIcontext *_ctx,
struct mesa_glinterop_device_info *out)
@@ -2002,6 +2006,7 @@ static const __DRIextension *dri_screen_extensions[] = {
,
,
,
+   ,
NULL
 };
 
@@ -2015,6 +2020,7 @@ static const __DRIextension 
*dri_robust_screen_extensions[] = {
,
,
,
+   ,
NULL
 };
 
diff --git a/src/gallium/state_trackers/dri/dri_context.c 
b/src/gallium/state_trackers/dri/dri_context.c
index ec555e4..e25f186 100644
--- a/src/gallium/state_trackers/dri/dri_context.c
+++ b/src/gallium/state_trackers/dri/dri_context.c
@@ -57,7 +57,8 @@ dri_create_context(gl_api api, const struct gl_config * 
visual,
struct st_context_attribs attribs;
enum st_context_error ctx_err = 0;
unsigned allowed_flags = __DRI_CTX_FLAG_DEBUG |
-__DRI_CTX_FLAG_FORWARD_COMPATIBLE;
+__DRI_CTX_FLAG_FORWARD_COMPATIBLE |
+__DRI_CTX_FLAG_NO_ERROR;
const __DRIbackgroundCallableExtension *backgroundCallable =
   screen->sPriv->dri2.backgroundCallable;
 
diff --git a/src/mesa/drivers/dri/common/dri_util.c 
b/src/mesa/drivers/dri/common/dri_util.c
index f6df488..174356f 100644
--- a/src/mesa/drivers/dri/common/dri_util.c
+++ b/src/mesa/drivers/dri/common/dri_util.c
@@ -403,7 +403,8 @@ driCreateContextAttribs(__DRIscreen *screen, int api,
 if (mesa_api != API_OPENGL_COMPAT
 && mesa_api != API_OPENGL_CORE
 && (flags & ~(__DRI_CTX_FLAG_DEBUG |
- __DRI_CTX_FLAG_ROBUST_BUFFER_ACCESS))) {
+ __DRI_CTX_FLAG_ROBUST_BUFFER_ACCESS |
+ __DRI_CTX_FLAG_NO_ERROR))) {
*error = __DRI_CTX_ERROR_BAD_FLAG;
return NULL;
 }
@@ -425,7 +426,8 @@ driCreateContextAttribs(__DRIscreen *screen, int api,
 
 const uint32_t allowed_flags = (__DRI_CTX_FLAG_DEBUG
 | __DRI_CTX_FLAG_FORWARD_COMPATIBLE
-| __DRI_CTX_FLAG_ROBUST_BUFFER_ACCESS);
+| __DRI_CTX_FLAG_ROBUST_BUFFER_ACCESS
+| __DRI_CTX_FLAG_NO_ERROR);
 if (flags & ~allowed_flags) {
*error = __DRI_CTX_ERROR_UNKNOWN_FLAG;
return NULL;
@@ -467,6 +469,8 @@ driContextSetFlags(struct gl_context *ctx, uint32_t flags)
_mesa_set_debug_state_int(ctx, GL_DEBUG_OUTPUT, GL_TRUE);
 ctx->Const.ContextFlags |= GL_CONTEXT_FLAG_DEBUG_BIT;
 }
+if ((flags & __DRI_CTX_FLAG_NO_ERROR) != 0)
+ctx->Const.ContextFlags |= GL_CONTEXT_FLAG_NO_ERROR_BIT_KHR;
 }
 
 static __DRIcontext *
diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
b/src/mesa/drivers/dri/i965/intel_screen.c
index c75f212..2bfe0b9 100644
--- a/src/mesa/drivers/dri/i965/intel_screen.c
+++ b/src/mesa/drivers/dri/i965/intel_screen.c
@@ -1247,7 +1247,11 @@ static const __DRI2rendererQueryExtension 
intelRendererQueryExtension = {
 };
 
 static const 

[Mesa-dev] [PATCH 1/4] egl: Add EGL_KHR_create_context_no_error support

2017-07-11 Thread Grigori Goronzy
This only adds the EGL side, needs to be plumbed into Mesa frontend.
---
 src/egl/drivers/dri2/egl_dri2.c | 20 ++--
 src/egl/drivers/dri2/egl_dri2.h |  1 +
 src/egl/main/eglapi.c   |  1 +
 src/egl/main/eglcontext.c   | 30 ++
 src/egl/main/eglcontext.h   |  1 +
 src/egl/main/egldisplay.h   |  1 +
 6 files changed, 52 insertions(+), 2 deletions(-)

diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c
index cf26242..6bb94e4 100644
--- a/src/egl/drivers/dri2/egl_dri2.c
+++ b/src/egl/drivers/dri2/egl_dri2.c
@@ -428,6 +428,7 @@ static const struct dri2_extension_match 
swrast_core_extensions[] = {
 
 static const struct dri2_extension_match optional_core_extensions[] = {
{ __DRI2_ROBUSTNESS, 1, offsetof(struct dri2_egl_display, robustness) },
+   { __DRI2_NO_ERROR, 1, offsetof(struct dri2_egl_display, no_error) },
{ __DRI2_CONFIG_QUERY, 1, offsetof(struct dri2_egl_display, config) },
{ __DRI2_FENCE, 1, offsetof(struct dri2_egl_display, fence) },
{ __DRI2_RENDERER_QUERY, 1, offsetof(struct dri2_egl_display, 
rendererQuery) },
@@ -665,6 +666,9 @@ dri2_setup_screen(_EGLDisplay *disp)
  disp->Extensions.EXT_create_context_robustness = EGL_TRUE;
}
 
+   if (dri2_dpy->no_error)
+  disp->Extensions.KHR_create_context_no_error = EGL_TRUE;
+
if (dri2_dpy->fence) {
   disp->Extensions.KHR_fence_sync = EGL_TRUE;
   disp->Extensions.KHR_wait_sync = EGL_TRUE;
@@ -1056,7 +1060,7 @@ dri2_fill_context_attribs(struct dri2_egl_context 
*dri2_ctx,
ctx_attribs[pos++] = __DRI_CTX_ATTRIB_MINOR_VERSION;
ctx_attribs[pos++] = dri2_ctx->base.ClientMinorVersion;
 
-   if (dri2_ctx->base.Flags != 0) {
+   if (dri2_ctx->base.Flags != 0 || dri2_ctx->base.NoError) {
   /* If the implementation doesn't support the __DRI2_ROBUSTNESS
* extension, don't even try to send it the robust-access flag.
* It may explode.  Instead, generate the required EGL error here.
@@ -1068,7 +1072,8 @@ dri2_fill_context_attribs(struct dri2_egl_context 
*dri2_ctx,
   }
 
   ctx_attribs[pos++] = __DRI_CTX_ATTRIB_FLAGS;
-  ctx_attribs[pos++] = dri2_ctx->base.Flags;
+  ctx_attribs[pos++] = dri2_ctx->base.Flags |
+dri2_ctx->base.NoError ? __DRI_CTX_FLAG_NO_ERROR : 0;
}
 
if (dri2_ctx->base.ResetNotificationStrategy != 
EGL_NO_RESET_NOTIFICATION_KHR) {
@@ -1131,6 +1136,17 @@ dri2_create_context(_EGLDriver *drv, _EGLDisplay *disp, 
_EGLConfig *conf,
   goto cleanup;
}
 
+   /* The EGL_KHR_create_context_no_error spec says:
+*
+*"BAD_MATCH is generated if the value of 
EGL_CONTEXT_OPENGL_NO_ERROR_KHR
+*used to create  does not match the value of
+*EGL_CONTEXT_OPENGL_NO_ERROR_KHR for the context being created."
+*/
+   if (share_list && share_list->NoError != dri2_ctx->base.NoError) {
+  _eglError(EGL_BAD_MATCH, "eglCreateContext");
+  goto cleanup;
+   }
+
switch (dri2_ctx->base.ClientAPI) {
case EGL_OPENGL_ES_API:
   switch (dri2_ctx->base.ClientMajorVersion) {
diff --git a/src/egl/drivers/dri2/egl_dri2.h b/src/egl/drivers/dri2/egl_dri2.h
index 4a5cf8e..5b3e93a 100644
--- a/src/egl/drivers/dri2/egl_dri2.h
+++ b/src/egl/drivers/dri2/egl_dri2.h
@@ -170,6 +170,7 @@ struct dri2_egl_display
const __DRItexBufferExtension  *tex_buffer;
const __DRIimageExtension  *image;
const __DRIrobustnessExtension *robustness;
+   const __DRInoErrorExtension*no_error;
const __DRI2configQueryExtension *config;
const __DRI2fenceExtension *fence;
const __DRI2rendererQueryExtension *rendererQuery;
diff --git a/src/egl/main/eglapi.c b/src/egl/main/eglapi.c
index 9b899d8..000368a 100644
--- a/src/egl/main/eglapi.c
+++ b/src/egl/main/eglapi.c
@@ -494,6 +494,7 @@ _eglCreateExtensionsString(_EGLDisplay *dpy)
_EGL_CHECK_EXTENSION(KHR_cl_event2);
_EGL_CHECK_EXTENSION(KHR_config_attribs);
_EGL_CHECK_EXTENSION(KHR_create_context);
+   _EGL_CHECK_EXTENSION(KHR_create_context_no_error);
_EGL_CHECK_EXTENSION(KHR_fence_sync);
_EGL_CHECK_EXTENSION(KHR_get_all_proc_addresses);
_EGL_CHECK_EXTENSION(KHR_gl_colorspace);
diff --git a/src/egl/main/eglcontext.c b/src/egl/main/eglcontext.c
index df8b45c..4244ca0 100644
--- a/src/egl/main/eglcontext.c
+++ b/src/egl/main/eglcontext.c
@@ -312,6 +312,36 @@ _eglParseContextAttribList(_EGLContext *ctx, _EGLDisplay 
*dpy,
 ctx->Flags |= EGL_CONTEXT_OPENGL_FORWARD_COMPATIBLE_BIT_KHR;
  break;
 
+  case EGL_CONTEXT_OPENGL_NO_ERROR_KHR:
+ if (dpy->Version < 14) {
+err = EGL_BAD_ATTRIBUTE;
+break;
+ }
+
+ /* The KHR_no_error spec only applies against OpenGL 2.0+ and
+  * OpenGL ES 2.0+
+  */
+ if ((api != EGL_OPENGL_API && api != EGL_OPENGL_ES_API) ||
+ ctx->ClientMajorVersion < 2) {
+err = EGL_BAD_ATTRIBUTE;
+break;
+ }
+
+ 

[Mesa-dev] KHR_no_error improvements

2017-07-11 Thread Grigori Goronzy
Hi,

this series implements support for the EGL_KHR_context_create_no
error extension and the associated plumbing through the different
layers of Mesa - EGL, DRI, Gallium state tracker, Mesa frontend. It
took me a while to figure out how everything is connected together
and still it's somewhat confusing to me, so please bear with me if
I did something stupid. :)

With all the infrastructure in place, it's easy to add driconf
support for KHR_no_error, so that's done as well. Maybe games can be
whitelisted, similar to glthread, although that seems to be a slightly
controversial idea.

Grigori

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/4] st/mesa: add support for KHR_no_error flag

2017-07-11 Thread Grigori Goronzy
Add a new context flag and plumb it through the various layers of the
context creation code to set up dispatch tables for the no-error mode.
---
 src/gallium/include/state_tracker/st_api.h   |  1 +
 src/gallium/state_trackers/dri/dri_context.c |  3 +++
 src/mesa/state_tracker/st_context.c  | 10 +++---
 src/mesa/state_tracker/st_context.h  |  3 ++-
 src/mesa/state_tracker/st_manager.c  |  6 +-
 5 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/src/gallium/include/state_tracker/st_api.h 
b/src/gallium/include/state_tracker/st_api.h
index 222e565..29e05e9 100644
--- a/src/gallium/include/state_tracker/st_api.h
+++ b/src/gallium/include/state_tracker/st_api.h
@@ -90,6 +90,7 @@ enum st_api_feature
 #define ST_CONTEXT_FLAG_FORWARD_COMPATIBLE  (1 << 1)
 #define ST_CONTEXT_FLAG_ROBUST_ACCESS   (1 << 2)
 #define ST_CONTEXT_FLAG_RESET_NOTIFICATION_ENABLED (1 << 3)
+#define ST_CONTEXT_FLAG_NO_ERROR(1 << 4)
 
 /**
  * Reasons that context creation might fail.
diff --git a/src/gallium/state_trackers/dri/dri_context.c 
b/src/gallium/state_trackers/dri/dri_context.c
index e25f186..275c0d4 100644
--- a/src/gallium/state_trackers/dri/dri_context.c
+++ b/src/gallium/state_trackers/dri/dri_context.c
@@ -107,6 +107,9 @@ dri_create_context(gl_api api, const struct gl_config * 
visual,
if (notify_reset)
   attribs.flags |= ST_CONTEXT_FLAG_RESET_NOTIFICATION_ENABLED;
 
+   if (flags & __DRI_CTX_FLAG_NO_ERROR)
+  attribs.flags |= ST_CONTEXT_FLAG_NO_ERROR;
+
if (sharedContextPrivate) {
   st_share = ((struct dri_context *)sharedContextPrivate)->st;
}
diff --git a/src/mesa/state_tracker/st_context.c 
b/src/mesa/state_tracker/st_context.c
index f535139..b8677f4 100644
--- a/src/mesa/state_tracker/st_context.c
+++ b/src/mesa/state_tracker/st_context.c
@@ -288,7 +288,7 @@ static void st_init_driver_flags(struct st_context *st);
 
 static struct st_context *
 st_create_context_priv( struct gl_context *ctx, struct pipe_context *pipe,
-   const struct st_config_options *options)
+   const struct st_config_options *options, bool no_error)
 {
struct pipe_screen *screen = pipe->screen;
uint i;
@@ -369,6 +369,9 @@ st_create_context_priv( struct gl_context *ctx, struct 
pipe_context *pipe,
 
ctx->VertexProgram._MaintainTnlProgram = GL_TRUE;
 
+   if (no_error)
+  ctx->Const.ContextFlags |= GL_CONTEXT_FLAG_NO_ERROR_BIT_KHR;
+
st->has_stencil_export =
   screen->get_param(screen, PIPE_CAP_SHADER_STENCIL_EXPORT);
st->has_shader_model3 = screen->get_param(screen, PIPE_CAP_SM3);
@@ -535,7 +538,8 @@ static void st_init_driver_flags(struct st_context *st)
 struct st_context *st_create_context(gl_api api, struct pipe_context *pipe,
  const struct gl_config *visual,
  struct st_context *share,
- const struct st_config_options *options)
+ const struct st_config_options *options,
+ bool no_error)
 {
struct gl_context *ctx;
struct gl_context *shareCtx = share ? share->ctx : NULL;
@@ -566,7 +570,7 @@ struct st_context *st_create_context(gl_api api, struct 
pipe_context *pipe,
if (debug_get_option_mesa_mvp_dp4())
   ctx->Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].OptimizeForAOS = 
GL_TRUE;
 
-   st = st_create_context_priv(ctx, pipe, options);
+   st = st_create_context_priv(ctx, pipe, options, no_error);
if (!st) {
   _mesa_destroy_context(ctx);
}
diff --git a/src/mesa/state_tracker/st_context.h 
b/src/mesa/state_tracker/st_context.h
index af9149e..b2ea6b5 100644
--- a/src/mesa/state_tracker/st_context.h
+++ b/src/mesa/state_tracker/st_context.h
@@ -390,7 +390,8 @@ extern struct st_context *
 st_create_context(gl_api api, struct pipe_context *pipe,
   const struct gl_config *visual,
   struct st_context *share,
-  const struct st_config_options *options);
+  const struct st_config_options *options,
+  bool no_error);
 
 extern void
 st_destroy_context(struct st_context *st);
diff --git a/src/mesa/state_tracker/st_manager.c 
b/src/mesa/state_tracker/st_manager.c
index 7a3205c..242262b 100644
--- a/src/mesa/state_tracker/st_manager.c
+++ b/src/mesa/state_tracker/st_manager.c
@@ -654,6 +654,7 @@ st_api_create_context(struct st_api *stapi, struct 
st_manager *smapi,
struct pipe_context *pipe;
struct gl_config mode;
gl_api api;
+   bool no_error = false;
unsigned ctx_flags = PIPE_CONTEXT_PREFER_THREADED;
 
if (!(stapi->profile_mask & (1 << attribs->profile)))
@@ -680,6 +681,9 @@ st_api_create_context(struct st_api *stapi, struct 
st_manager *smapi,
if (attribs->flags & ST_CONTEXT_FLAG_ROBUST_ACCESS)
   ctx_flags |= PIPE_CONTEXT_ROBUST_BUFFER_ACCESS;
 
+   if (attribs->flags & ST_CONTEXT_FLAG_NO_ERROR)
+  

[Mesa-dev] [PATCH 4/4] dri: Add KHR_no_error toggle to driconf

2017-07-11 Thread Grigori Goronzy
Allows applications to be whitelisted.
---
 src/gallium/state_trackers/dri/dri_context.c| 3 +++
 src/gallium/state_trackers/dri/dri_screen.c | 1 +
 src/mesa/drivers/dri/common/dri_util.c  | 3 +++
 src/mesa/drivers/dri/common/xmlpool/t_options.h | 5 +
 4 files changed, 12 insertions(+)

diff --git a/src/gallium/state_trackers/dri/dri_context.c 
b/src/gallium/state_trackers/dri/dri_context.c
index 275c0d4..e4f7c96 100644
--- a/src/gallium/state_trackers/dri/dri_context.c
+++ b/src/gallium/state_trackers/dri/dri_context.c
@@ -124,6 +124,9 @@ dri_create_context(gl_api api, const struct gl_config * 
visual,
ctx->cPriv = cPriv;
ctx->sPriv = sPriv;
 
+   if (driQueryOptionb(>optionCache, "mesa_no_error"))
+  attribs.flags |=  ST_CONTEXT_FLAG_NO_ERROR;
+
attribs.options = screen->options;
dri_fill_st_visual(, screen, visual);
ctx->st = stapi->create_context(stapi, >base, , _err,
diff --git a/src/gallium/state_trackers/dri/dri_screen.c 
b/src/gallium/state_trackers/dri/dri_screen.c
index 6b58830..de0840b 100644
--- a/src/gallium/state_trackers/dri/dri_screen.c
+++ b/src/gallium/state_trackers/dri/dri_screen.c
@@ -56,6 +56,7 @@ const __DRIconfigOptionsExtension gallium_config_options = {
DRI_CONF_BEGIN
   DRI_CONF_SECTION_PERFORMANCE
  DRI_CONF_MESA_GLTHREAD("false")
+ DRI_CONF_MESA_NO_ERROR("false")
  DRI_CONF_DISABLE_EXT_BUFFER_AGE("false")
  DRI_CONF_DISABLE_OML_SYNC_CONTROL("false")
   DRI_CONF_SECTION_END
diff --git a/src/mesa/drivers/dri/common/dri_util.c 
b/src/mesa/drivers/dri/common/dri_util.c
index 174356f..cc97c2d 100644
--- a/src/mesa/drivers/dri/common/dri_util.c
+++ b/src/mesa/drivers/dri/common/dri_util.c
@@ -437,6 +437,9 @@ driCreateContextAttribs(__DRIscreen *screen, int api,
   major_version, minor_version, error))
return NULL;
 
+if (driQueryOptionb(>optionCache, "mesa_no_error"))
+flags |= __DRI_CTX_FLAG_NO_ERROR;
+
 context = calloc(1, sizeof *context);
 if (!context) {
*error = __DRI_CTX_ERROR_NO_MEMORY;
diff --git a/src/mesa/drivers/dri/common/xmlpool/t_options.h 
b/src/mesa/drivers/dri/common/xmlpool/t_options.h
index 9aa1798..e308839 100644
--- a/src/mesa/drivers/dri/common/xmlpool/t_options.h
+++ b/src/mesa/drivers/dri/common/xmlpool/t_options.h
@@ -332,6 +332,11 @@ DRI_CONF_OPT_BEGIN_B(mesa_glthread, def) \
 DRI_CONF_DESC(en,gettext("Enable offloading GL driver work to a 
separate thread")) \
 DRI_CONF_OPT_END
 
+#define DRI_CONF_MESA_NO_ERROR(def) \
+DRI_CONF_OPT_BEGIN_B(mesa_no_error, def) \
+DRI_CONF_DESC(en,gettext("Disable GL driver error checking")) \
+DRI_CONF_OPT_END
+
 #define DRI_CONF_DISABLE_EXT_BUFFER_AGE(def) \
 DRI_CONF_OPT_BEGIN_B(glx_disable_ext_buffer_age, def) \
DRI_CONF_DESC(en, gettext("Disable the GLX_EXT_buffer_age extension")) \
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 16/25] gallium: introduce device/driver UUID queries

2017-07-11 Thread Marek Olšák
On Fri, Jul 7, 2017 at 6:24 AM, Andres Rodriguez  wrote:
> Signed-off-by: Andres Rodriguez 
> ---
>  src/gallium/drivers/ddebug/dd_screen.c | 17 +
>  src/gallium/include/pipe/p_defines.h   |  1 +
>  src/gallium/include/pipe/p_screen.h| 13 +
>  3 files changed, 31 insertions(+)
>
> diff --git a/src/gallium/drivers/ddebug/dd_screen.c 
> b/src/gallium/drivers/ddebug/dd_screen.c
> index fe9c841..955158f 100644
> --- a/src/gallium/drivers/ddebug/dd_screen.c
> +++ b/src/gallium/drivers/ddebug/dd_screen.c
> @@ -197,6 +197,21 @@ dd_screen_get_driver_query_group_info(struct pipe_screen 
> *_screen,
>  }
>
>
> +static void
> +dd_screen_get_driver_uuid(struct pipe_screen *_screen, char *uuid)
> +{
> +   struct pipe_screen *screen = dd_screen(_screen)->screen;
> +
> +   return screen->get_driver_uuid(screen, uuid);

return is unnecessary

> +}
> +
> +static void
> +dd_screen_get_device_uuid(struct pipe_screen *_screen, char *uuid)
> +{
> +   struct pipe_screen *screen = dd_screen(_screen)->screen;
> +
> +   return screen->get_device_uuid(screen, uuid);

return is unnecessary

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 09/25] radeonsi: add basic memory object support

2017-07-11 Thread Marek Olšák
On Fri, Jul 7, 2017 at 6:24 AM, Andres Rodriguez  wrote:
> From: Dave Airlie 
>
> Signed-off-by: Andres Rodriguez 
> ---
>  src/gallium/drivers/radeon/r600_pipe_common.h |   7 ++
>  src/gallium/drivers/radeon/r600_texture.c | 112 
> ++
>  2 files changed, 119 insertions(+)
>
> diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h 
> b/src/gallium/drivers/radeon/r600_pipe_common.h
> index b22a3a7..4c1a706 100644
> --- a/src/gallium/drivers/radeon/r600_pipe_common.h
> +++ b/src/gallium/drivers/radeon/r600_pipe_common.h
> @@ -377,6 +377,13 @@ union r600_mmio_counters {
> unsigned array[0];
>  };
>
> +struct r600_memory_object {
> +   struct pipe_memory_object   b;
> +   struct pb_buffer*buf;
> +   uint32_tstride;
> +   uint32_toffset;
> +};
> +
>  struct r600_common_screen {
> struct pipe_screen  b;
> struct radeon_winsys*ws;
> diff --git a/src/gallium/drivers/radeon/r600_texture.c 
> b/src/gallium/drivers/radeon/r600_texture.c
> index 2deb56a..0baa0ee 100644
> --- a/src/gallium/drivers/radeon/r600_texture.c
> +++ b/src/gallium/drivers/radeon/r600_texture.c
> @@ -2822,10 +2822,122 @@ void evergreen_do_fast_color_clear(struct 
> r600_common_context *rctx,
> }
>  }
>
> +static struct pipe_memory_object *
> +r600_memobj_from_handle(struct pipe_screen *screen,
> +   struct winsys_handle *whandle,
> +   bool dedicated)
> +{
> +   struct r600_common_screen *rscreen = (struct 
> r600_common_screen*)screen;
> +   struct r600_memory_object *memobj = CALLOC_STRUCT(r600_memory_object);
> +   struct pb_buffer *buf = NULL;
> +   uint32_t stride, offset;
> +
> +   if (!memobj)
> +   return NULL;
> +
> +   buf = rscreen->ws->buffer_from_handle(rscreen->ws, whandle,
> + , );
> +   if (!buf)
> +   return NULL;
> +
> +   memobj->b.dedicated = dedicated;
> +   memobj->buf = buf;
> +   memobj->stride = stride;
> +   memobj->offset = offset;
> +
> +   return (struct pipe_memory_object *)memobj;
> +}
> +
> +static void
> +r600_memobj_destroy(struct pipe_screen *screen,
> +   struct pipe_memory_object *memobj)
> +{
> +   free(memobj);
> +}
> +
> +static struct pipe_resource *
> +r600_texture_from_memobj(struct pipe_screen *screen,
> +const struct pipe_resource *templ,
> +struct pipe_memory_object *_memobj,
> +uint64_t offset)
> +{
> +   int r;
> +   struct r600_common_screen *rscreen = (struct 
> r600_common_screen*)screen;
> +   struct r600_memory_object *memobj = (struct r600_memory_object 
> *)_memobj;
> +   struct r600_texture *rtex;
> +   struct radeon_surf surface;
> +   struct radeon_bo_metadata metadata = {};
> +   unsigned array_mode;
> +
> +   if (memobj->b.dedicated) {
> +   rscreen->ws->buffer_get_metadata(memobj->buf, );
> +
> +   surface.u.legacy.pipe_config = metadata.u.legacy.pipe_config;
> +   surface.u.legacy.bankw = metadata.u.legacy.bankw;
> +   surface.u.legacy.bankh = metadata.u.legacy.bankh;
> +   surface.u.legacy.tile_split = metadata.u.legacy.tile_split;
> +   surface.u.legacy.mtilea = metadata.u.legacy.mtilea;
> +   surface.u.legacy.num_banks = metadata.u.legacy.num_banks;
> +
> +   if (metadata.u.legacy.macrotile == RADEON_LAYOUT_TILED)
> +   array_mode = RADEON_SURF_MODE_2D;
> +   else if (metadata.u.legacy.microtile == RADEON_LAYOUT_TILED)
> +   array_mode = RADEON_SURF_MODE_1D;
> +   else
> +   array_mode = RADEON_SURF_MODE_LINEAR_ALIGNED;

This "legacy" metadata is only for SI-CI-VI. GFX9 should be handled here too.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/mesa: add a winsys buffers list in st_context

2017-07-11 Thread Brian Paul

On 07/11/2017 11:15 AM, Charmaine Lee wrote:

Commit a5e733c6b52e93de3000647d075f5ca2f55fcb71 fixes the dangling
framebuffer object by unreferencing the window system draw/read buffers
when context is released. However this can prematurely destroy the
resources associated with these window system buffers. The problem is
reproducible with Turbine Demo running with VMware driver. In this case,
the depth buffer content was lost when the context is rebound to a
drawable.

To prevent premature destroy of the resources associated with
window system buffers, this patch maintains a list of these buffers in
the context, making sure the reference counts of these buffers will not
reach zero until the associated framebuffer interface objects no
longer exist. This also helps to avoid unnecessary destruction and
re-construction of the resources associated with the framebuffer.

Fixes VMware bug 1909807.
---
  src/gallium/include/state_tracker/st_api.h|  5 +++
  src/gallium/state_trackers/dri/dri_drawable.c |  4 ++
  src/gallium/state_trackers/wgl/stw_st.c   |  4 +-
  src/mesa/state_tracker/st_context.c   | 22 ++
  src/mesa/state_tracker/st_context.h   |  7 
  src/mesa/state_tracker/st_manager.c   | 59 ++-
  src/mesa/state_tracker/st_manager.h   |  4 ++
  7 files changed, 94 insertions(+), 11 deletions(-)

diff --git a/src/gallium/include/state_tracker/st_api.h 
b/src/gallium/include/state_tracker/st_api.h
index d641092..3fd5f01 100644
--- a/src/gallium/include/state_tracker/st_api.h
+++ b/src/gallium/include/state_tracker/st_api.h
@@ -311,6 +311,11 @@ struct st_framebuffer_iface
 int32_t stamp;

 /**
+* Identifier that uniquely identifies the framebuffer interface object.
+*/
+   uint32_t ID;
+
+   /**
  * Available for the state tracker manager to use.
  */
 void *st_manager_private;
diff --git a/src/gallium/state_trackers/dri/dri_drawable.c 
b/src/gallium/state_trackers/dri/dri_drawable.c
index 3c2e307..0cfdc30 100644
--- a/src/gallium/state_trackers/dri/dri_drawable.c
+++ b/src/gallium/state_trackers/dri/dri_drawable.c
@@ -38,6 +38,8 @@
  #include "util/u_memory.h"
  #include "util/u_inlines.h"

+static uint32_t drifb_ID = 0;
+
  static void
  swap_fences_unref(struct dri_drawable *draw);

@@ -155,6 +157,7 @@ dri_create_buffer(__DRIscreen * sPriv,

 dPriv->driverPrivate = (void *)drawable;
 p_atomic_set(>base.stamp, 1);
+   drawable->base.ID = p_atomic_inc_return(_ID);

 return GL_TRUE;
  fail:
@@ -177,6 +180,7 @@ dri_destroy_buffer(__DRIdrawable * dPriv)

 swap_fences_unref(drawable);

+   drawable->base.ID = 0;
 FREE(drawable);
  }

diff --git a/src/gallium/state_trackers/wgl/stw_st.c 
b/src/gallium/state_trackers/wgl/stw_st.c
index 7806a2a..c2844b0 100644
--- a/src/gallium/state_trackers/wgl/stw_st.c
+++ b/src/gallium/state_trackers/wgl/stw_st.c
@@ -46,7 +46,7 @@ struct stw_st_framebuffer {
 unsigned texture_mask;
  };

-
+static uint32_t stwfb_ID = 0;

  /**
   * Is the given mutex held by the calling thread?
@@ -234,6 +234,7 @@ stw_st_create_framebuffer(struct stw_framebuffer *fb)

 stwfb->fb = fb;
 stwfb->stvis = fb->pfi->stvis;
+   stwfb->base.ID = p_atomic_inc_return(_ID);

 stwfb->base.visual = >stvis;
 p_atomic_set(>base.stamp, 1);
@@ -255,6 +256,7 @@ stw_st_destroy_framebuffer_locked(struct 
st_framebuffer_iface *stfb)
 for (i = 0; i < ST_ATTACHMENT_COUNT; i++)
pipe_resource_reference(>textures[i], NULL);

+   stwfb->base.ID = 0;
 FREE(stwfb);
  }

diff --git a/src/mesa/state_tracker/st_context.c 
b/src/mesa/state_tracker/st_context.c
index f535139..fb0182f 100644
--- a/src/mesa/state_tracker/st_context.c
+++ b/src/mesa/state_tracker/st_context.c
@@ -38,6 +38,7 @@
  #include "program/prog_cache.h"
  #include "vbo/vbo.h"
  #include "glapi/glapi.h"
+#include "st_manager.h"
  #include "st_context.h"
  #include "st_debug.h"
  #include "st_cb_bitmap.h"
@@ -571,6 +572,9 @@ struct st_context *st_create_context(gl_api api, struct 
pipe_context *pipe,
_mesa_destroy_context(ctx);
 }

+   /* Initialize context's winsys buffers list */
+   LIST_INITHEAD(>winsys_buffers);
+
 return st;
  }

@@ -591,6 +595,19 @@ destroy_tex_sampler_cb(GLuint id, void *data, void 
*userData)
  void st_destroy_context( struct st_context *st )
  {
 struct gl_context *ctx = st->ctx;
+   struct st_framebuffer *stfb, *next;
+
+   GET_CURRENT_CONTEXT(curctx);
+   if (curctx == NULL) {
+  boolean ret;
+
+  /* No current context, but we need one to release
+   * renderbuffer surface when we release framebuffer.
+   * So temporarily bind the context.
+   */
+  ret = _mesa_make_current(ctx, NULL, NULL);
+  (void) ret;


We might as well just call _mesa_make_current() and get rid of the 'ret' 
variable.




+   }

 /* This must be called first so that glthread has a chance to finish */
 _mesa_glthread_destroy(ctx);
@@ -604,6 +621,11 

Re: [Mesa-dev] [PATCH] radv: Fix descriptors for cube images with VK_IMAGE_USAGE_STORAGE_BIT

2017-07-11 Thread Bas Nieuwenhuizen
On Fri, Jun 30, 2017 at 12:18 PM, Alex Smith
 wrote:
> If a cube image has VK_IMAGE_USAGE_STORAGE_BIT set, the type in an image
> view's descriptor was set to a 2D array (and a few other fields adjusted
> accordingly). This is correct when the image view is actually bound as a
> storage image, but not when bound as a sampled image. In that case the
> type should be set as a cube.
>
> Fix by generating 2 sets of descriptors at view creation time for both
> storage and non-storage usage, and then choose between them based on
> descriptor type when writing descriptor sets.
>
> Signed-off-by: Alex Smith 
> ---
>  src/amd/vulkan/radv_descriptor_set.c | 18 +++--
>  src/amd/vulkan/radv_image.c  | 77 
> 
>  src/amd/vulkan/radv_private.h|  6 +++
>  3 files changed, 72 insertions(+), 29 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_descriptor_set.c 
> b/src/amd/vulkan/radv_descriptor_set.c
> index ec7fd3d..b4a78aa 100644
> --- a/src/amd/vulkan/radv_descriptor_set.c
> +++ b/src/amd/vulkan/radv_descriptor_set.c
> @@ -603,11 +603,18 @@ write_image_descriptor(struct radv_device *device,
>struct radv_cmd_buffer *cmd_buffer,
>unsigned *dst,
>struct radeon_winsys_bo **buffer_list,
> +  VkDescriptorType descriptor_type,
>const VkDescriptorImageInfo *image_info)
>  {
> RADV_FROM_HANDLE(radv_image_view, iview, image_info->imageView);
> -   memcpy(dst, iview->descriptor, 8 * 4);
> -   memcpy(dst + 8, iview->fmask_descriptor, 8 * 4);
> +
> +   if (descriptor_type == VK_DESCRIPTOR_TYPE_STORAGE_IMAGE) {
> +   memcpy(dst, iview->storage_descriptor, 8 * 4);
> +   memcpy(dst + 8, iview->storage_fmask_descriptor, 8 * 4);
> +   } else {
> +   memcpy(dst, iview->descriptor, 8 * 4);
> +   memcpy(dst + 8, iview->fmask_descriptor, 8 * 4);
> +   }
>
> if (cmd_buffer)
> device->ws->cs_add_buffer(cmd_buffer->cs, iview->bo, 7);
> @@ -620,12 +627,13 @@ write_combined_image_sampler_descriptor(struct 
> radv_device *device,
> struct radv_cmd_buffer *cmd_buffer,
> unsigned *dst,
> struct radeon_winsys_bo **buffer_list,
> +   VkDescriptorType descriptor_type,
> const VkDescriptorImageInfo 
> *image_info,
> bool has_sampler)
>  {
> RADV_FROM_HANDLE(radv_sampler, sampler, image_info->sampler);
>
> -   write_image_descriptor(device, cmd_buffer, dst, buffer_list, 
> image_info);
> +   write_image_descriptor(device, cmd_buffer, dst, buffer_list, 
> descriptor_type, image_info);
> /* copy over sampler state */
> if (has_sampler)
> memcpy(dst + 16, sampler->state, 16);
> @@ -696,10 +704,12 @@ void radv_update_descriptor_sets(
> case VK_DESCRIPTOR_TYPE_STORAGE_IMAGE:
> case VK_DESCRIPTOR_TYPE_INPUT_ATTACHMENT:
> write_image_descriptor(device, cmd_buffer, 
> ptr, buffer_list,
> +  
> writeset->descriptorType,
>writeset->pImageInfo + 
> j);
> break;
> case VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER:
> 
> write_combined_image_sampler_descriptor(device, cmd_buffer, ptr, buffer_list,
> +   
> writeset->descriptorType,
> 
> writeset->pImageInfo + j,
> 
> !binding_layout->immutable_samplers_offset);
> if (copy_immutable_samplers) {
> @@ -866,10 +876,12 @@ void radv_update_descriptor_set_with_template(struct 
> radv_device *device,
> case VK_DESCRIPTOR_TYPE_STORAGE_IMAGE:
> case VK_DESCRIPTOR_TYPE_INPUT_ATTACHMENT:
> write_image_descriptor(device, cmd_buffer, 
> pDst, buffer_list,
> +  
> templ->entry[i].descriptor_type,
>(struct 
> VkDescriptorImageInfo *) pSrc);
> break;
> case VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER:
> 
> write_combined_image_sampler_descriptor(device, cmd_buffer, pDst, buffer_list,
> +   

[Mesa-dev] [PATCH] i965: Fix up a failed CPU/WC mmaping with a GTT mapping

2017-07-11 Thread Chris Wilson
Not all objects will be mappable for direct access by the CPU (either
using WC/CPU or WC paths), for example, a dmabuf wrapping an object on a
foreign device or an object wrapping access to stolen memory. Since
either the physical pages are not known or even do not exist, we need to
use the mediated, indirect access via the GTT. (If one day, the kernel
does suddenly start providing mediated access via a regular WB/WC
mmapping, we no longer need the fallback.)

Cc: Kenneth Graunke 
Cc: Matt Turner 
---
Note that until we actually have WC mmaps, we can legitmately hit the
fallback case for !llc writing into a linear buffer. So please just
squash this into the WC mmap patch.
---
 src/mesa/drivers/dri/i965/brw_bufmgr.c | 34 +++---
 1 file changed, 31 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c 
b/src/mesa/drivers/dri/i965/brw_bufmgr.c
index 98a75dd4a6..055ac59a4a 100644
--- a/src/mesa/drivers/dri/i965/brw_bufmgr.c
+++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c
@@ -740,6 +740,13 @@ brw_bo_map_cpu(struct brw_context *brw, struct brw_bo *bo, 
unsigned flags)
 }
 
 static void *
+brw_bo_map_wc(struct brw_context *brw, struct brw_bo *bo, unsigned flags)
+{
+   /* Intentionally left blank */
+   return NULL;
+}
+
+static void *
 brw_bo_map_gtt(struct brw_context *brw, struct brw_bo *bo, unsigned flags)
 {
struct brw_bufmgr *bufmgr = bo->bufmgr;
@@ -823,12 +830,33 @@ can_map_cpu(struct brw_bo *bo, unsigned flags)
 void *
 brw_bo_map(struct brw_context *brw, struct brw_bo *bo, unsigned flags)
 {
+
if (bo->tiling_mode != I915_TILING_NONE && !(flags & MAP_RAW))
   return brw_bo_map_gtt(brw, bo, flags);
-   else if (can_map_cpu(bo, flags))
-  return brw_bo_map_cpu(brw, bo, flags);
+
+   void *map;
+
+   if (can_map_cpu(bo, flags))
+  map = brw_bo_map_cpu(brw, bo, flags);
else
-  return brw_bo_map_gtt(brw, bo, flags);
+  map = brw_bo_map_wc(brw, bo, flags);
+
+   /* Allow the attempt to fail by falling back to the GTT where necessary.
+*
+* Not every buffer can be mmaped directly using the CPU (or WC), for
+* example buffers that wrap stolen memory or are imported from other
+* devices. For those, we have little choice but to use a GTT mmapping.
+* However, if we use a slow GTT mmapping for reads where we expected fast
+* access, that order of magnitude difference in throughput will be clearly
+* expressed by angry users.
+*/
+   if (!map) {
+  perf_debug("Fallback GTT mapping for %s with access flags %x\n",
+ bo->name, flags);
+  map = brw_bo_map_gtt(brw, bo, flags);
+   }
+
+   return map;
 }
 
 int
-- 
2.13.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] svga: fix texture swizzle writemasking

2017-07-11 Thread Charmaine Lee

Looks good.  Thanks for the quick fix.

Reviewed-by: Charmaine Lee 


From: Brian Paul 
Sent: Tuesday, July 11, 2017 2:02 PM
To: mesa-dev@lists.freedesktop.org
Cc: Charmaine Lee; Neha Bhende
Subject: [PATCH] svga: fix texture swizzle writemasking

Commit bfe1e7737a76e3b046 changed how texture swizzles are set up.
This exposed a latent bug in the VMware driver: we were ignoring
the texture instruction's writemask when applying the 0 and 1
swizzle terms.

This wasn't caught by the Piglit texture swizzle test because it
only exercises fixed function (no write masking).

Fixes issues seen with ETQW apitrace.
---
 src/gallium/drivers/svga/svga_tgsi_vgpu10.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/gallium/drivers/svga/svga_tgsi_vgpu10.c 
b/src/gallium/drivers/svga/svga_tgsi_vgpu10.c
index d29ac28..77911ad 100644
--- a/src/gallium/drivers/svga/svga_tgsi_vgpu10.c
+++ b/src/gallium/drivers/svga/svga_tgsi_vgpu10.c
@@ -5047,6 +5047,7 @@ end_tex_swizzle(struct svga_shader_emitter_v10 *emit,
  ((swz_g == PIPE_SWIZZLE_0) << 1) |
  ((swz_b == PIPE_SWIZZLE_0) << 2) |
  ((swz_a == PIPE_SWIZZLE_0) << 3));
+  writemask_0 &= swz->inst_dst->Register.WriteMask;

   if (writemask_0) {
  struct tgsi_full_src_register zero = int_tex ?
@@ -5066,6 +5067,8 @@ end_tex_swizzle(struct svga_shader_emitter_v10 *emit,
  ((swz_b == PIPE_SWIZZLE_1) << 2) |
  ((swz_a == PIPE_SWIZZLE_1) << 3));

+  writemask_1 &= swz->inst_dst->Register.WriteMask;
+
   if (writemask_1) {
  struct tgsi_full_src_register one = int_tex ?
 make_immediate_reg_int(emit, 1) :
--
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Use VALGRIND_MAKE_MEM_x in place of MALLOCLIKE/FREELIKE

2017-07-11 Thread Kenneth Graunke
On Tuesday, July 11, 2017 8:54:25 AM PDT Chris Wilson wrote:
> Valgrind doesn't actually implement VALGRIND_FREELIKE_BLOCK as the
> exact inverse of VALGRIND_MALLOCLIKE_BLOCK. It makes the block
> inaccessible, but still leaves it defined in its allocation tracker i.e.
> it will report the mmap as lost despite the call to FREELIKE!
> 
> Instead of treating the mmap as an allocation, treat it as changing the
> access bits upon the memory, i.e. that it becomes defined (because of
> the buffer objects always contain valid content from the user's
> perspective) upon mmap and inaccessible upon munmap. This makes memcheck
> happy without leaving it thinking there is a very large leak.
> 
> Finally for consistency, we treat all the mmap/munmap paths the same
> even though valgrind can intercept the regular mmap used for GTT. We
> could move this in the drm_mmap/drm_munmap macros, but that quickly
> looks ugly given the desire for those to support different OSes, but I
> didn't try that hard!

Reviewed-by: Kenneth Graunke 

and pushed:

To ssh://git.freedesktop.org/git/mesa/mesa
   314879f7fec..cead51a0c63  master -> master

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] svga: fix texture swizzle writemasking

2017-07-11 Thread Brian Paul
Commit bfe1e7737a76e3b046 changed how texture swizzles are set up.
This exposed a latent bug in the VMware driver: we were ignoring
the texture instruction's writemask when applying the 0 and 1
swizzle terms.

This wasn't caught by the Piglit texture swizzle test because it
only exercises fixed function (no write masking).

Fixes issues seen with ETQW apitrace.
---
 src/gallium/drivers/svga/svga_tgsi_vgpu10.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/gallium/drivers/svga/svga_tgsi_vgpu10.c 
b/src/gallium/drivers/svga/svga_tgsi_vgpu10.c
index d29ac28..77911ad 100644
--- a/src/gallium/drivers/svga/svga_tgsi_vgpu10.c
+++ b/src/gallium/drivers/svga/svga_tgsi_vgpu10.c
@@ -5047,6 +5047,7 @@ end_tex_swizzle(struct svga_shader_emitter_v10 *emit,
  ((swz_g == PIPE_SWIZZLE_0) << 1) |
  ((swz_b == PIPE_SWIZZLE_0) << 2) |
  ((swz_a == PIPE_SWIZZLE_0) << 3));
+  writemask_0 &= swz->inst_dst->Register.WriteMask;
 
   if (writemask_0) {
  struct tgsi_full_src_register zero = int_tex ?
@@ -5066,6 +5067,8 @@ end_tex_swizzle(struct svga_shader_emitter_v10 *emit,
  ((swz_b == PIPE_SWIZZLE_1) << 2) |
  ((swz_a == PIPE_SWIZZLE_1) << 3));
 
+  writemask_1 &= swz->inst_dst->Register.WriteMask;
+
   if (writemask_1) {
  struct tgsi_full_src_register one = int_tex ?
 make_immediate_reg_int(emit, 1) :
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2] st/dri: add 32-bit RGBX/RGBA formats

2017-07-11 Thread Rob Herring
From: Marek Olšák 

Add support for 32-bit RGBX/RGBA formats which are required for Android.

The original patch (commit ccdcf91104a5) was reverted (commit
c0c6ca40a25e) in mesa as it broke GLX resulting in swapped colors. Based
on further investigation by Chad Versace, moving the RGBX/RGBA configs
to the end is enough to prevent breaking GLX.

The handling of RGBA/RGBX in dri_fill_st_visual is a fix from Marek
Olšák.

Cc: Eric Anholt 
Cc: Chad Versace 
Cc: Mauro Rossi 
Reviewed-by: Marek Olšák 
Signed-off-by: Rob Herring 
---
v2:
- Incorporated dri_fill_st_visual RGBA/X handling from Marek
- Handle RGBA/X in dri2_drawable_get_buffers for completeness

 src/gallium/state_trackers/dri/dri2.c   |  8 
 src/gallium/state_trackers/dri/dri_screen.c | 69 -
 2 files changed, 65 insertions(+), 12 deletions(-)

diff --git a/src/gallium/state_trackers/dri/dri2.c 
b/src/gallium/state_trackers/dri/dri2.c
index 60ec38d8e445..e20b2c075361 100644
--- a/src/gallium/state_trackers/dri/dri2.c
+++ b/src/gallium/state_trackers/dri/dri2.c
@@ -186,6 +186,9 @@ static enum pipe_format dri2_format_to_pipe_format (int 
format)
case __DRI_IMAGE_FORMAT_ARGB:
   pf = PIPE_FORMAT_BGRA_UNORM;
   break;
+   case __DRI_IMAGE_FORMAT_XBGR:
+  pf = PIPE_FORMAT_RGBX_UNORM;
+  break;
case __DRI_IMAGE_FORMAT_ABGR:
   pf = PIPE_FORMAT_RGBA_UNORM;
   break;
@@ -356,9 +359,11 @@ dri2_drawable_get_buffers(struct dri_drawable *drawable,
*/
   switch(format) {
   case PIPE_FORMAT_BGRA_UNORM:
+  case PIPE_FORMAT_RGBA_UNORM:
 depth = 32;
 break;
   case PIPE_FORMAT_BGRX_UNORM:
+  case PIPE_FORMAT_RGBX_UNORM:
 depth = 24;
 break;
   case PIPE_FORMAT_B5G6R5_UNORM:
@@ -434,6 +439,9 @@ dri_image_drawable_get_buffers(struct dri_drawable 
*drawable,
   case PIPE_FORMAT_BGRA_UNORM:
  image_format = __DRI_IMAGE_FORMAT_ARGB;
  break;
+  case PIPE_FORMAT_RGBX_UNORM:
+ image_format = __DRI_IMAGE_FORMAT_XBGR;
+ break;
   case PIPE_FORMAT_RGBA_UNORM:
  image_format = __DRI_IMAGE_FORMAT_ABGR;
  break;
diff --git a/src/gallium/state_trackers/dri/dri_screen.c 
b/src/gallium/state_trackers/dri/dri_screen.c
index 6b58830e0b42..a0d9b34d667b 100644
--- a/src/gallium/state_trackers/dri/dri_screen.c
+++ b/src/gallium/state_trackers/dri/dri_screen.c
@@ -132,6 +132,27 @@ dri_fill_in_modes(struct dri_screen *screen)
   MESA_FORMAT_B8G8R8A8_SRGB,
   MESA_FORMAT_B8G8R8X8_SRGB,
   MESA_FORMAT_B5G6R5_UNORM,
+
+  /* The 32-bit RGBA format must not precede the 32-bit BGRA format.
+   * Likewise for RGBX and BGRX.  Otherwise, the GLX client and the GLX
+   * server may disagree on which format the GLXFBConfig represents,
+   * resulting in swapped color channels.
+   *
+   * The problem, as of 2017-05-30:
+   * When matching a GLXFBConfig to a __DRIconfig, GLX ignores the channel
+   * order and chooses the first __DRIconfig with the expected channel
+   * sizes. Specifically, GLX compares the GLXFBConfig's and __DRIconfig's
+   * __DRI_ATTRIB_{CHANNEL}_SIZE but ignores __DRI_ATTRIB_{CHANNEL}_MASK.
+   *
+   * EGL does not suffer from this problem. It correctly compares the
+   * channel masks when matching EGLConfig to __DRIconfig.
+   */
+
+  /* Required by Android, for HAL_PIXEL_FORMAT_RGBA_. */
+  MESA_FORMAT_R8G8B8A8_UNORM,
+
+  /* Required by Android, for HAL_PIXEL_FORMAT_RGBX_. */
+  MESA_FORMAT_R8G8B8X8_UNORM,
};
static const enum pipe_format pipe_formats[] = {
   PIPE_FORMAT_BGRA_UNORM,
@@ -139,6 +160,8 @@ dri_fill_in_modes(struct dri_screen *screen)
   PIPE_FORMAT_BGRA_SRGB,
   PIPE_FORMAT_BGRX_SRGB,
   PIPE_FORMAT_B5G6R5_UNORM,
+  PIPE_FORMAT_RGBA_UNORM,
+  PIPE_FORMAT_RGBX_UNORM,
};
mesa_format format;
__DRIconfig **configs = NULL;
@@ -275,19 +298,41 @@ dri_fill_st_visual(struct st_visual *stvis, struct 
dri_screen *screen,
if (!mode)
   return;
 
-   if (mode->redBits == 8) {
-  if (mode->alphaBits == 8)
- if (mode->sRGBCapable)
-stvis->color_format = PIPE_FORMAT_BGRA_SRGB;
- else
-stvis->color_format = PIPE_FORMAT_BGRA_UNORM;
-  else
- if (mode->sRGBCapable)
-stvis->color_format = PIPE_FORMAT_BGRX_SRGB;
- else
-stvis->color_format = PIPE_FORMAT_BGRX_UNORM;
-   } else {
+   /* Deduce the color format. */
+   switch (mode->redMask) {
+   case 0x00FF:
+  if (mode->alphaMask) {
+ assert(mode->alphaMask == 0xFF00);
+ stvis->color_format = mode->sRGBCapable ?
+  

Re: [Mesa-dev] [PATCH 08/19] mesa/st: implement memory objects as a backend for texture storage

2017-07-11 Thread Marek Olšák
On Sat, Jul 1, 2017 at 1:03 AM, Andres Rodriguez  wrote:
> From: Dave Airlie 
>
> Instead of allocating memory to back a texture, use the provided memory
> object.
>
> Signed-off-by: Andres Rodriguez 
> ---
>  src/mesa/main/mtypes.h |   2 +
>  src/mesa/state_tracker/st_cb_texture.c | 123 
> +
>  src/mesa/state_tracker/st_extensions.c |   5 ++
>  3 files changed, 130 insertions(+)
>
> diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
> index 8dcc1a8..463f444 100644
> --- a/src/mesa/main/mtypes.h
> +++ b/src/mesa/main/mtypes.h
> @@ -4125,6 +4125,8 @@ struct gl_extensions
> GLboolean EXT_framebuffer_sRGB;
> GLboolean EXT_gpu_program_parameters;
> GLboolean EXT_gpu_shader4;
> +   GLboolean EXT_memory_object;
> +   GLboolean EXT_memory_object_fd;
> GLboolean EXT_packed_float;
> GLboolean EXT_pixel_buffer_object;
> GLboolean EXT_point_parameters;
> diff --git a/src/mesa/state_tracker/st_cb_texture.c 
> b/src/mesa/state_tracker/st_cb_texture.c
> index 9798321..063f6d6 100644
> --- a/src/mesa/state_tracker/st_cb_texture.c
> +++ b/src/mesa/state_tracker/st_cb_texture.c
> @@ -53,6 +53,7 @@
>  #include "state_tracker/st_cb_flush.h"
>  #include "state_tracker/st_cb_texture.h"
>  #include "state_tracker/st_cb_bufferobjects.h"
> +#include "state_tracker/st_cb_memoryobjects.h"
>  #include "state_tracker/st_format.h"
>  #include "state_tracker/st_pbo.h"
>  #include "state_tracker/st_texture.h"
> @@ -2890,6 +2891,125 @@ st_TexParameter(struct gl_context *ctx,
> }
>  }
>
> +/**
> + * Allocate a new pipe_resource object
> + * width0, height0, depth0 are the dimensions of the level 0 image
> + * (the highest resolution).  last_level indicates how many mipmap levels
> + * to allocate storage for.  For non-mipmapped textures, this will be zero.
> + */
> +static struct pipe_resource *
> +st_texture_create_memory(struct st_context *st,

st_texture_create_from_memory?

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 08/19] mesa/st: implement memory objects as a backend for texture storage

2017-07-11 Thread Marek Olšák
On Sat, Jul 1, 2017 at 1:03 AM, Andres Rodriguez  wrote:
> From: Dave Airlie 
>
> Instead of allocating memory to back a texture, use the provided memory
> object.
>
> Signed-off-by: Andres Rodriguez 
> ---
>  src/mesa/main/mtypes.h |   2 +
>  src/mesa/state_tracker/st_cb_texture.c | 123 
> +
>  src/mesa/state_tracker/st_extensions.c |   5 ++
>  3 files changed, 130 insertions(+)
>
> diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
> index 8dcc1a8..463f444 100644
> --- a/src/mesa/main/mtypes.h
> +++ b/src/mesa/main/mtypes.h
> @@ -4125,6 +4125,8 @@ struct gl_extensions
> GLboolean EXT_framebuffer_sRGB;
> GLboolean EXT_gpu_program_parameters;
> GLboolean EXT_gpu_shader4;
> +   GLboolean EXT_memory_object;
> +   GLboolean EXT_memory_object_fd;
> GLboolean EXT_packed_float;
> GLboolean EXT_pixel_buffer_object;
> GLboolean EXT_point_parameters;
> diff --git a/src/mesa/state_tracker/st_cb_texture.c 
> b/src/mesa/state_tracker/st_cb_texture.c
> index 9798321..063f6d6 100644
> --- a/src/mesa/state_tracker/st_cb_texture.c
> +++ b/src/mesa/state_tracker/st_cb_texture.c
> @@ -53,6 +53,7 @@
>  #include "state_tracker/st_cb_flush.h"
>  #include "state_tracker/st_cb_texture.h"
>  #include "state_tracker/st_cb_bufferobjects.h"
> +#include "state_tracker/st_cb_memoryobjects.h"
>  #include "state_tracker/st_format.h"
>  #include "state_tracker/st_pbo.h"
>  #include "state_tracker/st_texture.h"
> @@ -2890,6 +2891,125 @@ st_TexParameter(struct gl_context *ctx,
> }
>  }
>
> +/**
> + * Allocate a new pipe_resource object
> + * width0, height0, depth0 are the dimensions of the level 0 image
> + * (the highest resolution).  last_level indicates how many mipmap levels
> + * to allocate storage for.  For non-mipmapped textures, this will be zero.
> + */
> +static struct pipe_resource *
> +st_texture_create_memory(struct st_context *st,
> + struct st_memory_object *memObj,
> + GLuint64 offset,
> + enum pipe_texture_target target,
> + enum pipe_format format,
> + GLuint last_level,
> + GLuint width0,
> + GLuint height0,
> + GLuint depth0,
> + GLuint layers,
> + GLuint nr_samples,
> + GLuint bind )
> +{
> +   struct pipe_resource pt, *newtex;
> +   struct pipe_screen *screen = st->pipe->screen;
> +
> +   assert(target < PIPE_MAX_TEXTURE_TYPES);
> +   assert(width0 > 0);
> +   assert(height0 > 0);
> +   assert(depth0 > 0);
> +   if (target == PIPE_TEXTURE_CUBE)
> +  assert(layers == 6);
> +
> +   DBG("%s target %d format %s last_level %d\n", __func__,
> +   (int) target, util_format_name(format), last_level);
> +
> +   assert(format);
> +   assert(screen->is_format_supported(screen, format, target, 0,
> +  PIPE_BIND_SAMPLER_VIEW));
> +
> +   memset(, 0, sizeof(pt));
> +   pt.target = target;
> +   pt.format = format;
> +   pt.last_level = last_level;
> +   pt.width0 = width0;
> +   pt.height0 = height0;
> +   pt.depth0 = depth0;
> +   pt.array_size = layers;
> +   pt.usage = PIPE_USAGE_DEFAULT;
> +   pt.bind = bind;
> +   /* only set this for OpenGL textures, not renderbuffers */
> +   pt.flags = PIPE_RESOURCE_FLAG_TEXTURING_MORE_LIKELY;
> +   pt.nr_samples = nr_samples;
> +
> +   newtex = screen->resource_from_memobj(screen, , memObj->memory, 
> offset);
> +
> +   assert(!newtex || pipe_is_referenced(>reference));
> +
> +   return newtex;
> +}
> +
> +
> +static bool
> +st_SetTextureStorageForMemoryObject(struct gl_context *ctx,
> +struct gl_texture_object *texObj,
> +struct gl_memory_object *memObj,
> +GLsizei levels, GLsizei width,
> +GLsizei height, GLsizei depth,
> +GLuint64 offset)
> +{
> +   struct st_context *st = st_context(ctx);
> +   struct gl_texture_image *texImage = texObj->Image[0][0];
> +   struct st_texture_object *stObj = st_texture_object(texObj);
> +   struct st_memory_object *smObj = st_memory_object(memObj);
> +   struct pipe_screen *screen = st->pipe->screen;
> +   unsigned ptWidth;
> +   uint16_t ptHeight, ptDepth, ptLayers;
> +   GLuint bindings;
> +   enum pipe_format fmt;
> +   GLuint num_samples = texImage->NumSamples;
> +
> +   stObj->lastLevel = levels - 1;
> +
> +   fmt = st_mesa_format_to_pipe_format(st, texImage->TexFormat);
> +
> +   bindings = default_bindings(st, fmt);
> +
> +   /* Raise the sample count if the requested one is unsupported. */
> +   if (num_samples > 1) {
> +  GLboolean found = GL_FALSE;
> +
> +  for (; num_samples <= ctx->Const.MaxSamples; 

[Mesa-dev] OSdemo32 GLU undefined reference errors..

2017-07-11 Thread Trevor Sandy
I'm having some difficulty linking OSdemo32 [Mesa Demo] on MSYS2/MinGW.
Basically, some functions are not included in libGLU.a as the error
returned is undefined reference to __imp_glu... It's strange because these
functions exist in the libGLU source and I am building and using a static
libGLU library (versus a .dll which could have some dll import issues as
explained at the bottom).

Here is what the last bit of  linking trace looks like:

* building Mesa demo...
g++ -DHAVE_FREEGLUT -DFREEGLUT_STATIC -O3 -I/opt/osmesa/include
-I../../src/util -include /opt/osmesa/include/GL/gl.h -include
/opt/osmesa/include/GL/glu.h -include /opt/osmesa/include/GL/freeglut.h -o
osdemo32 osdemo32.c -L/opt/osmesa/lib -losmesa -lfreeglut -lGLU -lz
-LC:\Users\Trevor\Projects\osmesa-install\build\install\llvm/lib
-lLLVMX86Disassembler -lLLVMX86AsmParser -lLLVMX86CodeGen -lLLVMGlobalISel
-lLLVMSelectionDAG -lLLVMAsmPrinter -lLLVMDebugInfoCodeView
-lLLVMDebugInfoMSF -lLLVMCodeGen -lLLVMScalarOpts -lLLVMInstCombine
-lLLVMTransformUtils -lLLVMBitWriter -lLLVMX86Desc -lLLVMMCDisassembler
-lLLVMX86Info -lLLVMX86AsmPrinter -lLLVMX86Utils -lLLVMMCJIT
-lLLVMExecutionEngine -lLLVMTarget -lLLVMAnalysis -lLLVMProfileData
-lLLVMRuntimeDyld -lLLVMObject -lLLVMMCParser -lLLVMBitReader -lLLVMMC
-lLLVMCore -lLLVMSupport -lLLVMDemangle -lpsapi -lshell32 -lole32 -luuid

C:\msys64\tmp\cc4gd4Lj.o:osdemo32.c:(.text.startup+0xe1): undefined
reference to `__imp_gluNewQuadric'
C:\msys64\tmp\cc4gd4Lj.o:osdemo32.c:(.text.startup+0x347): undefined
reference to `__imp_gluCylinder'
C:\msys64\tmp\cc4gd4Lj.o:osdemo32.c:(.text.startup+0x3a7): undefined
reference to `__imp_gluSphere'
C:\msys64\tmp\cc4gd4Lj.o:osdemo32.c:(.text.startup+0x3bf): undefined
reference to `__imp_gluDeleteQuadric'

collect2.exe: error: ld returned 1 exit status

I had numerous 'redeclared without dllimport attribute: previous dllimport
ignored [-Wattributes]' warnings while building libGLU but as I am using a
static lib, I imagine this should not be related to my issue. However, this
situation could indeed cause functions declared without dllimport
attributes to not export to the compiled [dynamic] library. Thanks.

Cheers,
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v6 0/6] mesa/st: glsl_to_tgsi: improved temp-reg lifetime estimation

2017-07-11 Thread Gert Wollny
Hi all, 

I just wanted to send a ping to ask whether there might be a chance to
get this into the next stable release? 

I also noted today that this update register merge algorithm fixed The
Witcher 2 for me on r600g (BARTS) 

many thanks, 
Gert


Am Dienstag, den 04.07.2017, 16:18 +0200 schrieb Gert Wollny:
> Dear Nicolai, 
> 
> this new version of the patch set that should address all the
> comments you gave 
> for v5.
> 
> Changes are: 
> 
> - the components are now tracked individually and the life time of a
> temporary 
>   is evaluated by merging the life-times of their components, 
> - BRK/CONT are now handled differently, 
> - the final algorithm to evaluate the life-times was simplified, 
> - read and write in the same instruction is now considered to be
> always 
>   well defined
> - adherence to the coding stile was improved, 
> - the case scope level is now below the according switch scope
> level, 
> - the new register merge method replaces the old version, i.e. no
> environment 
>   variables to switch between implementations. In theory, one could
> also remove 
>   the function get_last_temp_read_first_temp_write, but is is still
> used in 
>   some code in a #define 0 block, so I didn't touch it.  
> 
> In addition to your comments I applied these changes: 
> 
> - when compiled in debug mode and with the environment variable 
>   GLSL_TO_TGSI_RENAME_DEBUG specified the TGSI and resulting
> register 
>   lifetimes will be dumped to stderr.
> - unused registers are now ignored in the rename mapping evaluation
> - registers that are only read get a life-time {x,x}, with x the
> instruction 
>   line were the register is last read, so they can be merged.
> - the patch has been rebased against 7d7bcd65d (Date: Fri Jun 30
> 10:39:53 2017)
> 
> The patches are also in  
>   https://github.com/gerddie/mesa/tree/regrename-v6
> 
> Additional notes: 
> * According to a user report against v5, the patch fixes #99349
> * The new register merge code doesn't have a measurable effect on the
> all-over 
>   run-time of the shader-db.
> * There is actually not a single shader in the shader-db that
> requires 
>   per-component tracking of the temporaries, but I achieved writing a
> shader 
>   that really needs it and I think I will submit this to piglit.
> 
> best regards, 
> Gert
> 
> 
> Gert Wollny (6):
>   mesa/st: glsl_to_tgsi move some helper classes to extra files
>   mesa/st: glsl_to_tgsi: implement new temporary register lifetime
> tracker
>   mesa/st: glsl_to_tgsi: add tests for the new temporary lifetime
> tracker
>   mesa/st: glsl_to_tgsi: add register rename mapping evaluator
>   mesa/st: glsl_to_tgsi: Add test set for evaluation of rename
> mapping
>   mesa/st: glsl_to_tgsi: tie in new temporary register merge approach
> 
>  configure.ac   |1 +
>  src/mesa/Makefile.am   |2 +-
>  src/mesa/Makefile.sources  |4 +
>  src/mesa/state_tracker/st_glsl_to_tgsi.cpp |  344 +
>  src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp |  196 +++
>  src/mesa/state_tracker/st_glsl_to_tgsi_private.h   |  167 +++
>  .../state_tracker/st_glsl_to_tgsi_temprename.cpp   | 1066
> ++
>  .../state_tracker/st_glsl_to_tgsi_temprename.h |   67 +
>  src/mesa/state_tracker/tests/Makefile.am   |   36 +
>  .../tests/test_glsl_to_tgsi_lifetime.cpp   | 1556
> 
>  10 files changed, 3106 insertions(+), 333 deletions(-)
>  create mode 100644
> src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp
>  create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.h
>  create mode 100644
> src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
>  create mode 100644
> src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
>  create mode 100644 src/mesa/state_tracker/tests/Makefile.am
>  create mode 100644
> src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] swr/rast: make SWR_VISIBLE attribute work for windows

2017-07-11 Thread Cherniak, Bruce
Reviewed-by: Bruce Cherniak  

> On Jul 11, 2017, at 9:52 AM, Tim Rowley  wrote:
> 
> From: George Kyriazis 
> 
> Needed to expose SwrGetInterface
> ---
> src/gallium/drivers/swr/rasterizer/common/os.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/gallium/drivers/swr/rasterizer/common/os.h 
> b/src/gallium/drivers/swr/rasterizer/common/os.h
> index 8657709..a16f577 100644
> --- a/src/gallium/drivers/swr/rasterizer/common/os.h
> +++ b/src/gallium/drivers/swr/rasterizer/common/os.h
> @@ -30,7 +30,7 @@
> #if (defined(FORCE_WINDOWS) || defined(_WIN32)) && !defined(FORCE_LINUX)
> 
> #define SWR_API __cdecl
> -#define SWR_VISIBLE
> +#define SWR_VISIBLE  __declspec(dllexport)
> 
> #ifndef NOMINMAX
> #define NOMINMAX
> -- 
> 2.7.4
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [EGL android: accquire fence implementation] i965: Queue the buffer with a sync fence for Android OS

2017-07-11 Thread Emil Velikov
On 11 July 2017 at 16:23, Tomasz Figa  wrote:
> Hi Zhongmin,
>
> On Tue, Jul 11, 2017 at 11:02 AM, Wu, Zhongmin  wrote:
>> By the way,
>>
>> For cancelBuffer, sorry I forget such function, thanks for notice. It should 
>> also pass the same fence fd as the queuebuffer.
>>
>> And Yogesh, you mentioned the gallium,   is it another platform supported by 
>> mesa ?  I am sorry I have no idea about this,  could you please help to 
>> check this ?
>>
>> I think we can co-work with mesa team to work out an acceptable fix which 
>> can meet the requirement of Android without any break on other platforms.
>
> One thing needs clarifying here. Release fences from EGL are _not_ a
> requirement. It is an optional feature. Android compliance suites pass
> fully without Android sync fence support in Mesa at all.
>
> Other than that, it's been taking long enough and I agree that we
> should finally wire both acquire and release fence support in EGL and
> related drivers. Otherwise we can forget about getting good user
> experience on Android.
>
Right, I'm not trying to say otherwise.

The strange part, IMHO, is that now flatland has a hard requirement on
both fences, where the [developer-side of the] documentation does not
say anything about this.
This sounds a bit backwards. I believe documentation update is in order?

FWIW I was under the impression that EGL_ANDROID_native_fence_sync can
be used in flatland. Although as Rob mentioned... not sure if the
extension is available since the EGL meta seems to block/strip it out.


> On a technical side, the EGL change needs to take into account that
> not all drivers support fences and so it needs to have a fallback to
> old behavior for those which don't.
>

> Other than that, correct me if I'm wrong, but could we just use the
> DRI2 fence extension instead of adding some custom callbacks? I can
> see that a normal client request to create a sync fence would end up
> calling dri2_dpy->fence->create_fence_fd() (if it's present) [1].
> Could we do the same?
>
Reusing existing API would be ideal.

If not, Zhongmin/Yogesh please note:
 - when extending the interface, the version number must be bumped
 - user should check the version and the function pointer prior to
use, falling back to the old scheme
 - get_retrive_fd [barring the typo - retrieve], should have at least
the fd ownership documented

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [EGL android: accquire fence implementation] i965: Queue the buffer with a sync fence for Android OS

2017-07-11 Thread Marathe, Yogesh
> -Original Message-
> From: mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] On Behalf
> Of Tomasz Figa
> Sent: Tuesday, July 11, 2017 9:59 PM
> To: Marathe, Yogesh 
> Cc: Gao, Shuo ; Liu, Zhiquan ;
> Kondapally, Kalyan ; Chad Versace
> ; Eric Engestrom ; Emil
> Velikov ; Wu, Zhongmin
> ; Kenneth Graunke ; Rob
> Clark ; Widawsky, Benjamin
> ; ML mesa-dev  d...@lists.freedesktop.org>; Kristian H . Kristensen
> ; Timothy Arceri 
> Subject: Re: [Mesa-dev] [EGL android: accquire fence implementation] i965:
> Queue the buffer with a sync fence for Android OS
> 
> Hi Yogesh,
> 
> On Wed, Jul 12, 2017 at 1:09 AM, Marathe, Yogesh
>  wrote:
> > Hello Figa, Few caveats on that approach
> 
> (I'm Tomasz, by the way)

My bad Tomasz.

> >> Queue the buffer with a sync fence for Android OS
> >>
> >> Now for real, sorry guys... (Seriously gmail why you do this to me.)
> >>
> >> -chuanbo.w...@intel.com (bounces)
> >> +"Gao, Shuo"  (forgot to add originally, sorry)
> >>
> >> On Wed, Jul 12, 2017 at 12:23 AM, Tomasz Figa 
> wrote:
> >> > Hi Zhongmin,
> >> >
> >> > On Tue, Jul 11, 2017 at 11:02 AM, Wu, Zhongmin
> >> > 
> >> wrote:
> >> >> By the way,
> >> >>
> >> >> For cancelBuffer, sorry I forget such function, thanks for notice.
> >> >> It should
> >> also pass the same fence fd as the queuebuffer.
> >> >>
> >> >> And Yogesh, you mentioned the gallium,   is it another platform 
> >> >> supported
> by
> >> mesa ?  I am sorry I have no idea about this,  could you please help
> >> to check this ?
> >> >>
> >> >> I think we can co-work with mesa team to work out an acceptable
> >> >> fix which
> >> can meet the requirement of Android without any break on other platforms.
> >> >
> >> > One thing needs clarifying here. Release fences from EGL are _not_
> >> > a requirement. It is an optional feature. Android compliance suites
> >> > pass fully without Android sync fence support in Mesa at all.
> >> >
> >> > Other than that, it's been taking long enough and I agree that we
> >> > should finally wire both acquire and release fence support in EGL
> >> > and related drivers. Otherwise we can forget about getting good
> >> > user experience on Android.
> >> >
> >> > On a technical side, the EGL change needs to take into account that
> >> > not all drivers support fences and so it needs to have a fallback
> >> > to old behavior for those which don't.
> >> >
> >> > Other than that, correct me if I'm wrong, but could we just use the
> >> > DRI2 fence extension instead of adding some custom callbacks? I can
> >> > see that a normal client request to create a sync fence would end
> >> > up calling dri2_dpy->fence->create_fence_fd() (if it's present) [1].
> >> > Could we do the same?
> >
> > May be here we need to look at complete sequence eglCreateSyncKHR ->
> > _eglCreateSync -> dri2_create_sync, as eglCreateSyncKHR  seems entry
> > point and its doing attrib/type checks before reaching
> > dri2_create_sync(). Also, dri2_create_sync is static, can't be called
> > directly, there needs to be an entry point / interface.
> >
> 
> I think you misunderstood my suggestion. I didn't mean dri2_create_sync(), but
> rather using the DRI2 fence extension directly, just as dri2_create_sync() 
> does.
> You can access dri2_egl_display from Android EGL code and in fact it already
> uses other extensions like this.

Sorry, I'm still searching around. To try this out, can you please specify, 
which
functions did you mean by DRI2 fence extension? An example within EGL code 
would help. Please note we need to call these from platform_android.c finally.

> 
> Best regards,
> Tomasz
> 
> >> >
> >> > [1]
> >> > https://cgit.freedesktop.org/mesa/mesa/tree/src/egl/drivers/dri2/eg
> >> > l_d
> >> > ri2.c#n2772
> >> >
> >> > + Kristian, Chad and Dominik who have been looking into sync fence
> >> > integration with our EGL drivers.
> >> >
> >> > Best regards,
> >> > Tomasz
> >> >
> >> >>
> >> >> -Original Message-
> >> >> From: Wu, Zhongmin
> >> >> Sent: Tuesday, July 11, 2017 8:40
> >> >> To: 'Emil Velikov' ; Marathe, Yogesh
> >> >> 
> >> >> Cc: Widawsky, Benjamin ; Liu, Zhiquan
> >> >> ; Eric Engestrom ; Rob
> >> >> Clark ; Tomasz Figa
> >> >> ; Kenneth Graunke ;
> >> >> Kondapally, Kalyan ; ML mesa-dev
> >> >> ; Timothy Arceri
> >> >> ; Chuanbo Weng 

[Mesa-dev] [PATCH] st/mesa: add a winsys buffers list in st_context

2017-07-11 Thread Charmaine Lee
Commit a5e733c6b52e93de3000647d075f5ca2f55fcb71 fixes the dangling
framebuffer object by unreferencing the window system draw/read buffers
when context is released. However this can prematurely destroy the
resources associated with these window system buffers. The problem is
reproducible with Turbine Demo running with VMware driver. In this case,
the depth buffer content was lost when the context is rebound to a
drawable.

To prevent premature destroy of the resources associated with
window system buffers, this patch maintains a list of these buffers in
the context, making sure the reference counts of these buffers will not
reach zero until the associated framebuffer interface objects no
longer exist. This also helps to avoid unnecessary destruction and
re-construction of the resources associated with the framebuffer.

Fixes VMware bug 1909807.
---
 src/gallium/include/state_tracker/st_api.h|  5 +++
 src/gallium/state_trackers/dri/dri_drawable.c |  4 ++
 src/gallium/state_trackers/wgl/stw_st.c   |  4 +-
 src/mesa/state_tracker/st_context.c   | 22 ++
 src/mesa/state_tracker/st_context.h   |  7 
 src/mesa/state_tracker/st_manager.c   | 59 ++-
 src/mesa/state_tracker/st_manager.h   |  4 ++
 7 files changed, 94 insertions(+), 11 deletions(-)

diff --git a/src/gallium/include/state_tracker/st_api.h 
b/src/gallium/include/state_tracker/st_api.h
index d641092..3fd5f01 100644
--- a/src/gallium/include/state_tracker/st_api.h
+++ b/src/gallium/include/state_tracker/st_api.h
@@ -311,6 +311,11 @@ struct st_framebuffer_iface
int32_t stamp;
 
/**
+* Identifier that uniquely identifies the framebuffer interface object.
+*/
+   uint32_t ID;
+
+   /**
 * Available for the state tracker manager to use.
 */
void *st_manager_private;
diff --git a/src/gallium/state_trackers/dri/dri_drawable.c 
b/src/gallium/state_trackers/dri/dri_drawable.c
index 3c2e307..0cfdc30 100644
--- a/src/gallium/state_trackers/dri/dri_drawable.c
+++ b/src/gallium/state_trackers/dri/dri_drawable.c
@@ -38,6 +38,8 @@
 #include "util/u_memory.h"
 #include "util/u_inlines.h"
 
+static uint32_t drifb_ID = 0;
+
 static void
 swap_fences_unref(struct dri_drawable *draw);
 
@@ -155,6 +157,7 @@ dri_create_buffer(__DRIscreen * sPriv,
 
dPriv->driverPrivate = (void *)drawable;
p_atomic_set(>base.stamp, 1);
+   drawable->base.ID = p_atomic_inc_return(_ID);
 
return GL_TRUE;
 fail:
@@ -177,6 +180,7 @@ dri_destroy_buffer(__DRIdrawable * dPriv)
 
swap_fences_unref(drawable);
 
+   drawable->base.ID = 0;
FREE(drawable);
 }
 
diff --git a/src/gallium/state_trackers/wgl/stw_st.c 
b/src/gallium/state_trackers/wgl/stw_st.c
index 7806a2a..c2844b0 100644
--- a/src/gallium/state_trackers/wgl/stw_st.c
+++ b/src/gallium/state_trackers/wgl/stw_st.c
@@ -46,7 +46,7 @@ struct stw_st_framebuffer {
unsigned texture_mask;
 };
 
-
+static uint32_t stwfb_ID = 0;
 
 /**
  * Is the given mutex held by the calling thread?
@@ -234,6 +234,7 @@ stw_st_create_framebuffer(struct stw_framebuffer *fb)
 
stwfb->fb = fb;
stwfb->stvis = fb->pfi->stvis;
+   stwfb->base.ID = p_atomic_inc_return(_ID);
 
stwfb->base.visual = >stvis;
p_atomic_set(>base.stamp, 1);
@@ -255,6 +256,7 @@ stw_st_destroy_framebuffer_locked(struct 
st_framebuffer_iface *stfb)
for (i = 0; i < ST_ATTACHMENT_COUNT; i++)
   pipe_resource_reference(>textures[i], NULL);
 
+   stwfb->base.ID = 0;
FREE(stwfb);
 }
 
diff --git a/src/mesa/state_tracker/st_context.c 
b/src/mesa/state_tracker/st_context.c
index f535139..fb0182f 100644
--- a/src/mesa/state_tracker/st_context.c
+++ b/src/mesa/state_tracker/st_context.c
@@ -38,6 +38,7 @@
 #include "program/prog_cache.h"
 #include "vbo/vbo.h"
 #include "glapi/glapi.h"
+#include "st_manager.h"
 #include "st_context.h"
 #include "st_debug.h"
 #include "st_cb_bitmap.h"
@@ -571,6 +572,9 @@ struct st_context *st_create_context(gl_api api, struct 
pipe_context *pipe,
   _mesa_destroy_context(ctx);
}
 
+   /* Initialize context's winsys buffers list */
+   LIST_INITHEAD(>winsys_buffers);
+
return st;
 }
 
@@ -591,6 +595,19 @@ destroy_tex_sampler_cb(GLuint id, void *data, void 
*userData)
 void st_destroy_context( struct st_context *st )
 {
struct gl_context *ctx = st->ctx;
+   struct st_framebuffer *stfb, *next;
+
+   GET_CURRENT_CONTEXT(curctx);
+   if (curctx == NULL) {
+  boolean ret;
+
+  /* No current context, but we need one to release
+   * renderbuffer surface when we release framebuffer.
+   * So temporarily bind the context.
+   */
+  ret = _mesa_make_current(ctx, NULL, NULL);
+  (void) ret;
+   }
 
/* This must be called first so that glthread has a chance to finish */
_mesa_glthread_destroy(ctx);
@@ -604,6 +621,11 @@ void st_destroy_context( struct st_context *st )
st_reference_prog(st, >tep, NULL);
st_reference_compprog(st, >cp, NULL);
 
+   /* release framebuffer in the 

Re: [Mesa-dev] [PATCH 1/2] egl/android: remove HAL_PIXEL_FORMAT_BGRA_8888 support

2017-07-11 Thread Rob Herring
On Tue, Jul 11, 2017 at 9:34 AM, Tomasz Figa  wrote:
> On Tue, Jul 11, 2017 at 11:16 PM, Rob Herring  wrote:
>> On Tue, Jul 11, 2017 at 8:27 AM, Emil Velikov  
>> wrote:
>>> From: Emil Velikov 
>>>
>>> As said in the EGL_KHR_platform_android extensions
>>>
>>> For each EGLConfig that belongs to the Android platform, the
>>> EGL_NATIVE_VISUAL_ID attribute is an Android window format, such as
>>> WINDOW_FORMAT_RGBA_.
>>>
>>> Although it should be applicable overall.
>>>
>>> Even though we use HAL_PIXEL_FORMAT here, those are numerically
>>> identical to the  WINDOW_FORMAT_ and AHARDWAREBUFFER_FORMAT_ ones.
>>>
>>> Barring the said format of course. That one is only listed in HAL.
>>>
>>> Keep in mind that even if we try to use the said format, you'll get
>>> caught by droid_create_surface(). The function compares the format of
>>> the underlying window, against the NATIVE_VISUAL_ID of the config.
>>>
>>> Unfortunatelly it only prints a warning, rather than error out, likely
>>> leading to visual corruption.
>>>
>>> While SDL will even call ANativeWindow_setBuffersGeometry() with the
>>> wrong format, and conviniently ignore the [expected] failure.
>>>
>>> Cc: mesa-sta...@lists.freedesktop.org
>>> Cc: Chad Versace 
>>> Cc: Tomasz Figa 
>>> Signed-off-by: Emil Velikov 
>>> ---
>>> I'm about 99.99% sure the above is correct, but I haven't tested it.
>>
>> Isn't this going to break if there's no driver support for RGBA/RGBX
>> which is the case for stable (and master for gallium drvs).
>
> First of all, Android hardcodes HAL_PIXEL_FORMAT_RGBA_ in a number
> of places, which means that those users use a patched Android. However
> I'm not sure if we can just break them like this. I'll leave it to you
> guys, though.

Yes, patched to work around mesa's lack of RGBA/X support. Not sure
why they went this route. Maybe RGBA/X support in mesa was attempted
before.

> Other than that, CTS seems to require only RGBA_ and RGB_565, so
> this change is not going to affect compliance with unpatched Android.

Okay, good to know.

Rob
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [EGL android: accquire fence implementation] i965: Queue the buffer with a sync fence for Android OS

2017-07-11 Thread Tomasz Figa
Hi Yogesh,

On Wed, Jul 12, 2017 at 1:09 AM, Marathe, Yogesh
 wrote:
> Hello Figa, Few caveats on that approach

(I'm Tomasz, by the way)

>
>> -Original Message-
>> From: mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] On Behalf
>> Of Tomasz Figa
>> Sent: Tuesday, July 11, 2017 8:58 PM
>> To: Wu, Zhongmin 
>> Cc: Gao, Shuo ; Liu, Zhiquan ;
>> ML mesa-dev ; Emil Velikov
>> ; Chad Versace ; Eric
>> Engestrom ; Marathe, Yogesh
>> ; Rob Clark ; Kenneth
>> Graunke ; Widawsky, Benjamin
>> ; Kondapally, Kalyan
>> ; Kristian H . Kristensen
>> ; Timothy Arceri 
>> Subject: Re: [Mesa-dev] [EGL android: accquire fence implementation] i965:
>> Queue the buffer with a sync fence for Android OS
>>
>> Now for real, sorry guys... (Seriously gmail why you do this to me.)
>>
>> -chuanbo.w...@intel.com (bounces)
>> +"Gao, Shuo"  (forgot to add originally, sorry)
>>
>> On Wed, Jul 12, 2017 at 12:23 AM, Tomasz Figa  wrote:
>> > Hi Zhongmin,
>> >
>> > On Tue, Jul 11, 2017 at 11:02 AM, Wu, Zhongmin 
>> wrote:
>> >> By the way,
>> >>
>> >> For cancelBuffer, sorry I forget such function, thanks for notice. It 
>> >> should
>> also pass the same fence fd as the queuebuffer.
>> >>
>> >> And Yogesh, you mentioned the gallium,   is it another platform supported 
>> >> by
>> mesa ?  I am sorry I have no idea about this,  could you please help to 
>> check this
>> ?
>> >>
>> >> I think we can co-work with mesa team to work out an acceptable fix which
>> can meet the requirement of Android without any break on other platforms.
>> >
>> > One thing needs clarifying here. Release fences from EGL are _not_ a
>> > requirement. It is an optional feature. Android compliance suites pass
>> > fully without Android sync fence support in Mesa at all.
>> >
>> > Other than that, it's been taking long enough and I agree that we
>> > should finally wire both acquire and release fence support in EGL and
>> > related drivers. Otherwise we can forget about getting good user
>> > experience on Android.
>> >
>> > On a technical side, the EGL change needs to take into account that
>> > not all drivers support fences and so it needs to have a fallback to
>> > old behavior for those which don't.
>> >
>> > Other than that, correct me if I'm wrong, but could we just use the
>> > DRI2 fence extension instead of adding some custom callbacks? I can
>> > see that a normal client request to create a sync fence would end up
>> > calling dri2_dpy->fence->create_fence_fd() (if it's present) [1].
>> > Could we do the same?
>
> May be here we need to look at complete sequence
> eglCreateSyncKHR ->  _eglCreateSync -> dri2_create_sync,
> as eglCreateSyncKHR  seems entry point and its doing attrib/type checks
> before reaching  dri2_create_sync(). Also, dri2_create_sync is static,
> can't be called directly, there needs to be an entry point / interface.
>

I think you misunderstood my suggestion. I didn't mean
dri2_create_sync(), but rather using the DRI2 fence extension
directly, just as dri2_create_sync() does. You can access
dri2_egl_display from Android EGL code and in fact it already uses
other extensions like this.

Best regards,
Tomasz

>> >
>> > [1]
>> > https://cgit.freedesktop.org/mesa/mesa/tree/src/egl/drivers/dri2/egl_d
>> > ri2.c#n2772
>> >
>> > + Kristian, Chad and Dominik who have been looking into sync fence
>> > integration with our EGL drivers.
>> >
>> > Best regards,
>> > Tomasz
>> >
>> >>
>> >> -Original Message-
>> >> From: Wu, Zhongmin
>> >> Sent: Tuesday, July 11, 2017 8:40
>> >> To: 'Emil Velikov' ; Marathe, Yogesh
>> >> 
>> >> Cc: Widawsky, Benjamin ; Liu, Zhiquan
>> >> ; Eric Engestrom ; Rob
>> >> Clark ; Tomasz Figa ;
>> >> Kenneth Graunke ; Kondapally, Kalyan
>> >> ; ML mesa-dev
>> >> ; Timothy Arceri
>> >> ; Chuanbo Weng 
>> >> Subject: RE: [Mesa-dev] [EGL android: accquire fence implementation]
>> >> i965: Queue the buffer with a sync fence for Android OS
>> >>
>> >> Hi Emil and Yogesh
>> >> Thank you for your comments,  and thanks Yogesh for giving the
>> >> detailed explanations
>> >>
>> >>
>> >> And according to the document of Android below
>> (https://source.android.com/devices/graphics/arch-bq-gralloc):
>> >>
>> >> Recent Android devices support the sync framework, which enables the
>> 

Re: [Mesa-dev] [EGL android: accquire fence implementation] i965: Queue the buffer with a sync fence for Android OS

2017-07-11 Thread Marathe, Yogesh
Hello Figa, Few caveats on that approach

> -Original Message-
> From: mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] On Behalf
> Of Tomasz Figa
> Sent: Tuesday, July 11, 2017 8:58 PM
> To: Wu, Zhongmin 
> Cc: Gao, Shuo ; Liu, Zhiquan ;
> ML mesa-dev ; Emil Velikov
> ; Chad Versace ; Eric
> Engestrom ; Marathe, Yogesh
> ; Rob Clark ; Kenneth
> Graunke ; Widawsky, Benjamin
> ; Kondapally, Kalyan
> ; Kristian H . Kristensen
> ; Timothy Arceri 
> Subject: Re: [Mesa-dev] [EGL android: accquire fence implementation] i965:
> Queue the buffer with a sync fence for Android OS
> 
> Now for real, sorry guys... (Seriously gmail why you do this to me.)
> 
> -chuanbo.w...@intel.com (bounces)
> +"Gao, Shuo"  (forgot to add originally, sorry)
> 
> On Wed, Jul 12, 2017 at 12:23 AM, Tomasz Figa  wrote:
> > Hi Zhongmin,
> >
> > On Tue, Jul 11, 2017 at 11:02 AM, Wu, Zhongmin 
> wrote:
> >> By the way,
> >>
> >> For cancelBuffer, sorry I forget such function, thanks for notice. It 
> >> should
> also pass the same fence fd as the queuebuffer.
> >>
> >> And Yogesh, you mentioned the gallium,   is it another platform supported 
> >> by
> mesa ?  I am sorry I have no idea about this,  could you please help to check 
> this
> ?
> >>
> >> I think we can co-work with mesa team to work out an acceptable fix which
> can meet the requirement of Android without any break on other platforms.
> >
> > One thing needs clarifying here. Release fences from EGL are _not_ a
> > requirement. It is an optional feature. Android compliance suites pass
> > fully without Android sync fence support in Mesa at all.
> >
> > Other than that, it's been taking long enough and I agree that we
> > should finally wire both acquire and release fence support in EGL and
> > related drivers. Otherwise we can forget about getting good user
> > experience on Android.
> >
> > On a technical side, the EGL change needs to take into account that
> > not all drivers support fences and so it needs to have a fallback to
> > old behavior for those which don't.
> >
> > Other than that, correct me if I'm wrong, but could we just use the
> > DRI2 fence extension instead of adding some custom callbacks? I can
> > see that a normal client request to create a sync fence would end up
> > calling dri2_dpy->fence->create_fence_fd() (if it's present) [1].
> > Could we do the same?

May be here we need to look at complete sequence
eglCreateSyncKHR ->  _eglCreateSync -> dri2_create_sync, 
as eglCreateSyncKHR  seems entry point and its doing attrib/type checks
before reaching  dri2_create_sync(). Also, dri2_create_sync is static, 
can't be called directly, there needs to be an entry point / interface.

> >
> > [1]
> > https://cgit.freedesktop.org/mesa/mesa/tree/src/egl/drivers/dri2/egl_d
> > ri2.c#n2772
> >
> > + Kristian, Chad and Dominik who have been looking into sync fence
> > integration with our EGL drivers.
> >
> > Best regards,
> > Tomasz
> >
> >>
> >> -Original Message-
> >> From: Wu, Zhongmin
> >> Sent: Tuesday, July 11, 2017 8:40
> >> To: 'Emil Velikov' ; Marathe, Yogesh
> >> 
> >> Cc: Widawsky, Benjamin ; Liu, Zhiquan
> >> ; Eric Engestrom ; Rob
> >> Clark ; Tomasz Figa ;
> >> Kenneth Graunke ; Kondapally, Kalyan
> >> ; ML mesa-dev
> >> ; Timothy Arceri
> >> ; Chuanbo Weng 
> >> Subject: RE: [Mesa-dev] [EGL android: accquire fence implementation]
> >> i965: Queue the buffer with a sync fence for Android OS
> >>
> >> Hi Emil and Yogesh
> >> Thank you for your comments,  and thanks Yogesh for giving the
> >> detailed explanations
> >>
> >>
> >> And according to the document of Android below
> (https://source.android.com/devices/graphics/arch-bq-gralloc):
> >>
> >> Recent Android devices support the sync framework, which enables the
> system to do nifty things when combined with hardware components that can
> manipulate graphics data asynchronously. For example, a producer can submit a
> series of OpenGL ES drawing commands and then enqueue the output buffer
> before rendering completes. The buffer is accompanied by a fence that signals
> when the contents are ready.
> >>
> >>
> >> I think the things is very clear, that is if the rendering is completed 
> >> already
> when we call queueBuffer() in mesa ?   If not, we should queue the buffer 
> with a

[Mesa-dev] [PATCH] i965: Use VALGRIND_MAKE_MEM_x in place of MALLOCLIKE/FREELIKE

2017-07-11 Thread Chris Wilson
Valgrind doesn't actually implement VALGRIND_FREELIKE_BLOCK as the
exact inverse of VALGRIND_MALLOCLIKE_BLOCK. It makes the block
inaccessible, but still leaves it defined in its allocation tracker i.e.
it will report the mmap as lost despite the call to FREELIKE!

Instead of treating the mmap as an allocation, treat it as changing the
access bits upon the memory, i.e. that it becomes defined (because of
the buffer objects always contain valid content from the user's
perspective) upon mmap and inaccessible upon munmap. This makes memcheck
happy without leaving it thinking there is a very large leak.

Finally for consistency, we treat all the mmap/munmap paths the same
even though valgrind can intercept the regular mmap used for GTT. We
could move this in the drm_mmap/drm_munmap macros, but that quickly
looks ugly given the desire for those to support different OSes, but I
didn't try that hard!

Cc: Kenneth Graunke 
Cc: Matt Turner 
---
 src/mesa/drivers/dri/i965/brw_bufmgr.c | 34 +++---
 1 file changed, 27 insertions(+), 7 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c 
b/src/mesa/drivers/dri/i965/brw_bufmgr.c
index d468011774..a73fea95b1 100644
--- a/src/mesa/drivers/dri/i965/brw_bufmgr.c
+++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c
@@ -78,6 +78,16 @@
 #define VG(x)
 #endif
 
+/* VALGRIND_FREELIKE_BLOCK unfortunately does not actually undo the earlier
+ * VALGRIND_MALLOCLIKE_BLOCK but instead leaves vg convinced the memory is
+ * leaked. All because it does not call VG(cli_free) from its
+ * VG_USERREQ__FREELIKE_BLOCK handler. Instead of treating the memory like
+ * and allocation, we mark it available for use upon mmapping and remove
+ * it upon unmapping.
+ */
+#define VG_DEFINED(ptr, size) VG(VALGRIND_MAKE_MEM_DEFINED(ptr, size))
+#define VG_NOACCESS(ptr, size) VG(VALGRIND_MAKE_MEM_NOACCESS(ptr, size))
+
 #define memclear(s) memset(, 0, sizeof(s))
 
 #define FILE_DEBUG_FLAG DEBUG_BUFMGR
@@ -531,14 +541,15 @@ bo_free(struct brw_bo *bo)
struct hash_entry *entry;
 
if (bo->map_cpu) {
-  VG(VALGRIND_FREELIKE_BLOCK(bo->map_cpu, 0));
+  VG_NOACCESS(bo->map_cpu, bo->size);
   drm_munmap(bo->map_cpu, bo->size);
}
if (bo->map_wc) {
-  VG(VALGRIND_FREELIKE_BLOCK(bo->map_wc, 0));
+  VG_NOACCESS(bo->map_wc, bo->size);
   drm_munmap(bo->map_wc, bo->size);
}
if (bo->map_gtt) {
+  VG_NOACCESS(bo->map_gtt, bo->size);
   drm_munmap(bo->map_gtt, bo->size);
}
 
@@ -723,14 +734,16 @@ brw_bo_map_cpu(struct brw_context *brw, struct brw_bo 
*bo, unsigned flags)
  __FILE__, __LINE__, bo->gem_handle, bo->name);
  return NULL;
   }
-  VG(VALGRIND_MALLOCLIKE_BLOCK(mmap_arg.addr_ptr, mmap_arg.size, 0, 1));
   map = (void *) (uintptr_t) mmap_arg.addr_ptr;
+  VG_DEFINED(map, bo->size);
 
   if (p_atomic_cmpxchg(>map_cpu, NULL, map)) {
- VG(VALGRIND_FREELIKE_BLOCK(map, 0));
+ VG_NOACCESS(map, bo->size);
  drm_munmap(map, bo->size);
   }
}
+   assert(bo->map_cpu);
+
DBG("brw_bo_map_cpu: %d (%s) -> %p, ", bo->gem_handle, bo->name,
bo->map_cpu);
print_flags(flags);
@@ -765,9 +778,7 @@ brw_bo_map_gtt(struct brw_context *brw, struct brw_bo *bo, 
unsigned flags)
  return NULL;
   }
 
-  /* and mmap it.  We don't need to use VALGRIND_MALLOCLIKE_BLOCK
-   * because Valgrind will already intercept this mmap call.
-   */
+  /* and mmap it. */
   map = drm_mmap(0, bo->size, PROT_READ | PROT_WRITE,
  MAP_SHARED, bufmgr->fd, mmap_arg.offset);
   if (map == MAP_FAILED) {
@@ -776,10 +787,19 @@ brw_bo_map_gtt(struct brw_context *brw, struct brw_bo 
*bo, unsigned flags)
  return NULL;
   }
 
+  /* We don't need to use VALGRIND_MALLOCLIKE_BLOCK
+   * because Valgrind will already intercept this mmap call. However, for
+   * consistency between all the mmap paths, we mark the pointer as defined
+   * now and mark it as inaccessible afterwards.
+   */
+  VG_DEFINED(map, bo->size);
+
   if (p_atomic_cmpxchg(>map_gtt, NULL, map)) {
+ VG_NOACCESS(map, bo->size);
  drm_munmap(map, bo->size);
   }
}
+   assert(bo->map_gtt);
 
DBG("bo_map_gtt: %d (%s) -> %p, ", bo->gem_handle, bo->name, bo->map_gtt);
print_flags(flags);
-- 
2.13.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [EGL android: accquire fence implementation] i965: Queue the buffer with a sync fence for Android OS

2017-07-11 Thread Tomasz Figa
-chuanbo.w...@intel.com (bounces)
+"Gao, Shuo"  (forgot to add originally, sorry)

On Wed, Jul 12, 2017 at 12:23 AM, Tomasz Figa  wrote:
> Hi Zhongmin,
>
> On Tue, Jul 11, 2017 at 11:02 AM, Wu, Zhongmin  wrote:
>> By the way,
>>
>> For cancelBuffer, sorry I forget such function, thanks for notice. It should 
>> also pass the same fence fd as the queuebuffer.
>>
>> And Yogesh, you mentioned the gallium,   is it another platform supported by 
>> mesa ?  I am sorry I have no idea about this,  could you please help to 
>> check this ?
>>
>> I think we can co-work with mesa team to work out an acceptable fix which 
>> can meet the requirement of Android without any break on other platforms.
>
> One thing needs clarifying here. Release fences from EGL are _not_ a
> requirement. It is an optional feature. Android compliance suites pass
> fully without Android sync fence support in Mesa at all.
>
> Other than that, it's been taking long enough and I agree that we
> should finally wire both acquire and release fence support in EGL and
> related drivers. Otherwise we can forget about getting good user
> experience on Android.
>
> On a technical side, the EGL change needs to take into account that
> not all drivers support fences and so it needs to have a fallback to
> old behavior for those which don't.
>
> Other than that, correct me if I'm wrong, but could we just use the
> DRI2 fence extension instead of adding some custom callbacks? I can
> see that a normal client request to create a sync fence would end up
> calling dri2_dpy->fence->create_fence_fd() (if it's present) [1].
> Could we do the same?
>
> [1] 
> https://cgit.freedesktop.org/mesa/mesa/tree/src/egl/drivers/dri2/egl_dri2.c#n2772
>
> + Kristian, Chad and Dominik who have been looking into sync fence
> integration with our EGL drivers.
>
> Best regards,
> Tomasz
>
>>
>> -Original Message-
>> From: Wu, Zhongmin
>> Sent: Tuesday, July 11, 2017 8:40
>> To: 'Emil Velikov' ; Marathe, Yogesh 
>> 
>> Cc: Widawsky, Benjamin ; Liu, Zhiquan 
>> ; Eric Engestrom ; Rob Clark 
>> ; Tomasz Figa ; Kenneth 
>> Graunke ; Kondapally, Kalyan 
>> ; ML mesa-dev ; 
>> Timothy Arceri ; Chuanbo Weng 
>> Subject: RE: [Mesa-dev] [EGL android: accquire fence implementation] i965: 
>> Queue the buffer with a sync fence for Android OS
>>
>> Hi Emil and Yogesh
>> Thank you for your comments,  and thanks Yogesh for giving the detailed 
>> explanations
>>
>>
>> And according to the document of Android below 
>> (https://source.android.com/devices/graphics/arch-bq-gralloc):
>>
>> Recent Android devices support the sync framework, which enables the system 
>> to do nifty things when combined with hardware components that can 
>> manipulate graphics data asynchronously. For example, a producer can submit 
>> a series of OpenGL ES drawing commands and then enqueue the output buffer 
>> before rendering completes. The buffer is accompanied by a fence that 
>> signals when the contents are ready.
>>
>>
>> I think the things is very clear, that is if the rendering is completed 
>> already when we call queueBuffer() in mesa ?   If not, we should queue the 
>> buffer with a fence which will be signaled when the buffer is ready.
>>
>>
>>
>> -Original Message-
>> From: Emil Velikov [mailto:emil.l.veli...@gmail.com]
>> Sent: Tuesday, July 11, 2017 1:18
>> To: Marathe, Yogesh 
>> Cc: Wu, Zhongmin ; Widawsky, Benjamin 
>> ; Liu, Zhiquan ; Eric 
>> Engestrom ; Rob Clark ; Tomasz 
>> Figa ; Kenneth Graunke ; 
>> Kondapally, Kalyan ; ML mesa-dev 
>> ; Timothy Arceri ; 
>> Chuanbo Weng 
>> Subject: Re: [Mesa-dev] [EGL android: accquire fence implementation] i965: 
>> Queue the buffer with a sync fence for Android OS
>>
>> On 10 July 2017 at 15:26, Marathe, Yogesh  wrote:
>>> Hello Emil, My two cents since I too spent some time on this.
>>>
 -Original Message-
 From: mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] On
 Behalf Of Emil Velikov
 Sent: Monday, July 10, 2017 4:41 PM
 To: Wu, Zhongmin 
 Cc: Widawsky, Benjamin ; Liu, Zhiquan
 ; Eric Engestrom ; Rob
 Clark ; Tomasz Figa ;
 Kenneth Graunke ; 

Re: [Mesa-dev] [EGL android: accquire fence implementation] i965: Queue the buffer with a sync fence for Android OS

2017-07-11 Thread Tomasz Figa
Now for real, sorry guys... (Seriously gmail why you do this to me.)

-chuanbo.w...@intel.com (bounces)
+"Gao, Shuo"  (forgot to add originally, sorry)

On Wed, Jul 12, 2017 at 12:23 AM, Tomasz Figa  wrote:
> Hi Zhongmin,
>
> On Tue, Jul 11, 2017 at 11:02 AM, Wu, Zhongmin  wrote:
>> By the way,
>>
>> For cancelBuffer, sorry I forget such function, thanks for notice. It should 
>> also pass the same fence fd as the queuebuffer.
>>
>> And Yogesh, you mentioned the gallium,   is it another platform supported by 
>> mesa ?  I am sorry I have no idea about this,  could you please help to 
>> check this ?
>>
>> I think we can co-work with mesa team to work out an acceptable fix which 
>> can meet the requirement of Android without any break on other platforms.
>
> One thing needs clarifying here. Release fences from EGL are _not_ a
> requirement. It is an optional feature. Android compliance suites pass
> fully without Android sync fence support in Mesa at all.
>
> Other than that, it's been taking long enough and I agree that we
> should finally wire both acquire and release fence support in EGL and
> related drivers. Otherwise we can forget about getting good user
> experience on Android.
>
> On a technical side, the EGL change needs to take into account that
> not all drivers support fences and so it needs to have a fallback to
> old behavior for those which don't.
>
> Other than that, correct me if I'm wrong, but could we just use the
> DRI2 fence extension instead of adding some custom callbacks? I can
> see that a normal client request to create a sync fence would end up
> calling dri2_dpy->fence->create_fence_fd() (if it's present) [1].
> Could we do the same?
>
> [1] 
> https://cgit.freedesktop.org/mesa/mesa/tree/src/egl/drivers/dri2/egl_dri2.c#n2772
>
> + Kristian, Chad and Dominik who have been looking into sync fence
> integration with our EGL drivers.
>
> Best regards,
> Tomasz
>
>>
>> -Original Message-
>> From: Wu, Zhongmin
>> Sent: Tuesday, July 11, 2017 8:40
>> To: 'Emil Velikov' ; Marathe, Yogesh 
>> 
>> Cc: Widawsky, Benjamin ; Liu, Zhiquan 
>> ; Eric Engestrom ; Rob Clark 
>> ; Tomasz Figa ; Kenneth 
>> Graunke ; Kondapally, Kalyan 
>> ; ML mesa-dev ; 
>> Timothy Arceri ; Chuanbo Weng 
>> Subject: RE: [Mesa-dev] [EGL android: accquire fence implementation] i965: 
>> Queue the buffer with a sync fence for Android OS
>>
>> Hi Emil and Yogesh
>> Thank you for your comments,  and thanks Yogesh for giving the detailed 
>> explanations
>>
>>
>> And according to the document of Android below 
>> (https://source.android.com/devices/graphics/arch-bq-gralloc):
>>
>> Recent Android devices support the sync framework, which enables the system 
>> to do nifty things when combined with hardware components that can 
>> manipulate graphics data asynchronously. For example, a producer can submit 
>> a series of OpenGL ES drawing commands and then enqueue the output buffer 
>> before rendering completes. The buffer is accompanied by a fence that 
>> signals when the contents are ready.
>>
>>
>> I think the things is very clear, that is if the rendering is completed 
>> already when we call queueBuffer() in mesa ?   If not, we should queue the 
>> buffer with a fence which will be signaled when the buffer is ready.
>>
>>
>>
>> -Original Message-
>> From: Emil Velikov [mailto:emil.l.veli...@gmail.com]
>> Sent: Tuesday, July 11, 2017 1:18
>> To: Marathe, Yogesh 
>> Cc: Wu, Zhongmin ; Widawsky, Benjamin 
>> ; Liu, Zhiquan ; Eric 
>> Engestrom ; Rob Clark ; Tomasz 
>> Figa ; Kenneth Graunke ; 
>> Kondapally, Kalyan ; ML mesa-dev 
>> ; Timothy Arceri ; 
>> Chuanbo Weng 
>> Subject: Re: [Mesa-dev] [EGL android: accquire fence implementation] i965: 
>> Queue the buffer with a sync fence for Android OS
>>
>> On 10 July 2017 at 15:26, Marathe, Yogesh  wrote:
>>> Hello Emil, My two cents since I too spent some time on this.
>>>
 -Original Message-
 From: mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] On
 Behalf Of Emil Velikov
 Sent: Monday, July 10, 2017 4:41 PM
 To: Wu, Zhongmin 
 Cc: Widawsky, Benjamin ; Liu, Zhiquan
 ; Eric Engestrom ; Rob
 Clark ; Tomasz Figa 

Re: [Mesa-dev] [PATCH 07/10] util/u_queue: add an option to resize the queue when it's full

2017-07-11 Thread Marek Olšák
On Tue, Jul 11, 2017 at 12:05 PM, Grazvydas Ignotas  wrote:
> On Tue, Jul 11, 2017 at 12:21 AM, Marek Olšák  wrote:
>> From: Marek Olšák 
>>
>> Consider the following situation:
>>   mtx_lock(mutex);
>>   do_something();
>>   util_queue_add_job(...);
>>   mtx_unlock(mutex);
>>
>> If the queue is full, util_queue_add_job will wait for a free slot.
>> If the job which is currently being executed tries to lock the mutex,
>> it will be stuck forever, because util_queue_add_job is stuck.
>>
>> The deadlock can be trivially resolved by increasing the queue size
>> (reallocating the queue) in util_queue_add_job if the queue is full.
>> Then util_queue_add_job becomes wait-free.
>>
>> radeonsi will use it.
>
> Can't this cause the queue to grow uncontrollably, like on GPU hangs,
> making already difficult to debug situations worse? Perhaps
> util_queue_add_job() could have a non-blocking-fail option and the
> caller could then retry after releasing the mutex for a bit.

The thing with GPU hangs is that the driver is unable to continue its
operation and will be stuck one way or another.

The caller can't release the mutex, because it has done an operation
(do_something() above) that must be done together with
util_queue_add_job and can't be separated. The atomicity of command
submission starts with the first mtx_lock call. Things are
irreversible after do_something(). The only two possible outcomes is
that util_queue_add_job either succeeds or waits and then succeeds.
There is no other option.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [EGL android: accquire fence implementation] i965: Queue the buffer with a sync fence for Android OS

2017-07-11 Thread Tomasz Figa
Hi Zhongmin,

On Tue, Jul 11, 2017 at 11:02 AM, Wu, Zhongmin  wrote:
> By the way,
>
> For cancelBuffer, sorry I forget such function, thanks for notice. It should 
> also pass the same fence fd as the queuebuffer.
>
> And Yogesh, you mentioned the gallium,   is it another platform supported by 
> mesa ?  I am sorry I have no idea about this,  could you please help to check 
> this ?
>
> I think we can co-work with mesa team to work out an acceptable fix which can 
> meet the requirement of Android without any break on other platforms.

One thing needs clarifying here. Release fences from EGL are _not_ a
requirement. It is an optional feature. Android compliance suites pass
fully without Android sync fence support in Mesa at all.

Other than that, it's been taking long enough and I agree that we
should finally wire both acquire and release fence support in EGL and
related drivers. Otherwise we can forget about getting good user
experience on Android.

On a technical side, the EGL change needs to take into account that
not all drivers support fences and so it needs to have a fallback to
old behavior for those which don't.

Other than that, correct me if I'm wrong, but could we just use the
DRI2 fence extension instead of adding some custom callbacks? I can
see that a normal client request to create a sync fence would end up
calling dri2_dpy->fence->create_fence_fd() (if it's present) [1].
Could we do the same?

[1] 
https://cgit.freedesktop.org/mesa/mesa/tree/src/egl/drivers/dri2/egl_dri2.c#n2772

+ Kristian, Chad and Dominik who have been looking into sync fence
integration with our EGL drivers.

Best regards,
Tomasz

>
> -Original Message-
> From: Wu, Zhongmin
> Sent: Tuesday, July 11, 2017 8:40
> To: 'Emil Velikov' ; Marathe, Yogesh 
> 
> Cc: Widawsky, Benjamin ; Liu, Zhiquan 
> ; Eric Engestrom ; Rob Clark 
> ; Tomasz Figa ; Kenneth Graunke 
> ; Kondapally, Kalyan ; ML 
> mesa-dev ; Timothy Arceri 
> ; Chuanbo Weng 
> Subject: RE: [Mesa-dev] [EGL android: accquire fence implementation] i965: 
> Queue the buffer with a sync fence for Android OS
>
> Hi Emil and Yogesh
> Thank you for your comments,  and thanks Yogesh for giving the detailed 
> explanations
>
>
> And according to the document of Android below 
> (https://source.android.com/devices/graphics/arch-bq-gralloc):
>
> Recent Android devices support the sync framework, which enables the system 
> to do nifty things when combined with hardware components that can manipulate 
> graphics data asynchronously. For example, a producer can submit a series of 
> OpenGL ES drawing commands and then enqueue the output buffer before 
> rendering completes. The buffer is accompanied by a fence that signals when 
> the contents are ready.
>
>
> I think the things is very clear, that is if the rendering is completed 
> already when we call queueBuffer() in mesa ?   If not, we should queue the 
> buffer with a fence which will be signaled when the buffer is ready.
>
>
>
> -Original Message-
> From: Emil Velikov [mailto:emil.l.veli...@gmail.com]
> Sent: Tuesday, July 11, 2017 1:18
> To: Marathe, Yogesh 
> Cc: Wu, Zhongmin ; Widawsky, Benjamin 
> ; Liu, Zhiquan ; Eric 
> Engestrom ; Rob Clark ; Tomasz 
> Figa ; Kenneth Graunke ; 
> Kondapally, Kalyan ; ML mesa-dev 
> ; Timothy Arceri ; 
> Chuanbo Weng 
> Subject: Re: [Mesa-dev] [EGL android: accquire fence implementation] i965: 
> Queue the buffer with a sync fence for Android OS
>
> On 10 July 2017 at 15:26, Marathe, Yogesh  wrote:
>> Hello Emil, My two cents since I too spent some time on this.
>>
>>> -Original Message-
>>> From: mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] On
>>> Behalf Of Emil Velikov
>>> Sent: Monday, July 10, 2017 4:41 PM
>>> To: Wu, Zhongmin 
>>> Cc: Widawsky, Benjamin ; Liu, Zhiquan
>>> ; Eric Engestrom ; Rob
>>> Clark ; Tomasz Figa ;
>>> Kenneth Graunke ; Kondapally, Kalyan
>>> ; ML mesa-dev >> d...@lists.freedesktop.org>; Timothy Arceri ;
>>> Chuanbo Weng 
>>> Subject: Re: [Mesa-dev] [EGL android: accquire fence implementation] i965:
>>> Queue the buffer with a sync fence for Android 

Re: [Mesa-dev] [PATCH] anv: Stop leaking the no_aux sampler surface state

2017-07-11 Thread Lionel Landwerlin

Well spotted!

Reviewed-by: Lionel Landwerlin 

On 11/07/17 16:14, Jason Ekstrand wrote:

Cc: mesa-sta...@lists.freedesktop.org
---
  src/intel/vulkan/anv_image.c | 5 +
  1 file changed, 5 insertions(+)

diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c
index 76ab923..9992685 100644
--- a/src/intel/vulkan/anv_image.c
+++ b/src/intel/vulkan/anv_image.c
@@ -775,6 +775,11 @@ anv_DestroyImageView(VkDevice _device, VkImageView _iview,
iview->sampler_surface_state);
 }
  
+   if (iview->no_aux_sampler_surface_state.alloc_size > 0) {

+  anv_state_pool_free(>surface_state_pool,
+  iview->no_aux_sampler_surface_state);
+   }
+
 if (iview->storage_surface_state.alloc_size > 0) {
anv_state_pool_free(>surface_state_pool,
iview->storage_surface_state);



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] anv: Stop leaking the no_aux sampler surface state

2017-07-11 Thread Jason Ekstrand
Cc: mesa-sta...@lists.freedesktop.org
---
 src/intel/vulkan/anv_image.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c
index 76ab923..9992685 100644
--- a/src/intel/vulkan/anv_image.c
+++ b/src/intel/vulkan/anv_image.c
@@ -775,6 +775,11 @@ anv_DestroyImageView(VkDevice _device, VkImageView _iview,
   iview->sampler_surface_state);
}
 
+   if (iview->no_aux_sampler_surface_state.alloc_size > 0) {
+  anv_state_pool_free(>surface_state_pool,
+  iview->no_aux_sampler_surface_state);
+   }
+
if (iview->storage_surface_state.alloc_size > 0) {
   anv_state_pool_free(>surface_state_pool,
   iview->storage_surface_state);
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 2/2] swr: build driver proper separate from rasterizer

2017-07-11 Thread Emil Velikov
On 11 July 2017 at 03:09, Tim Rowley  wrote:
> swr used to build and link the rasterizer to the driver, and to support
> multiple architectures we needed to have multiple versions of the
> driver/rasterizer combination, which needed to link in much of mesa.
>
> Changing to having one instance of the driver and just building
> architecture specific versions of the rasterizer gives a large reduction
> in disk space.
>
> libGL.so6464 Kb ->  7000 Kb
> libswrAVX.so   10068 Kb ->  5432 Kb
> libswrAVX2.so   9828 Kb ->  5200 Kb
>
> Total  26360 Kb -> 17632 Kb
> ---
>  src/gallium/drivers/swr/Makefile.am | 31 ++-
>  src/gallium/drivers/swr/SConscript  | 26 +-
>  src/gallium/drivers/swr/swr_context.cpp |  2 +-
>  src/gallium/drivers/swr/swr_loader.cpp  | 14 ++
>  src/gallium/drivers/swr/swr_screen.h|  2 ++
>  5 files changed, 36 insertions(+), 39 deletions(-)
>
> diff --git a/src/gallium/drivers/swr/Makefile.am 
> b/src/gallium/drivers/swr/Makefile.am
> index 4b4bd37..7461228 100644
> --- a/src/gallium/drivers/swr/Makefile.am
> +++ b/src/gallium/drivers/swr/Makefile.am
> @@ -26,7 +26,14 @@ AM_CXXFLAGS = $(GALLIUM_DRIVER_CFLAGS) 
> $(SWR_CXX11_CXXFLAGS)
>
>  noinst_LTLIBRARIES = libmesaswr.la
>
> -libmesaswr_la_SOURCES = $(LOADER_SOURCES)
> +# gen_knobs.* included here to provide driver access to swr configuration
> +libmesaswr_la_SOURCES = \
> +   $(CXX_SOURCES) \
> +   $(COMMON_CXX_SOURCES) \
> +   $(JITTER_CXX_SOURCES) \
> +   rasterizer/codegen/gen_knobs.cpp \
> +   rasterizer/codegen/gen_knobs.h \
> +   $(LOADER_SOURCES)
>
>  COMMON_CXXFLAGS = \
> -fno-strict-aliasing \
> @@ -43,12 +50,15 @@ COMMON_CXXFLAGS = \
> -I$(srcdir)/rasterizer/jitter \
> -I$(srcdir)/rasterizer/archrast
>
> +# SWR_AVX_CXXFLAGS needed for intrinsic usage in swr api headers

Still seems strange to have the loader build for AVX, which will then
dive into a AVX or AVX2 only backend.
I'm pretty sure you guys will unwrap that out, once more the important
parts are sorted.

From build POV, things look great.
Reviewed-by: Emil Velikov 

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 101709] [llvmpipe] piglit gl-1.0-scissor-offscreen regression

2017-07-11 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=101709

Vedran Miletić  changed:

   What|Removed |Added

 CC||ved...@miletic.net

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] swr/rast: make SWR_VISIBLE attribute work for windows

2017-07-11 Thread Tim Rowley
From: George Kyriazis 

Needed to expose SwrGetInterface
---
 src/gallium/drivers/swr/rasterizer/common/os.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/swr/rasterizer/common/os.h 
b/src/gallium/drivers/swr/rasterizer/common/os.h
index 8657709..a16f577 100644
--- a/src/gallium/drivers/swr/rasterizer/common/os.h
+++ b/src/gallium/drivers/swr/rasterizer/common/os.h
@@ -30,7 +30,7 @@
 #if (defined(FORCE_WINDOWS) || defined(_WIN32)) && !defined(FORCE_LINUX)
 
 #define SWR_API __cdecl
-#define SWR_VISIBLE
+#define SWR_VISIBLE  __declspec(dllexport)
 
 #ifndef NOMINMAX
 #define NOMINMAX
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] egl/android: remove HAL_PIXEL_FORMAT_BGRA_8888 support

2017-07-11 Thread Emil Velikov
On 11 July 2017 at 15:16, Rob Herring  wrote:
> On Tue, Jul 11, 2017 at 8:27 AM, Emil Velikov  
> wrote:
>> From: Emil Velikov 
>>
>> As said in the EGL_KHR_platform_android extensions
>>
>> For each EGLConfig that belongs to the Android platform, the
>> EGL_NATIVE_VISUAL_ID attribute is an Android window format, such as
>> WINDOW_FORMAT_RGBA_.
>>
>> Although it should be applicable overall.
>>
>> Even though we use HAL_PIXEL_FORMAT here, those are numerically
>> identical to the  WINDOW_FORMAT_ and AHARDWAREBUFFER_FORMAT_ ones.
>>
>> Barring the said format of course. That one is only listed in HAL.
>>
>> Keep in mind that even if we try to use the said format, you'll get
>> caught by droid_create_surface(). The function compares the format of
>> the underlying window, against the NATIVE_VISUAL_ID of the config.
>>
>> Unfortunatelly it only prints a warning, rather than error out, likely
>> leading to visual corruption.
>>
>> While SDL will even call ANativeWindow_setBuffersGeometry() with the
>> wrong format, and conviniently ignore the [expected] failure.
>>
>> Cc: mesa-sta...@lists.freedesktop.org
>> Cc: Chad Versace 
>> Cc: Tomasz Figa 
>> Signed-off-by: Emil Velikov 
>> ---
>> I'm about 99.99% sure the above is correct, but I haven't tested it.
>
> Isn't this going to break if there's no driver support for RGBA/RGBX
> which is the case for stable (and master for gallium drvs).
>
Right, I should have send this after the Gallium patches have landed. Silly me.

I thought you guys are carrying the RGBA/RGBX on top of the stable
branch, so this should be safe.
Although in either case, there's limited use of having this in
-stable, so stable nomination withdrawn.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] egl/android: remove HAL_PIXEL_FORMAT_BGRA_8888 support

2017-07-11 Thread Tomasz Figa
On Tue, Jul 11, 2017 at 11:16 PM, Rob Herring  wrote:
> On Tue, Jul 11, 2017 at 8:27 AM, Emil Velikov  
> wrote:
>> From: Emil Velikov 
>>
>> As said in the EGL_KHR_platform_android extensions
>>
>> For each EGLConfig that belongs to the Android platform, the
>> EGL_NATIVE_VISUAL_ID attribute is an Android window format, such as
>> WINDOW_FORMAT_RGBA_.
>>
>> Although it should be applicable overall.
>>
>> Even though we use HAL_PIXEL_FORMAT here, those are numerically
>> identical to the  WINDOW_FORMAT_ and AHARDWAREBUFFER_FORMAT_ ones.
>>
>> Barring the said format of course. That one is only listed in HAL.
>>
>> Keep in mind that even if we try to use the said format, you'll get
>> caught by droid_create_surface(). The function compares the format of
>> the underlying window, against the NATIVE_VISUAL_ID of the config.
>>
>> Unfortunatelly it only prints a warning, rather than error out, likely
>> leading to visual corruption.
>>
>> While SDL will even call ANativeWindow_setBuffersGeometry() with the
>> wrong format, and conviniently ignore the [expected] failure.
>>
>> Cc: mesa-sta...@lists.freedesktop.org
>> Cc: Chad Versace 
>> Cc: Tomasz Figa 
>> Signed-off-by: Emil Velikov 
>> ---
>> I'm about 99.99% sure the above is correct, but I haven't tested it.
>
> Isn't this going to break if there's no driver support for RGBA/RGBX
> which is the case for stable (and master for gallium drvs).

First of all, Android hardcodes HAL_PIXEL_FORMAT_RGBA_ in a number
of places, which means that those users use a patched Android. However
I'm not sure if we can just break them like this. I'll leave it to you
guys, though.

Other than that, CTS seems to require only RGBA_ and RGB_565, so
this change is not going to affect compliance with unpatched Android.

Best regards,
Tomasz
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] egl/android: remove HAL_PIXEL_FORMAT_BGRA_8888 support

2017-07-11 Thread Rob Herring
On Tue, Jul 11, 2017 at 8:27 AM, Emil Velikov  wrote:
> From: Emil Velikov 
>
> As said in the EGL_KHR_platform_android extensions
>
> For each EGLConfig that belongs to the Android platform, the
> EGL_NATIVE_VISUAL_ID attribute is an Android window format, such as
> WINDOW_FORMAT_RGBA_.
>
> Although it should be applicable overall.
>
> Even though we use HAL_PIXEL_FORMAT here, those are numerically
> identical to the  WINDOW_FORMAT_ and AHARDWAREBUFFER_FORMAT_ ones.
>
> Barring the said format of course. That one is only listed in HAL.
>
> Keep in mind that even if we try to use the said format, you'll get
> caught by droid_create_surface(). The function compares the format of
> the underlying window, against the NATIVE_VISUAL_ID of the config.
>
> Unfortunatelly it only prints a warning, rather than error out, likely
> leading to visual corruption.
>
> While SDL will even call ANativeWindow_setBuffersGeometry() with the
> wrong format, and conviniently ignore the [expected] failure.
>
> Cc: mesa-sta...@lists.freedesktop.org
> Cc: Chad Versace 
> Cc: Tomasz Figa 
> Signed-off-by: Emil Velikov 
> ---
> I'm about 99.99% sure the above is correct, but I haven't tested it.

Isn't this going to break if there's no driver support for RGBA/RGBX
which is the case for stable (and master for gallium drvs).

Rob
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC 01/22] RFC: egl/x11: Support DRI3 v1.1

2017-07-11 Thread Emil Velikov
On 10 July 2017 at 22:26, Louis-Francis Ratté-Boulianne
 wrote:
> Hi,
>
> On Tue, 2017-06-20 at 15:19 +0100, Emil Velikov wrote:
>> > +for (i = 0; i < count; i++) {
>> > +   modifiers[i] = (uint64_t) mod_parts[i * 2] << 32;
>> > +   modifiers[i] |= (uint64_t) mod_parts[i * 2 + 1] &
>> > 0xff;
>> > +}
>> > + }
>> > +
>> > + free(mod_reply);
>> > +
>> > + buffer->image = draw->ext->image-
>> > >createImageWithModifiers(draw->dri_screen,
>> > +
>> >   width, height,
>> > +
>> >   format,
>> > +
>> >   modifiers, count,
>> > +
>> >   buffer);
>> > + free(modifiers);
>> > +  }
>> > +#endif
>> > +
>> > +  if (!buffer->image)
>>
>> Does not align with all the error paths above. There we bail out, yet
>> here we fall-back to the old extension.
>> Perhaps change the former to come here? One could even more that
>> whole
>> hunk into a separate function.
>
> The rationale is that we only try to create a buffer with the supported
> modifiers. If it doesn't work, it's still sensible to try the old path
> as it's better to have a DRI image without any optimization than none.
>
Getting confused here...
In the hunk that you've dropped, if the (optional) xcb comms/malloc
fails we error out (goto no_image). At the same if
createImageWithModifiers() fails we fall-back w/o modifiers.

So the dependencies cannot fail, why the depende can ... why? What
would happen if we have new X server (DRI3 1.1 capable), Mesa, xcb,
while an old X driver?
IMHO if any of the dependencies fail, fallback to w/o mods.


>> +   }
>> >
>> > -   xcb_dri3_pixmap_from_buffer(draw->conn,
>> > -   (pixmap = xcb_generate_id(draw-
>> > >conn)),
>> > -   draw->drawable,
>> > -   buffer->size,
>> > -   width, height, buffer->pitch,
>> > -   depth, buffer->cpp * 8,
>> > -   buffer_fd);
>> > +#if XCB_DRI3_MAJOR_VERSION > 1 || XCB_DRI3_MINOR_VERSION >= 1
>> > +   if (draw->multiplanes_available) {
>>
>> This else looks a bit odd. If we fail to manage multiple buffers
>> above, multiplanes_available will still be true, yet we could have a
>> DRIImage.
>> We should track that (modify multiplanes_available/other) and act
>> accordingly here.
>
> What can fail above (and not be critical) is creating an image with
> modifiers. I grant you that it means the resulting image will only have
> one plane, but that doesn't make it "bad" to use
> xcb_dri3_pixmap_from_buffers. multiplanes_available simply means that
> the X server actually supports using multiple planes.
>
Whether it's "bad" would depend on the implementation. I'm thinking
more of confusing and/or misleading.
Can you please add a couple of words as inline comment. Anything like
what you/Dan said seems reasonable IMHO.

Speaking of implementation, I might throw some questions.

BTW, Mesa's branch (feature freeze) is ~21st July. Personally it seems
perfectly possible to get this work in - the frontend bits at least.
What do you think?

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radv: set cb base tile swizzles for MRT speedups (v3)

2017-07-11 Thread Alex Smith
On 11 July 2017 at 14:27, Alex Smith  wrote:

> On 10 July 2017 at 05:59, Dave Airlie  wrote:
>
>> From: Dave Airlie 
>>
>> This patch uses addrlib to workout the tile swizzles according
>> to the surface index. It seems to produce the same values as
>> amdgpu-pro for the deferred test.
>>
>> v2: don't apply swizzle to CMASK. the eg docs don't mention
>> it, and we clearly don't align cmask for that.
>> v3: disable surf index for dedicated images, as these will
>> most likely be shared, and I don't think the metadata has
>> space for this info in it yet.
>>
>
> FWIW, disabling this for images marked as dedicated means this won't get
> any improvements for render targets on our games. We create all render
> targets as dedicated when NV_dedicated_allocation is available since this
> gets us significant perf improvement on NVIDIA.
>
> If it's not currently possible to have this enabled for dedicated images
> we could avoid using it on AMD, though I'm curious if there's likely to be
> any other perf benefits to marking RTs as dedicated we'd then be missing
> out on? I've not done any testing to see if there's any benefit from using
> it.
>

Realised this possibly didn't sound clear - what I'm asking is does using
NV_dedicated_allocation give any perf benefit on RADV at all like it does
for NV? If not we could avoid it to get the benefits from this patch.

Alex


>
> Thanks,
> Alex
>
>
>> This gets the deferred demo from 730->950fps on my rx480.
>> (dcc cmask elim predication patches get it further)
>> I'm also seeing some improvements in Mad Max at 4K
>>
>> Signed-off-by: Dave Airlie 
>>
>> fixup for dedicate
>> ---
>>  src/amd/common/ac_surface.c   | 14 ++
>>  src/amd/common/ac_surface.h   |  2 ++
>>  src/amd/vulkan/radv_device.c  |  7 ++-
>>  src/amd/vulkan/radv_image.c   | 19 ++-
>>  src/amd/vulkan/radv_private.h |  2 ++
>>  5 files changed, 42 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/amd/common/ac_surface.c b/src/amd/common/ac_surface.c
>> index 23fb66b..0aebacc 100644
>> --- a/src/amd/common/ac_surface.c
>> +++ b/src/amd/common/ac_surface.c
>> @@ -692,6 +692,20 @@ static int gfx6_compute_surface(ADDR_HANDLE addrlib,
>> surf->htile_size *= 2;
>>
>> surf->is_linear = surf->u.legacy.level[0].mode ==
>> RADEON_SURF_MODE_LINEAR_ALIGNED;
>> +
>> +   /* workout base swizzle */
>> +   if (!(surf->flags & RADEON_SURF_Z_OR_SBUFFER)) {
>> +   ADDR_COMPUTE_BASE_SWIZZLE_INPUT AddrBaseSwizzleIn = {0};
>> +   ADDR_COMPUTE_BASE_SWIZZLE_OUTPUT AddrBaseSwizzleOut =
>> {0};
>> +
>> +   AddrBaseSwizzleIn.surfIndex = config->info.surf_index;
>> +   AddrBaseSwizzleIn.tileIndex = AddrSurfInfoIn.tileIndex;
>> +   AddrBaseSwizzleIn.macroModeIndex =
>> AddrSurfInfoOut.macroModeIndex;
>> +   AddrBaseSwizzleIn.pTileInfo = AddrSurfInfoOut.pTileInfo;
>> +   AddrBaseSwizzleIn.tileMode = AddrSurfInfoOut.tileMode;
>> +   AddrComputeBaseSwizzle(addrlib, ,
>> );
>> +   surf->u.legacy.combined_swizzle =
>> AddrBaseSwizzleOut.tileSwizzle;
>> +   }
>> return 0;
>>  }
>>
>> diff --git a/src/amd/common/ac_surface.h b/src/amd/common/ac_surface.h
>> index 4d893ff..7901b86 100644
>> --- a/src/amd/common/ac_surface.h
>> +++ b/src/amd/common/ac_surface.h
>> @@ -97,6 +97,7 @@ struct legacy_surf_layout {
>>  unsigneddepth_adjusted:1;
>>  unsignedstencil_adjusted:1;
>>
>> +uint8_t combined_swizzle;
>>  struct legacy_surf_levellevel[RADEON_SURF_MAX_LEVELS];
>>  struct legacy_surf_levelstencil_level[RADEON_SURF_MAX_LEVELS];
>>  uint8_t tiling_index[RADEON_SURF_MAX_LEVELS];
>> @@ -194,6 +195,7 @@ struct ac_surf_info {
>> uint32_t width;
>> uint32_t height;
>> uint32_t depth;
>> +   uint32_t surf_index;
>> uint8_t samples;
>> uint8_t levels;
>> uint16_t array_size;
>> diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
>> index 789c90d..eb77914 100644
>> --- a/src/amd/vulkan/radv_device.c
>> +++ b/src/amd/vulkan/radv_device.c
>> @@ -2757,7 +2757,8 @@ radv_initialise_color_surface(struct radv_device
>> *device,
>> }
>>
>> cb->cb_color_base = va >> 8;
>> -
>> +   if (device->physical_device->rad_info.chip_class < GFX9)
>> +   cb->cb_color_base |= iview->image->surface.u.legacy
>> .combined_swizzle;
>> /* CMASK variables */
>> va = device->ws->buffer_get_va(iview->bo) + iview->image->offset;
>> va += iview->image->cmask.offset;
>> @@ -2766,6 +2767,8 @@ radv_initialise_color_surface(struct radv_device
>> *device,
>> va = device->ws->buffer_get_va(iview->bo) + iview->image->offset;
>> va += iview->image->dcc_offset;
>> 

[Mesa-dev] [PATCH 2/2] egl/drm: set the VISUAL_TYPE alongside the VISUAL_ID

2017-07-11 Thread Emil Velikov
From: Emil Velikov 

According to the EGL_KHR_platform_gbm extension:

For each EGLConfig that belongs to the GBM platform, the
EGL_NATIVE_VISUAL_ID attribute is a GBM color format, such as
GBM_FORMAT_XRGB.

Which we correctly manage. At the same time the EGL 1.4 spec says

If an EGLConfig supports windows then it may have an associated
native visual. EGL_NATIVE_VISUAL_ID specifies an identifier for this
visual, and EGL_NATIVE_VISUAL_TYPE specifies its type. If an
EGLConfig does not support windows, or if there is no associated
native visual type, then querying EGL_NATIVE_VISUAL_ID will return 0
and querying EGL_NATIVE_VISUAL_TYPE will return EGL_NONE.

Based on this, either both of ID and TYPE should be set, or neither.

Cc: mesa-sta...@lists.freedesktop.org
Cc: Chad Versace 
Signed-off-by: Emil Velikov 
---
 src/egl/drivers/dri2/platform_drm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/egl/drivers/dri2/platform_drm.c 
b/src/egl/drivers/dri2/platform_drm.c
index 8e12aed0b32..b2146aa64af 100644
--- a/src/egl/drivers/dri2/platform_drm.c
+++ b/src/egl/drivers/dri2/platform_drm.c
@@ -613,6 +613,7 @@ drm_add_configs_for_visuals(_EGLDriver *drv, _EGLDisplay 
*disp)
 
  const EGLint attr_list[] = {
 EGL_NATIVE_VISUAL_ID,  visuals[j].format,
+EGL_NATIVE_VISUAL_TYPE,  visuals[j].format,
 EGL_NONE,
  };
 
-- 
2.13.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] egl/android: remove HAL_PIXEL_FORMAT_BGRA_8888 support

2017-07-11 Thread Emil Velikov
From: Emil Velikov 

As said in the EGL_KHR_platform_android extensions

For each EGLConfig that belongs to the Android platform, the
EGL_NATIVE_VISUAL_ID attribute is an Android window format, such as
WINDOW_FORMAT_RGBA_.

Although it should be applicable overall.

Even though we use HAL_PIXEL_FORMAT here, those are numerically
identical to the  WINDOW_FORMAT_ and AHARDWAREBUFFER_FORMAT_ ones.

Barring the said format of course. That one is only listed in HAL.

Keep in mind that even if we try to use the said format, you'll get
caught by droid_create_surface(). The function compares the format of
the underlying window, against the NATIVE_VISUAL_ID of the config.

Unfortunatelly it only prints a warning, rather than error out, likely
leading to visual corruption.

While SDL will even call ANativeWindow_setBuffersGeometry() with the
wrong format, and conviniently ignore the [expected] failure.

Cc: mesa-sta...@lists.freedesktop.org
Cc: Chad Versace 
Cc: Tomasz Figa 
Signed-off-by: Emil Velikov 
---
I'm about 99.99% sure the above is correct, but I haven't tested it.
---
 src/egl/drivers/dri2/platform_android.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/src/egl/drivers/dri2/platform_android.c 
b/src/egl/drivers/dri2/platform_android.c
index 13006fee873..e909810d678 100644
--- a/src/egl/drivers/dri2/platform_android.c
+++ b/src/egl/drivers/dri2/platform_android.c
@@ -91,7 +91,6 @@ get_format_bpp(int native)
switch (native) {
case HAL_PIXEL_FORMAT_RGBA_:
case HAL_PIXEL_FORMAT_RGBX_:
-   case HAL_PIXEL_FORMAT_BGRA_:
   bpp = 4;
   break;
case HAL_PIXEL_FORMAT_RGB_565:
@@ -110,7 +109,6 @@ static int get_fourcc(int native)
 {
switch (native) {
case HAL_PIXEL_FORMAT_RGB_565:   return __DRI_IMAGE_FOURCC_RGB565;
-   case HAL_PIXEL_FORMAT_BGRA_: return __DRI_IMAGE_FOURCC_ARGB;
case HAL_PIXEL_FORMAT_RGBA_: return __DRI_IMAGE_FOURCC_ABGR;
case HAL_PIXEL_FORMAT_RGBX_: return __DRI_IMAGE_FOURCC_XBGR;
default:
@@ -122,7 +120,6 @@ static int get_fourcc(int native)
 static int get_format(int format)
 {
switch (format) {
-   case HAL_PIXEL_FORMAT_BGRA_: return __DRI_IMAGE_FORMAT_ARGB;
case HAL_PIXEL_FORMAT_RGB_565:   return __DRI_IMAGE_FORMAT_RGB565;
case HAL_PIXEL_FORMAT_RGBA_: return __DRI_IMAGE_FORMAT_ABGR;
case HAL_PIXEL_FORMAT_RGBX_: return __DRI_IMAGE_FORMAT_XBGR;
@@ -1027,7 +1024,6 @@ droid_add_configs_for_visuals(_EGLDriver *drv, 
_EGLDisplay *dpy)
   { HAL_PIXEL_FORMAT_RGBA_, { 0x00ff, 0xff00, 0x00ff, 
0xff00 } },
   { HAL_PIXEL_FORMAT_RGBX_, { 0x00ff, 0xff00, 0x00ff, 
0x } },
   { HAL_PIXEL_FORMAT_RGB_565,   { 0xf800, 0x07e0, 0x001f, 
0x } },
-  { HAL_PIXEL_FORMAT_BGRA_, { 0x00ff, 0xff00, 0x00ff, 
0xff00 } },
};
 
unsigned int format_count[ARRAY_SIZE(visuals)] = { 0 };
-- 
2.13.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radv: set cb base tile swizzles for MRT speedups (v3)

2017-07-11 Thread Alex Smith
On 10 July 2017 at 05:59, Dave Airlie  wrote:

> From: Dave Airlie 
>
> This patch uses addrlib to workout the tile swizzles according
> to the surface index. It seems to produce the same values as
> amdgpu-pro for the deferred test.
>
> v2: don't apply swizzle to CMASK. the eg docs don't mention
> it, and we clearly don't align cmask for that.
> v3: disable surf index for dedicated images, as these will
> most likely be shared, and I don't think the metadata has
> space for this info in it yet.
>

FWIW, disabling this for images marked as dedicated means this won't get
any improvements for render targets on our games. We create all render
targets as dedicated when NV_dedicated_allocation is available since this
gets us significant perf improvement on NVIDIA.

If it's not currently possible to have this enabled for dedicated images we
could avoid using it on AMD, though I'm curious if there's likely to be any
other perf benefits to marking RTs as dedicated we'd then be missing out
on? I've not done any testing to see if there's any benefit from using it.

Thanks,
Alex


> This gets the deferred demo from 730->950fps on my rx480.
> (dcc cmask elim predication patches get it further)
> I'm also seeing some improvements in Mad Max at 4K
>
> Signed-off-by: Dave Airlie 
>
> fixup for dedicate
> ---
>  src/amd/common/ac_surface.c   | 14 ++
>  src/amd/common/ac_surface.h   |  2 ++
>  src/amd/vulkan/radv_device.c  |  7 ++-
>  src/amd/vulkan/radv_image.c   | 19 ++-
>  src/amd/vulkan/radv_private.h |  2 ++
>  5 files changed, 42 insertions(+), 2 deletions(-)
>
> diff --git a/src/amd/common/ac_surface.c b/src/amd/common/ac_surface.c
> index 23fb66b..0aebacc 100644
> --- a/src/amd/common/ac_surface.c
> +++ b/src/amd/common/ac_surface.c
> @@ -692,6 +692,20 @@ static int gfx6_compute_surface(ADDR_HANDLE addrlib,
> surf->htile_size *= 2;
>
> surf->is_linear = surf->u.legacy.level[0].mode ==
> RADEON_SURF_MODE_LINEAR_ALIGNED;
> +
> +   /* workout base swizzle */
> +   if (!(surf->flags & RADEON_SURF_Z_OR_SBUFFER)) {
> +   ADDR_COMPUTE_BASE_SWIZZLE_INPUT AddrBaseSwizzleIn = {0};
> +   ADDR_COMPUTE_BASE_SWIZZLE_OUTPUT AddrBaseSwizzleOut = {0};
> +
> +   AddrBaseSwizzleIn.surfIndex = config->info.surf_index;
> +   AddrBaseSwizzleIn.tileIndex = AddrSurfInfoIn.tileIndex;
> +   AddrBaseSwizzleIn.macroModeIndex = AddrSurfInfoOut.
> macroModeIndex;
> +   AddrBaseSwizzleIn.pTileInfo = AddrSurfInfoOut.pTileInfo;
> +   AddrBaseSwizzleIn.tileMode = AddrSurfInfoOut.tileMode;
> +   AddrComputeBaseSwizzle(addrlib, ,
> );
> +   surf->u.legacy.combined_swizzle = AddrBaseSwizzleOut.
> tileSwizzle;
> +   }
> return 0;
>  }
>
> diff --git a/src/amd/common/ac_surface.h b/src/amd/common/ac_surface.h
> index 4d893ff..7901b86 100644
> --- a/src/amd/common/ac_surface.h
> +++ b/src/amd/common/ac_surface.h
> @@ -97,6 +97,7 @@ struct legacy_surf_layout {
>  unsigneddepth_adjusted:1;
>  unsignedstencil_adjusted:1;
>
> +uint8_t combined_swizzle;
>  struct legacy_surf_levellevel[RADEON_SURF_MAX_LEVELS];
>  struct legacy_surf_levelstencil_level[RADEON_SURF_MAX_LEVELS];
>  uint8_t tiling_index[RADEON_SURF_MAX_LEVELS];
> @@ -194,6 +195,7 @@ struct ac_surf_info {
> uint32_t width;
> uint32_t height;
> uint32_t depth;
> +   uint32_t surf_index;
> uint8_t samples;
> uint8_t levels;
> uint16_t array_size;
> diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
> index 789c90d..eb77914 100644
> --- a/src/amd/vulkan/radv_device.c
> +++ b/src/amd/vulkan/radv_device.c
> @@ -2757,7 +2757,8 @@ radv_initialise_color_surface(struct radv_device
> *device,
> }
>
> cb->cb_color_base = va >> 8;
> -
> +   if (device->physical_device->rad_info.chip_class < GFX9)
> +   cb->cb_color_base |= iview->image->surface.u.
> legacy.combined_swizzle;
> /* CMASK variables */
> va = device->ws->buffer_get_va(iview->bo) + iview->image->offset;
> va += iview->image->cmask.offset;
> @@ -2766,6 +2767,8 @@ radv_initialise_color_surface(struct radv_device
> *device,
> va = device->ws->buffer_get_va(iview->bo) + iview->image->offset;
> va += iview->image->dcc_offset;
> cb->cb_dcc_base = va >> 8;
> +   if (device->physical_device->rad_info.chip_class < GFX9)
> +   cb->cb_dcc_base |= iview->image->surface.u.
> legacy.combined_swizzle;
>
> uint32_t max_slice = radv_surface_layer_count(iview);
> cb->cb_color_view = S_028C6C_SLICE_START(iview->base_layer) |
> @@ -2781,6 +2784,8 @@ radv_initialise_color_surface(struct radv_device
> *device,
> if 

Re: [Mesa-dev] [PATCH] etnaviv: Use the correct LOG instruction on GC3000

2017-07-11 Thread Lucas Stach
Am Dienstag, den 11.07.2017, 15:07 +0200 schrieb Wladimir J. van der
Laan:
> GC3000 has a new LOG instruction, similar to the new SIN and COS instructions.
> 
> Generate the new instruction sequence when appropriate; there are
> two occasions, as part of LIT and the generator for the LG2
> instruction itself.
> 
> Signed-off-by: Wladimir J. van der Laan 

In a quick test this fixes all the misrendering I was observing in the
various glmark2 benchmarks on GC3000.

Tested-by: Lucas Stach 

> ---
>  src/gallium/drivers/etnaviv/etnaviv_compiler.c | 63 
> +++---
>  src/gallium/drivers/etnaviv/etnaviv_internal.h |  4 +-
>  src/gallium/drivers/etnaviv/etnaviv_screen.c   |  2 +-
>  3 files changed, 59 insertions(+), 10 deletions(-)
> 
> diff --git a/src/gallium/drivers/etnaviv/etnaviv_compiler.c 
> b/src/gallium/drivers/etnaviv/etnaviv_compiler.c
> index 07315f7..6435b84 100644
> --- a/src/gallium/drivers/etnaviv/etnaviv_compiler.c
> +++ b/src/gallium/drivers/etnaviv/etnaviv_compiler.c
> @@ -1389,12 +1389,27 @@ trans_lit(const struct instr_translater *t, struct 
> etna_compile *c,
> else
>src_w = swizzle(src[0], SWIZZLE(W, W, W, W));
>  
> -   struct etna_inst ins[3] = { };
> -   ins[0].opcode = INST_OPCODE_LOG;
> -   ins[0].dst = etna_native_to_dst(inner_temp, INST_COMPS_X);
> -   ins[0].src[2] = src_y;
> +   if (c->specs->has_new_transcendentals) { /* Alternative LOG sequence */
> +  emit_inst(c, &(struct etna_inst) {
> + .opcode = INST_OPCODE_LOG,
> + .dst = etna_native_to_dst(inner_temp, INST_COMPS_X | INST_COMPS_Y),
> + .src[2] = src_y,
> + .tex = { .amode=1 }, /* Unknown bit needs to be set */
> +  });
> +  emit_inst(c, &(struct etna_inst) {
> + .opcode = INST_OPCODE_MUL,
> + .dst = etna_native_to_dst(inner_temp, INST_COMPS_X),
> + .src[0] = etna_native_to_src(inner_temp, SWIZZLE(X, X, X, X)),
> + .src[1] = etna_native_to_src(inner_temp, SWIZZLE(Y, Y, Y, Y)),
> +  });
> +   } else {
> +  struct etna_inst ins[3] = { };
> +  ins[0].opcode = INST_OPCODE_LOG;
> +  ins[0].dst = etna_native_to_dst(inner_temp, INST_COMPS_X);
> +  ins[0].src[2] = src_y;
>  
> -   emit_inst(c, [0]);
> +  emit_inst(c, [0]);
> +   }
> emit_inst(c, &(struct etna_inst) {
>.opcode = INST_OPCODE_MUL,
>.sat = 0,
> @@ -1450,7 +1465,7 @@ static void
>  trans_trig(const struct instr_translater *t, struct etna_compile *c,
> const struct tgsi_full_instruction *inst, struct etna_inst_src 
> *src)
>  {
> -   if (c->specs->has_new_sin_cos) { /* Alternative SIN/COS */
> +   if (c->specs->has_new_transcendentals) { /* Alternative SIN/COS */
>/* On newer chips alternative SIN/COS instructions are implemented,
> * which:
> * - Need their input scaled by 1/pi instead of 2/pi
> @@ -1613,6 +1628,40 @@ trans_trig(const struct instr_translater *t, struct 
> etna_compile *c,
>  }
>  
>  static void
> +trans_lg2(const struct instr_translater *t, struct etna_compile *c,
> +const struct tgsi_full_instruction *inst, struct etna_inst_src 
> *src)
> +{
> +   if (c->specs->has_new_transcendentals) {
> +  /* On newer chips alternative LOG instruction is implemented,
> +   * which outputs an x and y component, which need to be multiplied to
> +   * get the result.
> +   */
> +  struct etna_native_reg temp = etna_compile_get_inner_temp(c); /* only 
> using .xy */
> +  emit_inst(c, &(struct etna_inst) {
> + .opcode = INST_OPCODE_LOG,
> + .sat = 0,
> + .dst = etna_native_to_dst(temp, INST_COMPS_X | INST_COMPS_Y),
> + .src[2] = src[0],
> + .tex = { .amode=1 }, /* Unknown bit needs to be set */
> +  });
> +  emit_inst(c, &(struct etna_inst) {
> + .opcode = INST_OPCODE_MUL,
> + .sat = inst->Instruction.Saturate,
> + .dst = convert_dst(c, >Dst[0]),
> + .src[0] = etna_native_to_src(temp, SWIZZLE(X, X, X, X)),
> + .src[1] = etna_native_to_src(temp, SWIZZLE(Y, Y, Y, Y)),
> +  });
> +   } else {
> +  emit_inst(c, &(struct etna_inst) {
> + .opcode = INST_OPCODE_LOG,
> + .sat = inst->Instruction.Saturate,
> + .dst = convert_dst(c, >Dst[0]),
> + .src[2] = src[0],
> +  });
> +   }
> +}
> +
> +static void
>  trans_dph(const struct instr_translater *t, struct etna_compile *c,
>const struct tgsi_full_instruction *inst, struct etna_inst_src 
> *src)
>  {
> @@ -1753,7 +1802,7 @@ static const struct instr_translater 
> translaters[TGSI_OPCODE_LAST] = {
> INSTR(DST, trans_instr, .opc = INST_OPCODE_DST, .src = {0, 1, -1}),
> INSTR(MAD, trans_instr, .opc = INST_OPCODE_MAD, .src = {0, 1, 2}),
> INSTR(EX2, trans_instr, .opc = INST_OPCODE_EXP, .src = {2, -1, -1}),
> -   INSTR(LG2, trans_instr, .opc = INST_OPCODE_LOG, .src = {2, -1, -1}),
> +   INSTR(LG2, trans_lg2),
> 

Re: [Mesa-dev] XCOM: Enemy Unknown vs. NaN texture unit LOD bias

2017-07-11 Thread Roland Scheidegger
Am 11.07.2017 um 08:25 schrieb Kenneth Graunke:
> Hello,
> 
> Mesa master has been hitting assert failures when running "XCOM: Enemy
> Unknown" since commit f8d69beed49c64f883bb8ffb28d4960306baf575, where we
> started asserting that the SAMPLER_STATE LOD Bias value actually fits in
> the correct number of bits.
> 
> Apparently, XCOM calls
> 
>glTexEnv(GL_TEXTURE_FILTER_CONTROL_EXT, GL_TEXTURE_LOD_BIAS_EXT, val);
> 
> to set the texture unit LOD bias...but according to gdb, the value is:
> 
>-nan(0x73)
> 
> In i965, we do CLAMP(lod bias, -16, 15)...but NaN ends up failing both
> the < min and > max comparisons, so it slips through.  But, that raises
> the question...what value *should* we be using?  0?  Min?  Max?
> 
> I couldn't find any immediately applicable GL spec text.  Anyone know of
> any?  If not, does DirectX mandate something?
I would guess behavior is undefined for GL.
I don't think d3d10 would say anything directly for this case, however
generally when things get converted to fixed point somewhere in the gpu
pipeline, the spec mandatates nans get converted to 0. This may or may
not be applicable here as it isn't really converted to fixed point
directly here.

We could also just make the CLAMP macro nan-safe easily (min2/max2
already are if you use them reversed...)
So instead of
#define CLAMP( X, MIN, MAX )  ( (X)<(MIN) ? (MIN) : ((X)>(MAX) ? (MAX) :
(X)) )

use
#define CLAMP( X, MIN, MAX ) ( (X)>(MIN) ? ((X)>(MAX) ? (MAX) : (X)) :
MIN) )

Which would fix this elsewhere too, with no extra effort even (at least
I'd think so - on x86_64, both should probably compile to just
minss/maxss combo, and if you get a nan or not there just depends on the
order of the arguments).
(This is, of course, assuming that min/max aren't NaN, but I'd think in
most (or all?) uses of CLAMP these are constant, non-NaN values).
Though it doesn't give you zero, unless the lower bound is zero (which
is probably often the case but not here). I have of course no idea if
the app would be happy with that...

And of course that's not filtering the values as they come through the api.

Roland


> 
> I wrote a hack to check isnan and replace it with 0, which gets the game
> working again, but...it seems like we could have this problem in a lot of
> other places too...and I'm not sure what the right answer is.
> 
> https://cgit.freedesktop.org/~kwg/mesa/commit/?h=xcom=6a1c0515b760c943eb547cced754b465aa3bd4ca
> 
> Thanks for any advice :)
> 
> --Ken
> 
> 
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] etnaviv: Use the correct LOG instruction on GC3000

2017-07-11 Thread Wladimir J. van der Laan
GC3000 has a new LOG instruction, similar to the new SIN and COS instructions.

Generate the new instruction sequence when appropriate; there are
two occasions, as part of LIT and the generator for the LG2
instruction itself.

Signed-off-by: Wladimir J. van der Laan 
---
 src/gallium/drivers/etnaviv/etnaviv_compiler.c | 63 +++---
 src/gallium/drivers/etnaviv/etnaviv_internal.h |  4 +-
 src/gallium/drivers/etnaviv/etnaviv_screen.c   |  2 +-
 3 files changed, 59 insertions(+), 10 deletions(-)

diff --git a/src/gallium/drivers/etnaviv/etnaviv_compiler.c 
b/src/gallium/drivers/etnaviv/etnaviv_compiler.c
index 07315f7..6435b84 100644
--- a/src/gallium/drivers/etnaviv/etnaviv_compiler.c
+++ b/src/gallium/drivers/etnaviv/etnaviv_compiler.c
@@ -1389,12 +1389,27 @@ trans_lit(const struct instr_translater *t, struct 
etna_compile *c,
else
   src_w = swizzle(src[0], SWIZZLE(W, W, W, W));
 
-   struct etna_inst ins[3] = { };
-   ins[0].opcode = INST_OPCODE_LOG;
-   ins[0].dst = etna_native_to_dst(inner_temp, INST_COMPS_X);
-   ins[0].src[2] = src_y;
+   if (c->specs->has_new_transcendentals) { /* Alternative LOG sequence */
+  emit_inst(c, &(struct etna_inst) {
+ .opcode = INST_OPCODE_LOG,
+ .dst = etna_native_to_dst(inner_temp, INST_COMPS_X | INST_COMPS_Y),
+ .src[2] = src_y,
+ .tex = { .amode=1 }, /* Unknown bit needs to be set */
+  });
+  emit_inst(c, &(struct etna_inst) {
+ .opcode = INST_OPCODE_MUL,
+ .dst = etna_native_to_dst(inner_temp, INST_COMPS_X),
+ .src[0] = etna_native_to_src(inner_temp, SWIZZLE(X, X, X, X)),
+ .src[1] = etna_native_to_src(inner_temp, SWIZZLE(Y, Y, Y, Y)),
+  });
+   } else {
+  struct etna_inst ins[3] = { };
+  ins[0].opcode = INST_OPCODE_LOG;
+  ins[0].dst = etna_native_to_dst(inner_temp, INST_COMPS_X);
+  ins[0].src[2] = src_y;
 
-   emit_inst(c, [0]);
+  emit_inst(c, [0]);
+   }
emit_inst(c, &(struct etna_inst) {
   .opcode = INST_OPCODE_MUL,
   .sat = 0,
@@ -1450,7 +1465,7 @@ static void
 trans_trig(const struct instr_translater *t, struct etna_compile *c,
const struct tgsi_full_instruction *inst, struct etna_inst_src *src)
 {
-   if (c->specs->has_new_sin_cos) { /* Alternative SIN/COS */
+   if (c->specs->has_new_transcendentals) { /* Alternative SIN/COS */
   /* On newer chips alternative SIN/COS instructions are implemented,
* which:
* - Need their input scaled by 1/pi instead of 2/pi
@@ -1613,6 +1628,40 @@ trans_trig(const struct instr_translater *t, struct 
etna_compile *c,
 }
 
 static void
+trans_lg2(const struct instr_translater *t, struct etna_compile *c,
+const struct tgsi_full_instruction *inst, struct etna_inst_src 
*src)
+{
+   if (c->specs->has_new_transcendentals) {
+  /* On newer chips alternative LOG instruction is implemented,
+   * which outputs an x and y component, which need to be multiplied to
+   * get the result.
+   */
+  struct etna_native_reg temp = etna_compile_get_inner_temp(c); /* only 
using .xy */
+  emit_inst(c, &(struct etna_inst) {
+ .opcode = INST_OPCODE_LOG,
+ .sat = 0,
+ .dst = etna_native_to_dst(temp, INST_COMPS_X | INST_COMPS_Y),
+ .src[2] = src[0],
+ .tex = { .amode=1 }, /* Unknown bit needs to be set */
+  });
+  emit_inst(c, &(struct etna_inst) {
+ .opcode = INST_OPCODE_MUL,
+ .sat = inst->Instruction.Saturate,
+ .dst = convert_dst(c, >Dst[0]),
+ .src[0] = etna_native_to_src(temp, SWIZZLE(X, X, X, X)),
+ .src[1] = etna_native_to_src(temp, SWIZZLE(Y, Y, Y, Y)),
+  });
+   } else {
+  emit_inst(c, &(struct etna_inst) {
+ .opcode = INST_OPCODE_LOG,
+ .sat = inst->Instruction.Saturate,
+ .dst = convert_dst(c, >Dst[0]),
+ .src[2] = src[0],
+  });
+   }
+}
+
+static void
 trans_dph(const struct instr_translater *t, struct etna_compile *c,
   const struct tgsi_full_instruction *inst, struct etna_inst_src *src)
 {
@@ -1753,7 +1802,7 @@ static const struct instr_translater 
translaters[TGSI_OPCODE_LAST] = {
INSTR(DST, trans_instr, .opc = INST_OPCODE_DST, .src = {0, 1, -1}),
INSTR(MAD, trans_instr, .opc = INST_OPCODE_MAD, .src = {0, 1, 2}),
INSTR(EX2, trans_instr, .opc = INST_OPCODE_EXP, .src = {2, -1, -1}),
-   INSTR(LG2, trans_instr, .opc = INST_OPCODE_LOG, .src = {2, -1, -1}),
+   INSTR(LG2, trans_lg2),
INSTR(SQRT, trans_instr, .opc = INST_OPCODE_SQRT, .src = {2, -1, -1}),
INSTR(FRC, trans_instr, .opc = INST_OPCODE_FRC, .src = {2, -1, -1}),
INSTR(CEIL, trans_instr, .opc = INST_OPCODE_CEIL, .src = {2, -1, -1}),
diff --git a/src/gallium/drivers/etnaviv/etnaviv_internal.h 
b/src/gallium/drivers/etnaviv/etnaviv_internal.h
index 1212fdf..8a31167 100644
--- a/src/gallium/drivers/etnaviv/etnaviv_internal.h
+++ b/src/gallium/drivers/etnaviv/etnaviv_internal.h
@@ -70,8 

Re: [Mesa-dev] [PATCH] meta: Fix BlitFramebuffer temp texture setup

2017-07-11 Thread Ville Syrjälä
On Mon, Jul 10, 2017 at 11:42:18PM +0300, Andres Gomez wrote:
> Ville, has this patch fallen through the cracks ?

Nope. I've still been looking into the issue whenever I've had a
few minutes to spare. I think I have it more or les figured out
at this stage, but I'll need to respin this patch, and clean up
some further patches to fix this stuff properly for i915.

> 
> On Fri, 2017-06-23 at 14:58 +0300, ville.syrj...@linux.intel.com wrote:
> > From: Ville Syrjälä 
> > 
> > Pass the correct src coordinates to CopyTexSubImage()
> > when creating the temporary texture, and also take care to adjust
> > flipX/Y if the original src coordinates were flipped compared to
> > the new temporary texture src coordinates.
> > 
> > This fixes all the flip_src_x/y tests in
> > piglit.spec.arb_framebuffer_object.fbo-blit-stretch on i915, but
> > we're still left with the some failures in the stretch tests.
> > 
> > It looks to me like commit b702233f53d6 ("meta: Refactor the
> > BlitFramebuffer color CopyTexImage fallback.") most likely
> > broke this codepath.
> > 
> > Cc: mesa-sta...@lists.freedesktop.org
> > Cc: Eric Anholt 
> > Cc: Kenneth Graunke 
> > Cc: Ian Romanick 
> > Cc: Anuj Phogat 
> > Fixes: b702233f53d6 ("meta: Refactor the BlitFramebuffer color CopyTexImage 
> > fallback.")
> > References: https://bugs.freedesktop.org/show_bug.cgi?id=101414
> > Signed-off-by: Ville Syrjälä 
> > ---
> >  src/mesa/drivers/common/meta_blit.c | 8 ++--
> >  1 file changed, 6 insertions(+), 2 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/common/meta_blit.c 
> > b/src/mesa/drivers/common/meta_blit.c
> > index 7adad469aceb..7262ecdfaf13 100644
> > --- a/src/mesa/drivers/common/meta_blit.c
> > +++ b/src/mesa/drivers/common/meta_blit.c
> > @@ -680,12 +680,16 @@ blitframebuffer_texture(struct gl_context *ctx,
> >}
> >  
> >_mesa_meta_setup_copypix_texture(ctx, meta_temp_texture,
> > -   srcX0, srcY0,
> > +   MIN2(srcX0, srcX1),
> > +   MIN2(srcY0, srcY1),
> > srcW, srcH,
> > tex_base_format,
> > filter);
> >  
> > -
> > +  if (srcX0 > srcX1)
> > + flipX = -flipX;
> > +  if (srcY0 > srcY1)
> > + flipY = -flipY;
> >srcX0 = 0;
> >srcY0 = 0;
> >srcX1 = srcW;
> -- 
> Br,
> 
> Andres

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/dri: add 32-bit RGBX/RGBA formats

2017-07-11 Thread Emil Velikov
On 10 July 2017 at 22:33, Chad Versace  wrote:
> On Mon 10 Jul 2017, Chad Versace wrote:
>> On Fri 07 Jul 2017, Rob Herring wrote:
>> > On Wed, Jul 5, 2017 at 5:14 PM, Chad Versace  
>> > wrote:
>> > > On Fri 30 Jun 2017, Rob Herring wrote:
>> > >> Add support for 32-bit RGBX/RGBA formats which are required for Android.
>> > >>
>> > >> The original patch (commit ccdcf91104a5) was reverted (commit
>> > >> c0c6ca40a25e) in mesa as it broke GLX resulting in swapped colors. Based
>> > >> on further investigation by Chad Versace, moving the RGBX/RGBA configs
>> > >> to the end is enough to prevent breaking GLX.
>> > >>
>> > >> Cc: Marek Olšák 
>> > >> Cc: Eric Anholt 
>> > >> Cc: Chad Versace 
>> > >> Cc: Mauro Rossi 
>> > >> Signed-off-by: Rob Herring 
>> > >> ---
>> > >> I've tested only on Android and could use help testing with KDE which
>> > >> broke last time. This has been done on the Intel driver and *should* be
>> > >> okay, but maybe not.
>> > >
>> > > Should this patch also update the switch statement in
>> > > dri2.c:dri2_drawable_get_buffers()? I think so, but am not certain.
>> >
>> > I don't know. At least for Android, I think we'd always take the
>> > dri_image_drawable_get_buffers path which already has the formats.
>>
>> True, I think Android always takes the dri_image_drawable_get_buffers()
>> path. It wouldn't hurt to also add the formats to
>> dri2_drawable_get_buffers(), but I doubt that function will ever see the
>> new formats.
>>
>> Reviewed-by: Chad Versace 
>
> Oops. I retract my r-b.
>
> dri_create_context() and dri_create_buffer() call dri_fill_st_visual(),
> but dri_fill_st_visual() hasn't yet been taught about the new formats.
> But I don't understand Gallium well enough to know when those functions
> get called.

Marek had a patch for that in https://bugs.freedesktop.org/show_bug.cgi?id=95071
I thought it was merged, but it seems not.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] - drirc: whitelist glthread for Euro Truck Simulator 2

2017-07-11 Thread Petr Sebor
---
 src/mesa/drivers/dri/common/drirc | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/mesa/drivers/dri/common/drirc 
b/src/mesa/drivers/dri/common/drirc
index 1311db9..be731a0 100644
--- a/src/mesa/drivers/dri/common/drirc
+++ b/src/mesa/drivers/dri/common/drirc
@@ -178,6 +178,9 @@ TODO: document the other workarounds.
 
 
 
+
+
+
 
 
 
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] - drirc: whitelist glthread for American Truck Simulator

2017-07-11 Thread Petr Sebor
---
 src/mesa/drivers/dri/common/drirc | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/mesa/drivers/dri/common/drirc 
b/src/mesa/drivers/dri/common/drirc
index 69b735c..1311db9 100644
--- a/src/mesa/drivers/dri/common/drirc
+++ b/src/mesa/drivers/dri/common/drirc
@@ -175,6 +175,9 @@ TODO: document the other workarounds.
 
 
 
+
+
+
 
 
 
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 07/10] util/u_queue: add an option to resize the queue when it's full

2017-07-11 Thread Grazvydas Ignotas
On Tue, Jul 11, 2017 at 12:21 AM, Marek Olšák  wrote:
> From: Marek Olšák 
>
> Consider the following situation:
>   mtx_lock(mutex);
>   do_something();
>   util_queue_add_job(...);
>   mtx_unlock(mutex);
>
> If the queue is full, util_queue_add_job will wait for a free slot.
> If the job which is currently being executed tries to lock the mutex,
> it will be stuck forever, because util_queue_add_job is stuck.
>
> The deadlock can be trivially resolved by increasing the queue size
> (reallocating the queue) in util_queue_add_job if the queue is full.
> Then util_queue_add_job becomes wait-free.
>
> radeonsi will use it.

Can't this cause the queue to grow uncontrollably, like on GPU hangs,
making already difficult to debug situations worse? Perhaps
util_queue_add_job() could have a non-blocking-fail option and the
caller could then retry after releasing the mutex for a bit.

Gražvydas
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radv: Fix descriptors for cube images with VK_IMAGE_USAGE_STORAGE_BIT

2017-07-11 Thread Alex Smith
Ping.

On 30 June 2017 at 11:18, Alex Smith  wrote:
> If a cube image has VK_IMAGE_USAGE_STORAGE_BIT set, the type in an image
> view's descriptor was set to a 2D array (and a few other fields adjusted
> accordingly). This is correct when the image view is actually bound as a
> storage image, but not when bound as a sampled image. In that case the
> type should be set as a cube.
>
> Fix by generating 2 sets of descriptors at view creation time for both
> storage and non-storage usage, and then choose between them based on
> descriptor type when writing descriptor sets.
>
> Signed-off-by: Alex Smith 
> ---
>  src/amd/vulkan/radv_descriptor_set.c | 18 +++--
>  src/amd/vulkan/radv_image.c  | 77 
> 
>  src/amd/vulkan/radv_private.h|  6 +++
>  3 files changed, 72 insertions(+), 29 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_descriptor_set.c 
> b/src/amd/vulkan/radv_descriptor_set.c
> index ec7fd3d..b4a78aa 100644
> --- a/src/amd/vulkan/radv_descriptor_set.c
> +++ b/src/amd/vulkan/radv_descriptor_set.c
> @@ -603,11 +603,18 @@ write_image_descriptor(struct radv_device *device,
>struct radv_cmd_buffer *cmd_buffer,
>unsigned *dst,
>struct radeon_winsys_bo **buffer_list,
> +  VkDescriptorType descriptor_type,
>const VkDescriptorImageInfo *image_info)
>  {
> RADV_FROM_HANDLE(radv_image_view, iview, image_info->imageView);
> -   memcpy(dst, iview->descriptor, 8 * 4);
> -   memcpy(dst + 8, iview->fmask_descriptor, 8 * 4);
> +
> +   if (descriptor_type == VK_DESCRIPTOR_TYPE_STORAGE_IMAGE) {
> +   memcpy(dst, iview->storage_descriptor, 8 * 4);
> +   memcpy(dst + 8, iview->storage_fmask_descriptor, 8 * 4);
> +   } else {
> +   memcpy(dst, iview->descriptor, 8 * 4);
> +   memcpy(dst + 8, iview->fmask_descriptor, 8 * 4);
> +   }
>
> if (cmd_buffer)
> device->ws->cs_add_buffer(cmd_buffer->cs, iview->bo, 7);
> @@ -620,12 +627,13 @@ write_combined_image_sampler_descriptor(struct 
> radv_device *device,
> struct radv_cmd_buffer *cmd_buffer,
> unsigned *dst,
> struct radeon_winsys_bo **buffer_list,
> +   VkDescriptorType descriptor_type,
> const VkDescriptorImageInfo 
> *image_info,
> bool has_sampler)
>  {
> RADV_FROM_HANDLE(radv_sampler, sampler, image_info->sampler);
>
> -   write_image_descriptor(device, cmd_buffer, dst, buffer_list, 
> image_info);
> +   write_image_descriptor(device, cmd_buffer, dst, buffer_list, 
> descriptor_type, image_info);
> /* copy over sampler state */
> if (has_sampler)
> memcpy(dst + 16, sampler->state, 16);
> @@ -696,10 +704,12 @@ void radv_update_descriptor_sets(
> case VK_DESCRIPTOR_TYPE_STORAGE_IMAGE:
> case VK_DESCRIPTOR_TYPE_INPUT_ATTACHMENT:
> write_image_descriptor(device, cmd_buffer, 
> ptr, buffer_list,
> +  
> writeset->descriptorType,
>writeset->pImageInfo + 
> j);
> break;
> case VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER:
> 
> write_combined_image_sampler_descriptor(device, cmd_buffer, ptr, buffer_list,
> +   
> writeset->descriptorType,
> 
> writeset->pImageInfo + j,
> 
> !binding_layout->immutable_samplers_offset);
> if (copy_immutable_samplers) {
> @@ -866,10 +876,12 @@ void radv_update_descriptor_set_with_template(struct 
> radv_device *device,
> case VK_DESCRIPTOR_TYPE_STORAGE_IMAGE:
> case VK_DESCRIPTOR_TYPE_INPUT_ATTACHMENT:
> write_image_descriptor(device, cmd_buffer, 
> pDst, buffer_list,
> +  
> templ->entry[i].descriptor_type,
>(struct 
> VkDescriptorImageInfo *) pSrc);
> break;
> case VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER:
> 
> write_combined_image_sampler_descriptor(device, cmd_buffer, pDst, buffer_list,
> +   

[Mesa-dev] [Bug 101334] Any vulkan app seems to freeze the system

2017-07-11 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=101334

--- Comment #23 from John  ---
Some apps worked, others froze the system.

I'm still hopeful to find a fix here :)

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 101334] Any vulkan app seems to freeze the system

2017-07-11 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=101334

--- Comment #22 from Marko  ---
(In reply to John from comment #21)
> I believe that's a same generation card, so it would make sense to behave
> similarly.

Yeah. Did it ever work for you? RADV was a no-go on my card from day one.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] i965: Fix asynchronous mappings on !LLC platforms.

2017-07-11 Thread Kenneth Graunke
When using a read-only CPU mapping, we may encounter stale buffer
contents.  For example, the Piglit primitive-restart test offers the
following scenario:

   1. Read data via a CPU map.
   2. Destroy that buffer.
   3. Create a new buffer - obtaining the same one via the BO cache.
   4. Call BufferSubData, which does a GTT map with MAP_WRITE | MAP_ASYNC.
  (We avoid set_domain for async mappings, so no flushing occurs.)
   5. Read data via a CPU map.
  (Without explicit clflushing, this will contain data from step 1!)

Otherwise, everything ought to work, keeping in mind that we never use
CPU maps for writing - just read-only CPU maps.

This restores the performance gains after Matt's revert in commit
71651b3139c501f50e6547c21a1cdb816b0a9dde.

v2: Do the invalidate later, and even when asking for a brand new map.
---
 src/mesa/drivers/dri/i965/brw_bufmgr.c | 18 --
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c 
b/src/mesa/drivers/dri/i965/brw_bufmgr.c
index 30e4b28b9e0..af6524b8100 100644
--- a/src/mesa/drivers/dri/i965/brw_bufmgr.c
+++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c
@@ -56,6 +56,7 @@
 #ifndef ETIME
 #define ETIME ETIMEDOUT
 #endif
+#include "common/gen_clflush.h"
 #include "common/gen_debug.h"
 #include "common/gen_device_info.h"
 #include "libdrm_macros.h"
@@ -703,11 +704,24 @@ brw_bo_map_cpu(struct brw_context *brw, struct brw_bo 
*bo, unsigned flags)
bo->map_cpu);
print_flags(flags);
 
-   if (!(flags & MAP_ASYNC) || !bufmgr->has_llc) {
+   if (!(flags & MAP_ASYNC)) {
   set_domain(brw, "CPU mapping", bo, I915_GEM_DOMAIN_CPU,
  flags & MAP_WRITE ? I915_GEM_DOMAIN_CPU : 0);
}
 
+   if (!bo->cache_coherent) {
+  /* If we're reusing an existing CPU mapping, the CPU caches may
+   * contain stale data from the last time we read from that mapping.
+   * (With the BO cache, it might even be data from a previous buffer!)
+   * Even if it's a brand new mapping, the kernel may have zeroed the
+   * buffer via CPU writes.
+   *
+   * We need to invalidate those cachelines so that we see the latest
+   * contents.
+   */
+  gen_invalidate_range(bo->map_cpu, bo->size);
+   }
+
return bo->map_cpu;
 }
 
@@ -754,7 +768,7 @@ brw_bo_map_gtt(struct brw_context *brw, struct brw_bo *bo, 
unsigned flags)
DBG("bo_map_gtt: %d (%s) -> %p, ", bo->gem_handle, bo->name, bo->map_gtt);
print_flags(flags);
 
-   if (!(flags & MAP_ASYNC) || !bufmgr->has_llc) {
+   if (!(flags & MAP_ASYNC)) {
   set_domain(brw, "GTT mapping", bo,
  I915_GEM_DOMAIN_GTT, I915_GEM_DOMAIN_GTT);
}
-- 
2.13.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] i965: Don't use PREAD for glGetBufferSubData().

2017-07-11 Thread Kenneth Graunke
Just map the buffer and memcpy.  This will do a CPU mmap, which should
be reasonably efficient, and doing this gives us full control over the
domains and caching instead of leaving it to the kernel.

This prevents regressions on Braswell in the next commit.  Specifically
GL45-CTS.shader_atomic_counters.basic-buffer-operations.  Because async
maps start skipping set-domain, the pread thought everything was nicely
still in the CPU domain, and returned stale data.

v2: Use _mesa_error_no_memory() if the map fails instead of crashing.
---
 src/mesa/drivers/dri/i965/brw_bufmgr.c   | 24 
 src/mesa/drivers/dri/i965/brw_bufmgr.h   |  3 ---
 src/mesa/drivers/dri/i965/intel_buffer_objects.c | 11 ++-
 3 files changed, 10 insertions(+), 28 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c 
b/src/mesa/drivers/dri/i965/brw_bufmgr.c
index 11251f15edc..30e4b28b9e0 100644
--- a/src/mesa/drivers/dri/i965/brw_bufmgr.c
+++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c
@@ -820,30 +820,6 @@ brw_bo_subdata(struct brw_bo *bo, uint64_t offset,
return ret;
 }
 
-int
-brw_bo_get_subdata(struct brw_bo *bo, uint64_t offset,
-   uint64_t size, void *data)
-{
-   struct brw_bufmgr *bufmgr = bo->bufmgr;
-   struct drm_i915_gem_pread pread;
-   int ret;
-
-   memclear(pread);
-   pread.handle = bo->gem_handle;
-   pread.offset = offset;
-   pread.size = size;
-   pread.data_ptr = (uint64_t) (uintptr_t) data;
-   ret = drmIoctl(bufmgr->fd, DRM_IOCTL_I915_GEM_PREAD, );
-   if (ret != 0) {
-  ret = -errno;
-  DBG("%s:%d: Error reading data from buffer %d: "
-  "(%"PRIu64" %"PRIu64") %s .\n",
-  __FILE__, __LINE__, bo->gem_handle, offset, size, strerror(errno));
-   }
-
-   return ret;
-}
-
 /** Waits for all GPU rendering with the object to have completed. */
 void
 brw_bo_wait_rendering(struct brw_bo *bo)
diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.h 
b/src/mesa/drivers/dri/i965/brw_bufmgr.h
index d388e5ad150..01a540f5315 100644
--- a/src/mesa/drivers/dri/i965/brw_bufmgr.h
+++ b/src/mesa/drivers/dri/i965/brw_bufmgr.h
@@ -222,9 +222,6 @@ static inline int brw_bo_unmap(struct brw_bo *bo) { return 
0; }
 /** Write data into an object. */
 int brw_bo_subdata(struct brw_bo *bo, uint64_t offset,
uint64_t size, const void *data);
-/** Read data from an object. */
-int brw_bo_get_subdata(struct brw_bo *bo, uint64_t offset,
-   uint64_t size, void *data);
 /**
  * Waits for rendering to an object by the GPU to have completed.
  *
diff --git a/src/mesa/drivers/dri/i965/intel_buffer_objects.c 
b/src/mesa/drivers/dri/i965/intel_buffer_objects.c
index a9ac29a6a81..85cc1a694bf 100644
--- a/src/mesa/drivers/dri/i965/intel_buffer_objects.c
+++ b/src/mesa/drivers/dri/i965/intel_buffer_objects.c
@@ -289,7 +289,16 @@ brw_get_buffer_subdata(struct gl_context *ctx,
if (brw_batch_references(>batch, intel_obj->buffer)) {
   intel_batchbuffer_flush(brw);
}
-   brw_bo_get_subdata(intel_obj->buffer, offset, size, data);
+
+   void *map = brw_bo_map(brw, intel_obj->buffer, MAP_READ);
+
+   if (unlikely(!map)) {
+  _mesa_error_no_memory(__func__);
+  return;
+   }
+
+   memcpy(data, map + offset, size);
+   brw_bo_unmap(intel_obj->buffer);
 
mark_buffer_inactive(intel_obj);
 }
-- 
2.13.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] XCOM: Enemy Unknown vs. NaN texture unit LOD bias

2017-07-11 Thread Kenneth Graunke
Hello,

Mesa master has been hitting assert failures when running "XCOM: Enemy
Unknown" since commit f8d69beed49c64f883bb8ffb28d4960306baf575, where we
started asserting that the SAMPLER_STATE LOD Bias value actually fits in
the correct number of bits.

Apparently, XCOM calls

   glTexEnv(GL_TEXTURE_FILTER_CONTROL_EXT, GL_TEXTURE_LOD_BIAS_EXT, val);

to set the texture unit LOD bias...but according to gdb, the value is:

   -nan(0x73)

In i965, we do CLAMP(lod bias, -16, 15)...but NaN ends up failing both
the < min and > max comparisons, so it slips through.  But, that raises
the question...what value *should* we be using?  0?  Min?  Max?

I couldn't find any immediately applicable GL spec text.  Anyone know of
any?  If not, does DirectX mandate something?

I wrote a hack to check isnan and replace it with 0, which gets the game
working again, but...it seems like we could have this problem in a lot of
other places too...and I'm not sure what the right answer is.

https://cgit.freedesktop.org/~kwg/mesa/commit/?h=xcom=6a1c0515b760c943eb547cced754b465aa3bd4ca

Thanks for any advice :)

--Ken

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev