date:20171017

[Mesa-dev] [PATCH] radv/image: bump all the offset to uint64_t.

2017-10-17 Thread Dave Airlie

From: Dave Airlie 

So one of the CTS tests tries to allocate a 16384x1 2048 array
texture. This overflows a bunch of calculations when we want it
tiled as the heights goes to 128.

addrlib returns us the correct size (16GB or so), but we mangle
it in the htile calcs due to the 32-bit offset fields, then
userspace gives us the reduced number and we try to allocate
it on a heap and things blow up.

We really need to give the app back the correct size for the
image so we can blow up properly in memory allocation later.

This should fix hangs in
dEQP-VK.pipeline.render_to_image.core.1d_array.huge.width_layers.r8g8b8a8_unorm_d32_sfloat_s8_uint
since
Fixes: ad3d98da9f (radv: enable tc compatible htile for d32s8 also.)

Now there's an open question if we should be enabling tc-compat
htile at all for shallow textures like the above.

This might cause some other wierd side effects in CTS even
without the tc compat so:
Cc: "17.2" 

Signed-off-by: Dave Airlie 
---
 src/amd/vulkan/radv_private.h | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/amd/vulkan/radv_private.h b/src/amd/vulkan/radv_private.h
index cb5bf6881e1..7496fe51a68 100644
--- a/src/amd/vulkan/radv_private.h
+++ b/src/amd/vulkan/radv_private.h
@@ -1213,15 +1213,15 @@ struct radv_image {
/* Set when bound */
struct radeon_winsys_bo *bo;
VkDeviceSize offset;
-   uint32_t dcc_offset;
-   uint32_t htile_offset;
+   uint64_t dcc_offset;
+   uint64_t htile_offset;
bool tc_compatible_htile;
struct radeon_surf surface;
 
struct radv_fmask_info fmask;
struct radv_cmask_info cmask;
-   uint32_t clear_value_offset;
-   uint32_t dcc_pred_offset;
+   uint64_t clear_value_offset;
+   uint64_t dcc_pred_offset;
 };
 
 /* Whether the image has a htile that is known consistent with the contents of
-- 
2.14.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] drirc: Group a few games in the glthread whitelist together.

2017-10-17 Thread Darren Salt

---
 src/util/drirc | 27 +--
 1 file changed, 21 insertions(+), 6 deletions(-)

diff --git a/src/util/drirc b/src/util/drirc
index 3cf3d8dc69..39ac3c858c 100644
--- a/src/util/drirc
+++ b/src/util/drirc
@@ -166,27 +166,37 @@ TODO: document the other workarounds.
 
 
 
-
-
-
+
 
 
 
+
 
 
 
+
 
 
 
+
+
+
+
 
 
 
+
 
 
 
+
 
 
 
+
+
+
+
 
 
 
@@ -196,39 +206,44 @@ TODO: document the other workarounds.
 
 
 
+
 
 
 
+
 
 
 
+
 
 
 
 
 
 
+
 
 
 
 
 
 
+
 
 
 
+
 
 
 
-
-
-
 
 
 
+
 
 
 
+
 
 
 
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH V3] automake: intel: move expat handling where it's used

2017-10-17 Thread Hongxu Jia

Linking libvulkan_intel.so can fail, due to unresolved references to
libexpat.so.

EXPAT_CFLAGS should be moved as well.

Signed-off-by: Hongxu Jia 
---
 src/intel/Makefile.common.am | 1 +
 src/intel/Makefile.tools.am  | 2 --
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/intel/Makefile.common.am b/src/intel/Makefile.common.am
index 49e9c6a..3789dc1 100644
--- a/src/intel/Makefile.common.am
+++ b/src/intel/Makefile.common.am
@@ -23,6 +23,7 @@ noinst_LTLIBRARIES += common/libintel_common.la
 
 common_libintel_common_la_CFLAGS = $(AM_CFLAGS) $(LIBDRM_CFLAGS)
 common_libintel_common_la_SOURCES = $(COMMON_FILES)
+common_libintel_common_la_LIBADD = $(EXPAT_LIBS)
 
 if HAVE_PLATFORM_ANDROID
 common_libintel_common_la_CFLAGS += $(ANDROID_CFLAGS)
diff --git a/src/intel/Makefile.tools.am b/src/intel/Makefile.tools.am
index 8071220..32cdc70 100644
--- a/src/intel/Makefile.tools.am
+++ b/src/intel/Makefile.tools.am
@@ -41,7 +41,6 @@ tools_aubinator_LDADD = \
$(PER_GEN_LIBS) \
$(PTHREAD_LIBS) \
$(DLOPEN_LIBS) \
-   $(EXPAT_LIBS) \
$(ZLIB_LIBS) \
-lm
 
@@ -56,7 +55,6 @@ tools_aubinator_error_decode_LDADD = \
compiler/libintel_compiler.la \
$(top_builddir)/src/util/libmesautil.la \
$(PTHREAD_LIBS) \
-   $(EXPAT_LIBS) \
$(ZLIB_LIBS)
 
 tools_aubinator_error_decode_CFLAGS = \
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] drirc: Enable glthread for more games (Saints Row 4 & Gat out of Hell).

2017-10-17 Thread Darren Salt

“Saints Row: Gat out of Hell” benefits from this on slower CPUs in that
usage spikes on individual cores are avoided, which in turn makes it harder
to hit a bug which causes broken audio and the game to hang on exit.

“Saints Row IV” appears to be fine either way, but also exhibits the audio
breakage bug: glthread is therefore being enabled on the grounds that it should
make it a little harder to hit that bug.
---
 src/util/drirc | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/util/drirc b/src/util/drirc
index 5ca4a266ec..3cf3d8dc69 100644
--- a/src/util/drirc
+++ b/src/util/drirc
@@ -190,6 +190,12 @@ TODO: document the other workarounds.
 
 
 
+
+
+
+
+
+
 
 
 
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Mesa-stable] [PATCH 2/2] auxiliary: use vl_drm_screen_create method for surfaceless

2017-10-17 Thread Christian König


Hi Suresh,

in general please don't use HTML mail (e.g. no color or bold formating 
etc.. in mails).


Mailing list usually reject that, but Mesa isn't so strict with that so 
it hopefully works out.


For more comments see below.

Am 17.10.2017 um 08:36 schrieb Guttula, Suresh:
On 11 October 2017 at 07:13, Guttula, Suresh > wrote:

> HI,
>
>>- why do we need "surfaceless" support
>ChromeOS supports surfacelsess and we need this va enablement for 
surfaceless in chromium.
Ack, that should have been part of the commit message.
   >>>I will update the commit message.
>> - does upstream VAAPI has surfaceless platform
> Yes. It uses headless support of VA-API for decoding.
There's no VA_DISPLAY_SURFACELESS in libva [1]. Thus adding one here 
is _very_ confusing and misleading.
>>>Sorry I understood wrongly the question, I thought you are asking 
about mesa-vaapi. In libva it is using drm path only.If I understood 
correctly , no need of any macro VA_DISPLAY_SURFACELESSin libva as 
there is no problem to use drm path for egl platform surfaceless.The 
problem exists in mesa side as the check is added to enable va based 
on platform.

https://github.com/01org/libva/blob/master/va/va_backend.h#L39
  >>>libva uses “VA_DISPLAY_DRM_RENDERNODES”  in this case. In libva 
,Chromium (Ozone) for egl surfaceless platform goes for drm display .

_https://cs.chromium.org/chromium/src/media/gpu/vaapi_wrapper.cc?rcl=e1a85cf02acf0b4ccaad6e37afcf41d1fd26ce24&l=1188_
>>  - why is the surfaceless implementation identical to the DRM one
> If I understand your question correctly, In case of surfaceless platform ,it uses headless support of VAAPI, which 
will use drm implementation. If I miss something here please provide 
some more details on the question.

>
To put it otherwise:
You're "adding" support for surfaceless for the sake of adding a name.
There's no functional difference nor upstream (see the libva question
above) demand for it.
   >>>The reason for adding "surfaceless"  in mesa is the condition 
checks for platform "drm/wayland/x11" to enable va.
But in case of chromium ,we build mesa 
*with_egl_platforms=surfaceless* and *not mesa_gbm* because chromium 
uses *minigbm* .So echo $platform is surfaceless,
even it is using drm path, condition check fail because of platform 
type picked as *surfaceless* and va is not enabled.

What is stopping you from using --with-platforms=drm ?
because if we use *with_platforms=drm*, we need to enable 
*mesa_gbm*which is not required for 
chromium(*with_egl_platforms=surfaceless*).


That sounds reasonable to me and I think Emil can agree on that.

But we should add that to the commit message to make it obvious for 
everybody why we do this.


Regards,
Christian.


Thanks,
Suresh G



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/6] i965: Only put external handles into the handle ht

2017-10-17 Thread Chris Wilson

Quoting Kenneth Graunke (2017-10-16 20:07:05)
> I'd like to try out the
> AMD_pinned_memory support with Left 4 Dead 2, as I know a bunch of
> the Source 1 games use AMD_pinned_memory for better performance...

Finally got around to building a 32b mesa for steam, Left4Dead2 reports:

SDL failed to create GL compatibility profile (whichProfile=0!
This system supports the OpenGL extension GL_EXT_framebuffer_object.
This system supports the OpenGL extension GL_EXT_framebuffer_blit.
This system supports the OpenGL extension GL_EXT_framebuffer_multisample.
This system DOES NOT support the OpenGL extension GL_APPLE_fence.
This system DOES NOT support the OpenGL extension GL_NV_fence.
This system supports the OpenGL extension GL_ARB_sync.
This system supports the OpenGL extension GL_EXT_draw_buffers2.
This system DOES NOT support the OpenGL extension GL_EXT_bindable_uniform.
This system DOES NOT support the OpenGL extension GL_APPLE_flush_buffer_range.
This system supports the OpenGL extension GL_ARB_map_buffer_range.
This system supports the OpenGL extension GL_ARB_vertex_buffer_object.
This system supports the OpenGL extension GL_ARB_occlusion_query.
This system DOES NOT support the OpenGL extension GL_APPLE_texture_range.
This system DOES NOT support the OpenGL extension GL_APPLE_client_storage.
This system DOES NOT support the OpenGL extension GL_ARB_uniform_buffer.
This system supports the OpenGL extension GL_ARB_vertex_array_bgra.
This system supports the OpenGL extension GL_EXT_vertex_array_bgra.
This system supports the OpenGL extension GL_ARB_framebuffer_object.
This system DOES NOT support the OpenGL extension GL_GREMEDY_string_marker.
This system supports the OpenGL extension GL_ARB_debug_output.
This system DOES NOT support the OpenGL extension GL_EXT_direct_state_access.
This system DOES NOT support the OpenGL extension GL_NV_bindless_texture.
This system supports the OpenGL extension GL_AMD_pinned_memory.
This system supports the OpenGL extension 
GL_EXT_framebuffer_multisample_blit_scaled.
This system supports the OpenGL extension GL_EXT_texture_sRGB_decode.
This system DOES NOT support the OpenGL extension GL_NVX_gpu_memory_info.
This system DOES NOT support the OpenGL extension GL_ATI_meminfo.
This system supports the OpenGL extension GL_EXT_texture_compression_s3tc.
This system supports the OpenGL extension GL_EXT_texture_compression_dxt1.
This system supports the OpenGL extension GL_ANGLE_texture_compression_dxt3.
This system supports the OpenGL extension GL_ANGLE_texture_compression_dxt5.
This system DOES NOT support the OpenGL extension GLX_EXT_swap_control_tear.
GL_NV_bindless_texture: DISABLED
GL_AMD_pinned_memory: DISABLED
GL_EXT_texture_sRGB_decode: AVAILABLE
GL_NVX_gpu_memory_info: UNAVAILABLE
GL_ATI_meminfo: UNAVAILABLE
GL_MAX_SAMPLES_EXT: 16

Halfway there?
-Chris

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2] glsl/linker: produce error when invalid explicit locations are used

2017-10-17 Thread Iago Toral Quiroga

We only need to add a check to validate output locations here. For
inputs with invalid locations we will fail to link when we can't
find a matching output in the same (invalid) location.

v2: compute location slots properly depending on shader stage and
variable type / direction

Fixes:
KHR-GL45.enhanced_layouts.varying_location_limit
---
 src/compiler/glsl/link_varyings.cpp | 43 ++---
 src/compiler/glsl/link_varyings.h   |  3 ++-
 src/compiler/glsl/linker.cpp|  2 +-
 3 files changed, 43 insertions(+), 5 deletions(-)

diff --git a/src/compiler/glsl/link_varyings.cpp 
b/src/compiler/glsl/link_varyings.cpp
index 29842ecacd..69c92bf53b 100644
--- a/src/compiler/glsl/link_varyings.cpp
+++ b/src/compiler/glsl/link_varyings.cpp
@@ -377,11 +377,38 @@ cross_validate_front_and_back_color(struct 
gl_shader_program *prog,
   consumer_stage, producer_stage);
 }
 
+static unsigned
+compute_variable_location_slot(ir_variable *var, gl_shader_stage stage)
+{
+   unsigned location_start = VARYING_SLOT_VAR0;
+
+   switch (stage) {
+  case MESA_SHADER_VERTEX:
+ if (var->data.mode == ir_var_shader_in)
+location_start = VERT_ATTRIB_GENERIC0;
+ break;
+  case MESA_SHADER_TESS_CTRL:
+  case MESA_SHADER_TESS_EVAL:
+ if (var->data.patch)
+location_start = VARYING_SLOT_PATCH0;
+ break;
+  case MESA_SHADER_FRAGMENT:
+ if (var->data.mode == ir_var_shader_out)
+location_start = FRAG_RESULT_DATA0;
+ break;
+  default:
+ break;
+   }
+
+   return var->data.location - location_start;
+}
+
 /**
  * Validate that outputs from one stage match inputs of another
  */
 void
-cross_validate_outputs_to_inputs(struct gl_shader_program *prog,
+cross_validate_outputs_to_inputs(struct gl_context *ctx,
+ struct gl_shader_program *prog,
  gl_linked_shader *producer,
  gl_linked_shader *consumer)
 {
@@ -406,10 +433,19 @@ cross_validate_outputs_to_inputs(struct gl_shader_program 
*prog,
   */
  const glsl_type *type = get_varying_type(var, producer->Stage);
  unsigned num_elements = type->count_attribute_slots(false);
- unsigned idx = var->data.location - VARYING_SLOT_VAR0;
+ unsigned idx = compute_variable_location_slot(var, producer->Stage);
  unsigned slot_limit = idx + num_elements;
  unsigned last_comp;
 
+ unsigned slot_max =
+ctx->Const.Program[producer->Stage].MaxOutputComponents / 4;
+ if (slot_limit > slot_max) {
+linker_error(prog,
+ "Invalid location %u in %s shader\n",
+ idx, _mesa_shader_stage_to_string(producer->Stage));
+return;
+ }
+
  if (type->without_array()->is_record()) {
 /* The component qualifier can't be used on structs so just treat
  * all component slots as used.
@@ -515,7 +551,8 @@ cross_validate_outputs_to_inputs(struct gl_shader_program 
*prog,
 
 const glsl_type *type = get_varying_type(input, consumer->Stage);
 unsigned num_elements = type->count_attribute_slots(false);
-unsigned idx = input->data.location - VARYING_SLOT_VAR0;
+unsigned idx =
+   compute_variable_location_slot(input, consumer->Stage);
 unsigned slot_limit = idx + num_elements;
 
 while (idx < slot_limit) {
diff --git a/src/compiler/glsl/link_varyings.h 
b/src/compiler/glsl/link_varyings.h
index 4e1f6d2e42..081b04ea38 100644
--- a/src/compiler/glsl/link_varyings.h
+++ b/src/compiler/glsl/link_varyings.h
@@ -300,7 +300,8 @@ link_varyings(struct gl_shader_program *prog, unsigned 
first, unsigned last,
   struct gl_context *ctx, void *mem_ctx);
 
 void
-cross_validate_outputs_to_inputs(struct gl_shader_program *prog,
+cross_validate_outputs_to_inputs(struct gl_context *ctx,
+ struct gl_shader_program *prog,
  gl_linked_shader *producer,
  gl_linked_shader *consumer);
 
diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index 03eb05bf63..3798309678 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -4929,7 +4929,7 @@ link_shaders(struct gl_context *ctx, struct 
gl_shader_program *prog)
   if (!prog->data->LinkStatus)
  goto done;
 
-  cross_validate_outputs_to_inputs(prog,
+  cross_validate_outputs_to_inputs(ctx, prog,
prog->_LinkedShaders[prev],
prog->_LinkedShaders[i]);
   if (!prog->data->LinkStatus)
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop

[Mesa-dev] [PATCH 3/9] radv: refactor indirect draws (+count buffer) with radv_draw_info

2017-10-17 Thread Samuel Pitoiset

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c | 169 +++
 1 file changed, 48 insertions(+), 121 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 1cacdda767..b03dc4c864 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -2945,6 +2945,12 @@ struct radv_draw_info {
struct radv_buffer *indirect;
uint64_t indirect_offset;
uint32_t stride;
+
+   /**
+* Draw count parameters resource.
+*/
+   struct radv_buffer *count_buffer;
+   uint64_t count_buffer_offset;
 };
 
 static void
@@ -2986,6 +2992,7 @@ radv_emit_draw_packets(struct radv_cmd_buffer *cmd_buffer,
 
if (info->indirect) {
uint64_t va = radv_buffer_get_va(info->indirect->bo);
+   uint64_t count_va = 0;
 
va += info->indirect->offset + info->indirect_offset;
 
@@ -2996,11 +3003,19 @@ radv_emit_draw_packets(struct radv_cmd_buffer 
*cmd_buffer,
radeon_emit(cs, va);
radeon_emit(cs, va >> 32);
 
+   if (info->count_buffer) {
+   count_va = radv_buffer_get_va(info->count_buffer->bo);
+   count_va += info->count_buffer->offset +
+   info->count_buffer_offset;
+
+   ws->cs_add_buffer(cs, info->count_buffer->bo, 8);
+   }
+
if (!state->subpass->view_mask) {
radv_cs_emit_indirect_draw_packet(cmd_buffer,
  info->indexed,
  info->count,
- 0 /* count_va */,
+ count_va,
  info->stride);
} else {
unsigned i;
@@ -3010,7 +3025,7 @@ radv_emit_draw_packets(struct radv_cmd_buffer *cmd_buffer,
radv_cs_emit_indirect_draw_packet(cmd_buffer,
  info->indexed,
  info->count,
- 0 /* count_va 
*/,
+ count_va,
  info->stride);
}
}
@@ -3105,115 +3120,6 @@ void radv_CmdDrawIndexed(
radv_emit_draw_packets(cmd_buffer, &info);
 }
 
-static void
-radv_emit_indirect_draw(struct radv_cmd_buffer *cmd_buffer,
-   VkBuffer _buffer,
-   VkDeviceSize offset,
-   VkBuffer _count_buffer,
-   VkDeviceSize count_offset,
-   uint32_t draw_count,
-   uint32_t stride,
-   bool indexed)
-{
-   RADV_FROM_HANDLE(radv_buffer, buffer, _buffer);
-   RADV_FROM_HANDLE(radv_buffer, count_buffer, _count_buffer);
-   struct radeon_winsys_cs *cs = cmd_buffer->cs;
-
-   uint64_t indirect_va = radv_buffer_get_va(buffer->bo);
-   indirect_va += offset + buffer->offset;
-   uint64_t count_va = 0;
-
-   if (count_buffer) {
-   count_va = radv_buffer_get_va(count_buffer->bo);
-   count_va += count_offset + count_buffer->offset;
-
-   cmd_buffer->device->ws->cs_add_buffer(cs, count_buffer->bo, 8);
-   }
-
-   if (!draw_count)
-   return;
-
-   cmd_buffer->device->ws->cs_add_buffer(cs, buffer->bo, 8);
-
-   radeon_emit(cs, PKT3(PKT3_SET_BASE, 2, 0));
-   radeon_emit(cs, 1);
-   radeon_emit(cs, indirect_va);
-   radeon_emit(cs, indirect_va >> 32);
-
-   if (!cmd_buffer->state.subpass->view_mask) {
-   radv_cs_emit_indirect_draw_packet(cmd_buffer, indexed, 
draw_count, count_va, stride);
-   } else {
-   unsigned i;
-   for_each_bit(i, cmd_buffer->state.subpass->view_mask) {
-   radv_emit_view_index(cmd_buffer, i);
-
-   radv_cs_emit_indirect_draw_packet(cmd_buffer, indexed, 
draw_count, count_va, stride);
-   }
-   }
-   radv_cmd_buffer_after_draw(cmd_buffer);
-}
-
-static void
-radv_cmd_draw_indirect_count(VkCommandBuffer 
commandBuffer,
- VkBuffer
buffer,
- VkDeviceSize
offset,
- VkBuffer
countBuffer,
- VkDeviceSize
countBufferOffset,
- uint32_t

[Mesa-dev] [PATCH 5/9] radv: emit primitive restart from radv_emit_draw_registers()

2017-10-17 Thread Samuel Pitoiset

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c | 59 
 1 file changed, 30 insertions(+), 29 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index d58bf19b2e..e68174ff69 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -1644,33 +1644,6 @@ radv_flush_constants(struct radv_cmd_buffer *cmd_buffer,
assert(cmd_buffer->cs->cdw <= cdw_max);
 }
 
-static void radv_emit_primitive_reset_state(struct radv_cmd_buffer *cmd_buffer,
-   bool indexed_draw)
-{
-   int32_t primitive_reset_en = indexed_draw && 
cmd_buffer->state.pipeline->graphics.prim_restart_enable;
-
-   if (primitive_reset_en != cmd_buffer->state.last_primitive_reset_en) {
-   cmd_buffer->state.last_primitive_reset_en = primitive_reset_en;
-   if (cmd_buffer->device->physical_device->rad_info.chip_class >= 
GFX9) {
-   radeon_set_uconfig_reg(cmd_buffer->cs, 
R_03092C_VGT_MULTI_PRIM_IB_RESET_EN,
-  primitive_reset_en);
-   } else {
-   radeon_set_context_reg(cmd_buffer->cs, 
R_028A94_VGT_MULTI_PRIM_IB_RESET_EN,
-  primitive_reset_en);
-   }
-   }
-
-   if (primitive_reset_en) {
-   uint32_t primitive_reset_index = cmd_buffer->state.index_type ? 
0xu : 0xu;
-
-   if (primitive_reset_index != 
cmd_buffer->state.last_primitive_reset_index) {
-   cmd_buffer->state.last_primitive_reset_index = 
primitive_reset_index;
-   radeon_set_context_reg(cmd_buffer->cs, 
R_02840C_VGT_MULTI_PRIM_IB_RESET_INDX,
-  primitive_reset_index);
-   }
-   }
-}
-
 static bool
 radv_cmd_buffer_update_vertex_descriptors(struct radv_cmd_buffer *cmd_buffer)
 {
@@ -1731,6 +1704,7 @@ radv_emit_draw_registers(struct radv_cmd_buffer 
*cmd_buffer, bool indexed_draw,
struct radeon_info *info = 
&cmd_buffer->device->physical_device->rad_info;
struct radv_cmd_state *state = &cmd_buffer->state;
struct radeon_winsys_cs *cs = cmd_buffer->cs;
+   int32_t primitive_reset_en;
uint32_t ia_multi_vgt_param;
 
/* Draw state. */
@@ -1753,6 +1727,35 @@ radv_emit_draw_registers(struct radv_cmd_buffer 
*cmd_buffer, bool indexed_draw,
}
state->last_ia_multi_vgt_param = ia_multi_vgt_param;
}
+
+   /* Primitive restart. */
+   primitive_reset_en =
+   indexed_draw && state->pipeline->graphics.prim_restart_enable;
+
+   if (primitive_reset_en != state->last_primitive_reset_en) {
+   state->last_primitive_reset_en = primitive_reset_en;
+   if (info->chip_class >= GFX9) {
+   radeon_set_uconfig_reg(cs,
+  
R_03092C_VGT_MULTI_PRIM_IB_RESET_EN,
+  primitive_reset_en);
+   } else {
+   radeon_set_context_reg(cs,
+  
R_028A94_VGT_MULTI_PRIM_IB_RESET_EN,
+  primitive_reset_en);
+   }
+   }
+
+   if (primitive_reset_en) {
+   uint32_t primitive_reset_index =
+   state->index_type ? 0xu : 0xu;
+
+   if (primitive_reset_index != state->last_primitive_reset_index) 
{
+   radeon_set_context_reg(cs,
+  
R_02840C_VGT_MULTI_PRIM_IB_RESET_INDX,
+  primitive_reset_index);
+   state->last_primitive_reset_index = 
primitive_reset_index;
+   }
+   }
 }
 
 static void
@@ -1778,8 +1781,6 @@ radv_cmd_buffer_flush_state(struct radv_cmd_buffer 
*cmd_buffer,
 
radv_cmd_buffer_flush_dynamic_state(cmd_buffer);
 
-   radv_emit_primitive_reset_state(cmd_buffer, indexed_draw);
-
radv_flush_descriptors(cmd_buffer, VK_SHADER_STAGE_ALL_GRAPHICS);
radv_flush_constants(cmd_buffer, cmd_buffer->state.pipeline,
 VK_SHADER_STAGE_ALL_GRAPHICS);
-- 
2.14.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/9] radv: refactor indirect draws with radv_draw_info

2017-10-17 Thread Samuel Pitoiset

Indirect draws with a count buffer will be refactored in a
separate patch.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c | 217 ++-
 1 file changed, 143 insertions(+), 74 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index f0abafad3b..1cacdda767 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -2869,6 +2869,45 @@ radv_cs_emit_draw_indexed_packet(struct radv_cmd_buffer 
*cmd_buffer,
radeon_emit(cmd_buffer->cs, V_0287F0_DI_SRC_SEL_DMA);
 }
 
+static void
+radv_cs_emit_indirect_draw_packet(struct radv_cmd_buffer *cmd_buffer,
+  bool indexed,
+  uint32_t draw_count,
+  uint64_t count_va,
+  uint32_t stride)
+{
+   struct radeon_winsys_cs *cs = cmd_buffer->cs;
+   unsigned di_src_sel = indexed ? V_0287F0_DI_SRC_SEL_DMA
+ : V_0287F0_DI_SRC_SEL_AUTO_INDEX;
+   bool draw_id_enable = 
cmd_buffer->state.pipeline->shaders[MESA_SHADER_VERTEX]->info.info.vs.needs_draw_id;
+   uint32_t base_reg = cmd_buffer->state.pipeline->graphics.vtx_base_sgpr;
+   assert(base_reg);
+
+   if (draw_count == 1 && !count_va && !draw_id_enable) {
+   radeon_emit(cs, PKT3(indexed ? PKT3_DRAW_INDEX_INDIRECT :
+PKT3_DRAW_INDIRECT, 3, false));
+   radeon_emit(cs, 0);
+   radeon_emit(cs, (base_reg - SI_SH_REG_OFFSET) >> 2);
+   radeon_emit(cs, ((base_reg + 4) - SI_SH_REG_OFFSET) >> 2);
+   radeon_emit(cs, di_src_sel);
+   } else {
+   radeon_emit(cs, PKT3(indexed ? PKT3_DRAW_INDEX_INDIRECT_MULTI :
+PKT3_DRAW_INDIRECT_MULTI,
+8, false));
+   radeon_emit(cs, 0);
+   radeon_emit(cs, (base_reg - SI_SH_REG_OFFSET) >> 2);
+   radeon_emit(cs, ((base_reg + 4) - SI_SH_REG_OFFSET) >> 2);
+   radeon_emit(cs, (((base_reg + 8) - SI_SH_REG_OFFSET) >> 2) |
+   S_2C3_DRAW_INDEX_ENABLE(draw_id_enable) |
+   S_2C3_COUNT_INDIRECT_ENABLE(!!count_va));
+   radeon_emit(cs, draw_count); /* count */
+   radeon_emit(cs, count_va); /* count_addr */
+   radeon_emit(cs, count_va >> 32);
+   radeon_emit(cs, stride); /* stride */
+   radeon_emit(cs, di_src_sel);
+   }
+}
+
 struct radv_draw_info {
/**
 * Number of vertices.
@@ -2899,6 +2938,13 @@ struct radv_draw_info {
 * Whether it's an indexed draw.
 */
bool indexed;
+
+   /**
+* Indirect draw parameters resource.
+*/
+   struct radv_buffer *indirect;
+   uint64_t indirect_offset;
+   uint32_t stride;
 };
 
 static void
@@ -2906,15 +2952,16 @@ radv_emit_draw_packets(struct radv_cmd_buffer 
*cmd_buffer,
   const struct radv_draw_info *info)
 {
struct radv_cmd_state *state = &cmd_buffer->state;
+   struct radeon_winsys *ws = cmd_buffer->device->ws;
struct radv_device *device = cmd_buffer->device;
struct radeon_winsys_cs *cs = cmd_buffer->cs;
 
radv_cmd_buffer_flush_state(cmd_buffer, info->indexed,
-   info->instance_count > 1, false,
-   info->count);
+   info->instance_count > 1, info->indirect,
+   info->indirect ? 0 : info->count);
 
MAYBE_UNUSED unsigned cdw_max = radeon_check_space(device->ws, cs,
-  26 * MAX_VIEWS);
+  31 * MAX_VIEWS);
 
if (info->indexed) {
if (device->physical_device->rad_info.chip_class >= GFX9) {
@@ -2924,49 +2971,93 @@ radv_emit_draw_packets(struct radv_cmd_buffer 
*cmd_buffer,
radeon_emit(cs, PKT3(PKT3_INDEX_TYPE, 0, 0));
radeon_emit(cs, state->index_type);
}
+
+   if (info->indirect) {
+   uint64_t index_va = cmd_buffer->state.index_va;
+
+   radeon_emit(cs, PKT3(PKT3_INDEX_BASE, 1, 0));
+   radeon_emit(cs, index_va);
+   radeon_emit(cs, index_va >> 32);
+
+   radeon_emit(cs, PKT3(PKT3_INDEX_BUFFER_SIZE, 0, 0));
+   radeon_emit(cs, state->max_index_count);
+   }
}
 
-   assert(cmd_buffer->state.pipeline->graphics.vtx_base_sgpr);
-   radeon_set_sh_reg_seq(cs, state->pipeline->graphics.vtx_base_sgpr,
- state->pipeline->graphics.vtx_emit_num);
-   radeon_emit(cs, info->vertex_

[Mesa-dev] [PATCH] radv: remove XtoY_temps structs

2017-10-17 Thread Samuel Pitoiset

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_meta_bufimage.c | 62 -
 1 file changed, 26 insertions(+), 36 deletions(-)

diff --git a/src/amd/vulkan/radv_meta_bufimage.c 
b/src/amd/vulkan/radv_meta_bufimage.c
index f5bbf3cb90..dfd99aa75f 100644
--- a/src/amd/vulkan/radv_meta_bufimage.c
+++ b/src/amd/vulkan/radv_meta_bufimage.c
@@ -823,14 +823,10 @@ create_bview(struct radv_cmd_buffer *cmd_buffer,
 
 }
 
-struct itob_temps {
-   struct radv_image_view src_iview;
-   struct radv_buffer_view dst_bview;
-};
-
 static void
 itob_bind_descriptors(struct radv_cmd_buffer *cmd_buffer,
- struct itob_temps *tmp)
+ struct radv_image_view *src,
+ struct radv_buffer_view *dst)
 {
struct radv_device *device = cmd_buffer->device;
 
@@ -849,7 +845,7 @@ itob_bind_descriptors(struct radv_cmd_buffer *cmd_buffer,
  .pImageInfo = 
(VkDescriptorImageInfo[]) {
  {
  .sampler 
= VK_NULL_HANDLE,
- 
.imageView = radv_image_view_to_handle(&tmp->src_iview),
+ 
.imageView = radv_image_view_to_handle(src),
  
.imageLayout = VK_IMAGE_LAYOUT_GENERAL,
  },
  }
@@ -860,7 +856,7 @@ itob_bind_descriptors(struct radv_cmd_buffer *cmd_buffer,
  .dstArrayElement = 0,
  .descriptorCount = 1,
  .descriptorType = 
VK_DESCRIPTOR_TYPE_STORAGE_TEXEL_BUFFER,
- .pTexelBufferView = 
(VkBufferView[])  { radv_buffer_view_to_handle(&tmp->dst_bview) },
+ .pTexelBufferView = 
(VkBufferView[])  { radv_buffer_view_to_handle(dst) },
  }
  });
 }
@@ -874,11 +870,12 @@ radv_meta_image_to_buffer(struct radv_cmd_buffer 
*cmd_buffer,
 {
VkPipeline pipeline = cmd_buffer->device->meta_state.itob.pipeline;
struct radv_device *device = cmd_buffer->device;
-   struct itob_temps temps;
+   struct radv_image_view src_view;
+   struct radv_buffer_view dst_view;
 
-   create_iview(cmd_buffer, src, &temps.src_iview);
-   create_bview(cmd_buffer, dst->buffer, dst->offset, dst->format, 
&temps.dst_bview);
-   itob_bind_descriptors(cmd_buffer, &temps);
+   create_iview(cmd_buffer, src, &src_view);
+   create_bview(cmd_buffer, dst->buffer, dst->offset, dst->format, 
&dst_view);
+   itob_bind_descriptors(cmd_buffer, &src_view, &dst_view);
 
 
radv_CmdBindPipeline(radv_cmd_buffer_to_handle(cmd_buffer),
@@ -899,14 +896,10 @@ radv_meta_image_to_buffer(struct radv_cmd_buffer 
*cmd_buffer,
}
 }
 
-struct btoi_temps {
-   struct radv_buffer_view src_bview;
-   struct radv_image_view dst_iview;
-};
-
 static void
 btoi_bind_descriptors(struct radv_cmd_buffer *cmd_buffer,
- struct btoi_temps *tmp)
+ struct radv_buffer_view *src,
+ struct radv_image_view *dst)
 {
struct radv_device *device = cmd_buffer->device;
 
@@ -922,7 +915,7 @@ btoi_bind_descriptors(struct radv_cmd_buffer *cmd_buffer,
  .dstArrayElement = 0,
  .descriptorCount = 1,
  .descriptorType = 
VK_DESCRIPTOR_TYPE_STORAGE_TEXEL_BUFFER,
- .pTexelBufferView = 
(VkBufferView[])  { radv_buffer_view_to_handle(&tmp->src_bview) },
+ .pTexelBufferView = 
(VkBufferView[])  { radv_buffer_view_to_handle(src) },
  },
  {
  .sType = 
VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET,
@@ -933,7 +926,7 @@ btoi_bind_descriptors(struct radv_cmd_buffer *cmd_buffer,
  .pImageInfo = 
(VkDescriptorImageInfo[]) {
  {
  .sampler 
= VK_NULL_HANDLE,
- 
.imageView = radv_image_view_to_handle(&tmp->dst_iview),
+

[Mesa-dev] [PATCH 8/9] radv: add radv_emit_shaders_prefetch()

2017-10-17 Thread Samuel Pitoiset

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c | 38 ++
 1 file changed, 26 insertions(+), 12 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index ec4e34966c..e72ef5ffb7 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -606,6 +606,30 @@ radv_emit_shader_prefetch(struct radv_cmd_buffer 
*cmd_buffer,
si_cp_dma_prefetch(cmd_buffer, va, shader->code_size);
 }
 
+static void
+radv_emit_shaders_prefetch(struct radv_cmd_buffer *cmd_buffer,
+  struct radv_pipeline *pipeline)
+{
+   radv_emit_shader_prefetch(cmd_buffer,
+ pipeline->shaders[MESA_SHADER_VERTEX]);
+
+   if (pipeline->shaders[MESA_SHADER_TESS_EVAL]) {
+   radv_emit_shader_prefetch(cmd_buffer,
+ 
pipeline->shaders[MESA_SHADER_TESS_CTRL]);
+   radv_emit_shader_prefetch(cmd_buffer,
+ 
pipeline->shaders[MESA_SHADER_TESS_EVAL]);
+   }
+
+   if (pipeline->shaders[MESA_SHADER_GEOMETRY]) {
+   radv_emit_shader_prefetch(cmd_buffer,
+ 
pipeline->shaders[MESA_SHADER_GEOMETRY]);
+   radv_emit_shader_prefetch(cmd_buffer, pipeline->gs_copy_shader);
+   }
+
+   radv_emit_shader_prefetch(cmd_buffer,
+ pipeline->shaders[MESA_SHADER_FRAGMENT]);
+}
+
 static void
 radv_emit_hw_vs(struct radv_cmd_buffer *cmd_buffer,
struct radv_pipeline *pipeline,
@@ -615,8 +639,6 @@ radv_emit_hw_vs(struct radv_cmd_buffer *cmd_buffer,
uint64_t va = radv_buffer_get_va(shader->bo) + shader->bo_offset;
unsigned export_count;
 
-   radv_emit_shader_prefetch(cmd_buffer, shader);
-
export_count = MAX2(1, outinfo->param_exports);
radeon_set_context_reg(cmd_buffer->cs, R_0286C4_SPI_VS_OUT_CONFIG,
   S_0286C4_VS_EXPORT_COUNT(export_count - 1));
@@ -662,8 +684,6 @@ radv_emit_hw_es(struct radv_cmd_buffer *cmd_buffer,
 {
uint64_t va = radv_buffer_get_va(shader->bo) + shader->bo_offset;
 
-   radv_emit_shader_prefetch(cmd_buffer, shader);
-
radeon_set_context_reg(cmd_buffer->cs, R_028AAC_VGT_ESGS_RING_ITEMSIZE,
   outinfo->esgs_itemsize / 4);
radeon_set_sh_reg_seq(cmd_buffer->cs, R_00B320_SPI_SHADER_PGM_LO_ES, 4);
@@ -680,8 +700,6 @@ radv_emit_hw_ls(struct radv_cmd_buffer *cmd_buffer,
uint64_t va = radv_buffer_get_va(shader->bo) + shader->bo_offset;
uint32_t rsrc2 = shader->rsrc2;
 
-   radv_emit_shader_prefetch(cmd_buffer, shader);
-
radeon_set_sh_reg_seq(cmd_buffer->cs, R_00B520_SPI_SHADER_PGM_LO_LS, 2);
radeon_emit(cmd_buffer->cs, va >> 8);
radeon_emit(cmd_buffer->cs, va >> 40);
@@ -702,8 +720,6 @@ radv_emit_hw_hs(struct radv_cmd_buffer *cmd_buffer,
 {
uint64_t va = radv_buffer_get_va(shader->bo) + shader->bo_offset;
 
-   radv_emit_shader_prefetch(cmd_buffer, shader);
-
radeon_set_sh_reg_seq(cmd_buffer->cs, R_00B420_SPI_SHADER_PGM_LO_HS, 4);
radeon_emit(cmd_buffer->cs, va >> 8);
radeon_emit(cmd_buffer->cs, va >> 40);
@@ -835,8 +851,6 @@ radv_emit_geometry_shader(struct radv_cmd_buffer 
*cmd_buffer,
 
va = radv_buffer_get_va(gs->bo) + gs->bo_offset;
 
-   radv_emit_shader_prefetch(cmd_buffer, gs);
-
radeon_set_sh_reg_seq(cmd_buffer->cs, R_00B220_SPI_SHADER_PGM_LO_GS, 4);
radeon_emit(cmd_buffer->cs, va >> 8);
radeon_emit(cmd_buffer->cs, va >> 40);
@@ -875,8 +889,6 @@ radv_emit_fragment_shader(struct radv_cmd_buffer 
*cmd_buffer,
ps = pipeline->shaders[MESA_SHADER_FRAGMENT];
va = radv_buffer_get_va(ps->bo) + ps->bo_offset;
 
-   radv_emit_shader_prefetch(cmd_buffer, ps);
-
radeon_set_sh_reg_seq(cmd_buffer->cs, R_00B020_SPI_SHADER_PGM_LO_PS, 4);
radeon_emit(cmd_buffer->cs, va >> 8);
radeon_emit(cmd_buffer->cs, va >> 40);
@@ -953,6 +965,8 @@ radv_emit_graphics_pipeline(struct radv_cmd_buffer 
*cmd_buffer)
radv_emit_fragment_shader(cmd_buffer, pipeline);
radv_emit_vgt_vertex_reuse(cmd_buffer, pipeline);
 
+   radv_emit_shaders_prefetch(cmd_buffer, pipeline);
+
cmd_buffer->scratch_size_needed =
  MAX2(cmd_buffer->scratch_size_needed,
   pipeline->max_waves * 
pipeline->scratch_bytes_per_wave);
-- 
2.14.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 9/9] radv: use optimal packet order for draws

2017-10-17 Thread Samuel Pitoiset

Ported from RadeonSI. The time where shaders are idle should
be shorter now. This can give a little boost, like +6% with
the dynamicubo Vulkan demo.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c | 94 
 1 file changed, 77 insertions(+), 17 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index e72ef5ffb7..9ebdbb011a 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -965,8 +965,6 @@ radv_emit_graphics_pipeline(struct radv_cmd_buffer 
*cmd_buffer)
radv_emit_fragment_shader(cmd_buffer, pipeline);
radv_emit_vgt_vertex_reuse(cmd_buffer, pipeline);
 
-   radv_emit_shaders_prefetch(cmd_buffer, pipeline);
-
cmd_buffer->scratch_size_needed =
  MAX2(cmd_buffer->scratch_size_needed,
   pipeline->max_waves * 
pipeline->scratch_bytes_per_wave);
@@ -1709,6 +1707,19 @@ radv_cmd_buffer_update_vertex_descriptors(struct 
radv_cmd_buffer *cmd_buffer)
return true;
 }
 
+static bool
+radv_upload_graphics_shader_descriptors(struct radv_cmd_buffer *cmd_buffer)
+{
+   if (!radv_cmd_buffer_update_vertex_descriptors(cmd_buffer))
+   return false;
+
+   radv_flush_descriptors(cmd_buffer, VK_SHADER_STAGE_ALL_GRAPHICS);
+   radv_flush_constants(cmd_buffer, cmd_buffer->state.pipeline,
+VK_SHADER_STAGE_ALL_GRAPHICS);
+
+   return true;
+}
+
 static void
 radv_emit_draw_registers(struct radv_cmd_buffer *cmd_buffer, bool indexed_draw,
 bool instanced_draw, bool indirect_draw,
@@ -3074,35 +3085,84 @@ radv_emit_draw_packets(struct radv_cmd_buffer 
*cmd_buffer,
 }
 
 static void
-radv_draw(struct radv_cmd_buffer *cmd_buffer,
- const struct radv_draw_info *info)
+radv_emit_all_graphics_states(struct radv_cmd_buffer *cmd_buffer,
+ const struct radv_draw_info *info)
 {
-   MAYBE_UNUSED unsigned cdw_max =
-   radeon_check_space(cmd_buffer->device->ws,
-  cmd_buffer->cs, 4096);
-
-   if (!radv_cmd_buffer_update_vertex_descriptors(cmd_buffer))
-   return;
-
if (cmd_buffer->state.dirty & RADV_CMD_DIRTY_PIPELINE)
radv_emit_graphics_pipeline(cmd_buffer);
 
if (cmd_buffer->state.dirty & RADV_CMD_DIRTY_RENDER_TARGETS)
radv_emit_framebuffer_state(cmd_buffer);
 
+   radv_cmd_buffer_flush_dynamic_state(cmd_buffer);
+
radv_emit_draw_registers(cmd_buffer, info->indexed,
 info->instance_count > 1, info->indirect,
 info->indirect ? 0 : info->count);
+}
 
-   radv_cmd_buffer_flush_dynamic_state(cmd_buffer);
+static void
+radv_draw(struct radv_cmd_buffer *cmd_buffer,
+ const struct radv_draw_info *info)
+{
+   bool pipeline_is_dirty =
+   (cmd_buffer->state.dirty & RADV_CMD_DIRTY_PIPELINE) &&
+   cmd_buffer->state.pipeline &&
+   cmd_buffer->state.pipeline != 
cmd_buffer->state.emitted_pipeline;
 
-   radv_flush_descriptors(cmd_buffer, VK_SHADER_STAGE_ALL_GRAPHICS);
-   radv_flush_constants(cmd_buffer, cmd_buffer->state.pipeline,
-VK_SHADER_STAGE_ALL_GRAPHICS);
+   MAYBE_UNUSED unsigned cdw_max =
+   radeon_check_space(cmd_buffer->device->ws,
+  cmd_buffer->cs, 4096);
 
-   si_emit_cache_flush(cmd_buffer);
+   /* Use optimal packet order based on whether we need to sync the
+* pipeline.
+*/
+   if (cmd_buffer->state.flush_bits & (RADV_CMD_FLAG_FLUSH_AND_INV_CB |
+   RADV_CMD_FLAG_FLUSH_AND_INV_DB |
+   RADV_CMD_FLAG_PS_PARTIAL_FLUSH |
+   RADV_CMD_FLAG_CS_PARTIAL_FLUSH)) {
+   /* If we have to wait for idle, set all states first, so that
+* all SET packets are processed in parallel with previous draw
+* calls. Then upload descriptors, set shader pointers, and
+* draw, and prefetch at the end. This ensures that the time
+* the CUs are idle is very short. (there are only SET_SH
+* packets between the wait and the draw)
+*/
+   radv_emit_all_graphics_states(cmd_buffer, info);
+   si_emit_cache_flush(cmd_buffer);
+   /* <-- CUs are idle here --> */
 
-   radv_emit_draw_packets(cmd_buffer, info);
+   if (!radv_upload_graphics_shader_descriptors(cmd_buffer))
+   return;
+
+   radv_emit_draw_packets(cmd_buffer, info);
+   /* <-- CUs are busy here --> */
+
+   /* Start prefetches after the draw has been started. Both will
+

[Mesa-dev] [PATCH 1/9] radv: refactor simple and indexed draws with radv_draw_info

2017-10-17 Thread Samuel Pitoiset

Similar to the dispatch compute logic but for draw calls. For
convenience, indirect draws will be converted in a separate
patch.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c | 202 ---
 1 file changed, 127 insertions(+), 75 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 9d59028bfd..f0abafad3b 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -2856,47 +2856,6 @@ radv_cs_emit_draw_packet(struct radv_cmd_buffer 
*cmd_buffer,
S_0287F0_USE_OPAQUE(0));
 }
 
-void radv_CmdDraw(
-   VkCommandBuffer commandBuffer,
-   uint32_tvertexCount,
-   uint32_tinstanceCount,
-   uint32_tfirstVertex,
-   uint32_tfirstInstance)
-{
-   RADV_FROM_HANDLE(radv_cmd_buffer, cmd_buffer, commandBuffer);
-
-   radv_cmd_buffer_flush_state(cmd_buffer, false, (instanceCount > 1), 
false, vertexCount);
-
-   MAYBE_UNUSED unsigned cdw_max = 
radeon_check_space(cmd_buffer->device->ws, cmd_buffer->cs, 20 * MAX_VIEWS);
-
-   assert(cmd_buffer->state.pipeline->graphics.vtx_base_sgpr);
-   radeon_set_sh_reg_seq(cmd_buffer->cs, 
cmd_buffer->state.pipeline->graphics.vtx_base_sgpr,
- 
cmd_buffer->state.pipeline->graphics.vtx_emit_num);
-   radeon_emit(cmd_buffer->cs, firstVertex);
-   radeon_emit(cmd_buffer->cs, firstInstance);
-   if (cmd_buffer->state.pipeline->graphics.vtx_emit_num == 3)
-   radeon_emit(cmd_buffer->cs, 0);
-
-   radeon_emit(cmd_buffer->cs, PKT3(PKT3_NUM_INSTANCES, 0, 
cmd_buffer->state.predicating));
-   radeon_emit(cmd_buffer->cs, instanceCount);
-
-   if (!cmd_buffer->state.subpass->view_mask) {
-   radv_cs_emit_draw_packet(cmd_buffer, vertexCount);
-   } else {
-   unsigned i;
-   for_each_bit(i, cmd_buffer->state.subpass->view_mask) {
-   radv_emit_view_index(cmd_buffer, i);
-
-   radv_cs_emit_draw_packet(cmd_buffer, vertexCount);
-   }
-   }
-
-   assert(cmd_buffer->cs->cdw <= cdw_max);
-
-   radv_cmd_buffer_after_draw(cmd_buffer);
-}
-
-
 static void
 radv_cs_emit_draw_indexed_packet(struct radv_cmd_buffer *cmd_buffer,
  uint64_t index_va,
@@ -2910,58 +2869,151 @@ radv_cs_emit_draw_indexed_packet(struct 
radv_cmd_buffer *cmd_buffer,
radeon_emit(cmd_buffer->cs, V_0287F0_DI_SRC_SEL_DMA);
 }
 
-void radv_CmdDrawIndexed(
-   VkCommandBuffer commandBuffer,
-   uint32_tindexCount,
-   uint32_tinstanceCount,
-   uint32_tfirstIndex,
-   int32_t vertexOffset,
-   uint32_tfirstInstance)
+struct radv_draw_info {
+   /**
+* Number of vertices.
+*/
+   uint32_t count;
+
+   /**
+* Index of the first vertex.
+*/
+   int32_t vertex_offset;
+
+   /**
+* First instance id.
+*/
+   uint32_t first_instance;
+
+   /**
+* Number of instances.
+*/
+   uint32_t instance_count;
+
+   /**
+* First index (indexed draws only).
+*/
+   uint32_t first_index;
+
+   /**
+* Whether it's an indexed draw.
+*/
+   bool indexed;
+};
+
+static void
+radv_emit_draw_packets(struct radv_cmd_buffer *cmd_buffer,
+  const struct radv_draw_info *info)
 {
-   RADV_FROM_HANDLE(radv_cmd_buffer, cmd_buffer, commandBuffer);
-   int index_size = cmd_buffer->state.index_type ? 4 : 2;
-   uint64_t index_va;
+   struct radv_cmd_state *state = &cmd_buffer->state;
+   struct radv_device *device = cmd_buffer->device;
+   struct radeon_winsys_cs *cs = cmd_buffer->cs;
 
-   radv_cmd_buffer_flush_state(cmd_buffer, true, (instanceCount > 1), 
false, indexCount);
+   radv_cmd_buffer_flush_state(cmd_buffer, info->indexed,
+   info->instance_count > 1, false,
+   info->count);
 
-   MAYBE_UNUSED unsigned cdw_max = 
radeon_check_space(cmd_buffer->device->ws, cmd_buffer->cs, 26 * MAX_VIEWS);
+   MAYBE_UNUSED unsigned cdw_max = radeon_check_space(device->ws, cs,
+  26 * MAX_VIEWS);
 
-   if (cmd_buffer->device->physical_device->rad_info.chip_class >= GFX9) {
-   radeon_set_uconfig_reg_idx(cmd_buffer->cs, 
R_03090C_VGT_INDEX_TYPE,
-  2, cmd_buffer->state.index_type);
-   } else {
-

[Mesa-dev] [PATCH 7/9] radv: add radv_emit_shader_prefetch()

2017-10-17 Thread Samuel Pitoiset

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c | 47 +++-
 1 file changed, 22 insertions(+), 25 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index f958d1a14e..ec4e34966c 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -589,12 +589,21 @@ radv_emit_graphics_raster_state(struct radv_cmd_buffer 
*cmd_buffer,
   raster->pa_su_sc_mode_cntl);
 }
 
-static inline void
-radv_emit_prefetch(struct radv_cmd_buffer *cmd_buffer, uint64_t va,
-  unsigned size)
+static void
+radv_emit_shader_prefetch(struct radv_cmd_buffer *cmd_buffer,
+ struct radv_shader_variant *shader)
 {
+   struct radeon_winsys *ws = cmd_buffer->device->ws;
+   struct radeon_winsys_cs *cs = cmd_buffer->cs;
+   uint64_t va;
+
+   assert(shader);
+
+   va = radv_buffer_get_va(shader->bo) + shader->bo_offset;
+
+   ws->cs_add_buffer(cs, shader->bo, 8);
if (cmd_buffer->device->physical_device->rad_info.chip_class >= CIK)
-   si_cp_dma_prefetch(cmd_buffer, va, size);
+   si_cp_dma_prefetch(cmd_buffer, va, shader->code_size);
 }
 
 static void
@@ -603,12 +612,10 @@ radv_emit_hw_vs(struct radv_cmd_buffer *cmd_buffer,
struct radv_shader_variant *shader,
struct ac_vs_output_info *outinfo)
 {
-   struct radeon_winsys *ws = cmd_buffer->device->ws;
uint64_t va = radv_buffer_get_va(shader->bo) + shader->bo_offset;
unsigned export_count;
 
-   ws->cs_add_buffer(cmd_buffer->cs, shader->bo, 8);
-   radv_emit_prefetch(cmd_buffer, va, shader->code_size);
+   radv_emit_shader_prefetch(cmd_buffer, shader);
 
export_count = MAX2(1, outinfo->param_exports);
radeon_set_context_reg(cmd_buffer->cs, R_0286C4_SPI_VS_OUT_CONFIG,
@@ -653,11 +660,9 @@ radv_emit_hw_es(struct radv_cmd_buffer *cmd_buffer,
struct radv_shader_variant *shader,
struct ac_es_output_info *outinfo)
 {
-   struct radeon_winsys *ws = cmd_buffer->device->ws;
uint64_t va = radv_buffer_get_va(shader->bo) + shader->bo_offset;
 
-   ws->cs_add_buffer(cmd_buffer->cs, shader->bo, 8);
-   radv_emit_prefetch(cmd_buffer, va, shader->code_size);
+   radv_emit_shader_prefetch(cmd_buffer, shader);
 
radeon_set_context_reg(cmd_buffer->cs, R_028AAC_VGT_ESGS_RING_ITEMSIZE,
   outinfo->esgs_itemsize / 4);
@@ -672,12 +677,10 @@ static void
 radv_emit_hw_ls(struct radv_cmd_buffer *cmd_buffer,
struct radv_shader_variant *shader)
 {
-   struct radeon_winsys *ws = cmd_buffer->device->ws;
uint64_t va = radv_buffer_get_va(shader->bo) + shader->bo_offset;
uint32_t rsrc2 = shader->rsrc2;
 
-   ws->cs_add_buffer(cmd_buffer->cs, shader->bo, 8);
-   radv_emit_prefetch(cmd_buffer, va, shader->code_size);
+   radv_emit_shader_prefetch(cmd_buffer, shader);
 
radeon_set_sh_reg_seq(cmd_buffer->cs, R_00B520_SPI_SHADER_PGM_LO_LS, 2);
radeon_emit(cmd_buffer->cs, va >> 8);
@@ -697,11 +700,9 @@ static void
 radv_emit_hw_hs(struct radv_cmd_buffer *cmd_buffer,
struct radv_shader_variant *shader)
 {
-   struct radeon_winsys *ws = cmd_buffer->device->ws;
uint64_t va = radv_buffer_get_va(shader->bo) + shader->bo_offset;
 
-   ws->cs_add_buffer(cmd_buffer->cs, shader->bo, 8);
-   radv_emit_prefetch(cmd_buffer, va, shader->code_size);
+   radv_emit_shader_prefetch(cmd_buffer, shader);
 
radeon_set_sh_reg_seq(cmd_buffer->cs, R_00B420_SPI_SHADER_PGM_LO_HS, 4);
radeon_emit(cmd_buffer->cs, va >> 8);
@@ -800,7 +801,6 @@ static void
 radv_emit_geometry_shader(struct radv_cmd_buffer *cmd_buffer,
  struct radv_pipeline *pipeline)
 {
-   struct radeon_winsys *ws = cmd_buffer->device->ws;
struct radv_shader_variant *gs;
uint64_t va;
 
@@ -834,8 +834,8 @@ radv_emit_geometry_shader(struct radv_cmd_buffer 
*cmd_buffer,
   S_028B90_ENABLE(gs_num_invocations > 0));
 
va = radv_buffer_get_va(gs->bo) + gs->bo_offset;
-   ws->cs_add_buffer(cmd_buffer->cs, gs->bo, 8);
-   radv_emit_prefetch(cmd_buffer, va, gs->code_size);
+
+   radv_emit_shader_prefetch(cmd_buffer, gs);
 
radeon_set_sh_reg_seq(cmd_buffer->cs, R_00B220_SPI_SHADER_PGM_LO_GS, 4);
radeon_emit(cmd_buffer->cs, va >> 8);
@@ -866,7 +866,6 @@ static void
 radv_emit_fragment_shader(struct radv_cmd_buffer *cmd_buffer,
  struct radv_pipeline *pipeline)
 {
-   struct radeon_winsys *ws = cmd_buffer->device->ws;
struct radv_shader_variant *ps;
uint64_t va;
unsigned spi_baryc_cntl = S_0286E0_FRONT_FACE_ALL_BITS(1);
@@ -875,8 +874,8 @@ radv_emit_fragment_shader(struct radv_cmd_buffer 
*cmd_buffer,
 
ps

[Mesa-dev] [PATCH 0/9] radv: draw codepath refactoring + optimal packet order

2017-10-17 Thread Samuel Pitoiset

Hi,

This series first refactors the draw codepath to follow the dispatch
codepath (ie. using a new structure called radv_draw_info). Then it
adds few helpers, and finally it tries to use a better packet order
in order to reduce the time where shaders are idle.

This is loosely based on RadeonSI and it should give a little boost.

Please review,
Thanks!

Samuel Pitoiset (9):
  radv: refactor simple and indexed draws with radv_draw_info
  radv: refactor indirect draws with radv_draw_info
  radv: refactor indirect draws (+count buffer) with radv_draw_info
  radv: add radv_emit_draw_registers()
  radv: emit primitive restart from radv_emit_draw_registers()
  radv: rename radv_cmd_buffer_flush_state() to radv_draw()
  radv: add radv_emit_shader_prefetch()
  radv: add radv_emit_shaders_prefetch()
  radv: use optimal packet order for draws

 src/amd/vulkan/radv_cmd_buffer.c | 673 +++
 1 file changed, 404 insertions(+), 269 deletions(-)

-- 
2.14.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 4/9] radv: add radv_emit_draw_registers()

2017-10-17 Thread Samuel Pitoiset

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c | 46 +---
 1 file changed, 34 insertions(+), 12 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index b03dc4c864..d58bf19b2e 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -1723,14 +1723,44 @@ radv_cmd_buffer_update_vertex_descriptors(struct 
radv_cmd_buffer *cmd_buffer)
return true;
 }
 
+static void
+radv_emit_draw_registers(struct radv_cmd_buffer *cmd_buffer, bool indexed_draw,
+bool instanced_draw, bool indirect_draw,
+uint32_t draw_vertex_count)
+{
+   struct radeon_info *info = 
&cmd_buffer->device->physical_device->rad_info;
+   struct radv_cmd_state *state = &cmd_buffer->state;
+   struct radeon_winsys_cs *cs = cmd_buffer->cs;
+   uint32_t ia_multi_vgt_param;
+
+   /* Draw state. */
+   ia_multi_vgt_param =
+   si_get_ia_multi_vgt_param(cmd_buffer, instanced_draw,
+ indirect_draw, draw_vertex_count);
+
+   if (state->last_ia_multi_vgt_param != ia_multi_vgt_param) {
+   if (info->chip_class >= GFX9) {
+   radeon_set_uconfig_reg_idx(cs,
+  R_030960_IA_MULTI_VGT_PARAM,
+  4, ia_multi_vgt_param);
+   } else if (info->chip_class >= CIK) {
+   radeon_set_context_reg_idx(cs,
+  R_028AA8_IA_MULTI_VGT_PARAM,
+  1, ia_multi_vgt_param);
+   } else {
+   radeon_set_context_reg(cs, R_028AA8_IA_MULTI_VGT_PARAM,
+  ia_multi_vgt_param);
+   }
+   state->last_ia_multi_vgt_param = ia_multi_vgt_param;
+   }
+}
+
 static void
 radv_cmd_buffer_flush_state(struct radv_cmd_buffer *cmd_buffer,
bool indexed_draw, bool instanced_draw,
bool indirect_draw,
uint32_t draw_vertex_count)
 {
-   uint32_t ia_multi_vgt_param;
-
MAYBE_UNUSED unsigned cdw_max = 
radeon_check_space(cmd_buffer->device->ws,
   cmd_buffer->cs, 
4096);
 
@@ -1743,16 +1773,8 @@ radv_cmd_buffer_flush_state(struct radv_cmd_buffer 
*cmd_buffer,
if (cmd_buffer->state.dirty & RADV_CMD_DIRTY_RENDER_TARGETS)
radv_emit_framebuffer_state(cmd_buffer);
 
-   ia_multi_vgt_param = si_get_ia_multi_vgt_param(cmd_buffer, 
instanced_draw, indirect_draw, draw_vertex_count);
-   if (cmd_buffer->state.last_ia_multi_vgt_param != ia_multi_vgt_param) {
-   if (cmd_buffer->device->physical_device->rad_info.chip_class >= 
GFX9)
-   radeon_set_uconfig_reg_idx(cmd_buffer->cs, 
R_030960_IA_MULTI_VGT_PARAM, 4, ia_multi_vgt_param);
-   else if 
(cmd_buffer->device->physical_device->rad_info.chip_class >= CIK)
-   radeon_set_context_reg_idx(cmd_buffer->cs, 
R_028AA8_IA_MULTI_VGT_PARAM, 1, ia_multi_vgt_param);
-   else
-   radeon_set_context_reg(cmd_buffer->cs, 
R_028AA8_IA_MULTI_VGT_PARAM, ia_multi_vgt_param);
-   cmd_buffer->state.last_ia_multi_vgt_param = ia_multi_vgt_param;
-   }
+   radv_emit_draw_registers(cmd_buffer, indexed_draw, instanced_draw,
+indirect_draw, draw_vertex_count);
 
radv_cmd_buffer_flush_dynamic_state(cmd_buffer);
 
-- 
2.14.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 6/9] radv: rename radv_cmd_buffer_flush_state() to radv_draw()

2017-10-17 Thread Samuel Pitoiset

Similar to the dispatch codepath.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c | 85 ++--
 1 file changed, 39 insertions(+), 46 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index e68174ff69..f958d1a14e 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -1758,38 +1758,6 @@ radv_emit_draw_registers(struct radv_cmd_buffer 
*cmd_buffer, bool indexed_draw,
}
 }
 
-static void
-radv_cmd_buffer_flush_state(struct radv_cmd_buffer *cmd_buffer,
-   bool indexed_draw, bool instanced_draw,
-   bool indirect_draw,
-   uint32_t draw_vertex_count)
-{
-   MAYBE_UNUSED unsigned cdw_max = 
radeon_check_space(cmd_buffer->device->ws,
-  cmd_buffer->cs, 
4096);
-
-   if (!radv_cmd_buffer_update_vertex_descriptors(cmd_buffer))
-   return;
-
-   if (cmd_buffer->state.dirty & RADV_CMD_DIRTY_PIPELINE)
-   radv_emit_graphics_pipeline(cmd_buffer);
-
-   if (cmd_buffer->state.dirty & RADV_CMD_DIRTY_RENDER_TARGETS)
-   radv_emit_framebuffer_state(cmd_buffer);
-
-   radv_emit_draw_registers(cmd_buffer, indexed_draw, instanced_draw,
-indirect_draw, draw_vertex_count);
-
-   radv_cmd_buffer_flush_dynamic_state(cmd_buffer);
-
-   radv_flush_descriptors(cmd_buffer, VK_SHADER_STAGE_ALL_GRAPHICS);
-   radv_flush_constants(cmd_buffer, cmd_buffer->state.pipeline,
-VK_SHADER_STAGE_ALL_GRAPHICS);
-
-   assert(cmd_buffer->cs->cdw <= cdw_max);
-
-   si_emit_cache_flush(cmd_buffer);
-}
-
 static void radv_stage_flush(struct radv_cmd_buffer *cmd_buffer,
 VkPipelineStageFlags src_stage_mask)
 {
@@ -2985,13 +2953,6 @@ radv_emit_draw_packets(struct radv_cmd_buffer 
*cmd_buffer,
struct radv_device *device = cmd_buffer->device;
struct radeon_winsys_cs *cs = cmd_buffer->cs;
 
-   radv_cmd_buffer_flush_state(cmd_buffer, info->indexed,
-   info->instance_count > 1, info->indirect,
-   info->indirect ? 0 : info->count);
-
-   MAYBE_UNUSED unsigned cdw_max = radeon_check_space(device->ws, cs,
-  31 * MAX_VIEWS);
-
if (info->indexed) {
if (device->physical_device->rad_info.chip_class >= GFX9) {
radeon_set_uconfig_reg_idx(cs, R_03090C_VGT_INDEX_TYPE,
@@ -3099,8 +3060,40 @@ radv_emit_draw_packets(struct radv_cmd_buffer 
*cmd_buffer,
}
}
}
+}
 
-   assert(cs->cdw <= cdw_max);
+static void
+radv_draw(struct radv_cmd_buffer *cmd_buffer,
+ const struct radv_draw_info *info)
+{
+   MAYBE_UNUSED unsigned cdw_max =
+   radeon_check_space(cmd_buffer->device->ws,
+  cmd_buffer->cs, 4096);
+
+   if (!radv_cmd_buffer_update_vertex_descriptors(cmd_buffer))
+   return;
+
+   if (cmd_buffer->state.dirty & RADV_CMD_DIRTY_PIPELINE)
+   radv_emit_graphics_pipeline(cmd_buffer);
+
+   if (cmd_buffer->state.dirty & RADV_CMD_DIRTY_RENDER_TARGETS)
+   radv_emit_framebuffer_state(cmd_buffer);
+
+   radv_emit_draw_registers(cmd_buffer, info->indexed,
+info->instance_count > 1, info->indirect,
+info->indirect ? 0 : info->count);
+
+   radv_cmd_buffer_flush_dynamic_state(cmd_buffer);
+
+   radv_flush_descriptors(cmd_buffer, VK_SHADER_STAGE_ALL_GRAPHICS);
+   radv_flush_constants(cmd_buffer, cmd_buffer->state.pipeline,
+VK_SHADER_STAGE_ALL_GRAPHICS);
+
+   si_emit_cache_flush(cmd_buffer);
+
+   radv_emit_draw_packets(cmd_buffer, info);
+
+   assert(cmd_buffer->cs->cdw <= cdw_max);
radv_cmd_buffer_after_draw(cmd_buffer);
 }
 
@@ -3119,7 +3112,7 @@ void radv_CmdDraw(
info.first_instance = firstInstance;
info.vertex_offset = firstVertex;
 
-   radv_emit_draw_packets(cmd_buffer, &info);
+   radv_draw(cmd_buffer, &info);
 }
 
 void radv_CmdDrawIndexed(
@@ -3140,7 +3133,7 @@ void radv_CmdDrawIndexed(
info.vertex_offset = vertexOffset;
info.first_instance = firstInstance;
 
-   radv_emit_draw_packets(cmd_buffer, &info);
+   radv_draw(cmd_buffer, &info);
 }
 
 void radv_CmdDrawIndirect(
@@ -3159,7 +3152,7 @@ void radv_CmdDrawIndirect(
info.indirect_offset = offset;
info.stride = stride;
 
-   radv_emit_draw_packets(cmd_buffer, &info);
+   radv_draw(cmd_buffer, &info);
 }
 
 void radv_CmdDrawIndexedIndirect(
@@ -3179,7 +3172,7 @@ void radv_CmdDrawIndexedIndirect(
info.indirect_offset = offset;
info.stride

Re: [Mesa-dev] [PATCH 1/9] radv: refactor simple and indexed draws with radv_draw_info

2017-10-17 Thread Samuel Pitoiset


Just noticed that I missed the predicate stuff.

On 10/17/2017 11:03 AM, Samuel Pitoiset wrote:

Similar to the dispatch compute logic but for draw calls. For
convenience, indirect draws will be converted in a separate
patch.

Signed-off-by: Samuel Pitoiset 
---
  src/amd/vulkan/radv_cmd_buffer.c | 202 ---
  1 file changed, 127 insertions(+), 75 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 9d59028bfd..f0abafad3b 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -2856,47 +2856,6 @@ radv_cs_emit_draw_packet(struct radv_cmd_buffer 
*cmd_buffer,
S_0287F0_USE_OPAQUE(0));
  }
  
-void radv_CmdDraw(

-   VkCommandBuffer commandBuffer,
-   uint32_tvertexCount,
-   uint32_tinstanceCount,
-   uint32_tfirstVertex,
-   uint32_tfirstInstance)
-{
-   RADV_FROM_HANDLE(radv_cmd_buffer, cmd_buffer, commandBuffer);
-
-   radv_cmd_buffer_flush_state(cmd_buffer, false, (instanceCount > 1), 
false, vertexCount);
-
-   MAYBE_UNUSED unsigned cdw_max = radeon_check_space(cmd_buffer->device->ws, 
cmd_buffer->cs, 20 * MAX_VIEWS);
-
-   assert(cmd_buffer->state.pipeline->graphics.vtx_base_sgpr);
-   radeon_set_sh_reg_seq(cmd_buffer->cs, 
cmd_buffer->state.pipeline->graphics.vtx_base_sgpr,
- 
cmd_buffer->state.pipeline->graphics.vtx_emit_num);
-   radeon_emit(cmd_buffer->cs, firstVertex);
-   radeon_emit(cmd_buffer->cs, firstInstance);
-   if (cmd_buffer->state.pipeline->graphics.vtx_emit_num == 3)
-   radeon_emit(cmd_buffer->cs, 0);
-
-   radeon_emit(cmd_buffer->cs, PKT3(PKT3_NUM_INSTANCES, 0, 
cmd_buffer->state.predicating));
-   radeon_emit(cmd_buffer->cs, instanceCount);
-
-   if (!cmd_buffer->state.subpass->view_mask) {
-   radv_cs_emit_draw_packet(cmd_buffer, vertexCount);
-   } else {
-   unsigned i;
-   for_each_bit(i, cmd_buffer->state.subpass->view_mask) {
-   radv_emit_view_index(cmd_buffer, i);
-
-   radv_cs_emit_draw_packet(cmd_buffer, vertexCount);
-   }
-   }
-
-   assert(cmd_buffer->cs->cdw <= cdw_max);
-
-   radv_cmd_buffer_after_draw(cmd_buffer);
-}
-
-
  static void
  radv_cs_emit_draw_indexed_packet(struct radv_cmd_buffer *cmd_buffer,
   uint64_t index_va,
@@ -2910,58 +2869,151 @@ radv_cs_emit_draw_indexed_packet(struct 
radv_cmd_buffer *cmd_buffer,
radeon_emit(cmd_buffer->cs, V_0287F0_DI_SRC_SEL_DMA);
  }
  
-void radv_CmdDrawIndexed(

-   VkCommandBuffer commandBuffer,
-   uint32_tindexCount,
-   uint32_tinstanceCount,
-   uint32_tfirstIndex,
-   int32_t vertexOffset,
-   uint32_tfirstInstance)
+struct radv_draw_info {
+   /**
+* Number of vertices.
+*/
+   uint32_t count;
+
+   /**
+* Index of the first vertex.
+*/
+   int32_t vertex_offset;
+
+   /**
+* First instance id.
+*/
+   uint32_t first_instance;
+
+   /**
+* Number of instances.
+*/
+   uint32_t instance_count;
+
+   /**
+* First index (indexed draws only).
+*/
+   uint32_t first_index;
+
+   /**
+* Whether it's an indexed draw.
+*/
+   bool indexed;
+};
+
+static void
+radv_emit_draw_packets(struct radv_cmd_buffer *cmd_buffer,
+  const struct radv_draw_info *info)
  {
-   RADV_FROM_HANDLE(radv_cmd_buffer, cmd_buffer, commandBuffer);
-   int index_size = cmd_buffer->state.index_type ? 4 : 2;
-   uint64_t index_va;
+   struct radv_cmd_state *state = &cmd_buffer->state;
+   struct radv_device *device = cmd_buffer->device;
+   struct radeon_winsys_cs *cs = cmd_buffer->cs;
  
-	radv_cmd_buffer_flush_state(cmd_buffer, true, (instanceCount > 1), false, indexCount);

+   radv_cmd_buffer_flush_state(cmd_buffer, info->indexed,
+   info->instance_count > 1, false,
+   info->count);
  
-	MAYBE_UNUSED unsigned cdw_max = radeon_check_space(cmd_buffer->device->ws, cmd_buffer->cs, 26 * MAX_VIEWS);

+   MAYBE_UNUSED unsigned cdw_max = radeon_check_space(device->ws, cs,
+  26 * MAX_VIEWS);
  
-	if (cmd_buffer->device->physical_device->rad_info.chip_class >= GFX9) {

-   radeon_set_uconfig_reg_idx(cmd_buffer->cs, 
R_03090C_VGT_INDEX_TYPE,
-

[Mesa-dev] [PATCH] [AMD] dri3: Add adaptive_sync_enable driconf option

2017-10-17 Thread Nicolai Hähnle

From: Nicolai Hähnle 

When enabled, this will request FreeSync via the hybrid amdgpu DDX's
AMDGPU X11 protocol extension.

Due to limitations in the DDX this will only work for applications
that cover the entire X screen (which is important to keep in mind when
you have a multi-monitor setup).
--
We currently have no plans to upstream this patch in this form. So
this is mostly informational.

Still, for any future upstream solution, it would be nice to expose
the same adaptive_sync_enable option, to reduce user confusion.
The meaning of that option is basically: enable FreeSync / use
VESA Adaptive Sync in the way that makes sense for games, i.e. try
to produce frames as fast as possible, and adjust the monitor refresh
rate to the game's update rate.
---
 src/gallium/drivers/radeonsi/driinfo_radeonsi.h |  4 ++
 src/loader/loader_dri3_helper.c | 60 -
 src/loader/loader_dri3_helper.h |  1 +
 src/util/xmlpool/t_options.h|  5 +++
 4 files changed, 69 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/driinfo_radeonsi.h 
b/src/gallium/drivers/radeonsi/driinfo_radeonsi.h
index 402d3406d45..29b4346a726 100644
--- a/src/gallium/drivers/radeonsi/driinfo_radeonsi.h
+++ b/src/gallium/drivers/radeonsi/driinfo_radeonsi.h
@@ -1,10 +1,14 @@
 // DriConf options specific to radeonsi
+DRI_CONF_SECTION_QUALITY
+   DRI_CONF_ADAPTIVE_SYNC_ENABLE("false")
+DRI_CONF_SECTION_END
+
 DRI_CONF_SECTION_PERFORMANCE
 DRI_CONF_RADEONSI_ENABLE_SISCHED("false")
 DRI_CONF_RADEONSI_ASSUME_NO_Z_FIGHTS("false")
 DRI_CONF_RADEONSI_COMMUTATIVE_BLEND_ADD("false")
 DRI_CONF_SECTION_END
 
 DRI_CONF_SECTION_DEBUG
DRI_CONF_RADEONSI_CLEAR_DB_META_BEFORE_CLEAR("false")
 DRI_CONF_SECTION_END
diff --git a/src/loader/loader_dri3_helper.c b/src/loader/loader_dri3_helper.c
index 19ab5815100..c685afa7661 100644
--- a/src/loader/loader_dri3_helper.c
+++ b/src/loader/loader_dri3_helper.c
@@ -15,25 +15,27 @@
  * THE COPYRIGHT HOLDERS DISCLAIM ALL WARRANTIES WITH REGARD TO THIS SOFTWARE,
  * INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO
  * EVENT SHALL THE COPYRIGHT HOLDERS BE LIABLE FOR ANY SPECIAL, INDIRECT OR
  * CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE,
  * DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER
  * TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE
  * OF THIS SOFTWARE.
  */
 
 #include 
+#include 
 #include 
 #include 
 
 #include 
 #include 
+#include 
 #include 
 #include 
 
 #include 
 
 #include 
 #include "loader_dri3_helper.h"
 
 /* From xmlpool/options.h, user exposed so should be stable */
 #define DRI_CONF_VBLANK_NEVER 0
@@ -240,20 +242,64 @@ loader_dri3_drawable_fini(struct loader_dri3_drawable 
*draw)
if (draw->special_event) {
   xcb_void_cookie_t cookie =
  xcb_present_select_input_checked(draw->conn, draw->eid, 
draw->drawable,
   XCB_PRESENT_EVENT_MASK_NO_EVENT);
 
   xcb_discard_reply(draw->conn, cookie.sequence);
   xcb_unregister_for_special_event(draw->conn, draw->special_event);
}
 }
 
+#define X_AMDGPUFreesyncCapability 0
+
+/* Requests must be mulitple of 4 bytes */
+typedef struct _AMDGPUFreesyncCapabilityReq {
+   uint8_t reqType;
+   uint8_t amdgpuReqType;
+   uint16_t length;
+   uint32_t screen;
+   uint32_t drawable;
+} xAMDGPUFreesyncCapabilityReq;
+
+static xcb_extension_t amdgpu_ext_id = { "AMDGPU", 0 };
+
+static bool
+loader_dri3_amdgpu_freesync_enable(xcb_connection_t *conn,
+   xcb_drawable_t drawable)
+{
+   const xcb_query_extension_reply_t *extension;
+
+   extension = xcb_get_extension_data(conn, &amdgpu_ext_id);
+   if (!(extension && extension->present)) {
+  fprintf(stderr, "AMDGPU extension not present -- cannot enable 
FreeSync\n");
+  return false;
+   }
+
+   const xcb_protocol_request_t xcb_req = {
+  .count = 1,
+  .ext = &amdgpu_ext_id,
+  .opcode = X_AMDGPUFreesyncCapability,
+  .isvoid = 1,
+   };
+   xAMDGPUFreesyncCapabilityReq req;
+   struct iovec xcb_parts[3];
+
+   req.screen = 0; /* TODO: do we need to support multiple screens? */
+   req.drawable = drawable;
+
+   xcb_parts[2].iov_base = (char *)&req;
+   xcb_parts[2].iov_len = sizeof(req);
+
+   xcb_send_request(conn, 0, xcb_parts + 2, &xcb_req);
+   return true;
+}
+
 int
 loader_dri3_drawable_init(xcb_connection_t *conn,
   xcb_drawable_t drawable,
   __DRIscreen *dri_screen,
   bool is_different_gpu,
   const __DRIconfig *dri_config,
   struct loader_dri3_extensions *ext,
   const struct loader_dri3_vtable *vtable,
   struct loader_dri3_drawable *draw)
 {
@@ -269,25 +315,32 @@ loader_dri3_drawable_init(xcb_connection_t *conn,

[Mesa-dev] Upstream support for FreeSync / Adaptive Sync

2017-10-17 Thread Nicolai Hähnle


Hi all,

I just sent out a patch that enables FreeSync in Mesa for the somewhat 
hacked implementation of FreeSync that exists in our hybrid (amdgpu-pro) 
stack's DDX and kernel module. [0]


While this patch isn't meant for upstream, that's as good a time as any 
to raise the issue of how a proper upstream solution would look like. It 
needs to cut across the entire stack, and we should try to align the KMS 
interface with the X11 protocol and the Wayland protocol.


Prior art that I'm aware of includes:

1. The Present protocol extension has a PresentOptionUST bit for 
requesting a specific present time, but the reality is that the 
implementation of that is even less than what the protocol docs claim. [1]


2. There's a VK_GOOGLE_display_timing extension which similarly allows 
providing a desiredPresentTime (in ns).


3. Keith Packard's CRTC_{GET,QUEUE}_SEQUENCE is not specific to Adaptive 
Sync, but seems like something Adaptive Sync-aware applications would 
want to use. [2]


Common sense suggests that there need to be two side to FreeSync / VESA 
Adaptive Sync support:


1. Query the display capabilities. This means querying minimum / maximum 
refresh duration, plus possibly a query for when the earliest/latest 
timing of the *next* refresh.


2. Signal desired present time. This means passing a target timer value 
instead of a target vblank count, e.g. something like this for the KMS 
interface:


  int drmModePageFlipTarget64(int fd, uint32_t crtc_id, uint32_t fb_id,
  uint32_t flags, void *user_data,
  uint64_t target);

  + a flag to indicate whether target is the vblank count or the 
CLOCK_MONOTONIC (?) time in ns.


These two sides then need to be plumbed through the entire stack.

I guess for now the main questions are: Is there more "prior art" that 
we should be aware of? Does anybody have partial prototypes or very 
strong feelings about what the interfaces and protocols should look like?


Cheers,
Nicolai

[0] https://patchwork.freedesktop.org/patch/183117/

[1] 
https://cgit.freedesktop.org/xorg/proto/presentproto/tree/presentproto.txt


[2] https://lists.freedesktop.org/archives/dri-devel/2017-August/148905.html
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] radv: use the dispatch initiator for indirect dispatches

2017-10-17 Thread Samuel Pitoiset

Missed that when I allowed waves to be launched out-of-order.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c | 24 +---
 1 file changed, 13 insertions(+), 11 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 9ebdbb011a..5358ae71bd 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -3322,6 +3322,7 @@ radv_emit_dispatch_packets(struct radv_cmd_buffer 
*cmd_buffer,
struct radeon_winsys *ws = cmd_buffer->device->ws;
struct radeon_winsys_cs *cs = cmd_buffer->cs;
struct ac_userdata_info *loc;
+   unsigned dispatch_initiator;
uint8_t grid_used;
 
grid_used = compute_shader->info.info.cs.grid_components_used;
@@ -3331,6 +3332,16 @@ radv_emit_dispatch_packets(struct radv_cmd_buffer 
*cmd_buffer,
 
MAYBE_UNUSED unsigned cdw_max = radeon_check_space(ws, cs, 25);
 
+   dispatch_initiator = S_00B800_COMPUTE_SHADER_EN(1) |
+S_00B800_FORCE_START_AT_000(1);
+
+   if (cmd_buffer->device->physical_device->rad_info.chip_class >= CIK) {
+   /* If the KMD allows it (there is a KMD hw register for it),
+* allow launching waves out-of-order.
+*/
+   dispatch_initiator |= S_00B800_ORDER_MODE(1);
+   }
+
if (info->indirect) {
uint64_t va = radv_buffer_get_va(info->indirect->bo);
 
@@ -3356,7 +3367,7 @@ radv_emit_dispatch_packets(struct radv_cmd_buffer 
*cmd_buffer,
PKT3_SHADER_TYPE_S(1));
radeon_emit(cs, va);
radeon_emit(cs, va >> 32);
-   radeon_emit(cs, 1);
+   radeon_emit(cs, dispatch_initiator);
} else {
radeon_emit(cs, PKT3(PKT3_SET_BASE, 2, 0) |
PKT3_SHADER_TYPE_S(1));
@@ -3367,19 +3378,10 @@ radv_emit_dispatch_packets(struct radv_cmd_buffer 
*cmd_buffer,
radeon_emit(cs, PKT3(PKT3_DISPATCH_INDIRECT, 1, 0) |
PKT3_SHADER_TYPE_S(1));
radeon_emit(cs, 0);
-   radeon_emit(cs, 1);
+   radeon_emit(cs, dispatch_initiator);
}
} else {
unsigned blocks[3] = { info->blocks[0], info->blocks[1], 
info->blocks[2] };
-   unsigned dispatch_initiator = S_00B800_COMPUTE_SHADER_EN(1) |
- S_00B800_FORCE_START_AT_000(1);
-
-   if (cmd_buffer->device->physical_device->rad_info.chip_class >= 
CIK) {
-   /* If the KMD allows it (there is a KMD hw register for
-* it), allow launching waves out-of-order.
-*/
-   dispatch_initiator |= S_00B800_ORDER_MODE(1);
-   }
 
if (info->unaligned) {
unsigned *cs_block_size = 
compute_shader->info.cs.block_size;
-- 
2.14.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] [AMD] dri3: Add adaptive_sync_enable driconf option

2017-10-17 Thread Michel Dänzer

On 17/10/17 11:33 AM, Nicolai Hähnle wrote:
> From: Nicolai Hähnle 
> 
> When enabled, this will request FreeSync via the hybrid amdgpu DDX's
> AMDGPU X11 protocol extension.
> 
> Due to limitations in the DDX this will only work for applications
> that cover the entire X screen (which is important to keep in mind when
> you have a multi-monitor setup).

This limitation already applies to page flipping in general, it's not
specific to FreeSync.


> @@ -269,25 +315,32 @@ loader_dri3_drawable_init(xcb_connection_t *conn,
> draw->drawable = drawable;
> draw->dri_screen = dri_screen;
> draw->is_different_gpu = is_different_gpu;
>  
> draw->have_back = 0;
> draw->have_fake_front = 0;
> draw->first_init = true;
>  
> draw->cur_blit_source = -1;
> draw->back_format = __DRI_IMAGE_FORMAT_NONE;
> +   draw->adaptive_sync = false;
>  
> -   if (draw->ext->config)
> +   if (draw->ext->config) {
>draw->ext->config->configQueryi(draw->dri_screen,
>"vblank_mode", &vblank_mode);
>  
> +  unsigned char adaptive_sync_enable = 1;

Adaptive sync shouldn't be enabled by default. (Maybe that effectively
isn't the case, but then initializing this variable to 1 is just
confusing :)


> @@ -767,20 +820,25 @@ loader_dri3_swap_buffers_msc(struct 
> loader_dri3_drawable *draw,
>   bool force_copy)
>  {
> struct loader_dri3_buffer *back;
> int64_t ret = 0;
> uint32_t options = XCB_PRESENT_OPTION_NONE;
>  
> draw->vtable->flush_drawable(draw, flush_flags);
>  
> back = dri3_find_back_alloc(draw);
>  
> +   if (draw->adaptive_sync) {
> +  if (!loader_dri3_amdgpu_freesync_enable(draw->conn, draw->drawable))
> + draw->adaptive_sync = false;
> +   }

The AMDGPUFreesyncCapability request only has to be submitted once per
window, not for every buffer swap.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 04/10] gbm: handle queryImage() failure for GBM_BO_IMPORT_EGL_IMAGE

2017-10-17 Thread Eric Engestrom

On Monday, 2017-10-16 16:04:06 +, Emil Velikov wrote:
> From: Emil Velikov 
> 
> The function can fail. Check and teardown accordingly.
> 
> Fixes: a43d286ef7f ("gbm: Add import from fd")
> Cc: Kristian Høgsberg 
> Signed-off-by: Emil Velikov 
> ---
>  src/gbm/backends/dri/gbm_dri.c | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/src/gbm/backends/dri/gbm_dri.c b/src/gbm/backends/dri/gbm_dri.c
> index 4a51bd39903..9c9066e6661 100644
> --- a/src/gbm/backends/dri/gbm_dri.c
> +++ b/src/gbm/backends/dri/gbm_dri.c
> @@ -891,6 +891,7 @@ gbm_dri_bo_import(struct gbm_device *gbm,
> __DRIimage *image;
> unsigned dri_use = 0;
> int gbm_format;
> +   unsigned query; /* EGLBoolean, but we cannot include the header */

`bool`?   is already included.

>  
> /* Required for query image WIDTH & HEIGHT */
> if (dri->image == NULL || dri->image->base.version < 4) {
> @@ -934,7 +935,12 @@ gbm_dri_bo_import(struct gbm_device *gbm,
>  
>image = dri->lookup_image(dri->screen, buffer, dri->lookup_user_data);
>image = dri->image->dupImage(image, NULL);
> -  dri->image->queryImage(image, __DRI_IMAGE_ATTRIB_FORMAT, &dri_format);
> +  query = dri->image->queryImage(image, __DRI_IMAGE_ATTRIB_FORMAT, 
> &dri_format);
> +  if (!query) {
> + errno = EINVAL;
> + dri->image->destroyImage(image);
> + break;
> +  }
>gbm_format = gbm_dri_to_gbm_format(dri_format);
>if (gbm_format == 0) {
>   errno = EINVAL;
> -- 
> 2.14.1
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 05/10] gbm: handle queryImage() failure through rest of gbm_dri_bo_import()

2017-10-17 Thread Eric Engestrom

On Monday, 2017-10-16 16:04:07 +, Emil Velikov wrote:
> From: Emil Velikov 
> 
> Fixes: 6a7dea93fa7 ("dri: Rework planar image interface")
> Cc: Jakob Bornecrantz 
> Signed-off-by: Emil Velikov 
> ---
>  src/gbm/backends/dri/gbm_dri.c | 23 +++
>  1 file changed, 15 insertions(+), 8 deletions(-)
> 
> diff --git a/src/gbm/backends/dri/gbm_dri.c b/src/gbm/backends/dri/gbm_dri.c
> index 9c9066e6661..a6c80cf1ec7 100644
> --- a/src/gbm/backends/dri/gbm_dri.c
> +++ b/src/gbm/backends/dri/gbm_dri.c
> @@ -1048,14 +1048,21 @@ gbm_dri_bo_import(struct gbm_device *gbm,
> bo->base.gbm = gbm;
> bo->base.format = gbm_format;
>  
> -   dri->image->queryImage(bo->image, __DRI_IMAGE_ATTRIB_WIDTH,
> -  (int*)&bo->base.width);
> -   dri->image->queryImage(bo->image, __DRI_IMAGE_ATTRIB_HEIGHT,
> -  (int*)&bo->base.height);
> -   dri->image->queryImage(bo->image, __DRI_IMAGE_ATTRIB_STRIDE,
> -  (int*)&bo->base.stride);
> -   dri->image->queryImage(bo->image, __DRI_IMAGE_ATTRIB_HANDLE,
> -  &bo->base.handle.s32);
> +   query = dri->image->queryImage(bo->image, __DRI_IMAGE_ATTRIB_WIDTH,
> +  (int*)&bo->base.width);
> +   query &= dri->image->queryImage(bo->image, __DRI_IMAGE_ATTRIB_HEIGHT,

With bitmasks, you really need to have 0/1 here. Someone could return `2`,
which would count as `true` as bool, but `1 & 2 == 0`, which is not what
we want here.
Please add `!!` in front of the queryImage() calls.

> +   (int*)&bo->base.height);
> +   query &= dri->image->queryImage(bo->image, __DRI_IMAGE_ATTRIB_STRIDE,
> +   (int*)&bo->base.stride);
> +   query &= dri->image->queryImage(bo->image, __DRI_IMAGE_ATTRIB_HANDLE,
> +   &bo->base.handle.s32);
> +
> +   if (!query) {
> +  errno = EINVAL;
> +  dri->image->destroyImage(bo->image);
> +  free(bo);
> +  return NULL;
> +   }
>  
> return &bo->base;
>  }
> -- 
> 2.14.1
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] nir: Add a helper for adding texture instruction sources

2017-10-17 Thread Lionel Landwerlin


Thanks!

Reviewed-by: Lionel Landwerlin 

On 17/10/17 02:09, Jason Ekstrand wrote:

---
  src/compiler/nir/nir.c   | 22 +++
  src/compiler/nir/nir.h   |  4 
  src/compiler/nir/nir_lower_samplers.c| 27 ++--
  src/intel/vulkan/anv_nir_apply_pipeline_layout.c | 19 +
  4 files changed, 29 insertions(+), 43 deletions(-)

diff --git a/src/compiler/nir/nir.c b/src/compiler/nir/nir.c
index afd4d1a..5bc07b7 100644
--- a/src/compiler/nir/nir.c
+++ b/src/compiler/nir/nir.c
@@ -542,6 +542,28 @@ nir_tex_instr_create(nir_shader *shader, unsigned num_srcs)
  }
  
  void

+nir_tex_instr_add_src(nir_tex_instr *tex,
+  nir_tex_src_type src_type,
+  nir_src src)
+{
+   nir_tex_src *new_srcs = rzalloc_array(tex, nir_tex_src,
+ tex->num_srcs + 1);
+
+   for (unsigned i = 0; i < tex->num_srcs; i++) {
+  new_srcs[i].src_type = tex->src[i].src_type;
+  nir_instr_move_src(&tex->instr, &new_srcs[i].src,
+ &tex->src[i].src);
+   }
+
+   ralloc_free(tex->src);
+   tex->src = new_srcs;
+
+   tex->src[tex->num_srcs].src_type = src_type;
+   nir_instr_rewrite_src(&tex->instr, &tex->src[tex->num_srcs].src, src);
+   tex->num_srcs++;
+}
+
+void
  nir_tex_instr_remove_src(nir_tex_instr *tex, unsigned src_idx)
  {
 assert(src_idx < tex->num_srcs);
diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index bd6035e..70c23c2 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -1379,6 +1379,10 @@ nir_tex_instr_src_index(const nir_tex_instr *instr, 
nir_tex_src_type type)
 return -1;
  }
  
+void nir_tex_instr_add_src(nir_tex_instr *tex,

+   nir_tex_src_type src_type,
+   nir_src src);
+
  void nir_tex_instr_remove_src(nir_tex_instr *tex, unsigned src_idx);
  
  typedef struct {

diff --git a/src/compiler/nir/nir_lower_samplers.c 
b/src/compiler/nir/nir_lower_samplers.c
index 0c4e91b..f75fb1a 100644
--- a/src/compiler/nir/nir_lower_samplers.c
+++ b/src/compiler/nir/nir_lower_samplers.c
@@ -109,32 +109,9 @@ lower_sampler(nir_tex_instr *instr, const struct 
gl_shader_program *shader_progr
assert(array_elements >= 1);
indirect = nir_umin(b, indirect, nir_imm_int(b, array_elements - 1));
  
-  /* First, we have to resize the array of texture sources */

-  nir_tex_src *new_srcs = rzalloc_array(instr, nir_tex_src,
-instr->num_srcs + 2);
-
-  for (unsigned i = 0; i < instr->num_srcs; i++) {
- new_srcs[i].src_type = instr->src[i].src_type;
- nir_instr_move_src(&instr->instr, &new_srcs[i].src,
-&instr->src[i].src);
-  }
-
-  ralloc_free(instr->src);
-  instr->src = new_srcs;
-
-  /* Now we can go ahead and move the source over to being a
-   * first-class texture source.
-   */
-  instr->src[instr->num_srcs].src_type = nir_tex_src_texture_offset;
-  instr->num_srcs++;
-  nir_instr_rewrite_src(&instr->instr,
-&instr->src[instr->num_srcs - 1].src,
+  nir_tex_instr_add_src(instr, nir_tex_src_texture_offset,
  nir_src_for_ssa(indirect));
-
-  instr->src[instr->num_srcs].src_type = nir_tex_src_sampler_offset;
-  instr->num_srcs++;
-  nir_instr_rewrite_src(&instr->instr,
-&instr->src[instr->num_srcs - 1].src,
+  nir_tex_instr_add_src(instr, nir_tex_src_sampler_offset,
  nir_src_for_ssa(indirect));
  
instr->texture_array_size = array_elements;

diff --git a/src/intel/vulkan/anv_nir_apply_pipeline_layout.c 
b/src/intel/vulkan/anv_nir_apply_pipeline_layout.c
index 26e7dcc..1c86513 100644
--- a/src/intel/vulkan/anv_nir_apply_pipeline_layout.c
+++ b/src/intel/vulkan/anv_nir_apply_pipeline_layout.c
@@ -157,24 +157,7 @@ lower_tex_deref(nir_tex_instr *tex, nir_deref_var *deref,
   if (state->add_bounds_checks)
  index = nir_umin(b, index, nir_imm_int(b, array_size - 1));
  
- nir_tex_src *new_srcs = rzalloc_array(tex, nir_tex_src,

-   tex->num_srcs + 1);
-
- for (unsigned i = 0; i < tex->num_srcs; i++) {
-new_srcs[i].src_type = tex->src[i].src_type;
-nir_instr_move_src(&tex->instr, &new_srcs[i].src, 
&tex->src[i].src);
- }
-
- ralloc_free(tex->src);
- tex->src = new_srcs;
-
- /* Now we can go ahead and move the source over to being a
-  * first-class texture source.
-  */
- tex->src[tex->num_srcs].src_type = src_type;
- nir_instr_rewrite_src(&tex->instr, &tex->src[tex->num_srcs].src,
-   nir_src_for_ssa(index));
- tex->num_srcs++;
+ nir_tex_instr_add_src(tex, src_typ

Re: [Mesa-dev] Upstream support for FreeSync / Adaptive Sync

2017-10-17 Thread Michel Dänzer

On 17/10/17 11:34 AM, Nicolai Hähnle wrote:
> Hi all,
> 
> I just sent out a patch that enables FreeSync in Mesa for the somewhat
> hacked implementation of FreeSync that exists in our hybrid (amdgpu-pro)
> stack's DDX and kernel module. [0]
> 
> While this patch isn't meant for upstream, that's as good a time as any
> to raise the issue of how a proper upstream solution would look like. It
> needs to cut across the entire stack, and we should try to align the KMS
> interface with the X11 protocol and the Wayland protocol.
> 
> Prior art that I'm aware of includes:
> 
> 1. The Present protocol extension has a PresentOptionUST bit for
> requesting a specific present time, but the reality is that the
> implementation of that is even less than what the protocol docs claim. [1]

FWIW, I do think this could be a good way for clients to signal that
they want a frame to be displayed ASAP. It would also allow for e.g.
video players to naturally adapt the refresh rate to the video frame
rate (the VDPAU presentation API has a target timestamp for this).


> 3. Keith Packard's CRTC_{GET,QUEUE}_SEQUENCE is not specific to Adaptive
> Sync, but seems like something Adaptive Sync-aware applications would
> want to use. [2]

Not sure I can agree with that. Applications should use higher level
APIs, not low level ones like these directly. (Also, they're basically
just KMS variants of DRM_IOCTL_WAIT_VBLANK, not directly related to
adaptive sync)


> Common sense suggests that there need to be two side to FreeSync / VESA
> Adaptive Sync support:
> 
> 1. Query the display capabilities. This means querying minimum / maximum
> refresh duration, plus possibly a query for when the earliest/latest
> timing of the *next* refresh.
> 
> 2. Signal desired present time. This means passing a target timer value
> instead of a target vblank count, e.g. something like this for the KMS
> interface:
> 
>   int drmModePageFlipTarget64(int fd, uint32_t crtc_id, uint32_t fb_id,
>   uint32_t flags, void *user_data,
>   uint64_t target);
> 
>   + a flag to indicate whether target is the vblank count or the
> CLOCK_MONOTONIC (?) time in ns.

drmModePageFlip(Target) is part of the pre-atomic KMS API, but adapative
sync should probably only be supported via the atomic API, presumably
via output properties.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] [AMD] dri3: Add adaptive_sync_enable driconf option

2017-10-17 Thread Nicolai Hähnle


On 17.10.2017 12:07, Michel Dänzer wrote:

On 17/10/17 11:33 AM, Nicolai Hähnle wrote:

From: Nicolai Hähnle 

When enabled, this will request FreeSync via the hybrid amdgpu DDX's
AMDGPU X11 protocol extension.

Due to limitations in the DDX this will only work for applications
that cover the entire X screen (which is important to keep in mind when
you have a multi-monitor setup).


This limitation already applies to page flipping in general, it's not
specific to FreeSync.


Maybe it shouldn't apply in general? After all, what if you have one 
monitor with 60Hz and one with 90Hz?




@@ -269,25 +315,32 @@ loader_dri3_drawable_init(xcb_connection_t *conn,
 draw->drawable = drawable;
 draw->dri_screen = dri_screen;
 draw->is_different_gpu = is_different_gpu;
  
 draw->have_back = 0;

 draw->have_fake_front = 0;
 draw->first_init = true;
  
 draw->cur_blit_source = -1;

 draw->back_format = __DRI_IMAGE_FORMAT_NONE;
+   draw->adaptive_sync = false;
  
-   if (draw->ext->config)

+   if (draw->ext->config) {
draw->ext->config->configQueryi(draw->dri_screen,
"vblank_mode", &vblank_mode);
  
+  unsigned char adaptive_sync_enable = 1;


Adaptive sync shouldn't be enabled by default. (Maybe that effectively
isn't the case, but then initializing this variable to 1 is just
confusing :)


You're right, I'll change it :)



@@ -767,20 +820,25 @@ loader_dri3_swap_buffers_msc(struct loader_dri3_drawable 
*draw,
   bool force_copy)
  {
 struct loader_dri3_buffer *back;
 int64_t ret = 0;
 uint32_t options = XCB_PRESENT_OPTION_NONE;
  
 draw->vtable->flush_drawable(draw, flush_flags);
  
 back = dri3_find_back_alloc(draw);
  
+   if (draw->adaptive_sync) {

+  if (!loader_dri3_amdgpu_freesync_enable(draw->conn, draw->drawable))
+ draw->adaptive_sync = false;
+   }


The AMDGPUFreesyncCapability request only has to be submitted once per
window, not for every buffer swap.


Unfortunately, that's not true due to the implementation in the DDX.

For one,  we *definitely* have to re-submit the request at least each 
time the window is moved or resized, because the request doesn't 
register unless the window is already fullscreen.


For another, some applications create multiple windows, but the DDX can 
only keep track of one.


There's perhaps an argument to be made that both of these issues would 
better be fixed in the DDX. However, since this is a non-upstream 
interim solution anyway, and the final upstream solution will likely 
also do something in swap_buffers, I think this approach is reasonable.


Correct me if I got some of the xcb code wrong, but from what I 
understand there should be no stalls associated with it, just a bunch of 
additional bytes sent over the socket on each swap buffers, which 
doesn't seem too bad to me.


(For what it's worth, the closed GL driver sends this request on 
glXMakeCurrent, which makes even less sense to me :-))


Cheers,
Nicolai







--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 08/10] i965: implement DRIImage::createImageFromRenderbuffer2

2017-10-17 Thread Eric Engestrom

On Monday, 2017-10-16 16:04:10 +, Emil Velikov wrote:
> From: Emil Velikov 
> 
> The new entry point has a way to feedback the error. Thus we no longer
> need to call _mesa_error() but instead we can pass the correct value.
> 
> Signed-off-by: Emil Velikov 
> ---
>  src/mesa/drivers/dri/i965/intel_screen.c | 31 ++-
>  1 file changed, 26 insertions(+), 5 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
> b/src/mesa/drivers/dri/i965/intel_screen.c
> index ea04a72e860..7cbb5e3b060 100644
> --- a/src/mesa/drivers/dri/i965/intel_screen.c
> +++ b/src/mesa/drivers/dri/i965/intel_screen.c
> @@ -474,8 +474,9 @@ intel_create_image_from_name(__DRIscreen *dri_screen,
>  }
>  
>  static __DRIimage *
> -intel_create_image_from_renderbuffer(__DRIcontext *context,
> -  int renderbuffer, void *loaderPrivate)
> +intel_create_image_from_renderbuffer2(__DRIcontext *context,
> +   int renderbuffer, void *loaderPrivate,
> +   unsigned *error)
>  {
> __DRIimage *image;
> struct brw_context *brw = context->driverPrivate;
> @@ -485,15 +486,17 @@ intel_create_image_from_renderbuffer(__DRIcontext 
> *context,
>  
> rb = _mesa_lookup_renderbuffer(ctx, renderbuffer);
> if (!rb) {
> -  _mesa_error(ctx, GL_INVALID_OPERATION, "glRenderbufferExternalMESA");
> +  *error = __DRI_IMAGE_ERROR_BAD_PARAMETER;
>return NULL;
> }

[there]

>  
> irb = intel_renderbuffer(rb);
> intel_miptree_make_shareable(brw, irb->mt);
> image = calloc(1, sizeof *image);
> -   if (image == NULL)
> +   if (image == NULL) {
> +  *error = __DRI_IMAGE_ERROR_BAD_ALLOC;
>return NULL;
> +   }
>  
> image->internal_format = rb->InternalFormat;
> image->format = rb->Format;
> @@ -508,12 +511,29 @@ intel_create_image_from_renderbuffer(__DRIcontext 
> *context,
> image->height = rb->Height;
> image->pitch = irb->mt->surf.row_pitch;
> image->dri_format = driGLFormatToImageFormat(image->format);
> +   if (image->dri_format == __DRI_IMAGE_FORMAT_NONE) {

__DRI_IMAGE_FORMAT_NONE can only come from MESA_FORMAT_NONE; did you
mean `== 0`? Or both?

> +  *error = __DRI_IMAGE_ERROR_BAD_PARAMETER;
> +  brw_bo_unreference(irb->mt->bo);
> +  free(image);
> +  return NULL;
> +   }

You can move this check before the calloc and ref ([there] above), so
that the error path becomes just setting `error` and returning NULL.

You'll want a temp var to avoid calling driGLFormatToImageFormat()
twice:
uint32_t dri_format = driGLFormatToImageFormat(rb->Format);

> +
> image->has_depthstencil = irb->mt->stencil_mt? true : false;
>  
> rb->NeedsFinishRenderTexture = true;
> +   *error = __DRI_IMAGE_ERROR_SUCCESS;
> return image;
>  }
>  
> +static __DRIimage *
> +intel_create_image_from_renderbuffer(__DRIcontext *context,
> +  int renderbuffer, void *loaderPrivate)
> +{
> +   unsigned error;
> +   return intel_create_image_from_renderbuffer2(context, renderbuffer,
> +loaderPrivate, &error);

if (error == __DRI_IMAGE_ERROR_BAD_PARAMETER)
_mesa_error(ctx, GL_INVALID_OPERATION, 
"glRenderbufferExternalMESA");

(maybe even `if (error != __DRI_IMAGE_ERROR_SUCCESS)`?)

> +}
> +
>  static __DRIimage *
>  intel_create_image_from_texture(__DRIcontext *context, int target,
>  unsigned texture, int zoffset,
> @@ -1289,7 +1309,7 @@ intel_from_planar(__DRIimage *parent, int plane, void 
> *loaderPrivate)
>  }
>  
>  static const __DRIimageExtension intelImageExtension = {
> -.base = { __DRI_IMAGE, 16 },
> +.base = { __DRI_IMAGE, 17 },
>  
>  .createImageFromName= intel_create_image_from_name,
>  .createImageFromRenderbuffer= 
> intel_create_image_from_renderbuffer,
> @@ -1312,6 +1332,7 @@ static const __DRIimageExtension intelImageExtension = {
>  .queryDmaBufFormats = intel_query_dma_buf_formats,
>  .queryDmaBufModifiers   = intel_query_dma_buf_modifiers,
>  .queryDmaBufFormatModifierAttribs   = 
> intel_query_format_modifier_attribs,
> +.createImageFromRenderbuffer2   = 
> intel_create_image_from_renderbuffer2,
>  };
>  
>  static uint64_t
> -- 
> 2.14.1
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH V3] automake: intel: move expat handling where it's used

2017-10-17 Thread Lionel Landwerlin


Yeah, it also applies to i965.
I'm guessing we haven't seen that problem because expat gets pulled in 
by other bits of the mesa.


Looks good to me, thanks :

Reviewed-by: Lionel Landwerlin 

On 17/10/17 02:10, Hongxu Jia wrote:

Linking libvulkan_intel.so can fail, due to unresolved references to
libexpat.so.

EXPAT_CFLAGS should be moved as well.

Signed-off-by: Hongxu Jia 
---
  src/intel/Makefile.common.am | 1 +
  src/intel/Makefile.tools.am  | 2 --
  2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/intel/Makefile.common.am b/src/intel/Makefile.common.am
index 49e9c6a..3789dc1 100644
--- a/src/intel/Makefile.common.am
+++ b/src/intel/Makefile.common.am
@@ -23,6 +23,7 @@ noinst_LTLIBRARIES += common/libintel_common.la
  
  common_libintel_common_la_CFLAGS = $(AM_CFLAGS) $(LIBDRM_CFLAGS)

  common_libintel_common_la_SOURCES = $(COMMON_FILES)
+common_libintel_common_la_LIBADD = $(EXPAT_LIBS)
  
  if HAVE_PLATFORM_ANDROID

  common_libintel_common_la_CFLAGS += $(ANDROID_CFLAGS)
diff --git a/src/intel/Makefile.tools.am b/src/intel/Makefile.tools.am
index 8071220..32cdc70 100644
--- a/src/intel/Makefile.tools.am
+++ b/src/intel/Makefile.tools.am
@@ -41,7 +41,6 @@ tools_aubinator_LDADD = \
$(PER_GEN_LIBS) \
$(PTHREAD_LIBS) \
$(DLOPEN_LIBS) \
-   $(EXPAT_LIBS) \
$(ZLIB_LIBS) \
-lm
  
@@ -56,7 +55,6 @@ tools_aubinator_error_decode_LDADD = \

compiler/libintel_compiler.la \
$(top_builddir)/src/util/libmesautil.la \
$(PTHREAD_LIBS) \
-   $(EXPAT_LIBS) \
$(ZLIB_LIBS)
  
  tools_aubinator_error_decode_CFLAGS = \



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] radv: copy indirect lowering settings from radeonsi

2017-10-17 Thread Timothy Arceri




On 17/10/17 17:49, Bas Nieuwenhuizen wrote:

On Tue, Oct 17, 2017 at 7:41 AM, Timothy Arceri  wrote:

It looks the original indirect mask was probably copied from
ANV.

Here we drop lowering locals altogether and allow indirects
on inputs where supported.

Sascha Willems demo results:

tessellation ~4000 -> ~4200 fps
---

  Radeonsi also does a couple of other things.

  1. It sets the following llvm falg:

sscreen->llvm_has_working_vgpr_indexing ? "" : ",-promote-alloca"

  2. Lowers indirect outputs on certain hardware:

   return sscreen->llvm_has_working_vgpr_indexing ||
/* TCS stores outputs directly to memory. */
shader == PIPE_SHADER_TESS_CTRL;


  I'm not sure if we should be doing these things also. Comments?


Yeah, we should be doing those also (the latter is when *NOT* to lower
indirect outputs though?).


Yeah sorry the result of the above code is inverted later on, I realized 
after sending it might not have been the best example.



Everything besides TCS uses our custom
vector buffering code, so is susceptible to LLVM indirect addressing
bugs. We might be reducing this for LS/ES by directly writing to
LDS/memory instead of to internal vars first, but that is not the way
it is done currently.

In the meantime, this patch is

Reviewed-by: Bas Nieuwenhuizen 


Thanks!





  src/amd/vulkan/radv_shader.c | 20 ++--
  1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
index 055787a705..819d33b1ad 100644
--- a/src/amd/vulkan/radv_shader.c
+++ b/src/amd/vulkan/radv_shader.c
@@ -238,23 +238,39 @@ radv_shader_compile_to_nir(struct radv_device *device,
 NIR_PASS_V(nir, nir_lower_constant_initializers, ~0);
 NIR_PASS_V(nir, nir_lower_system_values);
 NIR_PASS_V(nir, nir_lower_clip_cull_distance_arrays);
 }

 /* Vulkan uses the separate-shader linking model */
 nir->info.separate_shader = true;

 nir_shader_gather_info(nir, entry_point->impl);

+   /* While it would be nice not to have this flag, we are constrained
+* by the reality that LLVM 5.0 doesn't have working VGPR indexing
+* on GFX9.
+*/
+   bool llvm_has_working_vgpr_indexing =
+   device->physical_device->rad_info.chip_class <= VI;
+
+   /* TODO: Indirect indexing of GS inputs is unimplemented.
+*
+* TCS and TES load inputs directly from LDS or offchip memory, so
+* indirect indexing is trivial.
+*/
 nir_variable_mode indirect_mask = 0;
-   indirect_mask |= nir_var_shader_in;
-   indirect_mask |= nir_var_local;
+   if (nir->stage == MESA_SHADER_GEOMETRY ||
+   (nir->stage != MESA_SHADER_TESS_CTRL &&
+nir->stage != MESA_SHADER_TESS_EVAL &&
+!llvm_has_working_vgpr_indexing)) {
+   indirect_mask |= nir_var_shader_in;
+   }

 nir_lower_indirect_derefs(nir, indirect_mask);

 static const nir_lower_tex_options tex_options = {
   .lower_txp = ~0,
 };

 nir_lower_tex(nir, &tex_options);

 nir_lower_vars_to_ssa(nir);
--
2.13.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 09/10] i915: implement DRIImage::createImageFromRenderbuffer2

2017-10-17 Thread Eric Engestrom

On Monday, 2017-10-16 16:04:11 +, Emil Velikov wrote:
> From: Emil Velikov 
> 
> Signed-off-by: Emil Velikov 

(same code, same comments as the previous/i965 patch apply here)

> ---
>  src/mesa/drivers/dri/i915/intel_screen.c | 33 
> +---
>  1 file changed, 26 insertions(+), 7 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i915/intel_screen.c 
> b/src/mesa/drivers/dri/i915/intel_screen.c
> index ba49e90fef5..d75e59ab98a 100644
> --- a/src/mesa/drivers/dri/i915/intel_screen.c
> +++ b/src/mesa/drivers/dri/i915/intel_screen.c
> @@ -334,8 +334,9 @@ intel_create_image_from_name(__DRIscreen *screen,
>  }
>  
>  static __DRIimage *
> -intel_create_image_from_renderbuffer(__DRIcontext *context,
> -  int renderbuffer, void *loaderPrivate)
> +intel_create_image_from_renderbuffer2(__DRIcontext *context,
> +   int renderbuffer, void *loaderPrivate,
> +   unsigned *error)
>  {
> __DRIimage *image;
> struct intel_context *intel = context->driverPrivate;
> @@ -344,15 +345,16 @@ intel_create_image_from_renderbuffer(__DRIcontext 
> *context,
>  
> rb = _mesa_lookup_renderbuffer(&intel->ctx, renderbuffer);
> if (!rb) {
> -  _mesa_error(&intel->ctx,
> -   GL_INVALID_OPERATION, "glRenderbufferExternalMESA");
> +  *error = __DRI_IMAGE_ERROR_BAD_PARAMETER;
>return NULL;
> }
>  
> irb = intel_renderbuffer(rb);
> image = calloc(1, sizeof *image);
> -   if (image == NULL)
> +   if (image == NULL) {
> +  *error = __DRI_IMAGE_ERROR_BAD_ALLOC;
>return NULL;
> +   }
>  
> image->internal_format = rb->InternalFormat;
> image->format = rb->Format;
> @@ -361,11 +363,27 @@ intel_create_image_from_renderbuffer(__DRIcontext 
> *context,
> intel_region_reference(&image->region, irb->mt->region);
> intel_setup_image_from_dimensions(image);
> image->dri_format = driGLFormatToImageFormat(image->format);
> +   if (image->dri_format == __DRI_IMAGE_FORMAT_NONE) {
> +  *error = __DRI_IMAGE_ERROR_BAD_PARAMETER;
> +  intel_region_release(&image->region);
> +  free(image);
> +  return NULL;
> +   }
>  
> rb->NeedsFinishRenderTexture = true;
> +   *error = __DRI_IMAGE_ERROR_SUCCESS;
> return image;
>  }
>  
> +static __DRIimage *
> +intel_create_image_from_renderbuffer(__DRIcontext *context,
> +  int renderbuffer, void *loaderPrivate)
> +{
> +   unsigned error;
> +   return intel_create_image_from_renderbuffer2(context, renderbuffer,
> +loaderPrivate, &error);
> +}
> +
>  static __DRIimage *
>  intel_create_image_from_texture(__DRIcontext *context, int target,
>  unsigned texture, int zoffset,
> @@ -694,7 +712,7 @@ intel_from_planar(__DRIimage *parent, int plane, void 
> *loaderPrivate)
>  }
>  
>  static const __DRIimageExtension intelImageExtension = {
> -.base = { __DRI_IMAGE, 7 },
> +.base = { __DRI_IMAGE, 17 },
>  
>  .createImageFromName= intel_create_image_from_name,
>  .createImageFromRenderbuffer= 
> intel_create_image_from_renderbuffer,
> @@ -706,7 +724,8 @@ static const __DRIimageExtension intelImageExtension = {
>  .createImageFromNames   = intel_create_image_from_names,
>  .fromPlanar = intel_from_planar,
>  .createImageFromTexture = intel_create_image_from_texture,
> -.createImageFromFds = intel_create_image_from_fds
> +.createImageFromFds = intel_create_image_from_fds,
> +.createImageFromRenderbuffer2   = 
> intel_create_image_from_renderbuffer2,
>  };
>  
>  static int
> -- 
> 2.14.1
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Mesa-stable] [PATCH 2/2] auxiliary: use vl_drm_screen_create method for surfaceless

2017-10-17 Thread Emil Velikov

HI Suresh,

Please try to avoid HTML emails. If that's not possible, make sure
your reply is more readable.
At the moment it has 3 different fonts and 3 different colours, which
makes is distracting and hard to read.

Please tweak the quoting format - older replies should be a level
deeper. See [1] [2]

On 17 October 2017 at 07:36, Guttula, Suresh  wrote:
> On 11 October 2017 at 07:13, Guttula, Suresh  wrote:
>> HI,
>>
>>>- why do we need "surfaceless" support
>>ChromeOS supports surfacelsess and we need this va enablement for
>> surfaceless in chromium.
> Ack, that should have been part of the commit message.
>>>>I will update the commit message.
>
>>> - does upstream VAAPI has surfaceless platform
>> Yes. It uses headless support of VA-API for decoding.
> There's no VA_DISPLAY_SURFACELESS in libva [1]. Thus adding one here is
> _very_ confusing and misleading.
>   >>>Sorry I understood wrongly the question, I thought you are asking about
> mesa-vaapi. In libva it is using drm path only. If I understood correctly ,
> no need of any macro  VA_DISPLAY_SURFACELESS in libva as there is no problem
> to use drm path for egl platform surfaceless. The problem exists in mesa
> side as the check is added to enable va based on platform.
> https://github.com/01org/libva/blob/master/va/va_backend.h#L39
>   >>>libva uses “VA_DISPLAY_DRM_RENDERNODES”  in this case. In libva
> ,Chromium (Ozone) for egl surfaceless platform goes for drm display .
> https://cs.chromium.org/chromium/src/media/gpu/vaapi_wrapper.cc?rcl=e1a85cf02acf0b4ccaad6e37afcf41d1fd26ce24&l=1188
>
>>>  - why is the surfaceless implementation identical to the DRM one
>> If I understand your question correctly, In case of surfaceless
>> platform ,it uses headless support of VAAPI, which will use drm
>> implementation. If I miss something here please provide some more details on
>> the question.
>>
Precisely - in hind sight, one might have called the libva display
VA_DISPLAY_SURFACELESS.
Regardless, it's just a name so not a bit deal how it's called, as
long as things are consistent.

You're adding surfaceless for Mesa VAAPI (backend) that interacts with
libva VA_DISPLAY_DRM* (frontend). That's the confusing/misleading part
I'm talking about.

> To put it otherwise:
>
> You're "adding" support for surfaceless for the sake of adding a name.
> There's no functional difference nor upstream (see the libva question
> above) demand for it.
>>>>The reason for adding "surfaceless"  in mesa is the condition checks
> for platform "drm/wayland/x11" to enable va.
> But in case of chromium ,we build mesa with_egl_platforms=surfaceless and
> not mesa_gbm because chromium uses minigbm .So echo $platform is
> surfaceless,
> even it is using drm path, condition check fail because of platform type
> picked as surfaceless and va is not enabled.
>
Or in other words:
 - CrOS uses its own GBM,
 - using --with-platforms=gbm requires Mesa's GBM

Thus the solution here really is:
 - decouple the link-time dependency to a (once-off) runtime one
 - and(?) demote the configure error to a warning ;-)
Right?

While we're there we could/should:
 - drop the (first_pointer == gbm_create_device) hack
Replace with dladdr(first_pointer, &info) + strcmp(info.dli_sname,
"gbm_create_device") combo
 - make egl_dri2.c free of calls into gbm - only gbm_device_destroy remains
Move the remaining gbm_device_destroy to platform_drm.c

Bonus points:
 - Add ABI and/or version check for Mesa GBM <> EGL interop.

Does that make sense? It seems like a more robust solution, IMHO.

Thanks
Emil

[1] https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
[2] 
https://www.extendoffice.com/documents/outlook/4006-outlook-reply-quote.html#a1
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 10/10] radeon: implement DRIImage::createImageFromRenderbuffer2

2017-10-17 Thread Eric Engestrom

On Monday, 2017-10-16 16:04:12 +, Emil Velikov wrote:
> From: Emil Velikov 
> 
> Signed-off-by: Emil Velikov 
> ---
>  src/mesa/drivers/dri/radeon/radeon_screen.c | 37 
> +++--
>  1 file changed, 25 insertions(+), 12 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/radeon/radeon_screen.c 
> b/src/mesa/drivers/dri/radeon/radeon_screen.c
> index 51af452e245..02f0c1a6147 100644
> --- a/src/mesa/drivers/dri/radeon/radeon_screen.c
> +++ b/src/mesa/drivers/dri/radeon/radeon_screen.c
> @@ -241,8 +241,9 @@ radeon_create_image_from_name(__DRIscreen *screen,
>  }
>  
>  static __DRIimage *
> -radeon_create_image_from_renderbuffer(__DRIcontext *context,
> -  int renderbuffer, void *loaderPrivate)
> +radeon_create_image_from_renderbuffer2(__DRIcontext *context,
> +   int renderbuffer, void *loaderPrivate,
> +   unsigned *error)
>  {
> __DRIimage *image;
> radeonContextPtr radeon = context->driverPrivate;
> @@ -251,15 +252,16 @@ radeon_create_image_from_renderbuffer(__DRIcontext 
> *context,
>  
> rb = _mesa_lookup_renderbuffer(&radeon->glCtx, renderbuffer);
> if (!rb) {
> -  _mesa_error(&radeon->glCtx,
> -  GL_INVALID_OPERATION, "glRenderbufferExternalMESA");
> +  *error = __DRI_IMAGE_ERROR_BAD_PARAMETER;
>return NULL;
> }
>  
> rrb = radeon_renderbuffer(rb);
> image = calloc(1, sizeof *image);
> -   if (image == NULL)
> +   if (image == NULL) {
> +  *error = __DRI_IMAGE_ERROR_BAD_ALLOC;
>return NULL;
> +   }
>  
> image->internal_format = rb->InternalFormat;
> image->format = rb->Format;
> @@ -273,9 +275,19 @@ radeon_create_image_from_renderbuffer(__DRIcontext 
> *context,
> image->height = rb->Height;
> image->pitch = rrb->pitch / image->cpp;
>  
> +   *error = __DRI_IMAGE_ERROR_SUCCESS;
> return image;
>  }
>  
> +static __DRIimage *
> +radeon_create_image_from_renderbuffer(__DRIcontext *context,
> +  int renderbuffer, void *loaderPrivate)
> +{
> +   unsigned error;
> +   return radeon_create_image_from_renderbuffer2(context, renderbuffer,
> + loaderPrivate, &error);

Same as the previous two patches, I think you should keep _mesa_error()
here.

> +}
> +
>  static void
>  radeon_destroy_image(__DRIimage *image)
>  {
> @@ -359,13 +371,14 @@ radeon_query_image(__DRIimage *image, int attrib, int 
> *value)
>  }
>  
>  static const __DRIimageExtension radeonImageExtension = {
> -   .base = { __DRI_IMAGE, 1 },
> -
> -   .createImageFromName = radeon_create_image_from_name,
> -   .createImageFromRenderbuffer = radeon_create_image_from_renderbuffer,
> -   .destroyImage= radeon_destroy_image,
> -   .createImage = radeon_create_image,
> -   .queryImage  = radeon_query_image
> +   .base = { __DRI_IMAGE, 17 },
> +
> +   .createImageFromName  = radeon_create_image_from_name,
> +   .createImageFromRenderbuffer  = radeon_create_image_from_renderbuffer,
> +   .destroyImage = radeon_destroy_image,
> +   .createImage  = radeon_create_image,
> +   .queryImage   = radeon_query_image,
> +   .createImageFromRenderbuffer2 = radeon_create_image_from_renderbuffer2,
>  };
>  
>  static int radeon_set_screen_flags(radeonScreenPtr screen, int device_id)
> -- 
> 2.14.1
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] [AMD] dri3: Add adaptive_sync_enable driconf option

2017-10-17 Thread Michel Dänzer

On 17/10/17 12:29 PM, Nicolai Hähnle wrote:
> On 17.10.2017 12:07, Michel Dänzer wrote:
>> On 17/10/17 11:33 AM, Nicolai Hähnle wrote:
>>> From: Nicolai Hähnle 
>>>
>>> When enabled, this will request FreeSync via the hybrid amdgpu DDX's
>>> AMDGPU X11 protocol extension.
>>>
>>> Due to limitations in the DDX this will only work for applications
>>> that cover the entire X screen (which is important to keep in mind when
>>> you have a multi-monitor setup).
>>
>> This limitation already applies to page flipping in general, it's not
>> specific to FreeSync.
> 
> Maybe it shouldn't apply in general? After all, what if you have one
> monitor with 60Hz and one with 90Hz?

Page flipping can be used in that case. The application's buffer swaps
will be synchronized to one or the other CRTC, by default the one where
the largest part of the window is visible on. On the other CRTC(s), each
flip may take effect slightly earlier or later than on the
synchronization CRTC.


>>> @@ -767,20 +820,25 @@ loader_dri3_swap_buffers_msc(struct
>>> loader_dri3_drawable *draw,
>>>    bool force_copy)
>>>   {
>>>  struct loader_dri3_buffer *back;
>>>  int64_t ret = 0;
>>>  uint32_t options = XCB_PRESENT_OPTION_NONE;
>>>    draw->vtable->flush_drawable(draw, flush_flags);
>>>    back = dri3_find_back_alloc(draw);
>>>   +   if (draw->adaptive_sync) {
>>> +  if (!loader_dri3_amdgpu_freesync_enable(draw->conn,
>>> draw->drawable))
>>> + draw->adaptive_sync = false;
>>> +   }
>>
>> The AMDGPUFreesyncCapability request only has to be submitted once per
>> window, not for every buffer swap.
> 
> Unfortunately, that's not true due to the implementation in the DDX.
> 
> For one,  we *definitely* have to re-submit the request at least each
> time the window is moved or resized, because the request doesn't
> register unless the window is already fullscreen.
> 
> For another, some applications create multiple windows, but the DDX can
> only keep track of one.
> 
> There's perhaps an argument to be made that both of these issues would
> better be fixed in the DDX. However, since this is a non-upstream
> interim solution anyway, and the final upstream solution will likely
> also do something in swap_buffers, I think this approach is reasonable.
> 
> Correct me if I got some of the xcb code wrong, but from what I
> understand there should be no stalls associated with it, just a bunch of
> additional bytes sent over the socket on each swap buffers, which
> doesn't seem too bad to me.
> 
> (For what it's worth, the closed GL driver sends this request on
> glXMakeCurrent, which makes even less sense to me :-))

I agree with your analysis on all accounts. Really, this should have
simply used a per-window property (part of the original X11 protocol)
instead of creating an extension...


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH V3] automake: intel: move expat handling where it's used

2017-10-17 Thread Emil Velikov

On 17 October 2017 at 02:10, Hongxu Jia  wrote:
> Linking libvulkan_intel.so can fail, due to unresolved references to
> libexpat.so.
>
> EXPAT_CFLAGS should be moved as well.
>
The EXPAT_CFLAGS changes seems to be missing. Did you forget git add?

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH mesa] meson: add missing radv_extensions.c generation for libvulkan_radeon

2017-10-17 Thread Eric Engestrom

Signed-off-by: Eric Engestrom 
---
 src/amd/vulkan/meson.build | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/src/amd/vulkan/meson.build b/src/amd/vulkan/meson.build
index a5a4f81352807beac92d..6a416d988674504281c6 100644
--- a/src/amd/vulkan/meson.build
+++ b/src/amd/vulkan/meson.build
@@ -26,6 +26,14 @@ radv_entrypoints = custom_target(
  '--outdir', meson.current_build_dir()],
 )
 
+radv_extensions = custom_target(
+  'radv_extensions.c',
+  input : ['radv_extensions.py', vk_api_xml],
+  output : ['radv_extensions.c'],
+  command : [prog_python2, '@INPUT0@', '--xml', '@INPUT1@',
+ '--out', '@OUTPUT@'],
+)
+
 vk_format_table_c = custom_target(
   'vk_format_table.c',
   input : ['vk_format_table.py', 'vk_format_layout.csv'],
@@ -102,7 +110,7 @@ endif
 
 libvulkan_radeon = shared_library(
   'vulkan_radeon',
-  [libradv_files, radv_entrypoints, nir_opcodes_h, vk_format_table_c],
+  [libradv_files, radv_entrypoints, radv_extensions, nir_opcodes_h, 
vk_format_table_c],
   include_directories : [inc_common, inc_amd, inc_amd_common, inc_compiler,
  inc_vulkan_util, inc_vulkan_wsi],
   link_with : [libamd_common, libamdgpu_addrlib, libvulkan_util,
-- 
Cheers,
  Eric

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Upstream support for FreeSync / Adaptive Sync

2017-10-17 Thread Nicolai Hähnle


On 17.10.2017 12:28, Michel Dänzer wrote:

On 17/10/17 11:34 AM, Nicolai Hähnle wrote:

Hi all,

I just sent out a patch that enables FreeSync in Mesa for the somewhat
hacked implementation of FreeSync that exists in our hybrid (amdgpu-pro)
stack's DDX and kernel module. [0]

While this patch isn't meant for upstream, that's as good a time as any
to raise the issue of how a proper upstream solution would look like. It
needs to cut across the entire stack, and we should try to align the KMS
interface with the X11 protocol and the Wayland protocol.

Prior art that I'm aware of includes:

1. The Present protocol extension has a PresentOptionUST bit for
requesting a specific present time, but the reality is that the
implementation of that is even less than what the protocol docs claim. [1]


FWIW, I do think this could be a good way for clients to signal that
they want a frame to be displayed ASAP. It would also allow for e.g.
video players to naturally adapt the refresh rate to the video frame
rate (the VDPAU presentation API has a target timestamp for this).


Agreed. The point is that a lot of implementation work needs to be done, 
and the protocol docs need to be fixed (the doc claims that every 
implementation will treat PresentOptionUST reasonably, rounding to the 
nearest MSC when PresentCapabilityUST is missing, but that's false as 
far as I can tell).




3. Keith Packard's CRTC_{GET,QUEUE}_SEQUENCE is not specific to Adaptive
Sync, but seems like something Adaptive Sync-aware applications would
want to use. [2]


Not sure I can agree with that. Applications should use higher level
APIs, not low level ones like these directly. (Also, they're basically
just KMS variants of DRM_IOCTL_WAIT_VBLANK, not directly related to
adaptive sync)

>

Common sense suggests that there need to be two side to FreeSync / VESA
Adaptive Sync support:

1. Query the display capabilities. This means querying minimum / maximum
refresh duration, plus possibly a query for when the earliest/latest
timing of the *next* refresh.

2. Signal desired present time. This means passing a target timer value
instead of a target vblank count, e.g. something like this for the KMS
interface:

   int drmModePageFlipTarget64(int fd, uint32_t crtc_id, uint32_t fb_id,
   uint32_t flags, void *user_data,
   uint64_t target);

   + a flag to indicate whether target is the vblank count or the
CLOCK_MONOTONIC (?) time in ns.


drmModePageFlip(Target) is part of the pre-atomic KMS API, but adapative
sync should probably only be supported via the atomic API, presumably
via output properties.


Time for xf86-video-amdgpu to grow atomic support, then? ;)

How is a target vblank count communicated via the atomic API? Or is that 
simply not part of the design and up to user space?


Cheers,
Nicolai
--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] [AMD] dri3: Add adaptive_sync_enable driconf option

2017-10-17 Thread Nicolai Hähnle


On 17.10.2017 12:41, Michel Dänzer wrote:

On 17/10/17 12:29 PM, Nicolai Hähnle wrote:

On 17.10.2017 12:07, Michel Dänzer wrote:

On 17/10/17 11:33 AM, Nicolai Hähnle wrote:

From: Nicolai Hähnle 

When enabled, this will request FreeSync via the hybrid amdgpu DDX's
AMDGPU X11 protocol extension.

Due to limitations in the DDX this will only work for applications
that cover the entire X screen (which is important to keep in mind when
you have a multi-monitor setup).


This limitation already applies to page flipping in general, it's not
specific to FreeSync.


Maybe it shouldn't apply in general? After all, what if you have one
monitor with 60Hz and one with 90Hz?


Page flipping can be used in that case. The application's buffer swaps
will be synchronized to one or the other CRTC, by default the one where
the largest part of the window is visible on. On the other CRTC(s), each
flip may take effect slightly earlier or later than on the
synchronization CRTC.


Hmm. I seem to have lost the plot and/or we're talking past each other. 
Maybe we need to go back to square one?


To rephrase the point that I was trying to get across: When you have a 
multi-monitor setup, and your application window covers (only) one of 
these monitors, then FreeSync will *not* be enabled.


At least that's how I understand the current DDX implementation, and I 
think it's surprising enough that it should be pointed out explicitly.


Cheers,
Nicolai
--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 23/43] i965/fs: Add byte scattered read message and fs support

2017-10-17 Thread Chema Casanova



On 15/10/17 11:47, Pohjolainen, Topi wrote:
> On Thu, Oct 12, 2017 at 08:38:12PM +0200, Jose Maria Casanova Crespo wrote:
>> ---
>>  src/intel/compiler/brw_eu.h|  7 +
>>  src/intel/compiler/brw_eu_defines.h|  2 ++
>>  src/intel/compiler/brw_eu_emit.c   | 41 
>> ++
>>  src/intel/compiler/brw_fs.cpp  | 10 +++
>>  src/intel/compiler/brw_fs_copy_propagation.cpp |  2 ++
>>  src/intel/compiler/brw_fs_generator.cpp|  5 
>>  src/intel/compiler/brw_fs_surface_builder.cpp  | 12 
>>  src/intel/compiler/brw_fs_surface_builder.h|  5 
>>  src/intel/compiler/brw_shader.cpp  |  6 
>>  9 files changed, 90 insertions(+)
>>
>> diff --git a/src/intel/compiler/brw_eu.h b/src/intel/compiler/brw_eu.h
>> index b44ca0f518..ca1ff21a83 100644
>> --- a/src/intel/compiler/brw_eu.h
>> +++ b/src/intel/compiler/brw_eu.h
>> @@ -476,6 +476,13 @@ brw_typed_surface_write(struct brw_codegen *p,
>>  unsigned num_channels);
>>  
>>  void
>> +brw_byte_scattered_read(struct brw_codegen *p,
>> +struct brw_reg dst,
>> +struct brw_reg payload,
>> +struct brw_reg surface,
>> +unsigned msg_length);
>> +
>> +void
>>  brw_byte_scattered_write(struct brw_codegen *p,
>>   struct brw_reg payload,
>>   struct brw_reg surface,
>> diff --git a/src/intel/compiler/brw_eu_defines.h 
>> b/src/intel/compiler/brw_eu_defines.h
>> index 9aac385ba7..c5dc5fd5fb 100644
>> --- a/src/intel/compiler/brw_eu_defines.h
>> +++ b/src/intel/compiler/brw_eu_defines.h
>> @@ -397,6 +397,8 @@ enum opcode {
>>  * opcode, but instead of taking a single payload blog they expect their
>>  * arguments separately as individual sources, like untyped write/read.
>>  */
>> +   SHADER_OPCODE_BYTE_SCATTERED_READ,
>> +   SHADER_OPCODE_BYTE_SCATTERED_READ_LOGICAL,
>> SHADER_OPCODE_BYTE_SCATTERED_WRITE,
>> SHADER_OPCODE_BYTE_SCATTERED_WRITE_LOGICAL,
>>  
>> diff --git a/src/intel/compiler/brw_eu_emit.c 
>> b/src/intel/compiler/brw_eu_emit.c
>> index 84d85be653..8c83d8b500 100644
>> --- a/src/intel/compiler/brw_eu_emit.c
>> +++ b/src/intel/compiler/brw_eu_emit.c
>> @@ -2929,6 +2929,47 @@ brw_untyped_surface_write(struct brw_codegen *p,
>>p, insn, num_channels);
>>  }
>>  
>> +
>> +
>> +static void
>> +brw_set_dp_byte_scattered_read_message(struct brw_codegen *p,
>> +   struct brw_inst *insn)
>> +{
>> +
>> +   const struct gen_device_info *devinfo = p->devinfo;
>> +   /* Set mask of 32-bit channels to drop. */
>> +   unsigned msg_control = GEN7_BYTE_SCATTERED_DATA_SIZE_WORD << 2;
>> +
>> +   if (brw_inst_access_mode(devinfo, p->current) == BRW_ALIGN_1) {
>> +  if (brw_inst_exec_size(devinfo, p->current) == BRW_EXECUTE_16)
>> + msg_control |= 1; /* SIMD16 mode */
>> +  else
>> + msg_control |= 2; /* SIMD8 mode */
>> +   }
>> +
>> +   brw_inst_set_dp_msg_type(devinfo, insn,
>> +(devinfo->gen >= 8 || devinfo->is_haswell ?
>> + HSW_DATAPORT_DC_PORT0_BYTE_SCATTERED_READ :
>> + GEN7_DATAPORT_DC_BYTE_SCATTERED_READ));
>> +   brw_inst_set_dp_msg_control(devinfo, insn, msg_control);
>> +}
>> +
>> +void
>> +brw_byte_scattered_read(struct brw_codegen *p,
>> +struct brw_reg dst,
>> +struct brw_reg payload,
>> +struct brw_reg surface,
>> +unsigned msg_length)
>> +{
>> +   const unsigned sfid =  GEN7_SFID_DATAPORT_DATA_CACHE;
>> +   struct brw_inst *insn = brw_send_indirect_scattered_message(
>> +  p, sfid, dst, payload, surface, msg_length,
>> +  brw_surface_payload_size(p, 1, true, true),
>> +  false);
>> +
>> +   brw_set_dp_byte_scattered_read_message(p, insn);
>> +}
>> +
>>  static void
>>  brw_set_dp_byte_scattered_write(struct brw_codegen *p,
>>  struct brw_inst *insn)
>> diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp
>> index e4a94ff053..bd0d32b741 100644
>> --- a/src/intel/compiler/brw_fs.cpp
>> +++ b/src/intel/compiler/brw_fs.cpp
>> @@ -251,6 +251,7 @@ fs_inst::is_send_from_grf() const
>> case SHADER_OPCODE_UNTYPED_SURFACE_READ:
>> case SHADER_OPCODE_UNTYPED_SURFACE_WRITE:
>> case SHADER_OPCODE_BYTE_SCATTERED_WRITE:
>> +   case SHADER_OPCODE_BYTE_SCATTERED_READ:
>> case SHADER_OPCODE_TYPED_ATOMIC:
>> case SHADER_OPCODE_TYPED_SURFACE_READ:
>> case SHADER_OPCODE_TYPED_SURFACE_WRITE:
>> @@ -733,6 +734,7 @@ fs_inst::components_read(unsigned i) const
>>  
>> case SHADER_OPCODE_UNTYPED_SURFACE_READ_LOGICAL:
>> case SHADER_OPCODE_TYPED_SURFACE_READ_LOGICAL:
>> +   case SHADER_OPCODE_BYTE_SCATTERED_READ_LOGICAL:
>>assert(src[3].file == IM

Re: [Mesa-dev] [PATCH 2/2] anv/apply_pipeline_layout: Use nir_tex_instr_remove_src

2017-10-17 Thread Lionel Landwerlin


Reviewed-by: Lionel Landwerlin 

On 17/10/17 02:09, Jason Ekstrand wrote:

---
  src/intel/vulkan/anv_nir_apply_pipeline_layout.c | 17 +
  1 file changed, 5 insertions(+), 12 deletions(-)

diff --git a/src/intel/vulkan/anv_nir_apply_pipeline_layout.c 
b/src/intel/vulkan/anv_nir_apply_pipeline_layout.c
index 1c86513..3ca2b04 100644
--- a/src/intel/vulkan/anv_nir_apply_pipeline_layout.c
+++ b/src/intel/vulkan/anv_nir_apply_pipeline_layout.c
@@ -192,10 +192,10 @@ has_tex_src_plane(nir_tex_instr *tex)
  static uint32_t
  extract_tex_src_plane(nir_tex_instr *tex)
  {
-   nir_tex_src *new_srcs = rzalloc_array(tex, nir_tex_src, tex->num_srcs - 1);
 unsigned plane = 0;
  
-   for (unsigned i = 0, w = 0; i < tex->num_srcs; i++) {

+   int plane_src_idx = -1;
+   for (unsigned i = 0; i < tex->num_srcs; i++) {
if (tex->src[i].src_type == nir_tex_src_plane) {
   nir_const_value *const_plane =
  nir_src_as_const_value(tex->src[i].src);
@@ -204,19 +204,12 @@ extract_tex_src_plane(nir_tex_instr *tex)
* constants. */
   assert(const_plane);
   plane = const_plane->u32[0];
-
- /* Remove the source from the instruction */
- nir_instr_rewrite_src(&tex->instr, &tex->src[i].src, NIR_SRC_INIT);
-  } else {
- new_srcs[w].src_type = tex->src[i].src_type;
- nir_instr_move_src(&tex->instr, &new_srcs[w].src, &tex->src[i].src);
- w++;
+ plane_src_idx = i;
}
 }
  
-   ralloc_free(tex->src);

-   tex->src = new_srcs;
-   tex->num_srcs--;
+   assert(plane_src_idx >= 0);
+   nir_tex_instr_remove_src(tex, plane_src_idx);
  
 return plane;

  }



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 04/10] gbm: handle queryImage() failure for GBM_BO_IMPORT_EGL_IMAGE

2017-10-17 Thread Emil Velikov

On 17 October 2017 at 11:09, Eric Engestrom  wrote:
> On Monday, 2017-10-16 16:04:06 +, Emil Velikov wrote:
>> From: Emil Velikov 
>>
>> The function can fail. Check and teardown accordingly.
>>
>> Fixes: a43d286ef7f ("gbm: Add import from fd")
>> Cc: Kristian Høgsberg 
>> Signed-off-by: Emil Velikov 
>> ---
>>  src/gbm/backends/dri/gbm_dri.c | 8 +++-
>>  1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/src/gbm/backends/dri/gbm_dri.c b/src/gbm/backends/dri/gbm_dri.c
>> index 4a51bd39903..9c9066e6661 100644
>> --- a/src/gbm/backends/dri/gbm_dri.c
>> +++ b/src/gbm/backends/dri/gbm_dri.c
>> @@ -891,6 +891,7 @@ gbm_dri_bo_import(struct gbm_device *gbm,
>> __DRIimage *image;
>> unsigned dri_use = 0;
>> int gbm_format;
>> +   unsigned query; /* EGLBoolean, but we cannot include the header */
>
I stand corrected - s/EGLBoolean/GLboolean/

> `bool`?   is already included.
>
Having a skim through dri_interface.h we could drop the GL types in
favour of the standard ones, all together.
Any objection if I do that as a follow-up?

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 103312] meson/macOS: Dependency libdrm_intel found: NO

2017-10-17 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=103312

Bug ID: 103312
   Summary: meson/macOS: Dependency libdrm_intel found: NO
   Product: Mesa
   Version: git
  Hardware: x86-64 (AMD64)
OS: Mac OS X (All)
Status: NEW
  Severity: normal
  Priority: medium
 Component: Other
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: yuriko...@gmail.com
QA Contact: mesa-dev@lists.freedesktop.org

Seems like meson on macOS should not check for libdrm_intel dependency.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 103312] meson/macOS: Dependency libdrm_intel found: NO

2017-10-17 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=103312

Yurii Kolesnykov  changed:

   What|Removed |Added

URL||https://gist.github.com/yur
   ||ikoles/cb1905ab50e3076713e5
   ||857ea5f8bb1d

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 05/10] gbm: handle queryImage() failure through rest of gbm_dri_bo_import()

2017-10-17 Thread Emil Velikov

On 17 October 2017 at 11:12, Eric Engestrom  wrote:
> On Monday, 2017-10-16 16:04:07 +, Emil Velikov wrote:
>> From: Emil Velikov 
>>
>> Fixes: 6a7dea93fa7 ("dri: Rework planar image interface")
>> Cc: Jakob Bornecrantz 
>> Signed-off-by: Emil Velikov 
>> ---
>>  src/gbm/backends/dri/gbm_dri.c | 23 +++
>>  1 file changed, 15 insertions(+), 8 deletions(-)
>>
>> diff --git a/src/gbm/backends/dri/gbm_dri.c b/src/gbm/backends/dri/gbm_dri.c
>> index 9c9066e6661..a6c80cf1ec7 100644
>> --- a/src/gbm/backends/dri/gbm_dri.c
>> +++ b/src/gbm/backends/dri/gbm_dri.c
>> @@ -1048,14 +1048,21 @@ gbm_dri_bo_import(struct gbm_device *gbm,
>> bo->base.gbm = gbm;
>> bo->base.format = gbm_format;
>>
>> -   dri->image->queryImage(bo->image, __DRI_IMAGE_ATTRIB_WIDTH,
>> -  (int*)&bo->base.width);
>> -   dri->image->queryImage(bo->image, __DRI_IMAGE_ATTRIB_HEIGHT,
>> -  (int*)&bo->base.height);
>> -   dri->image->queryImage(bo->image, __DRI_IMAGE_ATTRIB_STRIDE,
>> -  (int*)&bo->base.stride);
>> -   dri->image->queryImage(bo->image, __DRI_IMAGE_ATTRIB_HANDLE,
>> -  &bo->base.handle.s32);
>> +   query = dri->image->queryImage(bo->image, __DRI_IMAGE_ATTRIB_WIDTH,
>> +  (int*)&bo->base.width);
>> +   query &= dri->image->queryImage(bo->image, __DRI_IMAGE_ATTRIB_HEIGHT,
>
> With bitmasks, you really need to have 0/1 here. Someone could return `2`,
> which would count as `true` as bool, but `1 & 2 == 0`, which is not what
> we want here.
> Please add `!!` in front of the queryImage() calls.
>
I've assumed that won't be the case since we already have a few &= instances.
Regardless, suggestion makes sense.

My earlier comment (use standard types) won't fly for
s/GLboolean/bool/ since it's an ABI break.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 08/10] i965: implement DRIImage::createImageFromRenderbuffer2

2017-10-17 Thread Emil Velikov

On 17 October 2017 at 11:29, Eric Engestrom  wrote:
> On Monday, 2017-10-16 16:04:10 +, Emil Velikov wrote:
>> From: Emil Velikov 
>>
>> The new entry point has a way to feedback the error. Thus we no longer
>> need to call _mesa_error() but instead we can pass the correct value.
>>
>> Signed-off-by: Emil Velikov 
>> ---
>>  src/mesa/drivers/dri/i965/intel_screen.c | 31 
>> ++-
>>  1 file changed, 26 insertions(+), 5 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
>> b/src/mesa/drivers/dri/i965/intel_screen.c
>> index ea04a72e860..7cbb5e3b060 100644
>> --- a/src/mesa/drivers/dri/i965/intel_screen.c
>> +++ b/src/mesa/drivers/dri/i965/intel_screen.c
>> @@ -474,8 +474,9 @@ intel_create_image_from_name(__DRIscreen *dri_screen,
>>  }
>>
>>  static __DRIimage *
>> -intel_create_image_from_renderbuffer(__DRIcontext *context,
>> -  int renderbuffer, void *loaderPrivate)
>> +intel_create_image_from_renderbuffer2(__DRIcontext *context,
>> +   int renderbuffer, void *loaderPrivate,
>> +   unsigned *error)
>>  {
>> __DRIimage *image;
>> struct brw_context *brw = context->driverPrivate;
>> @@ -485,15 +486,17 @@ intel_create_image_from_renderbuffer(__DRIcontext 
>> *context,
>>
>> rb = _mesa_lookup_renderbuffer(ctx, renderbuffer);
>> if (!rb) {
>> -  _mesa_error(ctx, GL_INVALID_OPERATION, "glRenderbufferExternalMESA");
>> +  *error = __DRI_IMAGE_ERROR_BAD_PARAMETER;
>>return NULL;
>> }
>
> [there]
>
>>
>> irb = intel_renderbuffer(rb);
>> intel_miptree_make_shareable(brw, irb->mt);
>> image = calloc(1, sizeof *image);
>> -   if (image == NULL)
>> +   if (image == NULL) {
>> +  *error = __DRI_IMAGE_ERROR_BAD_ALLOC;
>>return NULL;
>> +   }
>>
>> image->internal_format = rb->InternalFormat;
>> image->format = rb->Format;
>> @@ -508,12 +511,29 @@ intel_create_image_from_renderbuffer(__DRIcontext 
>> *context,
>> image->height = rb->Height;
>> image->pitch = irb->mt->surf.row_pitch;
>> image->dri_format = driGLFormatToImageFormat(image->format);
>> +   if (image->dri_format == __DRI_IMAGE_FORMAT_NONE) {
>
> __DRI_IMAGE_FORMAT_NONE can only come from MESA_FORMAT_NONE; did you
> mean `== 0`? Or both?
>
Right, I've assumed that __DRI_IMAGE_FORMAT_NONE is the default case
which doesn't seems to be the case ...
The whole __DRI_IMAGE_FORMAT handling seems to need some work, but the
first one would be to fixup driGLFormatToImageFormat()

Patch coming in v2.

>> +  *error = __DRI_IMAGE_ERROR_BAD_PARAMETER;
>> +  brw_bo_unreference(irb->mt->bo);
>> +  free(image);
>> +  return NULL;
>> +   }
>
> You can move this check before the calloc and ref ([there] above), so
> that the error path becomes just setting `error` and returning NULL.
>
> You'll want a temp var to avoid calling driGLFormatToImageFormat()
> twice:
> uint32_t dri_format = driGLFormatToImageFormat(rb->Format);
>
Thought about it, then decided again it... not sure why. Will fix.

>> +
>> image->has_depthstencil = irb->mt->stencil_mt? true : false;
>>
>> rb->NeedsFinishRenderTexture = true;
>> +   *error = __DRI_IMAGE_ERROR_SUCCESS;
>> return image;
>>  }
>>
>> +static __DRIimage *
>> +intel_create_image_from_renderbuffer(__DRIcontext *context,
>> +  int renderbuffer, void *loaderPrivate)
>> +{
>> +   unsigned error;
>> +   return intel_create_image_from_renderbuffer2(context, renderbuffer,
>> +loaderPrivate, &error);
>
> if (error == __DRI_IMAGE_ERROR_BAD_PARAMETER)
> _mesa_error(ctx, GL_INVALID_OPERATION, 
> "glRenderbufferExternalMESA");
>
> (maybe even `if (error != __DRI_IMAGE_ERROR_SUCCESS)`?)
>
The use of _mesa_error() is a bug in the original code. It generates a
GL error, whereas a EGL one is expected.
Thus programs thoroughly checking the error states will be mislead/lied to.

I'll add an explicit note in the commit message about it.

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Upstream support for FreeSync / Adaptive Sync

2017-10-17 Thread Daniel Vetter

On Tue, Oct 17, 2017 at 12:28:17PM +0200, Michel Dänzer wrote:
> On 17/10/17 11:34 AM, Nicolai Hähnle wrote:
> > Hi all,
> > 
> > I just sent out a patch that enables FreeSync in Mesa for the somewhat
> > hacked implementation of FreeSync that exists in our hybrid (amdgpu-pro)
> > stack's DDX and kernel module. [0]
> > 
> > While this patch isn't meant for upstream, that's as good a time as any
> > to raise the issue of how a proper upstream solution would look like. It
> > needs to cut across the entire stack, and we should try to align the KMS
> > interface with the X11 protocol and the Wayland protocol.
> > 
> > Prior art that I'm aware of includes:
> > 
> > 1. The Present protocol extension has a PresentOptionUST bit for
> > requesting a specific present time, but the reality is that the
> > implementation of that is even less than what the protocol docs claim. [1]
> 
> FWIW, I do think this could be a good way for clients to signal that
> they want a frame to be displayed ASAP. It would also allow for e.g.
> video players to naturally adapt the refresh rate to the video frame
> rate (the VDPAU presentation API has a target timestamp for this).
> 
> 
> > 3. Keith Packard's CRTC_{GET,QUEUE}_SEQUENCE is not specific to Adaptive
> > Sync, but seems like something Adaptive Sync-aware applications would
> > want to use. [2]
> 
> Not sure I can agree with that. Applications should use higher level
> APIs, not low level ones like these directly. (Also, they're basically
> just KMS variants of DRM_IOCTL_WAIT_VBLANK, not directly related to
> adaptive sync)

+1.

We already accidentally exposed the legacy vblank ioctl to clients, and
they abuse it badly (second-guessing the compositor, which works so well).
If you're a client app and want timings, you must query the compositor.
Neither the client nor the kernel know enough to make this happen
directly.

Ofc the compositor will want to query timings through ioctls, but we
provide them all already (flip timestamp and vblank timestamp), Keith's
new ioctl simply provides a bit of an uapi cleanup + nanosecond accuracy
(not that any driver can do that anyway right now).

> > Common sense suggests that there need to be two side to FreeSync / VESA
> > Adaptive Sync support:
> > 
> > 1. Query the display capabilities. This means querying minimum / maximum
> > refresh duration, plus possibly a query for when the earliest/latest
> > timing of the *next* refresh.
> > 
> > 2. Signal desired present time. This means passing a target timer value
> > instead of a target vblank count, e.g. something like this for the KMS
> > interface:
> > 
> >   int drmModePageFlipTarget64(int fd, uint32_t crtc_id, uint32_t fb_id,
> >   uint32_t flags, void *user_data,
> >   uint64_t target);
> > 
> >   + a flag to indicate whether target is the vblank count or the
> > CLOCK_MONOTONIC (?) time in ns.
> 
> drmModePageFlip(Target) is part of the pre-atomic KMS API, but adapative
> sync should probably only be supported via the atomic API, presumably
> via output properties.

+1

At least now that DC is on track to land properly, and you want to do this
for DC-only anyway there's no reason to pimp the legacy interfaces
further. And atomic is soo much easier to extend.

The big question imo is where we need to put the flag on the kms side,
since freesync is not just about presenting earlier, but also about
presenting later. But for backwards compat we can't stretch the refresh
rate by default for everyone, or clients that rely on high precision
timestamps and regular refresh will get a bad surprise.

I think a boolean enable_freesync property is probably what we want, which
enables freesync for as long as it's set.

The other side is communicating to userspace which modes are freesync
capable. We might want to extend the mode struct with a min/max vrefresh
rate, or something similar.

Finally I'm not sure we want to insist on a target time for freesync. At
least as far as I understand things you just want "as soon as possible".
This might change with some of the VK/EGL/GLX extensions where you
specify a precise timing (media playback). But that needs a bit more work
to make it happen I think, so perhaps better to postpone.

Also note that right now no driver expect amdgpu has support for a target
vblank on a flip. That's imo another reason for not requiring target
support for at least basic freesync support.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 15/16] ac: clean up ac_build_indexed_load function interfaces

2017-10-17 Thread Nicolai Hähnle


This patch and patches 1 - 13:

Reviewed-by: Nicolai Hähnle 


On 13.10.2017 14:04, Marek Olšák wrote:

From: Marek Olšák 

---
  src/amd/common/ac_llvm_build.c| 42 ++-
  src/amd/common/ac_llvm_build.h| 14 
  src/amd/common/ac_nir_to_llvm.c   | 22 ++--
  src/gallium/drivers/radeonsi/si_shader.c  | 34 +-
  src/gallium/drivers/radeonsi/si_shader_tgsi_mem.c |  4 +--
  5 files changed, 61 insertions(+), 55 deletions(-)

diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
index 1d97b09..949f181 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -710,46 +710,54 @@ ac_build_indexed_store(struct ac_llvm_context *ctx,
   ac_build_gep0(ctx, base_ptr, index));
  }
  
  /**

   * Build an LLVM bytecode indexed load using LLVMBuildGEP + LLVMBuildLoad.
   * It's equivalent to doing a load from &base_ptr[index].
   *
   * \param base_ptr  Where the array starts.
   * \param index The element index into the array.
   * \param uniform   Whether the base_ptr and index can be assumed to be
- *  dynamically uniform
+ *  dynamically uniform (i.e. load to an SGPR)
+ * \param invariant Whether the load is invariant (no other opcodes affect it)
   */
-LLVMValueRef
-ac_build_indexed_load(struct ac_llvm_context *ctx,
- LLVMValueRef base_ptr, LLVMValueRef index,
- bool uniform)
+static LLVMValueRef
+ac_build_load_custom(struct ac_llvm_context *ctx, LLVMValueRef base_ptr,
+LLVMValueRef index, bool uniform, bool invariant)
  {
-   LLVMValueRef pointer;
+   LLVMValueRef pointer, result;
  
  	pointer = ac_build_gep0(ctx, base_ptr, index);

if (uniform)
LLVMSetMetadata(pointer, ctx->uniform_md_kind, ctx->empty_md);
-   return LLVMBuildLoad(ctx->builder, pointer, "");
+   result = LLVMBuildLoad(ctx->builder, pointer, "");
+   if (invariant)
+   LLVMSetMetadata(result, ctx->invariant_load_md_kind, 
ctx->empty_md);
+   return result;
  }
  
-/**

- * Do a load from &base_ptr[index], but also add a flag that it's loading
- * a constant from a dynamically uniform index.
- */
-LLVMValueRef
-ac_build_indexed_load_const(struct ac_llvm_context *ctx,
-   LLVMValueRef base_ptr, LLVMValueRef index)
+LLVMValueRef ac_build_load(struct ac_llvm_context *ctx, LLVMValueRef base_ptr,
+  LLVMValueRef index)
  {
-   LLVMValueRef result = ac_build_indexed_load(ctx, base_ptr, index, true);
-   LLVMSetMetadata(result, ctx->invariant_load_md_kind, ctx->empty_md);
-   return result;
+   return ac_build_load_custom(ctx, base_ptr, index, false, false);
+}
+
+LLVMValueRef ac_build_load_invariant(struct ac_llvm_context *ctx,
+LLVMValueRef base_ptr, LLVMValueRef index)
+{
+   return ac_build_load_custom(ctx, base_ptr, index, false, true);
+}
+
+LLVMValueRef ac_build_load_to_sgpr(struct ac_llvm_context *ctx,
+  LLVMValueRef base_ptr, LLVMValueRef index)
+{
+   return ac_build_load_custom(ctx, base_ptr, index, true, true);
  }
  
  /* TBUFFER_STORE_FORMAT_{X,XY,XYZ,XYZW} <- the suffix is selected by num_channels=1..4.

   * The type of vdata must be one of i32 (num_channels=1), v2i32 
(num_channels=2),
   * or v4i32 (num_channels=3,4).
   */
  void
  ac_build_buffer_store_dword(struct ac_llvm_context *ctx,
LLVMValueRef rsrc,
LLVMValueRef vdata,
diff --git a/src/amd/common/ac_llvm_build.h b/src/amd/common/ac_llvm_build.h
index ac8ea9c..f0b5875 100644
--- a/src/amd/common/ac_llvm_build.h
+++ b/src/amd/common/ac_llvm_build.h
@@ -143,28 +143,26 @@ ac_build_fs_interp_mov(struct ac_llvm_context *ctx,
  LLVMValueRef
  ac_build_gep0(struct ac_llvm_context *ctx,
  LLVMValueRef base_ptr,
  LLVMValueRef index);
  
  void

  ac_build_indexed_store(struct ac_llvm_context *ctx,
   LLVMValueRef base_ptr, LLVMValueRef index,
   LLVMValueRef value);
  
-LLVMValueRef

-ac_build_indexed_load(struct ac_llvm_context *ctx,
- LLVMValueRef base_ptr, LLVMValueRef index,
- bool uniform);
-
-LLVMValueRef
-ac_build_indexed_load_const(struct ac_llvm_context *ctx,
-   LLVMValueRef base_ptr, LLVMValueRef index);
+LLVMValueRef ac_build_load(struct ac_llvm_context *ctx, LLVMValueRef base_ptr,
+  LLVMValueRef index);
+LLVMValueRef ac_build_load_invariant(struct ac_llvm_context *ctx,
+LLVMValueRef base_ptr, LLVMValueRef index);
+LLVMValueRef ac_build_load_to_sgpr(struct ac_llvm_context *ctx,
+  LLVMValueRef base_ptr, LLVMValueRef index);
  
  void

Re: [Mesa-dev] [PATCH 16/16] radeonsi: if there's just const buffer 0, set it in place of CONST/SSBO pointer

2017-10-17 Thread Nicolai Hähnle


On 13.10.2017 14:04, Marek Olšák wrote:

From: Marek Olšák 

SI_SGPR_CONST_AND_SHADER_BUFFERS now contains the pointer to const buffer 0
if there is no other buffer there.

Benefits:
- there is no constbuf descriptor upload and shader load

It's assumed that all constant addresses are within bounds. Non-constant
addresses are clamped against the last declared CONST variable.
This only works if the state tracker ensures the bound constant buffer
matches what the shader needs.

Once we get 32-bit pointers, we can only do this for user constant buffers
where the driver is in charge of the upload so that it can guarantee a 32-bit
address.

The real performance benefit might not be measurable.

These apps get 100% theoretical benefit in all shaders (except where noted):
- antichamber
- barman arkham origins
- borderlands 2
- borderlands pre-sequel
- brutal legend
- civilization BE
- CS:GO
- deadcore
- dota 2 -- most shaders
- europa universalis
- grid autosport -- most shaders
- left 4 dead 2
- legend of grimrock
- life is strange
- payday 2
- portal
- rocket league
- serious sam 3 bfe
- talos principle
- team fortress 2
- thea
- unigine heaven
- unigine valley -- also sanctuary and tropics
- wasteland 2
- xcom: enemy unknown & enemy within
- tesseract
- unity (engine)

Changed stats only:
 SGPRS: 2059998 -> 2086238 (1.27 %)
 VGPRS: 1626888 -> 1626904 (0.00 %)
 Spilled SGPRs: 7902 -> 7865 (-0.47 %)
 Code Size: 60924520 -> 60982660 (0.10 %) bytes
 Max Waves: 374539 -> 374526 (-0.00 %)
---
  src/gallium/drivers/radeonsi/si_descriptors.c | 23 +++--
  src/gallium/drivers/radeonsi/si_shader.c  | 72 +++
  src/gallium/drivers/radeonsi/si_shader.h  |  2 +-
  src/gallium/drivers/radeonsi/si_state.h   |  3 ++
  4 files changed, 87 insertions(+), 13 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c 
b/src/gallium/drivers/radeonsi/si_descriptors.c
index 0c1fca8..da6efa8 100644
--- a/src/gallium/drivers/radeonsi/si_descriptors.c
+++ b/src/gallium/drivers/radeonsi/si_descriptors.c
@@ -119,20 +119,21 @@ static void si_init_descriptor_list(uint32_t *desc_list,
  
  static void si_init_descriptors(struct si_descriptors *desc,

unsigned shader_userdata_index,
unsigned element_dw_size,
unsigned num_elements)
  {
desc->list = CALLOC(num_elements, element_dw_size * 4);
desc->element_dw_size = element_dw_size;
desc->num_elements = num_elements;
desc->shader_userdata_offset = shader_userdata_index * 4;
+   desc->slot_index_to_bind_directly = -1;
  }
  
  static void si_release_descriptors(struct si_descriptors *desc)

  {
r600_resource_reference(&desc->buffer, NULL);
FREE(desc->list);
  }
  
  static bool si_upload_descriptors(struct si_context *sctx,

  struct si_descriptors *desc)
@@ -141,20 +142,34 @@ static bool si_upload_descriptors(struct si_context *sctx,
unsigned first_slot_offset = desc->first_active_slot * slot_size;
unsigned upload_size = desc->num_active_slots * slot_size;
  
  	/* Skip the upload if no shader is using the descriptors. dirty_mask

 * will stay dirty and the descriptors will be uploaded when there is
 * a shader using them.
 */
if (!upload_size)
return true;
  
+	/* If there is just one active descriptor, bind it directly. */

+   if ((int)desc->first_active_slot == desc->slot_index_to_bind_directly &&
+   desc->num_active_slots == 1) {
+   uint32_t *descriptor = 
&desc->list[desc->slot_index_to_bind_directly *
+  desc->element_dw_size];
+
+   /* The buffer is already in the buffer list. */
+   r600_resource_reference(&desc->buffer, NULL);
+   desc->gpu_list = NULL;
+   desc->gpu_address = si_desc_extract_buffer_address(descriptor);
+   si_mark_atom_dirty(sctx, &sctx->shader_pointers.atom);
+   return true;
+   }
+
uint32_t *ptr;
int buffer_offset;
u_upload_alloc(sctx->b.b.const_uploader, 0, upload_size,
   si_optimal_tcc_alignment(sctx, upload_size),
   (unsigned*)&buffer_offset,
   (struct pipe_resource**)&desc->buffer,
   (void**)&ptr);
if (!desc->buffer) {
desc->gpu_address = 0;
return false; /* skip the draw call */
@@ -2524,38 +2539,40 @@ void si_init_all_descriptors(struct si_context *sctx)
int i;
  
  	STATIC_ASSERT(GFX9_SGPR_TCS_CONST_AND_SHADER_BUFFERS % 2 == 0);

STATIC_ASSERT(GFX9_SGPR_GS_CONST_AND_SHADER_BUFFERS % 2 == 0);
  
  	for (i = 0; i < SI_NUM_SHADERS; i++) {

bool gfx9_tcs = false;
bool gfx9_gs = false;
unsigned n

[Mesa-dev] [Bug 103312] meson/macOS: Dependency libdrm_intel found: NO

2017-10-17 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=103312

--- Comment #1 from Eric Engestrom  ---
libdrm_intel is needed to build i915; if you want to disable it you can use
`-Ddri-drivers=i965` (or any other value that doesn't contain `i915`, although
right now those are the only possibilities)

How did you configure autotools when that's what you were using to build mesa?

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Build fail since configure.ac: rework llvm libs handling for 3.9+

2017-10-17 Thread Emil Velikov

On 16 October 2017 at 23:32, Dieter Nützel  wrote:
> Am 16.10.2017 20:13, schrieb Andy Furniss:
>>
>> Emil Velikov wrote:
>>>
>>> On 16 October 2017 at 03:22, Jan Vesely  wrote:

 On Sun, 2017-10-15 at 00:00 +0100, Andy Furniss wrote:
>
> Andy Furniss wrote:
>>
>> Since
>>
>> commit 13a53c4f5cdd664fd155c9e78fb46a4387af006c
>> Author: Emil Velikov 
>> Date:   Thu Oct 5 11:19:05 2017 +0100
>>
>>   configure.ac: rework llvm libs handling for 3.9+
>>
>> I am getting 00s of
>>
>>  /mesa/src/amd/common/. undefined reference to LLVM..
>>
>> Using git llvm have tried with -DLLVM_APPEND_VC_REV=OFF
>>
>> My llvm config =
>>
>> cmake -DCMAKE_INSTALL_PREFIX=/usr -DCMAKE_BUILD_TYPE=Release
>> -DLLVM_TARGETS_TO_BUILD="host;AMDGPU" -DLLVM_ENABLE_ASSERTIONS=ON
>> -DLLVM_BUILD_LLVM_DYLIB=ON
>
>
> Build is OK with llvm built with -DLLVM_LINK_LLVM_DYLIB=ON


 THis is weird. that option should only set whether llvm executables are
 dynamically linked. is the llvm-config output different when you toggle
 this setting?

>>> Precisely what I was wondering as well.
>>>
>>> Andy can you please share the output of the following commands across
>>> your build combinations:
>>> llvm-config --link-shared --libs bitwriter
>>> llvm-config --link-static --libs bitwriter
>>> llvm-config --link-static --system-libs
>>
>>
>> On a "bad" build (-DLLVM_BUILD_LLVM_DYLIB=ON) I get
>>
>> andy [~]$ llvm-config --link-shared --libs bitwriter
>> llvm-config: error: missing: /usr/lib/libLLVMDemangle.so
>> llvm-config: error: missing: /usr/lib/libLLVMSupport.so
>> llvm-config: error: missing: /usr/lib/libLLVMBinaryFormat.so
>> llvm-config: error: missing: /usr/lib/libLLVMCore.so
>> llvm-config: error: missing: /usr/lib/libLLVMBitReader.so
>> llvm-config: error: missing: /usr/lib/libLLVMMC.so
>> llvm-config: error: missing: /usr/lib/libLLVMMCParser.so
>> llvm-config: error: missing: /usr/lib/libLLVMObject.so
>> llvm-config: error: missing: /usr/lib/libLLVMProfileData.so
>> llvm-config: error: missing: /usr/lib/libLLVMAnalysis.so
>> llvm-config: error: missing: /usr/lib/libLLVMBitWriter.so

These here indicate that something in LLVM broke. Please report it
ASAP so that we don't get a LLVM release with this bug.

>> andy [~]$
>> andy [~]$ llvm-config --link-static --libs bitwriter
>> -lLLVMBitWriter -lLLVMAnalysis -lLLVMProfileData -lLLVMObject
>> -lLLVMMCParser -lLLVMMC -lLLVMBitReader -lLLVMCore -lLLVMBinaryFormat
>> -lLLVMSupport -lLLVMDemangle
>> andy [~]$
>> andy [~]$ llvm-config --link-static --system-libs
>> -lrt -ldl -lcurses -lpthread -lz -lm
>>
>> On a "good" (-DLLVM_LINK_LLVM_DYLIB=ON) I get
>
>
> With both:
> -DLLVM_BUILD_LLVM_DYLIB=ON and -DLLVM_LINK_LLVM_DYLIB=ON
>
Something's a bit strange:



>> andy [~]$ llvm-config --link-shared --libs bitwriter
>> -lLLVM-6.0svn
>> andy [~]$ llvm-config --link-static --libs bitwriter
>> -lLLVMBitWriter -lLLVMAnalysis -lLLVMProfileData -lLLVMObject
>> -lLLVMMCParser -lLLVMMC -lLLVMBitReader -lLLVMCore -lLLVMBinaryFormat
>> -lLLVMSupport -lLLVMDemangle
>> andy [~]$ llvm-config --link-static --system-libs
>> -lrt -ldl -lcurses -lpthread -lz -lm
>
>
> I get mostly the same:
>
> Radeon/Mesa> llvm-config --link-shared --libs bitwriter
> -lLLVM-6.0svn
> Radeon/Mesa> llvm-config --link-static --libs bitwriter
> -lLLVMBitWriter -lLLVMAnalysis -lLLVMProfileData -lLLVMObject -lLLVMMCParser
> -lLLVMMC -lLLVMBitReader -lLLVMCore -lLLVMBinaryFormat -lLLVMSupport
> -lLLVMDemangle
> Radeon/Mesa> llvm-config --link-static --system-libs
> -lrt -ldl -ltinfo -lpthread -lz -lm
>   
>
> Dieter
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Build fail since configure.ac: rework llvm libs handling for 3.9+

2017-10-17 Thread Emil Velikov

On 17 October 2017 at 13:40, Emil Velikov  wrote:

>> With both:
>> -DLLVM_BUILD_LLVM_DYLIB=ON and -DLLVM_LINK_LLVM_DYLIB=ON
>>
> Something's a bit strange:
>
... setting -DLLVM_LINK_LLVM_DYLIB=ON should also set
-DLLVM_BUILD_LLVM_DYLIB=ON [1].

If that's not the case please report it to the LLVM devs.

Thanks
Emil

[1] https://llvm.org/docs/CMake.html
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 02/10] mesa/st/tests: unify MockCodeLine* classes

2017-10-17 Thread Gert Wollny

 * Merge the classes MockCodeLine and MockCodelineWithSwizzle into
   one and refactor tests accordingly.
 * Change memory allocations to use ralloc* interface.

Signed-off-by: Gert Wollny 
---
 .../tests/test_glsl_to_tgsi_lifetime.cpp   | 481 ++---
 1 file changed, 234 insertions(+), 247 deletions(-)

diff --git a/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp 
b/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
index d0ac8b1020..80ea19fa80 100644
--- a/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
+++ b/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
@@ -29,35 +29,50 @@
 
 #include 
 #include 
+#include 
+#include 
 
 using std::vector;
 using std::pair;
 using std::make_pair;
+using std::transform;
+using std::copy;
+
+/* Use this to make the compiler pick the swizzle constructor below */
+struct SWZ {};
 
 /* A line to describe a TGSI instruction for building mock shaders. */
 struct MockCodeline {
-   MockCodeline(unsigned _op): op(_op) {}
-   MockCodeline(unsigned _op, const vector& _dst, const vector& 
_src, const vector&_to):
-  op(_op), dst(_dst), src(_src), tex_offsets(_to){}
-   unsigned op;
-   vector dst;
-   vector src;
-   vector tex_offsets;
-};
+   MockCodeline(unsigned _op): op(_op), max_temp_id(0){}
+   MockCodeline(unsigned _op, const vector& _dst, const vector& _src,
+const vector&_to);
+
+   MockCodeline(unsigned _op, const vector>& _dst,
+const vector>& _src,
+const vector>&_to, SWZ with_swizzle);
+
+   int get_max_reg_id() const { return max_temp_id;}
+
+   glsl_to_tgsi_instruction *get_codeline() const;
+
+   static void set_mem_ctx(void *ctx);
+
+private:
+   st_src_reg create_src_register(int src_idx);
+   st_src_reg create_src_register(int src_idx, const char *swizzle);
+   st_src_reg create_src_register(int src_idx, gl_register_file file);
+
+   st_dst_reg create_dst_register(int dst_idx);
+   st_dst_reg create_dst_register(int dst_idx, int writemask);
+   st_dst_reg create_dst_register(int dst_idx, gl_register_file file);
 
-/* A line to describe a TGSI instruction with swizzeling and write makss
- * for building mock shaders.
- */
-struct MockCodelineWithSwizzle {
-   MockCodelineWithSwizzle(unsigned _op): op(_op) {}
-   MockCodelineWithSwizzle(unsigned _op, const vector>& _dst,
-   const vector>& _src,
-   const vector>&_to):
-  op(_op), dst(_dst), src(_src), tex_offsets(_to){}
unsigned op;
-   vector> dst;
-   vector> src;
-   vector> tex_offsets;
+   vector dst;
+   vector src;
+   vector tex_offsets;
+
+   int max_temp_id;
+   static void *mem_ctx;
 };
 
 /* A few constants that will notbe tracked as temporary registers by the
@@ -72,22 +87,14 @@ const int out1 = -2;
 
 class MockShader {
 public:
-   MockShader(const vector& source);
-   MockShader(const vector& source);
-   ~MockShader();
-
-   void free();
+   MockShader(const vector& source, void *ctx);
 
exec_list* get_program() const;
int get_num_temps() const;
+
 private:
-   st_src_reg create_src_register(int src_idx);
-   st_dst_reg create_dst_register(int dst_idx);
-   st_src_reg create_src_register(int src_idx, const char *swizzle);
-   st_dst_reg create_dst_register(int dst_idx,int writemask);
exec_list* program;
int num_temps;
-   void *mem_ctx;
 };
 
 using expectation = vector>;
@@ -102,7 +109,6 @@ protected:
 class LifetimeEvaluatorTest : public MesaTestWithMemCtx {
 protected:
void run(const vector& code, const expectation& e);
-   void run(const vector& code, const expectation& e);
 private:
virtual void check(const vector& result, const expectation& e) = 
0;
 };
@@ -136,21 +142,9 @@ protected:
 class RegisterLifetimeAndRemappingTest : public RegisterRemappingTest  {
 protected:
using RegisterRemappingTest::run;
-   template 
-   void run(const vector& code, const vector& expect);
+   void run(const vector& code, const vector& expect);
 };
 
-template 
-void RegisterLifetimeAndRemappingTest::run(const vector& code,
-  const vector& expect)
-{
- MockShader shader(code);
- std::vector lt(shader.get_num_temps());
- get_temp_registers_required_lifetimes(mem_ctx, shader.get_program(),
-   shader.get_num_temps(), <[0]);
- this->run(lt, expect);
-}
-
 TEST_F(LifetimeEvaluatorExactTest, SimpleMoveAdd)
 {
const vector code = {
@@ -831,80 +825,81 @@ TEST_F(LifetimeEvaluatorExactTest, 
FirstWriteAtferReadInNestedLoop)
  */
 TEST_F(LifetimeEvaluatorExactTest, LoopWithConditionalComponentWrite_X)
 {
-   const vector code = {
-  MockCodelineWithSwizzle(TGSI_OPCODE_BGNLOOP),
-  MockCodelineWithSwizzle(TGSI_OPCODE_MOV, DST(1, WRITEMASK_Y), SRC(in1, 
"x"), {}),
-  MockCodelineWithSwizzle(TGSI_OPCODE_IF, {}, SRC(in0, ""), {}),
-  MockCodelineWithSwizzle(TGSI_OPCODE_MOV, DST(1, WRITEMASK_X), SRC(in1, 
"y"),

[Mesa-dev] [Bug 103312] meson/macOS: Dependency libdrm_intel found: NO

2017-10-17 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=103312

--- Comment #2 from Emil Velikov  ---
autotools detects possible drivers per platform and per cpu arch.
The latter may be an overkill but the former is a must IMHO.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 01/10] mesa/st/tests: Fix zero-byte allocation leaks

2017-10-17 Thread Gert Wollny

Don't allocate a zero-sized array, when no texture offsets are given.

Reviewed-by: Nicolai Hähnle 
Signed-off-by: Gert Wollny 
---
 .../tests/test_glsl_to_tgsi_lifetime.cpp   | 23 +++---
 1 file changed, 16 insertions(+), 7 deletions(-)

diff --git a/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp 
b/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
index 93f4020ebf..d0ac8b1020 100644
--- a/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
+++ b/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
@@ -1374,10 +1374,14 @@ MockShader::MockShader(const 
vector& source):
  next_instr->dst[k] = create_dst_register(i.dst[k].first, 
i.dst[k].second);
   }
   next_instr->tex_offset_num_offset = i.tex_offsets.size();
-  next_instr->tex_offsets = new st_src_reg[i.tex_offsets.size()];
-  for (unsigned k = 0; k < i.tex_offsets.size(); ++k) {
- next_instr->tex_offsets[k] = 
create_src_register(i.tex_offsets[k].first,
-  
i.tex_offsets[k].second);
+  if (next_instr->tex_offset_num_offset > 0) {
+ next_instr->tex_offsets = new st_src_reg[i.tex_offsets.size()];
+ for (unsigned k = 0; k < i.tex_offsets.size(); ++k) {
+next_instr->tex_offsets[k] = 
create_src_register(i.tex_offsets[k].first,
+ 
i.tex_offsets[k].second);
+ }
+  } else {
+ next_instr->tex_offsets = nullptr;
   }
   program->push_tail(next_instr);
}
@@ -1407,10 +1411,15 @@ MockShader::MockShader(const vector& 
source):
  next_instr->dst[k] = create_dst_register(i.dst[k]);
   }
   next_instr->tex_offset_num_offset = i.tex_offsets.size();
-  next_instr->tex_offsets = new st_src_reg[i.tex_offsets.size()];
-  for (unsigned k = 0; k < i.tex_offsets.size(); ++k) {
- next_instr->tex_offsets[k] = create_src_register(i.tex_offsets[k]);
+  if (next_instr->tex_offset_num_offset > 0) {
+ next_instr->tex_offsets = new st_src_reg[i.tex_offsets.size()];
+ for (unsigned k = 0; k < i.tex_offsets.size(); ++k) {
+next_instr->tex_offsets[k] = create_src_register(i.tex_offsets[k]);
+ }
+  } else {
+ next_instr->tex_offsets = nullptr;
   }
+
   program->push_tail(next_instr);
}
++num_temps;
-- 
2.14.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 05/10] mesa/st/glsl_to_tgsi: Correct debug output for indirect access

2017-10-17 Thread Gert Wollny

For arrays print the array ID, and with indirect access also print the
reladdr* registers. The reladdr* registers are always used in the
printout, even though the actual code may use an address register.

Specifically, a sequence involving src.reladdr = TEMP[2] and src.index=10
that emits the address register loading instruction will be printed like:

  MOV ADDR[0].x, TEMP[2].
  MOV TEMP[3], ARRAY(2)[TEMP[2]. + 10]

The reason for this is, that there is currently no indication in the src
register on whether the address instruction was or must be emitted.

Signed-off-by: Gert Wollny 
---
 .../state_tracker/st_glsl_to_tgsi_temprename.cpp   | 120 -
 1 file changed, 67 insertions(+), 53 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
index 76c198e165..839dfff078 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
@@ -601,7 +601,7 @@ public:
 
 #ifndef NDEBUG
 /* Function used for debugging. */
-static void dump_instruction(int line, prog_scope *scope,
+static void dump_instruction(std::ostream& os, int line, prog_scope *scope,
  const glsl_to_tgsi_instruction& inst);
 #endif
 
@@ -647,7 +647,7 @@ get_temp_registers_required_lifetimes(void *mem_ctx, 
exec_list *instructions,
  break;
   }
 
-  RENAME_DEBUG(dump_instruction(line, cur_scope, *inst));
+  RENAME_DEBUG(dump_instruction(cerr, line, cur_scope, *inst));
 
   switch (inst->op) {
   case TGSI_OPCODE_BGNLOOP: {
@@ -918,8 +918,59 @@ static const char *tgsi_file_names[PROGRAM_FILE_MAX] =  {
"IMM", "BUF",  "MEM",  "IMAGE"
 };
 
+template 
+void dump_reg_access(std::ostream& os, const st_reg& reg)
+{
+   os << tgsi_file_names[reg.file];
+   if (reg.file == PROGRAM_ARRAY)
+  os << "(" << reg.array_id << ")";
+
+   if (reg.has_index2) {
+  os << "[";
+  if (reg.reladdr2)
+ os << *reg.reladdr2 << "+";
+  os << reg.index2D << "]";
+   }
+
+   os << "[";
+   if (reg.reladdr)
+  os << *reg.reladdr << "+";
+   os << reg.index << "]";
+}
+
+static std::ostream& operator << (std::ostream& os, const st_src_reg& reg)
+{
+   dump_reg_access(os, reg);
+
+   if (reg.swizzle != SWIZZLE_XYZW) {
+  os << ".";
+  for (int idx = 0; idx < 4; ++idx) {
+ int swz = GET_SWZ(reg.swizzle, idx);
+ if (swz < 4) {
+os << swizzle_txt[swz];
+ }
+  }
+   }
+   return os;
+}
+
+static std::ostream& operator << (std::ostream& os, const st_dst_reg& dst)
+{
+   dump_reg_access(os, dst);
+
+   if (dst.writemask != TGSI_WRITEMASK_XYZW) {
+  os << ".";
+  if (dst.writemask & TGSI_WRITEMASK_X) os << "x";
+  if (dst.writemask & TGSI_WRITEMASK_Y) os << "y";
+  if (dst.writemask & TGSI_WRITEMASK_Z) os << "z";
+  if (dst.writemask & TGSI_WRITEMASK_W) os << "w";
+   }
+
+   return os;
+}
+
 static
-void dump_instruction(int line, prog_scope *scope,
+void dump_instruction(std::ostream& os, int line, prog_scope *scope,
   const glsl_to_tgsi_instruction& inst)
 {
const struct tgsi_opcode_info *info = tgsi_get_opcode_info(inst.op);
@@ -937,74 +988,37 @@ void dump_instruction(int line, prog_scope *scope,
info->opcode == TGSI_OPCODE_ENDSWITCH)
   --indent;
 
-   cerr << setw(4) << line << ": ";
+   os << setw(4) << line << ": ";
for (int i = 0; i < indent; ++i)
-  cerr << "";
-   cerr << tgsi_get_opcode_name(info->opcode) << " ";
+  os << "";
+   os << tgsi_get_opcode_name(info->opcode) << " ";
 
bool has_operators = false;
for (unsigned j = 0; j < num_inst_dst_regs(&inst); j++) {
   has_operators = true;
   if (j > 0)
- cerr << ", ";
+ os << ", ";
 
-  const st_dst_reg& dst = inst.dst[j];
-  cerr << tgsi_file_names[dst.file];
+  os << inst.dst[j];
 
-  if (dst.file == PROGRAM_ARRAY)
- cerr << "(" << dst.array_id << ")";
-
-  cerr << "[" << dst.index << "]";
-
-  if (dst.writemask != TGSI_WRITEMASK_XYZW) {
- cerr << ".";
- if (dst.writemask & TGSI_WRITEMASK_X) cerr << "x";
- if (dst.writemask & TGSI_WRITEMASK_Y) cerr << "y";
- if (dst.writemask & TGSI_WRITEMASK_Z) cerr << "z";
- if (dst.writemask & TGSI_WRITEMASK_W) cerr << "w";
-  }
}
if (has_operators)
-  cerr << " := ";
+  os << " := ";
 
for (unsigned j = 0; j < num_inst_src_regs(&inst); j++) {
   if (j > 0)
- cerr << ", ";
-
-  const st_src_reg& src = inst.src[j];
-  cerr << tgsi_file_names[src.file]
-   << "[" << src.index << "]";
-  if (src.swizzle != SWIZZLE_XYZW) {
- cerr << ".";
- for (int idx = 0; idx < 4; ++idx) {
-int swz = GET_SWZ(src.swizzle, idx);
-if (swz < 4) {
-   cerr << swizzle_txt[swz];
-}
- }
-  }
+

[Mesa-dev] [PATCH v2 00/10] glsl_to_tgsi: Further improvement of lifetime tracking for register merge

2017-10-17 Thread Gert Wollny

Dear all, 

this is the updated patch set that adds enhanced tracking of IF/ELSE 
branches and tracking of reladdr* registers for the register_merge step. 

So far patches 1 & 5 (now 8) are

  Reviewed-by: Nicolai Hähnle 

Changes w.r.t. v1: 

* patches 2-4(new): As suggested by Nikolai, these patches unify the test 
classes 
  with respect to the different register inputs (at this point: plain and with 
  swizzle). In addition, some comments are corrected and the used of white 
spaces 
  in the test cases is made more consistent. 
* patch 5: correct the debug output for indirect addressing. Nikolai suggested 
that 
  another patch might be in order to properly propagate the information when 
and 
  which address register is used, but since st_*_reg is passed through various 
  levels by value, I'd prefer to deal with that in another, dedicated patch 
series.
* patch 6: Further improve the tracking algorithm, and, as requested by 
Nikolai, 
  rename some variables and add comments to make the algorithm clearer.
* patch 7: Add yet more tests. 
* patch 9: Update the tests to adhere to the new, unified interface. 
* patch 10 (new): remove the no longer needed assert for the use of address 
registers 
  in register_merge (I was considering to add this to 8, but since that one was 
already 
  reviewed ...)

many thanks for any comments, 
Gert

--
Submitter has no write access to mesa-git 

Gert Wollny (10):
  mesa/st/tests: Fix zero-byte allocation leaks
  mesa/st/tests: unify MockCodeLine* classes
  mesa/st/tests: base check of number of registers on opcode info
  mesa/st/tests: cleanup whitespace usage and correct some comments
  mesa/st/glsl_to_tgsi: Correct debug output for indirect access
  mesa/st/glsl_to_tgsi: Add tracking of ifelse writes in register
merging
  mesa/st/tests: Add tests for improved tracking of temporaries
  mesa/st/glsl_to_tgsi: Add tracking of indirect addressing registers
  mesa/st/tests: Add tests for lifetime tracking with indirect
addressing
  mesa/st/glsl_to_tgsi: remove now unneeded assert.

 src/mesa/state_tracker/st_glsl_to_tgsi.cpp |1 -
 .../state_tracker/st_glsl_to_tgsi_temprename.cpp   |  540 +++--
 .../tests/test_glsl_to_tgsi_lifetime.cpp   | 1276 +++-
 3 files changed, 1399 insertions(+), 418 deletions(-)

-- 
2.14.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 08/10] mesa/st/glsl_to_tgsi: Add tracking of indirect addressing registers

2017-10-17 Thread Gert Wollny

So far indirect addressing was not tracked to estimate the temporary
life time, and it was not needed, because code to load the address
registers was always emitted eliminating the reladdr* handles in the
past glsl-to.tgsi stages. Now, with Mareks patch 9a88580a4b3d allowing 
any 1D register to be used for addressing n some hardware this changed, 
and the tracking becomes necessary.

Because the registers have no direct indication on whether the reladdr* was
already loaded into an address register, the temporaries in reladdr* are
always tracked as reads. This may result in a slight over-estimation of the
lifetime in the cases when the load to the address register was emitted.

Reviewed-by: Nicolai Hähnle 
Signed-off-by: Gert Wollny 
---
 .../state_tracker/st_glsl_to_tgsi_temprename.cpp   | 108 ++---
 1 file changed, 74 insertions(+), 34 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
index 4490a44468..0903f2feba 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
@@ -869,6 +869,69 @@ public:
}
 };
 
+class access_recorder {
+public:
+   access_recorder(int _ntemps);
+   ~access_recorder();
+
+   void record_read(const st_src_reg& src, int line, prog_scope *scope);
+   void record_write(const st_dst_reg& src, int line, prog_scope *scope);
+
+   void get_required_lifetimes(struct lifetime *lifetimes);
+private:
+
+   int ntemps;
+   temp_access *acc;
+
+};
+
+access_recorder::access_recorder(int _ntemps):
+   ntemps(_ntemps)
+{
+   acc = new temp_access[ntemps];
+}
+
+access_recorder::~access_recorder()
+{
+   delete[] acc;
+}
+
+void access_recorder::record_read(const st_src_reg& src, int line,
+  prog_scope *scope)
+{
+   if (src.file == PROGRAM_TEMPORARY)
+  acc[src.index].record_read(line, scope, src.swizzle);
+
+   if (src.reladdr)
+  record_read(*src.reladdr, line, scope);
+   if (src.reladdr2)
+  record_read(*src.reladdr2, line, scope);
+}
+
+void access_recorder::record_write(const st_dst_reg& dst, int line,
+   prog_scope *scope)
+{
+   if (dst.file == PROGRAM_TEMPORARY)
+  acc[dst.index].record_write(line, scope, dst.writemask);
+
+   if (dst.reladdr)
+  record_read(*dst.reladdr, line, scope);
+   if (dst.reladdr2)
+  record_read(*dst.reladdr2, line, scope);
+}
+
+void access_recorder::get_required_lifetimes(struct lifetime *lifetimes)
+{
+   RENAME_DEBUG(cerr << "= lifetimes ==\n");
+   for(int i = 0; i < ntemps; ++i) {
+  RENAME_DEBUG(cerr << setw(4) << i);
+  lifetimes[i] = acc[i].get_required_lifetime();
+  RENAME_DEBUG(cerr << ": [" << lifetimes[i].begin << ", "
+<< lifetimes[i].end << "]\n");
+   }
+   RENAME_DEBUG(cerr << "==\n\n");
+}
+
 }
 
 #ifndef NDEBUG
@@ -889,7 +952,6 @@ get_temp_registers_required_lifetimes(void *mem_ctx, 
exec_list *instructions,
int if_id = 1;
int switch_id = 0;
bool is_at_end = false;
-   bool ok = true;
int n_scopes = 1;
 
/* Count scopes to allocate the needed space without the need for
@@ -907,7 +969,8 @@ get_temp_registers_required_lifetimes(void *mem_ctx, 
exec_list *instructions,
}
 
prog_scope_storage scopes(mem_ctx, n_scopes);
-   temp_access *acc = new temp_access[ntemps];
+
+   access_recorder access(ntemps);
 
prog_scope *cur_scope = scopes.create(nullptr, outer_scope, 0, 0, line);
 
@@ -936,9 +999,7 @@ get_temp_registers_required_lifetimes(void *mem_ctx, 
exec_list *instructions,
   case TGSI_OPCODE_IF:
   case TGSI_OPCODE_UIF: {
  assert(num_inst_src_regs(inst) == 1);
- const st_src_reg& src = inst->src[0];
- if (src.file == PROGRAM_TEMPORARY)
-acc[src.index].record_read(line, cur_scope, src.swizzle);
+ access.record_read(inst->src[0], line, cur_scope);
  cur_scope = scopes.create(cur_scope, if_branch, if_id++,
cur_scope->nesting_depth() + 1, line + 1);
  break;
@@ -964,14 +1025,12 @@ get_temp_registers_required_lifetimes(void *mem_ctx, 
exec_list *instructions,
   }
   case TGSI_OPCODE_SWITCH: {
  assert(num_inst_src_regs(inst) == 1);
- const st_src_reg& src = inst->src[0];
  prog_scope *scope = scopes.create(cur_scope, switch_body, switch_id++,
cur_scope->nesting_depth() + 1, 
line);
  /* We record the read only for the SWITCH statement itself, like it
   * is used by the only consumer of TGSI_OPCODE_SWITCH in tgsi_exec.c.
   */
- if (src.file == PROGRAM_TEMPORARY)
-acc[src.index].record_read(line, cur_scope, src.swizzle);
+ access.record_read(inst->src[0], line, cur_scope);
  cur_scope = scope;
  break;
   }
@@ -993

[Mesa-dev] [PATCH v2 04/10] mesa/st/tests: cleanup whitespace usage and correct some comments

2017-10-17 Thread Gert Wollny

Signed-off-by: Gert Wollny 
---
 .../tests/test_glsl_to_tgsi_lifetime.cpp   | 127 ++---
 1 file changed, 63 insertions(+), 64 deletions(-)

diff --git a/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp 
b/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
index 1a3c8cfa32..908791fbf6 100644
--- a/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
+++ b/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
@@ -289,9 +289,9 @@ TEST_F(LifetimeEvaluatorAtLeastTest, WriteInIfAndElseInLoop)
run (code, expectation({{-1,-1}, {0,9}, {3,7}, {7,10}}));
 }
 
-/* In loop if/else value written in both path, read in else path
- * before write and also read later
- * - value must survive the whole loop
+/* Test that read before write in ELSE path is properly tracked:
+ * In loop if/else value written in both path but read in else path
+ * before write and also read later - value must survive the whole loop.
  */
 TEST_F(LifetimeEvaluatorExactTest, WriteInIfAndElseReadInElseInLoop)
 {
@@ -351,9 +351,9 @@ TEST_F(LifetimeEvaluatorExactTest, 
ReadInLoopInIfBeforeWriteAndLifeToTheEnd)
run (code, expectation({{-1,-1}, {0,6}}));
 }
 
-/* In loop if/else read in one path before written in the same loop
- * read after the loop, value must survivethe whole loop and
- * to the read.
+/* In loop read before written in the same loop read after the loop,
+ * value must survive the whole loop and to the read.
+ * This is kind of undefined behaviour though ...
  */
 TEST_F(LifetimeEvaluatorExactTest, ReadInLoopBeforeWriteAndLifeToTheEnd)
 {
@@ -686,7 +686,6 @@ TEST_F(LifetimeEvaluatorExactTest, 
LoopWithReadWriteInSwitchDifferentCaseFallThr
run (code, expectation({{-1,-1}, {0,8}}));
 }
 
-
 /* Here we read and write from an to the same temp in the same instruction,
  * but the read is conditional (select operation), hence the lifetime must
  * start with the first write.
@@ -694,21 +693,21 @@ TEST_F(LifetimeEvaluatorExactTest, 
LoopWithReadWriteInSwitchDifferentCaseFallThr
 TEST_F(LifetimeEvaluatorExactTest, WriteSelectFromSelf)
 {
const vector code = {
-  {TGSI_OPCODE_USEQ, {5}, {in0,in1}, {}},
-  {TGSI_OPCODE_UCMP, {1}, {5,in1,1}, {}},
-  {TGSI_OPCODE_UCMP, {1}, {5,in1,1}, {}},
-  {TGSI_OPCODE_UCMP, {1}, {5,in1,1}, {}},
-  {TGSI_OPCODE_UCMP, {1}, {5,in1,1}, {}},
-  {TGSI_OPCODE_FSLT, {2}, {1,in1}, {}},
-  {TGSI_OPCODE_UIF, {}, {2}, {}},
-  {  TGSI_OPCODE_MOV, {3}, {in1}, {}},
-  {TGSI_OPCODE_ELSE},
-  {  TGSI_OPCODE_MOV, {4}, {in1}, {}},
-  {  TGSI_OPCODE_MOV, {4}, {4}, {}},
-  {  TGSI_OPCODE_MOV, {3}, {4}, {}},
-  {TGSI_OPCODE_ENDIF},
-  {TGSI_OPCODE_MOV, {out1}, {3}, {}},
-  {TGSI_OPCODE_END}
+  { TGSI_OPCODE_USEQ, {5}, {in0,in1}, {}},
+  { TGSI_OPCODE_UCMP, {1}, {5,in1,1}, {}},
+  { TGSI_OPCODE_UCMP, {1}, {5,in1,1}, {}},
+  { TGSI_OPCODE_UCMP, {1}, {5,in1,1}, {}},
+  { TGSI_OPCODE_UCMP, {1}, {5,in1,1}, {}},
+  { TGSI_OPCODE_FSLT, {2}, {1,in1}, {}},
+  { TGSI_OPCODE_UIF, {}, {2}, {}},
+  {   TGSI_OPCODE_MOV, {3}, {in1}, {}},
+  { TGSI_OPCODE_ELSE},
+  {   TGSI_OPCODE_MOV, {4}, {in1}, {}},
+  {   TGSI_OPCODE_MOV, {4}, {4}, {}},
+  {   TGSI_OPCODE_MOV, {3}, {4}, {}},
+  { TGSI_OPCODE_ENDIF},
+  { TGSI_OPCODE_MOV, {out1}, {3}, {}},
+  { TGSI_OPCODE_END}
};
run (code, expectation({{-1,-1}, {1,5}, {5,6}, {7,13}, {9,11}, {0,4}}));
 }
@@ -1268,21 +1267,21 @@ TEST_F(RegisterRemappingTest, 
RegisterRemappingMergeZeroLifetimeRegisters)
 TEST_F(RegisterLifetimeAndRemappingTest, LifetimeAndRemapping)
 {
const vector code = {
-  {TGSI_OPCODE_USEQ, {5}, {in0,in1}, {}},
-  {TGSI_OPCODE_UCMP, {1}, {5,in1,1}, {}},
-  {TGSI_OPCODE_UCMP, {1}, {5,in1,1}, {}},
-  {TGSI_OPCODE_UCMP, {1}, {5,in1,1}, {}},
-  {TGSI_OPCODE_UCMP, {1}, {5,in1,1}, {}},
-  {TGSI_OPCODE_FSLT, {2}, {1,in1}, {}},
-  {TGSI_OPCODE_UIF, {}, {2}, {}},
-  {  TGSI_OPCODE_MOV, {3}, {in1}, {}},
-  {TGSI_OPCODE_ELSE},
-  {  TGSI_OPCODE_MOV, {4}, {in1}, {}},
-  {  TGSI_OPCODE_MOV, {4}, {4}, {}},
-  {  TGSI_OPCODE_MOV, {3}, {4}, {}},
-  {TGSI_OPCODE_ENDIF},
-  {TGSI_OPCODE_MOV, {out1}, {3}, {}},
-  {TGSI_OPCODE_END}
+  { TGSI_OPCODE_USEQ, {5}, {in0,in1}, {}},
+  { TGSI_OPCODE_UCMP, {1}, {5,in1,1}, {}},
+  { TGSI_OPCODE_UCMP, {1}, {5,in1,1}, {}},
+  { TGSI_OPCODE_UCMP, {1}, {5,in1,1}, {}},
+  { TGSI_OPCODE_UCMP, {1}, {5,in1,1}, {}},
+  { TGSI_OPCODE_FSLT, {2}, {1,in1}, {}},
+  { TGSI_OPCODE_UIF, {}, {2}, {}},
+  {   TGSI_OPCODE_MOV, {3}, {in1}, {}},
+  { TGSI_OPCODE_ELSE},
+  {   TGSI_OPCODE_MOV, {4}, {in1}, {}},
+  {   TGSI_OPCODE_MOV, {4}, {4}, {}},
+  {   TGSI_OPCODE_MOV, {3}, {4}, {}},
+  { TGSI_OPCODE_ENDIF},
+  { TGSI_OPCODE_MOV, {out1}, {3}, {}},
+  { TGSI_OPCODE_END}
};
run (code, vector({0,1,5,5,1,5}));
 }
@@ -1290,15 +1289,15 @@ TEST_F(RegisterLifetimeAndR

[Mesa-dev] [PATCH v2 10/10] mesa/st/glsl_to_tgsi: remove now unneeded assert.

2017-10-17 Thread Gert Wollny

With the implementation of the tracking of the registers used in reladdr
asserting that a driver calling merge_register() uses the address register
is no longer needed.

Signed-off-by: Gert Wollny 
---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 1 -
 1 file changed, 1 deletion(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index a45f0047a8..68b80e015d 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -5257,7 +5257,6 @@ glsl_to_tgsi_visitor::merge_two_dsts(void)
 void
 glsl_to_tgsi_visitor::merge_registers(void)
 {
-   assert(need_uarl);
struct lifetime *lifetimes =
  rzalloc_array(mem_ctx, struct lifetime, this->next_temp);
 
-- 
2.14.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 09/10] mesa/st/tests: Add tests for lifetime tracking with indirect addressing

2017-10-17 Thread Gert Wollny

 Add a code line type that accepts one layer of indirect addressing and
 add tests to check that temporary register access used for indirect
 addressing is accounted for in the lifetime estimation.

Signed-off-by: Gert Wollny 
---
 .../tests/test_glsl_to_tgsi_lifetime.cpp   | 184 -
 1 file changed, 183 insertions(+), 1 deletion(-)

diff --git a/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp 
b/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
index 5b6c40ffec..88f8844876 100644
--- a/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
+++ b/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
@@ -37,10 +37,14 @@ using std::pair;
 using std::make_pair;
 using std::transform;
 using std::copy;
+using std::tuple;
 
 /* Use this to make the compiler pick the swizzle constructor below */
 struct SWZ {};
 
+/* Use this to make the compiler pick the constructor with reladdr below */
+struct RA {};
+
 /* A line to describe a TGSI instruction for building mock shaders. */
 struct MockCodeline {
MockCodeline(unsigned _op): op(_op), max_temp_id(0){}
@@ -51,6 +55,10 @@ struct MockCodeline {
 const vector>& _src,
 const vector>&_to, SWZ with_swizzle);
 
+   MockCodeline(unsigned _op, const vector>& _dst,
+const vector>& _src,
+const vector>&_to, RA with_reladdr);
+
int get_max_reg_id() const { return max_temp_id;}
 
glsl_to_tgsi_instruction *get_codeline() const;
@@ -61,11 +69,13 @@ private:
st_src_reg create_src_register(int src_idx);
st_src_reg create_src_register(int src_idx, const char *swizzle);
st_src_reg create_src_register(int src_idx, gl_register_file file);
+   st_src_reg create_src_register(const tuple& src);
+   st_src_reg *create_rel_src_register(int idx);
 
st_dst_reg create_dst_register(int dst_idx);
st_dst_reg create_dst_register(int dst_idx, int writemask);
st_dst_reg create_dst_register(int dst_idx, gl_register_file file);
-
+   st_dst_reg create_dst_register(const tuple& dest);
unsigned op;
vector dst;
vector src;
@@ -1674,6 +1684,90 @@ TEST_F(LifetimeEvaluatorExactTest, 
NestedLoopWithWriteAfterBreak)
run (code, expectation({{-1,-1}, {0,8}}));
 }
 
+/* Check lifetime estimation with a relative addressing in src.
+ * Note, since the lifetime estimation always extends the lifetime
+ * at to at least one instruction after the last write, for the
+ * test the last read must be at least two instructions after the
+ * last write to obtain a proper test.
+ */
+
+TEST_F(LifetimeEvaluatorExactTest, ReadIndirectReladdr1)
+{
+   const vector code = {
+  { TGSI_OPCODE_MOV, {1}, {in1}, {}},
+  { TGSI_OPCODE_MOV, {2}, {in0}, {}},
+  { TGSI_OPCODE_MOV, {{3,0,0}}, {{2,1,0}}, {}, RA()},
+  { TGSI_OPCODE_MOV, {out0}, {3}, {}},
+  { TGSI_OPCODE_END}
+   };
+   run (code, expectation({{-1,-1}, {0,2}, {1,2}, {2,3}}));
+}
+
+/* Check lifetime estimation with a relative addressing in src */
+TEST_F(LifetimeEvaluatorExactTest, ReadIndirectReladdr2)
+{
+   const vector code = {
+  { TGSI_OPCODE_MOV , {1}, {in1}, {}},
+  { TGSI_OPCODE_MOV , {2}, {in0}, {}},
+  { TGSI_OPCODE_MOV , {{3,0,0}}, {{4,0,1}}, {}, RA()},
+  { TGSI_OPCODE_MOV , {out0}, {3}, {}},
+  { TGSI_OPCODE_END}
+   };
+   run (code, expectation({{-1,-1}, {0,2}, {1,2},{2,3}}));
+}
+
+/* Check lifetime estimation with a relative addressing in src */
+TEST_F(LifetimeEvaluatorExactTest, ReadIndirectTexOffsReladdr1)
+{
+   const vector code = {
+  { TGSI_OPCODE_MOV , {1}, {in1}, {}},
+  { TGSI_OPCODE_MOV , {2}, {in0}, {}},
+  { TGSI_OPCODE_MOV , {{3,0,0}}, {{in2,0,0}}, {{5,1,0}}, RA()},
+  { TGSI_OPCODE_MOV , {out0}, {3}, {}},
+  { TGSI_OPCODE_END}
+   };
+   run (code, expectation({{-1,-1}, {0,2}, {1,2}, {2,3}}));
+}
+
+/* Check lifetime estimation with a relative addressing in src */
+TEST_F(LifetimeEvaluatorExactTest, ReadIndirectTexOffsReladdr2)
+{
+   const vector code = {
+  { TGSI_OPCODE_MOV , {1}, {in1}, {}},
+  { TGSI_OPCODE_MOV , {2}, {in0}, {}},
+  { TGSI_OPCODE_MOV , {{3,0,0}}, {{in2,0,0}}, {{2,0,1}}, RA()},
+  { TGSI_OPCODE_MOV , {out0}, {3}, {}},
+  { TGSI_OPCODE_END}
+   };
+   run (code, expectation({{-1,-1}, {0,2}, {1,2}, {2,3}}));
+}
+
+/* Check lifetime estimation with a relative addressing in dst */
+TEST_F(LifetimeEvaluatorExactTest, WriteIndirectReladdr1)
+{
+   const vector code = {
+  { TGSI_OPCODE_MOV , {1}, {in0}, {}},
+  { TGSI_OPCODE_MOV , {1}, {in1}, {}},
+  { TGSI_OPCODE_MOV , {{5,1,0}}, {{in1,0,0}}, {}, RA()},
+  { TGSI_OPCODE_END}
+   };
+   run (code, expectation({{-1,-1}, {0,2}}));
+}
+
+/* Check lifetime estimation with a relative addressing in dst */
+TEST_F(LifetimeEvaluatorExactTest, WriteIndirectReladdr2)
+{
+   const vector code = {
+  { TGSI_OPCODE_MOV , {1}, {in0}, {}},
+  { TGSI_OPCODE_MOV , {2}, {in1}, {}},
+  { TGSI_OPCODE_MOV , {{5,0,1}}

[Mesa-dev] [PATCH v2 03/10] mesa/st/tests: base check of number of registers on opcode info

2017-10-17 Thread Gert Wollny

 * Test number of operands by using num_inst_src_regs/num_inst_dst_regs
   and fix tests accordingly.

Signed-off-by: Gert Wollny 
---
 src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp 
b/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
index 80ea19fa80..1a3c8cfa32 100644
--- a/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
+++ b/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
@@ -912,7 +912,7 @@ TEST_F(LifetimeEvaluatorExactTest, 
FRaWSameInstructionInLoopAndCondition)
const vector code = {
   { TGSI_OPCODE_BGNLOOP },
   {   TGSI_OPCODE_BGNLOOP },
-  { TGSI_OPCODE_IF, {0}, {in0}, {} },
+  { TGSI_OPCODE_IF, {}, {in0}, {} },
   {   TGSI_OPCODE_ADD, {1}, {1,in0}, {}},
   { TGSI_OPCODE_ENDIF},
   { TGSI_OPCODE_MOV, {1}, {in1}, {}},
@@ -1468,8 +1468,8 @@ glsl_to_tgsi_instruction *MockCodeline::get_codeline() 
const
next_instr->op = op;
next_instr->info = tgsi_get_opcode_info(op);
 
-   assert(src.size() < 5);
-   assert(dst.size() < 3);
+   assert(src.size() == num_inst_src_regs(next_instr));
+   assert(dst.size() == num_inst_dst_regs(next_instr));
assert(tex_offsets.size() < 3);
 
copy(src.begin(), src.end(), next_instr->src);
-- 
2.14.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 06/10] mesa/st/glsl_to_tgsi: Add tracking of ifelse writes in register merging

2017-10-17 Thread Gert Wollny

Improve the life-time evaluation of temporary registers by also tracking
writes in both if and else branches and in up to 32 nested scopes.
As a result the estimated required register life-times can be further
reduced enabling more registers to be merged.

Signed-off-by: Gert Wollny 
---
 .../state_tracker/st_glsl_to_tgsi_temprename.cpp   | 312 +++--
 1 file changed, 292 insertions(+), 20 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
index 839dfff078..4490a44468 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
@@ -98,15 +98,19 @@ public:
int begin() const;
int loop_break_line() const;
 
+   const prog_scope *in_else_scope() const;
const prog_scope *in_ifelse_scope() const;
-   const prog_scope *in_switchcase_scope() const;
+   const prog_scope *in_parent_ifelse_scope() const;
const prog_scope *innermost_loop() const;
const prog_scope *outermost_loop() const;
const prog_scope *enclosing_conditional() const;
 
bool is_loop() const;
bool is_in_loop() const;
+   bool is_switchcase_scope_in_loop() const;
bool is_conditional() const;
+   bool is_child_of(const prog_scope *scope) const;
+   bool is_child_of_ifelse_id_sibling(const prog_scope *scope) const;
 
bool break_is_for_switchcase() const;
bool contains_range_of(const prog_scope& other) const;
@@ -137,25 +141,81 @@ private:
prog_scope *storage;
 };
 
+/* Class to track the access to a component of a temporary register. */
+
 class temp_comp_access {
 public:
temp_comp_access();
+
void record_read(int line, prog_scope *scope);
void record_write(int line, prog_scope *scope);
lifetime get_required_lifetime();
 private:
void propagate_lifetime_to_dominant_write_scope();
+   bool conditional_ifelse_write_in_loop() const;
+
+   void record_ifelse_write(const prog_scope& scope);
+   void record_if_write(const prog_scope& scope);
+   void record_else_write(const prog_scope& scope);
 
prog_scope *last_read_scope;
prog_scope *first_read_scope;
prog_scope *first_write_scope;
+
int first_write;
int last_read;
int last_write;
int first_read;
-   bool keep_for_full_loop;
+
+   /* This member variable tracks the current resolution of conditional writing
+* to this temporary in IF/ELSE clauses.
+*
+* The initial value "conditionality_untouched" indicates that this
+* temporary has not yet been written to within an if clause.
+*
+* A positive (other than "conditionality_untouched") number refers to the
+* last loop id for which the write was resolved as unconditional. With each
+* new loop this value will be overwitten by "conditionality_unresolved"
+* on entering the first IF clause writing this temporary.
+*
+* The value "conditionality_unresolved" indicates that no resolution has
+* been achieved so far. If the variable is set to this value at the end of
+* the processing of the whole shader it also indicates a conditional write.
+*
+* The value "write_is_conditional" marks that the variable is written
+* conditionally (i.e. not in all relevant IF/ELSE code path pairs) in at
+* least one loop.
+*/
+   int conditionality_in_loop_id;
+
+   /* Helper constants to make the tracking code more readable. */
+   static const int write_is_conditional = -1;
+   static const int conditionality_unresolved = 0;
+   static const int conditionality_untouched;
+
+   /* A bit field tracking the nexting levels of if-else clauses where the
+* temporary has (so far) been written to in the if branch, but not in the
+* else branch.
+*/
+   unsigned int if_scope_write_flags;
+
+   int next_ifelse_nesting_depth;
+   static const int supported_ifelse_nesting_depth = 32;
+
+   /* Tracks the last if scope in which the temporary was written to
+* without a write in the correspondig else branch. Is also used
+* to track read-before-write in the according scope.
+*/
+   const prog_scope *current_unpaired_if_write_scope;
+
+   /* Flag to resolve read-before-write in the else scope. */
+   bool was_written_in_current_else_scope;
 };
 
+const int
+temp_comp_access::conditionality_untouched = numeric_limits::max();
+
+/* Class to track the access to all components of a temporary register. */
 class temp_access {
 public:
temp_access();
@@ -258,6 +318,32 @@ const prog_scope *prog_scope::outermost_loop() const
return loop;
 }
 
+bool prog_scope::is_child_of_ifelse_id_sibling(const prog_scope *scope) const
+{
+   const prog_scope *my_parent = in_parent_ifelse_scope();
+   while (my_parent) {
+  /* is a direct child? */
+  if (my_parent == scope)
+ return false;
+  /* is a child of the conditions sibling? */
+  if (my_parent->id() == scope->id())
+ return true;
+  my_parent = my_parent->in_pare

[Mesa-dev] [PATCH v2 07/10] mesa/st/tests: Add tests for improved tracking of temporaries

2017-10-17 Thread Gert Wollny

Additional tests are added that check the tracking of access to temporaries
in if-else branches.

Signed-off-by: Gert Wollny 
---
 .../tests/test_glsl_to_tgsi_lifetime.cpp   | 493 -
 1 file changed, 486 insertions(+), 7 deletions(-)

diff --git a/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp 
b/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
index 908791fbf6..5b6c40ffec 100644
--- a/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
+++ b/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
@@ -270,7 +270,7 @@ TEST_F(LifetimeEvaluatorExactTest, MoveInIfInNestedLoop)
  * - value must survive from first write to last read in loop
  * for now we only check that the minimum life time is correct.
  */
-TEST_F(LifetimeEvaluatorAtLeastTest, WriteInIfAndElseInLoop)
+TEST_F(LifetimeEvaluatorExactTest, WriteInIfAndElseInLoop)
 {
const vector code = {
   { TGSI_OPCODE_MOV, {1}, {in0}, {}},
@@ -312,6 +312,137 @@ TEST_F(LifetimeEvaluatorExactTest, 
WriteInIfAndElseReadInElseInLoop)
run (code, expectation({{-1,-1}, {0,9}, {1,9}, {7,10}}));
 }
 
+
+/* Test that a write in ELSE path only in loop is properly tracked:
+ * In loop if/else value written in else path and read outside
+ * - value must survive the whole loop.
+ */
+TEST_F(LifetimeEvaluatorExactTest, WriteInElseReadInLoop)
+{
+   const vector code = {
+  { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+  { TGSI_OPCODE_BGNLOOP },
+  {   TGSI_OPCODE_IF, {}, {1}, {}},
+  { TGSI_OPCODE_UADD, {2}, {1,in0}, {}},
+  {   TGSI_OPCODE_ELSE },
+  { TGSI_OPCODE_ADD, {3}, {1,2}, {}},
+  {   TGSI_OPCODE_ENDIF},
+  {   TGSI_OPCODE_UADD, {1}, {3,in1}, {}},
+  { TGSI_OPCODE_ENDLOOP },
+  { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+  { TGSI_OPCODE_END}
+   };
+   run (code, expectation({{-1,-1}, {0,9}, {1,8}, {1,8}}));
+}
+
+/* Test that tracking a second write in an ELSE path is not attributed
+ * to the IF path: In loop if/else value written in else path twice and
+ * read outside - value must survive the whole loop
+ */
+TEST_F(LifetimeEvaluatorExactTest, WriteInElseTwiceReadInLoop)
+{
+   const vector code = {
+  { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+  { TGSI_OPCODE_BGNLOOP },
+  {   TGSI_OPCODE_IF, {}, {1}, {}},
+  { TGSI_OPCODE_UADD, {2}, {1,in0}, {}},
+  {   TGSI_OPCODE_ELSE },
+  { TGSI_OPCODE_ADD, {3}, {1,2}, {}},
+  { TGSI_OPCODE_ADD, {3}, {1,3}, {}},
+  {   TGSI_OPCODE_ENDIF},
+  {   TGSI_OPCODE_UADD, {1}, {3,in1}, {}},
+  { TGSI_OPCODE_ENDLOOP },
+  { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+  { TGSI_OPCODE_END}
+   };
+   run (code, expectation({{-1,-1}, {0,10}, {1,9}, {1,9}}));
+}
+
+/* Test that the IF and ELSE scopes from different IF/ELSE pairs are not
+ * merged: In loop if/else value written in if, and then in different else path
+ * and read outside - value must survive the whole loop
+ */
+TEST_F(LifetimeEvaluatorExactTest, WriteInOneIfandInAnotherElseInLoop)
+{
+   const vector code = {
+  { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+  { TGSI_OPCODE_BGNLOOP },
+  {   TGSI_OPCODE_IF, {}, {1}, {}},
+  { TGSI_OPCODE_UADD, {2}, {1,in0}, {}},
+  {   TGSI_OPCODE_ENDIF},
+  {   TGSI_OPCODE_IF, {}, {1}, {}},
+  {   TGSI_OPCODE_ELSE },
+  { TGSI_OPCODE_ADD, {2}, {1,1}, {}},
+  {   TGSI_OPCODE_ENDIF},
+  {   TGSI_OPCODE_UADD, {1}, {2,in1}, {}},
+  { TGSI_OPCODE_ENDLOOP },
+  { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+  { TGSI_OPCODE_END}
+   };
+   run (code, expectation({{-1,-1}, {0,11}, {1,10}}));
+}
+
+/* Test that with a new loop the resolution of the IF/ELSE write conditionality
+ * is restarted: In first loop value is written in both if and else, in second
+ * loop value is written only in if - must survive the second loop.
+ * However, the tracking is currently not able to restrict the lifetime
+ * in the first loop, hence the "AtLeast" test.
+ */
+TEST_F(LifetimeEvaluatorAtLeastTest, 
UnconditionalInFirstLoopConditionalInSecond)
+{
+   const vector code = {
+  { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+  { TGSI_OPCODE_BGNLOOP },
+  {   TGSI_OPCODE_IF, {}, {1}, {}},
+  { TGSI_OPCODE_UADD, {2}, {1,in0}, {}},
+  {   TGSI_OPCODE_ELSE },
+  { TGSI_OPCODE_UADD, {2}, {1,in1}, {}},
+  {   TGSI_OPCODE_ENDIF},
+  { TGSI_OPCODE_ENDLOOP },
+  { TGSI_OPCODE_BGNLOOP },
+  {   TGSI_OPCODE_IF, {}, {1}, {}},
+  { TGSI_OPCODE_ADD, {2}, {in0,1}, {}},
+  {   TGSI_OPCODE_ENDIF},
+  {   TGSI_OPCODE_UADD, {1}, {2,in1}, {}},
+  { TGSI_OPCODE_ENDLOOP },
+  { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+  { TGSI_OPCODE_END}
+   };
+   run (code, expectation({{-1,-1}, {0,14}, {3,13}}));
+}
+
+/* Test that with a new loop the resolution of the IF/ELSE write conditionality
+ * is restarted, and also takes care of write before read in else scope:
+ * In first loop value is written in both if and else, in second loop v

[Mesa-dev] seems meson vulkan build is currently broken on travis

2017-10-17 Thread Gert Wollny

By testing my own patches I saw that the meson/vulcan specific build
failed on travis: 

https://travis-ci.org/gerddie/mesa/builds/288995180

To check that it is not related to my changes I also did that specific 
build with the latest git master (35c66f3e4017) that failed. 

https://travis-ci.org/gerddie/mesa/jobs/289010988

Build log tail: 

/home/travis/build/gerddie/mesa/_build/../src/amd/vulkan/radv_device.c:
301: undefined reference to `radv_instance_extension_supported'

src/amd/vulkan/vulkan_radeon@sha/radv_device.c.o: In function
`radv_GetPhysicalDeviceProperties':

/home/travis/build/gerddie/mesa/_build/../src/amd/vulkan/radv_device.c:
635: undefined reference to `radv_physical_device_api_version'

src/amd/vulkan/vulkan_radeon@sha/radv_device.c.o: In function
`radv_CreateDevice':

/home/travis/build/gerddie/mesa/_build/../src/amd/vulkan/radv_device.c:
926: undefined reference to `radv_physical_device_extension_supported'

collect2: error: ld returned 1 exit status

[339/404] Compiling C++ object 'src/intel/compiler/intel_compiler@sta/b
rw_vec4_visitor.cpp.o'.

ninja: build stopped: subcommand failed.


Best, 
Gert 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 25/43] compiler: Mark when input/ouput attribute at VS uses 16-bit

2017-10-17 Thread Chema Casanova

On 15/10/17 12:00, Pohjolainen, Topi wrote:
> On Thu, Oct 12, 2017 at 08:38:14PM +0200, Jose Maria Casanova Crespo wrote:
>> New shader attribute to mark when a location has 16-bit
>> value. This patch includes support on mesa glsl and nir.
>> ---
>>  src/compiler/glsl_types.h  | 24 
>>  src/compiler/nir/nir_gather_info.c | 23 ---
>>  src/compiler/nir_types.cpp |  6 ++
>>  src/compiler/nir_types.h   |  1 +
>>  src/compiler/shader_info.h |  2 ++
>>  5 files changed, 49 insertions(+), 7 deletions(-)
>>
>> diff --git a/src/compiler/glsl_types.h b/src/compiler/glsl_types.h
>> index 32399df351..d05e612e66 100644
>> --- a/src/compiler/glsl_types.h
>> +++ b/src/compiler/glsl_types.h
>> @@ -93,6 +93,13 @@ static inline bool glsl_base_type_is_integer(enum 
>> glsl_base_type type)
>>type == GLSL_TYPE_IMAGE;
>>  }
>>  
>> +static inline bool glsl_base_type_is_16bit(enum glsl_base_type type)
>> +{
>> +   return type == GLSL_TYPE_FLOAT16 ||
>> +  type == GLSL_TYPE_UINT16 ||
>> +  type == GLSL_TYPE_INT16;
>> +}
>> +
>>  enum glsl_sampler_dim {
>> GLSL_SAMPLER_DIM_1D = 0,
>> GLSL_SAMPLER_DIM_2D,
>> @@ -546,6 +553,15 @@ struct glsl_type {
>>return is_64bit() && vector_elements > 2;
>> }
>>  
>> +
>> +   /**
>> +* Query whether a 16-bit type takes half slots.
>> +*/
>> +   bool is_half_slot() const
> 
> I haven't checked later patches but here at least I'm wondering why we need
> two functionally identical helpers with different names, i.e., is_half_slot()
> and is_16bit().

It is true that at this moment, any use of is_half_slot could be
directly changed for is_16bit.

So removing is_half_slot could simplify the understanding of the code.
Because at the end the idea behind having two names was simply to use
the concept of half_slots when tracking the location input attributes at
the VS with 16-bit in a similar way that it was done for 64-bits for
dual slots (64bits & (vec3 || vec4)) .

After thinking about it it would also clearer maintain the is_16bit as
helper for future uses. But in the particular case of checking half
slots we could just use:

(glsl_get_bit_size(glsl_without_array(var->type)) == 16)


In this case what we really matters is that we have 16-bit values so we
need to unshuffle them, independently that they use half of an slot that
is the case of 16-bits values.

>> +   {
>> +  return is_16bit();
>> +   }
>> +
>> /**
>>  * Query whether or not a type is 64-bit
>>  */
>> @@ -555,6 +571,14 @@ struct glsl_type {
>> }
>>  
>> /**
>> +* Query whether or not a type is 16-bit
>> +*/
>> +   bool is_16bit() const
>> +   {
>> +  return glsl_base_type_is_16bit(base_type);
>> +   }
>> +
>> +   /**
>>  * Query whether or not a type is a non-array boolean type
>>  */
>> bool is_boolean() const
>> diff --git a/src/compiler/nir/nir_gather_info.c 
>> b/src/compiler/nir/nir_gather_info.c
>> index ac87bec46c..c7f8ff29cb 100644
>> --- a/src/compiler/nir/nir_gather_info.c
>> +++ b/src/compiler/nir/nir_gather_info.c
>> @@ -212,14 +212,22 @@ gather_intrinsic_info(nir_intrinsic_instr *instr, 
>> nir_shader *shader)
>>   if (!try_mask_partial_io(shader, instr->variables[0]))
>>  mark_whole_variable(shader, var);
>>  
>> - /* We need to track which input_reads bits correspond to a
>> -  * dvec3/dvec4 input attribute */
>> + /* We need to track which input_reads bits correspond to
>> +  * dvec3/dvec4 or 16-bit  input attributes */
>>   if (shader->stage == MESA_SHADER_VERTEX &&
>> - var->data.mode == nir_var_shader_in &&
>> - glsl_type_is_dual_slot(glsl_without_array(var->type))) {
>> -for (uint i = 0; i < glsl_count_attribute_slots(var->type, 
>> false); i++) {
>> -   int idx = var->data.location + i;
>> -   shader->info.double_inputs_read |= BITFIELD64_BIT(idx);
>> + var->data.mode == nir_var_shader_in) {
>> +if (glsl_type_is_dual_slot(glsl_without_array(var->type))) {
>> +   for (uint i = 0; i < glsl_count_attribute_slots(var->type, 
>> false); i++) {
>> +  int idx = var->data.location + i;
>> +  shader->info.double_inputs_read |= BITFIELD64_BIT(idx);
>> +   }
>> +} else {
>> +   if (glsl_type_is_half_slot(glsl_without_array(var->type))) {
> 
> This could be:
> 
>} else if 
> (glsl_type_is_half_slot(glsl_without_array(var->type))) {
> 
> allowing us to reduce indentation in the block.

Also changing this with the change proposed before,

} else if (glsl_get_bit_size(glsl_without_array(var->type)) == 16) {

I'm sending an v2 of this patch with these changes.

>> +  for (uint i = 0; i < 
>> glsl_count_attribute_slots(var->type, false); i++) {
>> + int idx = var->data.location + i;

Re: [Mesa-dev] [PATCH] [AMD] dri3: Add adaptive_sync_enable driconf option

2017-10-17 Thread Michel Dänzer

On 17/10/17 01:18 PM, Nicolai Hähnle wrote:
> On 17.10.2017 12:41, Michel Dänzer wrote:
>> On 17/10/17 12:29 PM, Nicolai Hähnle wrote:
>>> On 17.10.2017 12:07, Michel Dänzer wrote:
 On 17/10/17 11:33 AM, Nicolai Hähnle wrote:
> From: Nicolai Hähnle 
>
> When enabled, this will request FreeSync via the hybrid amdgpu DDX's
> AMDGPU X11 protocol extension.
>
> Due to limitations in the DDX this will only work for applications
> that cover the entire X screen (which is important to keep in mind
> when
> you have a multi-monitor setup).

 This limitation already applies to page flipping in general, it's not
 specific to FreeSync.
>>>
>>> Maybe it shouldn't apply in general? After all, what if you have one
>>> monitor with 60Hz and one with 90Hz?
>>
>> Page flipping can be used in that case. The application's buffer swaps
>> will be synchronized to one or the other CRTC, by default the one where
>> the largest part of the window is visible on. On the other CRTC(s), each
>> flip may take effect slightly earlier or later than on the
>> synchronization CRTC.
> 
> Hmm. I seem to have lost the plot and/or we're talking past each other.
> Maybe we need to go back to square one?
> 
> To rephrase the point that I was trying to get across: When you have a
> multi-monitor setup, and your application window covers (only) one of
> these monitors, then FreeSync will *not* be enabled.
> 
> At least that's how I understand the current DDX implementation, and I
> think it's surprising enough that it should be pointed out explicitly.

I'm pointing out that the reason for this is that it's a limitation of
page flipping, and adaptive sync can only work with page flipping.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 1/8] egl: add dri2_egl_surface_free_outdated_buffers_and_update_size() helper (v2)

2017-10-17 Thread Emil Velikov

Hi Gwan-gyeong,

On 6 October 2017 at 22:38, Gwan-gyeong Mun  wrote:
> To share common free outdated buffers and update size code.
> This compares width and height arguments with current egl surface dimension,
> if the compared surface dimension is differ, then it free local buffers and
> updates dimension.
>
> In preparation to adding of new platform which uses this helper.
>
> v2: Fixes from Eric's review:
>a) Split out series of refactor for helpers to a separate series.
>b) Add the new helper function and use them to replace the old code in the
>   same patch.
>
> Signed-off-by: Mun Gwan-gyeong 

The name dri2_egl_surface_free_outdated_buffers_and_update_size might
be a bit long/too verbose, but I'm out of ideas for alternative.
For the patch
Reviewed-by: Emil Velikov 

Side note:
We should be able to reuse this for platform_wayland, in the future.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Upstream support for FreeSync / Adaptive Sync

2017-10-17 Thread Michel Dänzer

On 17/10/17 01:04 PM, Nicolai Hähnle wrote:
> On 17.10.2017 12:28, Michel Dänzer wrote:
>> On 17/10/17 11:34 AM, Nicolai Hähnle wrote:
>>>
>>> Common sense suggests that there need to be two side to FreeSync / VESA
>>> Adaptive Sync support:
>>>
>>> 1. Query the display capabilities. This means querying minimum / maximum
>>> refresh duration, plus possibly a query for when the earliest/latest
>>> timing of the *next* refresh.
>>>
>>> 2. Signal desired present time. This means passing a target timer value
>>> instead of a target vblank count, e.g. something like this for the KMS
>>> interface:
>>>
>>>    int drmModePageFlipTarget64(int fd, uint32_t crtc_id, uint32_t fb_id,
>>>    uint32_t flags, void *user_data,
>>>    uint64_t target);
>>>
>>>    + a flag to indicate whether target is the vblank count or the
>>> CLOCK_MONOTONIC (?) time in ns.
>>
>> drmModePageFlip(Target) is part of the pre-atomic KMS API, but adapative
>> sync should probably only be supported via the atomic API, presumably
>> via output properties.
> 
> Time for xf86-video-amdgpu to grow atomic support, then? ;)

Yes, that will likely be part of an upstreamable solution. There are
already patches for this for the modesetting driver, adapting those
might not be that hard.


> How is a target vblank count communicated via the atomic API? Or is that
> simply not part of the design and up to user space?

From Daniel's followup it sounds like there's no support for this yet in
the atomic API, but I'm assuming it would be communicated via
properties, as is the case for most things in the atomic API.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] seems meson vulkan build is currently broken on travis

2017-10-17 Thread Eric Engestrom

On Tuesday, 2017-10-17 13:03:28 +, Gert Wollny wrote:
> By testing my own patches I saw that the meson/vulcan specific build
> failed on travis: 
> 
> https://travis-ci.org/gerddie/mesa/builds/288995180
> 
> To check that it is not related to my changes I also did that specific 
> build with the latest git master (35c66f3e4017) that failed. 
> 
> https://travis-ci.org/gerddie/mesa/jobs/289010988
> 
> Build log tail: 
> 
> /home/travis/build/gerddie/mesa/_build/../src/amd/vulkan/radv_device.c:
> 301: undefined reference to `radv_instance_extension_supported'
> 
> src/amd/vulkan/vulkan_radeon@sha/radv_device.c.o: In function
> `radv_GetPhysicalDeviceProperties':
> 
> /home/travis/build/gerddie/mesa/_build/../src/amd/vulkan/radv_device.c:
> 635: undefined reference to `radv_physical_device_api_version'
> 
> src/amd/vulkan/vulkan_radeon@sha/radv_device.c.o: In function
> `radv_CreateDevice':
> 
> /home/travis/build/gerddie/mesa/_build/../src/amd/vulkan/radv_device.c:
> 926: undefined reference to `radv_physical_device_extension_supported'
> 
> collect2: error: ld returned 1 exit status
> 
> [339/404] Compiling C++ object 'src/intel/compiler/intel_compiler@sta/b
> rw_vec4_visitor.cpp.o'.
> 
> ninja: build stopped: subcommand failed.
> 

Indeed, and I sent a fix to the list this morning [1] [2], you're
welcome to review it :)

[1] https://patchwork.freedesktop.org/patch/183177/
[2] https://lists.freedesktop.org/archives/mesa-dev/2017-October/173092.html

> 
> Best, 
> Gert 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Upstream support for FreeSync / Adaptive Sync

2017-10-17 Thread Michel Dänzer

On 17/10/17 02:22 PM, Daniel Vetter wrote:
> On Tue, Oct 17, 2017 at 12:28:17PM +0200, Michel Dänzer wrote:
>> On 17/10/17 11:34 AM, Nicolai Hähnle wrote:
> 
>>> Common sense suggests that there need to be two side to FreeSync / VESA
>>> Adaptive Sync support:
>>>
>>> 1. Query the display capabilities. This means querying minimum / maximum
>>> refresh duration, plus possibly a query for when the earliest/latest
>>> timing of the *next* refresh.
>>>
>>> 2. Signal desired present time. This means passing a target timer value
>>> instead of a target vblank count, e.g. something like this for the KMS
>>> interface:
>>>
>>>   int drmModePageFlipTarget64(int fd, uint32_t crtc_id, uint32_t fb_id,
>>>   uint32_t flags, void *user_data,
>>>   uint64_t target);
>>>
>>>   + a flag to indicate whether target is the vblank count or the
>>> CLOCK_MONOTONIC (?) time in ns.
>>
>> drmModePageFlip(Target) is part of the pre-atomic KMS API, but adapative
>> sync should probably only be supported via the atomic API, presumably
>> via output properties.
> 
> +1
> 
> At least now that DC is on track to land properly, and you want to do this
> for DC-only anyway there's no reason to pimp the legacy interfaces
> further. And atomic is soo much easier to extend.
> 
> The big question imo is where we need to put the flag on the kms side,
> since freesync is not just about presenting earlier, but also about
> presenting later. But for backwards compat we can't stretch the refresh
> rate by default for everyone, or clients that rely on high precision
> timestamps and regular refresh will get a bad surprise.

The idea described above is that adaptive sync would be used for flips
with a target timestamp. Apps which don't want to use adaptive sync
wouldn't set a target timestamp.


> I think a boolean enable_freesync property is probably what we want, which
> enables freesync for as long as it's set.

The question then becomes under what circumstances the property is (not)
set. Not sure offhand this will actually solve any problem, or just push
it somewhere else.


> Finally I'm not sure we want to insist on a target time for freesync. At
> least as far as I understand things you just want "as soon as possible".
> This might change with some of the VK/EGL/GLX extensions where you
> specify a precise timing (media playback). But that needs a bit more work
> to make it happen I think, so perhaps better to postpone.

I don't see why. There's an obvious use case for this now, for video
playback. At least VDPAU already has target timestamps for this.


> Also note that right now no driver expect amdgpu has support for a target
> vblank on a flip. That's imo another reason for not requiring target
> support for at least basic freesync support.

I think that's a bad reason. :) Adding it for atomic drivers shouldn't
be that hard.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 2/8] egl: refactor color_buffers structure for deduplicating

2017-10-17 Thread Emil Velikov

On 6 October 2017 at 22:38, Gwan-gyeong Mun  wrote:
> This is added for preventing adding of new color buffers structure and back*
> when new platform backend is added.
> This refactoring separates out the common and platform specific bits.
> This makes odd casting in the platform_foo.c but it prevents adding of new
> ifdef magic.
> Because of color_buffers array size is different on android and wayland /drm,
> it adds COLOR_BUFFERS_SIZE macro.
>  - android's color buffers array size is 3.
>drm & wayland's color buffers array size is 4.
>
> Fixes from Rob's review:
>  - refactor to separate out the common and platform specific bits.
>
> Fixes from Emil's review:
>  - use suggested color buffers structure shape.
>it makes a buffer pointer of each platform to void pointer type.
>
> Signed-off-by: Mun Gwan-gyeong 
> ---
>  src/egl/drivers/dri2/egl_dri2.h | 30 +-
>  src/egl/drivers/dri2/platform_android.c | 10 +++---
>  src/egl/drivers/dri2/platform_drm.c | 55 
> +
>  src/egl/drivers/dri2/platform_wayland.c | 46 +--
>  4 files changed, 71 insertions(+), 70 deletions(-)
>
> diff --git a/src/egl/drivers/dri2/egl_dri2.h b/src/egl/drivers/dri2/egl_dri2.h
> index 017895f0d9..08ccf24410 100644
> --- a/src/egl/drivers/dri2/egl_dri2.h
> +++ b/src/egl/drivers/dri2/egl_dri2.h
> @@ -65,6 +65,15 @@ struct zwp_linux_dmabuf_v1;
>
>  #endif /* HAVE_ANDROID_PLATFORM */
>
> +#if defined(HAVE_WAYLAND_PLATFORM) || defined(HAVE_DRM_PLATFORM)
The guard should be if Android ->3, 4 otherwise.

> +#define COLOR_BUFFERS_SIZE 4
> +#else
> +   /* Usually Android uses at most triple buffers in ANativeWindow
> +* so hardcode the number of color_buffers to 3.
> +*/
Props for keeping the comment around. Please drop the indentation.

> +#define COLOR_BUFFERS_SIZE 3
> +#endif
> +
>  #include "eglconfig.h"
>  #include "eglcontext.h"
>  #include "egldisplay.h"
> @@ -286,39 +295,28 @@ struct dri2_egl_surface
> /* EGL-owned buffers */
> __DRIbuffer   *local_buffers[__DRI_BUFFER_COUNT];
>
> -#if defined(HAVE_WAYLAND_PLATFORM) || defined(HAVE_DRM_PLATFORM)
> +   /* Used to record all the buffers created by each platform's native window
> +* and their ages.
> +*/
> struct {
> +  void *native_buffer; // aka wl_buffer/gbm_bo/ANativeWindowBuffer
>  #ifdef HAVE_WAYLAND_PLATFORM
I would drop this guard. Sure it will make the struct tiny bit larger,
but it will allow us to have a more generic and widespread helpers.

The rest of the patch should use a handful of:
 - drop unneeded $native_type -> void * cast
 - create the local native_buffer of $native_type and cast on assignment

Some partial examples follow:

> --- a/src/egl/drivers/dri2/platform_drm.c
> +++ b/src/egl/drivers/dri2/platform_drm.c
> @@ -53,7 +53,7 @@ lock_front_buffer(struct gbm_surface *_surf)
>return NULL;
> }
>
> -   bo = dri2_surf->current->bo;
> +   bo = (struct gbm_bo *)dri2_surf->current->native_buffer;
>
Unneeded cast?

> for (unsigned i = 0; i < ARRAY_SIZE(dri2_surf->color_buffers); i++) {
> -  if (dri2_surf->color_buffers[i].bo)
> -gbm_bo_destroy(dri2_surf->color_buffers[i].bo);
> +  if (dri2_surf->color_buffers[i].native_buffer)
> +gbm_bo_destroy((struct gbm_bo 
> *)dri2_surf->color_buffers[i].native_buffer);
Ditto.

> }
>
> dri2_egl_surface_free_local_buffers(dri2_surf);
> @@ -204,23 +204,24 @@ get_back_bo(struct dri2_egl_surface *dri2_surf)
>
> if (dri2_surf->back == NULL)
>return -1;
> -   if (dri2_surf->back->bo == NULL) {
> +   if (dri2_surf->back->native_buffer == NULL) {
>if (surf->base.modifiers)
> - dri2_surf->back->bo = 
> gbm_bo_create_with_modifiers(&dri2_dpy->gbm_dri->base,
> -surf->base.width,
> -
> surf->base.height,
> -
> surf->base.format,
> -
> surf->base.modifiers,
> -
> surf->base.count);
> + dri2_surf->back->native_buffer =
> +(void *)gbm_bo_create_with_modifiers(&dri2_dpy->gbm_dri->base,
> + surf->base.width,
> + surf->base.height,
> + surf->base.format,
> + surf->base.modifiers,
> + surf->base.count);
>else
> - dri2_surf->back->bo = gbm_bo_create(&dri2_dpy->gbm_dri->base,
> - surf->base.width,
> - surf->base.height,
> - surf->base.format,
> -

Re: [Mesa-dev] Upstream support for FreeSync / Adaptive Sync

2017-10-17 Thread Christian König


Am 17.10.2017 um 15:46 schrieb Michel Dänzer:

On 17/10/17 02:22 PM, Daniel Vetter wrote:
[SNIP]

Finally I'm not sure we want to insist on a target time for freesync. At
least as far as I understand things you just want "as soon as possible".
This might change with some of the VK/EGL/GLX extensions where you
specify a precise timing (media playback). But that needs a bit more work
to make it happen I think, so perhaps better to postpone.

I don't see why. There's an obvious use case for this now, for video
playback. At least VDPAU already has target timestamps for this.


Application calculate their frames for a certain point in time.

As far as I know this is very important for any VR application if you 
don't want to get sea sick.


Regards,
Christian.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 3/8] egl: add dri2_egl_surface_record_buffers_and_update_back_buffer() helper (v2)

2017-10-17 Thread Emil Velikov

Hi Gwan-gyeong,

There's a small nit but otherwise looks good.

On 6 October 2017 at 22:38, Gwan-gyeong Mun  wrote:
> To share common record buffers and update back buffer code.
> This records all the buffers created by each platform's native window and
> update back buffer for updating buffer's age in swap_buffers.
>
> In preparation to adding of new platform which uses this helper.
>
> v2:
>  - Remove unnedded ifdef magic
s/unnedded/unneeded/

>  - Fixes from Eric's review:
>a) Split out series of refactor for helpers to a separate series.
>b) Add the new helper function and use them to replace the old code in the
>   same patch.
>
> Signed-off-by: Mun Gwan-gyeong 
> ---
>  src/egl/drivers/dri2/egl_dri2.c | 32 
>  src/egl/drivers/dri2/egl_dri2.h |  5 +
>  src/egl/drivers/dri2/platform_android.c | 25 ++---
>  3 files changed, 39 insertions(+), 23 deletions(-)
>
> diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c
> index 3c4e525040..3622d18a24 100644
> --- a/src/egl/drivers/dri2/egl_dri2.c
> +++ b/src/egl/drivers/dri2/egl_dri2.c
> @@ -1068,6 +1068,38 @@ 
> dri2_egl_surface_free_outdated_buffers_and_update_size(struct 
> dri2_egl_surface *
> }
>  }
>
> +void
> +dri2_egl_surface_record_buffers_and_update_back_buffer(struct 
> dri2_egl_surface *dri2_surf,
> +   void *buffer)
> +{
> +   /* Record all the buffers created by each platform's native window and
> +* update back buffer for updating buffer's age in swap_buffers.
> +*/
> +   EGLBoolean updated = EGL_FALSE;
> +
> +   for (int i = 0; i < ARRAY_SIZE(dri2_surf->color_buffers); i++) {
> +  if (!dri2_surf->color_buffers[i].native_buffer) {
> + dri2_surf->color_buffers[i].native_buffer = buffer;
> + dri2_surf->color_buffers[i].age = 0;
"age" seems like a new addition. It's correct to have it, but I'll
keep that as separate patch.

Also I would set "updated" and bail out at this point. There's no
point in continuing to loop.
If you want to address that (it's optional) please keep as separate patch.

With the age line dropped and typo fixed
Reviewed-by: Emil Velikov 

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 8/8] egl/wayland: add dri2_wl_free_buffers() helper

2017-10-17 Thread Eric Engestrom

On Friday, 2017-10-06 21:38:35 +, Gwan-gyeong Mun wrote:
> This deduplicates free routines of color_buffers array.
> 
> Signed-off-by: Mun Gwan-gyeong 
> ---
>  src/egl/drivers/dri2/platform_wayland.c | 60 
> +
>  1 file changed, 31 insertions(+), 29 deletions(-)
> 
> diff --git a/src/egl/drivers/dri2/platform_wayland.c 
> b/src/egl/drivers/dri2/platform_wayland.c
> index 1518a24b7c..cfe474cf58 100644
> --- a/src/egl/drivers/dri2/platform_wayland.c
> +++ b/src/egl/drivers/dri2/platform_wayland.c
> @@ -253,6 +253,35 @@ dri2_wl_create_pixmap_surface(_EGLDriver *drv, 
> _EGLDisplay *disp,
> return NULL;
>  }
>  
> +static void
> +dri2_wl_free_buffers(struct dri2_egl_surface *dri2_surf, bool check_lock)
> +{
> +   struct dri2_egl_display *dri2_dpy =
> +  dri2_egl_display(dri2_surf->base.Resource.Display);
> +
> +   for (int i = 0; i < ARRAY_SIZE(dri2_surf->color_buffers); i++) {
> +  if (dri2_surf->color_buffers[i].native_buffer) {
> + if (check_lock && !dri2_surf->color_buffers[i].locked)
> +wl_buffer_destroy((struct wl_buffer 
> *)dri2_surf->color_buffers[i].native_buffer);
> + else
> +wl_buffer_destroy((struct wl_buffer 
> *)dri2_surf->color_buffers[i].native_buffer);

Both branches have the same code, should be a hint :P
I think this should be:

   if (!check_lock || !dri2_surf->color_buffers[i].locked) {
  wl_buffer_destroy((struct wl_buffer 
*)dri2_surf->color_buffers[i].native_buffer);
  dri2_surf->color_buffers[i].native_buffer = NULL;
   }

without an `else`. You also want to remove the `native_buffer = NULL`
from below, to avoid leaking locked buffers :)

> +  }
> +  if (dri2_surf->color_buffers[i].dri_image)
> + 
> dri2_dpy->image->destroyImage(dri2_surf->color_buffers[i].dri_image);
> +  if (dri2_surf->color_buffers[i].linear_copy)
> + 
> dri2_dpy->image->destroyImage(dri2_surf->color_buffers[i].linear_copy);
> +  if (dri2_surf->color_buffers[i].data)
> + munmap(dri2_surf->color_buffers[i].data,
> +dri2_surf->color_buffers[i].data_size);
> +
> +  dri2_surf->color_buffers[i].native_buffer = NULL;
> +  dri2_surf->color_buffers[i].dri_image = NULL;
> +  dri2_surf->color_buffers[i].linear_copy = NULL;
> +  dri2_surf->color_buffers[i].data = NULL;
> +  dri2_surf->color_buffers[i].locked = false;
> +   }
> +}
> +
>  /**
>   * Called via eglDestroySurface(), drv->API.DestroySurface().
>   */
> @@ -266,17 +295,7 @@ dri2_wl_destroy_surface(_EGLDriver *drv, _EGLDisplay 
> *disp, _EGLSurface *surf)
>  
> dri2_dpy->core->destroyDrawable(dri2_surf->dri_drawable);
>  
> -   for (int i = 0; i < ARRAY_SIZE(dri2_surf->color_buffers); i++) {
> -  if (dri2_surf->color_buffers[i].native_buffer)
> - wl_buffer_destroy((struct wl_buffer 
> *)dri2_surf->color_buffers[i].native_buffer);
> -  if (dri2_surf->color_buffers[i].dri_image)
> - 
> dri2_dpy->image->destroyImage(dri2_surf->color_buffers[i].dri_image);
> -  if (dri2_surf->color_buffers[i].linear_copy)
> - 
> dri2_dpy->image->destroyImage(dri2_surf->color_buffers[i].linear_copy);
> -  if (dri2_surf->color_buffers[i].data)
> - munmap(dri2_surf->color_buffers[i].data,
> -dri2_surf->color_buffers[i].data_size);
> -   }
> +   dri2_wl_free_buffers(dri2_surf, false);
>  
> if (dri2_dpy->dri2)
>dri2_egl_surface_free_local_buffers(dri2_surf);
> @@ -308,24 +327,7 @@ dri2_wl_release_buffers(struct dri2_egl_surface 
> *dri2_surf)
> struct dri2_egl_display *dri2_dpy =
>dri2_egl_display(dri2_surf->base.Resource.Display);
>  
> -   for (int i = 0; i < ARRAY_SIZE(dri2_surf->color_buffers); i++) {
> -  if (dri2_surf->color_buffers[i].native_buffer &&
> -  !dri2_surf->color_buffers[i].locked)
> - wl_buffer_destroy((struct wl_buffer 
> *)dri2_surf->color_buffers[i].native_buffer);
> -  if (dri2_surf->color_buffers[i].dri_image)
> - 
> dri2_dpy->image->destroyImage(dri2_surf->color_buffers[i].dri_image);
> -  if (dri2_surf->color_buffers[i].linear_copy)
> - 
> dri2_dpy->image->destroyImage(dri2_surf->color_buffers[i].linear_copy);
> -  if (dri2_surf->color_buffers[i].data)
> - munmap(dri2_surf->color_buffers[i].data,
> -dri2_surf->color_buffers[i].data_size);
> -
> -  dri2_surf->color_buffers[i].native_buffer = NULL;
> -  dri2_surf->color_buffers[i].dri_image = NULL;
> -  dri2_surf->color_buffers[i].linear_copy = NULL;
> -  dri2_surf->color_buffers[i].data = NULL;
> -  dri2_surf->color_buffers[i].locked = false;
> -   }
> +   dri2_wl_free_buffers(dri2_surf, true);
>  
> if (dri2_dpy->dri2)
>dri2_egl_surface_free_local_buffers(dri2_surf);
> -- 
> 2.14.2
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Upstream support for FreeSync / Adaptive Sync

2017-10-17 Thread Ville Syrjälä

On Tue, Oct 17, 2017 at 03:46:24PM +0200, Michel Dänzer wrote:
> On 17/10/17 02:22 PM, Daniel Vetter wrote:
> > On Tue, Oct 17, 2017 at 12:28:17PM +0200, Michel Dänzer wrote:
> >> On 17/10/17 11:34 AM, Nicolai Hähnle wrote:
> > 
> >>> Common sense suggests that there need to be two side to FreeSync / VESA
> >>> Adaptive Sync support:
> >>>
> >>> 1. Query the display capabilities. This means querying minimum / maximum
> >>> refresh duration, plus possibly a query for when the earliest/latest
> >>> timing of the *next* refresh.
> >>>
> >>> 2. Signal desired present time. This means passing a target timer value
> >>> instead of a target vblank count, e.g. something like this for the KMS
> >>> interface:
> >>>
> >>>   int drmModePageFlipTarget64(int fd, uint32_t crtc_id, uint32_t fb_id,
> >>>   uint32_t flags, void *user_data,
> >>>   uint64_t target);
> >>>
> >>>   + a flag to indicate whether target is the vblank count or the
> >>> CLOCK_MONOTONIC (?) time in ns.
> >>
> >> drmModePageFlip(Target) is part of the pre-atomic KMS API, but adapative
> >> sync should probably only be supported via the atomic API, presumably
> >> via output properties.
> > 
> > +1
> > 
> > At least now that DC is on track to land properly, and you want to do this
> > for DC-only anyway there's no reason to pimp the legacy interfaces
> > further. And atomic is soo much easier to extend.
> > 
> > The big question imo is where we need to put the flag on the kms side,
> > since freesync is not just about presenting earlier, but also about
> > presenting later. But for backwards compat we can't stretch the refresh
> > rate by default for everyone, or clients that rely on high precision
> > timestamps and regular refresh will get a bad surprise.
> 
> The idea described above is that adaptive sync would be used for flips
> with a target timestamp. Apps which don't want to use adaptive sync
> wouldn't set a target timestamp.
> 
> 
> > I think a boolean enable_freesync property is probably what we want, which
> > enables freesync for as long as it's set.
> 
> The question then becomes under what circumstances the property is (not)
> set. Not sure offhand this will actually solve any problem, or just push
> it somewhere else.
> 
> 
> > Finally I'm not sure we want to insist on a target time for freesync. At
> > least as far as I understand things you just want "as soon as possible".
> > This might change with some of the VK/EGL/GLX extensions where you
> > specify a precise timing (media playback). But that needs a bit more work
> > to make it happen I think, so perhaps better to postpone.
> 
> I don't see why. There's an obvious use case for this now, for video
> playback. At least VDPAU already has target timestamps for this.
> 
> 
> > Also note that right now no driver expect amdgpu has support for a target
> > vblank on a flip. That's imo another reason for not requiring target
> > support for at least basic freesync support.
> 
> I think that's a bad reason. :) Adding it for atomic drivers shouldn't
> be that hard.

Apart from the actual implementation hurdles it does open up some new questions:
- Is it going to be per-plane or per-crtc?
- What happens if the target timestamp is already stale?
- What happens if the target timestamp is good when it gets scheduled,
  but can't be met once the fences and whatnot have signalled?
- What happens if another operation is already queued with a more
  recent timestamp?
- Apart from a pure timestamp do we want to move the OML_sync/swap_whatever
  msc remainder etc. semantics into the kernel as well? It's just
  another way to specify the target flip time after all.

I do like the idea, but clearly there's a bit of thought require to
make sure the semantics are good.

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] vulkan/wsi: Avoid waiting indefinitely for present completion in x11_manage_fifo_queues().

2017-10-17 Thread Henri Verbeet

In particular, if the window was destroyed before the present request
completed, xcb_wait_for_special_event() may never return.

Note that the usage of xcb_poll_for_special_event() requires a version
of libxcb that includes commit fad81b63422105f9345215ab2716c4b804ec7986
to work properly.

Signed-off-by: Henri Verbeet 
---
This applies on top of "vulkan/wsi: Free the event in x11_manage_fifo_queues()."
---
 src/vulkan/wsi/wsi_common_x11.c | 19 ---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/src/vulkan/wsi/wsi_common_x11.c b/src/vulkan/wsi/wsi_common_x11.c
index 22b067b..ceb0d66 100644
--- a/src/vulkan/wsi/wsi_common_x11.c
+++ b/src/vulkan/wsi/wsi_common_x11.c
@@ -908,10 +908,14 @@ static void *
 x11_manage_fifo_queues(void *state)
 {
struct x11_swapchain *chain = state;
+   struct pollfd pfds;
VkResult result;
 
assert(chain->base.present_mode == VK_PRESENT_MODE_FIFO_KHR);
 
+   pfds.fd = xcb_get_file_descriptor(chain->conn);
+   pfds.events = POLLIN;
+
while (chain->status == VK_SUCCESS) {
   /* It should be safe to unconditionally block here.  Later in the loop
* we blocks until the previous present has landed on-screen.  At that
@@ -934,9 +938,18 @@ x11_manage_fifo_queues(void *state)
 
   while (chain->last_present_msc < target_msc) {
  xcb_generic_event_t *event =
-xcb_wait_for_special_event(chain->conn, chain->special_event);
- if (!event)
-goto fail;
+xcb_poll_for_special_event(chain->conn, chain->special_event);
+ if (!event) {
+int ret = poll(&pfds, 1, 100);
+if (ret < 0) {
+   result = VK_ERROR_OUT_OF_DATE_KHR;
+   goto fail;
+} else if (chain->status != VK_SUCCESS) {
+   return NULL;
+}
+
+continue;
+ }
 
  result = x11_handle_dri3_present_event(chain, (void *)event);
  free(event);
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 4/8] egl: add dri2_egl_surface_update_buffer_age() helper (v2)

2017-10-17 Thread Emil Velikov

On 6 October 2017 at 22:38, Gwan-gyeong Mun  wrote:
> To share common update buffer age code.
> This updates old buffer's age and sets current back buffer's age to 1.
>
> In preparation to adding of new platform which uses this helper.
>
> v2:
>  - Fixes from Eric's review:
>a) Split out series of refactor for helpers to a separate series.
>b) Add the new helper function and use them to replace the old code in the
>   same patch.
>  - Fixes from Rob's review:
>Remove unneeded ifdef block
>
Reviewed-by: Emil Velikov 

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 1/4] nir: set default lod to texture opcodes that needed it but don't provide it

2017-10-17 Thread Jason Ekstrand

I sent them out yesterday (sorry you weren't on the CC) and lionel reviewed
them.  They were pushed as:

commit 759ab66db036dd911cb589429eb4dbb3eb4fdc4c
Author: Jason Ekstrand 
Date:   Mon Oct 16 08:50:44 2017 -0700

anv/apply_pipeline_layout: Use nir_tex_instr_remove_src

Reviewed-by: Lionel Landwerlin 

commit 41c75b5354e5d4382786ff853f6f5143a0fe4c6d
Author: Jason Ekstrand 
Date:   Mon Oct 16 08:50:23 2017 -0700

nir: Add a helper for adding texture instruction sources

Reviewed-by: Lionel Landwerlin 


On Mon, Oct 16, 2017 at 3:49 AM, Samuel Iglesias Gonsálvez <
sigles...@igalia.com> wrote:

> On Wed, 2017-10-11 at 09:12 +0100, Lionel Landwerlin wrote:
> > On 11/10/17 09:00, Samuel Iglesias Gonsálvez wrote:
> > > On Tuesday, October 10, 2017 4:40:47 PM CEST Lionel Landwerlin
> > > wrote:
> > > > On 10/10/17 14:35, Samuel Iglesias Gonsálvez wrote:
> > > > > Signed-off-by: Samuel Iglesias Gonsálvez 
> > > > > ---
> > > > >
> > > > >   src/compiler/nir/nir_lower_tex.c | 68
> > > > >    1 file changed, 68
> > > > >   insertions(+)
> > > > >
> > > > > diff --git a/src/compiler/nir/nir_lower_tex.c
> > > > > b/src/compiler/nir/nir_lower_tex.c index
> > > > > 65681decb1c..d3380710405 100644
> > > > > --- a/src/compiler/nir/nir_lower_tex.c
> > > > > +++ b/src/compiler/nir/nir_lower_tex.c
> > > > > @@ -717,6 +717,52 @@ linearize_srgb_result(nir_builder *b,
> > > > > nir_tex_instr
> > > > > *tex)>
> > > > > result->parent_instr);
> > > > >
> > > > >   }
> > > > >
> > > > > +static void
> > > > > +set_default_lod(nir_builder *b, nir_tex_instr *tex)
> > > > > +{
> > > > > +   b->cursor = nir_before_instr(&tex->instr);
> > > > > +
> > > > > +   /* We are going to emit the same texture but adding a
> > > > > default LOD.
> > > > > +*/
> > > > > +   int num_srcs = tex->num_srcs + 1;
> > > > > +   nir_tex_instr *new = nir_tex_instr_create(b->shader,
> > > > > num_srcs);
> > > > > +
> > > > > +   new->op = tex->op;
> > > > > +   new->sampler_dim = tex->sampler_dim;
> > > > > +   new->texture_index = tex->texture_index;
> > > > > +   new->dest_type = tex->dest_type;
> > > > > +   new->is_array = tex->is_array;
> > > > > +   new->is_shadow = tex->is_shadow;
> > > > > +   new->is_new_style_shadow = tex->is_new_style_shadow;
> > > > > +   new->sampler_index = tex->sampler_index;
> > > > > +   new->texture = nir_deref_var_clone(tex->texture, new);
> > > > > +   new->sampler = nir_deref_var_clone(tex->sampler, new);
> > > > > +   new->coord_components = tex->coord_components;
> > > >
> > > > There are a couple of fields you're not copying : component &
> > > > texture_array_size
> > > > Not 100% sure whether they need to be.
> > > >
> > >
> > > I added them locally.
> > >
> > > > > +
> > > > > +   nir_ssa_dest_init(&new->instr, &new->dest, 4, 32, NULL);
> > > > > +
> > > > > +   int src_num = 0;
> > > > > +   for (int i = 0; i < tex->num_srcs; i++) {
> > > > > +  nir_src_copy(&new->src[src_num].src, &tex->src[i].src,
> > > > > new);
> > > > > +  new->src[src_num].src_type = tex->src[i].src_type;
> > > > > +  src_num++;
> > > > > +   }
> > > > > +
> > > > > +   new->src[src_num].src = nir_src_for_ssa(nir_imm_int(b, 0));
> > > > > +   new->src[src_num].src_type = nir_tex_src_lod;
> > > >
> > > > I think you could get rid of the src_num variable and just use
> > > > (new->num_srcs - 1) to set the default lod src.
> > > >
> > >
> > > Done.
> > >
> > > Does it get your R-b?
> > >
> > > Thanks,
> > >
> > > Sam
> >
> > Thanks!
> > Although I think Eric has a point avoid about memcpy(), since I used
> > roughly the same code in the ycbcr anv pass, I'll try to come up with
> > a helper.
> > Patches 1-3 are :
> >
> > Reviewed-by: Lionel Landwerlin 
> >
>
> I forgot to reply before. I will wait for your helper then.
> Please add me in Cc so I am aware when you submit it for review :)
>
> Thanks,
>
> Sam
>
> > > > > +   src_num++;
> > > > > +
> > > > > +   assert(src_num == num_srcs);
> > > > > +
> > > > > +   nir_ssa_dest_init(&new->instr, &new->dest,
> > > > > + tex->dest.ssa.num_components, 32, NULL);
> > > > > +   nir_builder_instr_insert(b, &new->instr);
> > > > > +
> > > > > +   nir_ssa_def_rewrite_uses(&tex->dest.ssa,
> > > > > nir_src_for_ssa(&new->dest.ssa)); +
> > > > > +   nir_instr_remove(&tex->instr);
> > > > > +}
> > > > > +
> > > > >
> > > > >   static bool
> > > > >   nir_lower_tex_block(nir_block *block, nir_builder *b,
> > > > >
> > > > >   const nir_lower_tex_options *options)
> > > > >
> > > > > @@ -813,6 +859,28 @@ nir_lower_tex_block(nir_block *block,
> > > > > nir_builder *b,
> > > > >
> > > > >progress = true;
> > > > >continue;
> > > > >
> > > > > }
> > > > >
> > > > > +
> > > > > +  /* TXF, TXS and TXL require a LOD but not everything we
> > > > > implement
> > > > > using those +   * three opcodes provides one.  Provide a
> > >

[Mesa-dev] [PATCH v2] st/mesa: Initialize textures array in st_framebuffer_validate

2017-10-17 Thread Michel Dänzer

From: Michel Dänzer 

And just reference pipe_resources to it in the validate callbacks.

Avoids pipe_resource leaks when st_framebuffer_validate ends up calling
the validate callback multiple times, e.g. when a window is resized.

v2:
* Use generic stable tag instead of Fixes: tag, since the problem could
  already happen before the commit referenced in v1 (Thomas Hellstrom)
* Use memset to initialize the array on the stack instead of allocating
  the array with os_calloc.

Cc: mesa-sta...@lists.freedesktop.org
Reviewed-by: Thomas Hellstrom  # v1
Signed-off-by: Michel Dänzer 
---
 src/gallium/state_trackers/dri/dri_drawable.c | 4 +---
 src/gallium/state_trackers/glx/xlib/xm_st.c   | 4 +---
 src/gallium/state_trackers/hgl/hgl.c  | 4 +---
 src/gallium/state_trackers/osmesa/osmesa.c| 1 +
 src/gallium/state_trackers/wgl/stw_st.c   | 4 +---
 src/mesa/state_tracker/st_manager.c   | 2 ++
 6 files changed, 7 insertions(+), 12 deletions(-)

diff --git a/src/gallium/state_trackers/dri/dri_drawable.c 
b/src/gallium/state_trackers/dri/dri_drawable.c
index 75a8197d330..d586b7564ef 100644
--- a/src/gallium/state_trackers/dri/dri_drawable.c
+++ b/src/gallium/state_trackers/dri/dri_drawable.c
@@ -99,10 +99,8 @@ dri_st_framebuffer_validate(struct st_context_iface *stctx,
   return TRUE;
 
/* Set the window-system buffers for the state tracker. */
-   for (i = 0; i < count; i++) {
-  out[i] = NULL;
+   for (i = 0; i < count; i++)
   pipe_resource_reference(&out[i], textures[statts[i]]);
-   }
 
return TRUE;
 }
diff --git a/src/gallium/state_trackers/glx/xlib/xm_st.c 
b/src/gallium/state_trackers/glx/xlib/xm_st.c
index 0c42e653c76..946b5dcff29 100644
--- a/src/gallium/state_trackers/glx/xlib/xm_st.c
+++ b/src/gallium/state_trackers/glx/xlib/xm_st.c
@@ -245,10 +245,8 @@ xmesa_st_framebuffer_validate(struct st_context_iface 
*stctx,
   }
}
 
-   for (i = 0; i < count; i++) {
-  out[i] = NULL;
+   for (i = 0; i < count; i++)
   pipe_resource_reference(&out[i], xstfb->textures[statts[i]]);
-   }
 
return TRUE;
 }
diff --git a/src/gallium/state_trackers/hgl/hgl.c 
b/src/gallium/state_trackers/hgl/hgl.c
index 1b702815a3a..bbc477a978c 100644
--- a/src/gallium/state_trackers/hgl/hgl.c
+++ b/src/gallium/state_trackers/hgl/hgl.c
@@ -193,10 +193,8 @@ hgl_st_framebuffer_validate(struct st_context_iface 
*stctxi,
//}
}
 
-   for (i = 0; i < count; i++) {
-   out[i] = NULL;
+   for (i = 0; i < count; i++)
pipe_resource_reference(&out[i], buffer->textures[statts[i]]);
-   }
 
return TRUE;
 }
diff --git a/src/gallium/state_trackers/osmesa/osmesa.c 
b/src/gallium/state_trackers/osmesa/osmesa.c
index 2f9558db312..44a0cc43812 100644
--- a/src/gallium/state_trackers/osmesa/osmesa.c
+++ b/src/gallium/state_trackers/osmesa/osmesa.c
@@ -432,6 +432,7 @@ osmesa_st_framebuffer_validate(struct st_context_iface 
*stctx,
 
   templat.format = format;
   templat.bind = bind;
+  pipe_resource_reference(&out[i], NULL);
   out[i] = osbuffer->textures[statts[i]] =
  screen->resource_create(screen, &templat);
}
diff --git a/src/gallium/state_trackers/wgl/stw_st.c 
b/src/gallium/state_trackers/wgl/stw_st.c
index 5e165c89f56..7cf18f0a8b0 100644
--- a/src/gallium/state_trackers/wgl/stw_st.c
+++ b/src/gallium/state_trackers/wgl/stw_st.c
@@ -161,10 +161,8 @@ stw_st_framebuffer_validate(struct st_context_iface *stctx,
   stwfb->fb->must_resize = FALSE;
}
 
-   for (i = 0; i < count; i++) {
-  out[i] = NULL;
+   for (i = 0; i < count; i++)
   pipe_resource_reference(&out[i], stwfb->textures[statts[i]]);
-   }
 
stw_framebuffer_unlock(stwfb->fb);
 
diff --git a/src/mesa/state_tracker/st_manager.c 
b/src/mesa/state_tracker/st_manager.c
index 50bc3c33c62..047337e22db 100644
--- a/src/mesa/state_tracker/st_manager.c
+++ b/src/mesa/state_tracker/st_manager.c
@@ -190,6 +190,8 @@ st_framebuffer_validate(struct st_framebuffer *stfb,
if (stfb->iface_stamp == new_stamp)
   return;
 
+   memset(textures, 0, stfb->num_statts * sizeof(textures[0]));
+
/* validate the fb */
do {
   if (!stfb->iface->validate(&st->iface, stfb->iface, stfb->statts,
-- 
2.15.0.rc0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Build fail since configure.ac: rework llvm libs handling for 3.9+

2017-10-17 Thread Andy Furniss


Emil Velikov wrote:


On a "bad" build (-DLLVM_BUILD_LLVM_DYLIB=ON) I get

andy [~]$ llvm-config --link-shared --libs bitwriter
llvm-config: error: missing: /usr/lib/libLLVMDemangle.so
llvm-config: error: missing: /usr/lib/libLLVMSupport.so
llvm-config: error: missing: /usr/lib/libLLVMBinaryFormat.so
llvm-config: error: missing: /usr/lib/libLLVMCore.so
llvm-config: error: missing: /usr/lib/libLLVMBitReader.so
llvm-config: error: missing: /usr/lib/libLLVMMC.so
llvm-config: error: missing: /usr/lib/libLLVMMCParser.so
llvm-config: error: missing: /usr/lib/libLLVMObject.so
llvm-config: error: missing: /usr/lib/libLLVMProfileData.so
llvm-config: error: missing: /usr/lib/libLLVMAnalysis.so
llvm-config: error: missing: /usr/lib/libLLVMBitWriter.so


These here indicate that something in LLVM broke. Please report it
ASAP so that we don't get a LLVM release with this bug.


https://bugs.llvm.org/show_bug.cgi?id=34977

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] loader/dri3: Make sure we invalidate a drawable on size change

2017-10-17 Thread Michel Dänzer

On 16/10/17 07:16 PM, Thomas Hellstrom wrote:
> On 10/16/2017 04:39 PM, Michel Dänzer wrote:
>> On 16/10/17 02:21 PM, Thomas Hellstrom wrote:
>>> On 10/16/2017 12:53 PM, Thomas Hellstrom wrote:
 Hi, Michel,

 On 10/16/2017 12:35 PM, Michel Dänzer wrote:
> Hi Thomas,
>
> On 05/09/17 10:15 AM, Thomas Hellstrom wrote:
>> If we're seeing a drawable size change, in particular after
>> processing a
>> configure notify event, make sure we invalidate so that the state
>> tracker
>> picks up the new geometry.
>>
>> Signed-off-by: Thomas Hellstrom 
>> ---
>>    src/loader/loader_dri3_helper.c | 2 ++
>>    1 file changed, 2 insertions(+)
>>
>> diff --git a/src/loader/loader_dri3_helper.c
>> b/src/loader/loader_dri3_helper.c
>> index 51e4e97..bcd5a66 100644
>> --- a/src/loader/loader_dri3_helper.c
>> +++ b/src/loader/loader_dri3_helper.c
>> @@ -348,6 +348,7 @@ dri3_handle_present_event(struct
>> loader_dri3_drawable *draw,
>>  draw->width = ce->width;
>>  draw->height = ce->height;
>>  draw->vtable->set_drawable_size(draw, draw->width,
>> draw->height);
>> + draw->ext->flush->invalidate(draw->dri_drawable);
>>  break;
>>   }
>>   case XCB_PRESENT_COMPLETE_NOTIFY: {
>> @@ -1592,6 +1593,7 @@ loader_dri3_update_drawable_geometry(struct
>> loader_dri3_drawable *draw)
>>  draw->width = geom_reply->width;
>>  draw->height = geom_reply->height;
>>  draw->vtable->set_drawable_size(draw, draw->width,
>> draw->height);
>> + draw->ext->flush->invalidate(draw->dri_drawable);
>>        free(geom_reply);
>>   }
>>
> unfortunately, I just bisected a regression to this commit. With
> it, the
> pipe_resource textures backing DRI3 back buffers are leaked when the
> window is resized:
>
> ==3408== 273,760 (228,464 direct, 45,296 indirect) bytes in 131
> blocks are definitely lost in loss record 1,285 of 1,286
> ==3408==    at 0x4C2DC05: calloc (vg_replace_malloc.c:711)
> ==3408==    by 0xA04D5D3: r600_texture_create_object
> (r600_texture.c:1125)
> ==3408==    by 0xA04E1C1: si_texture_create (r600_texture.c:1382)
> ==3408==    by 0x9CE3A17: dri2_allocate_textures (dri2.c:803)
> ==3408==    by 0x9CDDAE2: dri_st_framebuffer_validate
> (dri_drawable.c:85)
> ==3408==    by 0x9B6D07F: st_framebuffer_validate (st_manager.c:195)
> ==3408==    by 0x9B6EE84: st_manager_validate_framebuffers
> (st_manager.c:1063)
> ==3408==    by 0x9B21807: st_validate_state (st_atom.c:202)
> ==3408==    by 0x9B2AF75: st_Clear (st_cb_clear.c:411)
> ==3408==    by 0x10A6BD: ??? (in /usr/bin/glxgears)
> ==3408==    by 0x109E37: ??? (in /usr/bin/glxgears)
> ==3408==    by 0x5E202E0: (below main) (libc-start.c:291)
> ==3408==
> ==3408== 362,768 (230,208 direct, 132,560 indirect) bytes in 132
> blocks are definitely lost in loss record 1,286 of 1,286
> ==3408==    at 0x4C2DC05: calloc (vg_replace_malloc.c:711)
> ==3408==    by 0xA04D5D3: r600_texture_create_object
> (r600_texture.c:1125)
> ==3408==    by 0xA04E1C1: si_texture_create (r600_texture.c:1382)
> ==3408==    by 0x9CE0C0D: dri2_create_image_common (dri2.c:1119)
> ==3408==    by 0x9CE0C6F: dri2_create_image (dri2.c:1140)
> ==3408==    by 0x5386BA8: dri3_alloc_render_buffer
> (loader_dri3_helper.c:1030)
> ==3408==    by 0x53876F4: dri3_get_buffer.isra.15
> (loader_dri3_helper.c:1364)
> ==3408==    by 0x53886DC: loader_dri3_get_buffers
> (loader_dri3_helper.c:1549)
> ==3408==    by 0x9CE298C: dri_image_drawable_get_buffers (dri2.c:452)
> ==3408==    by 0x9CE298C: dri2_allocate_textures (dri2.c:576)
> ==3408==    by 0x9CDDAE2: dri_st_framebuffer_validate
> (dri_drawable.c:85)
> ==3408==    by 0x9B6D07F: st_framebuffer_validate (st_manager.c:195)
> ==3408==    by 0x9B6EE84: st_manager_validate_framebuffers
> (st_manager.c:1063)
>
>
> Can you reproduce this with vmwgfx as well? Any ideas why this is
> happening / how to fix it?
>
>
 Looks like other buffers (depth ?) are leaked too, outside dri3, right?

 I'll take a look.

 /Thomas


>>> Could you try the attached patch? Fixes the issue on my side.
>> Yes, it fixes the problem for me as well, thanks! I was looking at that
>> code as well, but didn't realize what's going on.
>>
>> I think the attached patch would be a cleaner solution though.
> 
> I agree. I had that in mind but had a bit too much on my plate today.
> 
> Perhaps we should not restrict the application of the patch patch with a
> Fixes: tag, though:
> Although improbable, the bug could be hit also without that dri3 patch.

Fair enough, replaced with a generic stable tag in v2.


> Also do we need to do a os_calloc() of the texture pointer ar

Re: [Mesa-dev] [PATCH v2] st/mesa: Initialize textures array in st_framebuffer_validate

2017-10-17 Thread Thomas Hellstrom


On 10/17/2017 04:39 PM, Michel Dänzer wrote:

From: Michel Dänzer 

And just reference pipe_resources to it in the validate callbacks.

Avoids pipe_resource leaks when st_framebuffer_validate ends up calling
the validate callback multiple times, e.g. when a window is resized.

v2:
* Use generic stable tag instead of Fixes: tag, since the problem could
   already happen before the commit referenced in v1 (Thomas Hellstrom)
* Use memset to initialize the array on the stack instead of allocating
   the array with os_calloc.

Cc: mesa-sta...@lists.freedesktop.org
Reviewed-by: Thomas Hellstrom  # v1
Signed-off-by: Michel Dänzer 
---
  src/gallium/state_trackers/dri/dri_drawable.c | 4 +---
  src/gallium/state_trackers/glx/xlib/xm_st.c   | 4 +---
  src/gallium/state_trackers/hgl/hgl.c  | 4 +---
  src/gallium/state_trackers/osmesa/osmesa.c| 1 +
  src/gallium/state_trackers/wgl/stw_st.c   | 4 +---
  src/mesa/state_tracker/st_manager.c   | 2 ++
  6 files changed, 7 insertions(+), 12 deletions(-)

diff --git a/src/gallium/state_trackers/dri/dri_drawable.c 
b/src/gallium/state_trackers/dri/dri_drawable.c
index 75a8197d330..d586b7564ef 100644
--- a/src/gallium/state_trackers/dri/dri_drawable.c
+++ b/src/gallium/state_trackers/dri/dri_drawable.c
@@ -99,10 +99,8 @@ dri_st_framebuffer_validate(struct st_context_iface *stctx,
return TRUE;
  
 /* Set the window-system buffers for the state tracker. */

-   for (i = 0; i < count; i++) {
-  out[i] = NULL;
+   for (i = 0; i < count; i++)
pipe_resource_reference(&out[i], textures[statts[i]]);
-   }
  
 return TRUE;

  }
diff --git a/src/gallium/state_trackers/glx/xlib/xm_st.c 
b/src/gallium/state_trackers/glx/xlib/xm_st.c
index 0c42e653c76..946b5dcff29 100644
--- a/src/gallium/state_trackers/glx/xlib/xm_st.c
+++ b/src/gallium/state_trackers/glx/xlib/xm_st.c
@@ -245,10 +245,8 @@ xmesa_st_framebuffer_validate(struct st_context_iface 
*stctx,
}
 }
  
-   for (i = 0; i < count; i++) {

-  out[i] = NULL;
+   for (i = 0; i < count; i++)
pipe_resource_reference(&out[i], xstfb->textures[statts[i]]);
-   }
  
 return TRUE;

  }
diff --git a/src/gallium/state_trackers/hgl/hgl.c 
b/src/gallium/state_trackers/hgl/hgl.c
index 1b702815a3a..bbc477a978c 100644
--- a/src/gallium/state_trackers/hgl/hgl.c
+++ b/src/gallium/state_trackers/hgl/hgl.c
@@ -193,10 +193,8 @@ hgl_st_framebuffer_validate(struct st_context_iface 
*stctxi,
//}
}
  
-	for (i = 0; i < count; i++) {

-   out[i] = NULL;
+   for (i = 0; i < count; i++)
pipe_resource_reference(&out[i], buffer->textures[statts[i]]);
-   }
  
  	return TRUE;

  }
diff --git a/src/gallium/state_trackers/osmesa/osmesa.c 
b/src/gallium/state_trackers/osmesa/osmesa.c
index 2f9558db312..44a0cc43812 100644
--- a/src/gallium/state_trackers/osmesa/osmesa.c
+++ b/src/gallium/state_trackers/osmesa/osmesa.c
@@ -432,6 +432,7 @@ osmesa_st_framebuffer_validate(struct st_context_iface 
*stctx,
  
templat.format = format;

templat.bind = bind;
+  pipe_resource_reference(&out[i], NULL);
out[i] = osbuffer->textures[statts[i]] =
   screen->resource_create(screen, &templat);
 }
diff --git a/src/gallium/state_trackers/wgl/stw_st.c 
b/src/gallium/state_trackers/wgl/stw_st.c
index 5e165c89f56..7cf18f0a8b0 100644
--- a/src/gallium/state_trackers/wgl/stw_st.c
+++ b/src/gallium/state_trackers/wgl/stw_st.c
@@ -161,10 +161,8 @@ stw_st_framebuffer_validate(struct st_context_iface *stctx,
stwfb->fb->must_resize = FALSE;
 }
  
-   for (i = 0; i < count; i++) {

-  out[i] = NULL;
+   for (i = 0; i < count; i++)
pipe_resource_reference(&out[i], stwfb->textures[statts[i]]);
-   }
  
 stw_framebuffer_unlock(stwfb->fb);
  
diff --git a/src/mesa/state_tracker/st_manager.c b/src/mesa/state_tracker/st_manager.c

index 50bc3c33c62..047337e22db 100644
--- a/src/mesa/state_tracker/st_manager.c
+++ b/src/mesa/state_tracker/st_manager.c
@@ -190,6 +190,8 @@ st_framebuffer_validate(struct st_framebuffer *stfb,
 if (stfb->iface_stamp == new_stamp)
return;
  
+   memset(textures, 0, stfb->num_statts * sizeof(textures[0]));

+
 /* validate the fb */
 do {
if (!stfb->iface->validate(&st->iface, stfb->iface, stfb->statts,


Also for v2:

Reviewed-by: Thomas Hellstrom 


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Build fail since configure.ac: rework llvm libs handling for 3.9+

2017-10-17 Thread Andy Furniss


Emil Velikov wrote:

On 17 October 2017 at 13:40, Emil Velikov  wrote:


With both:
-DLLVM_BUILD_LLVM_DYLIB=ON and -DLLVM_LINK_LLVM_DYLIB=ON


Something's a bit strange:


... setting -DLLVM_LINK_LLVM_DYLIB=ON should also set
-DLLVM_BUILD_LLVM_DYLIB=ON [1].

If that's not the case please report it to the LLVM devs.


Seems OK, I only set -DLLVM_LINK_LLVM_DYLIB=ON to get working (had 
already seen that it auto enables the other).


Dieter specified both and got the same result = OK (unless I mis-read?)



Thanks
Emil

[1] https://llvm.org/docs/CMake.html



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH mesa] meson: add missing radv_extensions.c generation for libvulkan_radeon

2017-10-17 Thread Andres Gomez

On Tue, 2017-10-17 at 12:00 +0100, Eric Engestrom wrote:
> Signed-off-by: Eric Engestrom 

I would add a line like:

fixes: 17201a2eb0b (radv: port to using updated anv entrypoint/extension 
generator.)

> ---
>  src/amd/vulkan/meson.build | 10 +-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/src/amd/vulkan/meson.build b/src/amd/vulkan/meson.build
> index a5a4f81352807beac92d..6a416d988674504281c6 100644
> --- a/src/amd/vulkan/meson.build
> +++ b/src/amd/vulkan/meson.build
> @@ -26,6 +26,14 @@ radv_entrypoints = custom_target(
>   '--outdir', meson.current_build_dir()],

Since radv_entrypoints_gen.py depends on it and it is also explicit in
the Makefile.am, I think we should also add something like this here:

depend_files : files('radv_extensions.py'),

>  )
>  
> +radv_extensions = custom_target(
> +  'radv_extensions.c',
> +  input : ['radv_extensions.py', vk_api_xml],
> +  output : ['radv_extensions.c'],
> +  command : [prog_python2, '@INPUT0@', '--xml', '@INPUT1@',
> + '--out', '@OUTPUT@'],
> +)
> +
>  vk_format_table_c = custom_target(
>'vk_format_table.c',
>input : ['vk_format_table.py', 'vk_format_layout.csv'],
> @@ -102,7 +110,7 @@ endif
>  
>  libvulkan_radeon = shared_library(
>'vulkan_radeon',
> -  [libradv_files, radv_entrypoints, nir_opcodes_h, vk_format_table_c],
> +  [libradv_files, radv_entrypoints, radv_extensions, nir_opcodes_h, 
> vk_format_table_c],
>include_directories : [inc_common, inc_amd, inc_amd_common, inc_compiler,
>   inc_vulkan_util, inc_vulkan_wsi],
>link_with : [libamd_common, libamdgpu_addrlib, libvulkan_util,

Other than that, this is:

Reviewed-by: Andres Gomez 

-- 
Br,

Andres
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 8/8] egl/wayland: add dri2_wl_free_buffers() helper

2017-10-17 Thread Emil Velikov

On 17 October 2017 at 15:07, Eric Engestrom  wrote:
> On Friday, 2017-10-06 21:38:35 +, Gwan-gyeong Mun wrote:
>> This deduplicates free routines of color_buffers array.
>>
>> Signed-off-by: Mun Gwan-gyeong 
>> ---
>>  src/egl/drivers/dri2/platform_wayland.c | 60 
>> +
>>  1 file changed, 31 insertions(+), 29 deletions(-)
>>
>> diff --git a/src/egl/drivers/dri2/platform_wayland.c 
>> b/src/egl/drivers/dri2/platform_wayland.c
>> index 1518a24b7c..cfe474cf58 100644
>> --- a/src/egl/drivers/dri2/platform_wayland.c
>> +++ b/src/egl/drivers/dri2/platform_wayland.c
>> @@ -253,6 +253,35 @@ dri2_wl_create_pixmap_surface(_EGLDriver *drv, 
>> _EGLDisplay *disp,
>> return NULL;
>>  }
>>
>> +static void
>> +dri2_wl_free_buffers(struct dri2_egl_surface *dri2_surf, bool check_lock)
>> +{
>> +   struct dri2_egl_display *dri2_dpy =
>> +  dri2_egl_display(dri2_surf->base.Resource.Display);
>> +
>> +   for (int i = 0; i < ARRAY_SIZE(dri2_surf->color_buffers); i++) {
>> +  if (dri2_surf->color_buffers[i].native_buffer) {
>> + if (check_lock && !dri2_surf->color_buffers[i].locked)
>> +wl_buffer_destroy((struct wl_buffer 
>> *)dri2_surf->color_buffers[i].native_buffer);
>> + else
>> +wl_buffer_destroy((struct wl_buffer 
>> *)dri2_surf->color_buffers[i].native_buffer);
>
> Both branches have the same code, should be a hint :P
> I think this should be:
>
>if (!check_lock || !dri2_surf->color_buffers[i].locked) {
>   wl_buffer_destroy((struct wl_buffer 
> *)dri2_surf->color_buffers[i].native_buffer);
Drop the unneeded cast?

>   dri2_surf->color_buffers[i].native_buffer = NULL;
>}
>
> without an `else`. You also want to remove the `native_buffer = NULL`
> from below, to avoid leaking locked buffers :)
>
I'm not sure if that's an actual leak. In either case, I'd leave that
as follow-up.

There are a few of instances where this can be used:
 - dri2_wl_destroy_surface, done
 - dri2_wl_release_buffers, done
 - update_buffers, todo
-  swrast_update_buffers, todo

Just an idea that came to mind, don't bother if you don't want to.
 - keep the platform specific buffer destroy (wl_buffer_destroy here)
in the platform code moving the rest as helper
 - plug the platform_android/platform_drm leak (?) by using the helper

Patches 5-7 look ok and are
Reviewed-by: Emil Velikov 

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Upstream support for FreeSync / Adaptive Sync

2017-10-17 Thread Daniel Vetter

On Tue, Oct 17, 2017 at 03:46:24PM +0200, Michel Dänzer wrote:
> On 17/10/17 02:22 PM, Daniel Vetter wrote:
> > On Tue, Oct 17, 2017 at 12:28:17PM +0200, Michel Dänzer wrote:
> >> On 17/10/17 11:34 AM, Nicolai Hähnle wrote:
> > 
> >>> Common sense suggests that there need to be two side to FreeSync / VESA
> >>> Adaptive Sync support:
> >>>
> >>> 1. Query the display capabilities. This means querying minimum / maximum
> >>> refresh duration, plus possibly a query for when the earliest/latest
> >>> timing of the *next* refresh.
> >>>
> >>> 2. Signal desired present time. This means passing a target timer value
> >>> instead of a target vblank count, e.g. something like this for the KMS
> >>> interface:
> >>>
> >>>   int drmModePageFlipTarget64(int fd, uint32_t crtc_id, uint32_t fb_id,
> >>>   uint32_t flags, void *user_data,
> >>>   uint64_t target);
> >>>
> >>>   + a flag to indicate whether target is the vblank count or the
> >>> CLOCK_MONOTONIC (?) time in ns.
> >>
> >> drmModePageFlip(Target) is part of the pre-atomic KMS API, but adapative
> >> sync should probably only be supported via the atomic API, presumably
> >> via output properties.
> > 
> > +1
> > 
> > At least now that DC is on track to land properly, and you want to do this
> > for DC-only anyway there's no reason to pimp the legacy interfaces
> > further. And atomic is soo much easier to extend.
> > 
> > The big question imo is where we need to put the flag on the kms side,
> > since freesync is not just about presenting earlier, but also about
> > presenting later. But for backwards compat we can't stretch the refresh
> > rate by default for everyone, or clients that rely on high precision
> > timestamps and regular refresh will get a bad surprise.
> 
> The idea described above is that adaptive sync would be used for flips
> with a target timestamp. Apps which don't want to use adaptive sync
> wouldn't set a target timestamp.
> 
> 
> > I think a boolean enable_freesync property is probably what we want, which
> > enables freesync for as long as it's set.
> 
> The question then becomes under what circumstances the property is (not)
> set. Not sure offhand this will actually solve any problem, or just push
> it somewhere else.

I thought that's what the driconf switch is for, with a policy of "please
schedule asap" instead of a specific timestamp.

> > Finally I'm not sure we want to insist on a target time for freesync. At
> > least as far as I understand things you just want "as soon as possible".
> > This might change with some of the VK/EGL/GLX extensions where you
> > specify a precise timing (media playback). But that needs a bit more work
> > to make it happen I think, so perhaps better to postpone.
> 
> I don't see why. There's an obvious use case for this now, for video
> playback. At least VDPAU already has target timestamps for this.
> 
> 
> > Also note that right now no driver expect amdgpu has support for a target
> > vblank on a flip. That's imo another reason for not requiring target
> > support for at least basic freesync support.
> 
> I think that's a bad reason. :) Adding it for atomic drivers shouldn't
> be that hard.

I thought the primary reason for adaptive sync is the adaptive frame rate
to cope with the occasional stall in games. If the intended use-case is
vr/media, then I agree going with timestamps from the beginning makes
sense. That still leaves the "schedule asap, with some leeway" mode. Or is
that (no longer) something we want?
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH mesa] meson: add missing radv_extensions.c generation for libvulkan_radeon

2017-10-17 Thread Eric Engestrom

On Tuesday, 2017-10-17 14:55:06 +, Andres Gomez wrote:
> On Tue, 2017-10-17 at 12:00 +0100, Eric Engestrom wrote:
> > Signed-off-by: Eric Engestrom 
> 
> I would add a line like:
> 
> fixes: 17201a2eb0b (radv: port to using updated anv entrypoint/extension 
> generator.)

Thanks, I forgot to do that.

> 
> > ---
> >  src/amd/vulkan/meson.build | 10 +-
> >  1 file changed, 9 insertions(+), 1 deletion(-)
> > 
> > diff --git a/src/amd/vulkan/meson.build b/src/amd/vulkan/meson.build
> > index a5a4f81352807beac92d..6a416d988674504281c6 100644
> > --- a/src/amd/vulkan/meson.build
> > +++ b/src/amd/vulkan/meson.build
> > @@ -26,6 +26,14 @@ radv_entrypoints = custom_target(
> >   '--outdir', meson.current_build_dir()],
> 
> Since radv_entrypoints_gen.py depends on it and it is also explicit in
> the Makefile.am, I think we should also add something like this here:
> 
> depend_files : files('radv_extensions.py'),

I'll send a separate patch for this, as it's unrelated to the break
fixed here, but good catch, I didn't think to check inside the scripts.

Dylan, meson tracks includes in C files automatically, right? As in,
a file including foo.h would get rebuilt if foo.h is touched, right?
Any idea if the same is/can be done for python files?

> 
> >  )
> >  
> > +radv_extensions = custom_target(
> > +  'radv_extensions.c',
> > +  input : ['radv_extensions.py', vk_api_xml],
> > +  output : ['radv_extensions.c'],
> > +  command : [prog_python2, '@INPUT0@', '--xml', '@INPUT1@',
> > + '--out', '@OUTPUT@'],
> > +)
> > +
> >  vk_format_table_c = custom_target(
> >'vk_format_table.c',
> >input : ['vk_format_table.py', 'vk_format_layout.csv'],
> > @@ -102,7 +110,7 @@ endif
> >  
> >  libvulkan_radeon = shared_library(
> >'vulkan_radeon',
> > -  [libradv_files, radv_entrypoints, nir_opcodes_h, vk_format_table_c],
> > +  [libradv_files, radv_entrypoints, radv_extensions, nir_opcodes_h, 
> > vk_format_table_c],
> >include_directories : [inc_common, inc_amd, inc_amd_common, inc_compiler,
> >   inc_vulkan_util, inc_vulkan_wsi],
> >link_with : [libamd_common, libamdgpu_addrlib, libvulkan_util,
> 
> Other than that, this is:
> 
> Reviewed-by: Andres Gomez 
> 
> -- 
> Br,
> 
> Andres
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH mesa] meson: track python script dependency

2017-10-17 Thread Eric Engestrom

Suggested-by: Andres Gomez 
Signed-off-by: Eric Engestrom 
---
 src/amd/vulkan/meson.build | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/amd/vulkan/meson.build b/src/amd/vulkan/meson.build
index 6a416d988674504281c6..0d433e944efe7a3790a5 100644
--- a/src/amd/vulkan/meson.build
+++ b/src/amd/vulkan/meson.build
@@ -24,6 +24,7 @@ radv_entrypoints = custom_target(
   output : ['radv_entrypoints.h', 'radv_entrypoints.c'],
   command : [prog_python2, '@INPUT0@', '--xml', '@INPUT1@',
  '--outdir', meson.current_build_dir()],
+  depend_files : files('radv_extensions.py'),
 )
 
 radv_extensions = custom_target(
-- 
Cheers,
  Eric

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] egl/dri2: disambiguate driver name

2017-10-17 Thread Kai Wasserbäch

So far the Mesa-internal EGL driver "dri2" returned "DRI2" as its driver
name. This causes confusion, because there is a kernel interface by the
same name where a version 3 is available.

This change attempts to make it clearer that the "dri2" name of the Mesa
EGL driver has nothing to do with that kernel interface and is rather a
Mesa internal versioning of an interface.

See eg. the thread beginning at

for an example.

Signed-off-by: Kai Wasserbäch 
---
 src/egl/drivers/dri2/egl_dri2.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c
index 77f09271f0..3991dfc84d 100644
--- a/src/egl/drivers/dri2/egl_dri2.c
+++ b/src/egl/drivers/dri2/egl_dri2.c
@@ -3260,7 +3260,7 @@ _eglBuiltInDriver(void)
dri2_drv->base.API.GLInteropExportObject = dri2_interop_export_object;
dri2_drv->base.API.DupNativeFenceFDANDROID = dri2_dup_native_fence_fd;
 
-   dri2_drv->base.Name = "DRI2";
+   dri2_drv->base.Name = "MESA-DRI2";
 
return &dri2_drv->base;
 }
-- 
2.14.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] egl/wayland: Support for KHR_partial_update

2017-10-17 Thread Harish Krupo


Eric Engestrom  writes:

> On Monday, 2017-10-16 13:54:25 +, Emil Velikov wrote:
>> Hi Harish,
>> 
>> Overall looks great, a few comments/questions inline.
>> 
>
> I agree with everything Emil said :)
>
>> On 13 October 2017 at 19:49, Harish Krupo  wrote:
>> > This passes 33/37 deqp tests related to partial_update, 4 are not
>> > supported.
>
> Which 4 are not supported? It's definitely not obvious :)
> I'm guessing dEQP-EGL.functional.negative_partial_update.not_current_surface2
> isn't supported (pbuffer), but I can't tell which other ones would be
> unsupported.
>

Sorry, should have posted it.
Here are the ones not supported:
dEQP-EGL.functional.negative_partial_update.not_postable_surface
dEQP-EGL.functional.negative_partial_update.not_current_surface
dEQP-EGL.functional.negative_partial_update.buffer_preserved
dEQP-EGL.functional.negative_partial_update.not_current_surface2
Reason: No matching egl config found.

I will update the commit in v2.

>> >
>> > Signed-off-by: Harish Krupo 
>> > ---
>> >  src/egl/drivers/dri2/platform_wayland.c | 68 
>> > -
>> >  1 file changed, 59 insertions(+), 9 deletions(-)
>> >
>> > diff --git a/src/egl/drivers/dri2/platform_wayland.c 
>> > b/src/egl/drivers/dri2/platform_wayland.c
>> > index 14db55ca74..483d588b92 100644
>> > --- a/src/egl/drivers/dri2/platform_wayland.c
>> > +++ b/src/egl/drivers/dri2/platform_wayland.c
>> > @@ -810,15 +810,39 @@ try_damage_buffer(struct dri2_egl_surface *dri2_surf,
>> > }
>> > return EGL_TRUE;
>> >  }
>> > +
>> >  /**
>> > - * Called via eglSwapBuffers(), drv->API.SwapBuffers().
>> > + * Called via eglSetDamageRegionKHR(), drv->API.SetDamageRegion().
>> >   */
>> >  static EGLBoolean
>> > -dri2_wl_swap_buffers_with_damage(_EGLDriver *drv,
>> > - _EGLDisplay *disp,
>> > - _EGLSurface *draw,
>> > - const EGLint *rects,
>> > - EGLint n_rects)
>> > +wl_set_damage_region(_EGLDriver *drv,
>> Yes, it might be a bit confusing, but please stay consistent and call
>> this dri2_wl_set_damage_region.
>> 
>> > + _EGLDisplay *dpy,
>> > + _EGLSurface *surf,
>> > + const EGLint *rects,
>> > + EGLint n_rects)
>> > +{
>> > +   struct dri2_egl_surface *dri2_surf = dri2_egl_surface(surf);
>> > +
>> > +   /* The spec doesn't mention what should be returned in case of
>> > +* failure in setting the damage buffer with the window system, so
>> > +* setting the damage to maximum surface area
>> > +   */
>> > +   if (!n_rects || !try_damage_buffer(dri2_surf, rects, n_rects)) {
>> > +  wl_surface_damage(dri2_surf->wl_surface_wrapper,
>> > +0, 0, INT32_MAX, INT32_MAX);
>> > +  return EGL_TRUE;
>> Drop this return - it's only confusing the reader.
>> 
>> > +   }
>> > +
>> > +   return EGL_TRUE;
>> ... since this should suffice.
>> 
>> > +}
>> > +
>> > +static EGLBoolean
>> > +dri2_wl_swap_buffers_common(_EGLDriver *drv,
>> > +_EGLDisplay *disp,
>> > +_EGLSurface *draw,
>> > +const EGLint *rects,
>> > +EGLint n_rects,
>> > +EGLBoolean with_damage)
>> >  {
>> > struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp);
>> > struct dri2_egl_surface *dri2_surf = dri2_egl_surface(draw);
>> > @@ -876,7 +900,17 @@ dri2_wl_swap_buffers_with_damage(_EGLDriver *drv,
>> > /* If the compositor doesn't support damage_buffer, we deliberately
>> >  * ignore the damage region and post maximum damage, due to
>> >  * https://bugs.freedesktop.org/78190 */
>> > -   if (!n_rects || !try_damage_buffer(dri2_surf, rects, n_rects))
>> > +
>> > +   if (!with_damage) {
>> > +
>> > +  /* If called from swapBuffers, check if the damage region
>> > +   * is already set, if not set to full damage
>> > +   */
>> > +  if (!dri2_surf->base.SetDamageRegionCalled)
>> > + wl_surface_damage(dri2_surf->wl_surface_wrapper,
>> > +   0, 0, INT32_MAX, INT32_MAX);
>> > +   }
>> > +   else if (!n_rects || !try_damage_buffer(dri2_surf, rects, n_rects))
>> >wl_surface_damage(dri2_surf->wl_surface_wrapper,
>> >  0, 0, INT32_MAX, INT32_MAX);
>> >
>> > @@ -912,6 +946,20 @@ dri2_wl_swap_buffers_with_damage(_EGLDriver *drv,
>> > return EGL_TRUE;
>> >  }
>> >
>> > +/**
>> > + * Called via eglSwapBuffers(), drv->API.SwapBuffers().
>> > + */
>> > +static EGLBoolean
>> > +dri2_wl_swap_buffers_with_damage(_EGLDriver *drv,
>> > + _EGLDisplay *disp,
>> > + _EGLSurface *draw,
>> > + const EGLint *rects,
>> > + EGLint n_rects)
>> > +{
>> > +   return dri2_wl_swap_buffe

Re: [Mesa-dev] Upstream support for FreeSync / Adaptive Sync

2017-10-17 Thread Michel Dänzer

On 17/10/17 05:04 PM, Daniel Vetter wrote:
> On Tue, Oct 17, 2017 at 03:46:24PM +0200, Michel Dänzer wrote:
>> On 17/10/17 02:22 PM, Daniel Vetter wrote:
>>> On Tue, Oct 17, 2017 at 12:28:17PM +0200, Michel Dänzer wrote:
 On 17/10/17 11:34 AM, Nicolai Hähnle wrote:
>>>
> Common sense suggests that there need to be two side to FreeSync / VESA
> Adaptive Sync support:
>
> 1. Query the display capabilities. This means querying minimum / maximum
> refresh duration, plus possibly a query for when the earliest/latest
> timing of the *next* refresh.
>
> 2. Signal desired present time. This means passing a target timer value
> instead of a target vblank count, e.g. something like this for the KMS
> interface:
>
>   int drmModePageFlipTarget64(int fd, uint32_t crtc_id, uint32_t fb_id,
>   uint32_t flags, void *user_data,
>   uint64_t target);
>
>   + a flag to indicate whether target is the vblank count or the
> CLOCK_MONOTONIC (?) time in ns.

 drmModePageFlip(Target) is part of the pre-atomic KMS API, but adapative
 sync should probably only be supported via the atomic API, presumably
 via output properties.
>>>
>>> +1
>>>
>>> At least now that DC is on track to land properly, and you want to do this
>>> for DC-only anyway there's no reason to pimp the legacy interfaces
>>> further. And atomic is soo much easier to extend.
>>>
>>> The big question imo is where we need to put the flag on the kms side,
>>> since freesync is not just about presenting earlier, but also about
>>> presenting later. But for backwards compat we can't stretch the refresh
>>> rate by default for everyone, or clients that rely on high precision
>>> timestamps and regular refresh will get a bad surprise.
>>
>> The idea described above is that adaptive sync would be used for flips
>> with a target timestamp. Apps which don't want to use adaptive sync
>> wouldn't set a target timestamp.
>>
>>
>>> I think a boolean enable_freesync property is probably what we want, which
>>> enables freesync for as long as it's set.
>>
>> The question then becomes under what circumstances the property is (not)
>> set. Not sure offhand this will actually solve any problem, or just push
>> it somewhere else.
> 
> I thought that's what the driconf switch is for, with a policy of "please
> schedule asap" instead of a specific timestamp.

The driconf switch is just for the user's intention to use adaptive sync
when possible. A property as you suggest cannot be set by the client
directly, because it can't know when adaptive sync can actually be used
(only when its window is fullscreen and using page flipping). So the
property would have to be set by the X server/driver / Wayland
compositor / ... instead. The question is whether such a property is
actually needed, or whether the kernel could just enable adaptive sync
when there's a flip with a target timestamp, and disable it when there's
a flip without a target timestamp, or something like that.


>>> Finally I'm not sure we want to insist on a target time for freesync. At
>>> least as far as I understand things you just want "as soon as possible".
>>> This might change with some of the VK/EGL/GLX extensions where you
>>> specify a precise timing (media playback). But that needs a bit more work
>>> to make it happen I think, so perhaps better to postpone.
>>
>> I don't see why. There's an obvious use case for this now, for video
>> playback. At least VDPAU already has target timestamps for this.
>>
>>
>>> Also note that right now no driver expect amdgpu has support for a target
>>> vblank on a flip. That's imo another reason for not requiring target
>>> support for at least basic freesync support.
>>
>> I think that's a bad reason. :) Adding it for atomic drivers shouldn't
>> be that hard.
> 
> I thought the primary reason for adaptive sync is the adaptive frame rate
> to cope with the occasional stall in games. If the intended use-case is
> vr/media, then I agree going with timestamps from the beginning makes
> sense. That still leaves the "schedule asap, with some leeway" mode. Or is
> that (no longer) something we want?

Both are use cases for adaptive sync. Both can be covered by a target
timestamp. There may be other possible solutions which work for both though.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] egl/dri2: disambiguate driver name

2017-10-17 Thread Michel Dänzer

On 17/10/17 05:26 PM, Kai Wasserbäch wrote:
> So far the Mesa-internal EGL driver "dri2" returned "DRI2" as its driver
> name. This causes confusion, because there is a kernel interface by the
> same name where a version 3 is available.
> 
> This change attempts to make it clearer that the "dri2" name of the Mesa
> EGL driver has nothing to do with that kernel interface and is rather a
> Mesa internal versioning of an interface.

DRI3 is an X11 protocol extension, not a kernel interface.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 16/16] radeonsi: if there's just const buffer 0, set it in place of CONST/SSBO pointer

2017-10-17 Thread Marek Olšák

On Tue, Oct 17, 2017 at 2:25 PM, Nicolai Hähnle  wrote:
> On 13.10.2017 14:04, Marek Olšák wrote:
>>
>> From: Marek Olšák 
>>
>> SI_SGPR_CONST_AND_SHADER_BUFFERS now contains the pointer to const buffer
>> 0
>> if there is no other buffer there.
>>
>> Benefits:
>> - there is no constbuf descriptor upload and shader load
>>
>> It's assumed that all constant addresses are within bounds. Non-constant
>> addresses are clamped against the last declared CONST variable.
>> This only works if the state tracker ensures the bound constant buffer
>> matches what the shader needs.
>>
>> Once we get 32-bit pointers, we can only do this for user constant buffers
>> where the driver is in charge of the upload so that it can guarantee a
>> 32-bit
>> address.
>>
>> The real performance benefit might not be measurable.
>>
>> These apps get 100% theoretical benefit in all shaders (except where
>> noted):
>> - antichamber
>> - barman arkham origins
>> - borderlands 2
>> - borderlands pre-sequel
>> - brutal legend
>> - civilization BE
>> - CS:GO
>> - deadcore
>> - dota 2 -- most shaders
>> - europa universalis
>> - grid autosport -- most shaders
>> - left 4 dead 2
>> - legend of grimrock
>> - life is strange
>> - payday 2
>> - portal
>> - rocket league
>> - serious sam 3 bfe
>> - talos principle
>> - team fortress 2
>> - thea
>> - unigine heaven
>> - unigine valley -- also sanctuary and tropics
>> - wasteland 2
>> - xcom: enemy unknown & enemy within
>> - tesseract
>> - unity (engine)
>>
>> Changed stats only:
>>  SGPRS: 2059998 -> 2086238 (1.27 %)
>>  VGPRS: 1626888 -> 1626904 (0.00 %)
>>  Spilled SGPRs: 7902 -> 7865 (-0.47 %)
>>  Code Size: 60924520 -> 60982660 (0.10 %) bytes
>>  Max Waves: 374539 -> 374526 (-0.00 %)
>> ---
>>   src/gallium/drivers/radeonsi/si_descriptors.c | 23 +++--
>>   src/gallium/drivers/radeonsi/si_shader.c  | 72
>> +++
>>   src/gallium/drivers/radeonsi/si_shader.h  |  2 +-
>>   src/gallium/drivers/radeonsi/si_state.h   |  3 ++
>>   4 files changed, 87 insertions(+), 13 deletions(-)
>>
>> diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c
>> b/src/gallium/drivers/radeonsi/si_descriptors.c
>> index 0c1fca8..da6efa8 100644
>> --- a/src/gallium/drivers/radeonsi/si_descriptors.c
>> +++ b/src/gallium/drivers/radeonsi/si_descriptors.c
>> @@ -119,20 +119,21 @@ static void si_init_descriptor_list(uint32_t
>> *desc_list,
>> static void si_init_descriptors(struct si_descriptors *desc,
>> unsigned shader_userdata_index,
>> unsigned element_dw_size,
>> unsigned num_elements)
>>   {
>> desc->list = CALLOC(num_elements, element_dw_size * 4);
>> desc->element_dw_size = element_dw_size;
>> desc->num_elements = num_elements;
>> desc->shader_userdata_offset = shader_userdata_index * 4;
>> +   desc->slot_index_to_bind_directly = -1;
>>   }
>> static void si_release_descriptors(struct si_descriptors *desc)
>>   {
>> r600_resource_reference(&desc->buffer, NULL);
>> FREE(desc->list);
>>   }
>> static bool si_upload_descriptors(struct si_context *sctx,
>>   struct si_descriptors *desc)
>> @@ -141,20 +142,34 @@ static bool si_upload_descriptors(struct si_context
>> *sctx,
>> unsigned first_slot_offset = desc->first_active_slot * slot_size;
>> unsigned upload_size = desc->num_active_slots * slot_size;
>> /* Skip the upload if no shader is using the descriptors.
>> dirty_mask
>>  * will stay dirty and the descriptors will be uploaded when there
>> is
>>  * a shader using them.
>>  */
>> if (!upload_size)
>> return true;
>>   + /* If there is just one active descriptor, bind it directly. */
>> +   if ((int)desc->first_active_slot ==
>> desc->slot_index_to_bind_directly &&
>> +   desc->num_active_slots == 1) {
>> +   uint32_t *descriptor =
>> &desc->list[desc->slot_index_to_bind_directly *
>> +  desc->element_dw_size];
>> +
>> +   /* The buffer is already in the buffer list. */
>> +   r600_resource_reference(&desc->buffer, NULL);
>> +   desc->gpu_list = NULL;
>> +   desc->gpu_address =
>> si_desc_extract_buffer_address(descriptor);
>> +   si_mark_atom_dirty(sctx, &sctx->shader_pointers.atom);
>> +   return true;
>> +   }
>> +
>> uint32_t *ptr;
>> int buffer_offset;
>> u_upload_alloc(sctx->b.b.const_uploader, 0, upload_size,
>>si_optimal_tcc_alignment(sctx, upload_size),
>>(unsigned*)&buffer_offset,
>>(struct pipe_resource**)&desc->buffer,
>>(void**)&ptr);
>> if (!desc->buffer) {
>> desc->gpu_address = 0;
>>

Re: [Mesa-dev] [PATCH] egl/dri2: disambiguate driver name

2017-10-17 Thread Kai Wasserbäch

Michel Dänzer wrote on 17.10.2017 17:42:
> On 17/10/17 05:26 PM, Kai Wasserbäch wrote:
>> So far the Mesa-internal EGL driver "dri2" returned "DRI2" as its driver
>> name. This causes confusion, because there is a kernel interface by the
>> same name where a version 3 is available.
>>
>> This change attempts to make it clearer that the "dri2" name of the Mesa
>> EGL driver has nothing to do with that kernel interface and is rather a
>> Mesa internal versioning of an interface.
> 
> DRI3 is an X11 protocol extension, not a kernel interface.

Ihrg, yes. Total brainfart there. Should I resend or can this be fixed when this
patch might be commited? As I do not have commit access somebody else would need
to do it then.

Cheers,
Kai



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v4 0/2] build system: Unify c++11 detection and used [was: configure+mesa/st:check -std=c++11 support and enable tests accordingly]

2017-10-17 Thread Chuck Atkins

>
> I also think adding a test for each C++11 feature used in the code is
>
too tedious, regardless of the build system, and it would really need a
> dedicated maintainer.
>

Certainly.  Rather than checking for everything, I think a code snippet
that just includes a few c++11-only headers would be sufficient and just
assume if those are there then you have a working c++11 std library.
That's all we're doing with the std=c++11 check anyways; i.e. we're
assuming that there's no need to specifically check for support of
range-for loops and lambda expressions separately.  Partial standard
implementation is less of an issue these days I think than in the early
C++11 years.

- Chuck
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] configure: enable the OpenCL ICD by default

2017-10-17 Thread Aaron Watry

On Mon, Oct 16, 2017 at 10:40 AM, Emil Velikov  wrote:
> From: Emil Velikov 
>
> Nearly all the distributions* that build Mesa OpenCL, enable the ICD.
> Since building a non-ICD driver has the chance of conflicting with
> existing OpenCL binary (libOpenCL.so).
>
> Furthermore, some applications expect the library to provide
> annotated/versioned symbols.
>
> https://lists.freedesktop.org/archives/mesa-dev/2017-September/171093.html
>
> *Fedora, Suse, Arch, Debian, Ubuntu, FreeBSD use the ICD
> Gentoo manages the conflicting files via eselect.
>
> Cc: Aaron Watry 

Series is Reviewed-By: Aaron Watry 

> Cc: Francisco Jerez 
> Cc: Matt Turner 
> Cc: Jan Vesely 
> Signed-off-by: Emil Velikov 
> ---
>  Makefile.am  | 1 +
>  configure.ac | 4 ++--
>  2 files changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/Makefile.am b/Makefile.am
> index a4f49d3d332..ec432b471d5 100644
> --- a/Makefile.am
> +++ b/Makefile.am
> @@ -35,6 +35,7 @@ AM_DISTCHECK_CONFIGURE_FLAGS = \
> --enable-glx-tls \
> --enable-nine \
> --enable-opencl \
> +   --enable-opencl-icd \
> --enable-opengl \
> --enable-va \
> --enable-vdpau \
> diff --git a/configure.ac b/configure.ac
> index 62d33a1941c..9728675ccb2 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -1269,9 +1269,9 @@ AC_ARG_ENABLE([opencl],
>  AC_ARG_ENABLE([opencl_icd],
> [AS_HELP_STRING([--enable-opencl-icd],
>[Build an OpenCL ICD library to be loaded by an ICD implementation
> -   @<:@default=disabled@:>@])],
> +   @<:@default=enabled@:>@])],
>  [enable_opencl_icd="$enableval"],
> -[enable_opencl_icd=no])
> +[enable_opencl_icd=yes])
>
>  AC_ARG_ENABLE([gallium-tests],
>  [AS_HELP_STRING([--enable-gallium-tests],
> --
> 2.14.1
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] egl/dri2: disambiguate driver name

2017-10-17 Thread Eric Engestrom

On Tuesday, 2017-10-17 15:26:00 +, Kai Wasserbäch wrote:
> So far the Mesa-internal EGL driver "dri2" returned "DRI2" as its driver
> name. This causes confusion, because there is a kernel interface by the
> same name where a version 3 is available.

What confusion? Do you have an example?

I'm not really opposed to a change, but I don't see the benefit,
especially of just adding this "MESA-" prefix...

(Nit: please spell it "Mesa", it's a name, not an acronym :)

> 
> This change attempts to make it clearer that the "dri2" name of the Mesa
> EGL driver has nothing to do with that kernel interface and is rather a
> Mesa internal versioning of an interface.
> 
> See eg. the thread beginning at
> 
> for an example.
> 
> Signed-off-by: Kai Wasserbäch 
> ---
>  src/egl/drivers/dri2/egl_dri2.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c
> index 77f09271f0..3991dfc84d 100644
> --- a/src/egl/drivers/dri2/egl_dri2.c
> +++ b/src/egl/drivers/dri2/egl_dri2.c
> @@ -3260,7 +3260,7 @@ _eglBuiltInDriver(void)
> dri2_drv->base.API.GLInteropExportObject = dri2_interop_export_object;
> dri2_drv->base.API.DupNativeFenceFDANDROID = dri2_dup_native_fence_fd;
>  
> -   dri2_drv->base.Name = "DRI2";
> +   dri2_drv->base.Name = "MESA-DRI2";
>  
> return &dri2_drv->base;
>  }
> -- 
> 2.14.2
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] egl/wayland: Support for KHR_partial_update

2017-10-17 Thread Harish Krupo

Hi Emil,

Thank you for the comments, will fix the code accordingly and send a
v2 patch.
I have answered the question inline.

Emil Velikov  writes:

> Hi Harish,
>
> Overall looks great, a few comments/questions inline.
>
> On 13 October 2017 at 19:49, Harish Krupo  wrote:
>> This passes 33/37 deqp tests related to partial_update, 4 are not
>> supported.
>>
>> Signed-off-by: Harish Krupo 
>> ---
>>  src/egl/drivers/dri2/platform_wayland.c | 68 
>> -
>>  1 file changed, 59 insertions(+), 9 deletions(-)
>>
>> diff --git a/src/egl/drivers/dri2/platform_wayland.c 
>> b/src/egl/drivers/dri2/platform_wayland.c
>> index 14db55ca74..483d588b92 100644
>> --- a/src/egl/drivers/dri2/platform_wayland.c
>> +++ b/src/egl/drivers/dri2/platform_wayland.c
>> @@ -810,15 +810,39 @@ try_damage_buffer(struct dri2_egl_surface *dri2_surf,
>> }
>> return EGL_TRUE;
>>  }
>> +
>>  /**
>> - * Called via eglSwapBuffers(), drv->API.SwapBuffers().
>> + * Called via eglSetDamageRegionKHR(), drv->API.SetDamageRegion().
>>   */
>>  static EGLBoolean
>> -dri2_wl_swap_buffers_with_damage(_EGLDriver *drv,
>> - _EGLDisplay *disp,
>> - _EGLSurface *draw,
>> - const EGLint *rects,
>> - EGLint n_rects)
>> +wl_set_damage_region(_EGLDriver *drv,
> Yes, it might be a bit confusing, but please stay consistent and call
> this dri2_wl_set_damage_region.
>
>> + _EGLDisplay *dpy,
>> + _EGLSurface *surf,
>> + const EGLint *rects,
>> + EGLint n_rects)
>> +{
>> +   struct dri2_egl_surface *dri2_surf = dri2_egl_surface(surf);
>> +
>> +   /* The spec doesn't mention what should be returned in case of
>> +* failure in setting the damage buffer with the window system, so
>> +* setting the damage to maximum surface area
>> +   */
>> +   if (!n_rects || !try_damage_buffer(dri2_surf, rects, n_rects)) {
>> +  wl_surface_damage(dri2_surf->wl_surface_wrapper,
>> +0, 0, INT32_MAX, INT32_MAX);
>> +  return EGL_TRUE;
> Drop this return - it's only confusing the reader.
>
>> +   }
>> +
>> +   return EGL_TRUE;
> ... since this should suffice.
>
>> +}
>> +
>> +static EGLBoolean
>> +dri2_wl_swap_buffers_common(_EGLDriver *drv,
>> +_EGLDisplay *disp,
>> +_EGLSurface *draw,
>> +const EGLint *rects,
>> +EGLint n_rects,
>> +EGLBoolean with_damage)
>>  {
>> struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp);
>> struct dri2_egl_surface *dri2_surf = dri2_egl_surface(draw);
>> @@ -876,7 +900,17 @@ dri2_wl_swap_buffers_with_damage(_EGLDriver *drv,
>> /* If the compositor doesn't support damage_buffer, we deliberately
>>  * ignore the damage region and post maximum damage, due to
>>  * https://bugs.freedesktop.org/78190 */
>> -   if (!n_rects || !try_damage_buffer(dri2_surf, rects, n_rects))
>> +
>> +   if (!with_damage) {
>> +
>> +  /* If called from swapBuffers, check if the damage region
>> +   * is already set, if not set to full damage
>> +   */
>> +  if (!dri2_surf->base.SetDamageRegionCalled)
>> + wl_surface_damage(dri2_surf->wl_surface_wrapper,
>> +   0, 0, INT32_MAX, INT32_MAX);
>> +   }
>> +   else if (!n_rects || !try_damage_buffer(dri2_surf, rects, n_rects))
>>wl_surface_damage(dri2_surf->wl_surface_wrapper,
>>  0, 0, INT32_MAX, INT32_MAX);
>>
>> @@ -912,6 +946,20 @@ dri2_wl_swap_buffers_with_damage(_EGLDriver *drv,
>> return EGL_TRUE;
>>  }
>>
>> +/**
>> + * Called via eglSwapBuffers(), drv->API.SwapBuffers().
>> + */
>> +static EGLBoolean
>> +dri2_wl_swap_buffers_with_damage(_EGLDriver *drv,
>> + _EGLDisplay *disp,
>> + _EGLSurface *draw,
>> + const EGLint *rects,
>> + EGLint n_rects)
>> +{
>> +   return dri2_wl_swap_buffers_common(drv, disp, draw,
>> +  rects, n_rects, EGL_TRUE);
> Hmm what will happen if use calls:
>
>   eglSetDamageRegionKHR(dpy, surf, rects, n_rects)
>   ...
>   eglSwapBuffersWithDamageKHR(dpy, surf, NULL, 0)
>
>
> eglSwapBuffersWithDamageKHR should behave identical as eglSwapBuffers, 
> correct?
> If so, the proposed code seems buggy - we might as well drop the
> explicit with_damage and derive it from rects/n_rects locally.
>

Yes, in the above case, the surface damage will be set to the full
surface just as the spec defines it.
Now consider this scenario:

   eglSetDamageRegionKHR(dpy, surf, rects, n_rects)
   ...
   eglSwapBuffers(dpy, surf)
   (from dEQP)

In this case if we were the distinguish based on n_rects then we would
end up setting the buffer

Re: [Mesa-dev] [PATCH] egl/wayland: Support for KHR_partial_update

2017-10-17 Thread Eric Engestrom

On 17 October 2017 17:36:06 BST, Harish Krupo  
wrote:
> Hi Emil,
> 
> Thank you for the comments, will fix the code accordingly and send a
> v2 patch.
> I have answered the question inline.
> 
> Emil Velikov  writes:
> 
> > Hi Harish,
> >
> > Overall looks great, a few comments/questions inline.
> >
> > On 13 October 2017 at 19:49, Harish Krupo
>  wrote:
> >> This passes 33/37 deqp tests related to partial_update, 4 are not
> >> supported.
> >>
> >> Signed-off-by: Harish Krupo 
> >> ---
> >>  src/egl/drivers/dri2/platform_wayland.c | 68
> -
> >>  1 file changed, 59 insertions(+), 9 deletions(-)
> >>
> >> diff --git a/src/egl/drivers/dri2/platform_wayland.c
> b/src/egl/drivers/dri2/platform_wayland.c
> >> index 14db55ca74..483d588b92 100644
> >> --- a/src/egl/drivers/dri2/platform_wayland.c
> >> +++ b/src/egl/drivers/dri2/platform_wayland.c
> >> @@ -810,15 +810,39 @@ try_damage_buffer(struct dri2_egl_surface
> *dri2_surf,
> >> }
> >> return EGL_TRUE;
> >>  }
> >> +
> >>  /**
> >> - * Called via eglSwapBuffers(), drv->API.SwapBuffers().
> >> + * Called via eglSetDamageRegionKHR(), drv->API.SetDamageRegion().
> >>   */
> >>  static EGLBoolean
> >> -dri2_wl_swap_buffers_with_damage(_EGLDriver *drv,
> >> - _EGLDisplay *disp,
> >> - _EGLSurface *draw,
> >> - const EGLint *rects,
> >> - EGLint n_rects)
> >> +wl_set_damage_region(_EGLDriver *drv,
> > Yes, it might be a bit confusing, but please stay consistent and
> call
> > this dri2_wl_set_damage_region.
> >
> >> + _EGLDisplay *dpy,
> >> + _EGLSurface *surf,
> >> + const EGLint *rects,
> >> + EGLint n_rects)
> >> +{
> >> +   struct dri2_egl_surface *dri2_surf = dri2_egl_surface(surf);
> >> +
> >> +   /* The spec doesn't mention what should be returned in case of
> >> +* failure in setting the damage buffer with the window system,
> so
> >> +* setting the damage to maximum surface area
> >> +   */
> >> +   if (!n_rects || !try_damage_buffer(dri2_surf, rects, n_rects))
> {
> >> +  wl_surface_damage(dri2_surf->wl_surface_wrapper,
> >> +0, 0, INT32_MAX, INT32_MAX);
> >> +  return EGL_TRUE;
> > Drop this return - it's only confusing the reader.
> >
> >> +   }
> >> +
> >> +   return EGL_TRUE;
> > ... since this should suffice.
> >
> >> +}
> >> +
> >> +static EGLBoolean
> >> +dri2_wl_swap_buffers_common(_EGLDriver *drv,
> >> +_EGLDisplay *disp,
> >> +_EGLSurface *draw,
> >> +const EGLint *rects,
> >> +EGLint n_rects,
> >> +EGLBoolean with_damage)
> >>  {
> >> struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp);
> >> struct dri2_egl_surface *dri2_surf = dri2_egl_surface(draw);
> >> @@ -876,7 +900,17 @@ dri2_wl_swap_buffers_with_damage(_EGLDriver
> *drv,
> >> /* If the compositor doesn't support damage_buffer, we
> deliberately
> >>  * ignore the damage region and post maximum damage, due to
> >>  * https://bugs.freedesktop.org/78190 */
> >> -   if (!n_rects || !try_damage_buffer(dri2_surf, rects, n_rects))
> >> +
> >> +   if (!with_damage) {
> >> +
> >> +  /* If called from swapBuffers, check if the damage region
> >> +   * is already set, if not set to full damage
> >> +   */
> >> +  if (!dri2_surf->base.SetDamageRegionCalled)
> >> + wl_surface_damage(dri2_surf->wl_surface_wrapper,
> >> +   0, 0, INT32_MAX, INT32_MAX);
> >> +   }
> >> +   else if (!n_rects || !try_damage_buffer(dri2_surf, rects,
> n_rects))
> >>wl_surface_damage(dri2_surf->wl_surface_wrapper,
> >>  0, 0, INT32_MAX, INT32_MAX);
> >>
> >> @@ -912,6 +946,20 @@ dri2_wl_swap_buffers_with_damage(_EGLDriver
> *drv,
> >> return EGL_TRUE;
> >>  }
> >>
> >> +/**
> >> + * Called via eglSwapBuffers(), drv->API.SwapBuffers().
> >> + */
> >> +static EGLBoolean
> >> +dri2_wl_swap_buffers_with_damage(_EGLDriver *drv,
> >> + _EGLDisplay *disp,
> >> + _EGLSurface *draw,
> >> + const EGLint *rects,
> >> + EGLint n_rects)
> >> +{
> >> +   return dri2_wl_swap_buffers_common(drv, disp, draw,
> >> +  rects, n_rects, EGL_TRUE);
> > Hmm what will happen if use calls:
> >
> >   eglSetDamageRegionKHR(dpy, surf, rects, n_rects)
> >   ...
> >   eglSwapBuffersWithDamageKHR(dpy, surf, NULL, 0)
> >
> >
> > eglSwapBuffersWithDamageKHR should behave identical as
> eglSwapBuffers, correct?
> > If so, the proposed code seems buggy - we might as well drop the
> > explicit with_damage and derive it from rects/n_rects locally.
> >
> 
> Yes, in the ab

1 2 >

1 - 100 of 195 matches

Mail list logo