date:20160906

Re: [Mesa-dev] [PATCH 2/6] intel/isl: Fix up asserts in calc_phys_level0_extent_sa

2016-09-06 Thread Pohjolainen, Topi

On Fri, Sep 02, 2016 at 03:50:43PM -0700, Jason Ekstrand wrote:
> First, compressed 1D textures should be allowed.  There's nothing in the
> Vulkan spec (or in GL as far as I can remember) that dissallows them.  It
> just waists a bit of vertical space because it's only one pixel tall.
> 
> Second, the assertion that a format is uncompressed in the multisample
> layouts isn't quite right either.  What we really want to assert is that
> the format supports multisampling which is a bit more complicated query.

Reviewed-by: Topi Pohjolainen 

> ---
>  src/intel/isl/isl.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
> index c460ddb..f8f5802 100644
> --- a/src/intel/isl/isl.c
> +++ b/src/intel/isl/isl.c
> @@ -518,7 +518,6 @@ isl_calc_phys_level0_extent_sa(const struct isl_device 
> *dev,
>assert(info->height == 1);
>assert(info->depth == 1);
>assert(info->samples == 1);
> -  assert(!isl_format_is_compressed(info->format));
>  
>switch (dim_layout) {
>case ISL_DIM_LAYOUT_GEN4_3D:
> @@ -558,7 +557,7 @@ isl_calc_phys_level0_extent_sa(const struct isl_device 
> *dev,
>case ISL_MSAA_LAYOUT_ARRAY:
>   assert(info->depth == 1);
>   assert(info->levels == 1);
> - assert(!isl_format_is_compressed(info->format));
> + assert(isl_format_supports_multisampling(dev->info, info->format));
>  
>   *phys_level0_sa = (struct isl_extent4d) {
>  .w = info->width,
> @@ -571,7 +570,7 @@ isl_calc_phys_level0_extent_sa(const struct isl_device 
> *dev,
>case ISL_MSAA_LAYOUT_INTERLEAVED:
>   assert(info->depth == 1);
>   assert(info->levels == 1);
> - assert(!isl_format_is_compressed(info->format));
> + assert(isl_format_supports_multisampling(dev->info, info->format));
>  
>   *phys_level0_sa = (struct isl_extent4d) {
>  .w = info->width,
> -- 
> 2.5.0.400.gff86faf
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 97549] [SNB, BXT] up to 40% perf drop from "loader/dri3: Overhaul dri3_update_num_back" commit

2016-09-06 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=97549

Michel Dänzer  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #9 from Michel Dänzer  ---
Thanks for the information and testing! Fix pushed to master:

Module: Mesa
Branch: master
Commit: dc3bb5db8c81e7f08ae12ea5d3ee999e2afcbfd1
URL:   
http://cgit.freedesktop.org/mesa/mesa/commit/?id=dc3bb5db8c81e7f08ae12ea5d3ee999e2afcbfd1

Author: Michel Dänzer 
Date:   Tue Sep  6 11:34:49 2016 +0900

loader/dri3: Always use at least two back buffers

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Mesa-announce] Mesa 12.0.2 release candidate

2016-09-06 Thread Michel Dänzer

On 02/09/16 06:12 PM, Emil Velikov wrote:
> On 2 September 2016 at 03:26, Michel Dänzer  wrote:
>> On 01/09/16 11:25 PM, Emil Velikov wrote:
>>> Hello list,
>>>
>>> The candidate for the Mesa 12.0.2 is now available. Currently we have:
>>>  - 160 queued
>>>  - 9 nominated (outstanding)
>>>  - and 1 rejected patches
>>
>> [...]
>>
>>> Mesa stable queue
>>> -
>>>
>>> Nominated (9)
>>> =
>> [...]
>>> Michel Dänzer (1):
>>>   loader/dri3: Overhaul dri3_update_num_back
>>
>> FWIW, it's better to hold off on this one until
>> https://bugs.freedesktop.org/show_bug.cgi?id=97549 is resolved.
>>
> Indeed. I had it queued actually and pulled it out as I saw that bug.

Should be good to go for the next stable release(s) together with
dc3bb5db8c81 ("loader/dri3: Always use at least two back buffers").


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] glsl: use hash instead of exec_list in copy propagation

2016-09-06 Thread Tapani Pälli

This change makes copy propagation pass faster. Complete link time
spent in test case attached to bug 94477 goes down to ~400 secs from
over 500 secs on my HSW machine. Does not fix the actual issue but
brings down the total. No regressions seen in CI.

Signed-off-by: Tapani Pälli 
---

Next I'll attempt to make similar change to opt_copy_propagation_elements.

 src/compiler/glsl/opt_copy_propagation.cpp | 92 +-
 1 file changed, 41 insertions(+), 51 deletions(-)

diff --git a/src/compiler/glsl/opt_copy_propagation.cpp 
b/src/compiler/glsl/opt_copy_propagation.cpp
index 443905d..db2595d 100644
--- a/src/compiler/glsl/opt_copy_propagation.cpp
+++ b/src/compiler/glsl/opt_copy_propagation.cpp
@@ -37,25 +37,10 @@
 #include "ir_basic_block.h"
 #include "ir_optimization.h"
 #include "compiler/glsl_types.h"
+#include "util/hash_table.h"
 
 namespace {
 
-class acp_entry : public exec_node
-{
-public:
-   acp_entry(ir_variable *lhs, ir_variable *rhs)
-   {
-  assert(lhs);
-  assert(rhs);
-  this->lhs = lhs;
-  this->rhs = rhs;
-   }
-
-   ir_variable *lhs;
-   ir_variable *rhs;
-};
-
-
 class kill_entry : public exec_node
 {
 public:
@@ -74,7 +59,8 @@ public:
{
   progress = false;
   mem_ctx = ralloc_context(0);
-  this->acp = new(mem_ctx) exec_list;
+  acp = _mesa_hash_table_create(NULL, _mesa_hash_pointer,
+_mesa_key_pointer_equal);
   this->kills = new(mem_ctx) exec_list;
   killed_all = false;
}
@@ -96,8 +82,8 @@ public:
void kill(ir_variable *ir);
void handle_if_block(exec_list *instructions);
 
-   /** List of acp_entry: The available copies to propagate */
-   exec_list *acp;
+   /** Hash of lhs->rhs: The available copies to propagate */
+   hash_table *acp;
/**
 * List of kill_entry: The variables whose values were killed in this
 * block.
@@ -120,17 +106,18 @@ 
ir_copy_propagation_visitor::visit_enter(ir_function_signature *ir)
 * block.  Any instructions at global scope will be shuffled into
 * main() at link time, so they're irrelevant to us.
 */
-   exec_list *orig_acp = this->acp;
+   hash_table *orig_acp = this->acp;
exec_list *orig_kills = this->kills;
bool orig_killed_all = this->killed_all;
 
-   this->acp = new(mem_ctx) exec_list;
+   acp = _mesa_hash_table_create(NULL, _mesa_hash_pointer,
+ _mesa_key_pointer_equal);
this->kills = new(mem_ctx) exec_list;
this->killed_all = false;
 
visit_list_elements(this, &ir->body);
 
-   ralloc_free(this->acp);
+   _mesa_hash_table_destroy(acp, NULL);
ralloc_free(this->kills);
 
this->kills = orig_kills;
@@ -170,14 +157,10 @@ 
ir_copy_propagation_visitor::visit(ir_dereference_variable *ir)
if (this->in_assignee)
   return visit_continue;
 
-   ir_variable *var = ir->var;
-
-   foreach_in_list(acp_entry, entry, this->acp) {
-  if (var == entry->lhs) {
-ir->var = entry->rhs;
-this->progress = true;
-break;
-  }
+   struct hash_entry *entry = _mesa_hash_table_search(acp, ir->var);
+   if (entry) {
+  ir->var = (ir_variable *) entry->data;
+  progress = true;
}
 
return visit_continue;
@@ -201,7 +184,7 @@ ir_copy_propagation_visitor::visit_enter(ir_call *ir)
/* Since we're unlinked, we don't (necessarily) know the side effects of
 * this call.  So kill all copies.
 */
-   acp->make_empty();
+   _mesa_hash_table_clear(acp, NULL);
this->killed_all = true;
 
return visit_continue_with_parent;
@@ -210,28 +193,30 @@ ir_copy_propagation_visitor::visit_enter(ir_call *ir)
 void
 ir_copy_propagation_visitor::handle_if_block(exec_list *instructions)
 {
-   exec_list *orig_acp = this->acp;
+   hash_table *orig_acp = this->acp;
exec_list *orig_kills = this->kills;
bool orig_killed_all = this->killed_all;
 
-   this->acp = new(mem_ctx) exec_list;
+   acp = _mesa_hash_table_create(NULL, _mesa_hash_pointer,
+ _mesa_key_pointer_equal);
this->kills = new(mem_ctx) exec_list;
this->killed_all = false;
 
/* Populate the initial acp with a copy of the original */
-   foreach_in_list(acp_entry, a, orig_acp) {
-  this->acp->push_tail(new(this->acp) acp_entry(a->lhs, a->rhs));
+   struct hash_entry *entry;
+   hash_table_foreach(orig_acp, entry) {
+  _mesa_hash_table_insert(acp, entry->key, entry->data);
}
 
visit_list_elements(this, instructions);
 
if (this->killed_all) {
-  orig_acp->make_empty();
+  _mesa_hash_table_clear(orig_acp, NULL);
}
 
exec_list *new_kills = this->kills;
this->kills = orig_kills;
-   ralloc_free(this->acp);
+   _mesa_hash_table_destroy(acp, NULL);
this->acp = orig_acp;
this->killed_all = this->killed_all || orig_killed_all;
 
@@ -257,30 +242,31 @@ ir_copy_propagation_visitor::visit_enter(ir_if *ir)
 void
 ir_copy_propagation_visitor::handle_loop(ir_loop *ir, bool keep_acp)
 {
-   exec_l

Re: [Mesa-dev] [PATCH] main: GL_RGB10_A2UI does not come with GL 3.0/EXT_texture_integer

2016-09-06 Thread Nicolai Hähnle


Reviewed-by: Nicolai Hähnle 

On 04.09.2016 02:24, Ilia Mirkin wrote:

Add a separate extension check for that format. Prevents glTexImage from
trying to find a matching format, which fails on drivers without support
for this format.

Fixes: sized-texture-format-channels (on a3xx)
Signed-off-by: Ilia Mirkin 
Cc: mesa-sta...@lists.freedesktop.org
---

Technically we should be exposing it on a3xx, as it's required by ES 3.0.
However I've thus far failed to do so in a way that actually works for all
the necessary use-cases.

 src/mesa/main/glformats.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/src/mesa/main/glformats.c b/src/mesa/main/glformats.c
index 0c2210a..85d00af 100644
--- a/src/mesa/main/glformats.c
+++ b/src/mesa/main/glformats.c
@@ -2518,7 +2518,6 @@ _mesa_base_tex_format(const struct gl_context *ctx, GLint 
internalFormat)
   case GL_RGBA8I_EXT:
   case GL_RGBA16I_EXT:
   case GL_RGBA32I_EXT:
-  case GL_RGB10_A2UI:
  return GL_RGBA;
   case GL_RGB8UI_EXT:
   case GL_RGB16UI_EXT:
@@ -2530,6 +2529,13 @@ _mesa_base_tex_format(const struct gl_context *ctx, 
GLint internalFormat)
   }
}

+   if (ctx->Extensions.ARB_texture_rgb10_a2ui) {
+  switch (internalFormat) {
+  case GL_RGB10_A2UI:
+ return GL_RGBA;
+  }
+   }
+
if (ctx->Extensions.EXT_texture_integer) {
   switch (internalFormat) {
   case GL_ALPHA8UI_EXT:


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [v2 5/6] isl/gen8+: Allow 1D and 3D auxiliary surfaces

2016-09-06 Thread Topi Pohjolainen

Otherwise once mcs buffer gets allocated without delay for lossless
compression (same as we do for msaa), assert starts to fire in
piglit case: tex3d. The test uses depth of one which is in fact
supported even now.

v2 (Jason): Allow also 1D case as there is nothing in the specs
constraining it either.

Signed-off-by: Topi Pohjolainen 
---
 src/intel/isl/isl.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
index c7639d0..3dfdf20 100644
--- a/src/intel/isl/isl.c
+++ b/src/intel/isl/isl.c
@@ -1329,7 +1329,8 @@ isl_surf_get_ccs_surf(const struct isl_device *dev,
assert(surf->samples == 1 && surf->msaa_layout == ISL_MSAA_LAYOUT_NONE);
assert(ISL_DEV_GEN(dev) >= 7);
 
-   assert(surf->dim == ISL_SURF_DIM_2D);
+   assert(ISL_DEV_GEN(dev) >= 8 || surf->dim == ISL_SURF_DIM_2D);
+
assert(surf->logical_level0_px.depth == 1);
 
/* TODO: More conditions where it can fail. */
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [v2 4/6] i965/rbc: Set aux surface sampling engine according to rb settings

2016-09-06 Thread Topi Pohjolainen

Once mcs buffer gets allocated without delay for lossless
compression (same as we do for msaa), one gets regression in:

GL45-CTS.texture_barrier_ARB.same-texel-rw

Setting the auxiliary surface for both sampling engine and data
port seems to fix this. I haven't found any hardware documentation
backing this though.

v2 (Jason): Prepare also for the case where surface is sampled with
non-compressible format forcing also rendering without
compression.

Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 74 +++-
 1 file changed, 71 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index af102a9..05b214f 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -77,6 +77,76 @@ static const struct surface_state_info surface_state_infos[] 
= {
[9] = {16, 64, 8,  10, SKL_MOCS_WB,  SKL_MOCS_PTE},
 };
 
+static unsigned
+brw_find_matching_rb(const struct gl_framebuffer *fb,
+ const struct intel_mipmap_tree *mt)
+{
+   for (unsigned i = 0; i < fb->_NumColorDrawBuffers; i++) {
+  const struct intel_renderbuffer *irb =
+ intel_renderbuffer(fb->_ColorDrawBuffers[i]);
+
+  if (irb->mt == mt)
+ return i;
+   }
+
+   return fb->_NumColorDrawBuffers;
+}
+
+static bool
+brw_needs_aux_surface(const struct brw_context *brw,
+  const struct intel_mipmap_tree *mt, int flags,
+  const struct isl_view *view)
+{
+   if (!mt->mcs_mt)
+  return false;
+
+   if (view->usage & ISL_SURF_USAGE_RENDER_TARGET_BIT &&
+   !(flags & INTEL_RENDERBUFFER_AUX_DISABLED))
+  return true;
+
+   const struct gl_framebuffer *fb = brw->ctx.DrawBuffer;
+   const bool is_lossless_compressed =
+  intel_miptree_is_lossless_compressed(brw, mt);
+   const bool view_format_lossless_compressed =
+   isl_format_supports_lossless_compression(brw->intelScreen->devinfo,
+view->format);
+   const unsigned rb_index = brw_find_matching_rb(fb, mt);
+
+   /* If the underlying surface is compressed but it is sampled using a
+* format that the sampling engine doesn't support as compressed, there
+* is no alternative but to treat the surface as non-compressed.
+*/
+   if (is_lossless_compressed && !view_format_lossless_compressed) {
+  /* Logic elsewhere needs to take care to resolve the color buffer prior
+   * to sampling it as non-compressed.
+   */
+  assert(mt->fast_clear_state == INTEL_FAST_CLEAR_STATE_RESOLVED);
+
+  /* In practise it looks that setting the same lossless compressed
+   * surface to be sampled without auxiliary surface and to be written
+   * with auxiliary surface confuses the hardware. Therefore any
+   * corresponding renderbuffer must be set up with auxiliary buffer
+   * disabled.
+   */
+  assert(rb_index == fb->_NumColorDrawBuffers ||
+ brw->draw_aux_buffer_disabled[rb_index]);
+  return false;
+   }
+
+   /* In practise it looks that setting the same lossless compressed surface
+* to be sampled without auxiliary surface and to be written with auxiliary
+* surface confuses the hardware. Therefore sampler engine must be provided
+* with auxiliary buffer regardless of the fast clear state if the same
+* surface is also going to be written during the same rendering pass.
+*/
+   if (is_lossless_compressed && rb_index < fb->_NumColorDrawBuffers) {
+  assert(!brw->draw_aux_buffer_disabled[rb_index]);
+  return true;
+   }
+
+   return mt->fast_clear_state != INTEL_FAST_CLEAR_STATE_RESOLVED;
+}
+
 static void
 brw_emit_surface_state(struct brw_context *brw,
struct intel_mipmap_tree *mt, int flags,
@@ -140,9 +210,7 @@ brw_emit_surface_state(struct brw_context *brw,
struct isl_surf *aux_surf = NULL, aux_surf_s;
uint64_t aux_offset = 0;
enum isl_aux_usage aux_usage = ISL_AUX_USAGE_NONE;
-   if (mt->mcs_mt &&
-   ((view.usage & ISL_SURF_USAGE_RENDER_TARGET_BIT) ||
-mt->fast_clear_state != INTEL_FAST_CLEAR_STATE_RESOLVED)) {
+   if (brw_needs_aux_surface(brw, mt, flags, &view)) {
   intel_miptree_get_aux_isl_surf(brw, mt, &aux_surf_s, &aux_usage);
   aux_surf = &aux_surf_s;
   assert(mt->mcs_mt->offset == 0);
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [v2 1/6] i965/rbc: Allow integer formats as advertised in isl_format.c

2016-09-06 Thread Topi Pohjolainen

Blorp consults brw_is_color_fast_clear_compatible() to see if any
restrictions apply for fast clear in addition to the capablities
advertised in isl_format.c::format_info[]. On Gen8+ integer formats
are backlisted for plain old fast clear but there is no reason why
lossless compression shouldn't be supported. In fact, lossless
compression of integer formats is already supported for normal
render paths.

This patch prepares for dropping the delayed allocating of the mcs
buffer for lossless compression. Until now the skip of fast clear
also prevented the mcs being allocated and hence the lossless
compression being effectively turned off for integer formats.
Once the mcs buffer is allocated beforehand, the assertion addressed
here would start triggering.

v2: Drop the assert instead of relaxing it (Jason)
Fix typo while at it.

Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_blorp.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index c902f2e..b0fbb64 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -759,10 +759,9 @@ do_single_blorp_clear(struct brw_context *brw, struct 
gl_framebuffer *fb,
 
   if (intel_miptree_is_lossless_compressed(brw, irb->mt)) {
  /* Compressed buffers can be cleared also using normal rep-clear. In
-  * such case they bahave such as if they were drawn using normal 3D
+  * such case they behave such as if they were drawn using normal 3D
   * render pipeline, and we simply mark the mcs as dirty.
   */
- assert(partial_clear);
  irb->mt->fast_clear_state = INTEL_FAST_CLEAR_STATE_UNRESOLVED;
   }
}
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] i965: Allocate mcs directly for lossless compressed

2016-09-06 Thread Topi Pohjolainen

This mini-series replaces patches 1-4/12 in "Hardware assisted
layered clears". This is based on Jason's feedback and some offline
discussion.

Mostly it tackles tries to tackle cases where a render pass uses the
same surface for both texturing and rendering. In case the surface
suppors lossless compression care needs to be taken that both uses
are set up the same way.

1) Sampling thru texture view using a format that the sampling engine
   doesn't understand as compressed:
   Mark aux buffer disabled for renderbuffer, resolve the color buffer,
   set both tex and rb surfaces without aux buffer (i.e., as
   non-compressed.

2) Sampling a compressed buffer which is resolved needs to be set with
   auxiliary buffer if the same surface is also going to written in the
   same rendering pass.

CC: Jason Ekstrand 

Topi Pohjolainen (6):
  i965/rbc: Allow integer formats as advertised in isl_format.c
  i965: Replace boolean rb surface state setup argument with flags
  i965: Track non-compressible sampling of renderbuffers
  i965/rbc: Set aux surface sampling engine according to rb settings
  isl/gen8+: Allow 1D and 3D auxiliary surfaces
  i965/rbc: Allocate mcs directly

 src/intel/isl/isl.c  |   3 +-
 src/mesa/drivers/dri/i965/brw_blorp.c|  13 +--
 src/mesa/drivers/dri/i965/brw_context.c  |  16 
 src/mesa/drivers/dri/i965/brw_context.h  |  12 ++-
 src/mesa/drivers/dri/i965/brw_draw.c |   4 +-
 src/mesa/drivers/dri/i965/brw_state.h|   2 +-
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 110 ---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c|  68 --
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h|   7 +-
 9 files changed, 154 insertions(+), 81 deletions(-)

-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [v2 3/6] i965: Track non-compressible sampling of renderbuffers

2016-09-06 Thread Topi Pohjolainen

Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_context.c  | 16 
 src/mesa/drivers/dri/i965/brw_context.h  | 10 ++
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 12 ++--
 3 files changed, 36 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index b880b4f..c5c6fdd 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -197,6 +197,22 @@ intel_texture_view_requires_resolve(struct brw_context 
*brw,
   _mesa_get_format_name(intel_tex->_Format),
   _mesa_get_format_name(intel_tex->mt->format));
 
+   const struct gl_framebuffer *fb = brw->ctx.DrawBuffer;
+   for (unsigned i = 0; i < fb->_NumColorDrawBuffers; i++) {
+  const struct intel_renderbuffer *irb =
+ intel_renderbuffer(fb->_ColorDrawBuffers[i]);
+
+  /* In case the same surface is also used for rendering one needs to
+   * disable the compression.
+   */
+  brw->draw_aux_buffer_disabled[i] = intel_tex->mt->bo == irb->mt->bo;
+
+  if (brw->draw_aux_buffer_disabled[i]) {
+ perf_debug("Sampling renderbuffer with non-compressible format - "
+"turning off compression");
+  }
+   }
+
return true;
 }
 
diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 12ac8af..074d554 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -1333,6 +1333,16 @@ struct brw_context
 
struct brw_fast_clear_state *fast_clear_state;
 
+   /* Array of flags telling if auxiliary buffer is disabled for corresponding
+* renderbuffer. If draw_aux_buffer_disabled[i] is set then use of
+* auxiliary buffer for gl_framebuffer::_ColorDrawBuffers[i] is
+* disabled.
+* This is needed in case the same underlying buffer is also configured
+* to be sampled but with a format that the sampling engine can't treat
+* compressed or fast cleared.
+*/
+   bool draw_aux_buffer_disabled[MAX_DRAW_BUFFERS];
+
__DRIcontext *driContext;
struct intel_screen *intelScreen;
 };
diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 073919e..af102a9 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -56,6 +56,7 @@
 
 enum {
INTEL_RENDERBUFFER_LAYERED = 1 << 0,
+   INTEL_RENDERBUFFER_AUX_DISABLED = 1 << 1,
 };
 
 struct surface_state_info {
@@ -194,6 +195,10 @@ brw_update_renderbuffer_surface(struct brw_context *brw,
struct intel_renderbuffer *irb = intel_renderbuffer(rb);
struct intel_mipmap_tree *mt = irb->mt;
 
+   if (brw->gen < 9) {
+  assert(!(flags & INTEL_RENDERBUFFER_AUX_DISABLED));
+   }
+
assert(brw_render_target_supported(brw, rb));
intel_miptree_used_for_rendering(mt);
 
@@ -885,6 +890,7 @@ gen4_update_renderbuffer_surface(struct brw_context *brw,
/* BRW_NEW_FS_PROG_DATA */
 
assert(!(flags & INTEL_RENDERBUFFER_LAYERED));
+   assert(!(flags & INTEL_RENDERBUFFER_AUX_DISABLED));
 
if (rb->TexImage && !brw->has_surface_tile_offset) {
   intel_renderbuffer_get_tile_offsets(irb, &tile_x, &tile_y);
@@ -987,8 +993,10 @@ brw_update_renderbuffer_surfaces(struct brw_context *brw,
if (fb->_NumColorDrawBuffers >= 1) {
   for (i = 0; i < fb->_NumColorDrawBuffers; i++) {
  const uint32_t surf_index = render_target_start + i;
- const int flags =
-_mesa_geometric_layers(fb) > 0 ? INTEL_RENDERBUFFER_LAYERED : 0;
+ const int flags = (_mesa_geometric_layers(fb) > 0 ?
+  INTEL_RENDERBUFFER_LAYERED : 0) |
+   (brw->draw_aux_buffer_disabled[i] ? 
+  INTEL_RENDERBUFFER_AUX_DISABLED : 0);
 
 if (intel_renderbuffer(fb->_ColorDrawBuffers[i])) {
 surf_offset[surf_index] =
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [v2 2/6] i965: Replace boolean rb surface state setup argument with flags

2016-09-06 Thread Topi Pohjolainen

And add plumbing to provide it all the way to surface state emitter.
This is not used yet but will be in subsequent patches to carry
additional constraints.

Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_context.h  |  2 +-
 src/mesa/drivers/dri/i965/brw_state.h|  2 +-
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 28 +++-
 3 files changed, 20 insertions(+), 12 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index e7c90b7..12ac8af 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -747,7 +747,7 @@ struct brw_context
{
   uint32_t (*update_renderbuffer_surface)(struct brw_context *brw,
   struct gl_renderbuffer *rb,
-  bool layered, unsigned unit,
+  int flags, unsigned unit,
   uint32_t surf_index);
   void (*emit_null_surface_state)(struct brw_context *brw,
   unsigned width,
diff --git a/src/mesa/drivers/dri/i965/brw_state.h 
b/src/mesa/drivers/dri/i965/brw_state.h
index bfcdf29..1d370c3 100644
--- a/src/mesa/drivers/dri/i965/brw_state.h
+++ b/src/mesa/drivers/dri/i965/brw_state.h
@@ -288,7 +288,7 @@ void brw_update_texture_surface(struct gl_context *ctx,
 
 uint32_t brw_update_renderbuffer_surface(struct brw_context *brw,
  struct gl_renderbuffer *rb,
- bool layered, unsigned unit,
+ int flags, unsigned unit,
  uint32_t surf_index);
 
 void brw_update_renderbuffer_surfaces(struct brw_context *brw,
diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index c347b5d..073919e 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -54,6 +54,10 @@
 #include "brw_defines.h"
 #include "brw_wm.h"
 
+enum {
+   INTEL_RENDERBUFFER_LAYERED = 1 << 0,
+};
+
 struct surface_state_info {
unsigned num_dwords;
unsigned ss_align; /* Required alignment of RENDER_SURFACE_STATE in bytes */
@@ -74,7 +78,7 @@ static const struct surface_state_info surface_state_infos[] 
= {
 
 static void
 brw_emit_surface_state(struct brw_context *brw,
-   struct intel_mipmap_tree *mt,
+   struct intel_mipmap_tree *mt, int flags,
GLenum target, struct isl_view view,
uint32_t mocs, uint32_t *surf_offset, int surf_index,
unsigned read_domains, unsigned write_domains)
@@ -183,7 +187,7 @@ brw_emit_surface_state(struct brw_context *brw,
 uint32_t
 brw_update_renderbuffer_surface(struct brw_context *brw,
 struct gl_renderbuffer *rb,
-bool layered, unsigned unit /* unused */,
+int flags, unsigned unit /* unused */,
 uint32_t surf_index)
 {
struct gl_context *ctx = &brw->ctx;
@@ -220,7 +224,7 @@ brw_update_renderbuffer_surface(struct brw_context *brw,
};
 
uint32_t offset;
-   brw_emit_surface_state(brw, mt, mt->target, view,
+   brw_emit_surface_state(brw, mt, flags, mt->target, view,
   surface_state_infos[brw->gen].rb_mocs,
   &offset, surf_index,
   I915_GEM_DOMAIN_RENDER,
@@ -533,7 +537,8 @@ brw_update_texture_surface(struct gl_context *ctx,
   obj->Target == GL_TEXTURE_CUBE_MAP_ARRAY)
  view.usage |= ISL_SURF_USAGE_CUBE_BIT;
 
-  brw_emit_surface_state(brw, mt, mt->target, view,
+  const int flags = 0;
+  brw_emit_surface_state(brw, mt, flags, mt->target, view,
  surface_state_infos[brw->gen].tex_mocs,
  surf_offset, surf_index,
  I915_GEM_DOMAIN_SAMPLER, 0);
@@ -865,7 +870,7 @@ brw_emit_null_surface_state(struct brw_context *brw,
 static uint32_t
 gen4_update_renderbuffer_surface(struct brw_context *brw,
  struct gl_renderbuffer *rb,
- bool layered, unsigned unit,
+ int flags, unsigned unit,
  uint32_t surf_index)
 {
struct gl_context *ctx = &brw->ctx;
@@ -879,7 +884,7 @@ gen4_update_renderbuffer_surface(struct brw_context *brw,
mesa_format rb_format = _mesa_get_render_format(ctx, intel_rb_format(irb));
/* BRW_NEW_FS_PROG_DATA */
 
-   assert(!layered);
+   assert(!(flags & INTEL_RENDERBUFFER_LAYERED));
 
if (rb->TexImage && !brw->has_surface_tile_offset) {
   intel_renderbuffer_get_tile

[Mesa-dev] [v2 6/6] i965/rbc: Allocate mcs directly

2016-09-06 Thread Topi Pohjolainen

such as we do for compressed msaa. In case of non-compressed simgle
sampled buffers the allocation of mcs is deferred until there is
actually a clear operation that needs the mcs.
In case of render buffer compression the mcs buffer always needed
and there is no real reason to defer the allocation. By doing it
directly allows to drop quite a bit unnecessary complexity.

Patch leaves brw_predraw_set_aux_buffers() a no-op. Subsequent
patches will re-use it and it seemed cleaner to leave it instead
of removing and re-introducing.

Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_blorp.c | 10 ++--
 src/mesa/drivers/dri/i965/brw_draw.c  |  4 +-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 68 +++
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  7 +--
 4 files changed, 26 insertions(+), 63 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index b0fbb64..fdaf429 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -287,8 +287,6 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
intel_miptree_slice_resolve_depth(brw, src_mt, src_level, src_layer);
intel_miptree_slice_resolve_depth(brw, dst_mt, dst_level, dst_layer);
 
-   intel_miptree_prepare_mcs(brw, dst_mt);
-
DBG("%s from %dx %s mt %p %d %d (%f,%f) (%f,%f)"
"to %dx %s mt %p %d %d (%f,%f) (%f,%f) (flip %d,%d)\n",
__func__,
@@ -689,6 +687,9 @@ do_single_blorp_clear(struct brw_context *brw, struct 
gl_framebuffer *fb,
!brw_is_color_fast_clear_compatible(brw, irb->mt, 
&ctx->Color.ClearColor))
   can_fast_clear = false;
 
+   const bool is_lossless_compressed = intel_miptree_is_lossless_compressed(
+  brw, irb->mt);
+
if (can_fast_clear) {
   /* Record the clear color in the miptree so that it will be
* programmed in SURFACE_STATE by later rendering and resolve
@@ -708,7 +709,8 @@ do_single_blorp_clear(struct brw_context *brw, struct 
gl_framebuffer *fb,
* it now.
*/
   if (!irb->mt->mcs_mt) {
- if (!intel_miptree_alloc_non_msrt_mcs(brw, irb->mt)) {
+ assert(!is_lossless_compressed);
+ if (!intel_miptree_alloc_non_msrt_mcs(brw, irb->mt, false)) {
 /* MCS allocation failed--probably this will only happen in
  * out-of-memory conditions.  But in any case, try to recover
  * by falling back to a non-blorp clear technique.
@@ -757,7 +759,7 @@ do_single_blorp_clear(struct brw_context *brw, struct 
gl_framebuffer *fb,
   clear_color, color_write_disable);
   blorp_batch_finish(&batch);
 
-  if (intel_miptree_is_lossless_compressed(brw, irb->mt)) {
+  if (is_lossless_compressed) {
  /* Compressed buffers can be cleared also using normal rep-clear. In
   * such case they behave such as if they were drawn using normal 3D
   * render pipeline, and we simply mark the mcs as dirty.
diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
b/src/mesa/drivers/dri/i965/brw_draw.c
index 9b1e18c..cab67c9 100644
--- a/src/mesa/drivers/dri/i965/brw_draw.c
+++ b/src/mesa/drivers/dri/i965/brw_draw.c
@@ -409,8 +409,8 @@ brw_predraw_set_aux_buffers(struct brw_context *brw)
   struct intel_renderbuffer *irb =
  intel_renderbuffer(fb->_ColorDrawBuffers[i]);
 
-  if (irb) {
- intel_miptree_prepare_mcs(brw, irb->mt);
+  if (!irb) {
+ continue;
   }
}
 }
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 7b97183..427657c 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -789,6 +789,20 @@ intel_miptree_create(struct brw_context *brw,
intel_miptree_supports_non_msrt_fast_clear(brw, mt)) {
   mt->fast_clear_state = INTEL_FAST_CLEAR_STATE_RESOLVED;
   assert(brw->gen < 8 || mt->halign == 16 || num_samples <= 1);
+
+  /* On Gen9+ clients are not currently capable of consuming compressed
+   * single-sampled buffers. Disabling compression allows us to skip
+   * resolves.
+   */
+  const bool lossless_compression_disabled = INTEL_DEBUG & DEBUG_NO_RBC;
+  const bool is_lossless_compressed =
+ unlikely(!lossless_compression_disabled) &&
+ brw->gen >= 9 && !mt->is_scanout &&
+ intel_miptree_supports_lossless_compressed(brw, mt);
+
+  if (is_lossless_compressed) {
+ intel_miptree_alloc_non_msrt_mcs(brw, mt, is_lossless_compressed);
+  }
}
 
return mt;
@@ -1563,7 +1577,8 @@ intel_miptree_alloc_mcs(struct brw_context *brw,
 
 bool
 intel_miptree_alloc_non_msrt_mcs(struct brw_context *brw,
- struct intel_mipmap_tree *mt)
+ struct intel_mipmap_tree *mt,
+ bool is_lossless_compres

Re: [Mesa-dev] [PATCH 4/4] radeonsi: skip redundant INDEX_TYPE writes

2016-09-06 Thread Nicolai Hähnle


For the series:

Reviewed-by: Nicolai Hähnle 

On 06.09.2016 00:46, Marek Olšák wrote:

From: Marek Olšák 

Ported from Vulkan.
---
 src/gallium/drivers/radeonsi/si_hw_context.c |  1 +
 src/gallium/drivers/radeonsi/si_pipe.h   |  1 +
 src/gallium/drivers/radeonsi/si_state_draw.c | 50 +---
 3 files changed, 32 insertions(+), 20 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_hw_context.c 
b/src/gallium/drivers/radeonsi/si_hw_context.c
index a03b327..24b0360 100644
--- a/src/gallium/drivers/radeonsi/si_hw_context.c
+++ b/src/gallium/drivers/radeonsi/si_hw_context.c
@@ -218,20 +218,21 @@ void si_begin_new_cs(struct si_context *ctx)
si_mark_atom_dirty(ctx, &ctx->b.viewports.atom);

r600_postflush_resume_features(&ctx->b);

assert(!ctx->b.gfx.cs->prev_dw);
ctx->b.initial_gfx_cs_size = ctx->b.gfx.cs->current.cdw;

/* Invalidate various draw states so that they are emitted before
 * the first draw call. */
si_invalidate_draw_sh_constants(ctx);
+   ctx->last_index_size = -1;
ctx->last_primitive_restart_en = -1;
ctx->last_restart_index = SI_RESTART_INDEX_UNKNOWN;
ctx->last_gs_out_prim = -1;
ctx->last_prim = -1;
ctx->last_multi_vgt_param = -1;
ctx->last_ls_hs_config = -1;
ctx->last_rast_prim = -1;
ctx->last_sc_line_stipple = ~0;
ctx->last_vtx_reuse_depth = -1;
ctx->emit_scratch_reloc = true;
diff --git a/src/gallium/drivers/radeonsi/si_pipe.h 
b/src/gallium/drivers/radeonsi/si_pipe.h
index 5c041ce..a648d86 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.h
+++ b/src/gallium/drivers/radeonsi/si_pipe.h
@@ -300,20 +300,21 @@ struct si_context {
booldb_flush_depth_inplace;
booldb_flush_stencil_inplace;
booldb_depth_clear;
booldb_depth_disable_expclear;
booldb_stencil_clear;
booldb_stencil_disable_expclear;
unsignedps_db_shader_control;
boolocclusion_queries_disabled;

/* Emitted draw state. */
+   int last_index_size;
int last_base_vertex;
int last_start_instance;
int last_drawid;
int last_sh_base_reg;
int last_primitive_restart_en;
int last_restart_index;
int last_gs_out_prim;
int last_prim;
int last_multi_vgt_param;
int last_ls_hs_config;
diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c
index d4447a9..d7325ff 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -546,49 +546,59 @@ static void si_emit_draw_packets(struct si_context *sctx,
radeon_emit(cs, R_028B2C_VGT_STRMOUT_DRAW_OPAQUE_BUFFER_FILLED_SIZE 
>> 2);
radeon_emit(cs, 0); /* unused */

radeon_add_to_buffer_list(&sctx->b, &sctx->b.gfx,
  t->buf_filled_size, RADEON_USAGE_READ,
  RADEON_PRIO_SO_FILLED_SIZE);
}

/* draw packet */
if (info->indexed) {
-   radeon_emit(cs, PKT3(PKT3_INDEX_TYPE, 0, 0));
-
-   /* index type */
-   switch (ib->index_size) {
-   case 1:
-   radeon_emit(cs, V_028A7C_VGT_INDEX_8);
-   break;
-   case 2:
-   radeon_emit(cs, V_028A7C_VGT_INDEX_16 |
-   (SI_BIG_ENDIAN && sctx->b.chip_class <= CIK 
?
-V_028A7C_VGT_DMA_SWAP_16_BIT : 0));
-   break;
-   case 4:
-   radeon_emit(cs, V_028A7C_VGT_INDEX_32 |
-   (SI_BIG_ENDIAN && sctx->b.chip_class <= CIK 
?
-V_028A7C_VGT_DMA_SWAP_32_BIT : 0));
-   break;
-   default:
-   assert(!"unreachable");
-   return;
+   if (ib->index_size != sctx->last_index_size) {
+   radeon_emit(cs, PKT3(PKT3_INDEX_TYPE, 0, 0));
+
+   /* index type */
+   switch (ib->index_size) {
+   case 1:
+   radeon_emit(cs, V_028A7C_VGT_INDEX_8);
+   break;
+   case 2:
+   radeon_emit(cs, V_028A7C_VGT_INDEX_16 |
+   (SI_BIG_ENDIAN && sctx->b.chip_class 
<= CI

Re: [Mesa-dev] [PATCH 3/4] radeonsi: add more unlikely() uses into si_draw_vbo

2016-09-06 Thread Gustaw Smolarczyk

2016-09-06 3:56 GMT+02:00 Ilia Mirkin :
> On Mon, Sep 5, 2016 at 9:54 PM, Michel Dänzer  wrote:
>> On 06/09/16 07:46 AM, Marek Olšák wrote:
>>> From: Marek Olšák 
>>
>> Did you measure any significant performance boost with this change?
>> Otherwise, using (un)likely can be bad because it can defeat the CPU's
>> branch prediction, which tends to be pretty good these days.
>
> Is there a way to affect the branch predictor on x86 with instruction
> encodings? I didn't think so. I was under the impression that all
> likely/unlikely did was to affect placement of the code, i.e. where
> the "if" code was placed.

If I may add to the discussion: there was a way to add branch
prediction hints to the instruction encoding (using x86 prefixes that
were to be ignored according to ISA), which was used by NetBurst
architecture (Pentium 4). It is no longer recognized by any modern
architecture and AFAIK compilers will not generate code that uses
them.

The compiler should be able to do two things using the (un)likely
hints (there might be more tricks I am not aware of):
1. Make the likely branch not jump. When the CPU executes a jump
without any branch prediction data cached for it, it assumes that it
doesn't jump.
2. Move the unlikely parts of code outside of a function or to the end
of a function. That increases instruction cache and fetch usage for
likely code.

In the end, it would be best to measure the performance of the (un)likely hints.

Regards,
Gustaw Smolarczyk

>
>   -ilia
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/4] radeonsi: add more unlikely() uses into si_draw_vbo

2016-09-06 Thread Marek Olšák

On Tue, Sep 6, 2016 at 3:54 AM, Michel Dänzer  wrote:
> On 06/09/16 07:46 AM, Marek Olšák wrote:
>> From: Marek Olšák 
>
> Did you measure any significant performance boost with this change?

I didn't measure anything.

> Otherwise, using (un)likely can be bad because it can defeat the CPU's
> branch prediction, which tends to be pretty good these days.

I'm not an expert on that, but it doesn't seem to be the case
according to other people's comments here.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 0/3] Make eglExportDMABUFImageMESA return corresponding offset.

2016-09-06 Thread Chuanbo Weng

This patchset makes eglExportDMABUFImageMESA return corresponding offset
of EGLImage instead of 0 on intel platfrom with classic dri driver(i965).

Chuanbo Weng (3):
  dri: add offset attribute and bump version of EGLImage extensions.
  egl: return corresponding offset of EGLImage instead of 0.
  i965: implement querying __DRI_IMAGE_ATTRIB_OFFSET.

 include/GL/internal/dri_interface.h  |  4 +++-
 src/egl/drivers/dri2/egl_dri2.c  | 12 +---
 src/mesa/drivers/dri/i965/intel_screen.c |  9 +++--
 3 files changed, 19 insertions(+), 6 deletions(-)

-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/3] egl: return corresponding offset of EGLImage instead of 0.

2016-09-06 Thread Chuanbo Weng

The offset should not always be 0. For example, if EGLImage is
created from a 2D texture with EGL_GL_TEXTURE_LEVEL=1, then the
offset should be the actual start of miplevel 1 in bo.

Signed-off-by: Chuanbo Weng 
---
 src/egl/drivers/dri2/egl_dri2.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c
index 859612f..8ef0acd 100644
--- a/src/egl/drivers/dri2/egl_dri2.c
+++ b/src/egl/drivers/dri2/egl_dri2.c
@@ -2249,6 +2249,8 @@ dri2_export_dma_buf_image_mesa(_EGLDriver *drv, 
_EGLDisplay *disp, _EGLImage *im
struct dri2_egl_image *dri2_img = dri2_egl_image(img);
 
(void) drv;
+   EGLBoolean ret = EGL_TRUE;
+   EGLint img_offset = 0;
 
/* rework later to provide multiple fds/strides/offsets */
if (fds)
@@ -2259,10 +2261,14 @@ dri2_export_dma_buf_image_mesa(_EGLDriver *drv, 
_EGLDisplay *disp, _EGLImage *im
   dri2_dpy->image->queryImage(dri2_img->dri_image,
  __DRI_IMAGE_ATTRIB_STRIDE, strides);
 
-   if (offsets)
-  offsets[0] = 0;
+   if (offsets){
+  ret = dri2_dpy->image->queryImage(dri2_img->dri_image,
+   __DRI_IMAGE_ATTRIB_OFFSET, &img_offset);
+  if(ret == EGL_TRUE)
+offsets[0] = img_offset;
+   }
 
-   return EGL_TRUE;
+   return ret;
 }
 
 #endif
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/3] dri: add offset attribute and bump version of EGLImage extensions.

2016-09-06 Thread Chuanbo Weng

Offset is useful for buffer sharing with other components, so add
it to queryImage attributes.

Signed-off-by: Chuanbo Weng 
---
 include/GL/internal/dri_interface.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/GL/internal/dri_interface.h 
b/include/GL/internal/dri_interface.h
index 1c73cce..d0b1bc6 100644
--- a/include/GL/internal/dri_interface.h
+++ b/include/GL/internal/dri_interface.h
@@ -1094,7 +1094,7 @@ struct __DRIdri2ExtensionRec {
  * extensions.
  */
 #define __DRI_IMAGE "DRI_IMAGE"
-#define __DRI_IMAGE_VERSION 12
+#define __DRI_IMAGE_VERSION 13
 
 /**
  * These formats correspond to the similarly named MESA_FORMAT_*
@@ -1208,6 +1208,8 @@ struct __DRIdri2ExtensionRec {
 #define __DRI_IMAGE_ATTRIB_FOURCC   0x2008 /* available in versions 11 */
 #define __DRI_IMAGE_ATTRIB_NUM_PLANES   0x2009 /* available in versions 11 */
 
+#define __DRI_IMAGE_ATTRIB_OFFSET 0x200A /* available in versions 13 */
+
 enum __DRIYUVColorSpace {
__DRI_YUV_COLOR_SPACE_UNDEFINED = 0,
__DRI_YUV_COLOR_SPACE_ITU_REC601 = 0x327F,
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/3] i965: implement querying __DRI_IMAGE_ATTRIB_OFFSET.

2016-09-06 Thread Chuanbo Weng

Implement querying this attribute in intelImageExtension and bump
version of intelImageExtension.

Signed-off-by: Chuanbo Weng 
---
 src/mesa/drivers/dri/i965/intel_screen.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
b/src/mesa/drivers/dri/i965/intel_screen.c
index a3d252d..8c75e61 100644
--- a/src/mesa/drivers/dri/i965/intel_screen.c
+++ b/src/mesa/drivers/dri/i965/intel_screen.c
@@ -609,6 +609,9 @@ intel_query_image(__DRIimage *image, int attrib, int *value)
case __DRI_IMAGE_ATTRIB_NUM_PLANES:
   *value = 1;
   return true;
+   case __DRI_IMAGE_ATTRIB_OFFSET:
+  *value = image->offset;
+  return true;
 
   default:
   return false;
@@ -845,7 +848,7 @@ intel_from_planar(__DRIimage *parent, int plane, void 
*loaderPrivate)
 }
 
 static const __DRIimageExtension intelImageExtension = {
-.base = { __DRI_IMAGE, 11 },
+.base = { __DRI_IMAGE, 13 },
 
 .createImageFromName= intel_create_image_from_name,
 .createImageFromRenderbuffer= intel_create_image_from_renderbuffer,
@@ -860,7 +863,9 @@ static const __DRIimageExtension intelImageExtension = {
 .createImageFromFds = intel_create_image_from_fds,
 .createImageFromDmaBufs = intel_create_image_from_dma_bufs,
 .blitImage  = NULL,
-.getCapabilities= NULL
+.getCapabilities= NULL,
+.mapImage   = NULL,
+.unmapImage = NULL,
 };
 
 static int
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2] egl: return corresponding offset of EGLImage instead of 0.

2016-09-06 Thread Weng Chuanbo

Hi Emil,
I have split out patches and sent out the patches.
Please review them. Thanks!

2016-09-06 11:03 GMT+08:00 Weng Chuanbo :

> Got it. Thanks for your patient explanation!
>
> 2016-09-06 1:14 GMT+08:00 Emil Velikov :
>
>> On 5 September 2016 at 17:21, Weng Chuanbo 
>> wrote:
>>
>> > [Chuanbo] Could you explain " we want these NULL checks split
>> out
>> > and ported to older loader " more detailed?
>> >
>> > And what's older loaders? What's newer dri modules?
>> >
>> > From my understanding, the only path in mesa code invokes mapImage is
>> > gbm_dri_bo_map. So if apply my patch code above, no NULL deref will
>> happen.
>> >
>> For this particular exercise we can say that there are three loaders -
>> src/{gbm,glx,egl}, the interface is in include/ and everything else
>> can be considered* as dri module/drivers.
>>
>> So here you want to split out the null checks into a separate patches,
>> which precede the functional (EGLimage) change. This way the former
>> can be applied (as they are bugfixes) for older versions of mesa.
>>
>> -Emil
>>
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/7] omx hevc decode support

2016-09-06 Thread Christian König


The whole series is Acked-by: Christian König .

I tried to look closer into it, but I only have two hands and one head.

Regards,
Christian.

Am 31.08.2016 um 15:51 schrieb Leo Liu:

This set implements hevc decode for omx, it includes basic structures
from h264 implementation and hevc specific sps, pps, slice header and
reference picture sets, as well as what's required by uvd.

Leo Liu (7):
   st/omx/dec: add initial omx hevc support
   st/omx/dec/h265: add sequence parameter sets
   st/omx/dec/h265: add picture parameter sets
   st/omx/dec/h265: add slice header
   st/omx/dec/h265: add short term reference picture sets
   st/omx/dec/h265: get the reference list for uvd
   st/omx/dec: enable hevc omx decode support

  src/gallium/state_trackers/omx/Makefile.sources |   1 +
  src/gallium/state_trackers/omx/vid_dec.c|  23 +-
  src/gallium/state_trackers/omx/vid_dec.h|  19 +
  src/gallium/state_trackers/omx/vid_dec_h265.c   | 897 
  4 files changed, 939 insertions(+), 1 deletion(-)
  create mode 100644 src/gallium/state_trackers/omx/vid_dec_h265.c



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] gbm: fix potential NULL deref of mapImage/unmapImage.

2016-09-06 Thread Chuanbo Weng

The mapImage/unmapImage functions of DRIimage extension can be NULL,
so we should add additional check for them.

Cc: 
Signed-off-by: Chuanbo Weng 
---
 src/gbm/backends/dri/gbm_dri.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/gbm/backends/dri/gbm_dri.c b/src/gbm/backends/dri/gbm_dri.c
index c3626e3..b14faef 100644
--- a/src/gbm/backends/dri/gbm_dri.c
+++ b/src/gbm/backends/dri/gbm_dri.c
@@ -941,7 +941,7 @@ gbm_dri_bo_map(struct gbm_bo *_bo,
   return *map_data;
}
 
-   if (!dri->image || dri->image->base.version < 12) {
+   if (!dri->image || dri->image->base.version < 12 || !dri->image->mapImage) {
   errno = ENOSYS;
   return NULL;
}
@@ -972,7 +972,8 @@ gbm_dri_bo_unmap(struct gbm_bo *_bo, void *map_data)
   return;
}
 
-   if (!dri->context || !dri->image || dri->image->base.version < 12)
+   if (!dri->context || !dri->image ||
+   dri->image->base.version < 12 || !dri->image->unmapImage)
   return;
 
dri->image->unmapImage(dri->context, bo->image, map_data);
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] gbm: fix potential NULL deref of mapImage/unmapImage.

2016-09-06 Thread Chuanbo Weng

The mapImage/unmapImage functions of DRIimage extension can be NULL,
so we should add additional check for them.

Cc: 
Signed-off-by: Chuanbo Weng 
---
 src/gbm/backends/dri/gbm_dri.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/gbm/backends/dri/gbm_dri.c b/src/gbm/backends/dri/gbm_dri.c
index c3626e3..b14faef 100644
--- a/src/gbm/backends/dri/gbm_dri.c
+++ b/src/gbm/backends/dri/gbm_dri.c
@@ -941,7 +941,7 @@ gbm_dri_bo_map(struct gbm_bo *_bo,
   return *map_data;
}
 
-   if (!dri->image || dri->image->base.version < 12) {
+   if (!dri->image || dri->image->base.version < 12 || !dri->image->mapImage) {
   errno = ENOSYS;
   return NULL;
}
@@ -972,7 +972,8 @@ gbm_dri_bo_unmap(struct gbm_bo *_bo, void *map_data)
   return;
}
 
-   if (!dri->context || !dri->image || dri->image->base.version < 12)
+   if (!dri->context || !dri->image ||
+   dri->image->base.version < 12 || !dri->image->unmapImage)
   return;
 
dri->image->unmapImage(dri->context, bo->image, map_data);
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] gbm: fix potential NULL deref of mapImage/unmapImage.

2016-09-06 Thread Chuanbo Weng

The mapImage/unmapImage functions of DRIimage extension can be NULL,
so we should add additional check for them.

Cc: 
Signed-off-by: Chuanbo Weng 
---
 src/gbm/backends/dri/gbm_dri.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/gbm/backends/dri/gbm_dri.c b/src/gbm/backends/dri/gbm_dri.c
index c3626e3..b14faef 100644
--- a/src/gbm/backends/dri/gbm_dri.c
+++ b/src/gbm/backends/dri/gbm_dri.c
@@ -941,7 +941,7 @@ gbm_dri_bo_map(struct gbm_bo *_bo,
   return *map_data;
}
 
-   if (!dri->image || dri->image->base.version < 12) {
+   if (!dri->image || dri->image->base.version < 12 || !dri->image->mapImage) {
   errno = ENOSYS;
   return NULL;
}
@@ -972,7 +972,8 @@ gbm_dri_bo_unmap(struct gbm_bo *_bo, void *map_data)
   return;
}
 
-   if (!dri->context || !dri->image || dri->image->base.version < 12)
+   if (!dri->context || !dri->image ||
+   dri->image->base.version < 12 || !dri->image->unmapImage)
   return;
 
dri->image->unmapImage(dri->context, bo->image, map_data);
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Mesa-stable] [PATCH] gbm: fix potential NULL deref of mapImage/unmapImage.

2016-09-06 Thread Weng, Chuanbo

Seems the e-mails are blocked before I subscribe mesa-stable.
So after I subscribe mesa-stable, this e-mail appears in mesa-dev three times.

-Original Message-
From: mesa-stable [mailto:mesa-stable-boun...@lists.freedesktop.org] On Behalf 
Of Chuanbo Weng
Sent: Tuesday, September 6, 2016 5:29 PM
To: mesa-dev@lists.freedesktop.org; emil.l.veli...@gmail.com
Cc: mesa-sta...@lists.freedesktop.org; Weng, Chuanbo 
Subject: [Mesa-stable] [PATCH] gbm: fix potential NULL deref of 
mapImage/unmapImage.

The mapImage/unmapImage functions of DRIimage extension can be NULL, so we 
should add additional check for them.

Cc: 
Signed-off-by: Chuanbo Weng 
---
 src/gbm/backends/dri/gbm_dri.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/gbm/backends/dri/gbm_dri.c b/src/gbm/backends/dri/gbm_dri.c 
index c3626e3..b14faef 100644
--- a/src/gbm/backends/dri/gbm_dri.c
+++ b/src/gbm/backends/dri/gbm_dri.c
@@ -941,7 +941,7 @@ gbm_dri_bo_map(struct gbm_bo *_bo,
   return *map_data;
}
 
-   if (!dri->image || dri->image->base.version < 12) {
+   if (!dri->image || dri->image->base.version < 12 || 
+ !dri->image->mapImage) {
   errno = ENOSYS;
   return NULL;
}
@@ -972,7 +972,8 @@ gbm_dri_bo_unmap(struct gbm_bo *_bo, void *map_data)
   return;
}
 
-   if (!dri->context || !dri->image || dri->image->base.version < 12)
+   if (!dri->context || !dri->image ||
+   dri->image->base.version < 12 || !dri->image->unmapImage)
   return;
 
dri->image->unmapImage(dri->context, bo->image, map_data);
--
1.9.1

___
mesa-stable mailing list
mesa-sta...@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-stable
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/4] radeonsi: add more unlikely() uses into si_draw_vbo

2016-09-06 Thread Michel Dänzer

On 06/09/16 06:04 PM, Marek Olšák wrote:
> On Tue, Sep 6, 2016 at 3:54 AM, Michel Dänzer  wrote:
>> On 06/09/16 07:46 AM, Marek Olšák wrote:
>>> From: Marek Olšák 
>>
>> Did you measure any significant performance boost with this change?
> 
> I didn't measure anything.
> 
>> Otherwise, using (un)likely can be bad because it can defeat the CPU's
>> branch prediction, which tends to be pretty good these days.
> 
> I'm not an expert on that, but it doesn't seem to be the case
> according to other people's comments here.

My main point (which Gustaw seems to agree with) is that (un)likely
should only be used when measurements show that they have a positive effect.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Mesa-stable] [PATCH] gbm: fix potential NULL deref of mapImage/unmapImage.

2016-09-06 Thread Emil Velikov

Hi Chuanbo,

On 6 September 2016 at 10:41, Weng, Chuanbo  wrote:
> Seems the e-mails are blocked before I subscribe mesa-stable.
> So after I subscribe mesa-stable, this e-mail appears in mesa-dev three times.
>
Don't worry too much about things bouncing off mesa-stable. I'm going
through the queue on daily basis flushing the queue.
But even if you use suppress-cc as long as the email has the tag Cc
mesa-stable tag the scripts will pick it up ;-)

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Mesa-announce] Mesa 12.0.2 release candidate

2016-09-06 Thread Emil Velikov

On 6 September 2016 at 08:15, Michel Dänzer  wrote:
> On 02/09/16 06:12 PM, Emil Velikov wrote:
>> On 2 September 2016 at 03:26, Michel Dänzer  wrote:
>>> On 01/09/16 11:25 PM, Emil Velikov wrote:
 Hello list,

 The candidate for the Mesa 12.0.2 is now available. Currently we have:
  - 160 queued
  - 9 nominated (outstanding)
  - and 1 rejected patches
>>>
>>> [...]
>>>
 Mesa stable queue
 -

 Nominated (9)
 =
>>> [...]
 Michel Dänzer (1):
   loader/dri3: Overhaul dri3_update_num_back
>>>
>>> FWIW, it's better to hold off on this one until
>>> https://bugs.freedesktop.org/show_bug.cgi?id=97549 is resolved.
>>>
>> Indeed. I had it queued actually and pulled it out as I saw that bug.
>
> Should be good to go for the next stable release(s) together with
> dc3bb5db8c81 ("loader/dri3: Always use at least two back buffers").
>
Smashing, thanks for the updates gents

Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 97608] Account request

2016-09-06 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=97608

Bug ID: 97608
   Summary: Account request
   Product: Mesa
   Version: git
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Other
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: edb+freedesk...@sigluy.net
QA Contact: mesa-dev@lists.freedesktop.org

Hello

I would like to request an account with commit access to Mesa.
I already have an account for Piglit (edb), can it be amended to grant push on
mesa git?

I'm working on Clover state tracker.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 97260] [bisected] R9 290 low performance in Linux 4.7

2016-09-06 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=97260

--- Comment #42 from Dieter Nützel  ---
Hello Kai et al.,

can you please retest with current Mesa git master (dc3bb5d).
And let us know if R9 290 low performance regression is fixed for you, too?

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 97260] [bisected] R9 290 low performance in Linux 4.7

2016-09-06 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=97260

Kai  changed:

   What|Removed |Added

 CC|k...@dev.carbon-project.org  |

--- Comment #43 from Kai  ---
(In reply to Dieter Nützel from comment #42)
> Hello Kai et al.,
> 
> can you please retest with current Mesa git master (dc3bb5d).
> And let us know if R9 290 low performance regression is fixed for you, too?

My regression was fixed with 1e3218bc5ba2b739261f0c0bacf4eb662d377236, see eg.
comment #28. Michel didn't remove the "always three buffers" part so no need to
retest anything.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Piglit] [PATCH 1/2] egl: Add sanity test for EGL_EXT_device_query (v3)

2016-09-06 Thread Emil Velikov

[moving to mesa-dev, adding the EGL device spec authors for their input]

On 5 September 2016 at 08:48, Mathias Fröhlich
 wrote:
> On Friday, 2 September 2016 14:02:07 CEST Emil Velikov wrote:
>
>> On 2 September 2016 at 07:15, Mathias Fröhlich
>
>>  wrote:
>>
>> >
>> > Great!
>> >
>> > One question that I cannot forsee from your branch:
>> >
>> > The EGL_EXT_device_enumeration spec says
>> >
>> >
>> >
>> > [...] All implementations must support
>> >
>> > at least one device.
>> >
>> > [...]
>> >
>> >
>> >
>> > Which means to me that once an application sucsessfully asked for
>> > EGL_EXT_device_query, this calling application can rely on recieving at
>> > least one usable(?) EGL device. As a last resort, that single guaranteed
>> > device can be a software renderer, but the application gets at least
>> > something that lets it render pictures in some way.
>> >
>> Yes we do need at least one device, which (modulo a few small changes)
>> is applicable with the above branch. There is no need for the single
>> guaranteed device to be software renderer.
>
> Well, how are you getting this single (drm) device when you are on a board
> with a pure framebuffer console on some simple VGA hardware just sufficient
> to bring up the boot screen?
>
>
> This situation is very common on some sort of modern systems. See below.
>

>> > Sure, the intent of the extension is to privide access to hw backed
>> > implementations.
>> >
>> Fully agree.
>>
>> >
>> >
>> > For us it means that we need to provide a software rendering context for
>> > the
>> > case that there is either no drm capable graphics driver.
>> I'm missing something here - barring the vendor neutrial EGL
>> requirement for EGL_EXT_device_base how is the presence or absence of
>> the device extensions going to affect any of your work.
>> Afaict all of them are simply not applicable in the software renderer
>> case.
>
> Now I am confused, what do you mean with 'your (my) work'?
>
>
>
> What I mean here - putting together what I read in the branch:
>
> On compile time of mesa, libdrm is there and usable, so lib EGL announces
> EGL_EXT_device_enumeration so eglQueryDevicesEXT shall be there and return
> at least one device. Now put that mesa libraries onto a fresh installed
> cluster system (just by installing a linux distribution that contains the
> mentioned precompield mesa package). That cluster node I mean has nothing
> drm capable as its head never faces a console user appart from the operator
> seeing the boot screen at most once, if not even that is automated away with
> a kickstart install via network.
>
> How are you going to handle this situation?
>
>
>
> Of course a typical installation out there has selected nodes installed with
> a/several GPU(s) each. This gpu is supposed to be used for producing
> visualization results of your simulation. No monitors attached, just to
> reduce the usually huge amount of simulation data (up to several terrabytes
> or even more) to something that you can actually download to your computer
> which is several thousand pictures (well more similar use cases but all
> share the property that you do not want to copy the simulatoin data but you
> can copy picture data in some sense). Sure on this node you expect
> EGL_EXT_device_enumeration to deliver a gpu and I would hope that we
> (mesa/oss graphics stack) also want to deliver EGL_EXT_platform_device where
> you can make use of that single EGLDevice to grab an EGLDisplay.
>
>
>
> In reality today, such cluster nodes are equipped exclusively with nvidia
> cards and the binary blob. VirtualGL is installed running a totally open X
> server that delivers the application local gl contexts via the binary blob
> through virtualgl. So having EGL_EXT_platform_device together with
> EGL_EXT_device_query is what those software vendors that have understood the
> security implications with VirtualGL will use in the future. Those vendors
> who do not (want to) think about security implications will probably
> continue to use the above virtualgl setup as this does not require any
> invest for changes in their software.
>

>> > Or an even more
>> > nasty case, when the device node is just not accessible by the user. I
>> > have
>> > seen distros that restrict the permissions of the render node devices to
>> > the
>> > user loged in the running X server. So, even if there is hardware that
>> > you
>> > could potentially use, you may not be able to access it.
>> >
>> The libdrm helper provides a list of devices which have at least one
>> node available - be that card, control or render. For the purposes of
>> EGL_EXT_device_drm we could consider the card or render, although the
>> card one is exposed in pretty much all the open-source drivers and is
>> independent of the kernel age.
>>
>> That said if distributions restricts permissions to all of those then
>> ... I'm inclined to go with Distribution/User Error. Then again please
>> poke us if you see such cases.
>
> Fedora 24 that I use

Re: [Mesa-dev] [PATCH 3/4] radeonsi: add more unlikely() uses into si_draw_vbo

2016-09-06 Thread Marek Olšák

On Sep 6, 2016 12:03 PM, "Michel Dänzer"  wrote:
>
> On 06/09/16 06:04 PM, Marek Olšák wrote:
> > On Tue, Sep 6, 2016 at 3:54 AM, Michel Dänzer 
wrote:
> >> On 06/09/16 07:46 AM, Marek Olšák wrote:
> >>> From: Marek Olšák 
> >>
> >> Did you measure any significant performance boost with this change?
> >
> > I didn't measure anything.
> >
> >> Otherwise, using (un)likely can be bad because it can defeat the CPU's
> >> branch prediction, which tends to be pretty good these days.
> >
> > I'm not an expert on that, but it doesn't seem to be the case
> > according to other people's comments here.
>
> My main point (which Gustaw seems to agree with) is that (un)likely
> should only be used when measurements show that they have a positive
effect.

I agree with you, but do you always measure the effect of unlikely? I
almost never do and I just use it instinctively like most people do. Due to
our manpower constraints, we can't even afford to measure performance for
much bigger changes than this.

Marek

Marek

>
>
> --
> Earthling Michel Dänzer   |   http://www.amd.com
> Libre software enthusiast | Mesa and X developer
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 97608] Account request

2016-09-06 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=97608

--- Comment #1 from Emil Velikov  ---
Fwiw I'm all for granting Serge commit access to mesa. He's got 30+ commits [1]
mostly in clover - ranging from build fixes to few CL functionality. On the
piglit side the numbers are about the same and he's proven to provide good
work.

[1] https://cgit.freedesktop.org/mesa/mesa/log/?qt=author&q=edb

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 97260] [bisected] R9 290 low performance in Linux 4.7

2016-09-06 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=97260

--- Comment #44 from Jos van Wolput  ---
(In reply to Michel Dänzer from comment #39)
> Anyway, please test the patch I attached to bug 97549.

Fixed on my hardware, thanks!

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 97566] [dri3] The frame of a window and its open gl content are out of sync.

2016-09-06 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=97566

Manuel Schneider  changed:

   What|Removed |Added

Version|unspecified |12.0

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 97566] [dri3] The frame of a window and its open gl content are out of sync.

2016-09-06 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=97566

Manuel Schneider  changed:

   What|Removed |Added

  Component|Other   |Drivers/DRI/i915
   Assignee|mesa-dev@lists.freedesktop. |i...@freedesktop.org
   |org |
 QA Contact|mesa-dev@lists.freedesktop. |
   |org |

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/7] omx hevc decode support

2016-09-06 Thread Leo Liu




On 09/06/2016 05:25 AM, Christian König wrote:

The whole series is Acked-by: Christian König .

I tried to look closer into it, but I only have two hands and one head.


I think you got more efficient hands:-) thanks for taking look.

Regards,
Leo



Regards,
Christian.

Am 31.08.2016 um 15:51 schrieb Leo Liu:

This set implements hevc decode for omx, it includes basic structures
from h264 implementation and hevc specific sps, pps, slice header and
reference picture sets, as well as what's required by uvd.

Leo Liu (7):
   st/omx/dec: add initial omx hevc support
   st/omx/dec/h265: add sequence parameter sets
   st/omx/dec/h265: add picture parameter sets
   st/omx/dec/h265: add slice header
   st/omx/dec/h265: add short term reference picture sets
   st/omx/dec/h265: get the reference list for uvd
   st/omx/dec: enable hevc omx decode support

  src/gallium/state_trackers/omx/Makefile.sources |   1 +
  src/gallium/state_trackers/omx/vid_dec.c|  23 +-
  src/gallium/state_trackers/omx/vid_dec.h|  19 +
  src/gallium/state_trackers/omx/vid_dec_h265.c   | 897 


  4 files changed, 939 insertions(+), 1 deletion(-)
  create mode 100644 src/gallium/state_trackers/omx/vid_dec_h265.c





___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] st/vdapu: use lanczos filter for scaling v4

2016-09-06 Thread Nayan Deshmukh

Hi Leo,

I thought so. As Michel suggested present extension needs
a linear buffer and he and Christian agreed that we should have
a separate linear buffer for this. But I still don't understand the code
in vl_winsys_dri3.c so I am not sure how this could be implemented.

Regards,
Nayan.

On Tue, Sep 6, 2016 at 7:08 PM, Leo Liu  wrote:

> Hi Nayan,
>
> This quick hack was just to prove Christian's idea, and for your reference.
> I don't have multi GPU system, and only had a very brief test on single
> GPU,
> so it might be some difference on your multi GPU system.
> we have to dig more into it.
>
> Regards,
> Leo
>
>
>
> On 09/05/2016 03:51 AM, Nayan Deshmukh wrote:
>
> Hi Leo,
>
> I have tested your patch with my mplayer and it gives error when I try to
> increase the size of the window. It gives the following error:-
>
> X11 error: BadAlloc (insufficient resources for operation)
> X11 error: BadDrawable (invalid Pixmap or Window parameter)
> X11 error: BadPixmap (invalid Pixmap parameter)
>
> Also when I made the back buffer linear instead of the providing the
> handle myself,
> it was working fine in my system.
>
> Regards,
> Nayan
>
> On Fri, Sep 2, 2016 at 8:51 PM, Leo Liu  wrote:
>
>>
>>
>> On 09/02/2016 10:48 AM, Christian König wrote:
>>
>> Am 02.09.2016 um 16:10 schrieb Leo Liu:
>>
>>
>>
>> On 09/02/2016 09:50 AM, Christian König wrote:
>>
>> Am 02.09.2016 um 15:27 schrieb Leo Liu:
>>
>>
>>
>> On 09/02/2016 02:11 AM, Christian König wrote:
>>
>> Am 02.09.2016 um 04:03 schrieb Michel Dänzer:
>>
>> On 02/09/16 10:17 AM, Michel Dänzer wrote:
>>
>> On 02/09/16 12:58 AM, Leo Liu wrote:
>>
>> On 09/01/2016 11:54 AM, Nayan Deshmukh wrote:
>>
>> I saw the code in dri3_glx.c and I could somewhat relate some basic
>> code structure to the vl_winsys_dri3.c. But I am new to this and not
>> aware of the
>> terminology that you used about the buffers. Could you please explain
>> what needs
>> to be done in more detail or point me to where I can read about it.
>>
>> I believe it's from loader_dri3_helper.c with "is_different_gpu"
>> condition true, that will include back buffer and front buffer case.
>> you could try only back buffer case for now.
>>
>>  From a high level, PRIME mainly affects presentation, not so much the
>> video decoding / rendering. The important thing is that the buffer used
>> for presentation via the Present extension is linear, not tiled. I'm not
>> sure whether it makes more sense to allocate a separate linear buffer
>> for this purpose, as is done for GLX, or for the vl code to make the
>> corresponding back (or front?) buffer linear in the first place.
>>
>> A separate linear buffer is probably better, actually, since it will
>> also be pinned to system memory while it's being shared with another GPU.
>>
>>
>> Yes, I agree. Nayan should also work on avoiding the extra copy which
>> currently occur because we can't allocate output buffers directly in the
>> format needed for presentation.
>>
>> The general idea should be to to check during presentation if the format
>> in the output surface is displayable directly.
>>
>>
>> Also we have to consider drawable resized case.
>>
>>
>> Actually we don't. Take a look at the VDPAU spec the output surface
>> should be send for displaying without considering it's size.
>>
>> E.g. when the window is 256x256 pixels, but the application allocated an
>> output surface of 1024x768 we should still send the whole surface to the X
>> server.
>>
>> It's the job of the application to resize the output surfaces not the one
>> of the VDPAU state tracker.
>>
>>
>> I thought this get done by vl compositor from presentation, scaling up or
>> down from output surface to back buffer based on the resize.
>>
>>
>> No, that is incorrect. Take a look at the VDPAU spec:
>>
>> Applications may choose to allow resizing of the presentation queue
>> target (which may be e.g. a regular Window when using an X11-based
>> implementation).
>>
>> *clip_width* and *clip_height* may be used to limit the size of the
>> displayed region of a surface, in order to match the specific region that
>> was rendered to.
>>
>> In turn, this allows the application to allocate over-sized (e.g.
>> screen-sized) surfaces, but render to a region that matches the current
>> size of the video window.
>>
>> Using this technique, an application's response to window resizing may
>> simply be to render to, and display, a different region of the surface,
>> rather than de-/re-allocation of surfaces to match the updated window size.
>>
>>
>> This means that we should send the original output surface size to X, no
>> matter what size it has or what size the window has it is displayed in.
>>
>> That wasn't possible with DRI2, that's why we have that workaround with
>> the delayed rendering in the mixer.
>>
>>
>> I did a quick hack on single GPU, and tested, this proves the whole idea
>> is working including resizing.
>> Linear is still displayable, just looks kind of sluggish when playback.
>>
>> Here is

Re: [Mesa-dev] [PATCH 2/2] st/vdapu: use lanczos filter for scaling v4

2016-09-06 Thread Leo Liu


Hi Nayan,

This quick hack was just to prove Christian's idea, and for your reference.
I don't have multi GPU system, and only had a very brief test on single 
GPU,

so it might be some difference on your multi GPU system.
we have to dig more into it.

Regards,
Leo


On 09/05/2016 03:51 AM, Nayan Deshmukh wrote:

Hi Leo,

I have tested your patch with my mplayer and it gives error when I try to
increase the size of the window. It gives the following error:-

X11 error: BadAlloc (insufficient resources for operation)
X11 error: BadDrawable (invalid Pixmap or Window parameter)
X11 error: BadPixmap (invalid Pixmap parameter)

Also when I made the back buffer linear instead of the providing the 
handle myself,

it was working fine in my system.

Regards,
Nayan

On Fri, Sep 2, 2016 at 8:51 PM, Leo Liu > wrote:




On 09/02/2016 10:48 AM, Christian König wrote:

Am 02.09.2016 um 16:10 schrieb Leo Liu:



On 09/02/2016 09:50 AM, Christian König wrote:

Am 02.09.2016 um 15:27 schrieb Leo Liu:



On 09/02/2016 02:11 AM, Christian König wrote:

Am 02.09.2016 um 04:03 schrieb Michel Dänzer:

On 02/09/16 10:17 AM, Michel Dänzer wrote:

On 02/09/16 12:58 AM, Leo Liu wrote:

On 09/01/2016 11:54 AM, Nayan Deshmukh wrote:

I saw the code in dri3_glx.c and I could somewhat relate
some basic
code structure to the vl_winsys_dri3.c. But I am new to
this and not aware of the
terminology that you used about the buffers. Could you
please explain what needs
to be done in more detail or point me to where I can read
about it.

I believe it's from loader_dri3_helper.c with
"is_different_gpu"
condition true, that will include back buffer and front
buffer case.
you could try only back buffer case for now.

 From a high level, PRIME mainly affects presentation, not
so much the
video decoding / rendering. The important thing is that the
buffer used
for presentation via the Present extension is linear, not
tiled. I'm not
sure whether it makes more sense to allocate a separate
linear buffer
for this purpose, as is done for GLX, or for the vl code to
make the
corresponding back (or front?) buffer linear in the first
place.

A separate linear buffer is probably better, actually, since
it will
also be pinned to system memory while it's being shared with
another GPU.


Yes, I agree. Nayan should also work on avoiding the extra
copy which currently occur because we can't allocate output
buffers directly in the format needed for presentation.

The general idea should be to to check during presentation if
the format in the output surface is displayable directly.


Also we have to consider drawable resized case.


Actually we don't. Take a look at the VDPAU spec the output
surface should be send for displaying without considering it's
size.

E.g. when the window is 256x256 pixels, but the application
allocated an output surface of 1024x768 we should still send
the whole surface to the X server.

It's the job of the application to resize the output surfaces
not the one of the VDPAU state tracker.


I thought this get done by vl compositor from presentation,
scaling up or down from output surface to back buffer based on
the resize.


No, that is incorrect. Take a look at the VDPAU spec:


Applications may choose to allow resizing of the presentation
queue target (which may be e.g. a regular Window when using an
X11-based implementation).

*clip_width* and *clip_height* may be used to limit the size of
the displayed region of a surface, in order to match the
specific region that was rendered to.

In turn, this allows the application to allocate over-sized
(e.g. screen-sized) surfaces, but render to a region that
matches the current size of the video window.

Using this technique, an application's response to window
resizing may simply be to render to, and display, a different
region of the surface, rather than de-/re-allocation of surfaces
to match the updated window size.



This means that we should send the original output surface size
to X, no matter what size it has or what size the window has it
is displayed in.

That wasn't possible with DRI2, that's why we have that
workaround with the delayed rendering in the mixer.


I did a quick hack on single GPU, and tested, this proves the
whole idea is working including resizing.
Linear is still displayable, just looks kind of sluggish when
playback.

Here is the hack for reference including remove back buffer
creating, and presentation rendering, use output surface handle for X

diff --git a/src/gallium/auxiliary/vl/vl_winsys.h
b/src/gallium/auxiliary/vl/vl_winsys.h
index 26db9f2..908ec3a 100644
--- a/src/gallium/auxiliary/vl/vl_winsys.h
+++

Re: [Mesa-dev] [PATCH 1/6] intel/isl: Add a format_supports_multisampling helper

2016-09-06 Thread Jason Ekstrand

On Sep 5, 2016 10:39 PM, "Pohjolainen, Topi" 
wrote:
>
> On Fri, Sep 02, 2016 at 03:50:42PM -0700, Jason Ekstrand wrote:
> > ---
> >  src/intel/isl/isl.h|  2 ++
> >  src/intel/isl/isl_format.c | 30 ++
> >  src/intel/isl/isl_gen6.c   | 19 +--
> >  src/intel/isl/isl_gen7.c   | 16 +---
> >  src/intel/isl/isl_gen8.c   |  4 +---
> >  5 files changed, 35 insertions(+), 36 deletions(-)
> >
> > diff --git a/src/intel/isl/isl.h b/src/intel/isl/isl.h
> > index ecedc05..cb7c22d 100644
> > --- a/src/intel/isl/isl.h
> > +++ b/src/intel/isl/isl.h
> > @@ -989,6 +989,8 @@ bool isl_format_supports_vertex_fetch(const struct
brw_device_info *devinfo,
> >enum isl_format format);
> >  bool isl_format_supports_lossless_compression(const struct
brw_device_info *devinfo,
> >enum isl_format format);
> > +bool isl_format_supports_multisampling(const struct brw_device_info
*devinfo,
> > +   enum isl_format format);
> >
> >  bool isl_format_has_unorm_channel(enum isl_format fmt) ATTRIBUTE_CONST;
> >  bool isl_format_has_snorm_channel(enum isl_format fmt) ATTRIBUTE_CONST;
> > diff --git a/src/intel/isl/isl_format.c b/src/intel/isl/isl_format.c
> > index 8507cc5..5d43fe7 100644
> > --- a/src/intel/isl/isl_format.c
> > +++ b/src/intel/isl/isl_format.c
> > @@ -429,6 +429,36 @@ isl_format_supports_lossless_compression(const
struct brw_device_info *devinfo,
> > return format_gen(devinfo) >=
format_info[format].lossless_compression;
> >  }
> >
> > +bool
> > +isl_format_supports_multisampling(const struct brw_device_info
*devinfo,
> > +  enum isl_format format)
> > +{
> > +   /* From the Sandybridge PRM, Volume 4 Part 1 p72, SURFACE_STATE,
Surface
> > +* Format:
> > +*
> > +*If Number of Multisamples is set to a value other than
> > +*MULTISAMPLECOUNT_1, this field cannot be set to the following
> > +*formats:
> > +*
> > +*   - any format with greater than 64 bits per element
> > +*   - any compressed texture format (BC*)
> > +*   - any YCRCB* format
> > +*
> > +* The restriction on the format's size is removed on Broadwell.
Also,
> > +* there is an exception for HiZ which we treat as a compressed
format and
> > +* is allowed to be multisampled on Broadwell and earlier.
> > +*/
> > +   if (devinfo->gen < 8 && isl_format_get_layout(format)->bpb > 64) {
> > +  return false;
> > +   } else if (isl_format_is_compressed(format)) {
> > +  return false;
>
> I'm merely studying here a little while waiting for jenkins: HiZ case hits
> this condition with ISL_TXC_HIZ. Where is the exception you mention in the
> comment?

In another patch.  That hunk of the convent should probably be moved.

> > +   } else if (isl_format_is_yuv(format)) {
> > +  return false;
> > +   } else {
> > +  return true;
> > +   }
> > +}
> > +
> >  static inline bool
> >  isl_format_has_channel_type(enum isl_format fmt, enum isl_base_type
type)
> >  {
> > diff --git a/src/intel/isl/isl_gen6.c b/src/intel/isl/isl_gen6.c
> > index 2c52e38..b30998d 100644
> > --- a/src/intel/isl/isl_gen6.c
> > +++ b/src/intel/isl/isl_gen6.c
> > @@ -30,8 +30,6 @@ gen6_choose_msaa_layout(const struct isl_device *dev,
> >enum isl_tiling tiling,
> >enum isl_msaa_layout *msaa_layout)
> >  {
> > -   const struct isl_format_layout *fmtl =
isl_format_get_layout(info->format);
> > -
> > assert(ISL_DEV_GEN(dev) == 6);
> > assert(info->samples >= 1);
> >
> > @@ -40,22 +38,7 @@ gen6_choose_msaa_layout(const struct isl_device *dev,
> >return false;
> > }
> >
> > -   /* From the Sandybridge PRM, Volume 4 Part 1 p72, SURFACE_STATE,
Surface
> > -* Format:
> > -*
> > -*If Number of Multisamples is set to a value other than
> > -*MULTISAMPLECOUNT_1, this field cannot be set to the following
> > -*formats:
> > -*
> > -*   - any format with greater than 64 bits per element
> > -*   - any compressed texture format (BC*)
> > -*   - any YCRCB* format
> > -*/
> > -   if (fmtl->bpb > 64)
> > -  return false;
> > -   if (isl_format_is_compressed(info->format))
> > -  return false;
> > -   if (isl_format_is_yuv(info->format))
> > +   if (!isl_format_supports_multisampling(dev->info, info->format))
> >return false;
> >
> > /* From the Sandybridge PRM, Volume 4 Part 1 p85, SURFACE_STATE,
Number of
> > diff --git a/src/intel/isl/isl_gen7.c b/src/intel/isl/isl_gen7.c
> > index 02273f8..7b40291 100644
> > --- a/src/intel/isl/isl_gen7.c
> > +++ b/src/intel/isl/isl_gen7.c
> > @@ -30,8 +30,6 @@ gen7_choose_msaa_layout(const struct isl_device *dev,
> >  enum isl_tiling tiling,
> >  enum isl_msaa_layout *msaa_layout)
> >  {
> > -

Re: [Mesa-dev] [PATCH 4/6] intel/isl: Handle HiZ and CCS tiling more directly

2016-09-06 Thread Pohjolainen, Topi

On Fri, Sep 02, 2016 at 03:50:45PM -0700, Jason Ekstrand wrote:
> The HiZ and CCS tiling formats are always used for HiZ and CCS surfaces
> respectively.  There's no reason why we should go through filter_tiling and
> it's much easier to always get HiZ and CCS right if we just handle them
> directly.
> ---
>  src/intel/isl/isl.c  | 18 --
>  src/intel/isl/isl_gen7.c | 14 --
>  2 files changed, 16 insertions(+), 16 deletions(-)
> 
> diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
> index f8f5802..33e83b1 100644
> --- a/src/intel/isl/isl.c
> +++ b/src/intel/isl/isl.c
> @@ -226,6 +226,22 @@ isl_surf_choose_tiling(const struct isl_device *dev,
>  {
> isl_tiling_flags_t tiling_flags = info->tiling_flags;
>  
> +   /* HiZ surfaces always use the HiZ tiling */
> +   if (info->usage & ISL_SURF_USAGE_HIZ_BIT) {

Similarly to CCS case below, should we also have:

 assert(isl_format_get_layout(info->format)->txc == ISL_TXC_HIZ);

Otherwise:

Reviewed-by: Topi Pohjolainen 

> +  assert(info->format == ISL_FORMAT_HIZ);
> +  assert(tiling_flags == ISL_TILING_HIZ_BIT);
> +  *tiling = ISL_TILING_HIZ;
> +  return true;
> +   }
> +
> +   /* CCS surfaces always use the CCS tiling */
> +   if (info->usage & ISL_SURF_USAGE_CCS_BIT) {
> +  assert(isl_format_get_layout(info->format)->txc == ISL_TXC_CCS);
> +  assert(tiling_flags == ISL_TILING_CCS_BIT);
> +  *tiling = ISL_TILING_CCS;
> +  return true;
> +   }
> +
> if (ISL_DEV_GEN(dev) >= 7) {
>gen7_filter_tiling(dev, info, &tiling_flags);
> } else {
> @@ -254,8 +270,6 @@ isl_surf_choose_tiling(const struct isl_device *dev,
>CHOOSE(ISL_TILING_LINEAR);
> }
>  
> -   CHOOSE(ISL_TILING_CCS);
> -   CHOOSE(ISL_TILING_HIZ);
> CHOOSE(ISL_TILING_Ys);
> CHOOSE(ISL_TILING_Yf);
> CHOOSE(ISL_TILING_Y0);
> diff --git a/src/intel/isl/isl_gen7.c b/src/intel/isl/isl_gen7.c
> index 7b40291..316b51b 100644
> --- a/src/intel/isl/isl_gen7.c
> +++ b/src/intel/isl/isl_gen7.c
> @@ -217,24 +217,10 @@ gen7_filter_tiling(const struct isl_device *dev,
>*flags &= ~ISL_TILING_W_BIT;
> }
>  
> -   /* The HiZ format and tiling always go together */
> -   if (info->format == ISL_FORMAT_HIZ) {
> -  *flags &= ISL_TILING_HIZ_BIT;
> -   } else {
> -  *flags &= ~ISL_TILING_HIZ_BIT;
> -   }
> -
> /* MCS buffers are always Y-tiled */
> if (isl_format_get_layout(info->format)->txc == ISL_TXC_MCS)
>*flags &= ISL_TILING_Y0_BIT;
>  
> -   /* The CCS formats and tiling always go together */
> -   if (isl_format_get_layout(info->format)->txc == ISL_TXC_CCS) {
> -  *flags &= ISL_TILING_CCS_BIT;
> -   } else {
> -  *flags &= ~ISL_TILING_CCS_BIT;
> -   }
> -
> if (info->usage & (ISL_SURF_USAGE_DISPLAY_ROTATE_90_BIT |
>ISL_SURF_USAGE_DISPLAY_ROTATE_180_BIT |
>ISL_SURF_USAGE_DISPLAY_ROTATE_270_BIT)) {
> -- 
> 2.5.0.400.gff86faf
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 09/10] st/vdpau: implement the new DMA-buf based interop

2016-09-06 Thread Ilia Mirkin

On Mon, Sep 5, 2016 at 2:48 AM, Michel Dänzer  wrote:
> On 05/09/16 04:37 AM, Ilia Mirkin wrote:
>> On Tue, Mar 8, 2016 at 7:21 AM, Christian König  
>> wrote:
>>> @@ -80,7 +82,7 @@ vlVdpOutputSurfaceCreate(VdpDevice device,
>>> res_tmpl.depth0 = 1;
>>> res_tmpl.array_size = 1;
>>> res_tmpl.bind = PIPE_BIND_SAMPLER_VIEW | PIPE_BIND_RENDER_TARGET |
>>> -   PIPE_BIND_LINEAR;
>>> +   PIPE_BIND_LINEAR | PIPE_BIND_SHARED;
>>
>> Hi Christian,
>>
>> This change appears to have semi-broken vdpau on nouveau. Whenever I
>> flip on the OSD in mplayer, the rendering becomes *extremely* slow.
>> However regular up-scaling without the OSD is plenty fast. This
>> effectively is forcing the output surfaces to live in GART instead of
>> VRAM.
>
> Strictly speaking, they'd only need to be forced to GART while they're
> actually being shared between different GPUs. That's how it works with
> the amdgpu and radeon kernel drivers.

Any suggestions on how to handle this? Perhaps reallocate + copy the
surface in st/vdpau when actual dmabuf sharing is requested?

To be clear - with this change, vdpau with nouveau is unusable in the
presence of an OSD in mplayer. The OSD comes up whenever you seek
around in the video, so in effect, it's unusable. Used to work great.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] android: fix a build issue with libmesa_st_mesa_32

2016-09-06 Thread Chih-Wei Huang

2016-08-29 16:52 GMT+08:00 Tapani Pälli :
> make sure nir_opcodes.h is in LOCAL_GENERATED_SOURCES otherwise
> build fails with:
>
> "In file included from
> external/mesa/src/mesa/state_tracker/st_glsl_to_nir.cpp:44:
> external/mesa/src/compiler/nir/nir.h:42:10: fatal error: 'nir_opcodes.h' file 
> not found"

Could you explain how to reproduce this error?

Someone also reported a similar error to us recently.
However, from my debugging I can't see how it happens.
Any file which includes nir_opcodes.h should
get this dependency from its .P file:

$OUT/obj/STATIC_LIBRARIES/libmesa_st_mesa_intermediates/state_tracker/st_glsl_to_nir.P
$OUT/obj_x86/STATIC_LIBRARIES/libmesa_st_mesa_intermediates/state_tracker/st_glsl_to_nir.P

cwhuang@icm05:~/git/marshmallow-x86$ grep nir_opcodes.h
$OUT/obj/STATIC_LIBRARIES/libmesa_st_mesa_intermediates/state_tracker/st_glsl_to_nir.P
 
out/target/product/x86_64/gen/STATIC_LIBRARIES/libmesa_nir_intermediates/nir/nir_opcodes.h
\
 
out/target/product/x86_64/gen/STATIC_LIBRARIES/libmesa_nir_intermediates/nir/nir_opcodes.h
:
cwhuang@icm05:~/git/marshmallow-x86$ grep nir_opcodes.h
$OUT/obj_x86/STATIC_LIBRARIES/libmesa_st_mesa_intermediates/state_tracker/st_glsl_to_nir.P
 
out/target/product/x86_64/gen/STATIC_LIBRARIES/libmesa_nir_intermediates/nir/nir_opcodes.h
\
 
out/target/product/x86_64/gen/STATIC_LIBRARIES/libmesa_nir_intermediates/nir/nir_opcodes.h
:

So I can't reproduce this issue.

Please refer to the discussion:

https://groups.google.com/d/msg/android-x86/EwCIlPer1i8/M439AaULCQAJ

> Signed-off-by: Tapani Pälli 
> ---
>  src/mesa/Android.libmesa_st_mesa.mk | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/src/mesa/Android.libmesa_st_mesa.mk 
> b/src/mesa/Android.libmesa_st_mesa.mk
> index 785b6de..e70f51e 100644
> --- a/src/mesa/Android.libmesa_st_mesa.mk
> +++ b/src/mesa/Android.libmesa_st_mesa.mk
> @@ -63,6 +63,8 @@ LOCAL_C_INCLUDES := \
>  LOCAL_WHOLE_STATIC_LIBRARIES += \
> libmesa_program
>
> +LOCAL_GENERATED_SOURCES := $(MESA_GEN_NIR_H)
> +
>  LOCAL_STATIC_LIBRARIES += libmesa_nir
>
>  include $(LOCAL_PATH)/Android.gen.mk
> --




-- 
Chih-Wei
Android-x86 project
http://www.android-x86.org
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] gallium: fix clang warnings

2016-09-06 Thread Martina Kollarova

1. Variable 'hole' is uninitialized when used here [-Wuninitialized]
2. Comparison of constant -1 with expression of type 'unsigned int' is always
   false [-Wtautological-constant-out-of-range-compare]
---
 src/gallium/winsys/radeon/drm/radeon_drm_bo.c | 2 +-
 src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c | 5 -
 2 files changed, 1 insertion(+), 6 deletions(-)

diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_bo.c 
b/src/gallium/winsys/radeon/drm/radeon_drm_bo.c
index 56aab48..23ee8d5 100644
--- a/src/gallium/winsys/radeon/drm/radeon_drm_bo.c
+++ b/src/gallium/winsys/radeon/drm/radeon_drm_bo.c
@@ -200,7 +200,7 @@ static uint64_t radeon_bomgr_find_va(struct 
radeon_drm_winsys *rws,
 static void radeon_bomgr_free_va(struct radeon_drm_winsys *rws,
  uint64_t va, uint64_t size)
 {
-struct radeon_bo_va_hole *hole;
+struct radeon_bo_va_hole *hole = NULL;
 
 size = align(size, rws->info.gart_page_size);
 
diff --git a/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c 
b/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c
index 07eca99..33f6850 100644
--- a/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c
+++ b/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c
@@ -259,11 +259,6 @@ kms_sw_displaytarget_add_from_prime(struct kms_sw_winsys 
*kms_sw, int fd,
kms_sw_dt->height = height;
kms_sw_dt->stride = stride;
 
-   if (kms_sw_dt->size == (off_t)-1) {
-  FREE(kms_sw_dt);
-  return NULL;
-   }
-
lseek(fd, 0, SEEK_SET);
 
list_add(&kms_sw_dt->link, &kms_sw->bo_list);
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 4/6] intel/isl: Handle HiZ and CCS tiling more directly

2016-09-06 Thread Jason Ekstrand

On Tue, Sep 6, 2016 at 7:18 AM, Pohjolainen, Topi <
topi.pohjolai...@gmail.com> wrote:

> On Fri, Sep 02, 2016 at 03:50:45PM -0700, Jason Ekstrand wrote:
> > The HiZ and CCS tiling formats are always used for HiZ and CCS surfaces
> > respectively.  There's no reason why we should go through filter_tiling
> and
> > it's much easier to always get HiZ and CCS right if we just handle them
> > directly.
> > ---
> >  src/intel/isl/isl.c  | 18 --
> >  src/intel/isl/isl_gen7.c | 14 --
> >  2 files changed, 16 insertions(+), 16 deletions(-)
> >
> > diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
> > index f8f5802..33e83b1 100644
> > --- a/src/intel/isl/isl.c
> > +++ b/src/intel/isl/isl.c
> > @@ -226,6 +226,22 @@ isl_surf_choose_tiling(const struct isl_device *dev,
> >  {
> > isl_tiling_flags_t tiling_flags = info->tiling_flags;
> >
> > +   /* HiZ surfaces always use the HiZ tiling */
> > +   if (info->usage & ISL_SURF_USAGE_HIZ_BIT) {
>
> Similarly to CCS case below, should we also have:
>
>  assert(isl_format_get_layout(info->format)->txc == ISL_TXC_HIZ);
>

The assert(info->format == ISL_FORMAT_HIZ) is effectively equivalent.  We
use txc for CCS because there are multiple CCS formats.


> Otherwise:
>
> Reviewed-by: Topi Pohjolainen 
>
> > +  assert(info->format == ISL_FORMAT_HIZ);
> > +  assert(tiling_flags == ISL_TILING_HIZ_BIT);
> > +  *tiling = ISL_TILING_HIZ;
> > +  return true;
> > +   }
> > +
> > +   /* CCS surfaces always use the CCS tiling */
> > +   if (info->usage & ISL_SURF_USAGE_CCS_BIT) {
> > +  assert(isl_format_get_layout(info->format)->txc == ISL_TXC_CCS);
> > +  assert(tiling_flags == ISL_TILING_CCS_BIT);
> > +  *tiling = ISL_TILING_CCS;
> > +  return true;
> > +   }
> > +
> > if (ISL_DEV_GEN(dev) >= 7) {
> >gen7_filter_tiling(dev, info, &tiling_flags);
> > } else {
> > @@ -254,8 +270,6 @@ isl_surf_choose_tiling(const struct isl_device *dev,
> >CHOOSE(ISL_TILING_LINEAR);
> > }
> >
> > -   CHOOSE(ISL_TILING_CCS);
> > -   CHOOSE(ISL_TILING_HIZ);
> > CHOOSE(ISL_TILING_Ys);
> > CHOOSE(ISL_TILING_Yf);
> > CHOOSE(ISL_TILING_Y0);
> > diff --git a/src/intel/isl/isl_gen7.c b/src/intel/isl/isl_gen7.c
> > index 7b40291..316b51b 100644
> > --- a/src/intel/isl/isl_gen7.c
> > +++ b/src/intel/isl/isl_gen7.c
> > @@ -217,24 +217,10 @@ gen7_filter_tiling(const struct isl_device *dev,
> >*flags &= ~ISL_TILING_W_BIT;
> > }
> >
> > -   /* The HiZ format and tiling always go together */
> > -   if (info->format == ISL_FORMAT_HIZ) {
> > -  *flags &= ISL_TILING_HIZ_BIT;
> > -   } else {
> > -  *flags &= ~ISL_TILING_HIZ_BIT;
> > -   }
> > -
> > /* MCS buffers are always Y-tiled */
> > if (isl_format_get_layout(info->format)->txc == ISL_TXC_MCS)
> >*flags &= ISL_TILING_Y0_BIT;
> >
> > -   /* The CCS formats and tiling always go together */
> > -   if (isl_format_get_layout(info->format)->txc == ISL_TXC_CCS) {
> > -  *flags &= ISL_TILING_CCS_BIT;
> > -   } else {
> > -  *flags &= ~ISL_TILING_CCS_BIT;
> > -   }
> > -
> > if (info->usage & (ISL_SURF_USAGE_DISPLAY_ROTATE_90_BIT |
> >ISL_SURF_USAGE_DISPLAY_ROTATE_180_BIT |
> >ISL_SURF_USAGE_DISPLAY_ROTATE_270_BIT)) {
> > --
> > 2.5.0.400.gff86faf
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [v2 3/6] i965: Track non-compressible sampling of renderbuffers

2016-09-06 Thread Jason Ekstrand

On Tue, Sep 6, 2016 at 12:28 AM, Topi Pohjolainen <
topi.pohjolai...@gmail.com> wrote:

> Signed-off-by: Topi Pohjolainen 
> ---
>  src/mesa/drivers/dri/i965/brw_context.c  | 16 
>  src/mesa/drivers/dri/i965/brw_context.h  | 10 ++
>  src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 12 ++--
>  3 files changed, 36 insertions(+), 2 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_context.c
> b/src/mesa/drivers/dri/i965/brw_context.c
> index b880b4f..c5c6fdd 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.c
> +++ b/src/mesa/drivers/dri/i965/brw_context.c
> @@ -197,6 +197,22 @@ intel_texture_view_requires_resolve(struct
> brw_context *brw,
>_mesa_get_format_name(intel_tex->_Format),
>_mesa_get_format_name(intel_tex->mt->format));
>
> +   const struct gl_framebuffer *fb = brw->ctx.DrawBuffer;
> +   for (unsigned i = 0; i < fb->_NumColorDrawBuffers; i++) {
> +  const struct intel_renderbuffer *irb =
> + intel_renderbuffer(fb->_ColorDrawBuffers[i]);
> +
> +  /* In case the same surface is also used for rendering one needs to
> +   * disable the compression.
> +   */
> +  brw->draw_aux_buffer_disabled[i] = intel_tex->mt->bo ==
> irb->mt->bo;
> +
> +  if (brw->draw_aux_buffer_disabled[i]) {
> + perf_debug("Sampling renderbuffer with non-compressible format -
> "
> +"turning off compression");
> +  }
> +   }
> +
> return true;
>  }
>
> diff --git a/src/mesa/drivers/dri/i965/brw_context.h
> b/src/mesa/drivers/dri/i965/brw_context.h
> index 12ac8af..074d554 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.h
> +++ b/src/mesa/drivers/dri/i965/brw_context.h
> @@ -1333,6 +1333,16 @@ struct brw_context
>
> struct brw_fast_clear_state *fast_clear_state;
>
> +   /* Array of flags telling if auxiliary buffer is disabled for
> corresponding
> +* renderbuffer. If draw_aux_buffer_disabled[i] is set then use of
> +* auxiliary buffer for gl_framebuffer::_ColorDrawBuffers[i] is
> +* disabled.
> +* This is needed in case the same underlying buffer is also configured
> +* to be sampled but with a format that the sampling engine can't treat
> +* compressed or fast cleared.
> +*/
> +   bool draw_aux_buffer_disabled[MAX_DRAW_BUFFERS];
>

I like the way you handled this.  It's nice and clean.  However, I don't
see where you memset draw_aux_buffer_disabled to 0 to reset it.  Did that
just go missing?


> +
> __DRIcontext *driContext;
> struct intel_screen *intelScreen;
>  };
> diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> index 073919e..af102a9 100644
> --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> @@ -56,6 +56,7 @@
>
>  enum {
> INTEL_RENDERBUFFER_LAYERED = 1 << 0,
> +   INTEL_RENDERBUFFER_AUX_DISABLED = 1 << 1,
>  };
>
>  struct surface_state_info {
> @@ -194,6 +195,10 @@ brw_update_renderbuffer_surface(struct brw_context
> *brw,
> struct intel_renderbuffer *irb = intel_renderbuffer(rb);
> struct intel_mipmap_tree *mt = irb->mt;
>
> +   if (brw->gen < 9) {
> +  assert(!(flags & INTEL_RENDERBUFFER_AUX_DISABLED));
> +   }
> +
> assert(brw_render_target_supported(brw, rb));
> intel_miptree_used_for_rendering(mt);
>
> @@ -885,6 +890,7 @@ gen4_update_renderbuffer_surface(struct brw_context
> *brw,
> /* BRW_NEW_FS_PROG_DATA */
>
> assert(!(flags & INTEL_RENDERBUFFER_LAYERED));
> +   assert(!(flags & INTEL_RENDERBUFFER_AUX_DISABLED));
>
> if (rb->TexImage && !brw->has_surface_tile_offset) {
>intel_renderbuffer_get_tile_offsets(irb, &tile_x, &tile_y);
> @@ -987,8 +993,10 @@ brw_update_renderbuffer_surfaces(struct brw_context
> *brw,
> if (fb->_NumColorDrawBuffers >= 1) {
>for (i = 0; i < fb->_NumColorDrawBuffers; i++) {
>   const uint32_t surf_index = render_target_start + i;
> - const int flags =
> -_mesa_geometric_layers(fb) > 0 ? INTEL_RENDERBUFFER_LAYERED :
> 0;
> + const int flags = (_mesa_geometric_layers(fb) > 0 ?
> +  INTEL_RENDERBUFFER_LAYERED : 0) |
> +   (brw->draw_aux_buffer_disabled[i] ?
> +  INTEL_RENDERBUFFER_AUX_DISABLED : 0);
>
>  if (intel_renderbuffer(fb->_ColorDrawBuffers[i])) {
>  surf_offset[surf_index] =
> --
> 2.5.5
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [v2 4/6] i965/rbc: Set aux surface sampling engine according to rb settings

2016-09-06 Thread Jason Ekstrand

On Tue, Sep 6, 2016 at 12:28 AM, Topi Pohjolainen <
topi.pohjolai...@gmail.com> wrote:

> Once mcs buffer gets allocated without delay for lossless
> compression (same as we do for msaa), one gets regression in:
>
> GL45-CTS.texture_barrier_ARB.same-texel-rw
>
> Setting the auxiliary surface for both sampling engine and data
> port seems to fix this. I haven't found any hardware documentation
> backing this though.
>
> v2 (Jason): Prepare also for the case where surface is sampled with
> non-compressible format forcing also rendering without
> compression.
>
> Signed-off-by: Topi Pohjolainen 
> ---
>  src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 74
> +++-
>  1 file changed, 71 insertions(+), 3 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> index af102a9..05b214f 100644
> --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> @@ -77,6 +77,76 @@ static const struct surface_state_info
> surface_state_infos[] = {
> [9] = {16, 64, 8,  10, SKL_MOCS_WB,  SKL_MOCS_PTE},
>  };
>
> +static unsigned
> +brw_find_matching_rb(const struct gl_framebuffer *fb,
> + const struct intel_mipmap_tree *mt)
> +{
> +   for (unsigned i = 0; i < fb->_NumColorDrawBuffers; i++) {
> +  const struct intel_renderbuffer *irb =
> + intel_renderbuffer(fb->_ColorDrawBuffers[i]);
> +
> +  if (irb->mt == mt)
> + return i;
> +   }
> +
> +   return fb->_NumColorDrawBuffers;
> +}
> +
> +static bool
> +brw_needs_aux_surface(const struct brw_context *brw,
> +  const struct intel_mipmap_tree *mt, int flags,
> +  const struct isl_view *view)
> +{
> +   if (!mt->mcs_mt)
> +  return false;
> +
> +   if (view->usage & ISL_SURF_USAGE_RENDER_TARGET_BIT &&
> +   !(flags & INTEL_RENDERBUFFER_AUX_DISABLED))
> +  return true;
> +
> +   const struct gl_framebuffer *fb = brw->ctx.DrawBuffer;
> +   const bool is_lossless_compressed =
> +  intel_miptree_is_lossless_compressed(brw, mt);
> +   const bool view_format_lossless_compressed =
> +   isl_format_supports_lossless_compression(brw->intelScreen->
> devinfo,
> +view->format);
> +   const unsigned rb_index = brw_find_matching_rb(fb, mt);
> +
> +   /* If the underlying surface is compressed but it is sampled using a
> +* format that the sampling engine doesn't support as compressed, there
> +* is no alternative but to treat the surface as non-compressed.
> +*/
> +   if (is_lossless_compressed && !view_format_lossless_compressed) {
> +  /* Logic elsewhere needs to take care to resolve the color buffer
> prior
> +   * to sampling it as non-compressed.
> +   */
> +  assert(mt->fast_clear_state == INTEL_FAST_CLEAR_STATE_RESOLVED);
> +
> +  /* In practise it looks that setting the same lossless compressed
> +   * surface to be sampled without auxiliary surface and to be written
> +   * with auxiliary surface confuses the hardware. Therefore any
> +   * corresponding renderbuffer must be set up with auxiliary buffer
> +   * disabled.
> +   */
> +  assert(rb_index == fb->_NumColorDrawBuffers ||
> + brw->draw_aux_buffer_disabled[rb_index]);
> +  return false;
> +   }
> +
> +   /* In practise it looks that setting the same lossless compressed
> surface
> +* to be sampled without auxiliary surface and to be written with
> auxiliary
> +* surface confuses the hardware. Therefore sampler engine must be
> provided
> +* with auxiliary buffer regardless of the fast clear state if the same
> +* surface is also going to be written during the same rendering pass.
> +*/
> +   if (is_lossless_compressed && rb_index < fb->_NumColorDrawBuffers) {
> +  assert(!brw->draw_aux_buffer_disabled[rb_index]);
> +  return true;
> +   }
> +
> +   return mt->fast_clear_state != INTEL_FAST_CLEAR_STATE_RESOLVED;]
>

How much of this really belongs in emit_surface_state?  It seems like we
ought to have someone else make those decisions and simply pass us the
AUX_DISABLED bit.


> +}
> +
>  static void
>  brw_emit_surface_state(struct brw_context *brw,
> struct intel_mipmap_tree *mt, int flags,
> @@ -140,9 +210,7 @@ brw_emit_surface_state(struct brw_context *brw,
> struct isl_surf *aux_surf = NULL, aux_surf_s;
> uint64_t aux_offset = 0;
> enum isl_aux_usage aux_usage = ISL_AUX_USAGE_NONE;
> -   if (mt->mcs_mt &&
> -   ((view.usage & ISL_SURF_USAGE_RENDER_TARGET_BIT) ||
> -mt->fast_clear_state != INTEL_FAST_CLEAR_STATE_RESOLVED)) {
> +   if (brw_needs_aux_surface(brw, mt, flags, &view)) {
>intel_miptree_get_aux_isl_surf(brw, mt, &aux_surf_s, &aux_usage);
>aux_surf = &aux_surf_s;
>assert(mt->mcs_mt->offset == 0);
> --
> 2.5.5
>
>

Re: [Mesa-dev] [PATCH] android: fix a build issue with libmesa_st_mesa_32

2016-09-06 Thread Emil Velikov

On 6 September 2016 at 15:33, Chih-Wei Huang  wrote:
> 2016-08-29 16:52 GMT+08:00 Tapani Pälli :
>> make sure nir_opcodes.h is in LOCAL_GENERATED_SOURCES otherwise
>> build fails with:
>>
>> "In file included from
>> external/mesa/src/mesa/state_tracker/st_glsl_to_nir.cpp:44:
>> external/mesa/src/compiler/nir/nir.h:42:10: fatal error: 'nir_opcodes.h' 
>> file not found"
>
> Could you explain how to reproduce this error?
>
> Someone also reported a similar error to us recently.
> However, from my debugging I can't see how it happens.
> Any file which includes nir_opcodes.h should
> get this dependency from its .P file:
>
> $OUT/obj/STATIC_LIBRARIES/libmesa_st_mesa_intermediates/state_tracker/st_glsl_to_nir.P
> $OUT/obj_x86/STATIC_LIBRARIES/libmesa_st_mesa_intermediates/state_tracker/st_glsl_to_nir.P
>
> cwhuang@icm05:~/git/marshmallow-x86$ grep nir_opcodes.h
> $OUT/obj/STATIC_LIBRARIES/libmesa_st_mesa_intermediates/state_tracker/st_glsl_to_nir.P
>  
> out/target/product/x86_64/gen/STATIC_LIBRARIES/libmesa_nir_intermediates/nir/nir_opcodes.h
> \
>  
> out/target/product/x86_64/gen/STATIC_LIBRARIES/libmesa_nir_intermediates/nir/nir_opcodes.h
> :
> cwhuang@icm05:~/git/marshmallow-x86$ grep nir_opcodes.h
> $OUT/obj_x86/STATIC_LIBRARIES/libmesa_st_mesa_intermediates/state_tracker/st_glsl_to_nir.P
>  
> out/target/product/x86_64/gen/STATIC_LIBRARIES/libmesa_nir_intermediates/nir/nir_opcodes.h
> \
>  
> out/target/product/x86_64/gen/STATIC_LIBRARIES/libmesa_nir_intermediates/nir/nir_opcodes.h
> :
>
> So I can't reproduce this issue.
>
> Please refer to the discussion:
>
> https://groups.google.com/d/msg/android-x86/EwCIlPer1i8/M439AaULCQAJ
>
Thanks for this - it reminded me that I should push Rob's patch.

Please check 244f0aba16a7e197ed30e118a9978e200aee2c64 for more info,
but the gist is that:
a) there's a circular dependency and b) depending on the # of jobs
(-j#) and Android build system 'version' you may or may not see the
problem.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [v2 5/6] isl/gen8+: Allow 1D and 3D auxiliary surfaces

2016-09-06 Thread Jason Ekstrand

Reviewed-by: Jason Ekstrand 

On Tue, Sep 6, 2016 at 12:28 AM, Topi Pohjolainen <
topi.pohjolai...@gmail.com> wrote:

> Otherwise once mcs buffer gets allocated without delay for lossless
> compression (same as we do for msaa), assert starts to fire in
> piglit case: tex3d. The test uses depth of one which is in fact
> supported even now.
>
> v2 (Jason): Allow also 1D case as there is nothing in the specs
> constraining it either.
>
> Signed-off-by: Topi Pohjolainen 
> ---
>  src/intel/isl/isl.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
> index c7639d0..3dfdf20 100644
> --- a/src/intel/isl/isl.c
> +++ b/src/intel/isl/isl.c
> @@ -1329,7 +1329,8 @@ isl_surf_get_ccs_surf(const struct isl_device *dev,
> assert(surf->samples == 1 && surf->msaa_layout ==
> ISL_MSAA_LAYOUT_NONE);
> assert(ISL_DEV_GEN(dev) >= 7);
>
> -   assert(surf->dim == ISL_SURF_DIM_2D);
> +   assert(ISL_DEV_GEN(dev) >= 8 || surf->dim == ISL_SURF_DIM_2D);
> +
> assert(surf->logical_level0_px.depth == 1);
>
> /* TODO: More conditions where it can fail. */
> --
> 2.5.5
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [v2 3/6] i965: Track non-compressible sampling of renderbuffers

2016-09-06 Thread Pohjolainen, Topi

On Tue, Sep 06, 2016 at 07:54:16AM -0700, Jason Ekstrand wrote:
>On Tue, Sep 6, 2016 at 12:28 AM, Topi Pohjolainen
><[1]topi.pohjolai...@gmail.com> wrote:
> 
>  Signed-off-by: Topi Pohjolainen <[2]topi.pohjolai...@intel.com>
>  ---
>   src/mesa/drivers/dri/i965/brw_context.c  | 16
>  
>   src/mesa/drivers/dri/i965/brw_context.h  | 10 ++
>   src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 12 ++--
>   3 files changed, 36 insertions(+), 2 deletions(-)
>  diff --git a/src/mesa/drivers/dri/i965/brw_context.c
>  b/src/mesa/drivers/dri/i965/brw_context.c
>  index b880b4f..c5c6fdd 100644
>  --- a/src/mesa/drivers/dri/i965/brw_context.c
>  +++ b/src/mesa/drivers/dri/i965/brw_context.c
>  @@ -197,6 +197,22 @@ intel_texture_view_requires_resolve(struct
>  brw_context *brw,
> _mesa_get_format_name(intel_tex->_Format),
> _mesa_get_format_name(intel_tex->mt->format));
>  +   const struct gl_framebuffer *fb = brw->ctx.DrawBuffer;
>  +   for (unsigned i = 0; i < fb->_NumColorDrawBuffers; i++) {
>  +  const struct intel_renderbuffer *irb =
>  + intel_renderbuffer(fb->_ColorDrawBuffers[i]);
>  +
>  +  /* In case the same surface is also used for rendering one
>  needs to
>  +   * disable the compression.
>  +   */
>  +  brw->draw_aux_buffer_disabled[i] = intel_tex->mt->bo ==
>  irb->mt->bo;

This loop goes thru all render surfaces and explicitly sets the flag. In
other words all flags are reset before uploading state - no need for
separate memset.

>  +
>  +  if (brw->draw_aux_buffer_disabled[i]) {
>  + perf_debug("Sampling renderbuffer with non-compressible
>  format - "
>  +"turning off compression");
>  +  }
>  +   }
>  +
>  return true;
>   }
>  diff --git a/src/mesa/drivers/dri/i965/brw_context.h
>  b/src/mesa/drivers/dri/i965/brw_context.h
>  index 12ac8af..074d554 100644
>  --- a/src/mesa/drivers/dri/i965/brw_context.h
>  +++ b/src/mesa/drivers/dri/i965/brw_context.h
>  @@ -1333,6 +1333,16 @@ struct brw_context
>  struct brw_fast_clear_state *fast_clear_state;
>  +   /* Array of flags telling if auxiliary buffer is disabled for
>  corresponding
>  +* renderbuffer. If draw_aux_buffer_disabled[i] is set then use
>  of
>  +* auxiliary buffer for gl_framebuffer::_ColorDrawBuffers[i] is
>  +* disabled.
>  +* This is needed in case the same underlying buffer is also
>  configured
>  +* to be sampled but with a format that the sampling engine
>  can't treat
>  +* compressed or fast cleared.
>  +*/
>  +   bool draw_aux_buffer_disabled[MAX_DRAW_BUFFERS];
> 
>I like the way you handled this.  It's nice and clean.  However, I
>don't see where you memset draw_aux_buffer_disabled to 0 to reset it.
>Did that just go missing?
> 
>  +
>  __DRIcontext *driContext;
>  struct intel_screen *intelScreen;
>   };
>  diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
>  b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
>  index 073919e..af102a9 100644
>  --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
>  +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
>  @@ -56,6 +56,7 @@
>   enum {
>  INTEL_RENDERBUFFER_LAYERED = 1 << 0,
>  +   INTEL_RENDERBUFFER_AUX_DISABLED = 1 << 1,
>   };
>   struct surface_state_info {
>  @@ -194,6 +195,10 @@ brw_update_renderbuffer_surface(struct
>  brw_context *brw,
>  struct intel_renderbuffer *irb = intel_renderbuffer(rb);
>  struct intel_mipmap_tree *mt = irb->mt;
>  +   if (brw->gen < 9) {
>  +  assert(!(flags & INTEL_RENDERBUFFER_AUX_DISABLED));
>  +   }
>  +
>  assert(brw_render_target_supported(brw, rb));
>  intel_miptree_used_for_rendering(mt);
>  @@ -885,6 +890,7 @@ gen4_update_renderbuffer_surface(struct
>  brw_context *brw,
>  /* BRW_NEW_FS_PROG_DATA */
>  assert(!(flags & INTEL_RENDERBUFFER_LAYERED));
>  +   assert(!(flags & INTEL_RENDERBUFFER_AUX_DISABLED));
>  if (rb->TexImage && !brw->has_surface_tile_offset) {
> intel_renderbuffer_get_tile_offsets(irb, &tile_x, &tile_y);
>  @@ -987,8 +993,10 @@ brw_update_renderbuffer_surfaces(struct
>  brw_context *brw,
>  if (fb->_NumColorDrawBuffers >= 1) {
> for (i = 0; i < fb->_NumColorDrawBuffers; i++) {
>const uint32_t surf_index = render_target_start + i;
>  - const int flags =
>  -_mesa_geometric_layers(fb) > 0 ?
>  INTEL_RENDERBUFFER_LAYERED : 0;
>  + const int flags = (_mesa_geometric_layers(fb) > 0 ?
>  +  INTEL_

Re: [Mesa-dev] [v2 6/6] i965/rbc: Allocate mcs directly

2016-09-06 Thread Jason Ekstrand

Reviewed-by: Jason Ekstrand 

On Tue, Sep 6, 2016 at 12:28 AM, Topi Pohjolainen <
topi.pohjolai...@gmail.com> wrote:

> such as we do for compressed msaa. In case of non-compressed simgle
> sampled buffers the allocation of mcs is deferred until there is
> actually a clear operation that needs the mcs.
> In case of render buffer compression the mcs buffer always needed
> and there is no real reason to defer the allocation. By doing it
> directly allows to drop quite a bit unnecessary complexity.
>
> Patch leaves brw_predraw_set_aux_buffers() a no-op. Subsequent
> patches will re-use it and it seemed cleaner to leave it instead
> of removing and re-introducing.
>
> Signed-off-by: Topi Pohjolainen 
> ---
>  src/mesa/drivers/dri/i965/brw_blorp.c | 10 ++--
>  src/mesa/drivers/dri/i965/brw_draw.c  |  4 +-
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 68
> +++
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  7 +--
>  4 files changed, 26 insertions(+), 63 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c
> b/src/mesa/drivers/dri/i965/brw_blorp.c
> index b0fbb64..fdaf429 100644
> --- a/src/mesa/drivers/dri/i965/brw_blorp.c
> +++ b/src/mesa/drivers/dri/i965/brw_blorp.c
> @@ -287,8 +287,6 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
> intel_miptree_slice_resolve_depth(brw, src_mt, src_level, src_layer);
> intel_miptree_slice_resolve_depth(brw, dst_mt, dst_level, dst_layer);
>
> -   intel_miptree_prepare_mcs(brw, dst_mt);
> -
> DBG("%s from %dx %s mt %p %d %d (%f,%f) (%f,%f)"
> "to %dx %s mt %p %d %d (%f,%f) (%f,%f) (flip %d,%d)\n",
> __func__,
> @@ -689,6 +687,9 @@ do_single_blorp_clear(struct brw_context *brw, struct
> gl_framebuffer *fb,
> !brw_is_color_fast_clear_compatible(brw, irb->mt,
> &ctx->Color.ClearColor))
>can_fast_clear = false;
>
> +   const bool is_lossless_compressed = intel_miptree_is_lossless_
> compressed(
> +  brw, irb->mt);
> +
> if (can_fast_clear) {
>/* Record the clear color in the miptree so that it will be
> * programmed in SURFACE_STATE by later rendering and resolve
> @@ -708,7 +709,8 @@ do_single_blorp_clear(struct brw_context *brw, struct
> gl_framebuffer *fb,
> * it now.
> */
>if (!irb->mt->mcs_mt) {
> - if (!intel_miptree_alloc_non_msrt_mcs(brw, irb->mt)) {
> + assert(!is_lossless_compressed);
> + if (!intel_miptree_alloc_non_msrt_mcs(brw, irb->mt, false)) {
>  /* MCS allocation failed--probably this will only happen in
>   * out-of-memory conditions.  But in any case, try to recover
>   * by falling back to a non-blorp clear technique.
> @@ -757,7 +759,7 @@ do_single_blorp_clear(struct brw_context *brw, struct
> gl_framebuffer *fb,
>clear_color, color_write_disable);
>blorp_batch_finish(&batch);
>
> -  if (intel_miptree_is_lossless_compressed(brw, irb->mt)) {
> +  if (is_lossless_compressed) {
>   /* Compressed buffers can be cleared also using normal
> rep-clear. In
>* such case they behave such as if they were drawn using normal
> 3D
>* render pipeline, and we simply mark the mcs as dirty.
> diff --git a/src/mesa/drivers/dri/i965/brw_draw.c
> b/src/mesa/drivers/dri/i965/brw_draw.c
> index 9b1e18c..cab67c9 100644
> --- a/src/mesa/drivers/dri/i965/brw_draw.c
> +++ b/src/mesa/drivers/dri/i965/brw_draw.c
> @@ -409,8 +409,8 @@ brw_predraw_set_aux_buffers(struct brw_context *brw)
>struct intel_renderbuffer *irb =
>   intel_renderbuffer(fb->_ColorDrawBuffers[i]);
>
> -  if (irb) {
> - intel_miptree_prepare_mcs(brw, irb->mt);
> +  if (!irb) {
> + continue;
>}
> }
>  }
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> index 7b97183..427657c 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> @@ -789,6 +789,20 @@ intel_miptree_create(struct brw_context *brw,
> intel_miptree_supports_non_msrt_fast_clear(brw, mt)) {
>mt->fast_clear_state = INTEL_FAST_CLEAR_STATE_RESOLVED;
>assert(brw->gen < 8 || mt->halign == 16 || num_samples <= 1);
> +
> +  /* On Gen9+ clients are not currently capable of consuming
> compressed
> +   * single-sampled buffers. Disabling compression allows us to skip
> +   * resolves.
> +   */
> +  const bool lossless_compression_disabled = INTEL_DEBUG &
> DEBUG_NO_RBC;
> +  const bool is_lossless_compressed =
> + unlikely(!lossless_compression_disabled) &&
> + brw->gen >= 9 && !mt->is_scanout &&
> + intel_miptree_supports_lossless_compressed(brw, mt);
> +
> +  if (is_lossless_compressed) {
> + intel_miptree_alloc_non_msrt_mcs(brw, mt,
> is_lossless_compressed);
> +  }
> }
>
> return

Re: [Mesa-dev] [v2 2/6] i965: Replace boolean rb surface state setup argument with flags

2016-09-06 Thread Jason Ekstrand

Would I be too much of a pedant if I asked for uint32_t for flags?

On Tue, Sep 6, 2016 at 12:28 AM, Topi Pohjolainen <
topi.pohjolai...@gmail.com> wrote:

> And add plumbing to provide it all the way to surface state emitter.
> This is not used yet but will be in subsequent patches to carry
> additional constraints.
>
> Signed-off-by: Topi Pohjolainen 
> ---
>  src/mesa/drivers/dri/i965/brw_context.h  |  2 +-
>  src/mesa/drivers/dri/i965/brw_state.h|  2 +-
>  src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 28
> +++-
>  3 files changed, 20 insertions(+), 12 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_context.h
> b/src/mesa/drivers/dri/i965/brw_context.h
> index e7c90b7..12ac8af 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.h
> +++ b/src/mesa/drivers/dri/i965/brw_context.h
> @@ -747,7 +747,7 @@ struct brw_context
> {
>uint32_t (*update_renderbuffer_surface)(struct brw_context *brw,
>struct gl_renderbuffer *rb,
> -  bool layered, unsigned unit,
> +  int flags, unsigned unit,
>uint32_t surf_index);
>void (*emit_null_surface_state)(struct brw_context *brw,
>unsigned width,
> diff --git a/src/mesa/drivers/dri/i965/brw_state.h
> b/src/mesa/drivers/dri/i965/brw_state.h
> index bfcdf29..1d370c3 100644
> --- a/src/mesa/drivers/dri/i965/brw_state.h
> +++ b/src/mesa/drivers/dri/i965/brw_state.h
> @@ -288,7 +288,7 @@ void brw_update_texture_surface(struct gl_context
> *ctx,
>
>  uint32_t brw_update_renderbuffer_surface(struct brw_context *brw,
>   struct gl_renderbuffer *rb,
> - bool layered, unsigned unit,
> + int flags, unsigned unit,
>   uint32_t surf_index);
>
>  void brw_update_renderbuffer_surfaces(struct brw_context *brw,
> diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> index c347b5d..073919e 100644
> --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> @@ -54,6 +54,10 @@
>  #include "brw_defines.h"
>  #include "brw_wm.h"
>
> +enum {
> +   INTEL_RENDERBUFFER_LAYERED = 1 << 0,
> +};
> +
>  struct surface_state_info {
> unsigned num_dwords;
> unsigned ss_align; /* Required alignment of RENDER_SURFACE_STATE in
> bytes */
> @@ -74,7 +78,7 @@ static const struct surface_state_info
> surface_state_infos[] = {
>
>  static void
>  brw_emit_surface_state(struct brw_context *brw,
> -   struct intel_mipmap_tree *mt,
> +   struct intel_mipmap_tree *mt, int flags,
> GLenum target, struct isl_view view,
> uint32_t mocs, uint32_t *surf_offset, int
> surf_index,
> unsigned read_domains, unsigned write_domains)
> @@ -183,7 +187,7 @@ brw_emit_surface_state(struct brw_context *brw,
>  uint32_t
>  brw_update_renderbuffer_surface(struct brw_context *brw,
>  struct gl_renderbuffer *rb,
> -bool layered, unsigned unit /* unused */,
> +int flags, unsigned unit /* unused */,
>  uint32_t surf_index)
>  {
> struct gl_context *ctx = &brw->ctx;
> @@ -220,7 +224,7 @@ brw_update_renderbuffer_surface(struct brw_context
> *brw,
> };
>
> uint32_t offset;
> -   brw_emit_surface_state(brw, mt, mt->target, view,
> +   brw_emit_surface_state(brw, mt, flags, mt->target, view,
>surface_state_infos[brw->gen].rb_mocs,
>&offset, surf_index,
>I915_GEM_DOMAIN_RENDER,
> @@ -533,7 +537,8 @@ brw_update_texture_surface(struct gl_context *ctx,
>obj->Target == GL_TEXTURE_CUBE_MAP_ARRAY)
>   view.usage |= ISL_SURF_USAGE_CUBE_BIT;
>
> -  brw_emit_surface_state(brw, mt, mt->target, view,
> +  const int flags = 0;
> +  brw_emit_surface_state(brw, mt, flags, mt->target, view,
>   surface_state_infos[brw->gen].tex_mocs,
>   surf_offset, surf_index,
>   I915_GEM_DOMAIN_SAMPLER, 0);
> @@ -865,7 +870,7 @@ brw_emit_null_surface_state(struct brw_context *brw,
>  static uint32_t
>  gen4_update_renderbuffer_surface(struct brw_context *brw,
>   struct gl_renderbuffer *rb,
> - bool layered, unsigned unit,
> + int flags, unsigned unit,
>   uint32_t surf_index)
>  {
> struct gl_context *ctx = &brw->ctx;
> @@

Re: [Mesa-dev] [v2 3/6] i965: Track non-compressible sampling of renderbuffers

2016-09-06 Thread Jason Ekstrand

On Tue, Sep 6, 2016 at 8:16 AM, Pohjolainen, Topi <
topi.pohjolai...@gmail.com> wrote:

> On Tue, Sep 06, 2016 at 07:54:16AM -0700, Jason Ekstrand wrote:
> >On Tue, Sep 6, 2016 at 12:28 AM, Topi Pohjolainen
> ><[1]topi.pohjolai...@gmail.com> wrote:
> >
> >  Signed-off-by: Topi Pohjolainen <[2]topi.pohjolai...@intel.com>
> >  ---
> >   src/mesa/drivers/dri/i965/brw_context.c  | 16
> >  
> >   src/mesa/drivers/dri/i965/brw_context.h  | 10 ++
> >   src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 12 ++--
> >   3 files changed, 36 insertions(+), 2 deletions(-)
> >  diff --git a/src/mesa/drivers/dri/i965/brw_context.c
> >  b/src/mesa/drivers/dri/i965/brw_context.c
> >  index b880b4f..c5c6fdd 100644
> >  --- a/src/mesa/drivers/dri/i965/brw_context.c
> >  +++ b/src/mesa/drivers/dri/i965/brw_context.c
> >  @@ -197,6 +197,22 @@ intel_texture_view_requires_resolve(struct
> >  brw_context *brw,
> > _mesa_get_format_name(intel_tex->_Format),
> > _mesa_get_format_name(intel_tex->mt->format));
> >  +   const struct gl_framebuffer *fb = brw->ctx.DrawBuffer;
> >  +   for (unsigned i = 0; i < fb->_NumColorDrawBuffers; i++) {
> >  +  const struct intel_renderbuffer *irb =
> >  + intel_renderbuffer(fb->_ColorDrawBuffers[i]);
> >  +
> >  +  /* In case the same surface is also used for rendering one
> >  needs to
> >  +   * disable the compression.
> >  +   */
> >  +  brw->draw_aux_buffer_disabled[i] = intel_tex->mt->bo ==
> >  irb->mt->bo;
>
> This loop goes thru all render surfaces and explicitly sets the flag. In
> other words all flags are reset before uploading state - no need for
> separate memset.
>

Ugh... then it is busted if you have multiple render targets *or* multiple
textures where the one being rendered to isn't the last one.  Only the
render buffer for the last bound texture will get the
buffer_aux_buffer_disabled[] bit set.  We really need a reset and
set-of-disable-needed model.


> >  +
> >  +  if (brw->draw_aux_buffer_disabled[i]) {
> >  + perf_debug("Sampling renderbuffer with non-compressible
> >  format - "
> >  +"turning off compression");
> >  +  }
> >  +   }
> >  +
> >  return true;
> >   }
> >  diff --git a/src/mesa/drivers/dri/i965/brw_context.h
> >  b/src/mesa/drivers/dri/i965/brw_context.h
> >  index 12ac8af..074d554 100644
> >  --- a/src/mesa/drivers/dri/i965/brw_context.h
> >  +++ b/src/mesa/drivers/dri/i965/brw_context.h
> >  @@ -1333,6 +1333,16 @@ struct brw_context
> >  struct brw_fast_clear_state *fast_clear_state;
> >  +   /* Array of flags telling if auxiliary buffer is disabled for
> >  corresponding
> >  +* renderbuffer. If draw_aux_buffer_disabled[i] is set then use
> >  of
> >  +* auxiliary buffer for gl_framebuffer::_ColorDrawBuffers[i] is
> >  +* disabled.
> >  +* This is needed in case the same underlying buffer is also
> >  configured
> >  +* to be sampled but with a format that the sampling engine
> >  can't treat
> >  +* compressed or fast cleared.
> >  +*/
> >  +   bool draw_aux_buffer_disabled[MAX_DRAW_BUFFERS];
> >
> >I like the way you handled this.  It's nice and clean.  However, I
> >don't see where you memset draw_aux_buffer_disabled to 0 to reset it.
> >Did that just go missing?
> >
> >  +
> >  __DRIcontext *driContext;
> >  struct intel_screen *intelScreen;
> >   };
> >  diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> >  b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> >  index 073919e..af102a9 100644
> >  --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> >  +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> >  @@ -56,6 +56,7 @@
> >   enum {
> >  INTEL_RENDERBUFFER_LAYERED = 1 << 0,
> >  +   INTEL_RENDERBUFFER_AUX_DISABLED = 1 << 1,
> >   };
> >   struct surface_state_info {
> >  @@ -194,6 +195,10 @@ brw_update_renderbuffer_surface(struct
> >  brw_context *brw,
> >  struct intel_renderbuffer *irb = intel_renderbuffer(rb);
> >  struct intel_mipmap_tree *mt = irb->mt;
> >  +   if (brw->gen < 9) {
> >  +  assert(!(flags & INTEL_RENDERBUFFER_AUX_DISABLED));
> >  +   }
> >  +
> >  assert(brw_render_target_supported(brw, rb));
> >  intel_miptree_used_for_rendering(mt);
> >  @@ -885,6 +890,7 @@ gen4_update_renderbuffer_surface(struct
> >  brw_context *brw,
> >  /* BRW_NEW_FS_PROG_DATA */
> >  assert(!(flags & INTEL_RENDERBUFFER_LAYERED));
> >  +   assert(!(flags & INTEL_RENDERBUFFER_AUX_DISABLED));
> >  if (rb->TexImage && !brw->has_surface_tile_offset) {
> >

Re: [Mesa-dev] [PATCH] gallium: fix clang warnings

2016-09-06 Thread Nicolai Hähnle


On 06.09.2016 16:37, Martina Kollarova wrote:

1. Variable 'hole' is uninitialized when used here [-Wuninitialized]
2. Comparison of constant -1 with expression of type 'unsigned int' is always
   false [-Wtautological-constant-out-of-range-compare]
---
 src/gallium/winsys/radeon/drm/radeon_drm_bo.c | 2 +-
 src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c | 5 -
 2 files changed, 1 insertion(+), 6 deletions(-)

diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_bo.c 
b/src/gallium/winsys/radeon/drm/radeon_drm_bo.c
index 56aab48..23ee8d5 100644
--- a/src/gallium/winsys/radeon/drm/radeon_drm_bo.c
+++ b/src/gallium/winsys/radeon/drm/radeon_drm_bo.c
@@ -200,7 +200,7 @@ static uint64_t radeon_bomgr_find_va(struct 
radeon_drm_winsys *rws,
 static void radeon_bomgr_free_va(struct radeon_drm_winsys *rws,
  uint64_t va, uint64_t size)
 {
-struct radeon_bo_va_hole *hole;
+struct radeon_bo_va_hole *hole = NULL;


This is a false positive, but sure, why not.



 size = align(size, rws->info.gart_page_size);

diff --git a/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c 
b/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c
index 07eca99..33f6850 100644
--- a/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c
+++ b/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c
@@ -259,11 +259,6 @@ kms_sw_displaytarget_add_from_prime(struct kms_sw_winsys 
*kms_sw, int fd,
kms_sw_dt->height = height;
kms_sw_dt->stride = stride;

-   if (kms_sw_dt->size == (off_t)-1) {
-  FREE(kms_sw_dt);
-  return NULL;
-   }


NAK. You're just papering over the real bug here, which is that size 
comes from the return value of lseek, which may be -1 on error, and the 
compiler seems to be telling you that it might "optimize" the error 
check away.


Nicolai


-
lseek(fd, 0, SEEK_SET);

list_add(&kms_sw_dt->link, &kms_sw->bo_list);


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Piglit] [PATCH 1/2] egl: Add sanity test for EGL_EXT_device_query (v3)

2016-09-06 Thread Kyle Brenneman



On 09/06/2016 05:29 AM, Emil Velikov wrote:

* Interaction of ^^ with EGL device extension(s) - update existing
extensions/introduce new ones
  ** Should EGL_EXT_device_enumeration expose one/multiple SW devices
  - no: we need alternative glvnd EGL interface for such cases
  - yes: implementing EGL_EXT_output_drm on EGL implementations
supporting both HW and SW devices is close to impossible barring spec
update


(Trimming other bullet points for readability)

GLVND itself can support EGL_EXT_device_enumeration, but it doesn't 
require any vendor library to support it. It'll advertise 
EGL_EXT_device_enumeration to the application if and only if at least 
one vendor advertises it, and then for eglQueryDevicesEXT, it will just 
call into each vendor library that supports it and concatenate the 
vendor's lists together. If a vendor doesn't support 
EGL_EXT_device_enumeration, then GLVND will just skip it and it won't be 
included in the eglQueryDevicesEXT list.


From a driver's perspective, the only requirement that GLVND adds is 
that the EGLDeviceEXT handles have to be pointers to some address that 
the vendor library somehow controls. That's only to ensure that the 
handles are unique between vendors, so GLVND doesn't care what (if 
anything) it actually points to. Other than that, the same 
implementation of eglQueryDevicesEXT should work with or without GLVND.


-Kyle

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] spirv/nir: Add support for OpAtomicLoad/Store

2016-09-06 Thread Lionel Landwerlin

Fixes new CTS tests :

dEQP-VK.spirv_assembly.instruction.compute.opatomic.load
dEQP-VK.spirv_assembly.instruction.compute.opatomic.store

Signed-off-by: Lionel Landwerlin 
Cc: Jason Ekstrand 
---
 src/compiler/spirv/spirv_to_nir.c | 33 +
 1 file changed, 33 insertions(+)

diff --git a/src/compiler/spirv/spirv_to_nir.c 
b/src/compiler/spirv/spirv_to_nir.c
index fda38f9..104b74f 100644
--- a/src/compiler/spirv/spirv_to_nir.c
+++ b/src/compiler/spirv/spirv_to_nir.c
@@ -1913,6 +1913,32 @@ vtn_handle_ssbo_or_shared_atomic(struct vtn_builder *b, 
SpvOp opcode,
nir_builder_instr_insert(&b->nb, &atomic->instr);
 }
 
+static void
+vtn_handle_atomic_load_store(struct vtn_builder *b, bool load,
+ const uint32_t *w, unsigned count)
+{
+   struct vtn_access_chain *chain =
+  load ?
+  vtn_value(b, w[3], vtn_value_type_access_chain)->access_chain :
+  vtn_value(b, w[1], vtn_value_type_access_chain)->access_chain;
+
+   switch (chain->var->mode) {
+   case vtn_variable_mode_image:
+   case vtn_variable_mode_ssbo:
+   case vtn_variable_mode_ubo:
+  if (load) {
+ struct vtn_value *val = vtn_push_value(b, w[2], vtn_value_type_ssa);
+ val->ssa = vtn_variable_load(b, chain);
+  } else {
+ struct vtn_ssa_value *src = vtn_ssa_value(b, w[4]);
+ vtn_variable_store(b, src, chain);
+  }
+  break;
+   default:
+  unreachable("invalid block variable");
+   }
+}
+
 static nir_alu_instr *
 create_vec(nir_shader *shader, unsigned num_components, unsigned bit_size)
 {
@@ -2649,6 +2675,13 @@ vtn_handle_body_instruction(struct vtn_builder *b, SpvOp 
opcode,
   vtn_handle_variables(b, opcode, w, count);
   break;
 
+   case SpvOpAtomicLoad:
+  vtn_handle_atomic_load_store(b, true, w, count);
+  break;
+   case SpvOpAtomicStore:
+  vtn_handle_atomic_load_store(b, false, w, count);
+  break;
+
case SpvOpFunctionCall:
   vtn_handle_function_call(b, opcode, w, count);
   break;
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] spirv/nir: Add support for OpAtomicLoad/Store

2016-09-06 Thread Jason Ekstrand

On Tue, Sep 6, 2016 at 9:16 AM, Lionel Landwerlin 
wrote:

> Fixes new CTS tests :
>
> dEQP-VK.spirv_assembly.instruction.compute.opatomic.load
> dEQP-VK.spirv_assembly.instruction.compute.opatomic.store
>
> Signed-off-by: Lionel Landwerlin 
> Cc: Jason Ekstrand 
> ---
>  src/compiler/spirv/spirv_to_nir.c | 33 +
>  1 file changed, 33 insertions(+)
>
> diff --git a/src/compiler/spirv/spirv_to_nir.c
> b/src/compiler/spirv/spirv_to_nir.c
> index fda38f9..104b74f 100644
> --- a/src/compiler/spirv/spirv_to_nir.c
> +++ b/src/compiler/spirv/spirv_to_nir.c
> @@ -1913,6 +1913,32 @@ vtn_handle_ssbo_or_shared_atomic(struct
> vtn_builder *b, SpvOp opcode,
> nir_builder_instr_insert(&b->nb, &atomic->instr);
>  }
>
> +static void
> +vtn_handle_atomic_load_store(struct vtn_builder *b, bool load,
> + const uint32_t *w, unsigned count)
> +{
> +   struct vtn_access_chain *chain =
> +  load ?
> +  vtn_value(b, w[3], vtn_value_type_access_chain)->access_chain :
> +  vtn_value(b, w[1], vtn_value_type_access_chain)->access_chain;
> +
> +   switch (chain->var->mode) {
> +   case vtn_variable_mode_image:
> +   case vtn_variable_mode_ssbo:
> +   case vtn_variable_mode_ubo:
> +  if (load) {
> + struct vtn_value *val = vtn_push_value(b, w[2],
> vtn_value_type_ssa);
> + val->ssa = vtn_variable_load(b, chain);
> +  } else {
> + struct vtn_ssa_value *src = vtn_ssa_value(b, w[4]);
> + vtn_variable_store(b, src, chain);
>

This may work for UBOs and SSBOs but it will not work for images.  They
need to go through the image load store paths.


> +  }
> +  break;
> +   default:
> +  unreachable("invalid block variable");
> +   }
> +}
> +
>  static nir_alu_instr *
>  create_vec(nir_shader *shader, unsigned num_components, unsigned bit_size)
>  {
> @@ -2649,6 +2675,13 @@ vtn_handle_body_instruction(struct vtn_builder *b,
> SpvOp opcode,
>vtn_handle_variables(b, opcode, w, count);
>break;
>
> +   case SpvOpAtomicLoad:
> +  vtn_handle_atomic_load_store(b, true, w, count);
> +  break;
> +   case SpvOpAtomicStore:
> +  vtn_handle_atomic_load_store(b, false, w, count);
> +  break;
> +
> case SpvOpFunctionCall:
>vtn_handle_function_call(b, opcode, w, count);
>break;
> --
> 2.9.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 00/33] anv: Use blorp for most blits and clears

2016-09-06 Thread Nanley Chery

On Mon, Sep 05, 2016 at 09:38:59AM -0700, Jason Ekstrand wrote:
> Everything is now reviewed.  I've also sent two additional patches blorp
> which, when placed prior to the anv patches, make the series
> regression-free and they've been reviewed by Topi.
> 
> Nanley, if you don't mind, I would still like your review on the "use blorp
> to implement vkFoo" patches.  They're a complete replacement of the old
> blit2d or direct meta entrypoints and it's entirely possible that I've
> missed things.  You know the dirty little details of those APIs better than
> about anyone else (including myself) so I'll feel much more comfortable if
> you've looked over it.
> 

Sure, I'll take a look at them today.

- Nanley

> --Jason
> 
> On Wed, Aug 31, 2016 at 2:22 PM, Jason Ekstrand 
> wrote:
> 
> > This little (hah!) series does a bit more blorp reworking and then converts
> > most of the Vulkan driver to using blorp for its blit and copy operations.
> >
> > As it currently stands, it adds a few cts "regresssions" which are really
> > because the CTS makes assumptions about rounding in scaled blit operations
> > that aren't quite valid.
> >
> > This series also doesn't 100% remove vulkan meta.  As it stands, blorp
> > can't do depth/stencil clears so those are left in meta.  Support could be
> > added to blorp (and we could use it from GL) but that's a bit more work and
> > I wanted to get this sent out earlier rather than later.
> >
> > For review, I would recommend that Topi review the first 19 or so and
> > Nanley review patch 26+.  For 20-25, I'm not sure.  Kristian may be the
> > best one to review since he understands it all but Jordan or Nanley could
> > take a crack at it too.
> >
> > Cc: Topi Pohjolainen 
> > Cc: Jordan Justen 
> > Cc: Nanley Chery 
> > Cc: Chad Versace 
> > Cc: Kristian Høgsberg 
> >
> > Jason Ekstrand (33):
> >   intel/isl: Add an isl_swizzle structure and use it for isl_view
> > swizzles
> >   intel/blorp: Take an isl_swizzle instead of a SWIZZLE
> >   intel/blorp: Take a destination swizzle in blorp_blit
> >   intel/blorp: Don't assume R8_UINT in convert_to_single_slice
> >   intel/blorp: Use the surface format for computing offsets
> >   intel/blorp: Fix the early return condition in convert_to_single_slice
> >   intel/isl: Fix an assert in get_intratile_offset_sa
> >   intel/blorp: Handle 3D surfaces in convert_to_single_slice
> >   intel/isl: Add a helper for getting the size of an interleaved pixel
> >   intel/blorp: Use isl_get_interleaved_msaa_px_size_sa
> >   intel/blorp: Use fake_interleaved_msaa in retile_w_to_y
> >   intel/blorp: Stop using the X/YOffset field of RENDER_SURFACE_STATE
> >   intel/blorp: Pull the guts of blorp_blit into a helper
> >   intel/blorp: Add an entrypoint for doing bit-for-bit copies
> >   intel/blorp: Add support for RGB destinations in copies
> >   intel/blorp: Add support for clearing R9G9B9E5 surfaces
> >   intel/blorp: Make color_write_disable const and optional
> >   intel/blorp: Add a swizzle parameter to blorp_clear
> >   intel/blorp: Rework alloc_binding_table
> >   intel/blorp: Use #defines for all __gen_ helpers
> >   anv/pipeline: Roll compute_urb_partition into emit_urb_setup
> >   anv: Generalize emit_urb_setup
> >   intel/anv: Use #defines for all __gen_ helpers
> >   anv: Add initial blorp support
> >   anv: Make image_get_surface_for_aspect_mask const
> >   anv: Use blorp to implement VkBlitImage
> >   anv: Use blorp for CopyImageToBuffer
> >   anv: Use blorp for CopyBufferToImage
> >   anv: Use blorp for CopyImage
> >   anv: Use blorp for CopyBuffer and UpdateBuffer
> >   anv: Delete meta_blit2d
> >   anv: Use blorp for ClearColorImage
> >   anv: Use blorp for doing MSAA resolves
> >
> >  src/intel/blorp/blorp.c  |7 +-
> >  src/intel/blorp/blorp.h  |   19 +-
> >  src/intel/blorp/blorp_blit.c |  570 +---
> >  src/intel/blorp/blorp_clear.c|   24 +-
> >  src/intel/blorp/blorp_genX_exec.h|   25 +-
> >  src/intel/blorp/blorp_priv.h |   26 +-
> >  src/intel/isl/isl.c  |   22 +-
> >  src/intel/isl/isl.h  |   23 +-
> >  src/intel/isl/isl_surface_state.c|8 +-
> >  src/intel/vulkan/Makefile.am |1 +
> >  src/intel/vulkan/Makefile.sources|9 +-
> >  src/intel/vulkan/anv_blorp.c |  846 
> >  src/intel/vulkan/anv_device.c|4 +
> >  src/intel/vulkan/anv_formats.c   |9 +-
> >  src/intel/vulkan/anv_genX.h  |9 +
> >  src/intel/vulkan/anv_image.c |   27 +-
> >  src/intel/vulkan/anv_meta.c  |   21 -
> >  src/intel/vulkan/anv_meta.h  |   44 -
> >  src/intel/vulkan/anv_meta_blit.c |  739 ---
> >  src/intel/vulkan/anv_meta_blit2d.c

Re: [Mesa-dev] [v2 3/6] i965: Track non-compressible sampling of renderbuffers

2016-09-06 Thread Pohjolainen, Topi

On Tue, Sep 06, 2016 at 08:24:11AM -0700, Jason Ekstrand wrote:
>On Tue, Sep 6, 2016 at 8:16 AM, Pohjolainen, Topi
><[1]topi.pohjolai...@gmail.com> wrote:
> 
>  On Tue, Sep 06, 2016 at 07:54:16AM -0700, Jason Ekstrand wrote:
>  >On Tue, Sep 6, 2016 at 12:28 AM, Topi Pohjolainen
>  ><[1][2]topi.pohjolai...@gmail.com> wrote:
>  >
>  >  Signed-off-by: Topi Pohjolainen
>  <[2][3]topi.pohjolai...@intel.com>
> 
>>  ---
>>   src/mesa/drivers/dri/i965/brw_context.c  | 16
>>  
>>   src/mesa/drivers/dri/i965/brw_context.h  | 10
>++
>>   src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 12
>++--
>>   3 files changed, 36 insertions(+), 2 deletions(-)
>>  diff --git a/src/mesa/drivers/dri/i965/brw_context.c
>>  b/src/mesa/drivers/dri/i965/brw_context.c
>>  index b880b4f..c5c6fdd 100644
>>  --- a/src/mesa/drivers/dri/i965/brw_context.c
>>  +++ b/src/mesa/drivers/dri/i965/brw_context.c
>>  @@ -197,6 +197,22 @@ intel_texture_view_requires_resolve(struct
>>  brw_context *brw,
>> _mesa_get_format_name(intel_tex->_Format),
>> _mesa_get_format_name(intel_tex->mt->format));
>>  +   const struct gl_framebuffer *fb = brw->ctx.DrawBuffer;
>>  +   for (unsigned i = 0; i < fb->_NumColorDrawBuffers; i++) {
>>  +  const struct intel_renderbuffer *irb =
>>  + intel_renderbuffer(fb->_ColorDrawBuffers[i]);
>>  +
>>  +  /* In case the same surface is also used for rendering
>one
>>  needs to
>>  +   * disable the compression.
>>  +   */
>>  +  brw->draw_aux_buffer_disabled[i] = intel_tex->mt->bo ==
>>  irb->mt->bo;
> 
>  This loop goes thru all render surfaces and explicitly sets the
>  flag. In
>  other words all flags are reset before uploading state - no need for
>  separate memset.
> 
>Ugh... then it is busted if you have multiple render targets *or*
>multiple textures where the one being rendered to isn't the last one.
>Only the render buffer for the last bound texture will get the
>buffer_aux_buffer_disabled[] bit set.  We really need a reset and
>set-of-disable-needed model.

Fully agree, I started re-reading and realized the same. I wonder how I got
this even working...
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 97542] mesa-12.0.1 with llvm-3.9.0_rc3 - src/gallium/state_trackers/clover/llvm/invocation.cpp:212:75: error: no matching function for call to clang::CompilerInvocation::setLangDefault

2016-09-06 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=97542

--- Comment #7 from Dennis Schridde  ---
(In reply to Daniel Exner from comment #6)
> Well, I just wanted to bump llvm to the _released_ 3.9.0 and hit this bug
> when trying to recompile mesa 12.0.2.
> 
> Perhaps it should be noted somewhere that mesa minor version releases only
> support the llvm release of the mayor they belong to.

Shouldn't ./configure check whether the installed version of LLVM is supported?
That's what Rust does.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] vl/rbsp: match initial escaped bits with valid in the buffer

2016-09-06 Thread Leo Liu

Otherwise the check for the three byte will not make sense.

Signed-off-by: Leo Liu 
---
 src/gallium/auxiliary/vl/vl_rbsp.h | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/gallium/auxiliary/vl/vl_rbsp.h 
b/src/gallium/auxiliary/vl/vl_rbsp.h
index 160b2f8..4d90c2d 100644
--- a/src/gallium/auxiliary/vl/vl_rbsp.h
+++ b/src/gallium/auxiliary/vl/vl_rbsp.h
@@ -56,8 +56,6 @@ static inline void vl_rbsp_init(struct vl_rbsp *rbsp, struct 
vl_vlc *nal, unsign
/* copy the position */
rbsp->nal = *nal;
 
-   rbsp->escaped = 16;
-
/* search for the end of the NAL unit */
while (vl_vlc_search_byte(nal, num_bits, 0x00)) {
   if (vl_vlc_peekbits(nal, 24) == 0x01 ||
@@ -76,6 +74,10 @@ static inline void vl_rbsp_init(struct vl_rbsp *rbsp, struct 
vl_vlc *nal, unsign
  i += 8;
   }
}
+
+   valid = vl_vlc_valid_bits(&rbsp->nal);
+
+   rbsp->escaped = (valid >= 16) ? 16 : ((valid >= 8) ? 8 : 0);
 }
 
 /**
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Piglit] [PATCH 1/2] egl: Add sanity test for EGL_EXT_device_query (v3)

2016-09-06 Thread Mathias Fröhlich

Hi Emil,

On Tuesday, 6 September 2016 12:29:01 CEST Emil Velikov wrote:
>  * SW EGL implementations - do we have any vendors/implementations
> apart from Mesa ?
I dont think so.
But lets put that beside. The reason for me talking about swrast here was to 
fulfill that part of the EGL_EXT_device_enumeration spec that says that there 
is always at least a single EGLDeviceEXT returned. But see below.

>  * Interaction of ^^ with EGL device extension(s) - update existing
> extensions/introduce new ones
>  ** Should EGL_EXT_device_enumeration expose one/multiple SW devices
>  - no: we need alternative glvnd EGL interface for such cases
>  - yes: implementing EGL_EXT_output_drm on EGL implementations
> supporting both HW and SW devices is close to impossible barring spec
> update
> 
>  ** EGL_EXT_output_drm
>  *** Using/exposing the card or render node
>  - Extension is designed with EGL streams in mind (using the
> primary/card node) while people expect to use to select the rendering
> device.
>  - Elaborate on the spec and/or introduce EGL_EXT_output{,_drm}_render ?
>  *** Exposing EGL_EXT_output{,_drm}{,_render} on EGL implementations
> supporting both SW and HW devices
>  - Elaborate on the spec(s), add new one for SW devices and/or error
> type to distinguish between the current errors and SW devices
I do not care about anything built on top of EGL_EXT_output_base or 
EGL_*_stream_*. From my point of view this is beside.


What I do care about is EGL_EXT_platform_device.

>  * Systems with fb only, disabled render nodes and/or alike.
> EGL implementations (in our case the libdrm API provides all the info
> about available DRM devices) can effectively detect the presence of
> HW/SW devices and expose relevant extensions.
> Note: The presence does not and _cannot_ imply that one will always
> succeed using each device.
So you are saying, on a system without drm device we should return a more or 
less dead single EGLDeviceEXT from eglQueryDevicesEXT(...).
Then after that in the call to
eglGetPlatformDisplayEXT(EGL_PLATFORM_DEVICE_EXT, EGLDeviceEXT,...)
we just return EGL_NO_DISPLAY.
Which should be ok because of EGL_EXT_platform_base.txt:

--
Additions to the EGL 1.4 Specification
[...]
eglGetPlatformDisplayEXT(platform, native_display, ...)
[...]
If  is valid but no display matching  is
available, then EGL_NO_DISPLAY is returned; no error condition is raised
in this case.
--

That seems to work without the need to play with a software rasterizer to 
fulfill the spec.
And yes, I was under the impression that once I have an EGLDeviceEXT in my 
hands I should get a valid EGLDisplay via eglGetPlatformDisplayEXT - which is 
obviously not the case. Of course all that with the presence of the required 
extensions.

Thanks

Mathias
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 30/33] anv: Use blorp for CopyBuffer and UpdateBuffer

2016-09-06 Thread Anuj Phogat

On Sat, Sep 3, 2016 at 8:53 AM, Jason Ekstrand  wrote:
>
>
> On Fri, Sep 2, 2016 at 12:05 PM, Anuj Phogat  wrote:
>>
>> On Wed, Aug 31, 2016 at 2:22 PM, Jason Ekstrand 
>> wrote:
>> > ---
>> >  src/intel/vulkan/Makefile.sources |   1 -
>> >  src/intel/vulkan/anv_blorp.c  | 184
>> > ++
>> >  src/intel/vulkan/anv_meta_copy.c  | 180
>> > -
>> >  3 files changed, 184 insertions(+), 181 deletions(-)
>> >  delete mode 100644 src/intel/vulkan/anv_meta_copy.c
>> >
>> > diff --git a/src/intel/vulkan/Makefile.sources
>> > b/src/intel/vulkan/Makefile.sources
>> > index 35e15f6..6c9853b 100644
>> > --- a/src/intel/vulkan/Makefile.sources
>> > +++ b/src/intel/vulkan/Makefile.sources
>> > @@ -35,7 +35,6 @@ VULKAN_FILES := \
>> > anv_meta.h \
>> > anv_meta_blit2d.c \
>> > anv_meta_clear.c \
>> > -   anv_meta_copy.c \
>> > anv_meta_resolve.c \
>> > anv_nir.h \
>> > anv_nir_apply_dynamic_offsets.c \
>> > diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c
>> > index 89ff3b3..e2b6672 100644
>> > --- a/src/intel/vulkan/anv_blorp.c
>> > +++ b/src/intel/vulkan/anv_blorp.c
>> > @@ -479,3 +479,187 @@ void anv_CmdBlitImage(
>> >
>> > blorp_batch_finish(&batch);
>> >  }
>> > +
>> > +static void
>> > +do_buffer_copy(struct blorp_batch *batch,
>> > +   struct anv_bo *src, uint64_t src_offset,
>> > +   struct anv_bo *dst, uint64_t dst_offset,
>> > +   int width, int height, int block_size)
>> > +{
>> > +   struct anv_device *device = batch->blorp->driver_ctx;
>> > +
>> > +   /* The actual format we pick doesn't matter as blorp will throw it
>> > away.
>> > +* The only thing that actually matters is the size.
>> > +*/
>> > +   enum isl_format format;
>> > +   switch (block_size) {
>> > +   case 1:  format = ISL_FORMAT_R8_UINT;  break;
>> > +   case 2:  format = ISL_FORMAT_R8G8_UINT;break;
>> > +   case 4:  format = ISL_FORMAT_R8G8B8A8_UNORM;   break;
>> > +   case 8:  format = ISL_FORMAT_R16G16B16A16_UNORM;   break;
>> > +   case 16: format = ISL_FORMAT_R32G32B32A32_UINT;break;
>> > +   default:
>> > +  unreachable("Not a power-of-two format size");
>> > +   }
>> > +
>> > +   struct isl_surf surf;
>> > +   isl_surf_init(&device->isl_dev, &surf,
>> > + .dim = ISL_SURF_DIM_2D,
>> > + .format = format,
>> > + .width = width,
>> > + .height = height,
>> > + .depth = 1,
>> > + .levels = 1,
>> > + .array_len = 1,
>> > + .samples = 1,
>> > + .usage = ISL_SURF_USAGE_TEXTURE_BIT,
>> > + .tiling_flags = ISL_TILING_LINEAR_BIT);
>> > +   assert(surf.row_pitch == width * block_size);
>> > +
>> > +   struct blorp_surf src_blorp_surf = {
>> > +  .surf = &surf,
>> > +  .addr = {
>> > + .buffer = src,
>> > + .offset = src_offset,
>> > +  },
>> > +   };
>> > +
>> > +   struct blorp_surf dst_blorp_surf = {
>> > +  .surf = &surf,
>> > +  .addr = {
>> > + .buffer = dst,
>> > + .offset = dst_offset,
>> > +  },
>> > +   };
>> > +
>> > +   blorp_copy(batch, &src_blorp_surf, 0, 0, &dst_blorp_surf, 0, 0,
>> > +  0, 0, 0, 0, width, height);
>> > +}
>> > +
>> > +void anv_CmdCopyBuffer(
>> > +VkCommandBuffer commandBuffer,
>> > +VkBuffersrcBuffer,
>> > +VkBufferdstBuffer,
>> > +uint32_tregionCount,
>> > +const VkBufferCopy* pRegions)
>> > +{
>> > +   ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
>> > +   ANV_FROM_HANDLE(anv_buffer, src_buffer, srcBuffer);
>> > +   ANV_FROM_HANDLE(anv_buffer, dst_buffer, dstBuffer);
>> > +
>> > +   struct blorp_batch batch;
>> > +   blorp_batch_init(&cmd_buffer->device->blorp, &batch, cmd_buffer);
>> > +
>> > +   for (unsigned r = 0; r < regionCount; r++) {
>> > +  uint64_t src_offset = src_buffer->offset + pRegions[r].srcOffset;
>> > +  uint64_t dst_offset = dst_buffer->offset + pRegions[r].dstOffset;
>> > +  uint64_t copy_size = pRegions[r].size;
>> > +
>> > +  /* First, we compute the biggest format that can be used with the
>> > +   * given offsets and size.
>> > +   */
>> > +  int bs = 16;
>> > +
>> > +  int fs = ffs(src_offset) - 1;
>> > +  if (fs != -1)
>> > + bs = MIN2(bs, 1 << fs);
>> > +  assert(src_offset % bs == 0);
>> > +
>> > +  fs = ffs(dst_offset) - 1;
>> > +  if (fs != -1)
>> > + bs = MIN2(bs, 1 << fs);
>> > +  assert(dst_offset % bs == 0);
>> > +
>> > +  fs = ffs(pRegions[r].size) - 1;
>> > +  if (fs != -1)
>> > + bs = MIN2(bs, 1 << fs);
>> > +  assert(pRegions[r].size % bs == 0);
>>

[Mesa-dev] [Bug 97542] mesa-12.0.1 with llvm-3.9.0_rc3 - src/gallium/state_trackers/clover/llvm/invocation.cpp:212:75: error: no matching function for call to clang::CompilerInvocation::setLangDefault

2016-09-06 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=97542

--- Comment #8 from Vedran Miletić  ---
(In reply to Dennis Schridde from comment #7)
> Shouldn't ./configure check whether the installed version of LLVM is
> supported? That's what Rust does.

There are basically two options:
1) Fail explicitly on LLVM version that is not released, even if a certain
snapshot of LLVM can work.
2) Allow using Mesa with a snapshot that happens to carry the same version
number but different API.

Mesa does 2) now, and I prefer it to 1).

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/3] i965/nir: Use GCM and GVN

2016-09-06 Thread Jason Ekstrand

---
 src/mesa/drivers/dri/i965/brw_nir.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/drivers/dri/i965/brw_nir.c 
b/src/mesa/drivers/dri/i965/brw_nir.c
index e8dafae..9071384 100644
--- a/src/mesa/drivers/dri/i965/brw_nir.c
+++ b/src/mesa/drivers/dri/i965/brw_nir.c
@@ -392,6 +392,7 @@ nir_optimize(nir_shader *nir, bool is_scalar)
   OPT(nir_opt_constant_folding);
   OPT(nir_opt_dead_cf);
   OPT(nir_opt_remove_phis);
+  OPT(nir_opt_gcm, true);
   OPT(nir_opt_undef);
   OPT_V(nir_lower_doubles, nir_lower_drcp |
nir_lower_dsqrt |
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/3] nir/gcm: Add value numbering support

2016-09-06 Thread Jason Ekstrand

---
 src/compiler/nir/nir.h |  2 +-
 src/compiler/nir/nir_opt_gcm.c | 29 -
 2 files changed, 25 insertions(+), 6 deletions(-)

diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index c1cf940..ff7c422 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -2587,7 +2587,7 @@ bool nir_opt_dce(nir_shader *shader);
 
 bool nir_opt_dead_cf(nir_shader *shader);
 
-void nir_opt_gcm(nir_shader *shader);
+bool nir_opt_gcm(nir_shader *shader, bool value_number);
 
 bool nir_opt_peephole_select(nir_shader *shader);
 
diff --git a/src/compiler/nir/nir_opt_gcm.c b/src/compiler/nir/nir_opt_gcm.c
index 02a9348..77eb8e6 100644
--- a/src/compiler/nir/nir_opt_gcm.c
+++ b/src/compiler/nir/nir_opt_gcm.c
@@ -26,6 +26,7 @@
  */
 
 #include "nir.h"
+#include "nir_instr_set.h"
 
 /*
  * Implements Global Code Motion.  A description of GCM can be found in
@@ -451,8 +452,8 @@ gcm_place_instr(nir_instr *instr, struct gcm_state *state)
block_info->last_instr = instr;
 }
 
-static void
-opt_gcm_impl(nir_function_impl *impl)
+static bool
+opt_gcm_impl(nir_function_impl *impl, bool value_number)
 {
struct gcm_state state;
 
@@ -470,6 +471,18 @@ opt_gcm_impl(nir_function_impl *impl)
   gcm_pin_instructions_block(block, &state);
}
 
+   bool progress = false;
+   if (value_number) {
+  struct set *gvn_set = nir_instr_set_create(NULL);
+  foreach_list_typed_safe(nir_instr, instr, node, &state.instrs) {
+ if (nir_instr_set_add_or_rewrite(gvn_set, instr)) {
+nir_instr_remove(instr);
+progress = true;
+ }
+  }
+  nir_instr_set_destroy(gvn_set);
+   }
+
foreach_list_typed(nir_instr, instr, node, &state.instrs)
   gcm_schedule_early_instr(instr, &state);
 
@@ -486,13 +499,19 @@ opt_gcm_impl(nir_function_impl *impl)
 
nir_metadata_preserve(impl, nir_metadata_block_index |
nir_metadata_dominance);
+
+   return progress;
 }
 
-void
-nir_opt_gcm(nir_shader *shader)
+bool
+nir_opt_gcm(nir_shader *shader, bool value_number)
 {
+   bool progress = false;
+
nir_foreach_function(function, shader) {
   if (function->impl)
- opt_gcm_impl(function->impl);
+ progress |= opt_gcm_impl(function->impl, value_number);
}
+
+   return progress;
 }
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 0/3] nir: Add value number to GCM

2016-09-06 Thread Jason Ekstrand

This little series adds value numbering support to the global code motion
pass.  All of the patches have my name on them but most of the credit goes
to Connor and his instruction set.  The instruction set was originally
written for GVN but we merged it a while ago because we can also use it for
regular CSE and it is much more efficient than the old list-based approach.
Now that it's merged GVN is just a couple of lines.

I'm not going to recommend that we merge patch 3 yet.  There are still some
regressions with it and we should try and get that sorted out.  That said,
at some point, we just need to say "good enough", eat the costs, and enjoy
the over-all benifit.  That time may not yet have come though, so I won't
push for turning it on.

That said, I would like to merge the first two patches so they aren't
floating around in my branch having to be rebased anymore.

Jason Ekstrand (3):
  nir/gcm: Call nir_metadata_preserve
  nir/gcm: Add value numbering support
  i965/nir: Use GCM and GVN

 src/compiler/nir/nir.h  |  2 +-
 src/compiler/nir/nir_opt_gcm.c  | 32 +++-
 src/mesa/drivers/dri/i965/brw_nir.c |  1 +
 3 files changed, 29 insertions(+), 6 deletions(-)

-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/3] nir/gcm: Call nir_metadata_preserve

2016-09-06 Thread Jason Ekstrand

---
 src/compiler/nir/nir_opt_gcm.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/compiler/nir/nir_opt_gcm.c b/src/compiler/nir/nir_opt_gcm.c
index 84e32ef..02a9348 100644
--- a/src/compiler/nir/nir_opt_gcm.c
+++ b/src/compiler/nir/nir_opt_gcm.c
@@ -483,6 +483,9 @@ opt_gcm_impl(nir_function_impl *impl)
}
 
ralloc_free(state.blocks);
+
+   nir_metadata_preserve(impl, nir_metadata_block_index |
+   nir_metadata_dominance);
 }
 
 void
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] glsl: use hash instead of exec_list in copy propagation

2016-09-06 Thread Eric Anholt

Tapani Pälli  writes:

> This change makes copy propagation pass faster. Complete link time
> spent in test case attached to bug 94477 goes down to ~400 secs from
> over 500 secs on my HSW machine. Does not fix the actual issue but
> brings down the total. No regressions seen in CI.
>
> Signed-off-by: Tapani Pälli 
> ---
>
> Next I'll attempt to make similar change to opt_copy_propagation_elements.

This all looks correct, though it makes me sad to see work being done on
these passes instead of just removing them from being called by NIR
drivers.

Reviewed-by: Eric Anholt 


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 09/10] st/vdpau: implement the new DMA-buf based interop

2016-09-06 Thread Christian König


Am 06.09.2016 um 16:23 schrieb Ilia Mirkin:

On Mon, Sep 5, 2016 at 2:48 AM, Michel Dänzer  wrote:

On 05/09/16 04:37 AM, Ilia Mirkin wrote:

On Tue, Mar 8, 2016 at 7:21 AM, Christian König  wrote:

@@ -80,7 +82,7 @@ vlVdpOutputSurfaceCreate(VdpDevice device,
 res_tmpl.depth0 = 1;
 res_tmpl.array_size = 1;
 res_tmpl.bind = PIPE_BIND_SAMPLER_VIEW | PIPE_BIND_RENDER_TARGET |
-   PIPE_BIND_LINEAR;
+   PIPE_BIND_LINEAR | PIPE_BIND_SHARED;

Hi Christian,

This change appears to have semi-broken vdpau on nouveau. Whenever I
flip on the OSD in mplayer, the rendering becomes *extremely* slow.
However regular up-scaling without the OSD is plenty fast. This
effectively is forcing the output surfaces to live in GART instead of
VRAM.

Strictly speaking, they'd only need to be forced to GART while they're
actually being shared between different GPUs. That's how it works with
the amdgpu and radeon kernel drivers.

Any suggestions on how to handle this? Perhaps reallocate + copy the
surface in st/vdpau when actual dmabuf sharing is requested?

To be clear - with this change, vdpau with nouveau is unusable in the
presence of an OSD in mplayer. The OSD comes up whenever you seek
around in the video, so in effect, it's unusable. Used to work great.


Well I think you should clearly figure out why adding PIPE_BIND_SHARED 
has such dramatic effect.


We not only need this for DMA-buf based interop, but also for the DRI3 
based sharing of buffers with X.


So that clearly sounds like a bug in nouveau to me.

Regards,
Christian.



   -ilia



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] glsl: use hash instead of exec_list in copy propagation

2016-09-06 Thread Jason Ekstrand

On Tue, Sep 6, 2016 at 11:14 AM, Eric Anholt  wrote:

> Tapani Pälli  writes:
>
> > This change makes copy propagation pass faster. Complete link time
> > spent in test case attached to bug 94477 goes down to ~400 secs from
> > over 500 secs on my HSW machine. Does not fix the actual issue but
> > brings down the total. No regressions seen in CI.
> >
> > Signed-off-by: Tapani Pälli 
> > ---
> >
> > Next I'll attempt to make similar change to
> opt_copy_propagation_elements.
>
> This all looks correct, though it makes me sad to see work being done on
> these passes instead of just removing them from being called by NIR
> drivers.
>

Sadly, I think copy-prop is one of the ones that will never go away since
we need it for evaluating GLSL stuff that we have to be able to validate at
link time.  That said, if the 500 seconds is coming because of channel
expressions That one needs to die.


> Reviewed-by: Eric Anholt 
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 09/10] st/vdpau: implement the new DMA-buf based interop

2016-09-06 Thread Ilia Mirkin

On Tue, Sep 6, 2016 at 2:22 PM, Christian König  wrote:
> Am 06.09.2016 um 16:23 schrieb Ilia Mirkin:
>>
>> On Mon, Sep 5, 2016 at 2:48 AM, Michel Dänzer  wrote:
>>>
>>> On 05/09/16 04:37 AM, Ilia Mirkin wrote:

 On Tue, Mar 8, 2016 at 7:21 AM, Christian König
  wrote:
>
> @@ -80,7 +82,7 @@ vlVdpOutputSurfaceCreate(VdpDevice device,
>  res_tmpl.depth0 = 1;
>  res_tmpl.array_size = 1;
>  res_tmpl.bind = PIPE_BIND_SAMPLER_VIEW | PIPE_BIND_RENDER_TARGET |
> -   PIPE_BIND_LINEAR;
> +   PIPE_BIND_LINEAR | PIPE_BIND_SHARED;

 Hi Christian,

 This change appears to have semi-broken vdpau on nouveau. Whenever I
 flip on the OSD in mplayer, the rendering becomes *extremely* slow.
 However regular up-scaling without the OSD is plenty fast. This
 effectively is forcing the output surfaces to live in GART instead of
 VRAM.
>>>
>>> Strictly speaking, they'd only need to be forced to GART while they're
>>> actually being shared between different GPUs. That's how it works with
>>> the amdgpu and radeon kernel drivers.
>>
>> Any suggestions on how to handle this? Perhaps reallocate + copy the
>> surface in st/vdpau when actual dmabuf sharing is requested?
>>
>> To be clear - with this change, vdpau with nouveau is unusable in the
>> presence of an OSD in mplayer. The OSD comes up whenever you seek
>> around in the video, so in effect, it's unusable. Used to work great.
>
>
> Well I think you should clearly figure out why adding PIPE_BIND_SHARED has
> such dramatic effect.

Because the buffer goes into GART. And then you try to blend on it,
which involves readback from GART (that's how the functions OSD is
based on work, I believe). We normally don't allocate renderable
surfaces or textures in GART.

>
> We not only need this for DMA-buf based interop, but also for the DRI3 based
> sharing of buffers with X.
>
> So that clearly sounds like a bug in nouveau to me.

OK, so SHARED != GART? With nouveau, buffers are placed statically in
either VRAM or GART, so I think that if it's shared it has to end up
in GART, no?

I'm pretty weak on all these concepts, as well as how the DRI3 stuff
works, unfortunately.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 96770] include/GL/mesa_glinterop.h:62: error: redefinition of typedef ‘GLXContext’

2016-09-06 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=96770

--- Comment #2 from Vinson Lee  ---
(In reply to Emil Velikov from comment #1)
> Yay, I broke things ;-)
> 
> There's a couple of routes we can take:
>  - Bring back the EGL/GLX includes - I would strongly advice against that.
>  - guard the typedefs with ifdef macros - fragile, we'll also need to ensure
> that the header is included after the EGL/GLX ones.
>  - opencode/replaces the existing typedefs with the respective original
> types - a tad nasty, yet it seems like the better option.
>  - other ?
> 
> Vinson, let me know which one you'd prefer and I'll whip a patch... Unless
> you beat me to it.

Emil, I don't have a preference but I tested that undoing the changes in
8472045b16b3e4621553fe451a20a9ba9f0d44b6 fixes the build.

diff --git a/include/GL/mesa_glinterop.h b/include/GL/mesa_glinterop.h
index 383d7f9..c6a967e 100644
--- a/include/GL/mesa_glinterop.h
+++ b/include/GL/mesa_glinterop.h
@@ -52,15 +52,12 @@

 #include 
 #include 
+#include 

 #ifdef __cplusplus
 extern "C" {
 #endif

-/* Forward declarations to avoid inclusion of GL/glx.h */
-typedef struct _XDisplay Display;
-typedef struct __GLXcontextRec *GLXContext;
-
 /* Forward declarations to avoid inclusion of EGL/egl.h */
 typedef void *EGLDisplay;
 typedef void *EGLContext;

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] glsl: Add a citation for uniform precision matching.

2016-09-06 Thread Kenneth Graunke

Ian added this check in commit 259fc505454ea6a67aeacf6cdebf1398d9947759.
While reviewing the rules, I found a citation which spells this out
clearly, so I figured I'd send a patch to add it as a comment.

Cc: i...@freedesktop.org
Signed-off-by: Kenneth Graunke 
---
 src/compiler/glsl/linker.cpp | 8 
 1 file changed, 8 insertions(+)

diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index c95edf3..78c9ea8 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -1154,6 +1154,14 @@ cross_validate_globals(struct gl_shader_program *prog,
 return;
  }
 
+ /* The GLSL ES 3.2 specification says:
+  *
+  *"Uniforms in shaders all share a single global name space when
+  * linked into a program or separable program. Hence, the types,
+  * precisions and any location specifiers of all declared uniform
+  * variables with the same name must match across shaders that
+  * are linked into a single program."
+  */
  if (prog->IsES && existing->data.precision != var->data.precision) {
 linker_error(prog, "declarations for %s `%s` have "
  "mismatching precision qualifiers\n",
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 09/10] st/vdpau: implement the new DMA-buf based interop

2016-09-06 Thread Christian König


Am 06.09.2016 um 21:05 schrieb Ilia Mirkin:

On Tue, Sep 6, 2016 at 2:22 PM, Christian König  wrote:

Am 06.09.2016 um 16:23 schrieb Ilia Mirkin:

On Mon, Sep 5, 2016 at 2:48 AM, Michel Dänzer  wrote:

On 05/09/16 04:37 AM, Ilia Mirkin wrote:

On Tue, Mar 8, 2016 at 7:21 AM, Christian König
 wrote:

@@ -80,7 +82,7 @@ vlVdpOutputSurfaceCreate(VdpDevice device,
  res_tmpl.depth0 = 1;
  res_tmpl.array_size = 1;
  res_tmpl.bind = PIPE_BIND_SAMPLER_VIEW | PIPE_BIND_RENDER_TARGET |
-   PIPE_BIND_LINEAR;
+   PIPE_BIND_LINEAR | PIPE_BIND_SHARED;

Hi Christian,

This change appears to have semi-broken vdpau on nouveau. Whenever I
flip on the OSD in mplayer, the rendering becomes *extremely* slow.
However regular up-scaling without the OSD is plenty fast. This
effectively is forcing the output surfaces to live in GART instead of
VRAM.

Strictly speaking, they'd only need to be forced to GART while they're
actually being shared between different GPUs. That's how it works with
the amdgpu and radeon kernel drivers.

Any suggestions on how to handle this? Perhaps reallocate + copy the
surface in st/vdpau when actual dmabuf sharing is requested?

To be clear - with this change, vdpau with nouveau is unusable in the
presence of an OSD in mplayer. The OSD comes up whenever you seek
around in the video, so in effect, it's unusable. Used to work great.


Well I think you should clearly figure out why adding PIPE_BIND_SHARED has
such dramatic effect.

Because the buffer goes into GART. And then you try to blend on it,
which involves readback from GART (that's how the functions OSD is
based on work, I believe). We normally don't allocate renderable
surfaces or textures in GART.


We not only need this for DMA-buf based interop, but also for the DRI3 based
sharing of buffers with X.

So that clearly sounds like a bug in nouveau to me.

OK, so SHARED != GART? With nouveau, buffers are placed statically in
either VRAM or GART, so I think that if it's shared it has to end up
in GART, no?


As far as I understand it no. Shared just means that we can share it 
between applications, doesn't it? Or does it mean the buffer should be 
shareable between GPUs?


Could be that my understanding was wrong and so if it's the later feel 
free to provide a patch to just remove the flag.



I'm pretty weak on all these concepts, as well as how the DRI3 stuff
works, unfortunately.


I have to confess I'm not so deeply into this stuff either. Marek, 
Michel what exactly is the meaning of the flag?


Regards,
Christian.



   -ilia



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] glsl: use hash instead of exec_list in copy propagation

2016-09-06 Thread Kenneth Graunke

On Tuesday, September 6, 2016 10:17:57 AM PDT Tapani Pälli wrote:
> This change makes copy propagation pass faster. Complete link time
> spent in test case attached to bug 94477 goes down to ~400 secs from
> over 500 secs on my HSW machine. Does not fix the actual issue but
> brings down the total. No regressions seen in CI.
> 
> Signed-off-by: Tapani Pälli 
> ---
> 
> Next I'll attempt to make similar change to opt_copy_propagation_elements.
> 
>  src/compiler/glsl/opt_copy_propagation.cpp | 92 
> +-
>  1 file changed, 41 insertions(+), 51 deletions(-)

A couple of drive by comments (since I see Eric already reviewed it):

- I was wondering if program/symbol_table would be useful, since we tend
  to make new tables at nested scopes...but I'm not sure if that works
  well for the kill set.  Maybe not a great fit.

- At least in the old code...the kill set could contain piles of
  duplicate entries.  kill(var) would just push a new node on the list,
  rather than seeing if var is already in the kill set.  Pretty lame :(


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 4/6] EGL: Call the EGL_KHR_debug callback on errors.

2016-09-06 Thread Adam Jackson

On Wed, 2016-07-06 at 10:33 -0600, Kyle Brenneman wrote:

> @@ -292,6 +292,24 @@ _eglError(EGLint errCode, const char *msg)
> return EGL_FALSE;
>  }
>  
> +EGLBoolean
> +_eglError(EGLint errCode, const char *msg)
> +{
> +   if (errCode != EGL_SUCCESS) {
> +  EGLint type;
> +  if (errCode == EGL_BAD_ALLOC) {
> + type = EGL_DEBUG_MSG_CRITICAL_KHR;
> +  } else {
> + type = EGL_DEBUG_MSG_ERROR_KHR;
> +  }
> +
> +  _eglDebugReport(errCode, NULL, msg, type, NULL, NULL);
> +   } else {
> +  _eglInternalError(errCode, msg);
> +   }
> +   return EGL_FALSE;
> +}

I don't think this can be right? _eglDebugReport ends with:

   if (type == EGL_DEBUG_MSG_CRITICAL_KHR || type == EGL_DEBUG_MSG_ERROR_KHR) {
  _eglError(error, command);
   }

So this looks like it could mutually recurse until you run out of stack
space and crash. I'll try to write a test to prove the point but maybe
I'm missing something about how this is meant to work.

- ajax
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] glsl: Add a citation for uniform precision matching.

2016-09-06 Thread Eric Anholt

Kenneth Graunke  writes:

> Ian added this check in commit 259fc505454ea6a67aeacf6cdebf1398d9947759.
> While reviewing the rules, I found a citation which spells this out
> clearly, so I figured I'd send a patch to add it as a comment.
>
> Cc: i...@freedesktop.org
> Signed-off-by: Kenneth Graunke 

Reviewed-by: Eric Anholt 

I was wondering about this change, because glmark2 is failing to compile
its terrain shaders now.


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 07/12] i965/blorp: Instruct vertex fetcher to provide prim instance id

2016-09-06 Thread Kristian Høgsberg

On Thu, Sep 1, 2016 at 8:43 AM, Jason Ekstrand  wrote:
> On Aug 31, 2016 9:06 AM, "Topi Pohjolainen" 
> wrote:
>>
>> This will indicate target layer (Render Target Array Index) needed
>> for layered clears.
>>
>> v2: Use 3DSTATE_VF_SGVS for gen8+
>>
>> Signed-off-by: Topi Pohjolainen 
>> ---
>>  src/intel/blorp/blorp_genX_exec.h | 26 ++
>>  1 file changed, 22 insertions(+), 4 deletions(-)
>>
>> diff --git a/src/intel/blorp/blorp_genX_exec.h
>> b/src/intel/blorp/blorp_genX_exec.h
>> index f44076e..7312847 100644
>> --- a/src/intel/blorp/blorp_genX_exec.h
>> +++ b/src/intel/blorp/blorp_genX_exec.h
>> @@ -298,8 +298,10 @@ blorp_emit_vertex_elements(struct blorp_batch *batch,
>>  * the URB. This is controlled by the 3DSTATE_VERTEX_BUFFERS and
>>  * 3DSTATE_VERTEX_ELEMENTS packets below. The VUE contents are as
>> follows:
>>  *   dw0: Reserved, MBZ.
>> -*   dw1: Render Target Array Index. The HiZ op does not use indexed
>> -*vertices, so set the dword to 0.
>> +*   dw1: Render Target Array Index. Below vertex fetcher gets
>> programmed
>> +*to assign this with primitive instance identifier which will
>> be
>> +*used for layered clears. All other renders have only one
>> instance
>> +*and therefore the value will be effectively zero.
>>  *   dw2: Viewport Index. The HiZ op disables viewport mapping and
>>  *scissoring, so set the dword to 0.
>>  *   dw3: Point Width: The HiZ op does not emit the POINTLIST
>> primitive,
>> @@ -318,7 +320,7 @@ blorp_emit_vertex_elements(struct blorp_batch *batch,
>>  * "Vertex URB Entry (VUE) Formats".
>>  *
>>  * Only vertex position X and Y are going to be variable, Z is fixed
>> to
>> -* zero and W to one. Header words dw0-3 are all zero. There is no
>> need to
>> +* zero and W to one. Header words dw0,2,3 are zero. There is no need
>> to
>>  * include the fixed values in the vertex buffer. Vertex fetcher can
>> be
>>  * instructed to fill vertex elements with constant values of one and
>> zero
>>  * instead of reading them from the buffer.
>> @@ -332,7 +334,16 @@ blorp_emit_vertex_elements(struct blorp_batch *batch,
>> ve[0].SourceElementFormat = ISL_FORMAT_R32G32B32A32_FLOAT;
>> ve[0].SourceElementOffset = 0;
>> ve[0].Component0Control = VFCOMP_STORE_0;
>> +
>> +   /* From Gen8 onwards hardware is no more instructed to overwrite
>> components
>> +* using an element specifier. Instead one has separate
>> 3DSTATE_VF_SGVS
>> +* (System Generated Value Setup) state packet for it.
>> +*/
>> +#if GEN_GEN >= 8
>> ve[0].Component1Control = VFCOMP_STORE_0;
>> +#else
>> +   ve[0].Component1Control = VFCOMP_STORE_IID;
>> +#endif
>> ve[0].Component2Control = VFCOMP_STORE_0;
>> ve[0].Component3Control = VFCOMP_STORE_0;
>>
>> @@ -366,7 +377,14 @@ blorp_emit_vertex_elements(struct blorp_batch *batch,
>> }
>>
>>  #if GEN_GEN >= 8
>> -   blorp_emit(batch, GENX(3DSTATE_VF_SGVS), sgvs);
>> +   /* Overwrite Render Target Array Index (2nd dword) in the VUE header
>> with
>> +* primitive instance identifier. This is used for layered clears.
>> +*/
>> +   blorp_emit(batch, GENX(3DSTATE_VF_SGVS), sgvs) {
>> +  sgvs.InstanceIDEnable = true;
>> +  sgvs.InstanceIDComponentNumber = COMP_1;
>> +  sgvs.InstanceIDElementOffset = 0;
>> +   }
>
> I love the fact that we can use SVGS this way.  I cc'd Kristian so he can
> enjoy it too!
>
> Reviewed-by: Jason Ekstrand 

Yeah, nice - we already did that in meta though.

>> for (unsigned i = 0; i < num_elements; i++) {
>>blorp_emit(batch, GENX(3DSTATE_VF_INSTANCING), vf) {
>> --
>> 2.5.5
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 4/6] EGL: Call the EGL_KHR_debug callback on errors.

2016-09-06 Thread Kyle Brenneman

The patch also changes _eglDebugReport so that it calls 
_eglInternalError at the end instead of _eglError.


-Kyle

On 09/06/2016 01:53 PM, Adam Jackson wrote:

On Wed, 2016-07-06 at 10:33 -0600, Kyle Brenneman wrote:


@@ -292,6 +292,24 @@ _eglError(EGLint errCode, const char *msg)
 return EGL_FALSE;
  }
  
+EGLBoolean

+_eglError(EGLint errCode, const char *msg)
+{
+   if (errCode != EGL_SUCCESS) {
+  EGLint type;
+  if (errCode == EGL_BAD_ALLOC) {
+ type = EGL_DEBUG_MSG_CRITICAL_KHR;
+  } else {
+ type = EGL_DEBUG_MSG_ERROR_KHR;
+  }
+
+  _eglDebugReport(errCode, NULL, msg, type, NULL, NULL);
+   } else {
+  _eglInternalError(errCode, msg);
+   }
+   return EGL_FALSE;
+}

I don't think this can be right? _eglDebugReport ends with:

if (type == EGL_DEBUG_MSG_CRITICAL_KHR || type == EGL_DEBUG_MSG_ERROR_KHR) {
   _eglError(error, command);
}

So this looks like it could mutually recurse until you run out of stack
space and crash. I'll try to write a test to prove the point but maybe
I'm missing something about how this is meant to work.

- ajax


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] spirv/nir: Add support for OpAtomicLoad/Store

2016-09-06 Thread Lionel Landwerlin

Fixes new CTS tests :

dEQP-VK.spirv_assembly.instruction.compute.opatomic.load
dEQP-VK.spirv_assembly.instruction.compute.opatomic.store

v2: don't handle images like ssbo/ubo (Jason)

Signed-off-by: Lionel Landwerlin 
Cc: Jason Ekstrand 
---
 src/compiler/spirv/spirv_to_nir.c | 124 ++
 1 file changed, 113 insertions(+), 11 deletions(-)

diff --git a/src/compiler/spirv/spirv_to_nir.c 
b/src/compiler/spirv/spirv_to_nir.c
index fda38f9..1fcd70f 100644
--- a/src/compiler/spirv/spirv_to_nir.c
+++ b/src/compiler/spirv/spirv_to_nir.c
@@ -1640,6 +1640,18 @@ vtn_handle_image(struct vtn_builder *b, SpvOp opcode,
   image = *vtn_value(b, w[3], vtn_value_type_image_pointer)->image;
   break;
 
+   case SpvOpAtomicLoad: {
+  image.image =
+ vtn_value(b, w[3], vtn_value_type_access_chain)->access_chain;
+  break;
+   }
+
+   case SpvOpAtomicStore: {
+  image.image =
+ vtn_value(b, w[1], vtn_value_type_access_chain)->access_chain;
+  break;
+   }
+
case SpvOpImageQuerySize:
   image.image =
  vtn_value(b, w[3], vtn_value_type_access_chain)->access_chain;
@@ -1685,6 +1697,8 @@ vtn_handle_image(struct vtn_builder *b, SpvOp opcode,
OP(ImageQuerySize, size)
OP(ImageRead,  load)
OP(ImageWrite, store)
+   OP(AtomicLoad, load)
+   OP(AtomicStore,store)
OP(AtomicExchange, atomic_exchange)
OP(AtomicCompareExchange,  atomic_comp_swap)
OP(AtomicIIncrement,   atomic_add)
@@ -1723,9 +1737,13 @@ vtn_handle_image(struct vtn_builder *b, SpvOp opcode,
}
 
switch (opcode) {
+   case SpvOpAtomicLoad:
case SpvOpImageQuerySize:
case SpvOpImageRead:
   break;
+   case SpvOpAtomicStore:
+  intrin->src[2] = nir_src_for_ssa(vtn_ssa_value(b, w[4])->def);
+  break;
case SpvOpImageWrite:
   intrin->src[2] = nir_src_for_ssa(vtn_ssa_value(b, w[3])->def);
   break;
@@ -1784,6 +1802,8 @@ static nir_intrinsic_op
 get_ssbo_nir_atomic_op(SpvOp opcode)
 {
switch (opcode) {
+   case SpvOpAtomicLoad:  return nir_intrinsic_load_ssbo;
+   case SpvOpAtomicStore: return nir_intrinsic_store_ssbo;
 #define OP(S, N) case SpvOp##S: return nir_intrinsic_ssbo_##N;
OP(AtomicExchange, atomic_exchange)
OP(AtomicCompareExchange,  atomic_comp_swap)
@@ -1808,6 +1828,8 @@ static nir_intrinsic_op
 get_shared_nir_atomic_op(SpvOp opcode)
 {
switch (opcode) {
+   case SpvOpAtomicLoad:  return nir_intrinsic_load_var;
+   case SpvOpAtomicStore: return nir_intrinsic_store_var;
 #define OP(S, N) case SpvOp##S: return nir_intrinsic_var_##N;
OP(AtomicExchange, atomic_exchange)
OP(AtomicCompareExchange,  atomic_comp_swap)
@@ -1873,10 +1895,38 @@ static void
 vtn_handle_ssbo_or_shared_atomic(struct vtn_builder *b, SpvOp opcode,
  const uint32_t *w, unsigned count)
 {
-   struct vtn_access_chain *chain =
-  vtn_value(b, w[3], vtn_value_type_access_chain)->access_chain;
+   struct vtn_access_chain *chain;
nir_intrinsic_instr *atomic;
 
+   switch (opcode) {
+   case SpvOpAtomicLoad:
+   case SpvOpAtomicExchange:
+   case SpvOpAtomicCompareExchange:
+   case SpvOpAtomicCompareExchangeWeak:
+   case SpvOpAtomicIIncrement:
+   case SpvOpAtomicIDecrement:
+   case SpvOpAtomicIAdd:
+   case SpvOpAtomicISub:
+   case SpvOpAtomicSMin:
+   case SpvOpAtomicUMin:
+   case SpvOpAtomicSMax:
+   case SpvOpAtomicUMax:
+   case SpvOpAtomicAnd:
+   case SpvOpAtomicOr:
+   case SpvOpAtomicXor:
+  chain =
+ vtn_value(b, w[3], vtn_value_type_access_chain)->access_chain;
+  break;
+
+   case SpvOpAtomicStore:
+  chain =
+ vtn_value(b, w[1], vtn_value_type_access_chain)->access_chain;
+  break;
+
+   default:
+  unreachable("Invalid SPIR-V atomic");
+   }
+
/*
SpvScope scope = w[4];
SpvMemorySemanticsMask semantics = w[5];
@@ -1897,18 +1947,58 @@ vtn_handle_ssbo_or_shared_atomic(struct vtn_builder *b, 
SpvOp opcode,
   nir_intrinsic_op op = get_ssbo_nir_atomic_op(opcode);
 
   atomic = nir_intrinsic_instr_create(b->nb.shader, op);
-  atomic->src[0] = nir_src_for_ssa(index);
-  atomic->src[1] = nir_src_for_ssa(offset);
-  fill_common_atomic_sources(b, opcode, w, &atomic->src[2]);
+
+  switch (opcode) {
+  case SpvOpAtomicLoad:
+ atomic->num_components = glsl_get_vector_elements(type->type);
+ atomic->src[0] = nir_src_for_ssa(index);
+ atomic->src[1] = nir_src_for_ssa(offset);
+ break;
+
+  case SpvOpAtomicStore:
+ atomic->num_components = glsl_get_vector_elements(type->type);
+ nir_intrinsic_set_write_mask(atomic, (1 << atomic->num_components) - 
1);
+ atomic->src[0] = nir_src_for_ssa(vtn_ssa_value(b, w[4])->def);
+ atomic->src[1] = nir_src_for_ssa(index);
+ atomic->src[2] = nir_src_for_ssa(offset);
+ break;
+
+  case SpvOpAtomicExch

Re: [Mesa-dev] [PATCH 1/3] nir/gcm: Call nir_metadata_preserve

2016-09-06 Thread Eric Anholt

Jason Ekstrand  writes:

> ---
>  src/compiler/nir/nir_opt_gcm.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/src/compiler/nir/nir_opt_gcm.c b/src/compiler/nir/nir_opt_gcm.c
> index 84e32ef..02a9348 100644
> --- a/src/compiler/nir/nir_opt_gcm.c
> +++ b/src/compiler/nir/nir_opt_gcm.c
> @@ -483,6 +483,9 @@ opt_gcm_impl(nir_function_impl *impl)
> }
>  
> ralloc_free(state.blocks);
> +
> +   nir_metadata_preserve(impl, nir_metadata_block_index |
> +   nir_metadata_dominance);
>  }

Reviewed-by: Eric Anholt 


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/3] nir/gcm: Add value numbering support

2016-09-06 Thread Eric Anholt

Jason Ekstrand  writes:

> ---

This seems fine, but the commit message needs some expansion.  Questions
I had:

Does this help compared to an implementation with nir CSE already?

Does nir CSE still get us anything that this doesn't?


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Piglit] Black screen / hang in piglit's spec@ext_texture_lod_bias@lodbias test on radionsi with ubuntu 16.04

2016-09-06 Thread Ilia Mirkin

[-piglit,+mesa-dev]

On Tue, Sep 6, 2016 at 5:03 PM, Dan Kegel  wrote:
> Hi all,
> happily running piglit on Ubuntu 16.04 with an AMD W600 card.  No
> system crashes so far :-)
> But I do have an X hang.  Black screen, test hung, but still available via 
> ssh.
>
> I looked in 
> https://bugs.freedesktop.org/buglist.cgi?component=Drivers%2FGallium%2Fradeonsi&product=Mesa
> but didn't see this hang mentioned offhand; I assume that's the right place.
>
> Here's what I would file there.  Does this look about right?  Or is it
> frowned upon to report
> hangs from ubuntu releases rather than tip?

piglit@ is for ... piglit. The issue you have isn't with piglit but
rather with the driver. Use either mesa-dev@ or dri-devel@ for that.
The bug tracker link you found is the proper place to file a bug.

While I am in no way connected to the AMD team, I can pretty much
guarantee someone will ask you what a W600 card is, so include lspci
-nn output, and your radeonsi_dri.so doesn't include any debug
symbols, making it harder to see what's going on.

Graphics development pace tends to be quick (which is a polite way of
saying that the drivers suck in various ways but are being actively
improved), so testing with the latest and greatest is definitely
encouraged. Different teams provide different levels of support for
various "old" things, not sure where AMD is at exactly in that regard.

Cheers,

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] st/va: enable vbr rate control for vaapi encode

2016-09-06 Thread boyuan.zhang

From: Boyuan Zhang 

This patch enables variable bit-rate for vaapi encoding. According to va.h,
target bit-rate equals to maximum bit-rate multiplies by target_percentage.

Signed-off-by: Boyuan Zhang 
---
 src/gallium/state_trackers/va/config.c  | 2 +-
 src/gallium/state_trackers/va/picture.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/state_trackers/va/config.c 
b/src/gallium/state_trackers/va/config.c
index 84bf913..4052316 100644
--- a/src/gallium/state_trackers/va/config.c
+++ b/src/gallium/state_trackers/va/config.c
@@ -120,7 +120,7 @@ vlVaGetConfigAttributes(VADriverContextP ctx, VAProfile 
profile, VAEntrypoint en
  value = VA_RT_FORMAT_YUV420;
  break;
   case VAConfigAttribRateControl:
- value = VA_RC_CQP | VA_RC_CBR;
+ value = VA_RC_CQP | VA_RC_CBR | VA_RC_VBR;
  break;
   default:
  value = VA_ATTRIB_NOT_SUPPORTED;
diff --git a/src/gallium/state_trackers/va/picture.c 
b/src/gallium/state_trackers/va/picture.c
index a283e83..7f3d96d 100644
--- a/src/gallium/state_trackers/va/picture.c
+++ b/src/gallium/state_trackers/va/picture.c
@@ -322,7 +322,7 @@ handleVAEncMiscParameterTypeRateControl(vlVaContext 
*context, VAEncMiscParameter
PIPE_H264_ENC_RATE_CONTROL_METHOD_CONSTANT)
   context->desc.h264enc.rate_ctrl.target_bitrate = rc->bits_per_second;
else
-  context->desc.h264enc.rate_ctrl.target_bitrate = rc->bits_per_second * 
rc->target_percentage;
+  context->desc.h264enc.rate_ctrl.target_bitrate = rc->bits_per_second * 
rc->target_percentage / 100;
context->desc.h264enc.rate_ctrl.peak_bitrate = rc->bits_per_second;
if (context->desc.h264enc.rate_ctrl.target_bitrate < 200)
   context->desc.h264enc.rate_ctrl.vbv_buffer_size = 
MIN2((context->desc.h264enc.rate_ctrl.target_bitrate * 2.75), 200);
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/3] nir/gcm: Add value numbering support

2016-09-06 Thread Jason Ekstrand

On Sep 6, 2016 1:50 PM, "Eric Anholt"  wrote:
>
> Jason Ekstrand  writes:
>
> > ---
>
> This seems fine, but the commit message needs some expansion.  Questions
> I had:
>
> Does this help compared to an implementation with nir CSE already?

Yes, value numbering is capable of detecting common values even if one does
not dominate the other.  For instance, in you have

if (...) {
   ssa_1 = ssa_0 + 7;
   /* use ssa_1 */
} else {
   ssa_2 = ssa_0 + 7;
   /* use ssa_2 */
}

Global value numbering doesn't care about dominance relationships so it
figures out that ssa_1 and ssa_2 are the same and converts this to

if (...) {
   ssa_1 = ssa_0 + 7;
   /* use ssa_1 */
} else {
   /* use ssa_2 */
}

Obviously, we just broke SSA form which is bad.  Global code motion,
however, will repair this for us by turning this into

ssa_1 = ssa_0 + 7;
if (...) {
   /* use ssa_1 */
} else {
   /* use ssa_2 */
}

> Does nir CSE still get us anything that this doesn't?

It's still useful primarily because it's less of a scorched-earth approach
and doesn't require GCM.  This makes it q bit more appropriate for use as a
clean-up in a late optimization run.

I can add since of that to the commit message.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] mapi: add gl32.h to the list of GLES3 headers for installation

2016-09-06 Thread Ilia Mirkin

This was missed when I added the updated (and new) Khronos headers.

Signed-off-by: Ilia Mirkin 
---
 src/mapi/Makefile.am | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mapi/Makefile.am b/src/mapi/Makefile.am
index 68a28a2..3b0a9de 100644
--- a/src/mapi/Makefile.am
+++ b/src/mapi/Makefile.am
@@ -207,6 +207,7 @@ GLES3_includedir = $(includedir)/GLES3
 GLES3_include_HEADERS = \
$(top_srcdir)/include/GLES3/gl3.h \
$(top_srcdir)/include/GLES3/gl31.h \
+   $(top_srcdir)/include/GLES3/gl32.h \
$(top_srcdir)/include/GLES3/gl3ext.h \
$(top_srcdir)/include/GLES3/gl3platform.h
 
-- 
2.7.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mapi: add gl32.h to the list of GLES3 headers for installation

2016-09-06 Thread Mark Janes

Reviewed-by: Mark Janes 
Tested-by: Mark Janes 

Ilia Mirkin  writes:

> This was missed when I added the updated (and new) Khronos headers.
>
> Signed-off-by: Ilia Mirkin 
> ---
>  src/mapi/Makefile.am | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/src/mapi/Makefile.am b/src/mapi/Makefile.am
> index 68a28a2..3b0a9de 100644
> --- a/src/mapi/Makefile.am
> +++ b/src/mapi/Makefile.am
> @@ -207,6 +207,7 @@ GLES3_includedir = $(includedir)/GLES3
>  GLES3_include_HEADERS = \
>   $(top_srcdir)/include/GLES3/gl3.h \
>   $(top_srcdir)/include/GLES3/gl31.h \
> + $(top_srcdir)/include/GLES3/gl32.h \
>   $(top_srcdir)/include/GLES3/gl3ext.h \
>   $(top_srcdir)/include/GLES3/gl3platform.h
>  
> -- 
> 2.7.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] st/va: enable dual instances encode by sync surface

2016-09-06 Thread Mark Thompson


Hi,

This patch (applied as 
)
 changes the meaning of vaSyncSurface() in a way I don't think is quite right.

The way it is implemented appears to make vaSyncSurface() mean "given a surface 
with an outstanding rendering operation we haven't yet synced, wait for that 
operation to complete or consume the notification that it already has".

I think this should really be "given any surface, if the surface has an 
outstanding rendering operation, wait for that operation to complete".

The immediate case where this is visible is if you call vaSyncSurface() on a 
surface which has never been rendered to, you get an error.  (For why you might 
want to do this, consider code which is uploading/downloading surfaces - you 
want to sync before copying, but that should be as late as possible and your 
code there doesn't want to keep track of what surfaces are being used for; for 
the code where I am hitting this have a look at 
.)

So, I want to suggest something like:

diff --git a/src/gallium/state_trackers/va/surface.c 
b/src/gallium/state_trackers/va/surface.c
index 3ee1cdd..aad296e 100644
--- a/src/gallium/state_trackers/va/surface.c
+++ b/src/gallium/state_trackers/va/surface.c
@@ -112,12 +112,7 @@ vlVaSyncSurface(VADriverContextP ctx, VASurfaceID 
render_target)
}

context = handle_table_get(drv->htab, surf->ctx);
-   if (!context) {
-  pipe_mutex_unlock(drv->mutex);
-  return VA_STATUS_ERROR_INVALID_CONTEXT;
-   }
-
-   if (context->decoder->entrypoint == PIPE_VIDEO_ENTRYPOINT_ENCODE) {
+   if (context && context->decoder->entrypoint == 
PIPE_VIDEO_ENTRYPOINT_ENCODE) {
   int frame_diff;
   if (context->desc.h264enc.frame_num_cnt > surf->frame_num_cnt)
  frame_diff = context->desc.h264enc.frame_num_cnt - 
surf->frame_num_cnt;

However, this doesn't quite work.  Now you can upload once to a surface which 
hasn't been touched before in order to pass it to an encoder, but any following 
upload to that same surface (when you reuse it) is running into the fact that 
it has already been synced to by the encoder (to wait for the bitstream 
output).  That's now undefined behaviour by the definition above (there isn't 
an outstanding rendering operation).

(It segfaults in rcve_get_feedback() because surf->feedback is destroyed the 
first time and therefore null the second time.  You can also get this case by 
calling vaSyncSurface() twice on any surface which has been rendered to.)

Some simplistic attempts to fix this by checking surf->feedback in 
vlVaSyncSurface() before calling get_feedback() didn't yield anything useful 
(seems to spin in the second sync operation), and I don't really know enough 
about this code to do much more.

So, would you mind having another look at this patch, and clarifying what you 
think vaSyncSurface() should mean?

Thanks,

- Mark
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/3] nir/spirv: Swap the argument order for AtomicCompareExchange

2016-09-06 Thread Jason Ekstrand

SPIR-V has the two arguments in the opposite order from GLSL.  NIR uses the
GLSL order so we had them backwards.

Fixes dEQP-VK.spirv_assembly.instruction.compute.opatomic.compex

Signed-off-by: Jason Ekstrand 
Cc: "12.0" 
---
 src/compiler/spirv/spirv_to_nir.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/compiler/spirv/spirv_to_nir.c 
b/src/compiler/spirv/spirv_to_nir.c
index fda38f9..4c0c794 100644
--- a/src/compiler/spirv/spirv_to_nir.c
+++ b/src/compiler/spirv/spirv_to_nir.c
@@ -1847,8 +1847,8 @@ fill_common_atomic_sources(struct vtn_builder *b, SpvOp 
opcode,
   break;
 
case SpvOpAtomicCompareExchange:
-  src[0] = nir_src_for_ssa(vtn_ssa_value(b, w[7])->def);
-  src[1] = nir_src_for_ssa(vtn_ssa_value(b, w[8])->def);
+  src[0] = nir_src_for_ssa(vtn_ssa_value(b, w[8])->def);
+  src[1] = nir_src_for_ssa(vtn_ssa_value(b, w[7])->def);
   break;
   /* Fall through */
 
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/3] nir/spirv: Use fill_common_atomic_sources for image atomics

2016-09-06 Thread Jason Ekstrand

We had two almost identical copies of this code and they were both broken
but in different ways.  The previous two commits fixed both of them.  This
one just unifies them so that it's easier to handle in the future.

Signed-off-by: Jason Ekstrand 
---
 src/compiler/spirv/spirv_to_nir.c | 99 +--
 1 file changed, 43 insertions(+), 56 deletions(-)

diff --git a/src/compiler/spirv/spirv_to_nir.c 
b/src/compiler/spirv/spirv_to_nir.c
index 0d6a70e..e91a7b2 100644
--- a/src/compiler/spirv/spirv_to_nir.c
+++ b/src/compiler/spirv/spirv_to_nir.c
@@ -1589,6 +1589,47 @@ vtn_handle_texture(struct vtn_builder *b, SpvOp opcode,
nir_builder_instr_insert(&b->nb, &instr->instr);
 }
 
+static void
+fill_common_atomic_sources(struct vtn_builder *b, SpvOp opcode,
+   const uint32_t *w, nir_src *src)
+{
+   switch (opcode) {
+   case SpvOpAtomicIIncrement:
+  src[0] = nir_src_for_ssa(nir_imm_int(&b->nb, 1));
+  break;
+
+   case SpvOpAtomicIDecrement:
+  src[0] = nir_src_for_ssa(nir_imm_int(&b->nb, -1));
+  break;
+
+   case SpvOpAtomicISub:
+  src[0] =
+ nir_src_for_ssa(nir_ineg(&b->nb, vtn_ssa_value(b, w[6])->def));
+  break;
+
+   case SpvOpAtomicCompareExchange:
+  src[0] = nir_src_for_ssa(vtn_ssa_value(b, w[8])->def);
+  src[1] = nir_src_for_ssa(vtn_ssa_value(b, w[7])->def);
+  break;
+  /* Fall through */
+
+   case SpvOpAtomicExchange:
+   case SpvOpAtomicIAdd:
+   case SpvOpAtomicSMin:
+   case SpvOpAtomicUMin:
+   case SpvOpAtomicSMax:
+   case SpvOpAtomicUMax:
+   case SpvOpAtomicAnd:
+   case SpvOpAtomicOr:
+   case SpvOpAtomicXor:
+  src[0] = nir_src_for_ssa(vtn_ssa_value(b, w[6])->def);
+  break;
+
+   default:
+  unreachable("Invalid SPIR-V atomic");
+   }
+}
+
 static nir_ssa_def *
 get_image_coord(struct vtn_builder *b, uint32_t value)
 {
@@ -1729,13 +1770,9 @@ vtn_handle_image(struct vtn_builder *b, SpvOp opcode,
case SpvOpImageWrite:
   intrin->src[2] = nir_src_for_ssa(vtn_ssa_value(b, w[3])->def);
   break;
+
case SpvOpAtomicIIncrement:
-  intrin->src[2] = nir_src_for_ssa(nir_imm_int(&b->nb, 1));
-  break;
case SpvOpAtomicIDecrement:
-  intrin->src[2] = nir_src_for_ssa(nir_imm_int(&b->nb, -1));
-  break;
-
case SpvOpAtomicExchange:
case SpvOpAtomicIAdd:
case SpvOpAtomicSMin:
@@ -1745,16 +1782,7 @@ vtn_handle_image(struct vtn_builder *b, SpvOp opcode,
case SpvOpAtomicAnd:
case SpvOpAtomicOr:
case SpvOpAtomicXor:
-  intrin->src[2] = nir_src_for_ssa(vtn_ssa_value(b, w[6])->def);
-  break;
-
-   case SpvOpAtomicCompareExchange:
-  intrin->src[2] = nir_src_for_ssa(vtn_ssa_value(b, w[8])->def);
-  intrin->src[3] = nir_src_for_ssa(vtn_ssa_value(b, w[7])->def);
-  break;
-
-   case SpvOpAtomicISub:
-  intrin->src[2] = nir_src_for_ssa(nir_ineg(&b->nb, vtn_ssa_value(b, 
w[6])->def));
+  fill_common_atomic_sources(b, opcode, w, &intrin->src[2]);
   break;
 
default:
@@ -1829,47 +1857,6 @@ get_shared_nir_atomic_op(SpvOp opcode)
 }
 
 static void
-fill_common_atomic_sources(struct vtn_builder *b, SpvOp opcode,
-   const uint32_t *w, nir_src *src)
-{
-   switch (opcode) {
-   case SpvOpAtomicIIncrement:
-  src[0] = nir_src_for_ssa(nir_imm_int(&b->nb, 1));
-  break;
-
-   case SpvOpAtomicIDecrement:
-  src[0] = nir_src_for_ssa(nir_imm_int(&b->nb, -1));
-  break;
-
-   case SpvOpAtomicISub:
-  src[0] =
- nir_src_for_ssa(nir_ineg(&b->nb, vtn_ssa_value(b, w[6])->def));
-  break;
-
-   case SpvOpAtomicCompareExchange:
-  src[0] = nir_src_for_ssa(vtn_ssa_value(b, w[8])->def);
-  src[1] = nir_src_for_ssa(vtn_ssa_value(b, w[7])->def);
-  break;
-  /* Fall through */
-
-   case SpvOpAtomicExchange:
-   case SpvOpAtomicIAdd:
-   case SpvOpAtomicSMin:
-   case SpvOpAtomicUMin:
-   case SpvOpAtomicSMax:
-   case SpvOpAtomicUMax:
-   case SpvOpAtomicAnd:
-   case SpvOpAtomicOr:
-   case SpvOpAtomicXor:
-  src[0] = nir_src_for_ssa(vtn_ssa_value(b, w[6])->def);
-  break;
-
-   default:
-  unreachable("Invalid SPIR-V atomic");
-   }
-}
-
-static void
 vtn_handle_ssbo_or_shared_atomic(struct vtn_builder *b, SpvOp opcode,
  const uint32_t *w, unsigned count)
 {
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/3] nir/spirv: Use the correct sources for CompareExchange on images

2016-09-06 Thread Jason Ekstrand

The CompareExchange operation has two "Memory Semantics" parameters instead
of one so the real arguments start at w[7] instead of w[6].

Signed-off-by: Jason Ekstrand 
Cc: "12.0" 
---
 src/compiler/spirv/spirv_to_nir.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/compiler/spirv/spirv_to_nir.c 
b/src/compiler/spirv/spirv_to_nir.c
index 4c0c794..0d6a70e 100644
--- a/src/compiler/spirv/spirv_to_nir.c
+++ b/src/compiler/spirv/spirv_to_nir.c
@@ -1749,8 +1749,8 @@ vtn_handle_image(struct vtn_builder *b, SpvOp opcode,
   break;
 
case SpvOpAtomicCompareExchange:
-  intrin->src[2] = nir_src_for_ssa(vtn_ssa_value(b, w[7])->def);
-  intrin->src[3] = nir_src_for_ssa(vtn_ssa_value(b, w[6])->def);
+  intrin->src[2] = nir_src_for_ssa(vtn_ssa_value(b, w[8])->def);
+  intrin->src[3] = nir_src_for_ssa(vtn_ssa_value(b, w[7])->def);
   break;
 
case SpvOpAtomicISub:
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] glsl: Add a citation for uniform precision matching.

2016-09-06 Thread Kenneth Graunke

On Tuesday, September 6, 2016 1:04:43 PM PDT Eric Anholt wrote:
> Kenneth Graunke  writes:
> 
> > Ian added this check in commit 259fc505454ea6a67aeacf6cdebf1398d9947759.
> > While reviewing the rules, I found a citation which spells this out
> > clearly, so I figured I'd send a patch to add it as a comment.
> >
> > Cc: i...@freedesktop.org
> > Signed-off-by: Kenneth Graunke 
> 
> Reviewed-by: Eric Anholt 
> 
> I was wondering about this change, because glmark2 is failing to compile
> its terrain shaders now.

Really?  GLBenchmark 2.7 also fails to compile:

https://bugs.freedesktop.org/show_bug.cgi?id=97532

I'm beginning to doubt whether any other vendor implements this part of
the spec, or if they have some variation of it.

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/3] nir/spirv: Swap the argument order for AtomicCompareExchange

2016-09-06 Thread Dave Airlie

For the series:

Reviewed-by: Dave Airlie 

On 7 September 2016 at 08:17, Jason Ekstrand  wrote:
> SPIR-V has the two arguments in the opposite order from GLSL.  NIR uses the
> GLSL order so we had them backwards.
>
> Fixes dEQP-VK.spirv_assembly.instruction.compute.opatomic.compex
>
> Signed-off-by: Jason Ekstrand 
> Cc: "12.0" 
> ---
>  src/compiler/spirv/spirv_to_nir.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/src/compiler/spirv/spirv_to_nir.c 
> b/src/compiler/spirv/spirv_to_nir.c
> index fda38f9..4c0c794 100644
> --- a/src/compiler/spirv/spirv_to_nir.c
> +++ b/src/compiler/spirv/spirv_to_nir.c
> @@ -1847,8 +1847,8 @@ fill_common_atomic_sources(struct vtn_builder *b, SpvOp 
> opcode,
>break;
>
> case SpvOpAtomicCompareExchange:
> -  src[0] = nir_src_for_ssa(vtn_ssa_value(b, w[7])->def);
> -  src[1] = nir_src_for_ssa(vtn_ssa_value(b, w[8])->def);
> +  src[0] = nir_src_for_ssa(vtn_ssa_value(b, w[8])->def);
> +  src[1] = nir_src_for_ssa(vtn_ssa_value(b, w[7])->def);
>break;
>/* Fall through */
>
> --
> 2.5.0.400.gff86faf
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 14/33] intel/blorp: Add an entrypoint for doing bit-for-bit copies

2016-09-06 Thread Nanley Chery

On Wed, Aug 31, 2016 at 02:22:33PM -0700, Jason Ekstrand wrote:
> ---
>  src/intel/blorp/blorp.h  |  10 
>  src/intel/blorp/blorp_blit.c | 133 
> +++
>  2 files changed, 143 insertions(+)
> 
> diff --git a/src/intel/blorp/blorp.h b/src/intel/blorp/blorp.h
> index c1e93fd..6574124 100644
> --- a/src/intel/blorp/blorp.h
> +++ b/src/intel/blorp/blorp.h
> @@ -109,6 +109,16 @@ blorp_blit(struct blorp_batch *batch,
> uint32_t filter, bool mirror_x, bool mirror_y);
>  
>  void
> +blorp_copy(struct blorp_batch *batch,
> +   const struct blorp_surf *src_surf,
> +   unsigned src_level, unsigned src_layer,
> +   const struct blorp_surf *dst_surf,
> +   unsigned dst_level, unsigned dst_layer,
> +   uint32_t src_x, uint32_t src_y,
> +   uint32_t dst_x, uint32_t dst_y,
> +   uint32_t src_width, uint32_t src_height);
> +
> +void
>  blorp_fast_clear(struct blorp_batch *batch,
>   const struct blorp_surf *surf,
>   uint32_t level, uint32_t layer, enum isl_format format,
> diff --git a/src/intel/blorp/blorp_blit.c b/src/intel/blorp/blorp_blit.c
> index 3ab39a3..42a502c 100644
> --- a/src/intel/blorp/blorp_blit.c
> +++ b/src/intel/blorp/blorp_blit.c
> @@ -1685,3 +1685,136 @@ blorp_blit(struct blorp_batch *batch,
>   dst_x0, dst_y0, dst_x1, dst_y1,
>   mirror_x, mirror_y);
>  }
> +
> +static enum isl_format
> +get_copy_format_for_bpb(unsigned bpb)
> +{
> +   /* The choice of UNORM and UINT formats is very intentional here.  Most of
> +* the time, we want to use a UINT format to avoid any rounding error in
> +* the blit.  For stencil blits, R8_UINT is required by the hardware.
> +* (It's the only format allowed in conjunction with W-tiling.)  Also we
> +* intentionally use the 4-channel formats whenever we can.  This is so
> +* that, when we do a RGB <-> RGBX copy, the two formats will line up even
> +* though one of them is 3/4 the size of the other.  The choice of UNORM
> +* vs. UINT is also very intentional because Haswell doesn't handle 8 or
> +* 16-bit RGB UINT formats at all so we have to use UNORM there.
> +* Fortunately, the only time we should ever use two different formats in
> +* the table below is for RGB -> RGBA blits and so we will never have any
> +* UNORM/UINT mismatch.
> +*/
> +   switch (bpb) {
> +   case 8:  return ISL_FORMAT_R8_UINT;
> +   case 16: return ISL_FORMAT_R8G8_UINT;
> +   case 24: return ISL_FORMAT_R8G8B8_UNORM;
> +   case 32: return ISL_FORMAT_R8G8B8A8_UNORM;
> +   case 48: return ISL_FORMAT_R16G16B16_UNORM;
> +   case 64: return ISL_FORMAT_R16G16B16A16_UNORM;
> +   case 96: return ISL_FORMAT_R32G32B32_UINT;
> +   case 128:return ISL_FORMAT_R32G32B32A32_UINT;
> +   default:
> +  unreachable("Unknown format bpb");
> +   }
> +}
> +
> +static void
> +surf_convert_to_uncompressed(const struct isl_device *isl_dev,
> + struct brw_blorp_surface_info *info,
> + uint32_t *x, uint32_t *y,
> + uint32_t *width, uint32_t *height)
> +{
> +   const struct isl_format_layout *fmtl =
> +  isl_format_get_layout(info->surf.format);
> +
> +   assert(fmtl->bw > 1 || fmtl->bh > 1);
> +
> +   /* This is a compressed surface.  We need to convert it to a single
> +* slice (becase compressed layouts don't perfectly match uncompressed
> +* ones with the same bpb) and divide x, y, width, and height by the
> +* block size.
> +*/
> +   surf_convert_to_single_slice(isl_dev, info);
> +
> +   if (width || height) {
> +  assert(*width % fmtl->bw == 0 ||
> + *x + *width == info->surf.logical_level0_px.width);
> +  assert(*height % fmtl->bh == 0 ||
> + *y + *height == info->surf.logical_level0_px.height);
> +  *width = DIV_ROUND_UP(*width, fmtl->bw);
> +  *height = DIV_ROUND_UP(*height, fmtl->bh);
> +   }
> +
> +   assert(*x % fmtl->bw == 0);
> +   assert(*y % fmtl->bh == 0);
> +   *x /= fmtl->bw;
> +   *y /= fmtl->bh;
> +
> +   info->surf.logical_level0_px.width =
> +  DIV_ROUND_UP(info->surf.logical_level0_px.width, fmtl->bw);
> +   info->surf.logical_level0_px.height =
> +  DIV_ROUND_UP(info->surf.logical_level0_px.height, fmtl->bh);
> +
> +   assert(info->surf.phys_level0_sa.width % fmtl->bw == 0);
> +   assert(info->surf.phys_level0_sa.height % fmtl->bh == 0);
> +   info->surf.phys_level0_sa.width /= fmtl->bw;
> +   info->surf.phys_level0_sa.height /= fmtl->bh;
> +
> +   assert(info->tile_x_sa % fmtl->bw == 0);
> +   assert(info->tile_y_sa % fmtl->bh == 0);
> +   info->tile_x_sa /= fmtl->bw;
> +   info->tile_y_sa /= fmtl->bh;
> +
> +   /* It's now an uncompressed surface so we need an uncompressed format */
> +   info->surf.format = get_copy_format_for_bpb(fmtl->bpb);
> +}
> +
> +void
> +blorp_copy(struct blorp_batch *batch,
> +   const struc

[Mesa-dev] [PATCH] doc: fix typo of GALLIUM_HUD_TOGGLE_SIGNAL

2016-09-06 Thread Christoph Haag

From: Christoph Haag 

In the original commit message in 56a1c10 it was wrongly used too:
- env GALLIUM_HUD_SIGNAL_TOGGLE: toggle visibility via signal
---
 docs/envvars.html | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/envvars.html b/docs/envvars.html
index 6d79398..789f5e9 100644
--- a/docs/envvars.html
+++ b/docs/envvars.html
@@ -215,7 +215,7 @@ Mesa EGL supports different sets of environment variables.  
See the
 GALLIUM_HUD_TOGGLE_SIGNAL - toggle visibility via user specified signal.
 Especially useful to toggle hud at specific points of application and
 disable for unencumbered viewing the rest of the time. For example, set
-GALLIUM_HUD_VISIBLE to false and GALLIUM_HUD_SIGNAL_TOGGLE to 10 (SIGUSR1).
+GALLIUM_HUD_VISIBLE to false and GALLIUM_HUD_TOGGLE_SIGNAL to 10 (SIGUSR1).
 Use kill -10  to toggle the hud as desired.
 GALLIUM_LOG_FILE - specifies a file for logging all errors, warnings, etc.
 rather than stderr.
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 14/33] intel/blorp: Add an entrypoint for doing bit-for-bit copies

2016-09-06 Thread Jason Ekstrand

On Tue, Sep 6, 2016 at 4:12 PM, Nanley Chery  wrote:

> On Wed, Aug 31, 2016 at 02:22:33PM -0700, Jason Ekstrand wrote:
> > ---
> >  src/intel/blorp/blorp.h  |  10 
> >  src/intel/blorp/blorp_blit.c | 133 ++
> +
> >  2 files changed, 143 insertions(+)
> >
> > diff --git a/src/intel/blorp/blorp.h b/src/intel/blorp/blorp.h
> > index c1e93fd..6574124 100644
> > --- a/src/intel/blorp/blorp.h
> > +++ b/src/intel/blorp/blorp.h
> > @@ -109,6 +109,16 @@ blorp_blit(struct blorp_batch *batch,
> > uint32_t filter, bool mirror_x, bool mirror_y);
> >
> >  void
> > +blorp_copy(struct blorp_batch *batch,
> > +   const struct blorp_surf *src_surf,
> > +   unsigned src_level, unsigned src_layer,
> > +   const struct blorp_surf *dst_surf,
> > +   unsigned dst_level, unsigned dst_layer,
> > +   uint32_t src_x, uint32_t src_y,
> > +   uint32_t dst_x, uint32_t dst_y,
> > +   uint32_t src_width, uint32_t src_height);
> > +
> > +void
> >  blorp_fast_clear(struct blorp_batch *batch,
> >   const struct blorp_surf *surf,
> >   uint32_t level, uint32_t layer, enum isl_format format,
> > diff --git a/src/intel/blorp/blorp_blit.c b/src/intel/blorp/blorp_blit.c
> > index 3ab39a3..42a502c 100644
> > --- a/src/intel/blorp/blorp_blit.c
> > +++ b/src/intel/blorp/blorp_blit.c
> > @@ -1685,3 +1685,136 @@ blorp_blit(struct blorp_batch *batch,
> >   dst_x0, dst_y0, dst_x1, dst_y1,
> >   mirror_x, mirror_y);
> >  }
> > +
> > +static enum isl_format
> > +get_copy_format_for_bpb(unsigned bpb)
> > +{
> > +   /* The choice of UNORM and UINT formats is very intentional here.
> Most of
> > +* the time, we want to use a UINT format to avoid any rounding
> error in
> > +* the blit.  For stencil blits, R8_UINT is required by the hardware.
> > +* (It's the only format allowed in conjunction with W-tiling.)
> Also we
> > +* intentionally use the 4-channel formats whenever we can.  This is
> so
> > +* that, when we do a RGB <-> RGBX copy, the two formats will line
> up even
> > +* though one of them is 3/4 the size of the other.  The choice of
> UNORM
> > +* vs. UINT is also very intentional because Haswell doesn't handle
> 8 or
> > +* 16-bit RGB UINT formats at all so we have to use UNORM there.
> > +* Fortunately, the only time we should ever use two different
> formats in
> > +* the table below is for RGB -> RGBA blits and so we will never
> have any
> > +* UNORM/UINT mismatch.
> > +*/
> > +   switch (bpb) {
> > +   case 8:  return ISL_FORMAT_R8_UINT;
> > +   case 16: return ISL_FORMAT_R8G8_UINT;
> > +   case 24: return ISL_FORMAT_R8G8B8_UNORM;
> > +   case 32: return ISL_FORMAT_R8G8B8A8_UNORM;
> > +   case 48: return ISL_FORMAT_R16G16B16_UNORM;
> > +   case 64: return ISL_FORMAT_R16G16B16A16_UNORM;
> > +   case 96: return ISL_FORMAT_R32G32B32_UINT;
> > +   case 128:return ISL_FORMAT_R32G32B32A32_UINT;
> > +   default:
> > +  unreachable("Unknown format bpb");
> > +   }
> > +}
> > +
> > +static void
> > +surf_convert_to_uncompressed(const struct isl_device *isl_dev,
> > + struct brw_blorp_surface_info *info,
> > + uint32_t *x, uint32_t *y,
> > + uint32_t *width, uint32_t *height)
> > +{
> > +   const struct isl_format_layout *fmtl =
> > +  isl_format_get_layout(info->surf.format);
> > +
> > +   assert(fmtl->bw > 1 || fmtl->bh > 1);
> > +
> > +   /* This is a compressed surface.  We need to convert it to a single
> > +* slice (becase compressed layouts don't perfectly match
> uncompressed
> > +* ones with the same bpb) and divide x, y, width, and height by the
> > +* block size.
> > +*/
> > +   surf_convert_to_single_slice(isl_dev, info);
> > +
> > +   if (width || height) {
> > +  assert(*width % fmtl->bw == 0 ||
> > + *x + *width == info->surf.logical_level0_px.width);
> > +  assert(*height % fmtl->bh == 0 ||
> > + *y + *height == info->surf.logical_level0_px.height);
> > +  *width = DIV_ROUND_UP(*width, fmtl->bw);
> > +  *height = DIV_ROUND_UP(*height, fmtl->bh);
> > +   }
> > +
> > +   assert(*x % fmtl->bw == 0);
> > +   assert(*y % fmtl->bh == 0);
> > +   *x /= fmtl->bw;
> > +   *y /= fmtl->bh;
> > +
> > +   info->surf.logical_level0_px.width =
> > +  DIV_ROUND_UP(info->surf.logical_level0_px.width, fmtl->bw);
> > +   info->surf.logical_level0_px.height =
> > +  DIV_ROUND_UP(info->surf.logical_level0_px.height, fmtl->bh);
> > +
> > +   assert(info->surf.phys_level0_sa.width % fmtl->bw == 0);
> > +   assert(info->surf.phys_level0_sa.height % fmtl->bh == 0);
> > +   info->surf.phys_level0_sa.width /= fmtl->bw;
> > +   info->surf.phys_level0_sa.height /= fmtl->bh;
> > +
> > +   assert(info->tile_x_sa % fmtl->bw == 0);
> > +   assert(info->tile_y_sa % fmtl->bh == 0);
> > +   i

Re: [Mesa-dev] [PATCH 3/4] radeonsi: add more unlikely() uses into si_draw_vbo

2016-09-06 Thread Michel Dänzer

On 06/09/16 08:33 PM, Marek Olšák wrote:
> On Sep 6, 2016 12:03 PM, "Michel Dänzer"  > wrote:
>> On 06/09/16 06:04 PM, Marek Olšák wrote:
>> > On Tue, Sep 6, 2016 at 3:54 AM, Michel Dänzer  > wrote:
>> >> On 06/09/16 07:46 AM, Marek Olšák wrote:
>> >>> From: Marek Olšák mailto:marek.ol...@amd.com>>
>> >>
>> >> Did you measure any significant performance boost with this change?
>> >
>> > I didn't measure anything.
>> >
>> >> Otherwise, using (un)likely can be bad because it can defeat the CPU's
>> >> branch prediction, which tends to be pretty good these days.
>> >
>> > I'm not an expert on that, but it doesn't seem to be the case
>> > according to other people's comments here.
>>
>> My main point (which Gustaw seems to agree with) is that (un)likely
>> should only be used when measurements show that they have a positive
> effect.
> 
> I agree with you, but do you always measure the effect of unlikely? I
> almost never do and I just use it instinctively like most people do. Due
> to our manpower constraints, we can't even afford to measure performance
> for much bigger changes than this.

So let's spend our manpower on more important things than (un)likely
annotations. :)


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/4] radeonsi: add more unlikely() uses into si_draw_vbo

2016-09-06 Thread Rob Clark

On Tue, Sep 6, 2016 at 8:23 PM, Michel Dänzer  wrote:
> On 06/09/16 08:33 PM, Marek Olšák wrote:
>> On Sep 6, 2016 12:03 PM, "Michel Dänzer" > > wrote:
>>> On 06/09/16 06:04 PM, Marek Olšák wrote:
>>> > On Tue, Sep 6, 2016 at 3:54 AM, Michel Dänzer > > wrote:
>>> >> On 06/09/16 07:46 AM, Marek Olšák wrote:
>>> >>> From: Marek Olšák mailto:marek.ol...@amd.com>>
>>> >>
>>> >> Did you measure any significant performance boost with this change?
>>> >
>>> > I didn't measure anything.
>>> >
>>> >> Otherwise, using (un)likely can be bad because it can defeat the CPU's
>>> >> branch prediction, which tends to be pretty good these days.
>>> >
>>> > I'm not an expert on that, but it doesn't seem to be the case
>>> > according to other people's comments here.
>>>
>>> My main point (which Gustaw seems to agree with) is that (un)likely
>>> should only be used when measurements show that they have a positive
>> effect.
>>
>> I agree with you, but do you always measure the effect of unlikely? I
>> almost never do and I just use it instinctively like most people do. Due
>> to our manpower constraints, we can't even afford to measure performance
>> for much bigger changes than this.
>
> So let's spend our manpower on more important things than (un)likely
> annotations. :)
>

I guess it is still a reasonable thing to use annotations in places
that should really be gl edge cases where we don't want to penalize
if(false) case by jumping over multiple cache lines of instructions
(ie. I still plan to use it for the if(unlikely(samplerExternalOES))
stuff to deal with YUV external images).. but in cases where there is
more doubt probably not worth using them.

BR,
-R
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

1 2 >

1 - 100 of 117 matches

Mail list logo