Re: [Mesa-dev] [PATCH 20/20] anv: Implement VK_ANDROID_native_buffer (v2)

2017-09-13 Thread Tapani Pälli



On 09/14/2017 02:03 AM, Chad Versace wrote:

From: Chad Versace 

This implementation is correct (afaict), but takes two shortcuts
regarding the import/export of Android sync fds.

   Shortcut 1. When Android calls vkAcquireImageANDROID to import a sync
   fd into a VkSemaphore or VkFence, the driver instead simply blocks on
   the sync fd, then puts the VkSemaphore or VkFence into the signalled
   state. Thanks to implicit sync, this produces correct behavior (with
   extra latency overhead, perhaps) despite its ugliness.

   Shortcut 2. When Android calls vkQueueSignalReleaseImageANDROID to export
   a collection of wait semaphores as a sync fd, the driver instead
   submits the semaphores to the queue, then returns sync fd -1, which
   informs the caller that no additional synchronization is needed.
   Again, thanks to implicit sync, this produces correct behavior (with
   extra batch submission overhead) despite its ugliness.

I chose to take the shortcuts instead of properly importing/exporting
the sync fds for two reasons:

   Reason 1. I've already tested this patch with dEQP and with demos
   apps. It works. I wanted to get the tested patches into the tree now,
   and polish the implementation afterwards.

   Reason 2. I want to run this on a 3.18 kernel (gasp!). In 3.18, i915
   supports neither Android's sync_fence, nor upstream's sync_file, nor
   drm_syncobj. Again, I tested these patches on Android with a 3.18
   kernel and they work.

I plan to quickly follow-up with patches that remove the shortcuts and
properly import/export the sync fds.

Non-Testing
===
I did not test at all using the Android.mk buildsystem. I probably
broke it. Please test and review that.

Testing
===
I tested with 64-bit ARC++ on a Skylake Chromebook and a 3.18 kernel.
The following pass:

   a little spinning cube demo APK
   dEQP-VK.info.*
   dEQP-VK.api.smoke.*
   dEQP-VK.api.info.instance.*
   dEQP-VK.api.info.device.*
   dEQP-VK.api.wsi.android.*

v2:
   - Reject VkNativeBufferANDROID if the dma-buf's size is too small for
 the VkImage.
   - Stop abusing VkNativeBufferANDROID by passing it to vkAllocateMemory
 during vkCreateImage. Instead, directly import its dma-buf during
 vkCreateImage with anv_bo_cache_import(). [for jekstrand]
   - Rebase onto Tapani's VK_EXT_debug_report changes.
   - Drop `CPPFLAGS += $(top_srcdir)/include/android`. The dir does not
 exist.
---
  src/intel/Makefile.sources  |   3 +
  src/intel/Makefile.vulkan.am|   2 +
  src/intel/vulkan/anv_android.c  | 245 
  src/intel/vulkan/anv_device.c   |  12 +-
  src/intel/vulkan/anv_entrypoints_gen.py |  10 +-
  src/intel/vulkan/anv_extensions.py  |   1 +
  src/intel/vulkan/anv_image.c| 141 --
  src/intel/vulkan/anv_private.h  |   1 +
  8 files changed, 405 insertions(+), 10 deletions(-)
  create mode 100644 src/intel/vulkan/anv_android.c

diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources
index 8ca50ff622b..6f2dfa91e20 100644
--- a/src/intel/Makefile.sources
+++ b/src/intel/Makefile.sources
@@ -229,6 +229,9 @@ VULKAN_FILES := \
vulkan/anv_wsi.c \
vulkan/vk_format_info.h
  
+VULKAN_ANDROID_FILES := \

+   vulkan/anv_android.c
+
  VULKAN_WSI_WAYLAND_FILES := \
vulkan/anv_wsi_wayland.c
  
diff --git a/src/intel/Makefile.vulkan.am b/src/intel/Makefile.vulkan.am

index d1b1132ed2e..e9c824f717b 100644
--- a/src/intel/Makefile.vulkan.am
+++ b/src/intel/Makefile.vulkan.am
@@ -147,8 +147,10 @@ VULKAN_LIB_DEPS = \
-lm
  
  if HAVE_PLATFORM_ANDROID

+VULKAN_CPPFLAGS += $(ANDROID_CPPFLAGS)
  VULKAN_CFLAGS += $(ANDROID_CFLAGS)
  VULKAN_LIB_DEPS += $(ANDROID_LIBS)
+VULKAN_SOURCES += $(VULKAN_ANDROID_FILES)
  endif
  
  if HAVE_PLATFORM_X11

diff --git a/src/intel/vulkan/anv_android.c b/src/intel/vulkan/anv_android.c
new file mode 100644
index 000..6b19ace4d2d
--- /dev/null
+++ b/src/intel/vulkan/anv_android.c
@@ -0,0 +1,245 @@
+/*
+ * Copyright 2017 Google
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR 

Re: [Mesa-dev] [PATCH] gallium/{r600, radeonsi}: Fix segfault with color format (v2)

2017-09-13 Thread Denis Pauk
Thank you.

I have closed https://bugs.freedesktop.org/show_bug.cgi?id=102552

Best regards,
  Denis.

On Sep 14, 2017 2:01 AM, "Marek Olšák"  wrote:

> I pushed the patch. Thanks.
>
> Marek
>
> On Wed, Sep 13, 2017 at 9:27 PM, Denis Pauk  wrote:
> > Could you please commit this changes?
> >
> > I have not found what i need to do next after recieve "approve" for
> changes
> > in https://www.mesa3d.org/submittingpatches.html#reviewing.
> > Do i need to send new mail with "Reviewed-by: " mark?
> >
> > On Wed, Sep 13, 2017 at 6:36 PM, Marek Olšák  wrote:
> >>
> >> I've changed my mind. The patch is OK:
> >>
> >> Reviewed-by: Marek Olšák 
> >>
> >> Marek
> >>
> >> On Wed, Sep 13, 2017 at 4:45 PM, Denis Pauk 
> wrote:
> >> > What do you think about replace all checks in patch to asserts?
> >> >
> >> >
> >> > Best regards,
> >> >   Denis.
> >> >
> >> > On Sep 13, 2017 1:00 PM, "Marek Olšák"  wrote:
> >> >>
> >> >> On Wed, Sep 13, 2017 at 6:54 AM, Денис Паук 
> >> >> wrote:
> >> >> > Do you mean delete check in u_format.c:: util_format_is_supported?
> >> >> > Could
> >> >> > you
> >> >> > please explain more?
> >> >>
> >> >> I mean delete everything. Invalid formats shouldn't be passed to
> >> >> pipe_screen::is_format_supported. The code doing it needs to be
> fixed.
> >> >>
> >> >> Marek
> >
> >
> >
> >
> > --
> > Best regards,
> >   Denis.
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] glsl: check if induction var incremented before use in terminator

2017-09-13 Thread Timothy Arceri
do-while loops can increment the starting value before the
condition is checked. e.g.

  do {
ndx++;
  } while (ndx < 3);

This commit changes the code to detect this and reduces the
iteration count by 1 if found.
---
 src/compiler/glsl/loop_analysis.cpp | 38 +
 1 file changed, 38 insertions(+)

diff --git a/src/compiler/glsl/loop_analysis.cpp 
b/src/compiler/glsl/loop_analysis.cpp
index b9bae43536..8a0425d185 100644
--- a/src/compiler/glsl/loop_analysis.cpp
+++ b/src/compiler/glsl/loop_analysis.cpp
@@ -25,20 +25,54 @@
 #include "loop_analysis.h"
 #include "ir_hierarchical_visitor.h"
 
 static bool is_loop_terminator(ir_if *ir);
 
 static bool all_expression_operands_are_loop_constant(ir_rvalue *,
  hash_table *);
 
 static ir_rvalue *get_basic_induction_increment(ir_assignment *, hash_table *);
 
+static bool
+incremented_before_termionator(ir_loop *loop, ir_variable *var,
+   ir_if *terminator)
+{
+   for (exec_node *node = loop->body_instructions.get_head();
+!node->is_tail_sentinel();
+node = node->get_next()) {
+  ir_instruction *ir = (ir_instruction *) node;
+
+  switch (ir->ir_type) {
+  case ir_type_if:
+ if (ir->as_if() == terminator)
+return false;
+ break;
+
+  case ir_type_assignment: {
+ ir_assignment *assign = ir->as_assignment();
+ ir_variable *assignee = assign->lhs->whole_variable_referenced();
+
+ if (assignee == var) {
+assert(assign->condition == NULL);
+return true;
+ }
+
+ break;
+  }
+
+  default:
+ break;
+  }
+   }
+
+   unreachable("Unable to find induction variable");
+}
 
 /**
  * Record the fact that the given loop variable was referenced inside the loop.
  *
  * \arg in_assignee is true if the reference was on the LHS of an assignment.
  *
  * \arg in_conditional_code_or_nested_loop is true if the reference occurred
  * inside an if statement or a nested loop.
  *
  * \arg current_assignment is the ir_assignment node that the loop variable is
@@ -436,20 +470,24 @@ loop_analysis::visit_leave(ir_loop *ir)
 
 ir_variable *var = counter->variable_referenced();
 
 ir_rvalue *init = find_initial_value(ir, var);
 
  loop_variable *lv = ls->get(var);
  if (lv != NULL && lv->is_induction_var()) {
 t->iterations = calculate_iterations(init, limit, lv->increment,
  cmp);
 
+if (incremented_before_termionator(ir, var, t->ir)) {
+   t->iterations--;
+}
+
 if (t->iterations >= 0 &&
 (ls->limiting_terminator == NULL ||
  t->iterations < ls->limiting_terminator->iterations)) {
ls->limiting_terminator = t;
 }
  }
  break;
   }
 
   default:
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] glsl: make loop unrolling more like the nir unrolling path

2017-09-13 Thread Timothy Arceri
The old code incorrectly assumed that loop terminators will always
be at the start of the loop. It really seems to be just luck that
we haven't triggered any bugs here, for example if there is a loop
terminator at the start of the loop we might actually ignore any
other terminators that might be later in the loop because we break
before checking all the instructions. Ignoring the other
terminators might result in unrolling loops that we shouldn't be,
or the wrong number of iterations being calculated etc.

Incorrect analysis can also result in loops not being
unrolled at all. For example the current code would unroll:

  int j = 0;
  do {
 if (j > 5)
break;

 ... do stuff ...

 j++;
  } while (j < 4);

But would fail to unroll the following as no iteration limit was
calculated because it failed to find the terminator:

  int j = 0;
  do {
 ... do stuff ...

 j++;
  } while (j < 4);

Also we would fail to unroll the following as we ended up
calculating the iteration limit as 6 rather than 4. The unroll
code then assumed we had 3 terminators rather the 2 as it
wasn't able to determine that "if (j > 5)" was redundant.

  int j = 0;
  do {
 if (j > 5)
break;

 ... do stuff ...

 if (bool(i))
break;

 j++;
  } while (j < 4);

This patch changes this pass to be more like the NIR unrolling pass.
With this change we handle loop terminators correctly and also
handle cases where the terminators have instructions in their
branches other than a break.
---
 src/compiler/glsl/loop_analysis.cpp |   7 +-
 src/compiler/glsl/loop_unroll.cpp   | 177 
 2 files changed, 141 insertions(+), 43 deletions(-)

diff --git a/src/compiler/glsl/loop_analysis.cpp 
b/src/compiler/glsl/loop_analysis.cpp
index 8a0425d185..53372183dd 100644
--- a/src/compiler/glsl/loop_analysis.cpp
+++ b/src/compiler/glsl/loop_analysis.cpp
@@ -324,22 +324,20 @@ loop_analysis::visit_leave(ir_loop *ir)
foreach_in_list(ir_instruction, node, >body_instructions) {
   /* Skip over declarations at the start of a loop.
*/
   if (node->as_variable())
 continue;
 
   ir_if *if_stmt = ((ir_instruction *) node)->as_if();
 
   if ((if_stmt != NULL) && is_loop_terminator(if_stmt))
 ls->insert(if_stmt);
-  else
-break;
}
 
 
foreach_in_list_safe(loop_variable, lv, >variables) {
   /* Move variables that are already marked as being loop constant to
* a separate list.  These trivially don't need to be tested.
*/
   if (lv->is_loop_constant()) {
 lv->remove();
 ls->constants.push_tail(lv);
@@ -644,25 +642,22 @@ get_basic_induction_increment(ir_assignment *ir, 
hash_table *var_hash)
 /**
  * Detect whether an if-statement is a loop terminating condition
  *
  * Detects if-statements of the form
  *
  *  (if (expression bool ...) (break))
  */
 bool
 is_loop_terminator(ir_if *ir)
 {
-   if (!ir->else_instructions.is_empty())
-  return false;
-
ir_instruction *const inst =
-  (ir_instruction *) ir->then_instructions.get_head();
+  (ir_instruction *) ir->then_instructions.get_tail();
if (inst == NULL)
   return false;
 
if (inst->ir_type != ir_type_loop_jump)
   return false;
 
ir_loop_jump *const jump = (ir_loop_jump *) inst;
if (jump->mode != ir_loop_jump::jump_break)
   return false;
 
diff --git a/src/compiler/glsl/loop_unroll.cpp 
b/src/compiler/glsl/loop_unroll.cpp
index 7f601295a1..9aa2b975f7 100644
--- a/src/compiler/glsl/loop_unroll.cpp
+++ b/src/compiler/glsl/loop_unroll.cpp
@@ -35,21 +35,23 @@ public:
const struct gl_shader_compiler_options *options)
{
   this->state = state;
   this->progress = false;
   this->options = options;
}
 
virtual ir_visitor_status visit_leave(ir_loop *ir);
void simple_unroll(ir_loop *ir, int iterations);
void complex_unroll(ir_loop *ir, int iterations,
-   bool continue_from_then_branch);
+   bool continue_from_then_branch,
+   bool limiting_term_first,
+   bool lt_continue_from_then_branch);
void splice_post_if_instructions(ir_if *ir_if, exec_list *splice_dest);
 
loop_state *state;
 
bool progress;
const struct gl_shader_compiler_options *options;
 };
 
 } /* anonymous namespace */
 
@@ -169,20 +171,65 @@ private:
  * (loop (...) ...instrs...)
  *
  * And the iteration count is 3, the output will be:
  *
  * ...instrs... ...instrs... ...instrs...
  */
 void
 loop_unroll_visitor::simple_unroll(ir_loop *ir, int iterations)
 {
void *const mem_ctx = ralloc_parent(ir);
+   loop_variable_state *const ls = this->state->get(ir);
+
+   ir_instruction *first_ir =
+  (ir_instruction *) ir->body_instructions.get_head();
+
+   if (!first_ir) {
+  /* The loop is empty remove it and return */
+  ir->remove();
+  return;
+   }
+
+   

[Mesa-dev] [PATCH 1/3] glsl: don't drop intructions from unreachable terminators continue branch

2017-09-13 Thread Timothy Arceri
These instruction will be executed on every iteration of the loop
we cannot drop them.
---
 src/compiler/glsl/loop_analysis.h   |  7 +++
 src/compiler/glsl/loop_controls.cpp | 15 +++
 src/compiler/glsl/loop_unroll.cpp   |  7 ---
 3 files changed, 22 insertions(+), 7 deletions(-)

diff --git a/src/compiler/glsl/loop_analysis.h 
b/src/compiler/glsl/loop_analysis.h
index 2894c6359b..0e1bfd8142 100644
--- a/src/compiler/glsl/loop_analysis.h
+++ b/src/compiler/glsl/loop_analysis.h
@@ -27,20 +27,27 @@
 
 #include "ir.h"
 #include "util/hash_table.h"
 
 /**
  * Analyze and classify all variables used in all loops in the instruction list
  */
 extern class loop_state *
 analyze_loop_variables(exec_list *instructions);
 
+static inline bool
+is_break(ir_instruction *ir)
+{
+   return ir != NULL && ir->ir_type == ir_type_loop_jump &&
+  ((ir_loop_jump *) ir)->is_break();
+}
+
 
 /**
  * Fill in loop control fields
  *
  * Based on analysis of loop variables, this function tries to remove
  * redundant sequences in the loop of the form
  *
  *  (if (expression bool ...) (break))
  *
  * For example, if it is provable that one loop exit condition will
diff --git a/src/compiler/glsl/loop_controls.cpp 
b/src/compiler/glsl/loop_controls.cpp
index 895954fc2d..2dff26aec0 100644
--- a/src/compiler/glsl/loop_controls.cpp
+++ b/src/compiler/glsl/loop_controls.cpp
@@ -215,21 +215,36 @@ loop_control_visitor::visit_leave(ir_loop *ir)
 * that are associated with a fixed iteration count, except for the one
 * associated with the limiting terminator--that one needs to stay, since
 * it terminates the loop.  Exception: if the loop still has a normative
 * bound, then that terminates the loop, so we don't even need the limiting
 * terminator.
 */
foreach_in_list(loop_terminator, t, >terminators) {
   if (t->iterations < 0)
  continue;
 
+  exec_list *branch_instructions;
   if (t != ls->limiting_terminator) {
+ ir_instruction *ir_if_last = (ir_instruction *)
+   t->ir->then_instructions.get_tail();
+ if (is_break(ir_if_last)) {
+branch_instructions = >ir->else_instructions;
+ } else {
+branch_instructions = >ir->then_instructions;
+assert(is_break(ir_if_last));
+ }
+
+ exec_list copy_list;
+ copy_list.make_empty();
+ clone_ir_list(ir, _list, branch_instructions);
+
+ t->ir->insert_before(_list);
  t->ir->remove();
 
  assert(ls->num_loop_jumps > 0);
  ls->num_loop_jumps--;
 
  this->progress = true;
   }
}
 
return visit_continue;
diff --git a/src/compiler/glsl/loop_unroll.cpp 
b/src/compiler/glsl/loop_unroll.cpp
index dbb3fa2fa5..7f601295a1 100644
--- a/src/compiler/glsl/loop_unroll.cpp
+++ b/src/compiler/glsl/loop_unroll.cpp
@@ -46,27 +46,20 @@ public:
void splice_post_if_instructions(ir_if *ir_if, exec_list *splice_dest);
 
loop_state *state;
 
bool progress;
const struct gl_shader_compiler_options *options;
 };
 
 } /* anonymous namespace */
 
-static bool
-is_break(ir_instruction *ir)
-{
-   return ir != NULL && ir->ir_type == ir_type_loop_jump
-&& ((ir_loop_jump *) ir)->is_break();
-}
-
 class loop_unroll_count : public ir_hierarchical_visitor {
 public:
int nodes;
bool unsupported_variable_indexing;
bool array_indexed_by_induction_var_with_exact_iterations;
/* If there are nested loops, the node count will be inaccurate. */
bool nested_loop;
 
loop_unroll_count(exec_list *list, loop_variable_state *ls,
  const struct gl_shader_compiler_options *options)
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/5] radeonsi: allow out-of-order rasterization in commutative blending cases

2017-09-13 Thread Dieter Nützel

For the series:

Tested-by: Dieter Nützel 

Greetings,
Dieter

Am 09.09.2017 12:43, schrieb Nicolai Hähnle:

From: Nicolai Hähnle 

We do not enable this by default for additive blending, since it 
slightly

breaks OpenGL invariance guarantees due to non-determinism.

Still, there may be some applications can benefit from white-listing
via the radeonsi_commutative_blend_add drirc setting without any real
visible artifacts.
---
 src/gallium/drivers/radeonsi/driinfo_radeonsi.h |  1 +
 src/gallium/drivers/radeonsi/si_pipe.c  |  2 +
 src/gallium/drivers/radeonsi/si_pipe.h  |  1 +
 src/gallium/drivers/radeonsi/si_state.c | 67 
+++--

 src/gallium/drivers/radeonsi/si_state.h |  1 +
 src/util/xmlpool/t_options.h|  5 ++
 6 files changed, 73 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/driinfo_radeonsi.h
b/src/gallium/drivers/radeonsi/driinfo_radeonsi.h
index 8be85289a0c..989e5175cc0 100644
--- a/src/gallium/drivers/radeonsi/driinfo_radeonsi.h
+++ b/src/gallium/drivers/radeonsi/driinfo_radeonsi.h
@@ -1,5 +1,6 @@
 // DriConf options specific to radeonsi
 DRI_CONF_SECTION_PERFORMANCE
 DRI_CONF_RADEONSI_ENABLE_SISCHED("false")
 DRI_CONF_RADEONSI_ASSUME_NO_Z_FIGHTS("false")
+DRI_CONF_RADEONSI_COMMUTATIVE_BLEND_ADD("false")
 DRI_CONF_SECTION_END
diff --git a/src/gallium/drivers/radeonsi/si_pipe.c
b/src/gallium/drivers/radeonsi/si_pipe.c
index b4972be739c..c44ea3be740 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ b/src/gallium/drivers/radeonsi/si_pipe.c
@@ -1043,20 +1043,22 @@ struct pipe_screen
*radeonsi_screen_create(struct radeon_winsys *ws,
(sscreen->b.chip_class == SI &&
 sscreen->b.info.pfp_fw_version >= 79 &&
 sscreen->b.info.me_fw_version >= 142);

sscreen->has_ds_bpermute = sscreen->b.chip_class >= VI;
sscreen->has_out_of_order_rast = sscreen->b.chip_class >= VI &&
 sscreen->b.info.max_se >= 2 &&
 !(sscreen->b.debug_flags & 
DBG_NO_OUT_OF_ORDER);
sscreen->assume_no_z_fights =
driQueryOptionb(config->options, "radeonsi_assume_no_z_fights");
+   sscreen->commutative_blend_add =
+   driQueryOptionb(config->options, 
"radeonsi_commutative_blend_add");
 	sscreen->has_msaa_sample_loc_bug = (sscreen->b.family >= 
CHIP_POLARIS10 &&

sscreen->b.family <= 
CHIP_POLARIS12) ||
   sscreen->b.family == CHIP_VEGA10 ||
   sscreen->b.family == CHIP_RAVEN;
sscreen->dpbb_allowed = sscreen->b.chip_class >= GFX9 &&
!(sscreen->b.debug_flags & DBG_NO_DPBB);
sscreen->dfsm_allowed = sscreen->dpbb_allowed &&
!(sscreen->b.debug_flags & DBG_NO_DFSM);

/* While it would be nice not to have this flag, we are constrained
diff --git a/src/gallium/drivers/radeonsi/si_pipe.h
b/src/gallium/drivers/radeonsi/si_pipe.h
index d200c9f571f..27e2bc81172 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.h
+++ b/src/gallium/drivers/radeonsi/si_pipe.h
@@ -90,20 +90,21 @@ struct u_suballocator;
 struct si_screen {
struct r600_common_screen   b;
unsignedgs_table_depth;
unsignedtess_offchip_block_dw_size;
boolhas_clear_state;
boolhas_distributed_tess;
boolhas_draw_indirect_multi;
boolhas_ds_bpermute;
boolhas_out_of_order_rast;
boolassume_no_z_fights;
+   boolcommutative_blend_add;
boolhas_msaa_sample_loc_bug;
booldpbb_allowed;
booldfsm_allowed;
boolllvm_has_working_vgpr_indexing;

/* Whether shaders are monolithic (1-part) or separate (3-part). */
booluse_monolithic_shaders;
boolrecord_llvm_ir;

mtx_t   shader_parts_mutex;
diff --git a/src/gallium/drivers/radeonsi/si_state.c
b/src/gallium/drivers/radeonsi/si_state.c
index a8af5752771..6a063e7f7a6 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -370,20 +370,62 @@ static uint32_t
si_translate_blend_opt_factor(int blend_fact, bool is_alpha)
case PIPE_BLENDFACTOR_INV_SRC_ALPHA:
return V_028760_BLEND_OPT_PRESERVE_A0_IGNORE_A1;
case PIPE_BLENDFACTOR_SRC_ALPHA_SATURATE:
return is_alpha ? 

Re: [Mesa-dev] [PATCH 0/5] Resend of string buffer series

2017-09-13 Thread Dieter Nützel

Even for this series, too:

Tested-by: Dieter Nützel 

You 'lost' all of my former tb's.

Dieter

Am 11.09.2017 22:21, schrieb Thomas Helland:

I think I should have addressed all review feedback pointed out
to me by Nicolai, fixed the build issues with the tests (thanks
to Eric and Emil) and I've discovered another bug thanks to
strengthening the gtest test even more. I've also given it some
more polishing and slight modifications to make comments clearer
and more precise, and variable names better explaining their intet
in a couple of places. Details of changes in each patch. A big thanks
to those who have helped review this series =)

Thomas Helland (5):
  util: Add a string buffer implementation
  util: Add tests for the string buffer
  glsl: Change the parser to use the string buffer
  glcpp: Use string_buffer for line continuation removal
  glcpp: Avoid unnecessary call to strlen

 configure.ac   |   1 +
 src/compiler/glsl/glcpp/glcpp-lex.l|   9 +-
 src/compiler/glsl/glcpp/glcpp-parse.y  | 195 
+---

 src/compiler/glsl/glcpp/glcpp.h|   8 +-
 src/compiler/glsl/glcpp/pp.c   |  64 +++---
 src/compiler/glsl/glsl_lexer.ll|  32 ++-
 src/compiler/glsl/glsl_parser.yy   | 246 
++---

 src/util/Makefile.am   |   5 +-
 src/util/Makefile.sources  |   2 +
 src/util/string_buffer.c   | 147 
 src/util/string_buffer.h   |  99 +
 src/util/tests/string_buffer/Makefile.am   |  40 
 .../tests/string_buffer/string_buffer_test.cpp | 119 ++
 13 files changed, 665 insertions(+), 302 deletions(-)
 create mode 100644 src/util/string_buffer.c
 create mode 100644 src/util/string_buffer.h
 create mode 100644 src/util/tests/string_buffer/Makefile.am
 create mode 100644 src/util/tests/string_buffer/string_buffer_test.cpp

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97957] Awful screen tearing in a separate X server with DRI3

2017-09-13 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97957

--- Comment #25 from Michel Dänzer  ---
(In reply to 赵九胤 from comment #24)
> I ran the S3(the computer suspends into memory) test 500 times and my Linux
> system crashed, [...]

That doesn't sound related to this bug report at all. Please file your own
report.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/5] glcpp: Avoid unnecessary call to strlen

2017-09-13 Thread Ian Romanick
On 09/11/2017 01:21 PM, Thomas Helland wrote:
> @@ -621,12 +636,17 @@ u64vec4 KEYWORD_WITH_ALT(0, 0, 0, 0, 
> yyextra->ARB_gpu_shader_int64_enable, U64V
>  [_a-zA-Z][_a-zA-Z0-9]*   {
>   struct _mesa_glsl_parse_state *state = yyextra;
>   void *ctx = state->linalloc;
> - if (state->es_shader && strlen(yytext) > 1024) {
> + if (state->es_shader && yyleng + 1 > 1024) {

Also... I don't think this is right.  Shouldn't this just be 'yylen > 1024'?

>  _mesa_glsl_error(yylloc, state,
>   "Identifier `%s' exceeds 1024 
> characters",
>   yytext);
>   } else {
> -   yylval->identifier = linear_strdup(ctx, yytext);
> +   /* We're not doing linear_strdup here, to avoid 
> an implicit call
> +* on strlen() for the length of the string, as 
> this is already
> +* found by flex and stored in yyleng
> +*/
> +   yylval->identifier = (char *) 
> linear_alloc_child(ctx, yyleng + 1);
> +   memcpy(yylval->identifier, yytext, yyleng + 1);
>   }
>   return classify_identifier(state, yytext);
>   }
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/5] glcpp: Avoid unnecessary call to strlen

2017-09-13 Thread Ian Romanick
On 09/11/2017 01:21 PM, Thomas Helland wrote:
> Length of the token was already calculated by flex and stored in yyleng,
> no need to implicitly call strlen() via linear_strdup().
> 
> Reviewed-by: Nicolai Hähnle 
> Reviewed-by: Timothy Arceri 
> 
> V2: Also convert this pattern in glsl_lexer.ll
> 
> V3: Remove a misplaced comment
> Fix compile warning from V2
> ---
>  src/compiler/glsl/glcpp/glcpp-lex.l |   9 +-
>  src/compiler/glsl/glsl_lexer.ll |  32 -
>  src/compiler/glsl/glsl_parser.yy| 246 
> ++--
>  3 files changed, 157 insertions(+), 130 deletions(-)
> 
> diff --git a/src/compiler/glsl/glcpp/glcpp-lex.l 
> b/src/compiler/glsl/glcpp/glcpp-lex.l
> index 381b97364a..9cfcc12022 100644
> --- a/src/compiler/glsl/glcpp/glcpp-lex.l
> +++ b/src/compiler/glsl/glcpp/glcpp-lex.l
> @@ -101,7 +101,14 @@ void glcpp_set_column (int  column_no , yyscan_t 
> yyscanner);
>  #define RETURN_STRING_TOKEN(token)   \
>   do {\
>   if (! parser->skipping) {   \
> - yylval->str = linear_strdup(yyextra->linalloc, yytext); 
> \
> + /* We're not doing linear_strdup here, to avoid \
> +  * an implicit call on strlen() for the length  \
> +  * of the string, as this is already found by   \
> +  * flex and stored in yyleng */ \
> + void *mem_ctx = yyextra->linalloc;  \
> + yylval->str = linear_alloc_child(mem_ctx,   \
> +  yyleng + 1);   \
> + memcpy(yylval->str, yytext, yyleng + 1);\
>   RETURN_TOKEN_NEVER_SKIP (token);\
>   }   \
>   } while(0)
> diff --git a/src/compiler/glsl/glsl_lexer.ll b/src/compiler/glsl/glsl_lexer.ll
> index 7c41455d98..3a67f0ea40 100644
> --- a/src/compiler/glsl/glsl_lexer.ll
> +++ b/src/compiler/glsl/glsl_lexer.ll
> @@ -81,8 +81,13 @@ static int classify_identifier(struct 
> _mesa_glsl_parse_state *, const char *);
> "illegal use of reserved word `%s'", yytext); \
>return ERROR_TOK;  \
>} else {   
> \
> -  void *mem_ctx = yyextra->linalloc; 
> \
> -  yylval->identifier = linear_strdup(mem_ctx, yytext);   \
> +  /* We're not doing linear_strdup here, to avoid an implicit\
> +   * call on strlen() for the length of the string, as this is   \
> +   * already found by flex and stored in yyleng */   \
> +  void *mem_ctx = yyextra->linalloc; \
> +  yylval->identifier = (char *) linear_alloc_child(mem_ctx,  \
> +   yyleng + 1);  \
> +  memcpy(yylval->identifier, yytext, yyleng + 1);\

Could this (here and below) be implemented as:

char *id = (char *) linear_alloc_child(mem_ctx, yyleng + 1);
memcpy(id, yytext, yyleng + 1);
yylval->identifier = id;

Then the type of identifier doesn't have to change, and the last hunk is
unnecessary.

>return classify_identifier(yyextra, yytext);   \
>}  
> \
> } while (0)
> @@ -261,8 +266,13 @@ HASH ^{SPC}#{SPC}
>  [ \t\r]* { }
>  :return COLON;
>  [_a-zA-Z][_a-zA-Z0-9]*   {
> +/* We're not doing linear_strdup here, to 
> avoid an implicit call
> + * on strlen() for the length of the string, 
> as this is already
> + * found by flex and stored in yyleng
> + */
>  void *mem_ctx = yyextra->linalloc;
> -yylval->identifier = linear_strdup(mem_ctx, 
> yytext);
> +yylval->identifier = (char *) 
> linear_alloc_child(mem_ctx, yyleng + 1);
> +memcpy(yylval->identifier, yytext, yyleng + 
> 1);
>  return IDENTIFIER;
>   }
>  [1-9][0-9]*  {
> @@ -449,8 +459,13 @@ layout   {
>|| yyextra->ARB_tessellation_shader_enable) {
> return LAYOUT_TOK;
>  } else {
> +   /* We're not doing linear_strdup here, to avoid an 
> implicit call
> +* on strlen() for the length of the string, as this is 
> already
> + 

Re: [Mesa-dev] [PATCH 12/27] i965: add support for cached shaders with xfb qualifiers

2017-09-13 Thread Timothy Arceri



On 14/09/17 10:04, Matt Turner wrote:

On 08/19, Jordan Justen wrote:

From: Timothy Arceri 

For now this disables the shader cache when transform feedback is
enabled via the GL API as we don't currently allow for it when
generating the sha for the shader.
---


If I understand correctly, at this point we'll never cache a shader that
has transform feedback enabled by the API, and with that knowledge this
patch adds code to skip looking on disk for one.

I think I'd just let it search in vain and avoid the complexity. I can
imagine implementing support for caching those shaders and then getting
pretty frustrated trying to figure out why they weren't being found
later.


I believe the problem is it *can* find something. e.g If the shader is 
used without xfb, then xfb is enable and the program is relinked.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium: add PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE

2017-09-13 Thread Marek Olšák
Reviewed-by: Marek Olšák 

Marek

On Wed, Sep 13, 2017 at 6:53 PM, Nicolai Hähnle  wrote:
> From: Nicolai Hähnle 
>
> To be able to properly distinguish between GL_ANY_SAMPLES_PASSED
> and GL_ANY_SAMPLES_PASSED_CONSERVATIVE.
>
> This patch goes through all drivers, having them treat the two
> query types identically, except:
>
> 1. radeon incorrectly enabled conservative mode on
>PIPE_QUERY_OCCLUSION_PREDICATE. We now do it correctly, only
>on PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE.
> 2. st/mesa uses the new query type.
>
> Fixes dEQP-GLES31.functional.fbo.no_attachments.*
> ---
>  src/gallium/auxiliary/util/u_dump_defines.c  |  1 +
>  src/gallium/auxiliary/util/u_inlines.h   |  1 +
>  src/gallium/docs/source/context.rst  |  6 ++
>  src/gallium/drivers/freedreno/a3xx/fd3_query.c   |  8 
>  src/gallium/drivers/freedreno/a4xx/fd4_query.c   |  8 
>  src/gallium/drivers/freedreno/a5xx/fd5_query.c   | 10 ++
>  src/gallium/drivers/freedreno/freedreno_query.h  |  1 +
>  src/gallium/drivers/llvmpipe/lp_query.c  |  3 +++
>  src/gallium/drivers/llvmpipe/lp_rast.c   |  2 ++
>  src/gallium/drivers/llvmpipe/lp_setup.c  |  3 +++
>  src/gallium/drivers/nouveau/nv30/nv30_query.c|  4 +++-
>  src/gallium/drivers/nouveau/nv50/nv50_query.c|  1 +
>  src/gallium/drivers/nouveau/nv50/nv50_query_hw.c |  4 
>  src/gallium/drivers/nouveau/nvc0/nvc0_query.c|  1 +
>  src/gallium/drivers/nouveau/nvc0/nvc0_query_hw.c |  7 ++-
>  src/gallium/drivers/r300/r300_query.c|  7 +--
>  src/gallium/drivers/radeon/r600_pipe_common.c|  3 ++-
>  src/gallium/drivers/radeon/r600_query.c  | 19 ++-
>  src/gallium/drivers/softpipe/sp_query.c  |  4 
>  src/gallium/drivers/svga/svga_pipe_query.c   |  7 ++-
>  src/gallium/drivers/swr/swr_query.cpp|  1 +
>  src/gallium/drivers/trace/tr_dump_state.c|  1 +
>  src/gallium/include/pipe/p_defines.h |  2 ++
>  src/mesa/state_tracker/st_cb_queryobj.c  |  5 -
>  24 files changed, 97 insertions(+), 12 deletions(-)
>
> diff --git a/src/gallium/auxiliary/util/u_dump_defines.c 
> b/src/gallium/auxiliary/util/u_dump_defines.c
> index 5032974a880..e87e5301600 100644
> --- a/src/gallium/auxiliary/util/u_dump_defines.c
> +++ b/src/gallium/auxiliary/util/u_dump_defines.c
> @@ -358,20 +358,21 @@ util_tex_filter_short_names[] = {
> "linear"
>  };
>
>  DEFINE_UTIL_STR_CONTINUOUS(tex_filter)
>
>
>  static const char *
>  util_query_type_names[] = {
> "PIPE_QUERY_OCCLUSION_COUNTER",
> "PIPE_QUERY_OCCLUSION_PREDICATE",
> +   "PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE",
> "PIPE_QUERY_TIMESTAMP",
> "PIPE_QUERY_TIMESTAMP_DISJOINT",
> "PIPE_QUERY_TIME_ELAPSED",
> "PIPE_QUERY_PRIMITIVES_GENERATED",
> "PIPE_QUERY_PRIMITIVES_EMITTED",
> "PIPE_QUERY_SO_STATISTICS",
> "PIPE_QUERY_SO_OVERFLOW_PREDICATE",
> "PIPE_QUERY_SO_OVERFLOW_ANY_PREDICATE",
> "PIPE_QUERY_GPU_FINISHED",
> "PIPE_QUERY_PIPELINE_STATISTICS",
> diff --git a/src/gallium/auxiliary/util/u_inlines.h 
> b/src/gallium/auxiliary/util/u_inlines.h
> index e0ed594c9fe..79f62c32266 100644
> --- a/src/gallium/auxiliary/util/u_inlines.h
> +++ b/src/gallium/auxiliary/util/u_inlines.h
> @@ -529,20 +529,21 @@ util_get_min_point_size(const struct 
> pipe_rasterizer_state *state)
> return !state->point_quad_rasterization &&
>!state->point_smooth &&
>!state->multisample ? 1.0f : 0.0f;
>  }
>
>  static inline void
>  util_query_clear_result(union pipe_query_result *result, unsigned type)
>  {
> switch (type) {
> case PIPE_QUERY_OCCLUSION_PREDICATE:
> +   case PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE:
> case PIPE_QUERY_SO_OVERFLOW_PREDICATE:
> case PIPE_QUERY_SO_OVERFLOW_ANY_PREDICATE:
> case PIPE_QUERY_GPU_FINISHED:
>result->b = FALSE;
>break;
> case PIPE_QUERY_OCCLUSION_COUNTER:
> case PIPE_QUERY_TIMESTAMP:
> case PIPE_QUERY_TIME_ELAPSED:
> case PIPE_QUERY_PRIMITIVES_GENERATED:
> case PIPE_QUERY_PRIMITIVES_EMITTED:
> diff --git a/src/gallium/docs/source/context.rst 
> b/src/gallium/docs/source/context.rst
> index 6ac45819a66..ba7fef8301d 100644
> --- a/src/gallium/docs/source/context.rst
> +++ b/src/gallium/docs/source/context.rst
> @@ -387,20 +387,26 @@ are written to the framebuffer without being culled by
>  The result is an unsigned 64-bit integer.
>  This query can be used with ``render_condition``.
>
>  In cases where a boolean result of an occlusion query is enough,
>  ``PIPE_QUERY_OCCLUSION_PREDICATE`` should be used. It is just like
>  ``PIPE_QUERY_OCCLUSION_COUNTER`` except that the result is a boolean
>  value of FALSE for cases where COUNTER would result in 0 and TRUE
>  for all other cases.
>  This query can be used with ``render_condition``.
>
> +In cases where 

Re: [Mesa-dev] [PATCH v2 03/15] i965: Use a WC map and memcpy for the batch instead of pwrite.

2017-09-13 Thread Kenneth Graunke
On Wednesday, September 13, 2017 3:51:03 PM PDT Matt Turner wrote:
> On 09/13, Kenneth Graunke wrote:
> >We'd like to eliminate the malloc'd shadow copy eventually, but there
> >are still unresolved performance problems.  In the meantime, let's at
> >least get rid of pwrite.
> 
> I don't know much about this. What is wrong with pwrite, and why is WC
> better?

Mainly, I just wanted to get rid of pwrite.  Chris and I talked about
getting rid of our remaining pwrite uses a while ago - I just hadn't
done it yet.  Using pwrite is a bit scary these days.  It can write
through the CPU caches, and relies on the kernel knowing about domains
and when it's safe to bypass domain changes and the associated flushing
and waiting.

These days, we no longer inform the kernel about domain changes, and
track dirty data in the CPU caches ourselves.  We also never use the
CPU caches for writing - only for read-only maps.  So, using an ioctl
that trusts in the kernel to figure out how to do this efficiently
doesn't mix with our "do everything ourselves in Mesa" strategy.

WC maps should work just as well, anyway, as we're memcpying large
sections of data in sequential order.

> Does this change performance?

Apparently it does - on Apollolake, OglBatch6 improves by 1.53581% +/-
0.269589% (n=108).  That's a bit surprising, but I'll take it...

--Ken

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 06/27] i965: add initial implementation of on disk shader cache

2017-09-13 Thread Timothy Arceri



On 14/09/17 09:53, Matt Turner wrote:

On 08/19, Jordan Justen wrote:

From: Timothy Arceri 

This uses the recently-added disk_cache.c to write out the final
linked binary for vertex and fragment shader programs.

This is based off the initial implementation done by Carl Worth.

[jordan.l.jus...@intel.com: *_cached_program => brw_disk_cache_*_program]
[jordan.l.jus...@intel.com: brw_shader_cache.c => brw_disk_cache.c]
[jordan.l.jus...@intel.com: don't map to write program when LLC is 
present]
[jordan.l.jus...@intel.com: set program_written_to_cache on read from 
cache]
[jordan.l.jus...@intel.com: only try cache when status is 
linking_skipped]

Signed-off-by: Jordan Justen 
---
src/mesa/drivers/dri/i965/Makefile.sources |   1 +
src/mesa/drivers/dri/i965/brw_disk_cache.c | 395 
+

src/mesa/drivers/dri/i965/brw_state.h  |   5 +
3 files changed, 401 insertions(+)
create mode 100644 src/mesa/drivers/dri/i965/brw_disk_cache.c

diff --git a/src/mesa/drivers/dri/i965/Makefile.sources 
b/src/mesa/drivers/dri/i965/Makefile.sources

index 425c883de8..6e21010bae 100644
--- a/src/mesa/drivers/dri/i965/Makefile.sources
+++ b/src/mesa/drivers/dri/i965/Makefile.sources
@@ -14,6 +14,7 @@ i965_FILES = \
brw_cs.h \
brw_curbe.c \
brw_defines.h \
+    brw_disk_cache.cpp \
brw_draw.c \
brw_draw.h \
brw_draw_upload.c \
diff --git a/src/mesa/drivers/dri/i965/brw_disk_cache.c 
b/src/mesa/drivers/dri/i965/brw_disk_cache.c

new file mode 100644
index 00..b56e561e14
--- /dev/null
+++ b/src/mesa/drivers/dri/i965/brw_disk_cache.c
@@ -0,0 +1,395 @@
+/*
+ * Copyright © 2014 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person 
obtaining a
+ * copy of this software and associated documentation files (the 
"Software"),
+ * to deal in the Software without restriction, including without 
limitation
+ * the rights to use, copy, modify, merge, publish, distribute, 
sublicense,

+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including 
the next
+ * paragraph) shall be included in all copies or substantial portions 
of the

+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 
EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 
MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT 
SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES 
OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, 
ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR 
OTHER DEALINGS

+ * IN THE SOFTWARE.
+ */
+
+#include "compiler/glsl/blob.h"
+#include "compiler/glsl/ir_uniform.h"
+#include "compiler/glsl/shader_cache.h"
+#include "main/mtypes.h"
+#include "util/disk_cache.h"
+#include "util/macros.h"
+#include "util/mesa-sha1.h"
+
+#include "brw_context.h"
+#include "brw_state.h"
+#include "brw_vs.h"
+#include "brw_wm.h"
+
+static uint64_t
+ptr_to_uint64_t(void *ptr)
+{
+   uint64_t ptr_int = (uint64_t) ptr;
+#if __i386__
+   ptr_int &= 0x;
+#endif
+   return ptr_int;
+}


This function can be as simple as

static uint64_t
ptr_to_uint64_t(void *ptr)
{
   return (uintptr_t) ptr;
}

But I have a question. We don't expect to be able to load a shader with
a 64-bit Mesa that was generated on a 32-bit Mesa (or vice versa)
because we're using the build id to determine compatibility.

Do we need to write out all pointers as 64-bits, or should we really
just be writing out uintptr_t? Maybe having the binary format the same
keeps something simple?


People insisted this be stored as 64-bit even though I argued that it 
was a bad idea to share cache items between the Mesa builds, that simple 
fact that we will never test this path in our automated tests is as good 
a reason as any not to do it. Currently the disk_cache util uses ptr 
size as part of the drivers key so we should never share cache items 
between builds currently.  I changed it to always store as 64bit here 
anyway in order to move the conversation along and try to get the series 
reviewed. I think at this point we could drop it.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH 5/5] amd/common: add workaround for cube map array layer clamping

2017-09-13 Thread Marek Olšák
For the series:

Reviewed-by: Marek Olšák 

Marek

On Wed, Sep 13, 2017 at 7:04 PM, Nicolai Hähnle  wrote:
> From: Nicolai Hähnle 
>
> Fixes dEQP-GLES31.functional.texture.filtering.cube_array.*
>
> Cc: mesa-sta...@lists.freedesktop.org
> ---
>  src/amd/common/ac_llvm_build.c | 31 +--
>  1 file changed, 29 insertions(+), 2 deletions(-)
>
> diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
> index 6c010e8c3a6..8a329515b57 100644
> --- a/src/amd/common/ac_llvm_build.c
> +++ b/src/amd/common/ac_llvm_build.c
> @@ -491,22 +491,49 @@ ac_prepare_cube_coords(struct ac_llvm_context *ctx,
>LLVMValueRef *coords_arg,
>LLVMValueRef *derivs_arg)
>  {
>
> LLVMBuilderRef builder = ctx->builder;
> struct cube_selection_coords selcoords;
> LLVMValueRef coords[3];
> LLVMValueRef invma;
>
> if (is_array && !is_lod) {
> -   coords_arg[3] = ac_build_intrinsic(ctx, "llvm.rint.f32", 
> ctx->f32,
> -  _arg[3], 1, 0);
> +   LLVMValueRef tmp = coords_arg[3];
> +   tmp = ac_build_intrinsic(ctx, "llvm.rint.f32", ctx->f32, 
> , 1, 0);
> +
> +   /* Section 8.9 (Texture Functions) of the GLSL 4.50 spec says:
> +*
> +*"For Array forms, the array layer used will be
> +*
> +*   max(0, min(d−1, floor(layer+0.5)))
> +*
> +* where d is the depth of the texture array and layer
> +* comes from the component indicated in the tables below.
> +* Workaroudn for an issue where the layer is taken from a
> +* helper invocation which happens to fall on a different
> +* layer due to extrapolation."
> +*
> +* VI and earlier attempt to implement this in hardware by
> +* clamping the value of coords[2] = (8 * layer) + face.
> +* Unfortunately, this means that the we end up with the wrong
> +* face when clamping occurs.
> +*
> +* Clamp the layer earlier to work around the issue.
> +*/
> +   if (ctx->chip_class <= VI) {
> +   LLVMValueRef ge0;
> +   ge0 = LLVMBuildFCmp(builder, LLVMRealOGE, tmp, 
> ctx->f32_0, "");
> +   tmp = LLVMBuildSelect(builder, ge0, tmp, ctx->f32_0, 
> "");
> +   }
> +
> +   coords_arg[3] = tmp;
> }
>
> build_cube_intrinsic(ctx, coords_arg, );
>
> invma = ac_build_intrinsic(ctx, "llvm.fabs.f32",
> ctx->f32, , 1, AC_FUNC_ATTR_READNONE);
> invma = ac_build_fdiv(ctx, LLVMConstReal(ctx->f32, 1.0), invma);
>
> for (int i = 0; i < 2; ++i)
> coords[i] = LLVMBuildFMul(builder, selcoords.stc[i], invma, 
> "");
> --
> 2.11.0
>
> ___
> mesa-stable mailing list
> mesa-sta...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-stable
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Deal with size differences between GLuint and GLhandleARB in GetAttachedObjectsARB

2017-09-13 Thread Matt Turner

On 08/17, Jeremy Huddleston Sequoia wrote:

Signed-off-by: Jeremy Huddleston Sequoia 
CC: Nicolai Hähnle 
CC: Matt Turner 
CC: Ian Romanick 
CC: Brian Paul 


This looks fine to me.

Reviewed-by: Matt Turner 


signature.asc
Description: Digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/5] radeonsi: allow out-of-order rasterization in commutative blending cases

2017-09-13 Thread Marek Olšák
For the series:

Reviewed-by: Marek Olšák 

Marek

On Sat, Sep 9, 2017 at 12:43 PM, Nicolai Hähnle  wrote:
> From: Nicolai Hähnle 
>
> We do not enable this by default for additive blending, since it slightly
> breaks OpenGL invariance guarantees due to non-determinism.
>
> Still, there may be some applications can benefit from white-listing
> via the radeonsi_commutative_blend_add drirc setting without any real
> visible artifacts.
> ---
>  src/gallium/drivers/radeonsi/driinfo_radeonsi.h |  1 +
>  src/gallium/drivers/radeonsi/si_pipe.c  |  2 +
>  src/gallium/drivers/radeonsi/si_pipe.h  |  1 +
>  src/gallium/drivers/radeonsi/si_state.c | 67 
> +++--
>  src/gallium/drivers/radeonsi/si_state.h |  1 +
>  src/util/xmlpool/t_options.h|  5 ++
>  6 files changed, 73 insertions(+), 4 deletions(-)
>
> diff --git a/src/gallium/drivers/radeonsi/driinfo_radeonsi.h 
> b/src/gallium/drivers/radeonsi/driinfo_radeonsi.h
> index 8be85289a0c..989e5175cc0 100644
> --- a/src/gallium/drivers/radeonsi/driinfo_radeonsi.h
> +++ b/src/gallium/drivers/radeonsi/driinfo_radeonsi.h
> @@ -1,5 +1,6 @@
>  // DriConf options specific to radeonsi
>  DRI_CONF_SECTION_PERFORMANCE
>  DRI_CONF_RADEONSI_ENABLE_SISCHED("false")
>  DRI_CONF_RADEONSI_ASSUME_NO_Z_FIGHTS("false")
> +DRI_CONF_RADEONSI_COMMUTATIVE_BLEND_ADD("false")
>  DRI_CONF_SECTION_END
> diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
> b/src/gallium/drivers/radeonsi/si_pipe.c
> index b4972be739c..c44ea3be740 100644
> --- a/src/gallium/drivers/radeonsi/si_pipe.c
> +++ b/src/gallium/drivers/radeonsi/si_pipe.c
> @@ -1043,20 +1043,22 @@ struct pipe_screen *radeonsi_screen_create(struct 
> radeon_winsys *ws,
> (sscreen->b.chip_class == SI &&
>  sscreen->b.info.pfp_fw_version >= 79 &&
>  sscreen->b.info.me_fw_version >= 142);
>
> sscreen->has_ds_bpermute = sscreen->b.chip_class >= VI;
> sscreen->has_out_of_order_rast = sscreen->b.chip_class >= VI &&
>  sscreen->b.info.max_se >= 2 &&
>  !(sscreen->b.debug_flags & 
> DBG_NO_OUT_OF_ORDER);
> sscreen->assume_no_z_fights =
> driQueryOptionb(config->options, 
> "radeonsi_assume_no_z_fights");
> +   sscreen->commutative_blend_add =
> +   driQueryOptionb(config->options, 
> "radeonsi_commutative_blend_add");
> sscreen->has_msaa_sample_loc_bug = (sscreen->b.family >= 
> CHIP_POLARIS10 &&
> sscreen->b.family <= 
> CHIP_POLARIS12) ||
>sscreen->b.family == CHIP_VEGA10 ||
>sscreen->b.family == CHIP_RAVEN;
> sscreen->dpbb_allowed = sscreen->b.chip_class >= GFX9 &&
> !(sscreen->b.debug_flags & DBG_NO_DPBB);
> sscreen->dfsm_allowed = sscreen->dpbb_allowed &&
> !(sscreen->b.debug_flags & DBG_NO_DFSM);
>
> /* While it would be nice not to have this flag, we are constrained
> diff --git a/src/gallium/drivers/radeonsi/si_pipe.h 
> b/src/gallium/drivers/radeonsi/si_pipe.h
> index d200c9f571f..27e2bc81172 100644
> --- a/src/gallium/drivers/radeonsi/si_pipe.h
> +++ b/src/gallium/drivers/radeonsi/si_pipe.h
> @@ -90,20 +90,21 @@ struct u_suballocator;
>  struct si_screen {
> struct r600_common_screen   b;
> unsignedgs_table_depth;
> unsignedtess_offchip_block_dw_size;
> boolhas_clear_state;
> boolhas_distributed_tess;
> boolhas_draw_indirect_multi;
> boolhas_ds_bpermute;
> boolhas_out_of_order_rast;
> boolassume_no_z_fights;
> +   boolcommutative_blend_add;
> boolhas_msaa_sample_loc_bug;
> booldpbb_allowed;
> booldfsm_allowed;
> boolllvm_has_working_vgpr_indexing;
>
> /* Whether shaders are monolithic (1-part) or separate (3-part). */
> booluse_monolithic_shaders;
> boolrecord_llvm_ir;
>
> mtx_t   shader_parts_mutex;
> diff --git a/src/gallium/drivers/radeonsi/si_state.c 
> b/src/gallium/drivers/radeonsi/si_state.c
> index a8af5752771..6a063e7f7a6 100644
> --- a/src/gallium/drivers/radeonsi/si_state.c
> +++ b/src/gallium/drivers/radeonsi/si_state.c
> @@ -370,20 +370,62 @@ static uint32_t si_translate_blend_opt_factor(int 
> blend_fact, bool 

Re: [Mesa-dev] [PATCH 26/27] i965: Initialize disk shader cache when INTEL_SHADER_CACHE env var is set

2017-09-13 Thread Matt Turner

On 08/19, Jordan Justen wrote:

We use the build-id of i965_dri.so for the timestamp, and the name
from i965_pci_ids.h for the device name.

Signed-off-by: Jordan Justen 
---
src/mesa/drivers/dri/i965/brw_context.c|  2 ++
src/mesa/drivers/dri/i965/brw_disk_cache.c | 45 ++
src/mesa/drivers/dri/i965/brw_state.h  |  1 +
3 files changed, 48 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 3904bc708b..5d7d555f79 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -1014,6 +1014,8 @@ brwCreateContext(gl_api api,
 brw->dri_config_options_sha1);
   brw->ctx.Const.dri_config_options_sha1 = brw->dri_config_options_sha1;

+   brw_disk_cache_init(brw);
+
   return true;
}

diff --git a/src/mesa/drivers/dri/i965/brw_disk_cache.c 
b/src/mesa/drivers/dri/i965/brw_disk_cache.c
index 036b56225e..ef5380126a 100644
--- a/src/mesa/drivers/dri/i965/brw_disk_cache.c
+++ b/src/mesa/drivers/dri/i965/brw_disk_cache.c
@@ -29,6 +29,8 @@
#include "main/mtypes.h"
#include "main/shaderobj.h"
#include "program/program.h"
+#include "util/build_id.h"
+#include "util/debug.h"
#include "util/disk_cache.h"
#include "util/macros.h"
#include "util/mesa-sha1.h"
@@ -652,3 +654,46 @@ brw_disk_cache_write_compute_program(struct brw_context 
*brw)
 MESA_SHADER_COMPUTE);
   }
}
+
+void
+brw_disk_cache_init(struct brw_context *brw)
+{
+#ifdef ENABLE_SHADER_CACHE
+   if (!env_var_as_boolean("INTEL_SHADER_CACHE", false))
+  return;
+
+   char *renderer = NULL;
+   int len = asprintf(, "i965_%04x", brw->screen->deviceID);
+   if (len < 0) {
+  renderer = strdup("i965");
+   }
+   if (renderer == NULL)
+   return;
+
+   const struct build_id_note *note = build_id_find_nhdr("i965_dri.so");


This will require a small rebase onto Chad's commit 5c98d3825ccbe

Also, we should just force the build-id to be present rather than having
code to handle the case when it isn't.

Just add -Wl,--build-id=sha1 to i965_dri.so's LDFLAGS. See
src/intel/Makefile.vulkan.am for example.

With that, this code becomes a lot simpler.


signature.asc
Description: Digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 12/27] i965: add support for cached shaders with xfb qualifiers

2017-09-13 Thread Matt Turner

On 08/19, Jordan Justen wrote:

From: Timothy Arceri 

For now this disables the shader cache when transform feedback is
enabled via the GL API as we don't currently allow for it when
generating the sha for the shader.
---


If I understand correctly, at this point we'll never cache a shader that
has transform feedback enabled by the API, and with that knowledge this
patch adds code to skip looking on disk for one.

I think I'd just let it search in vain and avoid the complexity. I can
imagine implementing support for caching those shaders and then getting
pretty frustrated trying to figure out why they weren't being found
later.


signature.asc
Description: Digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 07/27] i965: Add shader cache support for vertex and fragment stages

2017-09-13 Thread Matt Turner

On 08/19, Jordan Justen wrote:

From: Timothy Arceri 

This enables the cache on vertex and fragment shaders only.

[jordan.l.jus...@intel.com: reword subject]
[jordan.l.jus...@intel.com: *_cached_program => brw_disk_cache_*_program]
Signed-off-by: Jordan Justen 
---
src/mesa/drivers/dri/i965/brw_state_upload.c |  2 ++
src/mesa/drivers/dri/i965/brw_vs.c   | 21 ++---
src/mesa/drivers/dri/i965/brw_wm.c   | 24 +++-
3 files changed, 31 insertions(+), 16 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c 
b/src/mesa/drivers/dri/i965/brw_state_upload.c
index 1ae45ba2ac..140146970b 100644
--- a/src/mesa/drivers/dri/i965/brw_state_upload.c
+++ b/src/mesa/drivers/dri/i965/brw_state_upload.c
@@ -434,6 +434,8 @@ brw_upload_programs(struct brw_context *brw,
 brw_upload_clip_prog(brw);
 brw_upload_sf_prog(brw);
  }
+
+  brw_disk_cache_write_program(brw);
   } else if (pipeline == BRW_COMPUTE_PIPELINE) {
  brw_upload_cs_prog(brw);
   }
diff --git a/src/mesa/drivers/dri/i965/brw_vs.c 
b/src/mesa/drivers/dri/i965/brw_vs.c
index 055ea9d742..7190fc4a65 100644
--- a/src/mesa/drivers/dri/i965/brw_vs.c
+++ b/src/mesa/drivers/dri/i965/brw_vs.c
@@ -364,13 +364,20 @@ brw_upload_vs_prog(struct brw_context *brw)

   brw_vs_populate_key(brw, );

-   if (!brw_search_cache(>cache, BRW_CACHE_VS_PROG,
-, sizeof(key),
->vs.base.prog_offset, >vs.base.prog_data)) {
-  bool success = brw_codegen_vs_prog(brw, vp, );
-  (void) success;
-  assert(success);


Please just mark success as UNUSED (or MAYBE_UNUSED if you like typing
or find UNUSED too confusing). Same thing below and in the next patch.


signature.asc
Description: Digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 06/27] i965: add initial implementation of on disk shader cache

2017-09-13 Thread Matt Turner

On 08/19, Jordan Justen wrote:

From: Timothy Arceri 

This uses the recently-added disk_cache.c to write out the final
linked binary for vertex and fragment shader programs.

This is based off the initial implementation done by Carl Worth.

[jordan.l.jus...@intel.com: *_cached_program => brw_disk_cache_*_program]
[jordan.l.jus...@intel.com: brw_shader_cache.c => brw_disk_cache.c]
[jordan.l.jus...@intel.com: don't map to write program when LLC is present]
[jordan.l.jus...@intel.com: set program_written_to_cache on read from cache]
[jordan.l.jus...@intel.com: only try cache when status is linking_skipped]
Signed-off-by: Jordan Justen 
---
src/mesa/drivers/dri/i965/Makefile.sources |   1 +
src/mesa/drivers/dri/i965/brw_disk_cache.c | 395 +
src/mesa/drivers/dri/i965/brw_state.h  |   5 +
3 files changed, 401 insertions(+)
create mode 100644 src/mesa/drivers/dri/i965/brw_disk_cache.c

diff --git a/src/mesa/drivers/dri/i965/Makefile.sources 
b/src/mesa/drivers/dri/i965/Makefile.sources
index 425c883de8..6e21010bae 100644
--- a/src/mesa/drivers/dri/i965/Makefile.sources
+++ b/src/mesa/drivers/dri/i965/Makefile.sources
@@ -14,6 +14,7 @@ i965_FILES = \
brw_cs.h \
brw_curbe.c \
brw_defines.h \
+   brw_disk_cache.cpp \
brw_draw.c \
brw_draw.h \
brw_draw_upload.c \
diff --git a/src/mesa/drivers/dri/i965/brw_disk_cache.c 
b/src/mesa/drivers/dri/i965/brw_disk_cache.c
new file mode 100644
index 00..b56e561e14
--- /dev/null
+++ b/src/mesa/drivers/dri/i965/brw_disk_cache.c
@@ -0,0 +1,395 @@
+/*
+ * Copyright © 2014 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include "compiler/glsl/blob.h"
+#include "compiler/glsl/ir_uniform.h"
+#include "compiler/glsl/shader_cache.h"
+#include "main/mtypes.h"
+#include "util/disk_cache.h"
+#include "util/macros.h"
+#include "util/mesa-sha1.h"
+
+#include "brw_context.h"
+#include "brw_state.h"
+#include "brw_vs.h"
+#include "brw_wm.h"
+
+static uint64_t
+ptr_to_uint64_t(void *ptr)
+{
+   uint64_t ptr_int = (uint64_t) ptr;
+#if __i386__
+   ptr_int &= 0x;
+#endif
+   return ptr_int;
+}


This function can be as simple as

static uint64_t
ptr_to_uint64_t(void *ptr)
{
  return (uintptr_t) ptr;
}

But I have a question. We don't expect to be able to load a shader with
a 64-bit Mesa that was generated on a 32-bit Mesa (or vice versa)
because we're using the build id to determine compatibility.

Do we need to write out all pointers as 64-bits, or should we really
just be writing out uintptr_t? Maybe having the binary format the same
keeps something simple?


signature.asc
Description: Digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Record NOS dependencies for shader programs and only check those.

2017-09-13 Thread Matt Turner

On 08/22, Kenneth Graunke wrote:

Previously, the state upload code listened to a broad set of dirty flags
that corresponded to all possible state dependencies for a shader stage.
This is somewhat overkill.  For example, if a shader has no textures,
there is no need to listen to _NEW_TEXTURE.  Although these extra
dependencies are harmless for correctness, they cause us to recompute
the program keys and search the program cache more often than necessary.

This patch introduces a new brw_program::nos field, containing the dirty
flags which cover a shader's non-orthogonal state dependencies.  We look
at what the shader actually requires, and record that at link time (or
fixed function program generation time).  Then, the state upload code
simply checks the current dirty flags against those.

This should reduce CPU overhead when drawing with the same shaders
multiple times, but changing state or resources (such as binding new
textures or changing uniforms).

Improves performance in GFXBench4's gl_driver2_off on Apollolake at
1280x720 by 3.06834% +/- 0.722141% (n=25).


That all makes sense and seems like an obviously good idea.

Reviewed-by: Matt Turner 


signature.asc
Description: Digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 01/15] i965: Delete a batch size assertion that isn't very useful.

2017-09-13 Thread Matt Turner

The series looks good to me. I had a question on 03/15, mostly for
clarification in the commit message. The series is

Reviewed-by: Matt Turner 


signature.asc
Description: Digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/glsl_to_tgsi: fix theoretical memory leak

2017-09-13 Thread Timothy Arceri

Reviewed-by: Timothy Arceri 

On 14/09/17 03:05, Nicolai Hähnle wrote:

From: Nicolai Hähnle 

It can't *really* happen since we don't use subroutines.

CID: 1417491
---
  src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp | 7 +--
  1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
index 374393bb86e..76c198e165b 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
@@ -610,20 +610,21 @@ static void dump_instruction(int line, prog_scope *scope,
   */
  bool
  get_temp_registers_required_lifetimes(void *mem_ctx, exec_list *instructions,
int ntemps, struct lifetime *lifetimes)
  {
 int line = 0;
 int loop_id = 0;
 int if_id = 0;
 int switch_id = 0;
 bool is_at_end = false;
+   bool ok = true;
 int n_scopes = 1;
  
 /* Count scopes to allocate the needed space without the need for

  * re-allocation
  */
 foreach_in_list(glsl_to_tgsi_instruction, inst, instructions) {
if (inst->op == TGSI_OPCODE_BGNLOOP ||
inst->op == TGSI_OPCODE_SWITCH ||
inst->op == TGSI_OPCODE_CASE ||
inst->op == TGSI_OPCODE_IF ||
@@ -751,21 +752,22 @@ get_temp_registers_required_lifetimes(void *mem_ctx, 
exec_list *instructions,
   break;
}
case TGSI_OPCODE_CAL:
case TGSI_OPCODE_RET:
   /* These opcodes are not supported and if a subroutine would
* be called in a shader, then the lifetime tracking would have
* to follow that call to see which registers are used there.
* Since this is not done, we have to bail out here and signal
* that no register merge will take place.
*/
- return false;
+ ok = false;
+ goto out;
default: {
   for (unsigned j = 0; j < num_inst_src_regs(inst); j++) {
  const st_src_reg& src = inst->src[j];
  if (src.file == PROGRAM_TEMPORARY)
 acc[src.index].record_read(line, cur_scope, src.swizzle);
   }
   for (unsigned j = 0; j < inst->tex_offset_num_offset; j++) {
  const st_src_reg& src = inst->tex_offsets[j];
  if (src.file == PROGRAM_TEMPORARY)
 acc[src.index].record_read(line, cur_scope, src.swizzle);
@@ -790,22 +792,23 @@ get_temp_registers_required_lifetimes(void *mem_ctx, 
exec_list *instructions,
  
 RENAME_DEBUG(cerr << "= lifetimes ==\n");

 for(int i = 0; i < ntemps; ++i) {
RENAME_DEBUG(cerr << setw(4) << i);
lifetimes[i] = acc[i].get_required_lifetime();
RENAME_DEBUG(cerr << ": [" << lifetimes[i].begin << ", "
  << lifetimes[i].end << "]\n");
 }
 RENAME_DEBUG(cerr << "==\n\n");
  
+out:

 delete[] acc;
-   return true;
+   return ok;
  }
  
  /* Find the next register between [start, end) that has a life time starting

   * at or after bound by using a binary search.
   * start points at the beginning of the search range,
   * end points at the element past the end of the search range, and
   * the array comprising [start, end) must be sorted in ascending order.
   */
  static access_record*
  find_next_rename(access_record* start, access_record* end, int bound)


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 15/20] anv/image: Refactor creation of aux surfaces (v2)

2017-09-13 Thread Chad Versace
From: Chad Versace 

Creation of hiz, ccs, and mcs surfaces was encoded in a large, deep 'if'
tree at the tail of make_surface(). This patch extracts that 'if' tree
into one function per aux type:

try_make_hiz_surface()
try_make_ccs_surface()
try_make_mcs_surface()

For clarity, also rename make_surface() to make_main_surface().

No behavioral change is intended.

v2: Rebase onto Tapani's VK_EXT_debug_report changes.
---
 src/intel/vulkan/anv_image.c | 230 ++-
 1 file changed, 142 insertions(+), 88 deletions(-)

diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c
index 03cecb1f08a..ba09cc4c585 100644
--- a/src/intel/vulkan/anv_image.c
+++ b/src/intel/vulkan/anv_image.c
@@ -33,6 +33,10 @@
 
 #include "vk_format_info.h"
 
+static void
+add_fast_clear_state_buffer(struct anv_image *image,
+const struct anv_device *device);
+
 /**
  * Exactly one bit must be set in \a aspect.
  */
@@ -131,6 +135,133 @@ add_surface(struct anv_image *image, struct anv_surface 
*surf)
image->alignment = MAX2(image->alignment, surf->isl.alignment);
 }
 
+static void
+try_make_hiz_surface(const struct anv_device *dev,
+ const VkImageCreateInfo *base_info,
+ struct anv_image *image)
+{
+   bool ok UNUSED;
+
+   assert(image->aux_surface.isl.size == 0);
+
+   /* We don't advertise that depth buffers could be used as storage
+* images.
+*/
+   assert(!(image->usage & VK_IMAGE_USAGE_STORAGE_BIT));
+
+   /* Allow the user to control HiZ enabling through environment variables.
+* Disable by default on gen7 because resolves are not currently
+* implemented pre-BDW.
+*/
+   if (!(image->usage & VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT)) {
+  /* It will never be used as an attachment, HiZ is pointless. */
+   } else if (dev->info.gen == 7) {
+  anv_perf_warn(dev->instance, image, "Implement gen7 HiZ");
+   } else if (base_info->mipLevels > 1) {
+  anv_perf_warn(dev->instance, image, "Enable multi-LOD HiZ");
+   } else if (base_info->arrayLayers > 1) {
+  anv_perf_warn(dev->instance, image, "Implement multi-arrayLayer HiZ 
clears and resolves");
+   } else if (dev->info.gen == 8 && base_info->samples > 1) {
+  anv_perf_warn(dev->instance, image, "Enable gen8 multisampled HiZ");
+   } else if (!unlikely(INTEL_DEBUG & DEBUG_NO_HIZ)) {
+  ok = isl_surf_get_hiz_surf(>isl_dev, >depth_surface.isl,
+ >aux_surface.isl);
+  assert(ok);
+  add_surface(image, >aux_surface);
+  image->aux_usage = ISL_AUX_USAGE_HIZ;
+   }
+}
+
+static void
+try_make_ccs_surface(const struct anv_device *dev,
+ const VkImageCreateInfo *base_info,
+ struct anv_image *image)
+{
+   bool ok;
+
+   assert(image->aux_surface.isl.size == 0);
+
+   if (unlikely(INTEL_DEBUG & DEBUG_NO_RBC))
+  return;
+
+   ok = isl_surf_get_ccs_surf(>isl_dev, >color_surface.isl,
+  >aux_surface.isl, 0);
+   if (!ok)
+  return;
+
+   /* Disable CCS when it is not useful (i.e., when you can't render
+* to the image with CCS enabled).
+*/
+   if (!isl_format_supports_rendering(>info,
+  image->color_surface.isl.format)) {
+  /* While it may be technically possible to enable CCS for this
+   * image, we currently don't have things hooked up to get it
+   * working.
+   */
+  anv_perf_warn(dev->instance, image,
+"This image format doesn't support rendering. "
+"Not allocating an CCS buffer.");
+  image->aux_surface.isl.size = 0;
+  return;
+   }
+
+   add_surface(image, >aux_surface);
+   add_fast_clear_state_buffer(image, dev);
+
+   /* For images created without MUTABLE_FORMAT_BIT set, we know that
+* they will always be used with the original format.  In
+* particular, they will always be used with a format that
+* supports color compression.  If it's never used as a storage
+* image, then it will only be used through the sampler or the as
+* a render target.  This means that it's safe to just leave
+* compression on at all times for these formats.
+*/
+   if (!(base_info->usage & VK_IMAGE_USAGE_STORAGE_BIT) &&
+   !(base_info->flags & VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT) &&
+   isl_format_supports_ccs_e(>info, image->color_surface.isl.format)) 
{
+  image->aux_usage = ISL_AUX_USAGE_CCS_E;
+   }
+}
+
+static void
+try_make_mcs_surface(const struct anv_device *dev,
+ const VkImageCreateInfo *base_info,
+ struct anv_image *image)
+{
+   bool ok;
+
+   assert(image->aux_surface.isl.size == 0);
+   assert(!(base_info->usage & VK_IMAGE_USAGE_STORAGE_BIT));
+
+   ok = isl_surf_get_mcs_surf(>isl_dev, >color_surface.isl,
+  >aux_surface.isl);
+   if (!ok)
+  

[Mesa-dev] [PATCH 20/20] anv: Implement VK_ANDROID_native_buffer (v2)

2017-09-13 Thread Chad Versace
From: Chad Versace 

This implementation is correct (afaict), but takes two shortcuts
regarding the import/export of Android sync fds.

  Shortcut 1. When Android calls vkAcquireImageANDROID to import a sync
  fd into a VkSemaphore or VkFence, the driver instead simply blocks on
  the sync fd, then puts the VkSemaphore or VkFence into the signalled
  state. Thanks to implicit sync, this produces correct behavior (with
  extra latency overhead, perhaps) despite its ugliness.

  Shortcut 2. When Android calls vkQueueSignalReleaseImageANDROID to export
  a collection of wait semaphores as a sync fd, the driver instead
  submits the semaphores to the queue, then returns sync fd -1, which
  informs the caller that no additional synchronization is needed.
  Again, thanks to implicit sync, this produces correct behavior (with
  extra batch submission overhead) despite its ugliness.

I chose to take the shortcuts instead of properly importing/exporting
the sync fds for two reasons:

  Reason 1. I've already tested this patch with dEQP and with demos
  apps. It works. I wanted to get the tested patches into the tree now,
  and polish the implementation afterwards.

  Reason 2. I want to run this on a 3.18 kernel (gasp!). In 3.18, i915
  supports neither Android's sync_fence, nor upstream's sync_file, nor
  drm_syncobj. Again, I tested these patches on Android with a 3.18
  kernel and they work.

I plan to quickly follow-up with patches that remove the shortcuts and
properly import/export the sync fds.

Non-Testing
===
I did not test at all using the Android.mk buildsystem. I probably
broke it. Please test and review that.

Testing
===
I tested with 64-bit ARC++ on a Skylake Chromebook and a 3.18 kernel.
The following pass:

  a little spinning cube demo APK
  dEQP-VK.info.*
  dEQP-VK.api.smoke.*
  dEQP-VK.api.info.instance.*
  dEQP-VK.api.info.device.*
  dEQP-VK.api.wsi.android.*

v2:
  - Reject VkNativeBufferANDROID if the dma-buf's size is too small for
the VkImage.
  - Stop abusing VkNativeBufferANDROID by passing it to vkAllocateMemory
during vkCreateImage. Instead, directly import its dma-buf during
vkCreateImage with anv_bo_cache_import(). [for jekstrand]
  - Rebase onto Tapani's VK_EXT_debug_report changes.
  - Drop `CPPFLAGS += $(top_srcdir)/include/android`. The dir does not
exist.
---
 src/intel/Makefile.sources  |   3 +
 src/intel/Makefile.vulkan.am|   2 +
 src/intel/vulkan/anv_android.c  | 245 
 src/intel/vulkan/anv_device.c   |  12 +-
 src/intel/vulkan/anv_entrypoints_gen.py |  10 +-
 src/intel/vulkan/anv_extensions.py  |   1 +
 src/intel/vulkan/anv_image.c| 141 --
 src/intel/vulkan/anv_private.h  |   1 +
 8 files changed, 405 insertions(+), 10 deletions(-)
 create mode 100644 src/intel/vulkan/anv_android.c

diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources
index 8ca50ff622b..6f2dfa91e20 100644
--- a/src/intel/Makefile.sources
+++ b/src/intel/Makefile.sources
@@ -229,6 +229,9 @@ VULKAN_FILES := \
vulkan/anv_wsi.c \
vulkan/vk_format_info.h
 
+VULKAN_ANDROID_FILES := \
+   vulkan/anv_android.c
+
 VULKAN_WSI_WAYLAND_FILES := \
vulkan/anv_wsi_wayland.c
 
diff --git a/src/intel/Makefile.vulkan.am b/src/intel/Makefile.vulkan.am
index d1b1132ed2e..e9c824f717b 100644
--- a/src/intel/Makefile.vulkan.am
+++ b/src/intel/Makefile.vulkan.am
@@ -147,8 +147,10 @@ VULKAN_LIB_DEPS = \
-lm
 
 if HAVE_PLATFORM_ANDROID
+VULKAN_CPPFLAGS += $(ANDROID_CPPFLAGS)
 VULKAN_CFLAGS += $(ANDROID_CFLAGS)
 VULKAN_LIB_DEPS += $(ANDROID_LIBS)
+VULKAN_SOURCES += $(VULKAN_ANDROID_FILES)
 endif
 
 if HAVE_PLATFORM_X11
diff --git a/src/intel/vulkan/anv_android.c b/src/intel/vulkan/anv_android.c
new file mode 100644
index 000..6b19ace4d2d
--- /dev/null
+++ b/src/intel/vulkan/anv_android.c
@@ -0,0 +1,245 @@
+/*
+ * Copyright 2017 Google
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF 

[Mesa-dev] [PATCH 11/20] intel: Add simple logging façade for Android

2017-09-13 Thread Chad Versace
From: Chad Versace 

I'm bringing up Vulkan in the Android container of Chrome OS (ARC++).

On Android, stdio goes to /dev/null. On Android, remote gdb is even more
painful than the usual remote gdb. On Android, nothing works like you
expect and debugging is hell. I need logging.

This patch introduces a small, simple logging API that can easily wrap
Android's API. On non-Android platforms, this logger does nothing fancy.
It follows the time-honored Unix tradition of spewing everything to
stderr with minimal fuss.

My goal here is not perfection. My goal is to make a minimal, clean API,
that people hate merely a little instead of a lot, and that's good
enough to let me bring up Android Vulkan.  And it needs to be fast,
which means it must be small. No one wants to their game to miss frames
while aiming a flaming bow into the jaws of an angry robot t-rex, and
thus become t-rex breakfast, because some fool had too much fun desiging
a bloated, ideal logging API.

If people like it, perhaps we should quickly promote it to src/util.

The API looks like this:

#define INTEL_LOG_TAG "intel-vulkan"
#define DEBUG

intel_logd("try hard thing with foo=%d", foo);

n = try_foo(...);
if (n < 0) {
intel_loge("%s:%d: foo failed bigtime", __FILE__, __LINE__);
return VK_ERROR_DEVICE_LOST;
}

And produces this on non-Android:

intel-vulkan: debug: try hard thing with foo=93
intel-vulkan: error: anv_device.c:182: foo failed bigtime
---
 src/intel/Makefile.sources   |  4 +-
 src/intel/common/intel_log.c | 87 
 src/intel/common/intel_log.h | 82 +
 3 files changed, 172 insertions(+), 1 deletion(-)
 create mode 100644 src/intel/common/intel_log.c
 create mode 100644 src/intel/common/intel_log.h

diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources
index 200713b06ee..8ca50ff622b 100644
--- a/src/intel/Makefile.sources
+++ b/src/intel/Makefile.sources
@@ -18,7 +18,9 @@ COMMON_FILES = \
common/gen_l3_config.c \
common/gen_l3_config.h \
common/gen_urb_config.c \
-   common/gen_sample_positions.h
+   common/gen_sample_positions.h \
+   common/intel_log.c \
+   common/intel_log.h
 
 COMPILER_FILES = \
compiler/brw_cfg.cpp \
diff --git a/src/intel/common/intel_log.c b/src/intel/common/intel_log.c
new file mode 100644
index 000..03d6dc72a8d
--- /dev/null
+++ b/src/intel/common/intel_log.c
@@ -0,0 +1,87 @@
+/*
+ * Copyright 2017 Google
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include 
+
+#ifdef ANDROID
+#include 
+#else
+#include 
+#endif
+
+#include "intel_log.h"
+
+#ifdef ANDROID
+static inline android_LogPriority
+level_to_android(enum intel_log_level l)
+{
+   switch (l) {
+   case INTEL_LOG_ERROR: return ANDROID_LOG_ERROR;
+   case INTEL_LOG_WARN: return ANDROID_LOG_WARN;
+   case INTEL_LOG_INFO: return ANDROID_LOG_INFO;
+   case INTEL_LOG_DEBUG: return ANDROID_LOG_DEBUG;
+   }
+
+   unreachable("bad intel_log_level");
+}
+#endif
+
+#ifndef ANDROID
+static inline const char *
+level_to_str(enum intel_log_level l)
+{
+   switch (l) {
+   case INTEL_LOG_ERROR: return "error";
+   case INTEL_LOG_WARN: return "warning";
+   case INTEL_LOG_INFO: return "info";
+   case INTEL_LOG_DEBUG: return "debug";
+   }
+
+   unreachable("bad intel_log_level");
+}
+#endif
+
+void
+intel_log(enum intel_log_level level, const char *tag, const char *format, ...)
+{
+   va_list va;
+
+   va_start(va, format);
+   intel_log_v(level, tag, format, va);
+   va_end(va);
+}
+
+void
+intel_log_v(enum intel_log_level level, const char *tag, const char *format,
+va_list va)
+{
+#ifdef ANDROID
+   __android_log_vprint(level_to_android(level), tag, format, va);
+#else
+   flockfile(stderr);
+   fprintf(stderr, "%s: %s: ", tag, level_to_str(level));
+   

[Mesa-dev] [PATCH 18/20] anv: Add sizeless anv_bo_cache_import()

2017-09-13 Thread Chad Versace
From: Chad Versace 

The patch renames anv_bo_cache_import() to
anv_bo_cache_import_with_size(); and adds a new variant,
anv_bo_cache_import(), that lacks the 'size' parameter. Same as
pre-patch, anv_bo_cache_import_with_size() continues to validate the
imported size.

The patch is essentially a refactor patch.  It should change no existing
behavior.

This split makes it easier to implement VK_ANDROID_native_buffer. When
the user imports a gralloc hande into a VkImage using
VK_ANDROID_native_buffer, the user provides no size. The driver must
infer the size from the internals of the gralloc buffer.
---
 src/intel/vulkan/anv_allocator.c | 118 +--
 src/intel/vulkan/anv_device.c|   7 ++-
 src/intel/vulkan/anv_intel.c |   4 +-
 src/intel/vulkan/anv_private.h   |   3 +
 src/intel/vulkan/anv_queue.c |   5 +-
 5 files changed, 88 insertions(+), 49 deletions(-)

diff --git a/src/intel/vulkan/anv_allocator.c b/src/intel/vulkan/anv_allocator.c
index 2cf1130bf29..756f49f3103 100644
--- a/src/intel/vulkan/anv_allocator.c
+++ b/src/intel/vulkan/anv_allocator.c
@@ -1269,63 +1269,97 @@ anv_bo_cache_alloc(struct anv_device *device,
return VK_SUCCESS;
 }
 
+static VkResult
+anv_bo_cache_import_locked(struct anv_device *device,
+   struct anv_bo_cache *cache,
+   int fd, struct anv_bo **bo_out)
+{
+   uint32_t gem_handle = anv_gem_fd_to_handle(device, fd);
+   if (!gem_handle)
+  return vk_error(VK_ERROR_INVALID_EXTERNAL_HANDLE_KHR);
+
+   struct anv_cached_bo *bo = anv_bo_cache_lookup_locked(cache, gem_handle);
+   if (bo) {
+  __sync_fetch_and_add(>refcount, 1);
+  return VK_SUCCESS;
+   }
+
+   off_t size = lseek(fd, 0, SEEK_END);
+   if (size == (off_t)-1) {
+  anv_gem_close(device, gem_handle);
+  return vk_error(VK_ERROR_INVALID_EXTERNAL_HANDLE_KHR);
+   }
+
+   bo = vk_alloc(>alloc, sizeof(struct anv_cached_bo), 8,
+ VK_SYSTEM_ALLOCATION_SCOPE_OBJECT);
+   if (!bo) {
+  anv_gem_close(device, gem_handle);
+  return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
+   }
+
+   bo->refcount = 1;
+   anv_bo_init(>bo, gem_handle, size);
+   _mesa_hash_table_insert(cache->bo_map, (void *)(uintptr_t)gem_handle, bo);
+   *bo_out = >bo;
+   return VK_SUCCESS;
+}
+
 VkResult
 anv_bo_cache_import(struct anv_device *device,
 struct anv_bo_cache *cache,
-int fd, uint64_t size, struct anv_bo **bo_out)
+int fd, struct anv_bo **bo_out)
 {
+   VkResult result;
+
pthread_mutex_lock(>mutex);
+   result = anv_bo_cache_import_locked(device, cache, fd, bo_out);
+   pthread_mutex_unlock(>mutex);
+
+   return result;
+}
+
+VkResult
+anv_bo_cache_import_with_size(struct anv_device *device,
+  struct anv_bo_cache *cache,
+  int fd, uint64_t size,
+  struct anv_bo **bo_out)
+{
+   struct anv_bo *bo = NULL;
+   VkResult result;
 
/* The kernel is going to give us whole pages anyway */
size = align_u64(size, 4096);
 
-   uint32_t gem_handle = anv_gem_fd_to_handle(device, fd);
-   if (!gem_handle) {
+   pthread_mutex_lock(>mutex);
+
+   result = anv_bo_cache_import_locked(device, cache, fd, );
+   if (result != VK_SUCCESS) {
   pthread_mutex_unlock(>mutex);
-  return vk_error(VK_ERROR_INVALID_EXTERNAL_HANDLE_KHR);
+  return result;
}
 
-   struct anv_cached_bo *bo = anv_bo_cache_lookup_locked(cache, gem_handle);
-   if (bo) {
-  if (bo->bo.size != size) {
- pthread_mutex_unlock(>mutex);
- return vk_error(VK_ERROR_INVALID_EXTERNAL_HANDLE_KHR);
-  }
-  __sync_fetch_and_add(>refcount, 1);
-   } else {
-  /* For security purposes, we reject BO imports where the size does not
-   * match exactly.  This prevents a malicious client from passing a
-   * buffer to a trusted client, lying about the size, and telling the
-   * trusted client to try and texture from an image that goes
-   * out-of-bounds.  This sort of thing could lead to GPU hangs or worse
-   * in the trusted client.  The trusted client can protect itself against
-   * this sort of attack but only if it can trust the buffer size.
-   */
-  off_t import_size = lseek(fd, 0, SEEK_END);
-  if (import_size == (off_t)-1 || import_size != size) {
- anv_gem_close(device, gem_handle);
- pthread_mutex_unlock(>mutex);
- return vk_error(VK_ERROR_INVALID_EXTERNAL_HANDLE_KHR);
-  }
-
-  bo = vk_alloc(>alloc, sizeof(struct anv_cached_bo), 8,
-VK_SYSTEM_ALLOCATION_SCOPE_OBJECT);
-  if (!bo) {
- anv_gem_close(device, gem_handle);
- pthread_mutex_unlock(>mutex);
- return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
-  }
-
-  bo->refcount = 1;
-
-  anv_bo_init(>bo, gem_handle, size);
-
-  _mesa_hash_table_insert(cache->bo_map, 

[Mesa-dev] [PATCH 19/20] anv: Add func anv_gem_get_tiling()

2017-09-13 Thread Chad Versace
From: Chad Versace 

Will use in VK_ANDROID_native_buffer.
---
 src/intel/vulkan/anv_gem.c | 16 
 src/intel/vulkan/anv_private.h |  1 +
 2 files changed, 17 insertions(+)

diff --git a/src/intel/vulkan/anv_gem.c b/src/intel/vulkan/anv_gem.c
index 3994c6b66c2..34c09891086 100644
--- a/src/intel/vulkan/anv_gem.c
+++ b/src/intel/vulkan/anv_gem.c
@@ -192,6 +192,22 @@ anv_gem_execbuffer(struct anv_device *device,
   return anv_ioctl(device->fd, DRM_IOCTL_I915_GEM_EXECBUFFER2, execbuf);
 }
 
+/** Return -1 on error. */
+int
+anv_gem_get_tiling(struct anv_device *device, uint32_t gem_handle)
+{
+   struct drm_i915_gem_get_tiling get_tiling = {
+  .handle = gem_handle,
+   };
+
+   if (anv_ioctl(device->fd, DRM_IOCTL_I915_GEM_GET_TILING, _tiling)) {
+  assert(!"Failed to get BO tiling");
+  return -1;
+   }
+
+   return get_tiling.tiling_mode;
+}
+
 int
 anv_gem_set_tiling(struct anv_device *device,
uint32_t gem_handle, uint32_t stride, uint32_t tiling)
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index d7cce24f5c1..eca8fb94b59 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -930,6 +930,7 @@ int anv_gem_destroy_context(struct anv_device *device, int 
context);
 int anv_gem_get_context_param(int fd, int context, uint32_t param,
   uint64_t *value);
 int anv_gem_get_param(int fd, uint32_t param);
+int anv_gem_get_tiling(struct anv_device *device, uint32_t gem_handle);
 bool anv_gem_get_bit6_swizzle(int fd, uint32_t tiling);
 int anv_gem_get_aperture(int fd, uint64_t *size);
 bool anv_gem_supports_48b_addresses(int fd);
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 17/20] anv: Move close(fd) from anv_bo_cache_import to its callers

2017-09-13 Thread Chad Versace
From: Chad Versace 

This will allow us to implement VK_ANDROID_native_buffer without dup'ing
the fd. We must close the fd in VK_KHR_external_memory_fd, but we should
not in VK_ANDROID_native_buffer.
---
 src/intel/vulkan/anv_allocator.c | 12 
 src/intel/vulkan/anv_device.c| 11 +++
 src/intel/vulkan/anv_intel.c |  2 ++
 3 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/src/intel/vulkan/anv_allocator.c b/src/intel/vulkan/anv_allocator.c
index be750adeb52..2cf1130bf29 100644
--- a/src/intel/vulkan/anv_allocator.c
+++ b/src/intel/vulkan/anv_allocator.c
@@ -1324,18 +1324,6 @@ anv_bo_cache_import(struct anv_device *device,
}
 
pthread_mutex_unlock(>mutex);
-
-   /* From the Vulkan spec:
-*
-*"Importing memory from a file descriptor transfers ownership of
-*the file descriptor from the application to the Vulkan
-*implementation. The application must not perform any operations on
-*the file descriptor after a successful import."
-*
-* If the import fails, we leave the file descriptor open.
-*/
-   close(fd);
-
*bo_out = >bo;
 
return VK_SUCCESS;
diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index 8c645707a90..7fea4188986 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -1531,6 +1531,17 @@ VkResult anv_AllocateMemory(
>bo);
   if (result != VK_SUCCESS)
  goto fail;
+
+  /* From the Vulkan spec:
+   *
+   *"Importing memory from a file descriptor transfers ownership of
+   *the file descriptor from the application to the Vulkan
+   *implementation. The application must not perform any operations on
+   *the file descriptor after a successful import."
+   *
+   * If the import fails, we leave the file descriptor open.
+   */
+  close(fd_info->fd);
} else {
   result = anv_bo_cache_alloc(device, >bo_cache,
   pAllocateInfo->allocationSize,
diff --git a/src/intel/vulkan/anv_intel.c b/src/intel/vulkan/anv_intel.c
index 991a93542d2..be6568468e1 100644
--- a/src/intel/vulkan/anv_intel.c
+++ b/src/intel/vulkan/anv_intel.c
@@ -79,6 +79,8 @@ VkResult anv_CreateDmaBufImageINTEL(
   }},
   pAllocator, _h);
 
+   close(pCreateInfo->fd);
+
image = anv_image_from_handle(image_h);
image->bo = mem->bo;
image->offset = 0;
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 16/20] anv: Add field anv_image::bo_is_owned

2017-09-13 Thread Chad Versace
From: Chad Versace 

If this flag is set, then the image and it's bo have the same
lifetime. vkDestroyImage will release the bo.

We need this for VK_ANDROID_native_buffer, because that extension
creates the VkImage and imports its memory in the same call,
vkCreateImage.
---
 src/intel/vulkan/anv_image.c   | 9 -
 src/intel/vulkan/anv_private.h | 1 +
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c
index ba09cc4c585..d6691014552 100644
--- a/src/intel/vulkan/anv_image.c
+++ b/src/intel/vulkan/anv_image.c
@@ -451,8 +451,12 @@ anv_image_create(VkDevice _device,
return VK_SUCCESS;
 
 fail:
-   if (image)
+   if (image) {
+  if (image->bo && image->bo_is_owned)
+ anv_bo_cache_release(device, >bo_cache, image->bo);
+
   vk_free2(>alloc, alloc, image);
+   }
 
return r;
 }
@@ -481,6 +485,9 @@ anv_DestroyImage(VkDevice _device, VkImage _image,
if (!image)
   return;
 
+   if (image->bo && image->bo_is_owned)
+  anv_bo_cache_release(device, >bo_cache, image->bo);
+
vk_free2(>alloc, pAllocator, image);
 }
 
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index 16670f2ac3b..7a7f23141af 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -2224,6 +2224,7 @@ struct anv_image {
/* Set when bound */
struct anv_bo *bo;
VkDeviceSize offset;
+   bool bo_is_owned; /**< When destroying the image, also free its bo. */
 
/**
 * Image subsurfaces
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/20] intel: Move definition of LOG_TAG from header into Makefiles

2017-09-13 Thread Chad Versace
From: Chad Versace 

This patch prevents compilation failures in upcoming Android Vulkan
patches, failures due to redefinition of LOG_TAG in Android system
headers.

This patch does not change the value of LOG_TAG. It remains
"INTEL-MESA". (I don't like it, though. The all-caps smells like
FORTRAN).

Only one Intel header defined LOG_TAG: gen_debug.h. I believe I defined
it correctly in all the necessary Automake and Android.mk files, but one
can never truly know when touching build systems.

This patch is ugly and too big. If someone knows a better way, please
speak up.
---
 src/intel/Android.blorp.mk|  2 ++
 src/intel/Android.common.mk   |  2 ++
 src/intel/Android.vulkan.mk   | 21 -
 src/intel/Makefile.am |  3 ++-
 src/intel/common/gen_debug.h  |  1 -
 src/mesa/drivers/dri/i965/Android.mk  | 23 +--
 src/mesa/drivers/dri/i965/Makefile.am |  1 +
 7 files changed, 36 insertions(+), 17 deletions(-)

diff --git a/src/intel/Android.blorp.mk b/src/intel/Android.blorp.mk
index e1aa908706e..0fc7e1ed090 100644
--- a/src/intel/Android.blorp.mk
+++ b/src/intel/Android.blorp.mk
@@ -31,6 +31,8 @@ LOCAL_MODULE_CLASS := STATIC_LIBRARIES
 
 LOCAL_SRC_FILES := $(BLORP_FILES)
 
+LOCAL_CFLAGS := -DLOG_TAG=\"INTEL-MESA\"
+
 LOCAL_C_INCLUDES := \
$(call generated-sources-dir-for,STATIC_LIBRARIES,libmesa_nir,,)/nir \
$(MESA_TOP)/src/gallium/auxiliary \
diff --git a/src/intel/Android.common.mk b/src/intel/Android.common.mk
index 12cea6e5472..931800a596f 100644
--- a/src/intel/Android.common.mk
+++ b/src/intel/Android.common.mk
@@ -31,6 +31,8 @@ LOCAL_MODULE_CLASS := STATIC_LIBRARIES
 
 LOCAL_SRC_FILES := $(COMMON_FILES)
 
+LOCAL_CFLAGS := -DLOG_TAG=\"INTEL-MESA\"
+
 LOCAL_C_INCLUDES := \
external/zlib \
$(MESA_TOP)/src/gallium/include \
diff --git a/src/intel/Android.vulkan.mk b/src/intel/Android.vulkan.mk
index 17ae4b071b3..990bdb52fde 100644
--- a/src/intel/Android.vulkan.mk
+++ b/src/intel/Android.vulkan.mk
@@ -51,6 +51,8 @@ LOCAL_MODULE_CLASS := STATIC_LIBRARIES
 
 intermediates := $(call local-generated-sources-dir)
 
+LOCAL_CFLAGS := -DLOG_TAG=\"INTEL-MESA\"
+
 LOCAL_C_INCLUDES := \
$(VULKAN_COMMON_INCLUDES)
 
@@ -90,7 +92,8 @@ LOCAL_MODULE := libmesa_anv_gen7
 LOCAL_MODULE_CLASS := STATIC_LIBRARIES
 
 LOCAL_SRC_FILES := $(VULKAN_GEN7_FILES)
-LOCAL_CFLAGS := -DGEN_VERSIONx10=70
+
+LOCAL_CFLAGS := -DLOG_TAG=\"INTEL-MESA\" -DGEN_VERSIONx10=70
 
 LOCAL_C_INCLUDES := $(ANV_INCLUDES)
 
@@ -110,7 +113,8 @@ LOCAL_MODULE := libmesa_anv_gen75
 LOCAL_MODULE_CLASS := STATIC_LIBRARIES
 
 LOCAL_SRC_FILES := $(VULKAN_GEN75_FILES)
-LOCAL_CFLAGS := -DGEN_VERSIONx10=75
+
+LOCAL_CFLAGS := -DLOG_TAG=\"INTEL-MESA\" -DGEN_VERSIONx10=75
 
 LOCAL_C_INCLUDES := $(ANV_INCLUDES)
 
@@ -130,7 +134,8 @@ LOCAL_MODULE := libmesa_anv_gen8
 LOCAL_MODULE_CLASS := STATIC_LIBRARIES
 
 LOCAL_SRC_FILES := $(VULKAN_GEN8_FILES)
-LOCAL_CFLAGS := -DGEN_VERSIONx10=80
+
+LOCAL_CFLAGS := -DLOG_TAG=\"INTEL-MESA\" -DGEN_VERSIONx10=80
 
 LOCAL_C_INCLUDES := $(ANV_INCLUDES)
 
@@ -150,7 +155,8 @@ LOCAL_MODULE := libmesa_anv_gen9
 LOCAL_MODULE_CLASS := STATIC_LIBRARIES
 
 LOCAL_SRC_FILES := $(VULKAN_GEN9_FILES)
-LOCAL_CFLAGS := -DGEN_VERSIONx10=90
+
+LOCAL_CFLAGS := -DLOG_TAG=\"INTEL-MESA\" -DGEN_VERSIONx10=90
 
 LOCAL_C_INCLUDES := $(ANV_INCLUDES)
 
@@ -170,7 +176,8 @@ LOCAL_MODULE := libmesa_anv_gen10
 LOCAL_MODULE_CLASS := STATIC_LIBRARIES
 
 LOCAL_SRC_FILES := $(VULKAN_GEN10_FILES)
-LOCAL_CFLAGS := -DGEN_VERSIONx10=100
+
+LOCAL_CFLAGS := -DLOG_TAG=\"INTEL-MESA\" -DGEN_VERSIONx10=100
 
 LOCAL_C_INCLUDES := $(ANV_INCLUDES)
 
@@ -193,6 +200,8 @@ intermediates := $(call local-generated-sources-dir)
 
 LOCAL_SRC_FILES := $(VULKAN_FILES)
 
+LOCAL_CFLAGS := -DLOG_TAG=\"INTEL-MESA\"
+
 LOCAL_C_INCLUDES := \
$(ANV_INCLUDES) \
$(MESA_TOP)/src/compiler
@@ -237,6 +246,8 @@ include $(CLEAR_VARS)
 LOCAL_MODULE := libvulkan_intel
 LOCAL_MODULE_CLASS := SHARED_LIBRARIES
 
+LOCAL_CFLAGS := -DLOG_TAG=\"INTEL-MESA\"
+
 LOCAL_LDFLAGS += -Wl,--build-id=sha1
 
 LOCAL_SRC_FILES := \
diff --git a/src/intel/Makefile.am b/src/intel/Makefile.am
index a34e3014497..2872ffc1e89 100644
--- a/src/intel/Makefile.am
+++ b/src/intel/Makefile.am
@@ -40,7 +40,8 @@ AM_CPPFLAGS = \
-I$(top_srcdir)/src/gallium/include \
$(VALGRIND_CFLAGS) \
$(LIBDRM_CFLAGS) \
-   $(DEFINES)
+   $(DEFINES) \
+   -DLOG_TAG=\"INTEL-MESA\"
 
 AM_CFLAGS = \
$(VISIBILITY_CFLAGS) \
diff --git a/src/intel/common/gen_debug.h b/src/intel/common/gen_debug.h
index d290303682e..a38ee0844eb 100644
--- a/src/intel/common/gen_debug.h
+++ b/src/intel/common/gen_debug.h
@@ -84,7 +84,6 @@ extern uint64_t INTEL_DEBUG;
 #define DEBUG_COLOR   (1ull << 40)
 
 #ifdef HAVE_ANDROID_PLATFORM
-#define LOG_TAG "INTEL-MESA"
 #include 
 #ifndef ALOGW
 #define ALOGW LOGW
diff --git a/src/mesa/drivers/dri/i965/Android.mk 

[Mesa-dev] [PATCH 13/20] anv/image: Better var names for image-create-info structs

2017-09-13 Thread Chad Versace
From: Chad Versace 

Upcoming patches teach vkCreateImage to understand
VkNativeBufferANDROID. And much later patches will do the same for
whatever VkImageCreateInfo-chained struct imports dma-bufs. And then
again for the struct that will be introduced by the inevitable extension
VK_ANDROID_external_memory_whatever. So let's reduce ambiguity now by
choosing better variable names in make_surface() and
anv_image_create_info().

Rename:
  struct anv_image_create_info *create_info -> anv_info
  VkImageCreateInfo *pCreateInfo-> base_info
  VkImageCreateInfo *vk_info-> base_info
---
 src/intel/vulkan/anv_image.c | 74 ++--
 1 file changed, 37 insertions(+), 37 deletions(-)

diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c
index 492b341303a..78b4508dea8 100644
--- a/src/intel/vulkan/anv_image.c
+++ b/src/intel/vulkan/anv_image.c
@@ -204,7 +204,7 @@ make_surface(const struct anv_device *dev,
  const struct anv_image_create_info *anv_info,
  VkImageAspectFlags aspect)
 {
-   const VkImageCreateInfo *vk_info = anv_info->vk_info;
+   const VkImageCreateInfo *base_info = anv_info->vk_info;
bool ok UNUSED;
 
static const enum isl_surf_dim vk_to_isl_surf_dim[] = {
@@ -217,7 +217,7 @@ make_surface(const struct anv_device *dev,
 * result with an optionally provided ISL tiling argument.
 */
isl_tiling_flags_t tiling_flags =
-  (vk_info->tiling == VK_IMAGE_TILING_LINEAR) ?
+  (base_info->tiling == VK_IMAGE_TILING_LINEAR) ?
   ISL_TILING_LINEAR_BIT : ISL_TILING_ANY_MASK;
 
if (anv_info->isl_tiling_flags)
@@ -227,25 +227,25 @@ make_surface(const struct anv_device *dev,
 
struct anv_surface *anv_surf = get_surface(image, aspect);
 
-   image->extent = anv_sanitize_image_extent(vk_info->imageType,
- vk_info->extent);
+   image->extent = anv_sanitize_image_extent(base_info->imageType,
+ base_info->extent);
 
-   enum isl_format format = anv_get_isl_format(>info, vk_info->format,
-   aspect, vk_info->tiling);
+   enum isl_format format = anv_get_isl_format(>info, base_info->format,
+   aspect, base_info->tiling);
assert(format != ISL_FORMAT_UNSUPPORTED);
 
ok = isl_surf_init(>isl_dev, _surf->isl,
-  .dim = vk_to_isl_surf_dim[vk_info->imageType],
+  .dim = vk_to_isl_surf_dim[base_info->imageType],
   .format = format,
   .width = image->extent.width,
   .height = image->extent.height,
   .depth = image->extent.depth,
-  .levels = vk_info->mipLevels,
-  .array_len = vk_info->arrayLayers,
-  .samples = vk_info->samples,
+  .levels = base_info->mipLevels,
+  .array_len = base_info->arrayLayers,
+  .samples = base_info->samples,
   .min_alignment = 0,
   .row_pitch = anv_info->stride,
-  .usage = choose_isl_surf_usage(vk_info->flags, image->usage, aspect),
+  .usage = choose_isl_surf_usage(base_info->flags, image->usage, aspect),
   .tiling_flags = tiling_flags);
 
/* isl_surf_init() will fail only if provided invalid input. Invalid input
@@ -270,12 +270,12 @@ make_surface(const struct anv_device *dev,
  /* It will never be used as an attachment, HiZ is pointless. */
   } else if (dev->info.gen == 7) {
  anv_perf_warn(dev->instance, image, "Implement gen7 HiZ");
-  } else if (vk_info->mipLevels > 1) {
+  } else if (base_info->mipLevels > 1) {
  anv_perf_warn(dev->instance, image, "Enable multi-LOD HiZ");
-  } else if (vk_info->arrayLayers > 1) {
+  } else if (base_info->arrayLayers > 1) {
  anv_perf_warn(dev->instance, image,
"Implement multi-arrayLayer HiZ clears and resolves");
-  } else if (dev->info.gen == 8 && vk_info->samples > 1) {
+  } else if (dev->info.gen == 8 && base_info->samples > 1) {
  anv_perf_warn(dev->instance, image, "Enable gen8 multisampled HiZ");
   } else if (!unlikely(INTEL_DEBUG & DEBUG_NO_HIZ)) {
  assert(image->aux_surface.isl.size == 0);
@@ -285,7 +285,7 @@ make_surface(const struct anv_device *dev,
  add_surface(image, >aux_surface);
  image->aux_usage = ISL_AUX_USAGE_HIZ;
   }
-   } else if (aspect == VK_IMAGE_ASPECT_COLOR_BIT && vk_info->samples == 1) {
+   } else if (aspect == VK_IMAGE_ASPECT_COLOR_BIT && base_info->samples == 1) {
   if (!unlikely(INTEL_DEBUG & DEBUG_NO_RBC)) {
  assert(image->aux_surface.isl.size == 0);
  ok = isl_surf_get_ccs_surf(>isl_dev, _surf->isl,
@@ -318,16 +318,16 @@ make_surface(const struct anv_device *dev,
  * a render target.  This means that it's safe to just leave
  * compression on at all times for these formats.
  */
-if (!(vk_info->usage & 

[Mesa-dev] [PATCH 07/20] anv: Feed vk_android_native_buffer.xml to generators

2017-09-13 Thread Chad Versace
From: Chad Versace 

Feed the XML to anv_extensions.py and anv_entrypoints_gen.py.
Do it on all platforms, not just Android. Tested on Android and Fedora.

We always parse the Android XML, regardless of target platform, to
help reduce the chance that people working on non-Android break the
Android build.
---
 src/intel/Makefile.vulkan.am   | 17 +
 src/intel/vulkan/anv_extensions.py | 12 +++-
 2 files changed, 24 insertions(+), 5 deletions(-)

diff --git a/src/intel/Makefile.vulkan.am b/src/intel/Makefile.vulkan.am
index fa9b6ba7245..8a19f96096a 100644
--- a/src/intel/Makefile.vulkan.am
+++ b/src/intel/Makefile.vulkan.am
@@ -23,18 +23,27 @@
 # rules must be outside of any AM_CONDITIONALs. Otherwise they will be 
commented
 # out and we'll fail at `make dist'
 vulkan_api_xml = $(top_srcdir)/src/vulkan/registry/vk.xml
+vk_android_native_buffer_xml = 
$(top_srcdir)/src/vulkan/registry/vk_android_native_buffer.xml
 
 vulkan/anv_entrypoints.c: vulkan/anv_entrypoints_gen.py \
- vulkan/anv_extensions.py $(vulkan_api_xml)
+ vulkan/anv_extensions.py \
+ $(vulkan_api_xml) \
+ $(vk_android_native_buffer_xml)
$(MKDIR_GEN)
$(AM_V_GEN)$(PYTHON2) $(srcdir)/vulkan/anv_entrypoints_gen.py \
-   --xml $(vulkan_api_xml) --outdir $(builddir)/vulkan
+   --xml $(vulkan_api_xml) \
+   --xml $(vk_android_native_buffer_xml) \
+   --outdir $(builddir)/vulkan
 vulkan/anv_entrypoints.h: vulkan/anv_entrypoints.c
 
-vulkan/anv_extensions.c: vulkan/anv_extensions.py $(vulkan_api_xml)
+vulkan/anv_extensions.c: vulkan/anv_extensions.py \
+$(vulkan_api_xml) \
+$(vk_android_native_buffer_xml)
$(MKDIR_GEN)
$(AM_V_GEN)$(PYTHON2) $(srcdir)/vulkan/anv_extensions.py \
-   --xml $(vulkan_api_xml) --out $@
+   --xml $(vulkan_api_xml) \
+   --xml $(vk_android_native_buffer_xml) \
+   --out $@
 
 BUILT_SOURCES += $(VULKAN_GENERATED_FILES)
 CLEANFILES += \
diff --git a/src/intel/vulkan/anv_extensions.py 
b/src/intel/vulkan/anv_extensions.py
index d995f9f177c..316e1f04e79 100644
--- a/src/intel/vulkan/anv_extensions.py
+++ b/src/intel/vulkan/anv_extensions.py
@@ -134,8 +134,18 @@ def _init_exts_from_xml(xml):
 ext_name = ext_elem.attrib['name']
 if ext_name not in ext_name_map:
 continue
-ext = ext_name_map[ext_name]
 
+# Workaround for VK_ANDROID_native_buffer. Its  element in
+# vk.xml lists it as supported="disabled" and provides only a stub
+# definition.  Its  element in Mesa's custom
+# vk_android_native_buffer.xml, though, lists it as
+# supported='android-vendor' and fully defines the extension. We want
+# to skip the  element in vk.xml.
+if ext_elem.attrib['supported'] == 'disabled':
+assert ext_name == 'VK_ANDROID_native_buffer'
+continue
+
+ext = ext_name_map[ext_name]
 ext.type = ext_elem.attrib['type']
 
 _TEMPLATE = Template(COPYRIGHT + """
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 14/20] anv/image: Refactor how tiling is chosen

2017-09-13 Thread Chad Versace
From: Chad Versace 

The code that restricts the VkImage's tiling flags, extract it into
a new function named choose_isl_tiling_flags().  This reduces the diff
in upcoming patches for VK_ANDROID_native_buffer.
---
 src/intel/vulkan/anv_image.c | 31 ---
 1 file changed, 20 insertions(+), 11 deletions(-)

diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c
index 78b4508dea8..03cecb1f08a 100644
--- a/src/intel/vulkan/anv_image.c
+++ b/src/intel/vulkan/anv_image.c
@@ -106,6 +106,21 @@ get_surface(struct anv_image *image, VkImageAspectFlags 
aspect)
}
 }
 
+static VkResult
+choose_isl_tiling_flags(const struct anv_image_create_info *anv_info,
+isl_tiling_flags_t *restrict flags)
+{
+   *flags = ISL_TILING_ANY_MASK;
+
+   if (anv_info->vk_info->tiling == VK_IMAGE_TILING_LINEAR)
+  *flags &= ISL_TILING_LINEAR_BIT;
+
+   if (anv_info->isl_tiling_flags)
+  *flags &= anv_info->isl_tiling_flags;
+
+   return VK_SUCCESS;
+}
+
 static void
 add_surface(struct anv_image *image, struct anv_surface *surf)
 {
@@ -205,6 +220,7 @@ make_surface(const struct anv_device *dev,
  VkImageAspectFlags aspect)
 {
const VkImageCreateInfo *base_info = anv_info->vk_info;
+   VkResult result;
bool ok UNUSED;
 
static const enum isl_surf_dim vk_to_isl_surf_dim[] = {
@@ -213,17 +229,10 @@ make_surface(const struct anv_device *dev,
   [VK_IMAGE_TYPE_3D] = ISL_SURF_DIM_3D,
};
 
-   /* Translate the Vulkan tiling to an equivalent ISL tiling, then filter the
-* result with an optionally provided ISL tiling argument.
-*/
-   isl_tiling_flags_t tiling_flags =
-  (base_info->tiling == VK_IMAGE_TILING_LINEAR) ?
-  ISL_TILING_LINEAR_BIT : ISL_TILING_ANY_MASK;
-
-   if (anv_info->isl_tiling_flags)
-  tiling_flags &= anv_info->isl_tiling_flags;
-
-   assert(tiling_flags);
+   isl_tiling_flags_t tiling_flags;
+   result = choose_isl_tiling_flags(anv_info, _flags);
+   if (result != VK_SUCCESS)
+  return result;
 
struct anv_surface *anv_surf = get_surface(image, aspect);
 
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 12/20] anv: Better support for Android logging (v2)

2017-09-13 Thread Chad Versace
From: Chad Versace 

In src/intel/vulkan/*, redirect all instances of printf, vk_error,
anv_loge, anv_debug, anv_finishme, anv_perf_warn, anv_assert, and their
many variants to the new intel_log functions. I believe I caught them
all.

The other subdirs of src/intel are left for a future exercise.

v2:
  - Rebase onto Tapani's VK_EXT_debug_report changes.
  - Drop unused #include .
---
 src/intel/vulkan/anv_device.c  | 11 +--
 src/intel/vulkan/anv_private.h | 12 +---
 src/intel/vulkan/anv_util.c| 21 +++--
 src/intel/vulkan/genX_cmd_buffer.c |  4 ++--
 4 files changed, 15 insertions(+), 33 deletions(-)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index 8e2ed9eac45..8c645707a90 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -50,7 +50,7 @@ compiler_perf_log(void *data, const char *fmt, ...)
va_start(args, fmt);
 
if (unlikely(INTEL_DEBUG & DEBUG_PERF))
-  vfprintf(stderr, fmt, args);
+  intel_logd_v(fmt, args);
 
va_end(args);
 }
@@ -294,11 +294,11 @@ anv_physical_device_init(struct anv_physical_device 
*device,
}
 
if (device->info.is_haswell) {
-  fprintf(stderr, "WARNING: Haswell Vulkan support is incomplete\n");
+  intel_logw("Haswell Vulkan support is incomplete");
} else if (device->info.gen == 7 && !device->info.is_baytrail) {
-  fprintf(stderr, "WARNING: Ivy Bridge Vulkan support is incomplete\n");
+  intel_logw("Ivy Bridge Vulkan support is incomplete");
} else if (device->info.gen == 7 && device->info.is_baytrail) {
-  fprintf(stderr, "WARNING: Bay Trail Vulkan support is incomplete\n");
+  intel_logw("Bay Trail Vulkan support is incomplete");
} else if (device->info.gen >= 8 && device->info.gen <= 9) {
   /* Broadwell, Cherryview, Skylake, Broxton, Kabylake is as fully
* supported as anything */
@@ -365,8 +365,7 @@ anv_physical_device_init(struct anv_physical_device *device,
* many platforms, but otherwise, things will just work.
*/
   if (device->subslice_total < 1 || device->eu_total < 1) {
- fprintf(stderr, "WARNING: Kernel 4.1 required to properly"
- " query GPU properties.\n");
+ intel_logw("Kernel 4.1 required to properly query GPU properties");
   }
} else if (device->info.gen == 7) {
   device->subslice_total = 1 << (device->info.gt - 1);
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index df8de8d9407..16670f2ac3b 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -74,6 +74,7 @@ struct gen_l3_config;
 #include "isl/isl.h"
 
 #include "common/gen_debug.h"
+#include "common/intel_log.h"
 #include "wsi_common.h"
 
 /* Allowing different clear colors requires us to perform a depth resolve at
@@ -314,11 +315,9 @@ VkResult __vk_errorf(struct anv_instance *instance, const 
void *object,
 #define vk_errorf(instance, obj, error, format, ...)\
 __vk_errorf(instance, obj, REPORT_OBJECT_TYPE(obj), error,\
 __FILE__, __LINE__, format, ## __VA_ARGS__);
-#define anv_debug(format, ...) fprintf(stderr, "debug: " format, ##__VA_ARGS__)
 #else
 #define vk_error(error) error
 #define vk_errorf(instance, obj, error, format, ...) error
-#define anv_debug(format, ...)
 #endif
 
 /**
@@ -337,10 +336,8 @@ VkResult __vk_errorf(struct anv_instance *instance, const 
void *object,
  *defined by extensions supported by that component.
  */
 #define anv_debug_ignored_stype(sType) \
-   anv_debug("%s: ignored VkStructureType %u\n", __func__, (sType))
+   intel_logd("%s: ignored VkStructureType %u\n", __func__, (sType))
 
-void __anv_finishme(const char *file, int line, const char *format, ...)
-   anv_printflike(3, 4);
 void __anv_perf_warn(struct anv_instance *instance, const void *object,
  VkDebugReportObjectTypeEXT type, const char *file,
  int line, const char *format, ...)
@@ -364,7 +361,8 @@ void anv_debug_report(struct anv_instance *instance,
do { \
   static bool reported = false; \
   if (!reported) { \
- __anv_finishme(__FILE__, __LINE__, format, ##__VA_ARGS__); \
+ intel_logw("%s:%d: FINISHME: " format, __FILE__, __LINE__, \
+##__VA_ARGS__); \
  reported = true; \
   } \
} while (0)
@@ -386,7 +384,7 @@ void anv_debug_report(struct anv_instance *instance,
 #ifdef DEBUG
 #define anv_assert(x) ({ \
if (unlikely(!(x))) \
-  fprintf(stderr, "%s:%d ASSERT: %s\n", __FILE__, __LINE__, #x); \
+  intel_loge("%s:%d ASSERT: %s", __FILE__, __LINE__, #x); \
 })
 #else
 #define anv_assert(x)
diff --git a/src/intel/vulkan/anv_util.c b/src/intel/vulkan/anv_util.c
index ec61f7355ef..59f893492b7 100644
--- a/src/intel/vulkan/anv_util.c
+++ b/src/intel/vulkan/anv_util.c
@@ -47,22 +47,7 @@ anv_loge(const char *format, ...)
 void
 anv_loge_v(const 

[Mesa-dev] [PATCH 08/20] anv/android: Disable surface and swapchain extensions (v2)

2017-09-13 Thread Chad Versace
From: Chad Versace 

Android's Vulkan loader implements VK_KHR_surface and VK_KHR_swapchain,
and applications cannot access the driver's implementation. Moreoever,
if the driver exposes the those extension strings, then tests
dEQP-VK.api.info.instance.extensions and dEQP-VK.api.info.device fail
due to the duplicated strings.

v2: Replace !ANDROID with ANV_HAS_SURFACE. (for jekstrand)
---
 src/intel/vulkan/anv_extensions.py | 23 ---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/src/intel/vulkan/anv_extensions.py 
b/src/intel/vulkan/anv_extensions.py
index 316e1f04e79..f484e10fb93 100644
--- a/src/intel/vulkan/anv_extensions.py
+++ b/src/intel/vulkan/anv_extensions.py
@@ -44,6 +44,11 @@ class Extension:
 else:
 self.enable = enable;
 
+# On Android, we disable all surface and swapchain extensions. Android's Vulkan
+# loader implements VK_KHR_surface and VK_KHR_swapchain, and applications
+# cannot access the driver's implementation. Moreoever, if the driver exposes
+# the those extension strings, then tests dEQP-VK.api.info.instance.extensions
+# and dEQP-VK.api.info.device fail due to the duplicated strings.
 EXTENSIONS = [
 Extension('VK_KHR_dedicated_allocation',  1, True),
 Extension('VK_KHR_descriptor_update_template',1, True),
@@ -60,7 +65,7 @@ EXTENSIONS = [
 Extension('VK_KHR_external_semaphore_fd', 1, True),
 Extension('VK_KHR_get_memory_requirements2',  1, True),
 Extension('VK_KHR_get_physical_device_properties2',   1, True),
-Extension('VK_KHR_get_surface_capabilities2', 1, True),
+Extension('VK_KHR_get_surface_capabilities2', 1, 
'ANV_HAS_SURFACE'),
 Extension('VK_KHR_incremental_present',   1, True),
 Extension('VK_KHR_maintenance1',  1, True),
 Extension('VK_KHR_push_descriptor',   1, True),
@@ -68,8 +73,8 @@ EXTENSIONS = [
 Extension('VK_KHR_sampler_mirror_clamp_to_edge',  1, True),
 Extension('VK_KHR_shader_draw_parameters',1, True),
 Extension('VK_KHR_storage_buffer_storage_class',  1, True),
-Extension('VK_KHR_surface',  25, True),
-Extension('VK_KHR_swapchain',68, True),
+Extension('VK_KHR_surface',  25, 
'ANV_HAS_SURFACE'),
+Extension('VK_KHR_swapchain',68, 
'ANV_HAS_SURFACE'),
 Extension('VK_KHR_variable_pointers', 1, True),
 Extension('VK_KHR_wayland_surface',   6, 
'VK_USE_PLATFORM_WAYLAND_KHR'),
 Extension('VK_KHR_xcb_surface',   6, 
'VK_USE_PLATFORM_XCB_KHR'),
@@ -163,6 +168,18 @@ _TEMPLATE = Template(COPYRIGHT + """
 #endif
 %endfor
 
+/* And ANDROID too */
+#ifdef ANDROID
+#   undef ANDROID
+#   define ANDROID true
+#else
+#   define ANDROID false
+#endif
+
+#define ANV_HAS_SURFACE (VK_USE_PLATFORM_WAYLAND_KHR || \\
+ VK_USE_PLATFORM_XCB_KHR || \\
+ VK_USE_PLATFORM_XLIB_KHR)
+
 bool
 anv_instance_extension_supported(const char *name)
 {
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 09/20] anv: Link to Android libraries in the autotools build

2017-09-13 Thread Chad Versace
From: Chad Versace 

A first step to supporting Vulkan on ARC++. Mesa on ARC++ uses
Autotools, not Android.mk.

Doing this now, even before VK_ANDROID_native_buffer is implemented,
allows us to incrementally add Android support to the Autotools build.
---
 src/intel/Makefile.vulkan.am | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/intel/Makefile.vulkan.am b/src/intel/Makefile.vulkan.am
index 8a19f96096a..d1b1132ed2e 100644
--- a/src/intel/Makefile.vulkan.am
+++ b/src/intel/Makefile.vulkan.am
@@ -146,6 +146,11 @@ VULKAN_LIB_DEPS = \
$(DLOPEN_LIBS) \
-lm
 
+if HAVE_PLATFORM_ANDROID
+VULKAN_CFLAGS += $(ANDROID_CFLAGS)
+VULKAN_LIB_DEPS += $(ANDROID_LIBS)
+endif
+
 if HAVE_PLATFORM_X11
 VULKAN_CPPFLAGS += \
$(XCB_DRI3_CFLAGS) \
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/20] vulkan/util: Teach gen_enum_to_str.py to parse mutliple XML files

2017-09-13 Thread Chad Versace
From: Chad Versace 

To give the script multiple XML files, call it like so:

gen_enum_to_str.py --xml a.xml --xml b.xml --xml c.xml ...

The script parses the XML files in the given order.

This will allow us to feed the script XML files for extensions that are
missing from the official vk.xml, such as VK_ANDROID_native_buffer.
---
 src/vulkan/util/gen_enum_to_str.py | 19 +++
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/src/vulkan/util/gen_enum_to_str.py 
b/src/vulkan/util/gen_enum_to_str.py
index ef37972c20d..bc72c189943 100644
--- a/src/vulkan/util/gen_enum_to_str.py
+++ b/src/vulkan/util/gen_enum_to_str.py
@@ -121,13 +121,12 @@ class VkEnum(object):
 self.values = values or []
 
 
-def xml_parser(filename):
-"""Parse the XML file and return parsed data.
+def parse_xml(efactory, filename):
+"""Parse the XML file. Accumulate results into the efactory.
 
 This parser is a memory efficient iterative XML parser that returns a list
 of VkEnum objects.
 """
-efactory = EnumFactory(VkEnum)
 
 with open(filename, 'rb') as f:
 context = iter(et.iterparse(f, events=('start', 'end')))
@@ -153,25 +152,29 @@ def xml_parser(filename):
 
 root.clear()
 
-return efactory.registry.values()
-
 
 def main():
 parser = argparse.ArgumentParser()
-parser.add_argument('--xml', help='Vulkan API XML file.', required=True)
+parser.add_argument('--xml', required=True,
+help='Vulkan API XML files',
+action='append',
+dest='xml_files')
 parser.add_argument('--outdir',
 help='Directory to put the generated files in',
 required=True)
 
 args = parser.parse_args()
 
-enums = xml_parser(args.xml)
+efactory = EnumFactory(VkEnum)
+for filename in args.xml_files:
+parse_xml(efactory, filename)
+
 for template, file_ in [(C_TEMPLATE, os.path.join(args.outdir, 
'vk_enum_to_str.c')),
 (H_TEMPLATE, os.path.join(args.outdir, 
'vk_enum_to_str.h'))]:
 with open(file_, 'wb') as f:
 f.write(template.render(
 file=os.path.basename(__file__),
-enums=enums,
+enums=efactory.registry.values(),
 copyright=COPYRIGHT))
 
 
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 05/20] vulkan/registry: Feed vk_android_native_buffer.xml to gen_enum_to_str.py

2017-09-13 Thread Chad Versace
From: Chad Versace 

Tested on Android and Fedora.
---
 src/vulkan/Android.mk  |  9 +++--
 src/vulkan/Makefile.am |  9 +++--
 src/vulkan/util/gen_enum_to_str.py | 20 +++-
 3 files changed, 33 insertions(+), 5 deletions(-)

diff --git a/src/vulkan/Android.mk b/src/vulkan/Android.mk
index e19a33dd4f9..47c5c510819 100644
--- a/src/vulkan/Android.mk
+++ b/src/vulkan/Android.mk
@@ -44,11 +44,16 @@ LOCAL_GENERATED_SOURCES := $(addprefix $(intermediates)/, \
 LOCAL_SRC_FILES := $(VULKAN_UTIL_FILES)
 
 vulkan_api_xml = $(MESA_TOP)/src/vulkan/registry/vk.xml
+vk_android_native_buffer_xml = 
$(MESA_TOP)/src/vulkan/registry/vk_android_native_buffer.xml
 
-$(LOCAL_GENERATED_SOURCES): $(MESA_TOP)/src/vulkan/util/gen_enum_to_str.py 
$(vulkan_api_xml)
+$(LOCAL_GENERATED_SOURCES): $(MESA_TOP)/src/vulkan/util/gen_enum_to_str.py \
+   $(vulkan_api_xml) $(vk_android_native_buffer_xml)
@echo "target Generated: $(PRIVATE_MODULE) <= $(notdir $(@))"
@mkdir -p $(dir $@)
-   $(hide) $(MESA_PYTHON2) $(MESA_TOP)/src/vulkan/util/gen_enum_to_str.py 
--xml $(vulkan_api_xml) --outdir $(dir $@)
+   $(hide) $(MESA_PYTHON2) $(MESA_TOP)/src/vulkan/util/gen_enum_to_str.py \
+   --xml $(vulkan_api_xml) \
+   --xml $(vk_android_native_buffer_xml) \
+   --outdir $(dir $@)
 
 LOCAL_EXPORT_C_INCLUDE_DIRS := \
 $(intermediates)
diff --git a/src/vulkan/Makefile.am b/src/vulkan/Makefile.am
index c897a07d6a8..c92748cb8a1 100644
--- a/src/vulkan/Makefile.am
+++ b/src/vulkan/Makefile.am
@@ -4,6 +4,7 @@ noinst_LTLIBRARIES = libvulkan_wsi.la libvulkan_util.la
 
 vulkan_includedir = $(includedir)/vulkan
 vulkan_api_xml = $(top_srcdir)/src/vulkan/registry/vk.xml
+vk_android_native_buffer_xml = 
$(top_srcdir)/src/vulkan/registry/vk_android_native_buffer.xml
 
 MKDIR_GEN = $(AM_V_at)$(MKDIR_P) $(@D)
 PYTHON_GEN = $(AM_V_GEN)$(PYTHON2) $(PYTHON_FLAGS)
@@ -18,9 +19,13 @@ VULKAN_UTIL_SOURCES = \
 BUILT_SOURCES = \
$(VULKAN_UTIL_GENERATED_FILES)
 
-util/vk_enum_to_str.c util/vk_enum_to_str.h: util/gen_enum_to_str.py 
$(vulkan_api_xml)
+util/vk_enum_to_str.c util/vk_enum_to_str.h: util/gen_enum_to_str.py \
+   $(vulkan_api_xml) $(vk_android_native_buffer_xml)
$(MKDIR_GEN)
-   $(PYTHON_GEN) $(srcdir)/util/gen_enum_to_str.py --xml $(vulkan_api_xml) 
--outdir $(top_builddir)/src/vulkan/util
+   $(PYTHON_GEN) $(srcdir)/util/gen_enum_to_str.py \
+   --xml $(vulkan_api_xml) \
+   --xml $(vk_android_native_buffer_xml) \
+   --outdir $(top_builddir)/src/vulkan/util
 
 libvulkan_util_la_SOURCES = $(VULKAN_UTIL_SOURCES)
 
diff --git a/src/vulkan/util/gen_enum_to_str.py 
b/src/vulkan/util/gen_enum_to_str.py
index bc72c189943..df326d0a998 100644
--- a/src/vulkan/util/gen_enum_to_str.py
+++ b/src/vulkan/util/gen_enum_to_str.py
@@ -58,6 +58,7 @@ C_TEMPLATE = Template(textwrap.dedent(u"""\
  */
 
 #include 
+#include 
 #include "util/macros.h"
 #include "vk_enum_to_str.h"
 
@@ -68,8 +69,17 @@ C_TEMPLATE = Template(textwrap.dedent(u"""\
 {
 switch(input) {
 % for v in enum.values:
+% if v in FOREIGN_ENUM_VALUES:
+
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wswitch"
+% endif
 case ${v}:
 return "${v}";
+% if v in FOREIGN_ENUM_VALUES:
+#pragma GCC diagnostic pop
+
+% endif
 % endfor
 default:
 unreachable("Undefined enum value.");
@@ -89,6 +99,7 @@ H_TEMPLATE = Template(textwrap.dedent(u"""\
 #define MESA_VK_ENUM_TO_STR_H
 
 #include 
+#include 
 
 % for enum in enums:
 const char * vk_${enum.name[2:]}_to_str(${enum.name} input);
@@ -97,6 +108,12 @@ H_TEMPLATE = Template(textwrap.dedent(u"""\
 #endif"""),
 output_encoding='utf-8')
 
+# These enums are defined outside their respective enum blocks, and thus cause
+# -Wswitch warnings.
+FOREIGN_ENUM_VALUES = [
+"VK_STRUCTURE_TYPE_NATIVE_BUFFER_ANDROID",
+]
+
 
 class EnumFactory(object):
 """Factory for creating enums."""
@@ -175,7 +192,8 @@ def main():
 f.write(template.render(
 file=os.path.basename(__file__),
 enums=efactory.registry.values(),
-copyright=COPYRIGHT))
+copyright=COPYRIGHT,
+FOREIGN_ENUM_VALUES=FOREIGN_ENUM_VALUES))
 
 
 if __name__ == '__main__':
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/20] vulkan/registry: Add VK_ANDROID_native_buffer

2017-09-13 Thread Chad Versace
From: Chad Versace 

The VK_ANDROID_native_buffer extension is missing from the official
vk.xml. This patch defines the extension in a separate, minimal XML
file: vk_android_native_buffer.xml.

I chose to add the extension to a new XML file instead of adding it to
the official vk.xml in order to avoid conflicts each time we sync the
vk.xml from Khronos.

This should be only a temporary solution until Jesse Hall is persuaded
to add it to the official vk.xml.
---
 src/Makefile.am  |  2 +
 src/vulkan/registry/vk_android_native_buffer.xml | 52 
 2 files changed, 54 insertions(+)
 create mode 100644 src/vulkan/registry/vk_android_native_buffer.xml

diff --git a/src/Makefile.am b/src/Makefile.am
index e055cb8e701..7ca3c0a12d9 100644
--- a/src/Makefile.am
+++ b/src/Makefile.am
@@ -72,7 +72,9 @@ endif
 if HAVE_VULKAN_COMMON
 SUBDIRS += vulkan
 endif
+
 EXTRA_DIST += vulkan/registry/vk.xml
+EXTRA_DIST += vulkan/registry/vk_android_native_buffer.xml
 
 if HAVE_AMD_DRIVERS
 SUBDIRS += amd
diff --git a/src/vulkan/registry/vk_android_native_buffer.xml 
b/src/vulkan/registry/vk_android_native_buffer.xml
new file mode 100644
index 000..2738908aa71
--- /dev/null
+++ b/src/vulkan/registry/vk_android_native_buffer.xml
@@ -0,0 +1,52 @@
+
+
+
+
+VkStructureType 
sType
+const void* pNext
+buffer_handle_t handle
+int stride
+int format
+int usage
+
+
+
+
+VkResult 
vkGetSwapchainGrallocUsageANDROID
+VkDevice device
+VkFormat format
+VkImageUsageFlags 
imageUsage
+int* grallocUsage
+
+
+VkResult 
vkAcquireImageANDROID
+VkDevice device
+VkImage image
+int nativeFenceFd
+VkSemaphore semaphore
+VkFence fence
+
+
+VkResult 
vkQueueSignalReleaseImageANDROID
+VkQueue queue
+uint32_t 
waitSemaphoreCount
+const VkSemaphore* 
pWaitSemaphores
+VkImage image
+int* pNativeFenceFd
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 02/20] vulkan: Add #ifdef hack to vk_android_native_buffer.h

2017-09-13 Thread Chad Versace
From: Chad Versace 

This patch consolidates many potential `#ifdef ANDROID` messes
throughout src/vulkan and src/intel/vulkan into a simple, localized
hack. The hack is an `#ifdef ANDROID` in vk_android_native_buffer.h
that, on non-Android platorms, avoids including the Android platform
headers and typedefs any Android-specific types to void*.

This hack doesn't remove *all* the `#ifdef ANDROID`s in upcoming
patches, but it does remove a lot.

I first tried implementing VK_ANDROID_native_buffer without this hack,
but eventually gave up when the yak shaving became too much.
---
 include/vulkan/vk_android_native_buffer.h | 5 +
 1 file changed, 5 insertions(+)

diff --git a/include/vulkan/vk_android_native_buffer.h 
b/include/vulkan/vk_android_native_buffer.h
index d0ebf81353f..a658b49243d 100644
--- a/include/vulkan/vk_android_native_buffer.h
+++ b/include/vulkan/vk_android_native_buffer.h
@@ -17,8 +17,13 @@
 #ifndef __VK_ANDROID_NATIVE_BUFFER_H__
 #define __VK_ANDROID_NATIVE_BUFFER_H__
 
+/* MESA: A hack to avoid #ifdefs in driver code. */
+#ifdef ANDROID
 #include 
 #include 
+#else
+typedef void *buffer_handle_t;
+#endif
 
 #ifdef __cplusplus
 extern "C" {
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01/20] vulkan: Import vk_android_native_buffer.h

2017-09-13 Thread Chad Versace
From: Chad Versace 

Just as Mesa imports the Khronos Vulkan headers, it should import this
Android-private Vulkan header too. This guarantees that Mesa will
continue to build even when upstream Android breaks header
compatibility.

This header is only for *implementers* of Vulkan, not for consumers of
Vulkan.

Imported from tag 'android-7.1.1_r28' in aosp/frameworks/native.

References: 
https://android.googlesource.com/platform/frameworks/native/+/android-7.1.1_r28/vulkan/include/vulkan/vk_android_native_buffer.h
---
 include/vulkan/vk_android_native_buffer.h | 91 +++
 1 file changed, 91 insertions(+)
 create mode 100644 include/vulkan/vk_android_native_buffer.h

diff --git a/include/vulkan/vk_android_native_buffer.h 
b/include/vulkan/vk_android_native_buffer.h
new file mode 100644
index 000..d0ebf81353f
--- /dev/null
+++ b/include/vulkan/vk_android_native_buffer.h
@@ -0,0 +1,91 @@
+/*
+ * Copyright 2015 The Android Open Source Project
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#ifndef __VK_ANDROID_NATIVE_BUFFER_H__
+#define __VK_ANDROID_NATIVE_BUFFER_H__
+
+#include 
+#include 
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define VK_ANDROID_native_buffer 1
+
+#define VK_ANDROID_NATIVE_BUFFER_EXTENSION_NUMBER 11
+#define VK_ANDROID_NATIVE_BUFFER_SPEC_VERSION 5
+#define VK_ANDROID_NATIVE_BUFFER_EXTENSION_NAME   "VK_ANDROID_native_buffer"
+
+#define VK_ANDROID_NATIVE_BUFFER_ENUM(type,id)((type)(10 + (1000 * 
(VK_ANDROID_NATIVE_BUFFER_EXTENSION_NUMBER - 1)) + (id)))
+#define VK_STRUCTURE_TYPE_NATIVE_BUFFER_ANDROID   
VK_ANDROID_NATIVE_BUFFER_ENUM(VkStructureType, 0)
+
+typedef struct {
+VkStructureType sType; // must be 
VK_STRUCTURE_TYPE_NATIVE_BUFFER_ANDROID
+const void* pNext;
+
+// Buffer handle and stride returned from gralloc alloc()
+buffer_handle_t handle;
+int stride;
+
+// Gralloc format and usage requested when the buffer was allocated.
+int format;
+int usage;
+} VkNativeBufferANDROID;
+
+typedef VkResult (VKAPI_PTR *PFN_vkGetSwapchainGrallocUsageANDROID)(VkDevice 
device, VkFormat format, VkImageUsageFlags imageUsage, int* grallocUsage);
+typedef VkResult (VKAPI_PTR *PFN_vkAcquireImageANDROID)(VkDevice device, 
VkImage image, int nativeFenceFd, VkSemaphore semaphore, VkFence fence);
+typedef VkResult (VKAPI_PTR *PFN_vkQueueSignalReleaseImageANDROID)(VkQueue 
queue, uint32_t waitSemaphoreCount, const VkSemaphore* pWaitSemaphores, VkImage 
image, int* pNativeFenceFd);
+
+#ifndef VK_NO_PROTOTYPES
+VKAPI_ATTR VkResult VKAPI_CALL vkGetSwapchainGrallocUsageANDROID(
+VkDevicedevice,
+VkFormatformat,
+VkImageUsageFlags   imageUsage,
+int*grallocUsage
+);
+VKAPI_ATTR VkResult VKAPI_CALL vkAcquireImageANDROID(
+VkDevicedevice,
+VkImage image,
+int nativeFenceFd,
+VkSemaphore semaphore,
+VkFence fence
+);
+VKAPI_ATTR VkResult VKAPI_CALL vkQueueSignalReleaseImageANDROID(
+VkQueue queue,
+uint32_twaitSemaphoreCount,
+const VkSemaphore*  pWaitSemaphores,
+VkImage image,
+int*pNativeFenceFd
+);
+// -- DEPRECATED --
+VKAPI_ATTR VkResult VKAPI_CALL vkImportNativeFenceANDROID(
+VkDevicedevice,
+VkSemaphore semaphore,
+int nativeFenceFd
+);
+VKAPI_ATTR VkResult VKAPI_CALL vkQueueSignalNativeFenceANDROID(
+VkQueue queue,
+int*pNativeFenceFd
+);
+// 
+#endif
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif // __VK_ANDROID_NATIVE_BUFFER_H__
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/20] anv: Teach generator scripts how to parse mutliple XML files

2017-09-13 Thread Chad Versace
From: Chad Versace 

The taught scripts are anv_extensions.py and anv_entrypoints_gen.py.  To
give a script multiple XML files, call it like so:

anv_extensions.py --xml a.xml --xml b.xml --xml c.xml ...

The scripts parse the XML files in the given order.

This will allow us to feed the scripts XML files for extensions that are
missing from the official vk.xml, such as VK_ANDROID_native_buffer.
---
 src/intel/vulkan/anv_entrypoints_gen.py | 18 +-
 src/intel/vulkan/anv_extensions.py  | 15 ++-
 2 files changed, 23 insertions(+), 10 deletions(-)

diff --git a/src/intel/vulkan/anv_entrypoints_gen.py 
b/src/intel/vulkan/anv_entrypoints_gen.py
index 3240704a0d8..983be09a39b 100644
--- a/src/intel/vulkan/anv_entrypoints_gen.py
+++ b/src/intel/vulkan/anv_entrypoints_gen.py
@@ -253,7 +253,7 @@ def cal_hash(name):
 lambda h, c: (h * PRIME_FACTOR + ord(c)) & U32_MASK, name, 0)
 
 
-def get_entrypoints(doc, entrypoints_to_defines):
+def get_entrypoints(doc, entrypoints_to_defines, start_index):
 """Extract the entry points from the registry."""
 entrypoints = []
 
@@ -275,7 +275,7 @@ def get_entrypoints(doc, entrypoints_to_defines):
 for command in extension.findall('./require/command'):
 enabled_commands.add(command.attrib['name'])
 
-index = 0
+index = start_index
 for command in doc.findall('./commands/command'):
 type = command.find('./proto/type').text
 fullname = command.find('./proto/name').text
@@ -344,11 +344,19 @@ def main():
 parser = argparse.ArgumentParser()
 parser.add_argument('--outdir', help='Where to write the files.',
 required=True)
-parser.add_argument('--xml', help='Vulkan API XML file.', required=True)
+parser.add_argument('--xml',
+help='Vulkan API XML file.',
+required=True,
+action='append',
+dest='xml_files')
 args = parser.parse_args()
 
-doc = et.parse(args.xml)
-entrypoints = get_entrypoints(doc, get_entrypoints_defines(doc))
+entrypoints = []
+
+for filename in args.xml_files:
+doc = et.parse(filename)
+entrypoints += get_entrypoints(doc, get_entrypoints_defines(doc),
+   start_index=len(entrypoints))
 
 # Manually add CreateDmaBufImageINTEL for which we don't have an extension
 # defined.
diff --git a/src/intel/vulkan/anv_extensions.py 
b/src/intel/vulkan/anv_extensions.py
index acec785959b..d995f9f177c 100644
--- a/src/intel/vulkan/anv_extensions.py
+++ b/src/intel/vulkan/anv_extensions.py
@@ -138,9 +138,6 @@ def _init_exts_from_xml(xml):
 
 ext.type = ext_elem.attrib['type']
 
-for ext in EXTENSIONS:
-assert ext.type == 'instance' or ext.type == 'device'
-
 _TEMPLATE = Template(COPYRIGHT + """
 #include "anv_private.h"
 
@@ -232,10 +229,18 @@ VkResult anv_EnumerateDeviceExtensionProperties(
 if __name__ == '__main__':
 parser = argparse.ArgumentParser()
 parser.add_argument('--out', help='Output C file.', required=True)
-parser.add_argument('--xml', help='Vulkan API XML file.', required=True)
+parser.add_argument('--xml',
+help='Vulkan API XML file.',
+required=True,
+action='append',
+dest='xml_files')
 args = parser.parse_args()
 
-_init_exts_from_xml(args.xml)
+for filename in args.xml_files:
+_init_exts_from_xml(filename)
+
+for ext in EXTENSIONS:
+assert ext.type == 'instance' or ext.type == 'device'
 
 template_env = {
 'MAX_API_VERSION': MAX_API_VERSION,
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 00/23] anv: Implement VK_ANDROID_native_buffer (v2)

2017-09-13 Thread Chad Versace
From: Chad Versace 

This series adds Android support to Anvil. And Android requires
VK_ANDROID_native_buffer.


I tested the series on 64-bit ARC++ on a Skylake Chromebook with a 3.18
kernel. (ARC++ is the Android container in Chrome OS). (Yes, I said 3.18.
That's not a typo).


Here's my test results:

  - A little, spinning cube demo APK works, proving that the window
system integration is, if not fully correct, at least good enough.

  - On Linux (non-Android), this patch series causes no regressions in
VK-GL-CTS (master@dfcb8e87) under Wayland.

  - On Android, in patchset v1, dEQP from
android-cts-7.1_r8-linux_x86-x86 had 1 failure. I'm still waiting
for the results for patchset v2.

 Passed: 66889/81423 (82.2%)
 Failed: 1/81423 (0.0%)
 Not supported:  14529/81423 (17.8%)
 Warnings:   4/81423 (0.0%)

 The sole failure was
 
dEQP-VK.pipeline.image_view.view_type.cube_array.format.r16g16_sint.component_swizzle.one_r_g_zero


The exact code tested, and this patch series, live at:
tag  chadv/review/arc-vulkan-v02
url  git://git.kiwitree.net/~chadv/mesa
cgit 
http://git.kiwitree.net/cgit/~chadv/mesa/tag/?h=chadv/review/arc-vulkan-v02


Note that Mesa's ARC++ build uses Autotools, not Android.mk. This series
probably broke the Android.mk files. Someone, please review and test
that.


Each patch but the last is just prep. You should read the last patch
first to better understand the goal, then afterwards review the series
linearly from the beginning.


v2:
  - A few patches have already landed on master.
  - Drop the ugly flag params from anv_bo_cache_import(), for jekstrand.
  - Reject VkNativeBufferANDROID if the dma-buf's size is too small for
the VkImage.
  - Stop abusing VkNativeBufferANDROID by passing it to vkAllocateMemory
during vkCreateImage. Instead, directly import its dma-buf during
vkCreateImage with anv_bo_cache_import(). [for jekstrand]
  - Rebase onto Tapani's VK_EXT_debug_report changes.


Chad Versace (20):
  vulkan: Import vk_android_native_buffer.h
  vulkan: Add #ifdef hack to vk_android_native_buffer.h
  vulkan/registry: Add VK_ANDROID_native_buffer
  vulkan/util: Teach gen_enum_to_str.py to parse mutliple XML files
  vulkan/registry: Feed vk_android_native_buffer.xml to
gen_enum_to_str.py
  anv: Teach generator scripts how to parse mutliple XML files
  anv: Feed vk_android_native_buffer.xml to generators
  anv/android: Disable surface and swapchain extensions (v2)
  anv: Link to Android libraries in the autotools build
  intel: Move definition of LOG_TAG from header into Makefiles
  intel: Add simple logging façade for Android
  anv: Better support for Android logging (v2)
  anv/image: Better var names for image-create-info structs
  anv/image: Refactor how tiling is chosen
  anv/image: Refactor creation of aux surfaces (v2)
  anv: Add field anv_image::bo_is_owned
  anv: Move close(fd) from anv_bo_cache_import to its callers
  anv: Add sizeless anv_bo_cache_import()
  anv: Add func anv_gem_get_tiling()
  anv: Implement VK_ANDROID_native_buffer (v2)

 include/vulkan/vk_android_native_buffer.h|  96 +
 src/Makefile.am  |   2 +
 src/intel/Android.blorp.mk   |   2 +
 src/intel/Android.common.mk  |   2 +
 src/intel/Android.vulkan.mk  |  21 +-
 src/intel/Makefile.am|   3 +-
 src/intel/Makefile.sources   |   7 +-
 src/intel/Makefile.vulkan.am |  24 +-
 src/intel/common/gen_debug.h |   1 -
 src/intel/common/intel_log.c |  87 +
 src/intel/common/intel_log.h |  82 +
 src/intel/vulkan/anv_allocator.c | 126 ---
 src/intel/vulkan/anv_android.c   | 245 +
 src/intel/vulkan/anv_device.c|  41 ++-
 src/intel/vulkan/anv_entrypoints_gen.py  |  28 +-
 src/intel/vulkan/anv_extensions.py   |  51 ++-
 src/intel/vulkan/anv_gem.c   |  16 +
 src/intel/vulkan/anv_image.c | 443 ---
 src/intel/vulkan/anv_intel.c |   6 +-
 src/intel/vulkan/anv_private.h   |  18 +-
 src/intel/vulkan/anv_queue.c |   5 +-
 src/intel/vulkan/anv_util.c  |  21 +-
 src/intel/vulkan/genX_cmd_buffer.c   |   4 +-
 src/mesa/drivers/dri/i965/Android.mk |  23 +-
 src/mesa/drivers/dri/i965/Makefile.am|   1 +
 src/vulkan/Android.mk|   9 +-
 src/vulkan/Makefile.am   |   9 +-
 src/vulkan/registry/vk_android_native_buffer.xml |  52 +++
 src/vulkan/util/gen_enum_to_str.py   |  39 +-
 29 files changed, 1197 insertions(+), 267 deletions(-)
 create mode 100644 

Re: [Mesa-dev] [PATCH] glsl: do not set the 'smooth' qualifier by default on ES shaders

2017-09-13 Thread Timothy Arceri



On 14/09/17 01:22, Nicolai Hähnle wrote:

On 12.09.2017 03:33, Timothy Arceri wrote:



On 12/09/17 00:57, Nicolai Hähnle wrote:

On 11.09.2017 16:47, Ilia Mirkin wrote:
On Mon, Sep 11, 2017 at 10:44 AM, Nicolai Hähnle 
 wrote:

From: Nicolai Hähnle 

It breaks integer inputs and outputs on vertex processing stages
(e.g. geometry stages). Instead, rely on the driver to choose
smooth interpolation by default.
---

How about this instead? I haven't fully thought it through, but
that's where the INTERP_MODE_SMOOTH is coming from.

To be honest I'm not 100% sure about what INTERP_MODE_NONE can
mean, but I think it's definitely better than removing the (very
valid!) assertion in the packing code.


I couldn't understand what that assertion was asserting, esp in the
presence of struct in/out variables.


structs can't have per-member interpolation qualifiers. So a struct 
that contains integer values must have FLAT or NONE interpolation as 
a whole.


That clashes pretty hard with that ES rule, though ("When no 
interpolation qualifier is present, smooth interpolation is used.").
 From what I recall, NONE is just a placeholder so that the the driver 
knows to use the value set by glShadeModel(). ES doesn't have this and 
so we set this up front. I believe removing it will cause a bunch of 
tests to fail validation as they expect the default to be set and 
cross validated against stage interfaces.


So in a way, NONE just means "use the API default". The API default is 
set by glShadeModel() in compatibility profiles. In both ES and core 
profiles it's always "smooth" -- but only ES has this particular 
language about using "smooth" by default in its shading language spec.


Am I really the only one weirded out by seeing integer I/O that claims 
to use smooth interpolation?


I'd much rather not set smooth, as my patch proposes, and if there are 
tests that trigger linking errors, relax those to allow "smooth" and 
"none" to match.


They are CTS tests. Feel free to do it an alternative way but I don't 
think this patch alone is enough, the tests seem valid enough to me.






I'm also a bit confused as to what the assert() is trying to catch.


It's catching the weirdness of integer I/O that claims to use smooth 
interpolation. Admittedly, the comment is wrong and this is not related 
to varying packing, but I guarantee that people will be confused about 
this in the future.


Cheers,
Nicolai






On second thought, that assertion is actually irrelevant for varying 
packing itself, but it's still pretty disturbing that it's broken by 
this particular shader.


Cheers,
Nicolai






---
  src/compiler/glsl/ast_to_hir.cpp | 11 ---
  1 file changed, 11 deletions(-)

diff --git a/src/compiler/glsl/ast_to_hir.cpp 
b/src/compiler/glsl/ast_to_hir.cpp

index 98d2f94e129..cb42041642d 100644
--- a/src/compiler/glsl/ast_to_hir.cpp
+++ b/src/compiler/glsl/ast_to_hir.cpp
@@ -3119,31 +3119,20 @@ interpret_interpolation_qualifier(const 
struct ast_type_qualifier *qual,
    struct _mesa_glsl_parse_state 
*state,

    YYLTYPE *loc)
  {
 glsl_interp_mode interpolation;
 if (qual->flags.q.flat)
    interpolation = INTERP_MODE_FLAT;
 else if (qual->flags.q.noperspective)
    interpolation = INTERP_MODE_NOPERSPECTIVE;
 else if (qual->flags.q.smooth)
    interpolation = INTERP_MODE_SMOOTH;
-   else if (state->es_shader &&
-    ((mode == ir_var_shader_in &&
-  state->stage != MESA_SHADER_VERTEX) ||
- (mode == ir_var_shader_out &&
-  state->stage != MESA_SHADER_FRAGMENT)))
-  /* Section 4.3.9 (Interpolation) of the GLSL ES 3.00 spec says:
-   *
-   *    "When no interpolation qualifier is present, smooth 
interpolation

-   *    is used."
-   */
-  interpolation = INTERP_MODE_SMOOTH;
 else
    interpolation = INTERP_MODE_NONE;

 validate_interpolation_qualifier(state, loc,
  interpolation,
  qual, var_type, mode);

 return interpolation;
  }

--
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium/{r600, radeonsi}: Fix segfault with color format (v2)

2017-09-13 Thread Marek Olšák
I pushed the patch. Thanks.

Marek

On Wed, Sep 13, 2017 at 9:27 PM, Denis Pauk  wrote:
> Could you please commit this changes?
>
> I have not found what i need to do next after recieve "approve" for changes
> in https://www.mesa3d.org/submittingpatches.html#reviewing.
> Do i need to send new mail with "Reviewed-by: " mark?
>
> On Wed, Sep 13, 2017 at 6:36 PM, Marek Olšák  wrote:
>>
>> I've changed my mind. The patch is OK:
>>
>> Reviewed-by: Marek Olšák 
>>
>> Marek
>>
>> On Wed, Sep 13, 2017 at 4:45 PM, Denis Pauk  wrote:
>> > What do you think about replace all checks in patch to asserts?
>> >
>> >
>> > Best regards,
>> >   Denis.
>> >
>> > On Sep 13, 2017 1:00 PM, "Marek Olšák"  wrote:
>> >>
>> >> On Wed, Sep 13, 2017 at 6:54 AM, Денис Паук 
>> >> wrote:
>> >> > Do you mean delete check in u_format.c:: util_format_is_supported?
>> >> > Could
>> >> > you
>> >> > please explain more?
>> >>
>> >> I mean delete everything. Invalid formats shouldn't be passed to
>> >> pipe_screen::is_format_supported. The code doing it needs to be fixed.
>> >>
>> >> Marek
>
>
>
>
> --
> Best regards,
>   Denis.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 03/15] i965: Use a WC map and memcpy for the batch instead of pwrite.

2017-09-13 Thread Matt Turner

On 09/13, Kenneth Graunke wrote:

We'd like to eliminate the malloc'd shadow copy eventually, but there
are still unresolved performance problems.  In the meantime, let's at
least get rid of pwrite.


I don't know much about this. What is wrong with pwrite, and why is WC
better?

Does this change performance?


signature.asc
Description: Digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102573] fails to build on armel

2017-09-13 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102573

--- Comment #2 from Shmerl  ---
(In reply to Emil Velikov from comment #1)
> I don't have a armel setup handy, so if
> anyone anyone can polish/fix the test and send a patch, that would be
> appreciated.

It might be possible to set up without hardware, just using Qemu.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 11/19] anv: Use GNU C empty brace initializer

2017-09-13 Thread Ian Romanick
On 08/29/2017 10:19 AM, Emil Velikov wrote:
> On 29 August 2017 at 18:10, Matt Turner  wrote:
>> On Tue, Aug 29, 2017 at 3:35 AM, Emil Velikov  
>> wrote:
>>> On 29 August 2017 at 11:11, Eric Engestrom  
>>> wrote:
 On Monday, 2017-08-28 14:57:13 -0700, Matt Turner wrote:
> Avoids Clang's warning about the current code:
>
>warning: suggest braces around initialization of subobject

 I'm not sure about this patch [1], but everything else in this series is:
>>> IIRC "{}" is the least likely way to avoid the warnings, across
>>> GCC/Clang versions.
>>> It's not part of the C spec, but since it works - not sure how much to 
>>> bother.
>>
>> I'm surprised to hear you say that. I've tested with gcc-4.9.4, 6.4.0,
>> 7.1.0; and clang-4.0, and none of them warn about {}
>>
> Eric corrected me on IRC, but I forgot to mention here - "s/least/most/"
> 
>> I think when we went through this before (in NIR) we couldn't use {}
>> because it needs to compile with MSVC, and MSVC doesn't allow that in
>> C.
>>
> Precisely. The "most" portable solution that I know of is memset().

Except memset generates different code.  That will clear the padding
area between fields, and x = {} (and variations there of) do not.

>>> Would be great to have the issue reported (and fixed) in Clang though.
>>> AFAICT both {0} and {0,} are valid, if memory serves me right.
>>
>> I don't really understand the nuances of {0}, {0,}, {{0}}, etc, and I
>> get the sense that that's the case for nearly everyone else as well.
>>
> AFAICT {0} and {0,} should just work everywhere, modulo bugs.

And I believe most of these generate warnings if you enable other
warning options (see also this patch series
https://patchwork.freedesktop.org/patch/68141/).

> [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53119
> [2] https://bugs.llvm.org/show_bug.cgi?id=21689
> 
>> The most prominent bug I see in all of this is that {} isn't part of
>> standard C (it is part of standard C++!)
> Agreed. Not sure how easy it will be to convince the committee.
> 
> -Emil
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] intel/blorp/hiz: Always set sample number

2017-09-13 Thread Matt Turner

On 09/11, Topi Pohjolainen wrote:

Signed-off-by: Topi Pohjolainen 


Patches 2-3 (because I don't know a lot about this code) are

Acked-by: Matt Turner 

Chromium has been disabling HiZ on Braswell for stability reasons
(trying to find out more information, but I currently don't have
permissions to view their bug tracker). Not sure if they have the
bandwidth to test these patches, but Cc'ing Chad in case they do.


signature.asc
Description: Digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] i965/gen8: Remove unused gen8_emit_3dstate_multisample()

2017-09-13 Thread Matt Turner

On 09/11, Topi Pohjolainen wrote:

Signed-off-by: Topi Pohjolainen 


Reviewed-by: Matt Turner 


signature.asc
Description: Digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH shader-db 3/3] run: add extension_in_string() helper

2017-09-13 Thread Emil Velikov
On 13 September 2017 at 22:13, Matt Turner  wrote:
> On Mon, Aug 21, 2017 at 3:27 AM, Emil Velikov  
> wrote:
>> From: Emil Velikov 
>>
>> memmem() does not attribute what the character after the searched string
>> is. Thus it will flag even when haystack is "foobar" while we're looking
>> for "foo".
>>
>> Pull a small helper (based on piglit) that correctly handles this and use
>> it.
>>
>> Note: when parsing through the shader we have a non-zero terminated
>> needle, let's keep the memmem in there for now.
>>
>> Signed-off-by: Emil Velikov 
>> ---
>>  run.c | 43 +++
>>  1 file changed, 35 insertions(+), 8 deletions(-)
>>
>> diff --git a/run.c b/run.c
>> index d0d4598..0691d33 100644
>> --- a/run.c
>> +++ b/run.c
>> @@ -69,6 +69,35 @@ struct shader {
>>  int type;
>>  };
>>
>> +static bool
>> +extension_in_string(const char *haystack, const char *needle)
>> +{
>> +const unsigned needle_len = strlen(needle);
>> +
>> +if (needle_len == 0)
>> +return false;
>> +
>> +while (true) {
>> +const char *const s = strstr(haystack, needle);
>> +
>> +if (s == NULL)
>> +return false;
>> +
>> +if (s[needle_len] == ' ' || s[needle_len] == '\0')
>> +return true;
>> +
>> +/* strstr found an extension whose name begins with
>> + * needle, but whose name is not equal to needle.
>> + * Restart the search at s + needle_len so that we
>> + * don't just find the same extension again and go
>> + * into an infinite loop.
>> + */
>> +haystack = s + needle_len;
>> +}
>> +
>> +return false;
>> +}
>> +
>>  static struct shader *
>>  get_shaders(const struct context_info *core, const struct context_info 
>> *compat,
>>  const char *text, size_t text_size,
>> @@ -141,8 +170,8 @@ get_shaders(const struct context_info *core, const 
>> struct context_info *compat,
>>  extension_text += 1;
>>  const char *newline = memchr(extension_text, '\n',
>>   end_text - extension_text);
>> -if (memmem(info->extension_string, info->extension_string_len,
>> -   extension_text, newline - extension_text) == NULL) {
>> +if (memmem(info->extension_string, info->extension_string_len,
>> +   extension_text, newline - extension_text) == NULL) {
>>  fprintf(stderr, "SKIP: %s requires unavailable extension 
>> %.*s\n",
>>  shader_name, (int)(newline - extension_text), 
>> extension_text);
>>  return NULL;
>> @@ -415,7 +444,7 @@ main(int argc, char **argv)
>>  return -1;
>>  }
>>
>> -if (!strstr(client_extensions, "EGL_MESA_platform_gbm")) {
>> +if (!extension_in_string(client_extensions, "EGL_MESA_platform_gbm")) {
>>  fprintf(stderr, "ERROR: Missing EGL_MESA_platform_gbm\n");
>>  return -1;
>>  }
>> @@ -458,7 +487,7 @@ main(int argc, char **argv)
>>  };
>>  for (int i = 0; i < ARRAY_SIZE(egl_extension); i++) {
>>  const char *extension_string = eglQueryString(egl_dpy, 
>> EGL_EXTENSIONS);
>> -if (strstr(extension_string, egl_extension[i]) == NULL) {
>> +if (!extension_in_string(extension_string, egl_extension[i])) {
>>  fprintf(stderr, "ERROR: Missing %s\n", egl_extension[i]);
>>  ret = -1;
>>  goto egl_terminate;
>> @@ -530,8 +559,7 @@ main(int argc, char **argv)
>>
>>  core.max_glsl_version = get_glsl_version();
>>
>> -if (memmem(core.extension_string, core.extension_string_len,
>> -   "GL_KHR_debug", strlen("GL_KHR_debug")) == NULL) {
>> +if (!extension_in_string(core.extension_string, "GL_KHR_debug")) {
>
> I don't know what happened between this patch and getting committed,
> but what was committed contained a copy-and-paste mistake here (used
> compat.extension_string, which is NULL rather than
> core.extension_string) so run always segfaulted!
>
> I pushed a fix. Kind of irritating that this wasn't fixed in the week
> since it was reported.
Sincere apologies for that one - must have fat fingered while rebasing.

I was staring at it the other day and could spot it.
Even asked Tim for a sample shader.

-Emil (looking for a brown paper bag)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 23/23] anv: Implement VK_ANDROID_native_buffer

2017-09-13 Thread Emil Velikov
On 13 September 2017 at 22:15, Chad Versace  wrote:
>> > +
>> > +static void UNUSED
>> > +static_asserts(void)
>> > +{
>> > +   STATIC_ASSERT(HWVULKAN_DISPATCH_MAGIC == ICD_LOADER_MAGIC);
>> > +}
>
>> Seems like a left over? With that gone, there should be no need for vk_icd.h.
>
> I intended to leave this in because throughout anvil we do
>
>   api_object->_loader_data.loaderMagic = ICD_LOADER_MAGIC
>
> and that needs to work correctly even on Android. The static assertion
> here will give us ample warning if the two values ever change.
>
Right, my bad.

>> > +PUBLIC struct hwvulkan_module_t HAL_MODULE_INFO_SYM = {
>> Unrelated: any ideas why these are not const? All the HAL exports I've
>> seen always lack the notation.
>
> Surprisingly, the HAL API expects it to be non-const in
> hw_module_method_t. The 'device' output parameter to
> hw_module_method_t::open() is non-const, and it contains a reference to
> hw_module_t. Declaring HAL_MODULE_INFO_SYM as const produces compiler
> warnings (or errors, depending on CFLAGS) in the body of anv_hal_open().
>
> As precedent, Chrome OS's gralloc HAL (not written by me) also declares
> HAL_MODULE_INFO_SYM as non-const.
>
Right. I'm a little concerned since the HALs contain a bunch of
function pointers, which one can override and perhaps abuse. That's a
topic for another day though.

>
>>
>> > +VkResult
>> > +anv_QueueSignalReleaseImageANDROID(
>> > +  VkQueue queue,
>> > +  uint32_twaitSemaphoreCount,
>> > +  const VkSemaphore*  pWaitSemaphores,
>> > +  VkImage image,
>> > +  int*pNativeFenceFd)
>> > +{
>> > +   VkResult result;
>> > +
>> > +   if (waitSemaphoreCount == 0)
>> > +  goto done;
>> > +
>> > +   result = anv_QueueSubmit(queue,
>> > +  1, (VkSubmitInfo[]) {
>> Nit: Please stay consistent with other anv_QueueSubmit users.
>>
>> Namely, move "1," on the previous line and use a pointer
>> &(VkSubmitInfo), dropping the sentinel.
>
> Huh. My style used to be the dominant style in anvil, before the meta
> paths were deleted. And it continues to be pervasive in radv.
>
> I don't care right now, though. I'll change it.
>
Apologies, did not mean to rock the boat :-\

> By the way, I don't know what sentinel you refer to here. I don't see
> a sentinel.
>
My lack of formal CS education (Electronics Engineer here) shows...

>
>> > + {
>> > +.sType = VK_STRUCTURE_TYPE_SUBMIT_INFO,
>> > +.waitSemaphoreCount = 1,
>> > +.pWaitSemaphores = pWaitSemaphores,
>> > + },
>> > +  },

>> > +  (VkFence) VK_NULL_HANDLE);
... I meant this instance here.

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa 2/6] dri/radeon: use ARRAY_SIZE macro

2017-09-13 Thread Ian Romanick
Reviewed-by: Ian Romanick 

On 09/07/2017 03:21 AM, Eric Engestrom wrote:
> Signed-off-by: Eric Engestrom 
> ---
>  src/mesa/drivers/dri/radeon/radeon_tcl.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/radeon/radeon_tcl.c 
> b/src/mesa/drivers/dri/radeon/radeon_tcl.c
> index 3e2f426160..61ff2311e9 100644
> --- a/src/mesa/drivers/dri/radeon/radeon_tcl.c
> +++ b/src/mesa/drivers/dri/radeon/radeon_tcl.c
> @@ -39,6 +39,8 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 
> SOFTWARE.
>  #include "main/enums.h"
>  #include "main/state.h"
>  
> +#include "util/macros.h"
> +
>  #include "vbo/vbo.h"
>  #include "tnl/tnl.h"
>  #include "tnl/t_pipeline.h"
> @@ -297,7 +299,7 @@ static GLuint radeonEnsureEmitSize( struct gl_context * 
> ctx , GLuint inputs )
>  VERT_BIT_FOG
>};
>/* predict number of aos to emit */
> -  for (i=0; i < sizeof(flags_to_check)/sizeof(flags_to_check[0]); ++i)
> +  for (i=0; i < ARRAY_SIZE(flags_to_check); ++i)
>{
>  if (inputs & flags_to_check[i])
>++nr_aos;
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] automake: enable libunwind in `make distcheck'

2017-09-13 Thread Matt Turner

On 09/11, Emil Velikov wrote:

From: Emil Velikov 

Enable the toggle to catch when the library is missing from the link
path. Better to test, fail and address before releasing Mesa ;-)

Cc: mesa-sta...@lists.freedesktop.org
Signed-off-by: Emil Velikov 
---


Yes, please!

Reviewed-by: Matt Turner 


signature.asc
Description: Digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium/{r600, radeonsi}: Fix segfault with color format (v2)

2017-09-13 Thread Emil Velikov
On 13 September 2017 at 20:27, Denis Pauk  wrote:
> Could you please commit this changes?
>
> I have not found what i need to do next after recieve "approve" for changes
> in https://www.mesa3d.org/submittingpatches.html#reviewing.
> Do i need to send new mail with "Reviewed-by: " mark?
>
Hmm nicely spotted - I thought we had a section about it. How about
something like the following.
Can you please beat it into shape and send as a patch?

Pushing patches

After the patch is reviewed, ideally, there will be $PERIOD (open for
discussion) for other developers to review the patch.
Patches that fix serious regressions (fails to build on multiple
platforms, most of piglit/CTS fails, etc.) are likely to be pushed on
a shorter notice.

Reviewed patches, are pushed to the upstream repository $REF
(https://www.mesa3d.org/repository.html#anonymous) by developers with
commit access $REF (https://www.mesa3d.org/repository.html#developer).
If you do not have access, clearly state so - normally after the ---
line as part of the patch submission.

In such a case the reviewer should push the patch.

Do poke developers if you notice that your patch has not been pushed
after $PERIOD of being reviewed.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Implement another VF cache invalidate workaround on Gen8+.

2017-09-13 Thread Anuj Phogat
Ken, Looks like this patch never landed upstream? We also need it for CNL.

On Mon, Jan 30, 2017 at 10:59 AM, Pohjolainen, Topi
 wrote:
> On Sun, Jan 29, 2017 at 08:24:16PM -0800, Kenneth Graunke wrote:
>> ...and provide a better citation for the existing one.
>>
>> Signed-off-by: Kenneth Graunke 
>> ---
>>  src/mesa/drivers/dri/i965/brw_pipe_control.c | 32 
>> +---
>>  1 file changed, 24 insertions(+), 8 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_pipe_control.c 
>> b/src/mesa/drivers/dri/i965/brw_pipe_control.c
>> index b8f740640f2..3e08841e0a9 100644
>> --- a/src/mesa/drivers/dri/i965/brw_pipe_control.c
>> +++ b/src/mesa/drivers/dri/i965/brw_pipe_control.c
>> @@ -118,14 +118,30 @@ brw_emit_pipe_control_flush(struct brw_context *brw, 
>> uint32_t flags)
>>if (brw->gen == 8)
>>   gen8_add_cs_stall_workaround_bits();
>>
>> -  if (brw->gen == 9 &&
>> -  (flags & PIPE_CONTROL_VF_CACHE_INVALIDATE)) {
>> - /* Hardware workaround: SKL
>> -  *
>> -  * Emit Pipe Control with all bits set to zero before emitting
>> -  * a Pipe Control with VF Cache Invalidate set.
>> -  */
>> - brw_emit_pipe_control_flush(brw, 0);
>> +  if (flags & PIPE_CONTROL_VF_CACHE_INVALIDATE) {
>> + if (brw->gen >= 9) {
>> +/* The PIPE_CONTROL "VF Cache Invalidation Enable" bit 
>> description
>> + * lists several workarounds:
>> + *
>> + * "Projects: SKL, KBL, BXT
>> + *  If the VF Cache Invalidation Enable is set to a 1 in a
>> + *  PIPE_CONTROL, a separate Null PIPE_CONTROL, all bitfields 
>> sets
>> + *  to 0, with the VF Cache Invalidation Enable set to 0 needs 
>> to
>> + *  be sent prior to the PIPE_CONTROL with VF Cache Invalidation
>> + *  Enable set to a 1."
>> + */
>> +brw_emit_pipe_control_flush(brw, 0);
>> +
>> +/* "Projects: BDW+
>> + *  When VF Cache Invalidate is set ???Post Sync Operation??? 
>> must
>> + *  be enabled to ???Write Immediate Data??? or ???Write PS 
>> Depth Count???
>> + *  or ???Write Timestamp???.
>> + */
>> +brw_emit_pipe_control_write(brw,
>> +flags | 
>> PIPE_CONTROL_WRITE_IMMEDIATE,
>> +brw->workaround_bo, 0, 0, 0);
>
> Your title says gen8+, this is still within the gen9+ block. Did you mean to
> have it also for gen8?
>
>> +return;
>> + }
>>}
>>
>>BEGIN_BATCH(6);
>> --
>> 2.11.0
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 23/23] anv: Implement VK_ANDROID_native_buffer

2017-09-13 Thread Chad Versace
On Wed 13 Sep 2017, Emil Velikov wrote:
> Hi Chad,
> 
> On the topic of keep this in the driver vs separate module - I think
> it makes sense to keep it internal.
> 
> Sure radv/others won't be able to reuse the code [that easily], at the
> same time it gives greater control and one could make more optimal
> decisions.
> 
> On 2 September 2017 at 09:17, Chad Versace  wrote:
> 
> > diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources
> > index f6a69f65455..e4a371cbe3f 100644
> > --- a/src/intel/Makefile.sources
> > +++ b/src/intel/Makefile.sources
> > @@ -228,6 +228,9 @@ VULKAN_FILES := \
> > vulkan/anv_wsi.c \
> > vulkan/vk_format_info.h
> >
> > +VULKAN_ANDROID_FILES := \
> > +   vulkan/anv_android.c
> > +
> >  VULKAN_WSI_WAYLAND_FILES := \
> > vulkan/anv_wsi_wayland.c
> >
> > diff --git a/src/intel/Makefile.vulkan.am b/src/intel/Makefile.vulkan.am
> > index d1b1132ed2e..bc4b12768ad 100644
> > --- a/src/intel/Makefile.vulkan.am
> > +++ b/src/intel/Makefile.vulkan.am
> > @@ -147,8 +147,12 @@ VULKAN_LIB_DEPS = \
> > -lm
> >
> >  if HAVE_PLATFORM_ANDROID
> > +VULKAN_CPPFLAGS += \
> > +   -I$(top_srcdir)/include/android \
> > +   $(ANDROID_CPPFLAGS)
> >  VULKAN_CFLAGS += $(ANDROID_CFLAGS)
> >  VULKAN_LIB_DEPS += $(ANDROID_LIBS)
> > +VULKAN_SOURCES += $(VULKAN_ANDROID_FILES)
> >  endif
> >
> I think we'll need a simialar hunk for Android. Once people are
> settled on the approach myself or Tapani can propose something.
> 
> 
> >  if HAVE_PLATFORM_X11
> > diff --git a/src/intel/vulkan/anv_android.c b/src/intel/vulkan/anv_android.c
> > new file mode 100644
> > index 000..24274510cd6
> > --- /dev/null
> > +++ b/src/intel/vulkan/anv_android.c
> 
> > +#include 
> 
> > +
> > +static void UNUSED
> > +static_asserts(void)
> > +{
> > +   STATIC_ASSERT(HWVULKAN_DISPATCH_MAGIC == ICD_LOADER_MAGIC);
> > +}

> Seems like a left over? With that gone, there should be no need for vk_icd.h.

I intended to leave this in because throughout anvil we do

  api_object->_loader_data.loaderMagic = ICD_LOADER_MAGIC

and that needs to work correctly even on Android. The static assertion
here will give us ample warning if the two values ever change.

> > +PUBLIC struct hwvulkan_module_t HAL_MODULE_INFO_SYM = {
> Unrelated: any ideas why these are not const? All the HAL exports I've
> seen always lack the notation.

Surprisingly, the HAL API expects it to be non-const in
hw_module_method_t. The 'device' output parameter to
hw_module_method_t::open() is non-const, and it contains a reference to
hw_module_t. Declaring HAL_MODULE_INFO_SYM as const produces compiler
warnings (or errors, depending on CFLAGS) in the body of anv_hal_open().

As precedent, Chrome OS's gralloc HAL (not written by me) also declares
HAL_MODULE_INFO_SYM as non-const.


> 
> > +VkResult
> > +anv_QueueSignalReleaseImageANDROID(
> > +  VkQueue queue,
> > +  uint32_twaitSemaphoreCount,
> > +  const VkSemaphore*  pWaitSemaphores,
> > +  VkImage image,
> > +  int*pNativeFenceFd)
> > +{
> > +   VkResult result;
> > +
> > +   if (waitSemaphoreCount == 0)
> > +  goto done;
> > +
> > +   result = anv_QueueSubmit(queue,
> > +  1, (VkSubmitInfo[]) {
> Nit: Please stay consistent with other anv_QueueSubmit users.
> 
> Namely, move "1," on the previous line and use a pointer
> &(VkSubmitInfo), dropping the sentinel.

Huh. My style used to be the dominant style in anvil, before the meta
paths were deleted. And it continues to be pervasive in radv.

I don't care right now, though. I'll change it.

By the way, I don't know what sentinel you refer to here. I don't see
a sentinel.


> > + {
> > +.sType = VK_STRUCTURE_TYPE_SUBMIT_INFO,
> > +.waitSemaphoreCount = 1,
> > +.pWaitSemaphores = pWaitSemaphores,
> > + },
> > +  },
> > +  (VkFence) VK_NULL_HANDLE);
> 
> 
> > @@ -1440,10 +1440,20 @@ VkResult anv_AllocateMemory(
> > struct anv_device_memory *mem;
> > VkResult result = VK_SUCCESS;
> >
> > +   /* VK_ANDROID_native_buffer defines VkNativeBufferANDROID as an 
> > extension
> > +* of VkImageCreateInfo. We abuse the struct by chaining it to
> > +* VkMemoryAllocateInfo in the implementation of vkCreateImage.
> > +*/
> > +   const VkNativeBufferANDROID *gralloc_info = NULL;
> > +#ifdef ANDROID
> 
> On one hand I'm thinking that we could drop the ifdef guard here and
> the rest of the patch.
> At the same time any solution will lead to a few annoying issues:
> 
>  - The -I$(top_srcdir)/include/android is Android only -> breaking the
> !Android builds

The -I line was a mistake remaining from an old branch. It doesn't break
the build, though. The compiler silently ignores -I dirs that don't
exist.

>  - The header within ^^ is unconditionally included

Yes, I intentionally included the header unconditionally. It reduces the
#ifdef mess in 

Re: [Mesa-dev] [PATCH shader-db 3/3] run: add extension_in_string() helper

2017-09-13 Thread Matt Turner
On Mon, Aug 21, 2017 at 3:27 AM, Emil Velikov  wrote:
> From: Emil Velikov 
>
> memmem() does not attribute what the character after the searched string
> is. Thus it will flag even when haystack is "foobar" while we're looking
> for "foo".
>
> Pull a small helper (based on piglit) that correctly handles this and use
> it.
>
> Note: when parsing through the shader we have a non-zero terminated
> needle, let's keep the memmem in there for now.
>
> Signed-off-by: Emil Velikov 
> ---
>  run.c | 43 +++
>  1 file changed, 35 insertions(+), 8 deletions(-)
>
> diff --git a/run.c b/run.c
> index d0d4598..0691d33 100644
> --- a/run.c
> +++ b/run.c
> @@ -69,6 +69,35 @@ struct shader {
>  int type;
>  };
>
> +static bool
> +extension_in_string(const char *haystack, const char *needle)
> +{
> +const unsigned needle_len = strlen(needle);
> +
> +if (needle_len == 0)
> +return false;
> +
> +while (true) {
> +const char *const s = strstr(haystack, needle);
> +
> +if (s == NULL)
> +return false;
> +
> +if (s[needle_len] == ' ' || s[needle_len] == '\0')
> +return true;
> +
> +/* strstr found an extension whose name begins with
> + * needle, but whose name is not equal to needle.
> + * Restart the search at s + needle_len so that we
> + * don't just find the same extension again and go
> + * into an infinite loop.
> + */
> +haystack = s + needle_len;
> +}
> +
> +return false;
> +}
> +
>  static struct shader *
>  get_shaders(const struct context_info *core, const struct context_info 
> *compat,
>  const char *text, size_t text_size,
> @@ -141,8 +170,8 @@ get_shaders(const struct context_info *core, const struct 
> context_info *compat,
>  extension_text += 1;
>  const char *newline = memchr(extension_text, '\n',
>   end_text - extension_text);
> -if (memmem(info->extension_string, info->extension_string_len,
> -   extension_text, newline - extension_text) == NULL) {
> +if (memmem(info->extension_string, info->extension_string_len,
> +   extension_text, newline - extension_text) == NULL) {
>  fprintf(stderr, "SKIP: %s requires unavailable extension %.*s\n",
>  shader_name, (int)(newline - extension_text), 
> extension_text);
>  return NULL;
> @@ -415,7 +444,7 @@ main(int argc, char **argv)
>  return -1;
>  }
>
> -if (!strstr(client_extensions, "EGL_MESA_platform_gbm")) {
> +if (!extension_in_string(client_extensions, "EGL_MESA_platform_gbm")) {
>  fprintf(stderr, "ERROR: Missing EGL_MESA_platform_gbm\n");
>  return -1;
>  }
> @@ -458,7 +487,7 @@ main(int argc, char **argv)
>  };
>  for (int i = 0; i < ARRAY_SIZE(egl_extension); i++) {
>  const char *extension_string = eglQueryString(egl_dpy, 
> EGL_EXTENSIONS);
> -if (strstr(extension_string, egl_extension[i]) == NULL) {
> +if (!extension_in_string(extension_string, egl_extension[i])) {
>  fprintf(stderr, "ERROR: Missing %s\n", egl_extension[i]);
>  ret = -1;
>  goto egl_terminate;
> @@ -530,8 +559,7 @@ main(int argc, char **argv)
>
>  core.max_glsl_version = get_glsl_version();
>
> -if (memmem(core.extension_string, core.extension_string_len,
> -   "GL_KHR_debug", strlen("GL_KHR_debug")) == NULL) {
> +if (!extension_in_string(core.extension_string, "GL_KHR_debug")) {

I don't know what happened between this patch and getting committed,
but what was committed contained a copy-and-paste mistake here (used
compat.extension_string, which is NULL rather than
core.extension_string) so run always segfaulted!

I pushed a fix. Kind of irritating that this wasn't fixed in the week
since it was reported.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] intel/eu/validate: Look up types on demand in execution_type()

2017-09-13 Thread Emil Velikov
On 13 September 2017 at 21:35, Matt Turner  wrote:
> On 09/13, Emil Velikov wrote:
>>
>> As Andres pointed out this commit depends on 4fab67a4415 et al, which
>> seems to be missing in 17.1 and 17.2.
>>
>> I've ported this to 17.2 by moving the brw_inst_src[01]_reg_type()
>> calls as in the original patch.
>> The brw_inst_src[01]_reg_file() ones are left as-is.
>
>
> Without commit 4fab67a4415, this patch isn't needed at all. Feel free to
> drop it from the 17.2 branch.

Done, thanks for the confirmation Matt.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Is there a mesa Docker image available like on travis-ci? (was Re: [PATCH 1/2] gallium/targets/dri: Add libunwind to linker flags)

2017-09-13 Thread Emil Velikov
On 13 September 2017 at 11:07, Gert Wollny  wrote:
> Am Dienstag, den 12.09.2017, 17:15 +0100 schrieb Eric Engestrom:
>> On Tuesday, 2017-09-12 17:26:43 +0200, Gert Wollny wrote:
>> > Am Dienstag, den 12.09.2017, 12:26 +0100 schrieb Emil Velikov:
>> > > On 12 September 2017 at 11:17, Gert Wollny 
>> > > wrote:
>> > > >
>> > > >
>> > > > Is there a docker image available that resembles the travis-ci
>> > > > build environment as it is set up in mesa/.travis.yml (i.e.
>> > > > Ubuntu/Trusty with all the manually compiled packages)?
>> > > >
>> > >
>> > > There isn't one that I know of. They are using stock Ubuntu
>> > > (afaict)
>> > > so one should be able to reproduce/create one.
>> >
>> > I tried to avoid that ;)
>> >
>> > >
>> > > That aside, something like the following should help - do tweak
>> > > as
>> > > applicable. Don't forget to trim down the targets - no need to
>> > > build
>> > > unrelated stuff ;-)
>> >
>> > Thanks to your suggestion I was able to get some output [1], but
>> > the link command actually contains -lunwind and it is listed after
>> > libgallium.a, so I have no idea what's going wrong.
>> >
>> > Maybe it's a bug in that libtool version ...
>>
>> I wouldn't be surprised; try slibtool? We've had to use it internally
>> instead of libtool to build mesa for a while now.
>
> Well it turned out to be a different problem: I got random results:
> sometimes it built, sometimes not, and it turned out that for some
> reason the build randomly picked up either llvm-3.5 or llvm-3.9 and the
> first one build and the other one not. Interestingly, forcing either
> version resulted in an error in configure, probably because the default
> version that is installed doesn't have a version number in the llvm-
> config script file name.
>
> If I add the llvm-toolchain-trusty-3.9 toolchain and force the
> versionbuild seems to pass consistently.
>
Hmm the LLVM packaging is broken in more ways than expected :-\

> Emil, is there a reason why in the "make Gallium ST Other" build no
> llvm is specified ? - It is definitely picked up and used.
>
"ST Other" should address the minimum required LLVM version - 3.3.
Such early versions had quite a lot of bugs (wrt packaging at least),
so I did not set it explicitly.

If you can find a way convince Travis to use 3.3, patches will be appreciated.

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 14/15] i965: Disentangle batch and state buffer flushing.

2017-09-13 Thread Kenneth Graunke
We now flush the batch when either the batchbuffer or statebuffer
reaches the original intended batch size, instead of when the sum of
the two reaches a certain size (which makes no sense now that they're
separate buffers).

With this change, we also need to update our "are we near the end?"
estimate to require separate batch and state buffer space.  I obtained
these estimates by looking at the size of draw calls in the Unreal 4
Elemental Demo (using INTEL_DEBUG=flush and always_flush_batch=true).

This will significantly impact the size of our batches.  I've adjusted
both down to try and be roughly similar to what we had been doing.  On
various benchmarks, a 20kB batch and 16kB statebuffer seemed to about
right, but we may need to adjust this further.  I tried a 16kB batch,
but that regressed Synmark OglMultithread performance by a fair bit.
32kB for both would have significantly increased our batch sizes.
---
 src/mesa/drivers/dri/i965/brw_compute.c   | 18 -
 src/mesa/drivers/dri/i965/brw_draw.c  | 18 -
 src/mesa/drivers/dri/i965/brw_state.h |  1 +
 src/mesa/drivers/dri/i965/genX_blorp_exec.c   |  4 ++--
 src/mesa/drivers/dri/i965/intel_batchbuffer.c | 29 ++-
 5 files changed, 30 insertions(+), 40 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_compute.c 
b/src/mesa/drivers/dri/i965/brw_compute.c
index 1bad7ac7a0c..7f0278ac92b 100644
--- a/src/mesa/drivers/dri/i965/brw_compute.c
+++ b/src/mesa/drivers/dri/i965/brw_compute.c
@@ -167,7 +167,6 @@ static void
 brw_dispatch_compute_common(struct gl_context *ctx)
 {
struct brw_context *brw = brw_context(ctx);
-   int estimated_buffer_space_needed;
bool fail_next = false;
 
if (!_mesa_check_conditional_render(ctx))
@@ -180,20 +179,11 @@ brw_dispatch_compute_common(struct gl_context *ctx)
 
brw_predraw_resolve_inputs(brw);
 
-   const int sampler_state_size = 16; /* 16 bytes */
-   estimated_buffer_space_needed = 512; /* batchbuffer commands */
-   estimated_buffer_space_needed += (BRW_MAX_TEX_UNIT *
- (sampler_state_size +
-  sizeof(struct 
gen5_sampler_default_color)));
-   estimated_buffer_space_needed += 1024; /* push constants */
-   estimated_buffer_space_needed += 512; /* misc. pad */
-
-   /* Flush the batch if it's approaching full, so that we don't wrap while
-* we've got validated state that needs to be in the same batch as the
-* primitives.
+   /* Flush the batch if the batch/state buffers are nearly full.  We can
+* grow them if needed, but this is not free, so we'd like to avoid it.
 */
-   intel_batchbuffer_require_space(brw, estimated_buffer_space_needed,
-   RENDER_RING);
+   intel_batchbuffer_require_space(brw, 600, RENDER_RING);
+   brw_require_statebuffer_space(brw, 2500);
intel_batchbuffer_save_state(brw);
 
  retry:
diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
b/src/mesa/drivers/dri/i965/brw_draw.c
index d1ec2e3f09d..06c6ed72c98 100644
--- a/src/mesa/drivers/dri/i965/brw_draw.c
+++ b/src/mesa/drivers/dri/i965/brw_draw.c
@@ -669,26 +669,16 @@ brw_try_draw_prims(struct gl_context *ctx,
brw->ctx.NewDriverState |= BRW_NEW_VERTICES;
 
for (i = 0; i < nr_prims; i++) {
-  int estimated_max_prim_size;
-  const int sampler_state_size = 16;
-
-  estimated_max_prim_size = 512; /* batchbuffer commands */
-  estimated_max_prim_size += BRW_MAX_TEX_UNIT *
- (sampler_state_size + sizeof(struct gen5_sampler_default_color));
-  estimated_max_prim_size += 1024; /* gen6 VS push constants */
-  estimated_max_prim_size += 1024; /* gen6 WM push constants */
-  estimated_max_prim_size += 512; /* misc. pad */
-
   /* Flag BRW_NEW_DRAW_CALL on every draw.  This allows us to have
* atoms that happen on every draw call.
*/
   brw->ctx.NewDriverState |= BRW_NEW_DRAW_CALL;
 
-  /* Flush the batch if it's approaching full, so that we don't wrap while
-   * we've got validated state that needs to be in the same batch as the
-   * primitives.
+  /* Flush the batch if the batch/state buffers are nearly full.  We can
+   * grow them if needed, but this is not free, so we'd like to avoid it.
*/
-  intel_batchbuffer_require_space(brw, estimated_max_prim_size, 
RENDER_RING);
+  intel_batchbuffer_require_space(brw, 1500, RENDER_RING);
+  brw_require_statebuffer_space(brw, 2400);
   intel_batchbuffer_save_state(brw);
 
   if (brw->num_instances != prims[i].num_instances ||
diff --git a/src/mesa/drivers/dri/i965/brw_state.h 
b/src/mesa/drivers/dri/i965/brw_state.h
index c8b71e72de5..9718739dea9 100644
--- a/src/mesa/drivers/dri/i965/brw_state.h
+++ b/src/mesa/drivers/dri/i965/brw_state.h
@@ -185,6 +185,7 @@ void brw_destroy_caches( struct brw_context *brw );
 void brw_print_program_cache(struct brw_context *brw);
 
 /* intel_batchbuffer.c 

[Mesa-dev] [PATCH v2 15/15] i965: Print size of validation and relocation lists in INTEL_DEBUG=flush

2017-09-13 Thread Kenneth Graunke
It's nice to have this information.  While we're at it, tweak the
formatting to try and vertically align numbers in the common case.
---
 src/mesa/drivers/dri/i965/intel_batchbuffer.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c 
b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
index fddc84fcf9b..b6ce8174d57 100644
--- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
+++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
@@ -886,10 +886,15 @@ _intel_batchbuffer_flush_fence(struct brw_context *brw,
if (unlikely(INTEL_DEBUG & (DEBUG_BATCH | DEBUG_SUBMIT))) {
   int bytes_for_commands = 4 * USED_BATCH(brw->batch);
   int bytes_for_state = brw->batch.state_used;
-  fprintf(stderr, "%s:%d: Batchbuffer flush with %4db (%0.1f%%) (pkt) + "
-  "%4db (%0.1f%%) (state)\n", file, line,
+  fprintf(stderr, "%19s:%-3d: Batchbuffer flush with %5db (%0.1f%%) (pkt),"
+  " %5db (%0.1f%%) (state), %4d BOs (%0.1fMb aperture),"
+  " %4d batch relocs, %4d state relocs\n", file, line,
   bytes_for_commands, 100.0f * bytes_for_commands / BATCH_SZ,
-  bytes_for_state, 100.0f * bytes_for_state / STATE_SZ);
+  bytes_for_state, 100.0f * bytes_for_state / STATE_SZ,
+  brw->batch.exec_count,
+  (float) brw->batch.aperture_space / (1024 * 1024),
+  brw->batch.batch_relocs.reloc_count,
+  brw->batch.state_relocs.reloc_count);
}
 
brw_finish_batch(brw);
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 07/15] i965: Split brw_emit_reloc into brw_batch_reloc and brw_state_reloc.

2017-09-13 Thread Kenneth Graunke
brw_batch_reloc emits a relocation from the batchbuffer to elsewhere.
brw_state_reloc emits a relocation from the statebuffer to elsewhere.

For now, they do the same thing, but when we actually split the two
buffers, we'll change brw_state_reloc to use the state buffer.
---
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 46 
 src/mesa/drivers/dri/i965/genX_blorp_exec.c  | 10 +++---
 src/mesa/drivers/dri/i965/genX_state_upload.c|  7 ++--
 src/mesa/drivers/dri/i965/intel_batchbuffer.c| 39 ++--
 src/mesa/drivers/dri/i965/intel_batchbuffer.h| 19 ++
 5 files changed, 73 insertions(+), 48 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 1f89b723544..d110482cc8e 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -176,9 +176,9 @@ brw_emit_surface_state(struct brw_context *brw,
  surf_offset);
 
isl_surf_fill_state(>isl_dev, state, .surf = >surf, .view = ,
-   .address = brw_emit_reloc(>batch,
- *surf_offset + 
brw->isl_dev.ss.addr_offset,
- mt->bo, offset, reloc_flags),
+   .address = brw_state_reloc(>batch,
+  *surf_offset + 
brw->isl_dev.ss.addr_offset,
+  mt->bo, offset, reloc_flags),
.aux_surf = aux_surf, .aux_usage = aux_usage,
.aux_address = aux_offset,
.mocs = mocs, .clear_color = clear_color,
@@ -194,11 +194,11 @@ brw_emit_surface_state(struct brw_context *brw,
*/
   assert((aux_offset & 0xfff) == 0);
   uint32_t *aux_addr = state + brw->isl_dev.ss.aux_addr_offset;
-  *aux_addr = brw_emit_reloc(>batch,
- *surf_offset +
- brw->isl_dev.ss.aux_addr_offset,
- aux_bo, *aux_addr,
- reloc_flags);
+  *aux_addr = brw_state_reloc(>batch,
+  *surf_offset +
+  brw->isl_dev.ss.aux_addr_offset,
+  aux_bo, *aux_addr,
+  reloc_flags);
}
 }
 
@@ -607,10 +607,10 @@ brw_emit_buffer_surface_state(struct brw_context *brw,
 
isl_buffer_fill_state(>isl_dev, dw,
  .address = !bo ? buffer_offset :
-brw_emit_reloc(>batch,
-   *out_offset + 
brw->isl_dev.ss.addr_offset,
-   bo, buffer_offset,
-   reloc_flags),
+brw_state_reloc(>batch,
+*out_offset + 
brw->isl_dev.ss.addr_offset,
+bo, buffer_offset,
+reloc_flags),
  .size = buffer_size,
  .format = surface_format,
  .stride = pitch,
@@ -777,8 +777,8 @@ brw_update_sol_surface(struct brw_context *brw,
   BRW_SURFACE_MIPMAPLAYOUT_BELOW << BRW_SURFACE_MIPLAYOUT_SHIFT |
   surface_format << BRW_SURFACE_FORMAT_SHIFT |
   BRW_SURFACE_RC_READ_WRITE;
-   surf[1] = brw_emit_reloc(>batch,
-*out_offset + 4, bo, offset_bytes, RELOC_WRITE);
+   surf[1] = brw_state_reloc(>batch,
+ *out_offset + 4, bo, offset_bytes, RELOC_WRITE);
surf[2] = (width << BRW_SURFACE_WIDTH_SHIFT |
  height << BRW_SURFACE_HEIGHT_SHIFT);
surf[3] = (depth << BRW_SURFACE_DEPTH_SHIFT |
@@ -870,9 +870,9 @@ emit_null_surface_state(struct brw_context *brw,
 
surf[0] = (BRW_SURFACE_2D << BRW_SURFACE_TYPE_SHIFT |
  ISL_FORMAT_B8G8R8A8_UNORM << BRW_SURFACE_FORMAT_SHIFT);
-   surf[1] = brw_emit_reloc(>batch, *out_offset + 4,
-brw->wm.multisampled_null_render_target_bo,
-0, RELOC_WRITE);
+   surf[1] = brw_state_reloc(>batch, *out_offset + 4,
+ brw->wm.multisampled_null_render_target_bo,
+ 0, RELOC_WRITE);
 
surf[2] = ((width - 1) << BRW_SURFACE_WIDTH_SHIFT |
   (height - 1) << BRW_SURFACE_HEIGHT_SHIFT);
@@ -940,12 +940,12 @@ gen4_update_renderbuffer_surface(struct brw_context *brw,
 
/* reloc */
assert(mt->offset % mt->cpp == 0);
-   surf[1] = brw_emit_reloc(>batch, offset + 4, mt->bo,
-mt->offset +
-intel_renderbuffer_get_tile_offsets(irb,
- 

[Mesa-dev] [PATCH v2 12/15] i965: Make BLORP properly avoid batch wrapping.

2017-09-13 Thread Kenneth Graunke
We need to set brw->no_batch_wrap to actually avoid flushing in the
middle of our BLORP operation, and instead grow the batchbuffer.
---
 src/mesa/drivers/dri/i965/genX_blorp_exec.c | 16 ++--
 1 file changed, 2 insertions(+), 14 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/genX_blorp_exec.c 
b/src/mesa/drivers/dri/i965/genX_blorp_exec.c
index feb87923ccb..5bff7eaff59 100644
--- a/src/mesa/drivers/dri/i965/genX_blorp_exec.c
+++ b/src/mesa/drivers/dri/i965/genX_blorp_exec.c
@@ -224,9 +224,7 @@ genX(blorp_exec)(struct blorp_batch *batch,
 retry:
intel_batchbuffer_require_space(brw, estimated_max_batch_usage, 
RENDER_RING);
intel_batchbuffer_save_state(brw);
-   struct brw_bo *saved_bo = brw->batch.bo;
-   uint32_t saved_used = USED_BATCH(brw->batch);
-   uint32_t saved_state_used = brw->batch.state_used;
+   brw->no_batch_wrap = true;
 
 #if GEN_GEN == 6
/* Emit workaround flushes when we switch from drawing to blorping. */
@@ -254,17 +252,7 @@ retry:
 
blorp_exec(batch, params);
 
-   /* Make sure we didn't wrap the batch unintentionally, and make sure we
-* reserved enough space that a wrap will never happen.
-*/
-   assert(brw->batch.bo == saved_bo);
-   assert((USED_BATCH(brw->batch) - saved_used) * 4 +
-  (brw->batch.state_used - saved_state_used) <
-  estimated_max_batch_usage);
-   /* Shut up compiler warnings on release build */
-   (void)saved_bo;
-   (void)saved_used;
-   (void)saved_state_used;
+   brw->no_batch_wrap = false;
 
/* Check if the blorp op we just did would make our batch likely to fail to
 * map all the BOs into the GPU at batch exec time later.  If so, flush the
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 11/15] i965: Grow the batch/state buffers if we need space and can't flush.

2017-09-13 Thread Kenneth Graunke
Previously, we would just assert fail and die in this case.  The only
safeguard is the "estimated max prim size" checks when starting a draw
(or compute dispatch or BLORP operation)...which are woefully broken.

Growing is fairly straightforward:

1. Allocate a new larger BO.
2. memcpy the existing contents over to the new buffer
3. Set the new BO to the same GTT offset as the old BO.  When emitting
   relocations, we write the presumed GTT offset of the target BO.  If
   we changed it, we'd have to update all the existing values (by
   walking the relocation list and looking at offsets), which is more
   expensive.  With the old BO freed, ideally the kernel could simply
   place the new BO at that offset anyway.
4. Update the validation list to contain the new BO.
5. Update the relocation list to have the GEM handle for the new BO
   (which we can skip if using I915_EXEC_HANDLE_LUT).

v2: Update to handle malloc'd shadow buffers.
---
 src/mesa/drivers/dri/i965/intel_batchbuffer.c | 136 +-
 1 file changed, 131 insertions(+), 5 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c 
b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
index 074bb74f99f..9a22b8297f2 100644
--- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
+++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
@@ -40,9 +40,27 @@
 
 #define FILE_DEBUG_FLAG DEBUG_BUFMGR
 
+/**
+ * Target sizes of the batch and state buffers.  We create the initial
+ * buffers at these sizes, and flush when they're nearly full.  If we
+ * underestimate how close we are to the end, and suddenly need more space
+ * in the middle of a draw, we can grow the buffers, and finish the draw.
+ * At that point, we'll be over our target size, so the next operation
+ * should flush.  Each time we flush the batch, we recreate both buffers
+ * at the original target size, so it doesn't grow without bound.
+ */
 #define BATCH_SZ (8192*sizeof(uint32_t))
 #define STATE_SZ (8192*sizeof(uint32_t))
 
+/* The kernel assumes batchbuffers are smaller than 256kB. */
+#define MAX_BATCH_SIZE (256 * 1024)
+
+/* 3DSTATE_BINDING_TABLE_POINTERS has a U16 offset from Surface State Base
+ * Address, which means that we can't put binding tables beyond 64kB.  This
+ * effectively limits the maximum statebuffer size to 64kB.
+ */
+#define MAX_STATE_SIZE (64 * 1024)
+
 static void
 intel_batchbuffer_reset(struct intel_batchbuffer *batch,
 struct intel_screen *screen);
@@ -252,6 +270,93 @@ intel_batchbuffer_free(struct intel_batchbuffer *batch)
   _mesa_hash_table_destroy(batch->state_batch_sizes, NULL);
 }
 
+static void
+replace_bo_in_reloc_list(struct brw_reloc_list *rlist,
+ uint32_t old_handle, uint32_t new_handle)
+{
+   for (int i = 0; i < rlist->reloc_count; i++) {
+  if (rlist->relocs[i].target_handle == old_handle)
+ rlist->relocs[i].target_handle = new_handle;
+   }
+}
+
+/**
+ * Grow either the batch or state buffer to a new larger size.
+ *
+ * We can't actually grow buffers, so we allocate a new one, copy over
+ * the existing contents, and update our lists to refer to the new one.
+ *
+ * Note that this is only temporary - each new batch recreates the buffers
+ * at their original target size (BATCH_SZ or STATE_SZ).
+ */
+static void
+grow_buffer(struct brw_context *brw,
+struct brw_bo **bo_ptr,
+uint32_t **map_ptr,
+uint32_t **cpu_map_ptr,
+unsigned existing_bytes,
+unsigned new_size)
+{
+   struct intel_batchbuffer *batch = >batch;
+   struct brw_bufmgr *bufmgr = brw->bufmgr;
+
+   uint32_t *old_map = *map_ptr;
+   struct brw_bo *old_bo = *bo_ptr;
+
+   struct brw_bo *new_bo = brw_bo_alloc(bufmgr, old_bo->name, new_size, 4096);
+   uint32_t *new_map;
+
+   perf_debug("Growing %s - ran out of space\n", old_bo->name);
+
+   /* Copy existing data to the new larger buffer */
+   if (*cpu_map_ptr) {
+  *cpu_map_ptr = new_map = realloc(*cpu_map_ptr, new_size);
+   } else {
+  new_map = brw_bo_map(brw, new_bo, MAP_READ | MAP_WRITE);
+  memcpy(new_map, old_map, existing_bytes);
+   }
+
+   /* Try to put the new BO at the same GTT offset as the old BO (which
+* we're throwing away, so it doesn't need to be there).
+*
+* This guarantees that our relocations continue to work: values we've
+* already written into the buffer, values we're going to write into the
+* buffer, and the validation/relocation lists all will match.
+*/
+   new_bo->gtt_offset = old_bo->gtt_offset;
+   new_bo->index = old_bo->index;
+
+   /* Batch/state buffers are per-context, and if we've run out of space,
+* we must have actually used them before, so...they will be in the list.
+*/
+   assert(old_bo->index < batch->exec_count);
+   assert(batch->exec_bos[old_bo->index] == old_bo);
+
+   /* Update the validation list to use the new BO. */
+   batch->exec_bos[old_bo->index] = new_bo;
+   

[Mesa-dev] [PATCH v2 09/15] i965: Pass screen to intel_batchbuffer_reset().

2017-09-13 Thread Kenneth Graunke
This will let us access screen->kernel_features in the next patch.
---
 src/mesa/drivers/dri/i965/intel_batchbuffer.c | 18 --
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c 
b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
index d9b157b324c..8080e77b251 100644
--- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
+++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
@@ -44,8 +44,7 @@
 
 static void
 intel_batchbuffer_reset(struct intel_batchbuffer *batch,
-struct brw_bufmgr *bufmgr,
-bool has_llc);
+struct intel_screen *screen);
 
 static bool
 uint_key_compare(const void *a, const void *b)
@@ -72,7 +71,6 @@ void
 intel_batchbuffer_init(struct intel_screen *screen,
struct intel_batchbuffer *batch)
 {
-   struct brw_bufmgr *bufmgr = screen->bufmgr;
const struct gen_device_info *devinfo = >devinfo;
 
if (!devinfo->has_llc) {
@@ -103,7 +101,7 @@ intel_batchbuffer_init(struct intel_screen *screen,
if (devinfo->gen == 6)
   batch->valid_reloc_flags |= EXEC_OBJECT_NEEDS_GTT;
 
-   intel_batchbuffer_reset(batch, bufmgr, devinfo->has_llc);
+   intel_batchbuffer_reset(batch, screen);
 }
 
 #define READ_ONCE(x) (*(volatile __typeof__(x) *)&(x))
@@ -151,9 +149,11 @@ add_exec_bo(struct intel_batchbuffer *batch, struct brw_bo 
*bo)
 
 static void
 intel_batchbuffer_reset(struct intel_batchbuffer *batch,
-struct brw_bufmgr *bufmgr,
-bool has_llc)
+struct intel_screen *screen)
 {
+   struct brw_bufmgr *bufmgr = screen->bufmgr;
+   const struct gen_device_info *devinfo = >devinfo;
+
if (batch->last_bo != NULL) {
   brw_bo_unreference(batch->last_bo);
   batch->last_bo = NULL;
@@ -161,7 +161,7 @@ intel_batchbuffer_reset(struct intel_batchbuffer *batch,
batch->last_bo = batch->bo;
 
batch->bo = brw_bo_alloc(bufmgr, "batchbuffer", BATCH_SZ, 4096);
-   if (has_llc) {
+   if (devinfo->has_llc) {
   batch->map = brw_bo_map(NULL, batch->bo, MAP_READ | MAP_WRITE);
}
batch->map_next = batch->map;
@@ -186,9 +186,7 @@ intel_batchbuffer_reset(struct intel_batchbuffer *batch,
 static void
 intel_batchbuffer_reset_and_clear_render_cache(struct brw_context *brw)
 {
-   const struct gen_device_info *devinfo = >screen->devinfo;
-
-   intel_batchbuffer_reset(>batch, brw->bufmgr, devinfo->has_llc);
+   intel_batchbuffer_reset(>batch, brw->screen);
brw_render_cache_set_clear(brw);
 }
 
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 13/15] i965: Delete BATCH_RESERVED handling.

2017-09-13 Thread Kenneth Graunke
Now that we can grom the batchbuffer if we absolutely need the extra
space, we don't need to reserve space for the final do-or-die ending
commands.
---
 src/mesa/drivers/dri/i965/intel_batchbuffer.c | 11 +++
 src/mesa/drivers/dri/i965/intel_batchbuffer.h | 26 --
 2 files changed, 3 insertions(+), 34 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c 
b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
index 9a22b8297f2..5aa34e74293 100644
--- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
+++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
@@ -204,7 +204,6 @@ intel_batchbuffer_reset(struct intel_batchbuffer *batch,
add_exec_bo(batch, batch->bo);
assert(batch->bo->index == 0);
 
-   batch->reserved_space = BATCH_RESERVED;
batch->needs_sol_reset = false;
batch->state_base_address_emitted = false;
 
@@ -372,8 +371,7 @@ intel_batchbuffer_require_space(struct brw_context *brw, 
GLuint sz,
 
/* For now, flush as if the batch and state buffers still shared a BO */
const unsigned batch_used = USED_BATCH(*batch) * 4;
-   if (batch_used + sz >=
-   BATCH_SZ - batch->reserved_space - batch->state_used) {
+   if (batch_used + sz >= BATCH_SZ - batch->state_used) {
   if (!brw->no_batch_wrap) {
  intel_batchbuffer_flush(brw);
   } else {
@@ -382,8 +380,7 @@ intel_batchbuffer_require_space(struct brw_context *brw, 
GLuint sz,
  grow_buffer(brw, >bo, >map, >batch_cpu_map,
  batch_used, new_size);
  batch->map_next = (void *) batch->map + batch_used;
- assert(batch_used + sz <
-batch->bo->size - batch->reserved_space - batch->state_used);
+ assert(batch_used + sz < batch->bo->size - batch->state_used);
   }
}
 
@@ -896,8 +893,6 @@ _intel_batchbuffer_flush_fence(struct brw_context *brw,
   bytes_for_state, 100.0f * bytes_for_state / STATE_SZ);
}
 
-   brw->batch.reserved_space = 0;
-
brw_finish_batch(brw);
 
/* Mark the end of the buffer. */
@@ -1032,7 +1027,7 @@ brw_state_batch(struct brw_context *brw,
uint32_t offset = ALIGN(batch->state_used, alignment);
 
/* For now, follow the old flushing behavior. */
-   int batch_space = batch->reserved_space + USED_BATCH(*batch) * 4;
+   int batch_space = USED_BATCH(*batch) * 4;
 
if (offset + size >= STATE_SZ - batch_space) {
   if (!brw->no_batch_wrap) {
diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.h 
b/src/mesa/drivers/dri/i965/intel_batchbuffer.h
index 8a2e3cfc9bb..c02cafed521 100644
--- a/src/mesa/drivers/dri/i965/intel_batchbuffer.h
+++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.h
@@ -10,32 +10,6 @@
 extern "C" {
 #endif
 
-/**
- * Number of bytes to reserve for commands necessary to complete a batch.
- *
- * This includes:
- * - MI_BATCHBUFFER_END (4 bytes)
- * - Optional MI_NOOP for ensuring the batch length is qword aligned (4 bytes)
- * - Any state emitted by vtbl->finish_batch():
- *   - Gen4-5 record ending occlusion query values (4 * 4 = 16 bytes)
- *   - Disabling OA counters on Gen6+ (3 DWords = 12 bytes)
- *   - Ending MI_REPORT_PERF_COUNT on Gen5+, plus associated PIPE_CONTROLs:
- * - Two sets of PIPE_CONTROLs, which become 4 PIPE_CONTROLs each on SNB,
- *   which are 5 DWords each ==> 2 * 4 * 5 * 4 = 160 bytes
- * - 3 DWords for MI_REPORT_PERF_COUNT itself on Gen6+.  ==> 12 bytes.
- *   On Ironlake, it's 6 DWords, but we have some slack due to the lack of
- *   Sandybridge PIPE_CONTROL madness.
- *   - CC_STATE workaround on HSW (17 * 4 = 68 bytes)
- * - 10 dwords for initial mi_flush
- * - 2 dwords for CC state setup
- * - 5 dwords for the required pipe control at the end
- *   - Restoring L3 configuration: (24 dwords = 96 bytes)
- * - 2*6 dwords for two PIPE_CONTROL flushes.
- * - 7 dwords for L3 configuration set-up.
- * - 5 dwords for L3 atomic set-up (on HSW).
- */
-#define BATCH_RESERVED 308
-
 struct intel_batchbuffer;
 
 void intel_batchbuffer_init(struct intel_screen *screen,
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 10/15] i965: Use a separate state buffer, but avoid changing flushing behavior.

2017-09-13 Thread Kenneth Graunke
Previously, we emitted GPU commands and indirect state into the same
buffer, using a stack/heap like system where we filled in commands from
the start of the buffer, and state from the end of the buffer.  We then
flushed before the two met in the middle.

Meeting in the middle is fatal, so you have to be certain that you
reserve the correct amount of space before emitting commands or state
for a draw.  Currently, we will assert !no_batch_wrap and die if the
estimate is ever too small.  This has been mercifully obscure, but has
happened on a number of occasions, and could in theory happen to any
application that issues a large draw at just the wrong time.

Estimating the amount of batch space required is painful - it's hard to
get right, and getting it right involves a lot of code that would burn
CPU time, and also be painful to maintain.  Rolling back to a saved
state and retrying is also painful - failing to save/restore all the
required state will break things, and redoing state emission burns a
lot of CPU.  memcpy'ing to a new batch and continuing is painful,
because commands we issue for a draw depend on earlier commands as well
(such as STATE_BASE_ADDRESS, or the GPU being in a pirtacular state).

The best plan is to never run out of space, which is totally doable but
pretty wasteful - a pessimal draw requires a huge amount of space, and
rarely occurs.  Instead, we'd like to grow the batch buffer if we need
more space and can't safely flush.

We can't grow with a meet in the middle approach - we'd have to move the
state to the end, which would mean updating every offset from dynamic
state base address.  Using separate batch and state buffers, where both
fill starting at the beginning, makes it easy to grow either as needed.

This patch separates the two concepts.  We create a separate state
buffer, with a second relocation list, and use that for brw_state_batch.

However, this patch tries to retain the original flushing behavior - it
adds the amount of batch and state space together, as if they were still
co-existing in a single buffer.  The hope is to flush at the same time
as before.  This is necessary to avoid provoking bugs caused by broken
batch wrap handling (which we'll fix shortly).  It also avoids suddenly
increasing the size of the batch (due to state not taking up space),
which could have a significant performance impact.  We'll tune it later.

v2:
- Mark the statebuffer with EXEC_OBJECT_CAPTURE when supported (caught
  by Chris).  Unfortunately, we lose the ability to capture state data
  on older kernels.
- Continue to support the malloc'd shadow buffers.
---
 src/mesa/drivers/dri/i965/brw_context.h   |  10 ++-
 src/mesa/drivers/dri/i965/brw_misc_state.c|  26 +++---
 src/mesa/drivers/dri/i965/gen4_blorp_exec.h   |   2 +-
 src/mesa/drivers/dri/i965/genX_blorp_exec.c   |  22 --
 src/mesa/drivers/dri/i965/genX_state_upload.c |  31 +---
 src/mesa/drivers/dri/i965/intel_batchbuffer.c | 109 +-
 src/mesa/drivers/dri/i965/intel_batchbuffer.h |  23 ++
 7 files changed, 141 insertions(+), 82 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 89dacf3874c..6968366d6c8 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -447,6 +447,8 @@ struct intel_batchbuffer {
struct brw_bo *bo;
/** Last BO submitted to the hardware.  Used for glFinish(). */
struct brw_bo *last_bo;
+   /** Current statebuffer being queued up. */
+   struct brw_bo *state_bo;
 
 #ifdef DEBUG
uint16_t emit, total;
@@ -454,15 +456,18 @@ struct intel_batchbuffer {
uint16_t reserved_space;
uint32_t *map_next;
uint32_t *map;
-   uint32_t *cpu_map;
+   uint32_t *batch_cpu_map;
+   uint32_t *state_cpu_map;
+   uint32_t *state_map;
+   uint32_t state_used;
 
-   uint32_t state_batch_offset;
enum brw_gpu_ring ring;
bool use_batch_first;
bool needs_sol_reset;
bool state_base_address_emitted;
 
struct brw_reloc_list batch_relocs;
+   struct brw_reloc_list state_relocs;
unsigned int valid_reloc_flags;
 
/** The validation list */
@@ -477,6 +482,7 @@ struct intel_batchbuffer {
struct {
   uint32_t *map_next;
   int batch_reloc_count;
+  int state_reloc_count;
   int exec_count;
} saved;
 
diff --git a/src/mesa/drivers/dri/i965/brw_misc_state.c 
b/src/mesa/drivers/dri/i965/brw_misc_state.c
index 9b8ae70f103..53137cc4524 100644
--- a/src/mesa/drivers/dri/i965/brw_misc_state.c
+++ b/src/mesa/drivers/dri/i965/brw_misc_state.c
@@ -65,15 +65,15 @@ upload_pipelined_state_pointers(struct brw_context *brw)
 
BEGIN_BATCH(7);
OUT_BATCH(_3DSTATE_PIPELINED_POINTERS << 16 | (7 - 2));
-   OUT_RELOC(brw->batch.bo, 0, brw->vs.base.state_offset);
+   OUT_RELOC(brw->batch.state_bo, 0, brw->vs.base.state_offset);
if (brw->ff_gs.prog_active)
-  OUT_RELOC(brw->batch.bo, 0, brw->ff_gs.state_offset | 1);
+  

[Mesa-dev] [PATCH v2 06/15] i965: Refactor relocs into a brw_reloc_list structure.

2017-09-13 Thread Kenneth Graunke
I'm planning on splitting batch and state into separate buffers, at
which point we'll need two relocation lists.  In preparation for that,
this patch refactors the relocation stuff into a structure we can
replicate...which looks a lot like anv_reloc_list.

Reviewed-by: Chris Wilson 
---
 src/mesa/drivers/dri/i965/brw_context.h   | 12 ++---
 src/mesa/drivers/dri/i965/intel_batchbuffer.c | 39 ---
 2 files changed, 32 insertions(+), 19 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 92fc16de136..89dacf3874c 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -436,6 +436,12 @@ enum brw_gpu_ring {
BLT_RING,
 };
 
+struct brw_reloc_list {
+   struct drm_i915_gem_relocation_entry *relocs;
+   int reloc_count;
+   int reloc_array_size;
+};
+
 struct intel_batchbuffer {
/** Current batchbuffer being queued up. */
struct brw_bo *bo;
@@ -456,9 +462,7 @@ struct intel_batchbuffer {
bool needs_sol_reset;
bool state_base_address_emitted;
 
-   struct drm_i915_gem_relocation_entry *relocs;
-   int reloc_count;
-   int reloc_array_size;
+   struct brw_reloc_list batch_relocs;
unsigned int valid_reloc_flags;
 
/** The validation list */
@@ -472,7 +476,7 @@ struct intel_batchbuffer {
 
struct {
   uint32_t *map_next;
-  int reloc_count;
+  int batch_reloc_count;
   int exec_count;
} saved;
 
diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c 
b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
index 194b11a429d..ae27ec5445b 100644
--- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
+++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
@@ -59,6 +59,15 @@ uint_key_hash(const void *key)
return (uintptr_t) key;
 }
 
+static void
+init_reloc_list(struct brw_reloc_list *rlist, int count)
+{
+   rlist->reloc_count = 0;
+   rlist->reloc_array_size = count;
+   rlist->relocs = malloc(rlist->reloc_array_size *
+  sizeof(struct drm_i915_gem_relocation_entry));
+}
+
 void
 intel_batchbuffer_init(struct intel_screen *screen,
struct intel_batchbuffer *batch)
@@ -72,10 +81,8 @@ intel_batchbuffer_init(struct intel_screen *screen,
   batch->map_next = batch->cpu_map;
}
 
-   batch->reloc_count = 0;
-   batch->reloc_array_size = 250;
-   batch->relocs = malloc(batch->reloc_array_size *
-  sizeof(struct drm_i915_gem_relocation_entry));
+   init_reloc_list(>batch_relocs, 250);
+
batch->exec_count = 0;
batch->exec_array_size = 100;
batch->exec_bos =
@@ -189,7 +196,7 @@ void
 intel_batchbuffer_save_state(struct brw_context *brw)
 {
brw->batch.saved.map_next = brw->batch.map_next;
-   brw->batch.saved.reloc_count = brw->batch.reloc_count;
+   brw->batch.saved.batch_reloc_count = brw->batch.batch_relocs.reloc_count;
brw->batch.saved.exec_count = brw->batch.exec_count;
 }
 
@@ -200,7 +207,7 @@ intel_batchbuffer_reset_to_saved(struct brw_context *brw)
 i < brw->batch.exec_count; i++) {
   brw_bo_unreference(brw->batch.exec_bos[i]);
}
-   brw->batch.reloc_count = brw->batch.saved.reloc_count;
+   brw->batch.batch_relocs.reloc_count = brw->batch.saved.batch_reloc_count;
brw->batch.exec_count = brw->batch.saved.exec_count;
 
brw->batch.map_next = brw->batch.saved.map_next;
@@ -216,7 +223,7 @@ intel_batchbuffer_free(struct intel_batchbuffer *batch)
for (int i = 0; i < batch->exec_count; i++) {
   brw_bo_unreference(batch->exec_bos[i]);
}
-   free(batch->relocs);
+   free(batch->batch_relocs.relocs);
free(batch->exec_bos);
free(batch->validation_list);
 
@@ -444,7 +451,7 @@ brw_new_batch(struct brw_context *brw)
   brw_bo_unreference(brw->batch.exec_bos[i]);
   brw->batch.exec_bos[i] = NULL;
}
-   brw->batch.reloc_count = 0;
+   brw->batch.batch_relocs.reloc_count = 0;
brw->batch.exec_count = 0;
brw->batch.aperture_space = 0;
 
@@ -666,8 +673,8 @@ do_flush_locked(struct brw_context *brw, int in_fence_fd, 
int *out_fence_fd)
 
   struct drm_i915_gem_exec_object2 *entry = >validation_list[0];
   assert(entry->handle == batch->bo->gem_handle);
-  entry->relocation_count = batch->reloc_count;
-  entry->relocs_ptr = (uintptr_t) batch->relocs;
+  entry->relocation_count = batch->batch_relocs.reloc_count;
+  entry->relocs_ptr = (uintptr_t) batch->batch_relocs.relocs;
 
   if (batch->use_batch_first) {
  flags |= I915_EXEC_BATCH_FIRST | I915_EXEC_HANDLE_LUT;
@@ -792,12 +799,14 @@ brw_emit_reloc(struct intel_batchbuffer *batch, uint32_t 
batch_offset,
struct brw_bo *target, uint32_t target_offset,
unsigned int reloc_flags)
 {
+   struct brw_reloc_list *rlist = >batch_relocs;
+
assert(target != NULL);
 
-   if (batch->reloc_count == batch->reloc_array_size) {
-  batch->reloc_array_size *= 2;
-  

[Mesa-dev] [PATCH v2 08/15] i965: Prepare INTEL_DEBUG=bat decoding for a separate statebuffer.

2017-09-13 Thread Kenneth Graunke
We'll need to read from both buffers when decoding state.

This also drops the "failed to map" fallback - it's completely useless
on LLC systems where we write directly to the mapped BO.  It's not that
useful on non-LLC systems either.
---
 src/mesa/drivers/dri/i965/intel_batchbuffer.c | 110 +-
 1 file changed, 54 insertions(+), 56 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c 
b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
index 0e8ca184392..d9b157b324c 100644
--- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
+++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
@@ -302,23 +302,23 @@ do_batch_dump(struct brw_context *brw)
if (batch->ring != RENDER_RING)
   return;
 
-   void *map = brw_bo_map(brw, batch->bo, MAP_READ);
-   if (map == NULL) {
-  fprintf(stderr,
-  "WARNING: failed to map batchbuffer, "
-  "dumping uploaded data instead.\n");
+   uint32_t *batch_data = brw_bo_map(brw, batch->bo, MAP_READ);
+   uint32_t *state = batch_data;
+   if (batch == NULL || state == NULL) {
+  fprintf(stderr, "WARNING: failed to map batchbuffer/statebuffer\n");
+  return;
}
 
-   uint32_t *data = map ? map : batch->map;
-   uint32_t *end = data + USED_BATCH(*batch);
-   uint32_t gtt_offset = map ? batch->bo->gtt_offset : 0;
+   uint32_t *end = batch_data + USED_BATCH(*batch);
+   uint32_t batch_gtt_offset = batch->bo->gtt_offset;
+   uint32_t state_gtt_offset = batch->bo->gtt_offset;
int length;
 
bool color = INTEL_DEBUG & DEBUG_COLOR;
const char *header_color = color ? BLUE_HEADER : "";
const char *reset_color  = color ? NORMAL : "";
 
-   for (uint32_t *p = data; p < end; p += length) {
+   for (uint32_t *p = batch_data; p < end; p += length) {
   struct gen_group *inst = gen_spec_find_instruction(spec, p);
   length = gen_group_get_length(inst, p);
   assert(inst == NULL || length > 0);
@@ -328,7 +328,7 @@ do_batch_dump(struct brw_context *brw)
  continue;
   }
 
-  uint64_t offset = gtt_offset + 4 * (p - data);
+  uint64_t offset = batch_gtt_offset + 4 * (p - batch_data);
 
   fprintf(stderr, "%s0x%08"PRIx64":  0x%08x:  %-80s%s\n", header_color,
   offset, p[0], gen_group_get_name(inst), reset_color);
@@ -338,26 +338,26 @@ do_batch_dump(struct brw_context *brw)
   switch (gen_group_get_opcode(inst) >> 16) {
   case _3DSTATE_PIPELINED_POINTERS:
  /* Note: these Gen4-5 pointers are full relocations rather than
-  * offsets from the start of the batch.  So we need to subtract
-  * gtt_offset (the start of the batch) to obtain an offset we
+  * offsets from the start of the statebuffer.  So we need to subtract
+  * gtt_offset (the start of the statebuffer) to obtain an offset we
   * can add to the map and get at the data.
   */
- decode_struct(brw, spec, "VS_STATE", data, gtt_offset,
-   (p[1] & ~0x1fu) - gtt_offset, color);
+ decode_struct(brw, spec, "VS_STATE", state, state_gtt_offset,
+   (p[1] & ~0x1fu) - state_gtt_offset, color);
  if (p[2] & 1) {
-decode_struct(brw, spec, "GS_STATE", data, gtt_offset,
-  (p[2] & ~0x1fu) - gtt_offset, color);
+decode_struct(brw, spec, "GS_STATE", state, state_gtt_offset,
+  (p[2] & ~0x1fu) - state_gtt_offset, color);
  }
  if (p[3] & 1) {
-decode_struct(brw, spec, "CLIP_STATE", data, gtt_offset,
-  (p[3] & ~0x1fu) - gtt_offset, color);
+decode_struct(brw, spec, "CLIP_STATE", state, state_gtt_offset,
+  (p[3] & ~0x1fu) - state_gtt_offset, color);
  }
- decode_struct(brw, spec, "SF_STATE", data, gtt_offset,
-   (p[4] & ~0x1fu) - gtt_offset, color);
- decode_struct(brw, spec, "WM_STATE", data, gtt_offset,
-   (p[5] & ~0x1fu) - gtt_offset, color);
- decode_struct(brw, spec, "COLOR_CALC_STATE", data, gtt_offset,
-   (p[6] & ~0x3fu) - gtt_offset, color);
+ decode_struct(brw, spec, "SF_STATE", state, state_gtt_offset,
+   (p[4] & ~0x1fu) - state_gtt_offset, color);
+ decode_struct(brw, spec, "WM_STATE", state, state_gtt_offset,
+   (p[5] & ~0x1fu) - state_gtt_offset, color);
+ decode_struct(brw, spec, "COLOR_CALC_STATE", state, state_gtt_offset,
+   (p[6] & ~0x3fu) - state_gtt_offset, color);
  break;
   case _3DSTATE_BINDING_TABLE_POINTERS_VS:
   case _3DSTATE_BINDING_TABLE_POINTERS_HS:
@@ -371,11 +371,11 @@ do_batch_dump(struct brw_context *brw)
 
  uint32_t bt_offset = p[1] & ~0x1fu;
  int bt_entries = brw_state_batch_size(brw, bt_offset) / 4;
- uint32_t *bt_pointers = [bt_offset / 4];
+ uint32_t *bt_pointers = 

[Mesa-dev] [PATCH v2 02/15] i965: Use batch->bo->size in brw_emit_reloc assertion.

2017-09-13 Thread Kenneth Graunke
This makes the assertion safe against batchbuffers growing.
---
 src/mesa/drivers/dri/i965/intel_batchbuffer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c 
b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
index cd118f6c6fc..276fe458a13 100644
--- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
+++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
@@ -806,7 +806,7 @@ brw_emit_reloc(struct intel_batchbuffer *batch, uint32_t 
batch_offset,
}
 
/* Check args */
-   assert(batch_offset <= BATCH_SZ - sizeof(uint32_t));
+   assert(batch_offset <= batch->bo->size - sizeof(uint32_t));
 
unsigned int index = add_exec_bo(batch, target);
struct drm_i915_gem_exec_object2 *entry = >validation_list[index];
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 03/15] i965: Use a WC map and memcpy for the batch instead of pwrite.

2017-09-13 Thread Kenneth Graunke
We'd like to eliminate the malloc'd shadow copy eventually, but there
are still unresolved performance problems.  In the meantime, let's at
least get rid of pwrite.
---
 src/mesa/drivers/dri/i965/intel_batchbuffer.c | 18 --
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c 
b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
index 276fe458a13..9cd491b5ace 100644
--- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
+++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
@@ -629,18 +629,16 @@ do_flush_locked(struct brw_context *brw, int in_fence_fd, 
int *out_fence_fd)
struct intel_batchbuffer *batch = >batch;
int ret = 0;
 
-   if (devinfo->has_llc) {
-  brw_bo_unmap(batch->bo);
-   } else {
-  ret = brw_bo_subdata(batch->bo, 0, 4 * USED_BATCH(*batch), batch->map);
-  if (ret == 0 && batch->state_batch_offset != batch->bo->size) {
- ret = brw_bo_subdata(batch->bo,
-  batch->state_batch_offset,
-  batch->bo->size - batch->state_batch_offset,
-  (char *)batch->map + batch->state_batch_offset);
-  }
+   if (batch->cpu_map) {
+  void *bo_map = brw_bo_map(brw, batch->bo, MAP_WRITE);
+  memcpy(bo_map, batch->cpu_map, 4 * USED_BATCH(*batch));
+  memcpy(bo_map + batch->state_batch_offset,
+ (char *) batch->cpu_map + batch->state_batch_offset,
+ batch->bo->size - batch->state_batch_offset);
}
 
+   brw_bo_unmap(batch->bo);
+
if (!brw->screen->no_hw) {
   /* The requirement for using I915_EXEC_NO_RELOC are:
*
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 05/15] i965: Move brw_state_batch code to intel_batchbuffer.c

2017-09-13 Thread Kenneth Graunke
The batch buffer and state buffer code is fairly tied together,
and having it in one .c file will make refactoring easier.

Also, drop some commentary above brw_state_batch.  The "aperture
checking performance hacks" are long since gone, so that paragraph
makes little sense at this point.
---
 src/mesa/drivers/dri/i965/Makefile.sources|  1 -
 src/mesa/drivers/dri/i965/brw_state.h |  4 +-
 src/mesa/drivers/dri/i965/brw_state_batch.c   | 93 ---
 src/mesa/drivers/dri/i965/intel_batchbuffer.c | 46 +
 4 files changed, 47 insertions(+), 97 deletions(-)
 delete mode 100644 src/mesa/drivers/dri/i965/brw_state_batch.c

diff --git a/src/mesa/drivers/dri/i965/Makefile.sources 
b/src/mesa/drivers/dri/i965/Makefile.sources
index 9687eb957e1..e33dea07128 100644
--- a/src/mesa/drivers/dri/i965/Makefile.sources
+++ b/src/mesa/drivers/dri/i965/Makefile.sources
@@ -41,7 +41,6 @@ i965_FILES = \
brw_queryobj.c \
brw_reset.c \
brw_sf.c \
-   brw_state_batch.c \
brw_state.h \
brw_state_upload.c \
brw_structs.h \
diff --git a/src/mesa/drivers/dri/i965/brw_state.h 
b/src/mesa/drivers/dri/i965/brw_state.h
index 1cbddaba786..c8b71e72de5 100644
--- a/src/mesa/drivers/dri/i965/brw_state.h
+++ b/src/mesa/drivers/dri/i965/brw_state.h
@@ -184,9 +184,7 @@ void brw_destroy_caches( struct brw_context *brw );
 
 void brw_print_program_cache(struct brw_context *brw);
 
-/***
- * brw_state_batch.c
- */
+/* intel_batchbuffer.c */
 void *brw_state_batch(struct brw_context *brw,
   int size, int alignment, uint32_t *out_offset);
 uint32_t brw_state_batch_size(struct brw_context *brw, uint32_t offset);
diff --git a/src/mesa/drivers/dri/i965/brw_state_batch.c 
b/src/mesa/drivers/dri/i965/brw_state_batch.c
deleted file mode 100644
index 5b6f3af93d8..000
--- a/src/mesa/drivers/dri/i965/brw_state_batch.c
+++ /dev/null
@@ -1,93 +0,0 @@
-/*
- Copyright (C) Intel Corp.  2006.  All Rights Reserved.
- Intel funded Tungsten Graphics to
- develop this 3D driver.
-
- Permission is hereby granted, free of charge, to any person obtaining
- a copy of this software and associated documentation files (the
- "Software"), to deal in the Software without restriction, including
- without limitation the rights to use, copy, modify, merge, publish,
- distribute, sublicense, and/or sell copies of the Software, and to
- permit persons to whom the Software is furnished to do so, subject to
- the following conditions:
-
- The above copyright notice and this permission notice (including the
- next paragraph) shall be included in all copies or substantial
- portions of the Software.
-
- THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
- EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
- MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
- IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE
- LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
- OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
- WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
-
- **/
- /*
-  * Authors:
-  *   Keith Whitwell 
-  */
-
-#include "brw_state.h"
-#include "intel_batchbuffer.h"
-#include "main/imports.h"
-#include "util/hash_table.h"
-#include "util/ralloc.h"
-
-uint32_t
-brw_state_batch_size(struct brw_context *brw, uint32_t offset)
-{
-   struct hash_entry *entry =
-  _mesa_hash_table_search(brw->batch.state_batch_sizes,
-  (void *) (uintptr_t) offset);
-   return entry ? (uintptr_t) entry->data : 0;
-}
-
-/**
- * Allocates a block of space in the batchbuffer for indirect state.
- *
- * We don't want to allocate separate BOs for every bit of indirect
- * state in the driver.  It means overallocating by a significant
- * margin (4096 bytes, even if the object is just a 20-byte surface
- * state), and more buffers to walk and count for aperture size checking.
- *
- * However, due to the restrictions imposed by the aperture size
- * checking performance hacks, we can't have the batch point at a
- * separate indirect state buffer, because once the batch points at
- * it, no more relocations can be added to it.  So, we sneak these
- * buffers in at the top of the batchbuffer.
- */
-void *
-brw_state_batch(struct brw_context *brw,
-int size,
-int alignment,
-uint32_t *out_offset)
-{
-   struct intel_batchbuffer *batch = >batch;
-   uint32_t offset;
-
-   assert(size < batch->bo->size);
-   offset = ROUND_DOWN_TO(batch->state_batch_offset - size, alignment);
-
-   /* If allocating from the top would wrap below the batchbuffer, or
-* if the batch's used space (plus the reserved pad) collides with our
-* space, then flush 

[Mesa-dev] [PATCH v2 01/15] i965: Delete a batch size assertion that isn't very useful.

2017-09-13 Thread Kenneth Graunke
This assertion prevents you from doing intel_batchbuffer_require_space
with a size so huge it won't fit in the batchbuffer.  This doesn't seem
like a common mistake, and I've never seen the assert to be useful.

Soon, I hope to have batches grow, at which point this won't make sense.
---
 src/mesa/drivers/dri/i965/intel_batchbuffer.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c 
b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
index 515b595bc92..cd118f6c6fc 100644
--- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
+++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
@@ -238,9 +238,6 @@ intel_batchbuffer_require_space(struct brw_context *brw, 
GLuint sz,
   intel_batchbuffer_flush(brw);
}
 
-#ifdef DEBUG
-   assert(sz < BATCH_SZ - BATCH_RESERVED);
-#endif
if (intel_batchbuffer_space(>batch) < sz)
   intel_batchbuffer_flush(brw);
 
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 04/15] i965: Drop a useless ret == 0 check.

2017-09-13 Thread Kenneth Graunke
Prior to the previous patch, we would pwrite the batchbuffer contents,
and wanted to skip the execbuffer if that failed.  Now that we memcpy,
we don't set ret != 0 on failure anymore, so it will always be 0.
---
 src/mesa/drivers/dri/i965/intel_batchbuffer.c | 42 +--
 1 file changed, 20 insertions(+), 22 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c 
b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
index 9cd491b5ace..95ce4be3c11 100644
--- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
+++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
@@ -662,31 +662,29 @@ do_flush_locked(struct brw_context *brw, int in_fence_fd, 
int *out_fence_fd)
   if (batch->needs_sol_reset)
  flags |= I915_EXEC_GEN7_SOL_RESET;
 
-  if (ret == 0) {
- uint32_t hw_ctx = batch->ring == RENDER_RING ? brw->hw_ctx : 0;
-
- struct drm_i915_gem_exec_object2 *entry = >validation_list[0];
- assert(entry->handle == batch->bo->gem_handle);
- entry->relocation_count = batch->reloc_count;
- entry->relocs_ptr = (uintptr_t) batch->relocs;
-
- if (batch->use_batch_first) {
-flags |= I915_EXEC_BATCH_FIRST | I915_EXEC_HANDLE_LUT;
- } else {
-/* Move the batch to the end of the validation list */
-struct drm_i915_gem_exec_object2 tmp;
-const unsigned index = batch->exec_count - 1;
-
-tmp = *entry;
-*entry = batch->validation_list[index];
-batch->validation_list[index] = tmp;
- }
+  uint32_t hw_ctx = batch->ring == RENDER_RING ? brw->hw_ctx : 0;
+
+  struct drm_i915_gem_exec_object2 *entry = >validation_list[0];
+  assert(entry->handle == batch->bo->gem_handle);
+  entry->relocation_count = batch->reloc_count;
+  entry->relocs_ptr = (uintptr_t) batch->relocs;
 
- ret = execbuffer(dri_screen->fd, batch, hw_ctx,
-  4 * USED_BATCH(*batch),
-  in_fence_fd, out_fence_fd, flags);
+  if (batch->use_batch_first) {
+ flags |= I915_EXEC_BATCH_FIRST | I915_EXEC_HANDLE_LUT;
+  } else {
+ /* Move the batch to the end of the validation list */
+ struct drm_i915_gem_exec_object2 tmp;
+ const unsigned index = batch->exec_count - 1;
+
+ tmp = *entry;
+ *entry = batch->validation_list[index];
+ batch->validation_list[index] = tmp;
   }
 
+  ret = execbuffer(dri_screen->fd, batch, hw_ctx,
+   4 * USED_BATCH(*batch),
+   in_fence_fd, out_fence_fd, flags);
+
   throttle(brw);
}
 
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH 4/4] ac/surface: match Z and stencil tile config

2017-09-13 Thread Emil Velikov
On 13 September 2017 at 20:48, Nicolai Hähnle  wrote:
> On 13.09.2017 20:28, Emil Velikov wrote:
>>
>> Hi Nicolai,
>>
>> I'm doing a final run through the lists for 17.2.1 and doesn't seem
>> like this landed in master.
>> Just a friendly poke, in case it fell through the cracks.
>
>
> Are you absolutely sure about that? ;-)
>
> Admittedly it's only been ~3 hours since I pushed this.
>
My bad - seems like should have pulled before hitting send.

There were a few trivial conflicts due to the lack of 2b7e85562ae, but
that one is not required, right?

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] intel/eu/validate: Look up types on demand in execution_type()

2017-09-13 Thread Matt Turner

On 09/13, Emil Velikov wrote:

As Andres pointed out this commit depends on 4fab67a4415 et al, which
seems to be missing in 17.1 and 17.2.

I've ported this to 17.2 by moving the brw_inst_src[01]_reg_type()
calls as in the original patch.
The brw_inst_src[01]_reg_file() ones are left as-is.


Without commit 4fab67a4415, this patch isn't needed at all. Feel free to
drop it from the 17.2 branch.


signature.asc
Description: Digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/5] amd/common: remove has_ds_bpermute argument from ac_build_ddxy

2017-09-13 Thread Nicolai Hähnle

On 13.09.2017 21:48, Connor Abbott wrote:

On Wed, Sep 13, 2017 at 3:46 PM, Nicolai Hähnle  wrote:

On 13.09.2017 20:45, Connor Abbott wrote:


Not sure if we'll want to do this, since we'll need to need to
effectively revert it anyways when we implement derivatives with DPP
(although we'll have to rename has_ds_bpermute to has_dpp...).



Is there a reason for not deriving has_dpp from chip_class?


Yeah, good point. Maybe chip_class wasn't available when the field was added?


Right :)

I think this code has moved and been re-shaped quite a bit over the years...






Cheers,
Nicolai




On Wed, Sep 13, 2017 at 1:04 PM, Nicolai Hähnle 
wrote:


From: Nicolai Hähnle 

---
   src/amd/common/ac_llvm_build.c   | 3 +--
   src/amd/common/ac_llvm_build.h   | 1 -
   src/amd/common/ac_nir_to_llvm.c  | 5 +
   src/gallium/drivers/radeonsi/si_pipe.c   | 1 -
   src/gallium/drivers/radeonsi/si_pipe.h   | 1 -
   src/gallium/drivers/radeonsi/si_shader.c | 3 +--
   6 files changed, 3 insertions(+), 11 deletions(-)

diff --git a/src/amd/common/ac_llvm_build.c
b/src/amd/common/ac_llvm_build.c
index 4077bd81bbc..6c010e8c3a6 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -965,29 +965,28 @@ ac_get_thread_id(struct ac_llvm_context *ctx)
* So, masking the TID with 0xfffc yields the TID of the top left
pixel
* of the quad, masking with 0xfffd yields the TID of the top pixel
of
* the current pixel's column, and masking with 0xfffe yields the
TID
* of the left pixel of the current pixel's row.
*
* Adding 1 yields the TID of the pixel to the right of the left pixel,
and
* adding 2 yields the TID of the pixel below the top pixel.
*/
   LLVMValueRef
   ac_build_ddxy(struct ac_llvm_context *ctx,
- bool has_ds_bpermute,
uint32_t mask,
int idx,
LLVMValueRef val)
   {
  LLVMValueRef tl, trbl, args[2];
  LLVMValueRef result;

-   if (has_ds_bpermute) {
+   if (ctx->chip_class >= VI) {
  LLVMValueRef thread_id, tl_tid, trbl_tid;
  thread_id = ac_get_thread_id(ctx);

  tl_tid = LLVMBuildAnd(ctx->builder, thread_id,
LLVMConstInt(ctx->i32, mask,
false), "");

  trbl_tid = LLVMBuildAdd(ctx->builder, tl_tid,
  LLVMConstInt(ctx->i32, idx,
false), "");

  args[0] = LLVMBuildMul(ctx->builder, tl_tid,
diff --git a/src/amd/common/ac_llvm_build.h
b/src/amd/common/ac_llvm_build.h
index b6434893cfa..3f93551330c 100644
--- a/src/amd/common/ac_llvm_build.h
+++ b/src/amd/common/ac_llvm_build.h
@@ -187,21 +187,20 @@ LLVMValueRef ac_build_buffer_load_format(struct
ac_llvm_context *ctx,

   LLVMValueRef
   ac_get_thread_id(struct ac_llvm_context *ctx);

   #define AC_TID_MASK_TOP_LEFT 0xfffc
   #define AC_TID_MASK_TOP  0xfffd
   #define AC_TID_MASK_LEFT 0xfffe

   LLVMValueRef
   ac_build_ddxy(struct ac_llvm_context *ctx,
- bool has_ds_bpermute,
uint32_t mask,
int idx,
LLVMValueRef val);

   #define AC_SENDMSG_GS 2
   #define AC_SENDMSG_GS_DONE 3

   #define AC_SENDMSG_GS_OP_NOP  (0 << 4)
   #define AC_SENDMSG_GS_OP_CUT  (1 << 4)
   #define AC_SENDMSG_GS_OP_EMIT (2 << 4)
diff --git a/src/amd/common/ac_nir_to_llvm.c
b/src/amd/common/ac_nir_to_llvm.c
index c0c4441022a..bf4b3ca6521 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -1407,40 +1407,37 @@ static LLVMValueRef emit_unpack_half_2x16(struct
ac_llvm_context *ctx,
  return result;
   }

   static LLVMValueRef emit_ddxy(struct ac_nir_context *ctx,
nir_op op,
LLVMValueRef src0)
   {
  unsigned mask;
  int idx;
  LLVMValueRef result;
-   bool has_ds_bpermute = ctx->abi->chip_class >= VI;

  if (op == nir_op_fddx_fine || op == nir_op_fddx)
  mask = AC_TID_MASK_LEFT;
  else if (op == nir_op_fddy_fine || op == nir_op_fddy)
  mask = AC_TID_MASK_TOP;
  else
  mask = AC_TID_MASK_TOP_LEFT;

  /* for DDX we want to next X pixel, DDY next Y pixel. */
  if (op == nir_op_fddx_fine ||
  op == nir_op_fddx_coarse ||
  op == nir_op_fddx)
  idx = 1;
  else
  idx = 2;

-   result = ac_build_ddxy(>ac, has_ds_bpermute,
- mask, idx,
- src0);
+   result = ac_build_ddxy(>ac, mask, idx, src0);
  return result;
   }

   /*
* this takes an I,J coordinate pair,
* and works out the X and Y derivatives.
* it returns DDX(I), DDX(J), DDY(I), DDY(J).
*/
   

Re: [Mesa-dev] [PATCH 2/5] amd/common: round cube array slice in ac_prepare_cube_coords

2017-09-13 Thread Nicolai Hähnle

On 13.09.2017 22:20, Dave Airlie wrote:

On 14 September 2017 at 03:04, Nicolai Hähnle  wrote:

From: Nicolai Hähnle 

The NIR-to-LLVM pass already does this; now the same fix covers
radeonsi as well.

Fixes various tests of
dEQP-GLES31.functional.texture.filtering.cube_array.combinations.*


Nice to see you guys catch up, I think I asked about a lot of these workarounds
when I wrote them and didn't get much response :-), other than radeonsi doesn't
need them, looks like it did after all.


Indeed. In our defense, the only documentation I could find about a lot 
of these things is that they were fixed in gfx9 :)




Reviewed-by: Dave Airlie 


Thanks!





___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/5] amd/common: remove has_ds_bpermute argument from ac_build_ddxy

2017-09-13 Thread Nicolai Hähnle

On 13.09.2017 20:45, Connor Abbott wrote:

Not sure if we'll want to do this, since we'll need to need to
effectively revert it anyways when we implement derivatives with DPP
(although we'll have to rename has_ds_bpermute to has_dpp...).


Is there a reason for not deriving has_dpp from chip_class?

Cheers,
Nicolai



On Wed, Sep 13, 2017 at 1:04 PM, Nicolai Hähnle  wrote:

From: Nicolai Hähnle 

---
  src/amd/common/ac_llvm_build.c   | 3 +--
  src/amd/common/ac_llvm_build.h   | 1 -
  src/amd/common/ac_nir_to_llvm.c  | 5 +
  src/gallium/drivers/radeonsi/si_pipe.c   | 1 -
  src/gallium/drivers/radeonsi/si_pipe.h   | 1 -
  src/gallium/drivers/radeonsi/si_shader.c | 3 +--
  6 files changed, 3 insertions(+), 11 deletions(-)

diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
index 4077bd81bbc..6c010e8c3a6 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -965,29 +965,28 @@ ac_get_thread_id(struct ac_llvm_context *ctx)
   * So, masking the TID with 0xfffc yields the TID of the top left pixel
   * of the quad, masking with 0xfffd yields the TID of the top pixel of
   * the current pixel's column, and masking with 0xfffe yields the TID
   * of the left pixel of the current pixel's row.
   *
   * Adding 1 yields the TID of the pixel to the right of the left pixel, and
   * adding 2 yields the TID of the pixel below the top pixel.
   */
  LLVMValueRef
  ac_build_ddxy(struct ac_llvm_context *ctx,
- bool has_ds_bpermute,
   uint32_t mask,
   int idx,
   LLVMValueRef val)
  {
 LLVMValueRef tl, trbl, args[2];
 LLVMValueRef result;

-   if (has_ds_bpermute) {
+   if (ctx->chip_class >= VI) {
 LLVMValueRef thread_id, tl_tid, trbl_tid;
 thread_id = ac_get_thread_id(ctx);

 tl_tid = LLVMBuildAnd(ctx->builder, thread_id,
   LLVMConstInt(ctx->i32, mask, false), "");

 trbl_tid = LLVMBuildAdd(ctx->builder, tl_tid,
 LLVMConstInt(ctx->i32, idx, false), 
"");

 args[0] = LLVMBuildMul(ctx->builder, tl_tid,
diff --git a/src/amd/common/ac_llvm_build.h b/src/amd/common/ac_llvm_build.h
index b6434893cfa..3f93551330c 100644
--- a/src/amd/common/ac_llvm_build.h
+++ b/src/amd/common/ac_llvm_build.h
@@ -187,21 +187,20 @@ LLVMValueRef ac_build_buffer_load_format(struct 
ac_llvm_context *ctx,

  LLVMValueRef
  ac_get_thread_id(struct ac_llvm_context *ctx);

  #define AC_TID_MASK_TOP_LEFT 0xfffc
  #define AC_TID_MASK_TOP  0xfffd
  #define AC_TID_MASK_LEFT 0xfffe

  LLVMValueRef
  ac_build_ddxy(struct ac_llvm_context *ctx,
- bool has_ds_bpermute,
   uint32_t mask,
   int idx,
   LLVMValueRef val);

  #define AC_SENDMSG_GS 2
  #define AC_SENDMSG_GS_DONE 3

  #define AC_SENDMSG_GS_OP_NOP  (0 << 4)
  #define AC_SENDMSG_GS_OP_CUT  (1 << 4)
  #define AC_SENDMSG_GS_OP_EMIT (2 << 4)
diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index c0c4441022a..bf4b3ca6521 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -1407,40 +1407,37 @@ static LLVMValueRef emit_unpack_half_2x16(struct 
ac_llvm_context *ctx,
 return result;
  }

  static LLVMValueRef emit_ddxy(struct ac_nir_context *ctx,
   nir_op op,
   LLVMValueRef src0)
  {
 unsigned mask;
 int idx;
 LLVMValueRef result;
-   bool has_ds_bpermute = ctx->abi->chip_class >= VI;

 if (op == nir_op_fddx_fine || op == nir_op_fddx)
 mask = AC_TID_MASK_LEFT;
 else if (op == nir_op_fddy_fine || op == nir_op_fddy)
 mask = AC_TID_MASK_TOP;
 else
 mask = AC_TID_MASK_TOP_LEFT;

 /* for DDX we want to next X pixel, DDY next Y pixel. */
 if (op == nir_op_fddx_fine ||
 op == nir_op_fddx_coarse ||
 op == nir_op_fddx)
 idx = 1;
 else
 idx = 2;

-   result = ac_build_ddxy(>ac, has_ds_bpermute,
- mask, idx,
- src0);
+   result = ac_build_ddxy(>ac, mask, idx, src0);
 return result;
  }

  /*
   * this takes an I,J coordinate pair,
   * and works out the X and Y derivatives.
   * it returns DDX(I), DDX(J), DDY(I), DDY(J).
   */
  static LLVMValueRef emit_ddxy_interp(
 struct ac_nir_context *ctx,
diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
b/src/gallium/drivers/radeonsi/si_pipe.c
index ca2e055a90e..bb1362f1cfc 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ b/src/gallium/drivers/radeonsi/si_pipe.c
@@ -1037,21 +1037,20 @@ struct pipe_screen *radeonsi_screen_create(struct 

Re: [Mesa-dev] [PATCH 2/5] amd/common: round cube array slice in ac_prepare_cube_coords

2017-09-13 Thread Dave Airlie
On 14 September 2017 at 03:04, Nicolai Hähnle  wrote:
> From: Nicolai Hähnle 
>
> The NIR-to-LLVM pass already does this; now the same fix covers
> radeonsi as well.
>
> Fixes various tests of
> dEQP-GLES31.functional.texture.filtering.cube_array.combinations.*

Nice to see you guys catch up, I think I asked about a lot of these workarounds
when I wrote them and didn't get much response :-), other than radeonsi doesn't
need them, looks like it did after all.

Reviewed-by: Dave Airlie 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radv/ac: bump params array for image atomic comp swap

2017-09-13 Thread Dave Airlie
On 13 September 2017 at 23:36, Andres Gomez  wrote:
> Hi Dave,
>
> This patch landed tagged for 17.2 only. Was it, then, not nominated for
> 17.1 intentionally ?

Actually good point, this one should be in 17.1 as well.

Thanks,
Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH 4/4] ac/surface: match Z and stencil tile config

2017-09-13 Thread Nicolai Hähnle

On 13.09.2017 20:28, Emil Velikov wrote:

Hi Nicolai,

I'm doing a final run through the lists for 17.2.1 and doesn't seem
like this landed in master.
Just a friendly poke, in case it fell through the cracks.


Are you absolutely sure about that? ;-)

Admittedly it's only been ~3 hours since I pushed this.

Cheers,
Nicolai


-Emil




--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/5] amd/common: remove has_ds_bpermute argument from ac_build_ddxy

2017-09-13 Thread Connor Abbott
On Wed, Sep 13, 2017 at 3:46 PM, Nicolai Hähnle  wrote:
> On 13.09.2017 20:45, Connor Abbott wrote:
>>
>> Not sure if we'll want to do this, since we'll need to need to
>> effectively revert it anyways when we implement derivatives with DPP
>> (although we'll have to rename has_ds_bpermute to has_dpp...).
>
>
> Is there a reason for not deriving has_dpp from chip_class?

Yeah, good point. Maybe chip_class wasn't available when the field was added?

>
> Cheers,
> Nicolai
>
>
>>
>> On Wed, Sep 13, 2017 at 1:04 PM, Nicolai Hähnle 
>> wrote:
>>>
>>> From: Nicolai Hähnle 
>>>
>>> ---
>>>   src/amd/common/ac_llvm_build.c   | 3 +--
>>>   src/amd/common/ac_llvm_build.h   | 1 -
>>>   src/amd/common/ac_nir_to_llvm.c  | 5 +
>>>   src/gallium/drivers/radeonsi/si_pipe.c   | 1 -
>>>   src/gallium/drivers/radeonsi/si_pipe.h   | 1 -
>>>   src/gallium/drivers/radeonsi/si_shader.c | 3 +--
>>>   6 files changed, 3 insertions(+), 11 deletions(-)
>>>
>>> diff --git a/src/amd/common/ac_llvm_build.c
>>> b/src/amd/common/ac_llvm_build.c
>>> index 4077bd81bbc..6c010e8c3a6 100644
>>> --- a/src/amd/common/ac_llvm_build.c
>>> +++ b/src/amd/common/ac_llvm_build.c
>>> @@ -965,29 +965,28 @@ ac_get_thread_id(struct ac_llvm_context *ctx)
>>>* So, masking the TID with 0xfffc yields the TID of the top left
>>> pixel
>>>* of the quad, masking with 0xfffd yields the TID of the top pixel
>>> of
>>>* the current pixel's column, and masking with 0xfffe yields the
>>> TID
>>>* of the left pixel of the current pixel's row.
>>>*
>>>* Adding 1 yields the TID of the pixel to the right of the left pixel,
>>> and
>>>* adding 2 yields the TID of the pixel below the top pixel.
>>>*/
>>>   LLVMValueRef
>>>   ac_build_ddxy(struct ac_llvm_context *ctx,
>>> - bool has_ds_bpermute,
>>>uint32_t mask,
>>>int idx,
>>>LLVMValueRef val)
>>>   {
>>>  LLVMValueRef tl, trbl, args[2];
>>>  LLVMValueRef result;
>>>
>>> -   if (has_ds_bpermute) {
>>> +   if (ctx->chip_class >= VI) {
>>>  LLVMValueRef thread_id, tl_tid, trbl_tid;
>>>  thread_id = ac_get_thread_id(ctx);
>>>
>>>  tl_tid = LLVMBuildAnd(ctx->builder, thread_id,
>>>LLVMConstInt(ctx->i32, mask,
>>> false), "");
>>>
>>>  trbl_tid = LLVMBuildAdd(ctx->builder, tl_tid,
>>>  LLVMConstInt(ctx->i32, idx,
>>> false), "");
>>>
>>>  args[0] = LLVMBuildMul(ctx->builder, tl_tid,
>>> diff --git a/src/amd/common/ac_llvm_build.h
>>> b/src/amd/common/ac_llvm_build.h
>>> index b6434893cfa..3f93551330c 100644
>>> --- a/src/amd/common/ac_llvm_build.h
>>> +++ b/src/amd/common/ac_llvm_build.h
>>> @@ -187,21 +187,20 @@ LLVMValueRef ac_build_buffer_load_format(struct
>>> ac_llvm_context *ctx,
>>>
>>>   LLVMValueRef
>>>   ac_get_thread_id(struct ac_llvm_context *ctx);
>>>
>>>   #define AC_TID_MASK_TOP_LEFT 0xfffc
>>>   #define AC_TID_MASK_TOP  0xfffd
>>>   #define AC_TID_MASK_LEFT 0xfffe
>>>
>>>   LLVMValueRef
>>>   ac_build_ddxy(struct ac_llvm_context *ctx,
>>> - bool has_ds_bpermute,
>>>uint32_t mask,
>>>int idx,
>>>LLVMValueRef val);
>>>
>>>   #define AC_SENDMSG_GS 2
>>>   #define AC_SENDMSG_GS_DONE 3
>>>
>>>   #define AC_SENDMSG_GS_OP_NOP  (0 << 4)
>>>   #define AC_SENDMSG_GS_OP_CUT  (1 << 4)
>>>   #define AC_SENDMSG_GS_OP_EMIT (2 << 4)
>>> diff --git a/src/amd/common/ac_nir_to_llvm.c
>>> b/src/amd/common/ac_nir_to_llvm.c
>>> index c0c4441022a..bf4b3ca6521 100644
>>> --- a/src/amd/common/ac_nir_to_llvm.c
>>> +++ b/src/amd/common/ac_nir_to_llvm.c
>>> @@ -1407,40 +1407,37 @@ static LLVMValueRef emit_unpack_half_2x16(struct
>>> ac_llvm_context *ctx,
>>>  return result;
>>>   }
>>>
>>>   static LLVMValueRef emit_ddxy(struct ac_nir_context *ctx,
>>>nir_op op,
>>>LLVMValueRef src0)
>>>   {
>>>  unsigned mask;
>>>  int idx;
>>>  LLVMValueRef result;
>>> -   bool has_ds_bpermute = ctx->abi->chip_class >= VI;
>>>
>>>  if (op == nir_op_fddx_fine || op == nir_op_fddx)
>>>  mask = AC_TID_MASK_LEFT;
>>>  else if (op == nir_op_fddy_fine || op == nir_op_fddy)
>>>  mask = AC_TID_MASK_TOP;
>>>  else
>>>  mask = AC_TID_MASK_TOP_LEFT;
>>>
>>>  /* for DDX we want to next X pixel, DDY next Y pixel. */
>>>  if (op == nir_op_fddx_fine ||
>>>  op == nir_op_fddx_coarse ||
>>>  op == nir_op_fddx)
>>>  idx = 1;
>>>  else
>>>  idx = 2;
>>>
>>> -   result = ac_build_ddxy(>ac, has_ds_bpermute,
>>> - mask, idx,

Re: [Mesa-dev] [PATCH 01/23] vulkan: Import vk_android_native_buffer.h

2017-09-13 Thread Chad Versace
On Wed 13 Sep 2017, Emil Velikov wrote:
> On 2 September 2017 at 09:17, Chad Versace  wrote:
> > Just as Mesa imports the Khronos Vulkan headers, it should import this
> > Android-private Vulkan header too. This guarantees that Mesa will
> > continue to build even when upstream Android breaks header
> > compatibility.
> >
> That's the thing - upstream should not break. Although I share your
> concern/focus to keep Mesa building/working.

I agree, it shouldn't. But Android has done that before; it broke
backwards compatibility in frameworks headers between major releases (L
-> N -> O). (Though I don't recall the details).

> I'm wondering when upstream changes the API/ABI in a non-backward
> compatible manner.
> There will be interesting corruption/crashes + we'll need to still "fix" Mesa.

Right. But at least Mesa will remain bisectable because it will continue
to build.

> There are a few other concerns though. See below.
> 
> > --- /dev/null
> > +++ b/include/vulkan/vk_android_native_buffer.h
> 
> > +#ifndef __VK_ANDROID_NATIVE_BUFFER_H__
> > +#define __VK_ANDROID_NATIVE_BUFFER_H__
> > +
> > +#include 
> 
> As mentioned in 23/23 as-is this will pull the above header as
> dependency for normal Linux builds.
> Regardless of my input in said patch, one could use a local
> redeclaration for "buffer_handle_t", which is seemingly the only
> reason why this include exists?

Patch 2 in the series fixes this problem with a small hack.

> > +#include 
> > +
> > +#ifdef __cplusplus
> > +extern "C" {
> > +#endif
> > +
> > +#define VK_ANDROID_native_buffer 1
> > +
> > +#define VK_ANDROID_NATIVE_BUFFER_EXTENSION_NUMBER 11
> > +#define VK_ANDROID_NATIVE_BUFFER_SPEC_VERSION 5
> > +#define VK_ANDROID_NATIVE_BUFFER_EXTENSION_NAME   
> > "VK_ANDROID_native_buffer"
> > +

> Version does not match the one in vk.xml, fortunately the rest do.

The version matches that in the vk_android_native_buffer.xml that I add
in patch 3.

> None of the remainder of this file has equivalent in vk.xml. That
> combined with the deprecated notations indicates that the API/ABI is
> not stable yet. Thus my earlier concern.

Yes, this API seems unstable to me too. That's why it's so important to
import the header.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] util/build_id: Include

2017-09-13 Thread Matt Turner
Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium/{r600, radeonsi}: Fix segfault with color format (v2)

2017-09-13 Thread Denis Pauk
Could you please commit this changes?

I have not found what i need to do next after recieve "approve" for changes
in https://www.mesa3d.org/submittingpatches.html#reviewing.
Do i need to send new mail with "Reviewed-by: " mark?

On Wed, Sep 13, 2017 at 6:36 PM, Marek Olšák  wrote:

> I've changed my mind. The patch is OK:
>
> Reviewed-by: Marek Olšák 
>
> Marek
>
> On Wed, Sep 13, 2017 at 4:45 PM, Denis Pauk  wrote:
> > What do you think about replace all checks in patch to asserts?
> >
> >
> > Best regards,
> >   Denis.
> >
> > On Sep 13, 2017 1:00 PM, "Marek Olšák"  wrote:
> >>
> >> On Wed, Sep 13, 2017 at 6:54 AM, Денис Паук 
> wrote:
> >> > Do you mean delete check in u_format.c:: util_format_is_supported?
> Could
> >> > you
> >> > please explain more?
> >>
> >> I mean delete everything. Invalid formats shouldn't be passed to
> >> pipe_screen::is_format_supported. The code doing it needs to be fixed.
> >>
> >> Marek
>



-- 
Best regards,
  Denis.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102710] vkCmdBlitImage with arrayLayers > 1 fails

2017-09-13 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102710

--- Comment #1 from Bas Nieuwenhuizen  ---
Do you have a testcase?

Could you also tell a bit more about what you do and what happens? Do you use
VK_REMAINING_ARRAY_LAYERS? Is the result only the first layer being written to,
all layers containing layer 0 of the src, or some other recognizable pattern?

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] util/build_id: Include

2017-09-13 Thread Chad Versace
Fix the build for Android Nougat.

The dladdr(3) manpage says that  is required. On Linux, the
build succeeded without it because build_id.c includes  which
includes . On Android, we must include  directly.

Fixes: 5c98d382 "util: Query build-id by symbol address, not library name"
Cc: Matt Turner 
---
 src/util/build_id.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/util/build_id.c b/src/util/build_id.c
index 6280b4a54e..536c74360e 100644
--- a/src/util/build_id.c
+++ b/src/util/build_id.c
@@ -22,6 +22,7 @@
  */
 
 #ifdef HAVE_DL_ITERATE_PHDR
+#include 
 #include 
 #include 
 #include 
-- 
2.13.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa] swr: use ARRAY_SIZE macro

2017-09-13 Thread Cherniak, Bruce
Regardless of making it a v2, etc. :-)  Sorry for the complication of not being 
able to just accept the original.

Reviewed-by: Bruce Cherniak  

> On Sep 12, 2017, at 8:03 AM, Eric Engestrom  wrote:
> 
> Signed-off-by: Eric Engestrom 
> ---
> src/gallium/drivers/swr/rasterizer/memory/StoreTile.h | 10 ++
> 1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/src/gallium/drivers/swr/rasterizer/memory/StoreTile.h 
> b/src/gallium/drivers/swr/rasterizer/memory/StoreTile.h
> index c3d14e9509..67bcf94f00 100644
> --- a/src/gallium/drivers/swr/rasterizer/memory/StoreTile.h
> +++ b/src/gallium/drivers/swr/rasterizer/memory/StoreTile.h
> @@ -40,6 +40,8 @@
> #include 
> #include 
> 
> +#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
> +
> // Function pointer to different storing functions for color, depth, and 
> stencil based on incoming formats.
> typedef void(*PFN_STORE_TILES)(uint8_t*, SWR_SURFACE_STATE*, uint32_t, 
> uint32_t, uint32_t);
> 
> @@ -1523,7 +1525,7 @@ struct OptStoreRasterTile< TilingTraits 64>, SrcFormat, DstFormat
> 
> pSrc += KNOB_SIMD16_WIDTH * SRC_BYTES_PER_PIXEL;
> 
> -for (uint32_t i = 0; i < sizeof(ppDsts) / sizeof(ppDsts[0]); i 
> += 1)
> +for (uint32_t i = 0; i < ARRAY_SIZE(ppDsts); i += 1)
> {
> ppDsts[i] += dy;
> }
> @@ -1641,7 +1643,7 @@ struct OptStoreRasterTile< TilingTraits 128>, SrcFormat, DstForma
> 
> pSrc += KNOB_SIMD16_WIDTH * SRC_BYTES_PER_PIXEL;
> 
> -for (uint32_t i = 0; i < sizeof(ppDsts) / sizeof(ppDsts[0]); i 
> += 1)
> +for (uint32_t i = 0; i < ARRAY_SIZE(ppDsts); i += 1)
> {
> ppDsts[i] += dy;
> }
> @@ -2124,7 +2126,7 @@ struct OptStoreRasterTile< 
> TilingTraits, SrcFormat, Ds
> 
> pSrc += KNOB_SIMD16_WIDTH * SRC_BYTES_PER_PIXEL;
> 
> -for (uint32_t i = 0; i < sizeof(ppDsts) / sizeof(ppDsts[0]); i 
> += 1)
> +for (uint32_t i = 0; i < ARRAY_SIZE(ppDsts); i += 1)
> {
> ppDsts[i] += dy;
> }
> @@ -2253,7 +2255,7 @@ struct OptStoreRasterTile< 
> TilingTraits, SrcFormat, D
> 
> pSrc += KNOB_SIMD16_WIDTH * SRC_BYTES_PER_PIXEL;
> 
> -for (uint32_t i = 0; i < sizeof(ppDsts) / sizeof(ppDsts[0]); i 
> += 1)
> +for (uint32_t i = 0; i < ARRAY_SIZE(ppDsts); i += 1)
> {
> ppDsts[i] += dy;
> }
> -- 
> Cheers,
>  Eric
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102710] vkCmdBlitImage with arrayLayers > 1 fails

2017-09-13 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102710

Bug ID: 102710
   Summary: vkCmdBlitImage with arrayLayers > 1 fails
   Product: Mesa
   Version: 17.2
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Drivers/Vulkan/radeon
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: mais...@archlinux.us
QA Contact: mesa-dev@lists.freedesktop.org

vkCmdBlitImage with number of layers > 1 seems to fail. The workaround is to do
N individual vkCmdBlitImages.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/5] amd/common: remove has_ds_bpermute argument from ac_build_ddxy

2017-09-13 Thread Connor Abbott
Not sure if we'll want to do this, since we'll need to need to
effectively revert it anyways when we implement derivatives with DPP
(although we'll have to rename has_ds_bpermute to has_dpp...).

On Wed, Sep 13, 2017 at 1:04 PM, Nicolai Hähnle  wrote:
> From: Nicolai Hähnle 
>
> ---
>  src/amd/common/ac_llvm_build.c   | 3 +--
>  src/amd/common/ac_llvm_build.h   | 1 -
>  src/amd/common/ac_nir_to_llvm.c  | 5 +
>  src/gallium/drivers/radeonsi/si_pipe.c   | 1 -
>  src/gallium/drivers/radeonsi/si_pipe.h   | 1 -
>  src/gallium/drivers/radeonsi/si_shader.c | 3 +--
>  6 files changed, 3 insertions(+), 11 deletions(-)
>
> diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
> index 4077bd81bbc..6c010e8c3a6 100644
> --- a/src/amd/common/ac_llvm_build.c
> +++ b/src/amd/common/ac_llvm_build.c
> @@ -965,29 +965,28 @@ ac_get_thread_id(struct ac_llvm_context *ctx)
>   * So, masking the TID with 0xfffc yields the TID of the top left pixel
>   * of the quad, masking with 0xfffd yields the TID of the top pixel of
>   * the current pixel's column, and masking with 0xfffe yields the TID
>   * of the left pixel of the current pixel's row.
>   *
>   * Adding 1 yields the TID of the pixel to the right of the left pixel, and
>   * adding 2 yields the TID of the pixel below the top pixel.
>   */
>  LLVMValueRef
>  ac_build_ddxy(struct ac_llvm_context *ctx,
> - bool has_ds_bpermute,
>   uint32_t mask,
>   int idx,
>   LLVMValueRef val)
>  {
> LLVMValueRef tl, trbl, args[2];
> LLVMValueRef result;
>
> -   if (has_ds_bpermute) {
> +   if (ctx->chip_class >= VI) {
> LLVMValueRef thread_id, tl_tid, trbl_tid;
> thread_id = ac_get_thread_id(ctx);
>
> tl_tid = LLVMBuildAnd(ctx->builder, thread_id,
>   LLVMConstInt(ctx->i32, mask, false), 
> "");
>
> trbl_tid = LLVMBuildAdd(ctx->builder, tl_tid,
> LLVMConstInt(ctx->i32, idx, false), 
> "");
>
> args[0] = LLVMBuildMul(ctx->builder, tl_tid,
> diff --git a/src/amd/common/ac_llvm_build.h b/src/amd/common/ac_llvm_build.h
> index b6434893cfa..3f93551330c 100644
> --- a/src/amd/common/ac_llvm_build.h
> +++ b/src/amd/common/ac_llvm_build.h
> @@ -187,21 +187,20 @@ LLVMValueRef ac_build_buffer_load_format(struct 
> ac_llvm_context *ctx,
>
>  LLVMValueRef
>  ac_get_thread_id(struct ac_llvm_context *ctx);
>
>  #define AC_TID_MASK_TOP_LEFT 0xfffc
>  #define AC_TID_MASK_TOP  0xfffd
>  #define AC_TID_MASK_LEFT 0xfffe
>
>  LLVMValueRef
>  ac_build_ddxy(struct ac_llvm_context *ctx,
> - bool has_ds_bpermute,
>   uint32_t mask,
>   int idx,
>   LLVMValueRef val);
>
>  #define AC_SENDMSG_GS 2
>  #define AC_SENDMSG_GS_DONE 3
>
>  #define AC_SENDMSG_GS_OP_NOP  (0 << 4)
>  #define AC_SENDMSG_GS_OP_CUT  (1 << 4)
>  #define AC_SENDMSG_GS_OP_EMIT (2 << 4)
> diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
> index c0c4441022a..bf4b3ca6521 100644
> --- a/src/amd/common/ac_nir_to_llvm.c
> +++ b/src/amd/common/ac_nir_to_llvm.c
> @@ -1407,40 +1407,37 @@ static LLVMValueRef emit_unpack_half_2x16(struct 
> ac_llvm_context *ctx,
> return result;
>  }
>
>  static LLVMValueRef emit_ddxy(struct ac_nir_context *ctx,
>   nir_op op,
>   LLVMValueRef src0)
>  {
> unsigned mask;
> int idx;
> LLVMValueRef result;
> -   bool has_ds_bpermute = ctx->abi->chip_class >= VI;
>
> if (op == nir_op_fddx_fine || op == nir_op_fddx)
> mask = AC_TID_MASK_LEFT;
> else if (op == nir_op_fddy_fine || op == nir_op_fddy)
> mask = AC_TID_MASK_TOP;
> else
> mask = AC_TID_MASK_TOP_LEFT;
>
> /* for DDX we want to next X pixel, DDY next Y pixel. */
> if (op == nir_op_fddx_fine ||
> op == nir_op_fddx_coarse ||
> op == nir_op_fddx)
> idx = 1;
> else
> idx = 2;
>
> -   result = ac_build_ddxy(>ac, has_ds_bpermute,
> - mask, idx,
> - src0);
> +   result = ac_build_ddxy(>ac, mask, idx, src0);
> return result;
>  }
>
>  /*
>   * this takes an I,J coordinate pair,
>   * and works out the X and Y derivatives.
>   * it returns DDX(I), DDX(J), DDY(I), DDY(J).
>   */
>  static LLVMValueRef emit_ddxy_interp(
> struct ac_nir_context *ctx,
> diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
> b/src/gallium/drivers/radeonsi/si_pipe.c
> index ca2e055a90e..bb1362f1cfc 100644
> --- a/src/gallium/drivers/radeonsi/si_pipe.c
> +++ b/src/gallium/drivers/radeonsi/si_pipe.c
> @@ -1037,21 +1037,20 @@ struct pipe_screen 

Re: [Mesa-dev] [PATCH] intel/eu/validate: Look up types on demand in execution_type()

2017-09-13 Thread Emil Velikov
Hi all,

On 31 August 2017 at 23:48, Jason Ekstrand  wrote:
> We are looking up the execution type prior to checking how many sources
> we have.  This leads to looking for a type for src1 on MOV instructions
> which is bogus.  On BDW+, the src1 register type overlaps with the
> 64-bit immediate and causes us problems.
> ---
>  src/intel/compiler/brw_eu_validate.c | 6 ++
>  1 file changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/src/intel/compiler/brw_eu_validate.c 
> b/src/intel/compiler/brw_eu_validate.c
> index 249342f..d72c0a0 100644
> --- a/src/intel/compiler/brw_eu_validate.c
> +++ b/src/intel/compiler/brw_eu_validate.c
> @@ -300,15 +300,13 @@ execution_type(const struct gen_device_info *devinfo, 
> const brw_inst *inst)
>  {
> unsigned num_sources = num_sources_from_inst(devinfo, inst);
> enum brw_reg_type src0_exec_type, src1_exec_type;
> -   enum brw_reg_type src0_type = brw_inst_src0_type(devinfo, inst);
> -   enum brw_reg_type src1_type = brw_inst_src1_type(devinfo, inst);
>
> /* Execution data type is independent of destination data type, except in
>  * mixed F/HF instructions on CHV and SKL+.
>  */
> enum brw_reg_type dst_exec_type = brw_inst_dst_type(devinfo, inst);
>
> -   src0_exec_type = execution_type_for_type(src0_type);
> +   src0_exec_type = execution_type_for_type(brw_inst_src0_type(devinfo, 
> inst));
> if (num_sources == 1) {
>if ((devinfo->gen >= 9 || devinfo->is_cherryview) &&
>src0_exec_type == BRW_REGISTER_TYPE_HF) {
> @@ -317,7 +315,7 @@ execution_type(const struct gen_device_info *devinfo, 
> const brw_inst *inst)
>return src0_exec_type;
> }
>
> -   src1_exec_type = execution_type_for_type(src1_type);
> +   src1_exec_type = execution_type_for_type(brw_inst_src1_type(devinfo, 
> inst));

As Andres pointed out this commit depends on 4fab67a4415 et al, which
seems to be missing in 17.1 and 17.2.

I've ported this to 17.2 by moving the brw_inst_src[01]_reg_type()
calls as in the original patch.
The brw_inst_src[01]_reg_file() ones are left as-is.

Let me know if extra porting is required.

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH 4/4] ac/surface: match Z and stencil tile config

2017-09-13 Thread Emil Velikov
Hi Nicolai,

I'm doing a final run through the lists for 17.2.1 and doesn't seem
like this landed in master.
Just a friendly poke, in case it fell through the cracks.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/5] amd/common: add chip_class to ac_llvm_context

2017-09-13 Thread Nicolai Hähnle

On 13.09.2017 19:25, Emil Velikov wrote:

Hi Nicolai,

On 13 September 2017 at 18:04, Nicolai Hähnle  wrote:

From: Nicolai Hähnle 

---
  src/amd/common/ac_llvm_build.c  | 5 -
  src/amd/common/ac_llvm_build.h  | 7 ++-
  src/amd/common/ac_nir_to_llvm.c | 4 ++--
  src/gallium/drivers/radeonsi/si_shader_tgsi_setup.c | 2 +-
  4 files changed, 13 insertions(+), 5 deletions(-)


What's the difference between these three?

ac_llvm_context::chip_class


This one is newly introduced.


ac_nir_compiler_options::chip_class


This one is used to communicate the chip_class from radv to amd/common 
and will most likely stay.



ac_shader_abi::chip_class


This one is (among other things) used to communicate the chip_class from 
radv/radeonsi to the driver-independent parts of ac_nir_to_llvm, though 
we can phase it out in favor of ac_llvm_context::chip_class.


Cheers,
Nicolai



Is it be possible to remove the unneeded/duplicated ones?





-Emil




--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/5] amd/common: add chip_class to ac_llvm_context

2017-09-13 Thread Emil Velikov
Hi Nicolai,

On 13 September 2017 at 18:04, Nicolai Hähnle  wrote:
> From: Nicolai Hähnle 
>
> ---
>  src/amd/common/ac_llvm_build.c  | 5 -
>  src/amd/common/ac_llvm_build.h  | 7 ++-
>  src/amd/common/ac_nir_to_llvm.c | 4 ++--
>  src/gallium/drivers/radeonsi/si_shader_tgsi_setup.c | 2 +-
>  4 files changed, 13 insertions(+), 5 deletions(-)
>
What's the difference between these three?

ac_llvm_context::chip_class
ac_nir_compiler_options::chip_class
ac_shader_abi::chip_class

Is it be possible to remove the unneeded/duplicated ones?

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] intel/eu/validate: Look up types on demand in execution_type()

2017-09-13 Thread Matt Turner
On Wed, Sep 13, 2017 at 6:03 AM, Andres Gomez  wrote:
> Jason, this patch landed tagged for mesa-stable but without mentioning
> any specific branch.
>
> For 17.1 it depends on earlier commit 4fab67a4415 which did not land in
> the branch so I'm keeping it out.
>
> I hope this is OK with you.

Yes, thank you.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa/st/tests: Correct build flags and force -std=c++11

2017-09-13 Thread Emil Velikov
On 13 September 2017 at 13:42, Gert Wollny  wrote:
> Am Mittwoch, den 13.09.2017, 13:11 +0100 schrieb Emil Velikov:
>> Hi Gert,
>>
>> On 13 September 2017 at 09:32, Gert Wollny 
>> wrote:
>> > Include src/gallium/Automake.inc, correct the build flags
>> > accordingly, and force -std=c++11 because only when building
>> > against llvm and only with llvm >= 3.9 LLVM_CXXFLAGS
>> > actually provides -std=c++11.
>> >
>>
>> Is the one character change is the only reason behind the c++11
>> toggle?
>
> No, the code uses c++11 initializer lists quite heavily.
>
Right, those could be reworked rather easily, ala "s/const
vector/const LocalClass[]/"
But that's for someone who feels strongly about them ;-)

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   >