[Mesa-dev] [PATCH 06/12] nir: Lower flrp(#a, #b, c) differently

2018-08-24 Thread Ian Romanick
From: Ian Romanick 

If the magnitudes of #a and #b are such that (b-a) won't lose too much
precision, lower as a+c(b-a).

I have CC'ed everyone responsible for drivers that sets lower_flrp32
or lower_flrp64.

No changes on any other Intel platforms.

Iron Lake
total instructions in shared programs: 7754921 -> 7754801 (<.01%)
instructions in affected programs: 18457 -> 18337 (-0.65%)
helped: 68
HURT: 0
helped stats (abs) min: 1 max: 18 x̄: 1.76 x̃: 1
helped stats (rel) min: 0.19% max: 7.89% x̄: 1.09% x̃: 0.43%
95% mean confidence interval for instructions value: -2.48 -1.05
95% mean confidence interval for instructions %-change: -1.56% -0.63%
Instructions are helped.

total cycles in shared programs: 177924080 -> 177923500 (<.01%)
cycles in affected programs: 744664 -> 744084 (-0.08%)
helped: 62
HURT: 0
helped stats (abs) min: 4 max: 60 x̄: 9.35 x̃: 6
helped stats (rel) min: 0.02% max: 4.84% x̄: 0.27% x̃: 0.06%
95% mean confidence interval for cycles value: -12.37 -6.34
95% mean confidence interval for cycles %-change: -0.48% -0.06%
Cycles are helped.

GM45
total instructions in shared programs: 4784265 -> 4784205 (<.01%)
instructions in affected programs: 9352 -> 9292 (-0.64%)
helped: 34
HURT: 0
helped stats (abs) min: 1 max: 18 x̄: 1.76 x̃: 1
helped stats (rel) min: 0.19% max: 7.50% x̄: 1.08% x̃: 0.43%
95% mean confidence interval for instructions value: -2.80 -0.73
95% mean confidence interval for instructions %-change: -1.74% -0.41%
Instructions are helped.

total cycles in shared programs: 122042268 -> 122041920 (<.01%)
cycles in affected programs: 426790 -> 426442 (-0.08%)
helped: 31
HURT: 0
helped stats (abs) min: 6 max: 60 x̄: 11.23 x̃: 6
helped stats (rel) min: 0.02% max: 4.84% x̄: 0.29% x̃: 0.06%
95% mean confidence interval for cycles value: -16.33 -6.12
95% mean confidence interval for cycles %-change: -0.63% 0.04%
Inconclusive result (%-change mean confidence interval includes 0).

Signed-off-by: Ian Romanick 
Cc: Marek Olšák 
Cc: Rob Clark 
Cc: Eric Anholt 
Cc: Dave Airlie 
Cc: Timothy Arceri 
---
 src/compiler/nir/nir_lower_flrp.c | 68 +++
 1 file changed, 68 insertions(+)

diff --git a/src/compiler/nir/nir_lower_flrp.c 
b/src/compiler/nir/nir_lower_flrp.c
index 3240445e18f..0c7e803b20f 100644
--- a/src/compiler/nir/nir_lower_flrp.c
+++ b/src/compiler/nir/nir_lower_flrp.c
@@ -20,6 +20,7 @@
  * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
  * IN THE SOFTWARE.
  */
+#include 
 #include "nir.h"
 #include "nir_builder.h"
 #include "util/u_vector.h"
@@ -136,6 +137,58 @@ replace_with_fast(struct nir_builder *bld, struct u_vector 
*dead_flrp,
append_flrp_to_dead_list(dead_flrp, alu);
 }
 
+static bool
+sources_are_constants_with_similar_magnitudes(const nir_alu_instr *instr)
+{
+   nir_const_value *val0 = nir_src_as_const_value(instr->src[0].src);
+   nir_const_value *val1 = nir_src_as_const_value(instr->src[1].src);
+
+   if (val0 == NULL || val1 == NULL)
+  return false;
+
+   const uint8_t *const swizzle0 = instr->src[0].swizzle;
+   const uint8_t *const swizzle1 = instr->src[1].swizzle;
+   const unsigned num_components = nir_dest_num_components(instr->dest.dest);
+
+   if (instr->dest.dest.ssa.bit_size == 32) {
+  for (unsigned i = 0; i < num_components; i++) {
+ int exp0;
+ int exp1;
+
+ frexpf(val0->f32[swizzle0[i]], );
+ frexpf(val1->f32[swizzle1[i]], );
+
+ /* If the difference between exponents is >= 24, then A+B will always
+  * have the value whichever between A and B has the largest absolute
+  * value.  So, [0, 23] is the valid range.  The smaller the limit
+  * value, the more precision will be maintained at a potential
+  * performance cost.  Somewhat arbitrarilly split the range in half.
+  */
+ if (abs(exp0 - exp1) > (23 / 2))
+return false;
+  }
+   } else {
+  for (unsigned i = 0; i < num_components; i++) {
+ int exp0;
+ int exp1;
+
+ frexp(val0->f64[swizzle0[i]], );
+ frexp(val1->f64[swizzle1[i]], );
+
+ /* If the difference between exponents is >= 53, then A+B will always
+  * have the value whichever between A and B has the largest absolute
+  * value.  So, [0, 52] is the valid range.  The smaller the limit
+  * value, the more precision will be maintained at a potential
+  * performance cost.  Somewhat arbitrarilly split the range in half.
+  */
+ if (abs(exp0 - exp1) > (52 / 2))
+return false;
+  }
+   }
+
+   return true;
+}
+
 static void
 convert_flrp_instruction(nir_builder *bld,
  struct u_vector *dead_flrp,
@@ -197,6 +250,21 @@ convert_flrp_instruction(nir_builder *bld,
   return;
}
 
+   /*
+* - If x and y are both immediates and the relative magnitude of the
+*   values is similar (such that x-y does not lose too much precision):
+*
+  

[Mesa-dev] [PATCH 10/12] nir: Lower flrp(a, b, #c) differently

2018-08-24 Thread Ian Romanick
From: Ian Romanick 

This doesn't help on Intel GPUs now because we always take the
"always_precise" path first.  It may help on other GPUs, and it does
prevent a bunch of regressions in "intel/compiler: Don't always require
precise lowering of flrp".

I have CC'ed everyone responsible for drivers that sets lower_flrp32
or lower_flrp64.

Signed-off-by: Ian Romanick 
Cc: Marek Olšák 
Cc: Rob Clark 
Cc: Eric Anholt 
Cc: Dave Airlie 
Cc: Timothy Arceri 
---
 src/compiler/nir/nir_lower_flrp.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/src/compiler/nir/nir_lower_flrp.c 
b/src/compiler/nir/nir_lower_flrp.c
index 1a3c55d07a2..24282c3cbcf 100644
--- a/src/compiler/nir/nir_lower_flrp.c
+++ b/src/compiler/nir/nir_lower_flrp.c
@@ -555,6 +555,23 @@ convert_flrp_instruction(nir_builder *bld,
   }
}
 
+   /*
+* - If t is constant:
+*
+*x(1 - t) + yt
+*
+*   The cost is three instructions without FMA or two instructions with
+*   FMA.  This is the same cost as the imprecise lowering, but it gives
+*   the instruction scheduler a little more freedom.
+*
+*   There is no need to handle t = 0.5 specially.  nir_opt_algebraic
+*   already has optimizations to convert 0.5x + 0.5y to 0.5(x + y).
+*/
+   if (alu->src[2].src.ssa->parent_instr->type == nir_instr_type_load_const) {
+  replace_with_strict(bld, dead_flrp, alu);
+  return;
+   }
+
/*
 * - Otherwise
 *
-- 
2.14.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/12] nir: Lower flrp(±1, b, c) and flrp(a, ±1, c) differently

2018-08-24 Thread Ian Romanick
From: Ian Romanick 

I have CC'ed everyone responsible for drivers that sets lower_flrp32
or lower_flrp64.

No changes on any other Intel platforms.

Iron Lake
total instructions in shared programs: 7752306 -> 7716901 (-0.46%)
instructions in affected programs: 1160861 -> 1125456 (-3.05%)
helped: 4020
HURT: 10
helped stats (abs) min: 1 max: 40 x̄: 8.81 x̃: 9
helped stats (rel) min: 0.20% max: 86.96% x̄: 4.99% x̃: 3.05%
HURT stats (abs)   min: 1 max: 2 x̄: 1.20 x̃: 1
HURT stats (rel)   min: 1.06% max: 3.92% x̄: 1.62% x̃: 1.06%
95% mean confidence interval for instructions value: -8.93 -8.64
95% mean confidence interval for instructions %-change: -5.15% -4.79%
Instructions are helped.

total cycles in shared programs: 177868254 -> 177689740 (-0.10%)
cycles in affected programs: 26413132 -> 26234618 (-0.68%)
helped: 3927
HURT: 72
helped stats (abs) min: 2 max: 646 x̄: 45.66 x̃: 48
helped stats (rel) min: <.01% max: 94.58% x̄: 2.38% x̃: 0.88%
HURT stats (abs)   min: 2 max: 406 x̄: 10.75 x̃: 6
HURT stats (rel)   min: <.01% max: 2.77% x̄: 0.19% x̃: 0.02%
95% mean confidence interval for cycles value: -45.58 -43.70
95% mean confidence interval for cycles %-change: -2.47% -2.20%
Cycles are helped.

LOST:   3
GAINED: 35

GM45
total instructions in shared programs: 4760579 -> 4741934 (-0.39%)
instructions in affected programs: 643230 -> 624585 (-2.90%)
helped: 2165
HURT: 9
helped stats (abs) min: 1 max: 40 x̄: 8.62 x̃: 9
helped stats (rel) min: 0.20% max: 86.96% x̄: 4.74% x̃: 2.87%
HURT stats (abs)   min: 1 max: 2 x̄: 1.11 x̃: 1
HURT stats (rel)   min: 1.06% max: 3.77% x̄: 1.36% x̃: 1.06%
95% mean confidence interval for instructions value: -8.77 -8.38
95% mean confidence interval for instructions %-change: -4.95% -4.48%
Instructions are helped.

total cycles in shared programs: 121648572 -> 121542280 (-0.09%)
cycles in affected programs: 16923170 -> 16816878 (-0.63%)
helped: 2114
HURT: 51
helped stats (abs) min: 2 max: 646 x̄: 50.61 x̃: 50
helped stats (rel) min: <.01% max: 93.33% x̄: 2.39% x̃: 0.90%
HURT stats (abs)   min: 4 max: 406 x̄: 13.84 x̃: 6
HURT stats (rel)   min: <.01% max: 2.77% x̄: 0.19% x̃: 0.01%
95% mean confidence interval for cycles value: -50.56 -47.63
95% mean confidence interval for cycles %-change: -2.52% -2.14%
Cycles are helped.

LOST:   38
GAINED: 38

Signed-off-by: Ian Romanick 
Cc: Marek Olšák 
Cc: Rob Clark 
Cc: Eric Anholt 
Cc: Dave Airlie 
Cc: Timothy Arceri 
---
 src/compiler/nir/nir_lower_flrp.c | 134 ++
 1 file changed, 134 insertions(+)

diff --git a/src/compiler/nir/nir_lower_flrp.c 
b/src/compiler/nir/nir_lower_flrp.c
index 0c7e803b20f..b86f5b5f2df 100644
--- a/src/compiler/nir/nir_lower_flrp.c
+++ b/src/compiler/nir/nir_lower_flrp.c
@@ -137,6 +137,89 @@ replace_with_fast(struct nir_builder *bld, struct u_vector 
*dead_flrp,
append_flrp_to_dead_list(dead_flrp, alu);
 }
 
+/**
+ * Replace flrp(a, b, c) with (b*c ± c) + a
+ */
+static void
+replace_with_expanded_ffma_and_add(struct nir_builder *bld,
+   struct u_vector *dead_flrp,
+   struct nir_alu_instr *alu, bool subtract_c)
+{
+   nir_ssa_def *const a = nir_ssa_for_alu_src(bld, alu, 0);
+   nir_ssa_def *const b = nir_ssa_for_alu_src(bld, alu, 1);
+   nir_ssa_def *const c = nir_ssa_for_alu_src(bld, alu, 2);
+
+   nir_ssa_def *const b_times_c = nir_fadd(bld, b, c);
+   nir_instr_as_alu(b_times_c->parent_instr)->exact = alu->exact;
+
+   nir_ssa_def *inner_sum;
+
+   if (subtract_c) {
+  nir_ssa_def *const neg_c = nir_fneg(bld, c);
+  nir_instr_as_alu(neg_c->parent_instr)->exact = alu->exact;
+
+  inner_sum = nir_fadd(bld, b_times_c, neg_c);
+   } else {
+  inner_sum = nir_fadd(bld, b_times_c, c);
+   }
+
+   nir_instr_as_alu(inner_sum->parent_instr)->exact = alu->exact;
+
+   nir_ssa_def *const outer_sum = nir_fadd(bld, inner_sum, a);
+   nir_instr_as_alu(outer_sum->parent_instr)->exact = alu->exact;
+
+   nir_ssa_def_rewrite_uses(>dest.dest.ssa, nir_src_for_ssa(outer_sum));
+
+   /* DO NOT REMOVE the original flrp yet.  Many of the lowering choices are
+* based on other uses of the sources.  Removing the flrp may cause the
+* last flrp in a sequence to make a different, incorrect choice.
+*/
+   append_flrp_to_dead_list(dead_flrp, alu);
+}
+
+/**
+ * Determines whether a swizzled source is constant w/ all components the same.
+ *
+ * The value of the constant is stored in \c result.
+ *
+ * \return
+ * True if all components of the swizzled source are the same constant.
+ * Otherwise false is returned.
+ */
+static bool
+all_same_constant(const nir_alu_instr *instr, unsigned src, double *result)
+{
+   nir_const_value *val = nir_src_as_const_value(instr->src[src].src);
+
+   if (!val)
+  return false;
+
+   const uint8_t *const swizzle = instr->src[src].swizzle;
+   const unsigned num_components = nir_dest_num_components(instr->dest.dest);
+
+   if (instr->dest.dest.ssa.bit_size == 32) {
+  

[Mesa-dev] [PATCH 09/12] nir: Lower flrp(a, b, c) differently if another flrp(_, b, c) exists

2018-08-24 Thread Ian Romanick
From: Ian Romanick 

There is little effect on Intel GPUs now because we almost always take
the "always_precise" path first.  It may help on other GPUs, and it does
prevent a bunch of regressions in "intel/compiler: Don't always require
precise lowering of flrp".

I have CC'ed everyone responsible for drivers that sets lower_flrp32
or lower_flrp64.

No changes on any other Intel platforms.

Iron Lake
total cycles in shared programs: 178115276 -> 178115260 (<.01%)
cycles in affected programs: 14692 -> 14676 (-0.11%)
helped: 4
HURT: 0
helped stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4
helped stats (rel) min: 0.09% max: 0.13% x̄: 0.11% x̃: 0.11%
95% mean confidence interval for cycles value: -4.00 -4.00
95% mean confidence interval for cycles %-change: -0.13% -0.09%
Cycles are helped.

GM45
total cycles in shared programs: 122015822 -> 122015806 (<.01%)
cycles in affected programs: 14692 -> 14676 (-0.11%)
helped: 4
HURT: 0
helped stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4
helped stats (rel) min: 0.09% max: 0.13% x̄: 0.11% x̃: 0.11%
95% mean confidence interval for cycles value: -4.00 -4.00
95% mean confidence interval for cycles %-change: -0.13% -0.09%
Cycles are helped.

Signed-off-by: Ian Romanick 
Cc: Marek Olšák 
Cc: Rob Clark 
Cc: Eric Anholt 
Cc: Dave Airlie 
Cc: Timothy Arceri 
---
 src/compiler/nir/nir_lower_flrp.c | 57 ++-
 1 file changed, 56 insertions(+), 1 deletion(-)

diff --git a/src/compiler/nir/nir_lower_flrp.c 
b/src/compiler/nir/nir_lower_flrp.c
index 87245c8ec62..1a3c55d07a2 100644
--- a/src/compiler/nir/nir_lower_flrp.c
+++ b/src/compiler/nir/nir_lower_flrp.c
@@ -69,6 +69,39 @@ replace_with_strict_ffma(struct nir_builder *bld, struct 
u_vector *dead_flrp,
append_flrp_to_dead_list(dead_flrp, alu);
 }
 
+/**
+ * Replace flrp(a, b, c) with ffma(a, (1 - c), bc)
+ */
+static void
+replace_with_single_ffma(struct nir_builder *bld, struct u_vector *dead_flrp,
+ struct nir_alu_instr *alu)
+{
+   nir_ssa_def *const a = nir_ssa_for_alu_src(bld, alu, 0);
+   nir_ssa_def *const b = nir_ssa_for_alu_src(bld, alu, 1);
+   nir_ssa_def *const c = nir_ssa_for_alu_src(bld, alu, 2);
+
+   nir_ssa_def *const neg_c = nir_fneg(bld, c);
+   nir_instr_as_alu(neg_c->parent_instr)->exact = alu->exact;
+
+   nir_ssa_def *const one_minus_c =
+  nir_fadd(bld, nir_imm_float(bld, 1.0f), neg_c);
+   nir_instr_as_alu(one_minus_c->parent_instr)->exact = alu->exact;
+
+   nir_ssa_def *const b_times_c = nir_fmul(bld, b, c);
+   nir_instr_as_alu(b_times_c->parent_instr)->exact = alu->exact;
+
+   nir_ssa_def *const final_ffma = nir_ffma(bld, a, one_minus_c, b_times_c);
+   nir_instr_as_alu(final_ffma->parent_instr)->exact = alu->exact;
+
+   nir_ssa_def_rewrite_uses(>dest.dest.ssa, nir_src_for_ssa(final_ffma));
+
+   /* DO NOT REMOVE the original flrp yet.  Many of the lowering choices are
+* based on other uses of the sources.  Removing the flrp may cause the
+* last flrp in a sequence to make a different, incorrect choice.
+*/
+   append_flrp_to_dead_list(dead_flrp, alu);
+}
+
 /**
  * Replace flrp(a, b, c) with a(1-c) + bc.
  */
@@ -476,6 +509,20 @@ convert_flrp_instruction(nir_builder *bld,
  replace_with_strict_ffma(bld, dead_flrp, alu);
  return;
   }
+
+  /*
+   * - If FMA is supported and another flrp(_, y, t) exists:
+   *
+   *fma(x, (1 - t), yt)
+   *
+   *   The hope is that the (1 - t) and the yt will be shared with the
+   *   other lowered flrp.  This results in 3 insructions for the first
+   *   flrp and 1 for each additional flrp.
+   */
+  if (st.src1_and_src2 > 0) {
+ replace_with_single_ffma(bld, dead_flrp, alu);
+ return;
+  }
} else {
   if (always_precise) {
  replace_with_strict(bld, dead_flrp, alu);
@@ -490,11 +537,19 @@ convert_flrp_instruction(nir_builder *bld,
*   The hope is that the x(1 - t) will be shared with the other lowered
*   flrp.  This results in 4 insructions for the first flrp and 2 for
*   each additional flrp.
+   *
+   * - If FMA is not supported and another flrp(_, y, t) exists:
+   *
+   *x(1 - t) + yt
+   *
+   *   The hope is that the (1 - t) and the yt will be shared with the
+   *   other lowered flrp.  This results in 4 insructions for the first
+   *   flrp and 2 for each additional flrp.
*/
   struct similar_flrp_stats st;
 
   get_similar_flrp_stats(alu, );
-  if (st.src0_and_src2 > 0) {
+  if (st.src0_and_src2 > 0 || st.src1_and_src2 > 0) {
  replace_with_strict(bld, dead_flrp, alu);
  return;
   }
-- 
2.14.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01/12] nir: Pull common addition out of flrp arguments

2018-08-24 Thread Ian Romanick
From: Ian Romanick 

Skylake
total instructions in shared programs: 14304116 -> 14303686 (<.01%)
instructions in affected programs: 49036 -> 48606 (-0.88%)
helped: 209
HURT: 0
helped stats (abs) min: 1 max: 8 x̄: 2.06 x̃: 1
helped stats (rel) min: 0.23% max: 6.67% x̄: 1.19% x̃: 0.56%
95% mean confidence interval for instructions value: -2.27 -1.85
95% mean confidence interval for instructions %-change: -1.35% -1.04%
Instructions are helped.

total cycles in shared programs: 527552861 -> 527549152 (<.01%)
cycles in affected programs: 895530 -> 891821 (-0.41%)
helped: 142
HURT: 83
helped stats (abs) min: 2 max: 210 x̄: 58.15 x̃: 32
helped stats (rel) min: 0.05% max: 12.18% x̄: 2.98% x̃: 1.75%
HURT stats (abs)   min: 1 max: 224 x̄: 54.81 x̃: 14
HURT stats (rel)   min: 0.02% max: 28.02% x̄: 3.16% x̃: 0.95%
95% mean confidence interval for cycles value: -28.25 -4.72
95% mean confidence interval for cycles %-change: -1.40% -0.04%
Cycles are helped.

Broadwell
total instructions in shared programs: 14615782 -> 14615353 (<.01%)
instructions in affected programs: 49130 -> 48701 (-0.87%)
helped: 209
HURT: 0
helped stats (abs) min: 1 max: 8 x̄: 2.05 x̃: 1
helped stats (rel) min: 0.23% max: 6.67% x̄: 1.19% x̃: 0.56%
95% mean confidence interval for instructions value: -2.26 -1.84
95% mean confidence interval for instructions %-change: -1.34% -1.03%
Instructions are helped.

total cycles in shared programs: 554541474 -> 554552501 (<.01%)
cycles in affected programs: 938374 -> 949401 (1.18%)
helped: 72
HURT: 164
helped stats (abs) min: 1 max: 162 x̄: 40.78 x̃: 19
helped stats (rel) min: 0.08% max: 12.02% x̄: 2.46% x̃: 1.44%
HURT stats (abs)   min: 1 max: 248 x̄: 85.14 x̃: 27
HURT stats (rel)   min: 0.02% max: 12.48% x̄: 4.15% x̃: 1.74%
95% mean confidence interval for cycles value: 34.39 59.06
95% mean confidence interval for cycles %-change: 1.50% 2.77%
Cycles are HURT.

Haswell
total instructions in shared programs: 13005162 -> 13004752 (<.01%)
instructions in affected programs: 49860 -> 49450 (-0.82%)
helped: 209
HURT: 0
helped stats (abs) min: 1 max: 7 x̄: 1.96 x̃: 1
helped stats (rel) min: 0.23% max: 6.67% x̄: 1.16% x̃: 0.54%
95% mean confidence interval for instructions value: -2.15 -1.77
95% mean confidence interval for instructions %-change: -1.32% -1.00%
Instructions are helped.

total cycles in shared programs: 407143838 -> 407142554 (<.01%)
cycles in affected programs: 900617 -> 899333 (-0.14%)
helped: 116
HURT: 99
helped stats (abs) min: 1 max: 272 x̄: 30.32 x̃: 16
helped stats (rel) min: 0.04% max: 12.64% x̄: 1.53% x̃: 0.73%
HURT stats (abs)   min: 1 max: 198 x̄: 22.56 x̃: 10
HURT stats (rel)   min: <.01% max: 10.30% x̄: 1.32% x̃: 0.54%
95% mean confidence interval for cycles value: -12.79 0.85
95% mean confidence interval for cycles %-change: -0.56% 0.13%
Inconclusive result (value mean confidence interval includes 0).

Ivy Bridge
total instructions in shared programs: 11709181 -> 11708722 (<.01%)
instructions in affected programs: 61839 -> 61380 (-0.74%)
helped: 226
HURT: 0
helped stats (abs) min: 1 max: 7 x̄: 2.03 x̃: 2
helped stats (rel) min: 0.18% max: 8.33% x̄: 1.07% x̃: 0.54%
95% mean confidence interval for instructions value: -2.21 -1.85
95% mean confidence interval for instructions %-change: -1.23% -0.92%
Instructions are helped.

total cycles in shared programs: 254741126 -> 254744080 (<.01%)
cycles in affected programs: 966459 -> 969413 (0.31%)
helped: 117
HURT: 108
helped stats (abs) min: 1 max: 129 x̄: 20.26 x̃: 11
helped stats (rel) min: 0.02% max: 12.30% x̄: 1.15% x̃: 0.50%
HURT stats (abs)   min: 1 max: 214 x̄: 49.31 x̃: 14
HURT stats (rel)   min: <.01% max: 48.42% x̄: 2.79% x̃: 0.80%
95% mean confidence interval for cycles value: 5.05 21.21
95% mean confidence interval for cycles %-change: 0.14% 1.34%
Cycles are HURT.

Sandy Bridge
total instructions in shared programs: 10488075 -> 10487724 (<.01%)
instructions in affected programs: 30308 -> 29957 (-1.16%)
helped: 124
HURT: 0
helped stats (abs) min: 1 max: 7 x̄: 2.83 x̃: 3
helped stats (rel) min: 0.35% max: 8.00% x̄: 1.59% x̃: 1.29%
95% mean confidence interval for instructions value: -3.08 -2.58
95% mean confidence interval for instructions %-change: -1.81% -1.37%
Instructions are helped.

total cycles in shared programs: 150260469 -> 150259141 (<.01%)
cycles in affected programs: 328692 -> 327364 (-0.40%)
helped: 75
HURT: 50
helped stats (abs) min: 1 max: 284 x̄: 45.91 x̃: 13
helped stats (rel) min: 0.04% max: 9.16% x̄: 1.47% x̃: 0.68%
HURT stats (abs)   min: 1 max: 728 x̄: 42.30 x̃: 11
HURT stats (rel)   min: 0.03% max: 12.62% x̄: 1.36% x̃: 0.60%
95% mean confidence interval for cycles value: -28.01 6.76
95% mean confidence interval for cycles %-change: -0.79% 0.11%
Inconclusive result (value mean confidence interval includes 0).

Iron Lake
total instructions in shared programs: 7781330 -> 7780824 (<.01%)
instructions in affected programs: 17370 -> 16864 (-2.91%)
helped: 70
HURT: 0
helped stats (abs) min: 1 max: 14 x̄: 7.23 x̃: 7

[Mesa-dev] [PATCH 04/12] nir: Use the flrp lowering pass instead of nir_opt_algebraic

2018-08-24 Thread Ian Romanick
From: Ian Romanick 

I tried to be very careful while updating all the various drivers, but I
don't have any of that hardware for testing. :(  I have CC'ed everyone
responsible for drivers that sets lower_flrp32 or lower_flrp64.

i965 is the only platform that sets always_precise = true, and it is
only set true for fragment shaders.  Gen4 and Gen5 both set lower_flrp32
only for vertex shaders.  For fragment shaders, nir_op_flrp is lowered
during code generation as a(1-c)+bc.  On all other platforms 64-bit
nir_op_flrp and on Gen11 32-bit nir_op_flrp are lowered using the old
nir_opt_algebraic method.

No changes on any other Intel platforms.

Iron Lake
total instructions in shared programs: 7778140 -> 710 (<.01%)
instructions in affected programs: 32146 -> 31716 (-1.34%)
helped: 146
HURT: 0
helped stats (abs) min: 2 max: 4 x̄: 2.95 x̃: 2
helped stats (rel) min: 0.54% max: 2.86% x̄: 1.53% x̃: 1.07%
95% mean confidence interval for instructions value: -3.11 -2.78
95% mean confidence interval for instructions %-change: -1.66% -1.40%
Instructions are helped.

total cycles in shared programs: 177866442 -> 177865148 (<.01%)
cycles in affected programs: 1147918 -> 1146624 (-0.11%)
helped: 158
HURT: 0
helped stats (abs) min: 2 max: 16 x̄: 8.19 x̃: 9
helped stats (rel) min: 0.04% max: 0.86% x̄: 0.15% x̃: 0.13%
95% mean confidence interval for cycles value: -8.86 -7.52
95% mean confidence interval for cycles %-change: -0.17% -0.13%
Cycles are helped.

GM45
total instructions in shared programs: 4798115 -> 4797685 (<.01%)
instructions in affected programs: 32146 -> 31716 (-1.34%)
helped: 146
HURT: 0
helped stats (abs) min: 2 max: 4 x̄: 2.95 x̃: 2
helped stats (rel) min: 0.54% max: 2.86% x̄: 1.53% x̃: 1.07%
95% mean confidence interval for instructions value: -3.11 -2.78
95% mean confidence interval for instructions %-change: -1.66% -1.40%
Instructions are helped.

total cycles in shared programs: 122042296 -> 122041002 (<.01%)
cycles in affected programs: 1147924 -> 1146630 (-0.11%)
helped: 158
HURT: 0
helped stats (abs) min: 2 max: 16 x̄: 8.19 x̃: 9
helped stats (rel) min: 0.04% max: 0.86% x̄: 0.15% x̃: 0.13%
95% mean confidence interval for cycles value: -8.86 -7.52
95% mean confidence interval for cycles %-change: -0.17% -0.13%
Cycles are helped.

Signed-off-by: Ian Romanick 
Cc: Marek Olšák 
Cc: Rob Clark 
Cc: Eric Anholt 
Cc: Dave Airlie 
Cc: Timothy Arceri 
---
 src/amd/vulkan/radv_shader.c | 24 
 src/broadcom/compiler/nir_to_vir.c   | 22 ++
 src/compiler/nir/nir_opt_algebraic.py|  2 --
 src/gallium/drivers/freedreno/ir3/ir3_nir.c  | 17 +
 src/gallium/drivers/radeonsi/si_shader_nir.c | 23 +++
 src/gallium/drivers/vc4/vc4_program.c| 21 +
 src/intel/compiler/brw_nir.c | 22 ++
 src/mesa/state_tracker/st_glsl_to_nir.cpp| 23 +++
 8 files changed, 152 insertions(+), 2 deletions(-)

diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
index 207e5b050eb..bfccd84d677 100644
--- a/src/amd/vulkan/radv_shader.c
+++ b/src/amd/vulkan/radv_shader.c
@@ -122,6 +122,8 @@ void
 radv_optimize_nir(struct nir_shader *shader, bool optimize_conservatively)
 {
 bool progress;
+bool need_to_lower_flrp =
+shader->options->lower_flrp32 || shader->options->lower_flrp64;
 
 do {
 progress = false;
@@ -146,6 +148,28 @@ radv_optimize_nir(struct nir_shader *shader, bool 
optimize_conservatively)
 NIR_PASS(progress, shader, nir_opt_peephole_select, 8);
 NIR_PASS(progress, shader, nir_opt_algebraic);
 NIR_PASS(progress, shader, nir_opt_constant_folding);
+
+if (need_to_lower_flrp) {
+bool lower_flrp_progress;
+NIR_PASS(lower_flrp_progress,
+ shader,
+ nir_lower_flrp,
+ shader->options->lower_flrp32,
+ shader->options->lower_flrp64,
+ false /* always_precise */,
+ shader->options->lower_ffma);
+if (lower_flrp_progress) {
+NIR_PASS(progress, shader,
+ nir_opt_constant_folding);
+progress = true;
+}
+
+/* Nothing should rematerialize any flrps, so we only
+ * need to do this lowering once.
+ */
+need_to_lower_flrp = false;
+}
+
 NIR_PASS(progress, shader, nir_opt_undef);
 NIR_PASS(progress, shader, nir_opt_conditional_discard);
 if 

[Mesa-dev] [PATCH 08/12] nir: Lower flrp(a, b, c) differently if another flrp(a, _, c) exists

2018-08-24 Thread Ian Romanick
From: Ian Romanick 

This doesn't help on Intel GPUs now because we always take the
"always_precise" path first.  It may help on other GPUs, and it does
prevent a bunch of regressions in "intel/compiler: Don't always require
precise lowering of flrp".

I have CC'ed everyone responsible for drivers that sets lower_flrp32
or lower_flrp64.

No changes on any other Intel platforms.

Iron Lake
total cycles in shared programs: 178115276 -> 178115260 (<.01%)
cycles in affected programs: 14692 -> 14676 (-0.11%)
helped: 4
HURT: 0
helped stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4
helped stats (rel) min: 0.09% max: 0.13% x̄: 0.11% x̃: 0.11%
95% mean confidence interval for cycles value: -4.00 -4.00
95% mean confidence interval for cycles %-change: -0.13% -0.09%
Cycles are helped.

GM45
total cycles in shared programs: 122015822 -> 122015806 (<.01%)
cycles in affected programs: 14692 -> 14676 (-0.11%)
helped: 4
HURT: 0
helped stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4
helped stats (rel) min: 0.09% max: 0.13% x̄: 0.11% x̃: 0.11%
95% mean confidence interval for cycles value: -4.00 -4.00
95% mean confidence interval for cycles %-change: -0.13% -0.09%
Cycles are helped.

Signed-off-by: Ian Romanick 
Cc: Marek Olšák 
Cc: Rob Clark 
Cc: Eric Anholt 
Cc: Dave Airlie 
Cc: Timothy Arceri 
---
 src/compiler/nir/nir_lower_flrp.c | 89 +++
 1 file changed, 89 insertions(+)

diff --git a/src/compiler/nir/nir_lower_flrp.c 
b/src/compiler/nir/nir_lower_flrp.c
index b86f5b5f2df..87245c8ec62 100644
--- a/src/compiler/nir/nir_lower_flrp.c
+++ b/src/compiler/nir/nir_lower_flrp.c
@@ -272,6 +272,59 @@ sources_are_constants_with_similar_magnitudes(const 
nir_alu_instr *instr)
return true;
 }
 
+/**
+ * Counts of similar types of nir_op_flrp instructions
+ *
+ * If a similar instruction fits into more than one category, it will only be
+ * counted once.  The assumption is that no other instruction will have all
+ * sources the same, or CSE would have removed one of the instructions.
+ */
+struct similar_flrp_stats {
+   unsigned src2;
+   unsigned src0_and_src2;
+   unsigned src1_and_src2;
+};
+
+/**
+ * Collection counts of similar FLRP instructions.
+ *
+ * This function only cares about similar instructions that have src2 in
+ * common.
+ */
+static void
+get_similar_flrp_stats(nir_alu_instr *alu, struct similar_flrp_stats *st)
+{
+   memset(st, 0, sizeof(*st));
+
+   nir_foreach_use(other_use, alu->src[2].src.ssa) {
+  /* Is the use also a flrp? */
+  nir_instr *const other_instr = other_use->parent_instr;
+  if (other_instr->type != nir_instr_type_alu)
+ continue;
+
+  /* Eh-hem... don't match the instruction with itself. */
+  if (other_instr == >instr)
+ continue;
+
+  nir_alu_instr *const other_alu = nir_instr_as_alu(other_instr);
+  if (other_alu->op != nir_op_flrp)
+ continue;
+
+  /* Does the other flrp use source 2 from the first flrp as its source 2
+   * as well?
+   */
+  if (!nir_alu_srcs_equal(alu, other_alu, 2, 2))
+ continue;
+
+  if (nir_alu_srcs_equal(alu, other_alu, 0, 0))
+ st->src0_and_src2++;
+  else if (nir_alu_srcs_equal(alu, other_alu, 1, 1))
+ st->src1_and_src2++;
+  else
+ st->src2++;
+   }
+}
+
 static void
 convert_flrp_instruction(nir_builder *bld,
  struct u_vector *dead_flrp,
@@ -404,11 +457,47 @@ convert_flrp_instruction(nir_builder *bld,
  replace_with_strict_ffma(bld, dead_flrp, alu);
  return;
   }
+
+  /*
+   * - If FMA is supported and other flrp(x, _, t) exists:
+   *
+   *fma(y, t, fma(-x, t, x))
+   *
+   *   The hope is that the inner FMA calculation will be shared with the
+   *   other lowered flrp.  This results in two FMA instructions for the
+   *   first flrp and one FMA instruction for each additional flrp.  It
+   *   also means that the live range for x might be complete after the
+   *   inner ffma instead of after the last flrp.
+   */
+  struct similar_flrp_stats st;
+
+  get_similar_flrp_stats(alu, );
+  if (st.src0_and_src2 > 0) {
+ replace_with_strict_ffma(bld, dead_flrp, alu);
+ return;
+  }
} else {
   if (always_precise) {
  replace_with_strict(bld, dead_flrp, alu);
  return;
   }
+
+  /*
+   * - If FMA is not supported and another flrp(x, _, t) exists:
+   *
+   *x(1 - t) + yt
+   *
+   *   The hope is that the x(1 - t) will be shared with the other lowered
+   *   flrp.  This results in 4 insructions for the first flrp and 2 for
+   *   each additional flrp.
+   */
+  struct similar_flrp_stats st;
+
+  get_similar_flrp_stats(alu, );
+  if (st.src0_and_src2 > 0) {
+ replace_with_strict(bld, dead_flrp, alu);
+ return;
+  }
}
 
/*
-- 
2.14.4

___
mesa-dev 

[Mesa-dev] [PATCH 12/12] RFC: intel/compiler: Don't always require precise lowering of flrp

2018-08-24 Thread Ian Romanick
From: Ian Romanick 

No changes on any other Intel platforms.

Iron Lake
total instructions in shared programs: 7727356 -> 7699517 (-0.36%)
instructions in affected programs: 3152595 -> 3124756 (-0.88%)
helped: 12836
HURT: 48
helped stats (abs) min: 1 max: 30 x̄: 2.18 x̃: 1
helped stats (rel) min: 0.03% max: 10.77% x̄: 1.14% x̃: 0.94%
HURT stats (abs)   min: 1 max: 4 x̄: 1.75 x̃: 1
HURT stats (rel)   min: 0.27% max: 5.26% x̄: 1.94% x̃: 1.26%
95% mean confidence interval for instructions value: -2.20 -2.12
95% mean confidence interval for instructions %-change: -1.14% -1.11%
Instructions are helped.

total cycles in shared programs: 177982922 -> 177855192 (-0.07%)
cycles in affected programs: 67514274 -> 67386544 (-0.19%)
helped: 11752
HURT: 507
helped stats (abs) min: 2 max: 600 x̄: 11.13 x̃: 6
helped stats (rel) min: <.01% max: 5.48% x̄: 0.47% x̃: 0.26%
HURT stats (abs)   min: 2 max: 62 x̄: 6.03 x̃: 4
HURT stats (rel)   min: 0.01% max: 3.08% x̄: 0.19% x̃: 0.07%
95% mean confidence interval for cycles value: -10.77 -10.07
95% mean confidence interval for cycles %-change: -0.45% -0.43%
Cycles are helped.

LOST:   0
GAINED: 13

GM45
total instructions in shared programs: 4755025 -> 4740052 (-0.31%)
instructions in affected programs: 1703728 -> 1688755 (-0.88%)
helped: 6578
HURT: 24
helped stats (abs) min: 1 max: 87 x̄: 2.28 x̃: 1
helped stats (rel) min: 0.03% max: 10.45% x̄: 1.12% x̃: 0.93%
HURT stats (abs)   min: 1 max: 4 x̄: 1.75 x̃: 1
HURT stats (rel)   min: 0.27% max: 5.00% x̄: 1.89% x̃: 1.25%
95% mean confidence interval for instructions value: -2.33 -2.21
95% mean confidence interval for instructions %-change: -1.12% -1.08%
Instructions are helped.

total cycles in shared programs: 121824644 -> 121742888 (-0.07%)
cycles in affected programs: 41846900 -> 41765144 (-0.20%)
helped: 6119
HURT: 175
helped stats (abs) min: 2 max: 600 x̄: 13.54 x̃: 6
helped stats (rel) min: <.01% max: 5.48% x̄: 0.49% x̃: 0.28%
HURT stats (abs)   min: 2 max: 48 x̄: 6.23 x̃: 6
HURT stats (rel)   min: 0.01% max: 3.08% x̄: 0.18% x̃: 0.07%
95% mean confidence interval for cycles value: -13.56 -12.42
95% mean confidence interval for cycles %-change: -0.49% -0.46%
Cycles are helped.

total spills in shared programs: 72 -> 61 (-15.28%)
spills in affected programs: 66 -> 55 (-16.67%)
helped: 1
HURT: 0

total fills in shared programs: 108 -> 94 (-12.96%)
fills in affected programs: 96 -> 82 (-14.58%)
helped: 1
HURT: 0

LOST:   13
GAINED: 13

Signed-off-by: Ian Romanick 
---
 src/intel/compiler/brw_nir.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/src/intel/compiler/brw_nir.c b/src/intel/compiler/brw_nir.c
index bd08e1e1c65..1c773882f0f 100644
--- a/src/intel/compiler/brw_nir.c
+++ b/src/intel/compiler/brw_nir.c
@@ -575,13 +575,10 @@ brw_nir_optimize(nir_shader *nir, const struct 
brw_compiler *compiler,
   OPT(nir_opt_constant_folding);
 
   if (need_to_lower_flrp) {
- /* To match the old behavior, set always_precise only for scalar
-  * shader stages.
-  */
  if (OPT(nir_lower_flrp,
  nir->options->lower_flrp32,
  nir->options->lower_flrp64,
- is_scalar /* always_precise */,
+ false /* always_precise */,
  compiler->devinfo->gen >= 6)) {
 OPT(nir_opt_constant_folding);
  }
-- 
2.14.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 00/12] Do better for flrp on platforms that lack flrp instruction, take 2

2018-08-24 Thread Ian Romanick
This series replaces my previous series
(https://patchwork.freedesktop.org/series/48138/) that did similar
things.  Rather than trying to do everything with only local information
in nir_opt_algebraic, this series adds a new optimization pass.  This
new pass looks at how various parameters of a nir_op_flrp are used in
other nir_op_flrp instructions to make better choices.

You can compare the results across this whole series, below, with the
similar results in the previous series.  At least for Intel GPUs, this
series does quite a bit better.  There are a couple extra loops and a
couple lost SIMD16 shaders on Iron Lake, but the trade off is an extra
35 (+32 overall) SIMD16 shaders on Iron Lake and an 38 SIMD16 shaders on
GM45.

It also shouldn't break the whole universe for non-Intel GPUs. :)
Neither of the tested Intel GPUs have an FMA instruction, so I don't
know how this series will affect GPUs that lack LRP but have FMA.  I did
some testing on Ice Lake, which lacks LRP but has FMA, and the results
were mixed but generally positive.  There is some driver support missing
for the FMA on Ice Lake, so those results are not likely to be
representative of the final result.

I have another series waiting to go out that improves LRP and FMA
generation for all of the Intel platforms that support LRP and FMA.
That series caused a bunch of regressions on the non-LRP platforms, so
this series needs to land first.

This series, along with a few patches that didn't pan out, is available
at:

https://cgit.freedesktop.org/~idr/mesa/log/?h=lower-flrp

Iron Lake
total instructions in shared programs: 7774533 -> 7676404 (-1.26%)
instructions in affected programs: 4436203 -> 4338074 (-2.21%)
helped: 19242
HURT: 924
helped stats (abs) min: 1 max: 155 x̄: 5.16 x̃: 3
helped stats (rel) min: 0.11% max: 86.96% x̄: 2.85% x̃: 1.89%
HURT stats (abs)   min: 1 max: 8 x̄: 1.29 x̃: 1
HURT stats (rel)   min: 0.08% max: 6.52% x̄: 1.22% x̃: 1.01%
95% mean confidence interval for instructions value: -4.95 -4.79
95% mean confidence interval for instructions %-change: -2.71% -2.61%
Instructions are helped.

total cycles in shared programs: 177757572 -> 177292132 (-0.26%)
cycles in affected programs: 100318508 -> 99853068 (-0.46%)
helped: 18971
HURT: 1407
helped stats (abs) min: 2 max: 930 x̄: 25.18 x̃: 12
helped stats (rel) min: <.01% max: 94.58% x̄: 1.24% x̃: 0.55%
HURT stats (abs)   min: 2 max: 98 x̄: 8.71 x̃: 6
HURT stats (rel)   min: <.01% max: 4.11% x̄: 0.33% x̃: 0.14%
95% mean confidence interval for cycles value: -23.31 -22.37
95% mean confidence interval for cycles %-change: -1.16% -1.09%
Cycles are helped.

total loops in shared programs: 850 -> 854 (0.47%)
loops in affected programs: 0 -> 4
helped: 0
HURT: 4
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.00% max: 0.00% x̄: 0.00% x̃: 0.00%
95% mean confidence interval for loops value: 1.00 1.00
95% mean confidence interval for loops %-change: 0.00% 0.00%
Loops are HURT.

LOST:   3
GAINED: 52

GM45
total instructions in shared programs: 4766579 -> 4715273 (-1.08%)
instructions in affected programs: 2372653 -> 2321347 (-2.16%)
helped: 9929
HURT: 467
helped stats (abs) min: 1 max: 155 x̄: 5.23 x̃: 3
helped stats (rel) min: 0.11% max: 86.96% x̄: 2.78% x̃: 1.86%
HURT stats (abs)   min: 1 max: 8 x̄: 1.28 x̃: 1
HURT stats (rel)   min: 0.08% max: 6.25% x̄: 1.19% x̃: 1.00%
95% mean confidence interval for instructions value: -5.05 -4.82
95% mean confidence interval for instructions %-change: -2.67% -2.54%
Instructions are helped.

total cycles in shared programs: 121450872 -> 121161308 (-0.24%)
cycles in affected programs: 61826744 -> 61537180 (-0.47%)
helped: 9846
HURT: 759
helped stats (abs) min: 2 max: 930 x̄: 30.12 x̃: 12
helped stats (rel) min: <.01% max: 93.33% x̄: 1.30% x̃: 0.58%
HURT stats (abs)   min: 2 max: 98 x̄: 9.27 x̃: 6
HURT stats (rel)   min: <.01% max: 4.11% x̄: 0.33% x̃: 0.13%
95% mean confidence interval for cycles value: -28.06 -26.55
95% mean confidence interval for cycles %-change: -1.23% -1.14%
Cycles are helped.

total loops in shared programs: 629 -> 631 (0.32%)
loops in affected programs: 0 -> 2
helped: 0
HURT: 2

total fills in shared programs: 93 -> 94 (1.08%)
fills in affected programs: 81 -> 82 (1.23%)
helped: 0
HURT: 1

LOST:   55
GAINED: 55

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 02/12] nir: Pull common multiplication out of flrp arguments

2018-08-24 Thread Ian Romanick
From: Ian Romanick 

Skylake, Broadwell, Haswell, and Sandy Bridge had similar results. (Skylake 
shown)
total instructions in shared programs: 14303686 -> 14300269 (-0.02%)
instructions in affected programs: 121704 -> 118287 (-2.81%)
helped: 694
HURT: 0
helped stats (abs) min: 1 max: 16 x̄: 4.92 x̃: 3
helped stats (rel) min: 0.22% max: 16.00% x̄: 4.36% x̃: 3.75%
95% mean confidence interval for instructions value: -5.17 -4.67
95% mean confidence interval for instructions %-change: -4.57% -4.14%
Instructions are helped.

total cycles in shared programs: 527549152 -> 527532160 (<.01%)
cycles in affected programs: 858131 -> 841139 (-1.98%)
helped: 505
HURT: 151
helped stats (abs) min: 1 max: 4880 x̄: 45.10 x̃: 16
helped stats (rel) min: 0.04% max: 35.69% x̄: 3.36% x̃: 2.06%
HURT stats (abs)   min: 1 max: 294 x̄: 38.31 x̃: 16
HURT stats (rel)   min: 0.06% max: 29.17% x̄: 3.08% x̃: 2.03%
95% mean confidence interval for cycles value: -41.14 -10.66
95% mean confidence interval for cycles %-change: -2.25% -1.52%
Cycles are helped.

Ivy Bridge
total instructions in shared programs: 11707456 -> 11704465 (-0.03%)
instructions in affected programs: 129460 -> 126469 (-2.31%)
helped: 676
HURT: 0
helped stats (abs) min: 1 max: 16 x̄: 4.42 x̃: 3
helped stats (rel) min: 0.22% max: 19.23% x̄: 3.67% x̃: 3.03%
95% mean confidence interval for instructions value: -4.65 -4.20
95% mean confidence interval for instructions %-change: -3.87% -3.47%
Instructions are helped.

total cycles in shared programs: 254736691 -> 254717085 (<.01%)
cycles in affected programs: 789785 -> 770179 (-2.48%)
helped: 510
HURT: 155
helped stats (abs) min: 1 max: 419 x̄: 51.67 x̃: 18
helped stats (rel) min: 0.02% max: 35.22% x̄: 4.75% x̃: 2.72%
HURT stats (abs)   min: 1 max: 371 x̄: 43.51 x̃: 19
HURT stats (rel)   min: 0.02% max: 28.77% x̄: 4.31% x̃: 3.00%
95% mean confidence interval for cycles value: -35.47 -23.50
95% mean confidence interval for cycles %-change: -3.14% -2.13%
Cycles are helped.

LOST:   3
GAINED: 9

Iron Lake
total instructions in shared programs: 7780824 -> 7778140 (-0.03%)
instructions in affected programs: 149333 -> 146649 (-1.80%)
helped: 622
HURT: 0
helped stats (abs) min: 1 max: 12 x̄: 4.32 x̃: 3
helped stats (rel) min: 0.24% max: 7.55% x̄: 2.26% x̃: 2.45%
95% mean confidence interval for instructions value: -4.54 -4.10
95% mean confidence interval for instructions %-change: -2.36% -2.17%
Instructions are helped.

total cycles in shared programs: 177876974 -> 177866442 (<.01%)
cycles in affected programs: 3215150 -> 3204618 (-0.33%)
helped: 510
HURT: 63
helped stats (abs) min: 4 max: 66 x̄: 22.26 x̃: 18
helped stats (rel) min: 0.05% max: 4.62% x̄: 0.98% x̃: 0.88%
HURT stats (abs)   min: 4 max: 18 x̄: 13.02 x̃: 12
HURT stats (rel)   min: 0.10% max: 1.02% x̄: 0.71% x̃: 0.78%
95% mean confidence interval for cycles value: -19.85 -16.91
95% mean confidence interval for cycles %-change: -0.88% -0.72%
Cycles are helped.

GM45
total instructions in shared programs: 4799457 -> 4798115 (-0.03%)
instructions in affected programs: 76257 -> 74915 (-1.76%)
helped: 311
HURT: 0
helped stats (abs) min: 1 max: 12 x̄: 4.32 x̃: 3
helped stats (rel) min: 0.24% max: 7.02% x̄: 2.22% x̃: 2.42%
95% mean confidence interval for instructions value: -4.63 -4.00
95% mean confidence interval for instructions %-change: -2.36% -2.09%
Instructions are helped.

total cycles in shared programs: 122048862 -> 122042296 (<.01%)
cycles in affected programs: 1947358 -> 1940792 (-0.34%)
helped: 263
HURT: 26
helped stats (abs) min: 6 max: 66 x̄: 26.53 x̃: 18
helped stats (rel) min: 0.05% max: 4.62% x̄: 1.05% x̃: 1.09%
HURT stats (abs)   min: 14 max: 18 x̄: 15.85 x̃: 14
HURT stats (rel)   min: 0.78% max: 1.02% x̄: 0.86% x̃: 0.79%
95% mean confidence interval for cycles value: -25.06 -20.38
95% mean confidence interval for cycles %-change: -0.99% -0.76%
Cycles are helped.

Signed-off-by: Ian Romanick 
---
 src/compiler/nir/nir_opt_algebraic.py | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/compiler/nir/nir_opt_algebraic.py 
b/src/compiler/nir/nir_opt_algebraic.py
index 01e3843334a..2e7d9e8bbaf 100644
--- a/src/compiler/nir/nir_opt_algebraic.py
+++ b/src/compiler/nir/nir_opt_algebraic.py
@@ -123,6 +123,11 @@ optimizations = [
(('~flrp@32', ('fadd', a, b), ('fadd', a, c), d), ('fadd', ('flrp', b, c, 
d), a), 'options->lower_flrp32'),
(('~flrp@64', ('fadd', a, b), ('fadd', a, c), d), ('fadd', ('flrp', b, c, 
d), a), 'options->lower_flrp64'),
 
+   (('~flrp@32', a, ('fmul(is_used_once)', a, b), c), ('fmul', ('flrp', 1.0, 
b, c), a), 'options->lower_flrp32'),
+   (('~flrp@64', a, ('fmul(is_used_once)', a, b), c), ('fmul', ('flrp', 1.0, 
b, c), a), 'options->lower_flrp64'),
+
+   (('~flrp', ('fmul(is_used_once)', a, b), ('fmul(is_used_once)', a, c), d), 
('fmul', ('flrp', b, c, d), a)),
+
(('~flrp', a, b, ('b2f', c)), ('bcsel', c, b, a), 'options->lower_flrp32'),
(('~flrp', a, 0.0, c), ('fadd', ('fmul', ('fneg', a), c), a)),
(('flrp@32', 

[Mesa-dev] [PATCH 03/12] nir: Add new lowering pass for flrp instructions

2018-08-24 Thread Ian Romanick
From: Ian Romanick 

This pass will soon grow to include some optimizations that are
difficult or impossible to implement correctly within nir_opt_algebraic.
It also include the ability to generate strictly correct code which the
current nir_opt_algebraic lowering lacks (though that could be changed).

Signed-off-by: Ian Romanick 
---
 src/compiler/Makefile.sources |   1 +
 src/compiler/nir/meson.build  |   1 +
 src/compiler/nir/nir.h|   4 +
 src/compiler/nir/nir_lower_flrp.c | 281 ++
 4 files changed, 287 insertions(+)
 create mode 100644 src/compiler/nir/nir_lower_flrp.c

diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources
index d3b06564832..54a06f9623f 100644
--- a/src/compiler/Makefile.sources
+++ b/src/compiler/Makefile.sources
@@ -236,6 +236,7 @@ NIR_FILES = \
nir/nir_lower_constant_initializers.c \
nir/nir_lower_double_ops.c \
nir/nir_lower_drawpixels.c \
+   nir/nir_lower_flrp.c \
nir/nir_lower_global_vars_to_local.c \
nir/nir_lower_gs_intrinsics.c \
nir/nir_lower_load_const_to_scalar.c \
diff --git a/src/compiler/nir/meson.build b/src/compiler/nir/meson.build
index 090aa7a628f..b57df0c157a 100644
--- a/src/compiler/nir/meson.build
+++ b/src/compiler/nir/meson.build
@@ -119,6 +119,7 @@ files_libnir = files(
   'nir_lower_constant_initializers.c',
   'nir_lower_double_ops.c',
   'nir_lower_drawpixels.c',
+  'nir_lower_flrp.c',
   'nir_lower_global_vars_to_local.c',
   'nir_lower_gs_intrinsics.c',
   'nir_lower_load_const_to_scalar.c',
diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index 45a8c2c64cc..1cd5bd4c8a2 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -2731,6 +2731,10 @@ bool nir_lower_vec_to_movs(nir_shader *shader);
 void nir_lower_alpha_test(nir_shader *shader, enum compare_func func,
   bool alpha_to_one);
 bool nir_lower_alu(nir_shader *shader);
+
+bool nir_lower_flrp(nir_shader *shader, bool lower_flrp32, bool lower_flrp64,
+bool always_precise, bool have_ffma);
+
 bool nir_lower_alu_to_scalar(nir_shader *shader);
 bool nir_lower_load_const_to_scalar(nir_shader *shader);
 bool nir_lower_read_invocation_to_scalar(nir_shader *shader);
diff --git a/src/compiler/nir/nir_lower_flrp.c 
b/src/compiler/nir/nir_lower_flrp.c
new file mode 100644
index 000..3240445e18f
--- /dev/null
+++ b/src/compiler/nir/nir_lower_flrp.c
@@ -0,0 +1,281 @@
+/*
+ * Copyright © 2018 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+#include "nir.h"
+#include "nir_builder.h"
+#include "util/u_vector.h"
+
+/**
+ * Lower flrp instructions.
+ *
+ * Unlike the lowerings that are possible in nir_opt_algrbraic, this pass can
+ * examine more global information to determine a possibly more efficient
+ * lowering for each flrp.
+ */
+
+static void
+append_flrp_to_dead_list(struct u_vector *dead_flrp, struct nir_alu_instr *alu)
+{
+   struct nir_alu_instr **tail = u_vector_add(dead_flrp);
+   *tail = alu;
+}
+
+/**
+ * Replace flrp(a, b, c) with ffma(b, c, ffma(-a, c, a)).
+ */
+static void
+replace_with_strict_ffma(struct nir_builder *bld, struct u_vector *dead_flrp,
+ struct nir_alu_instr *alu)
+{
+   nir_ssa_def *const a = nir_ssa_for_alu_src(bld, alu, 0);
+   nir_ssa_def *const b = nir_ssa_for_alu_src(bld, alu, 1);
+   nir_ssa_def *const c = nir_ssa_for_alu_src(bld, alu, 2);
+
+   nir_ssa_def *const neg_a = nir_fneg(bld, a);
+   nir_instr_as_alu(neg_a->parent_instr)->exact = alu->exact;
+
+   nir_ssa_def *const inner_ffma = nir_ffma(bld, neg_a, c, a);
+   nir_instr_as_alu(inner_ffma->parent_instr)->exact = alu->exact;
+
+   nir_ssa_def *const outer_ffma = nir_ffma(bld, b, c, inner_ffma);
+   nir_instr_as_alu(outer_ffma->parent_instr)->exact = alu->exact;
+
+   nir_ssa_def_rewrite_uses(>dest.dest.ssa, 

[Mesa-dev] [PATCH 05/12] intel/compiler: Use the flrp lowering pass for all stages on Gen4 and Gen5

2018-08-24 Thread Ian Romanick
From: Ian Romanick 

Previously lower_flrp32 was only set for vertex shaders.  Fragment
shaders performed a(1-c)+bc lowering during code generation.

The shaders with loops hurt are SIMD8 and SIMD16 shaders for a
text-identical fragment shader.

Iron Lake
total instructions in shared programs: 7772319 -> 7748384 (-0.31%)
instructions in affected programs: 2504225 -> 2480290 (-0.96%)
helped: 9795
HURT: 1288
helped stats (abs) min: 1 max: 155 x̄: 2.77 x̃: 2
helped stats (rel) min: 0.11% max: 35.48% x̄: 1.62% x̃: 1.08%
HURT stats (abs)   min: 1 max: 13 x̄: 2.52 x̃: 1
HURT stats (rel)   min: 0.21% max: 13.64% x̄: 1.58% x̃: 1.01%
95% mean confidence interval for instructions value: -2.24 -2.08
95% mean confidence interval for instructions %-change: -1.29% -1.21%
Instructions are helped.

total cycles in shared programs: 11352 -> 177662742 (-0.06%)
cycles in affected programs: 68726036 -> 68617426 (-0.16%)
helped: 11138
HURT: 1944
helped stats (abs) min: 2 max: 930 x̄: 11.96 x̃: 6
helped stats (rel) min: <.01% max: 44.61% x̄: 0.65% x̃: 0.24%
HURT stats (abs)   min: 2 max: 128 x̄: 12.64 x̃: 8
HURT stats (rel)   min: <.01% max: 6.54% x̄: 0.46% x̃: 0.18%
95% mean confidence interval for cycles value: -8.70 -7.91
95% mean confidence interval for cycles %-change: -0.50% -0.46%
Cycles are helped.

total loops in shared programs: 850 -> 854 (0.47%)
loops in affected programs: 0 -> 4
helped: 0
HURT: 4
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.00% max: 0.00% x̄: 0.00% x̃: 0.00%
95% mean confidence interval for loops value: 1.00 1.00
95% mean confidence interval for loops %-change: 0.00% 0.00%
Loops are HURT.

LOST:   1
GAINED: 12

GM45
total instructions in shared programs: 4789512 -> 4777554 (-0.25%)
instructions in affected programs: 1308042 -> 1296084 (-0.91%)
helped: 4945
HURT: 647
helped stats (abs) min: 1 max: 155 x̄: 2.76 x̃: 2
helped stats (rel) min: 0.10% max: 34.38% x̄: 1.58% x̃: 1.04%
HURT stats (abs)   min: 1 max: 89 x̄: 2.65 x̃: 1
HURT stats (rel)   min: 0.20% max: 12.50% x̄: 1.56% x̃: 0.98%
95% mean confidence interval for instructions value: -2.26 -2.02
95% mean confidence interval for instructions %-change: -1.28% -1.16%
Instructions are helped.

total cycles in shared programs: 121863964 -> 121794274 (-0.06%)
cycles in affected programs: 43211788 -> 43142098 (-0.16%)
helped: 5837
HURT: 973
helped stats (abs) min: 2 max: 930 x̄: 14.41 x̃: 8
helped stats (rel) min: <.01% max: 41.03% x̄: 0.67% x̃: 0.25%
HURT stats (abs)   min: 2 max: 128 x̄: 14.82 x̃: 8
HURT stats (rel)   min: <.01% max: 6.54% x̄: 0.48% x̃: 0.18%
95% mean confidence interval for cycles value: -10.88 -9.59
95% mean confidence interval for cycles %-change: -0.54% -0.48%
Cycles are helped.

total loops in shared programs: 629 -> 631 (0.32%)
loops in affected programs: 0 -> 2
helped: 0
HURT: 2

total spills in shared programs: 61 -> 72 (18.03%)
spills in affected programs: 55 -> 66 (20.00%)
helped: 0
HURT: 1

total fills in shared programs: 93 -> 108 (16.13%)
fills in affected programs: 81 -> 96 (18.52%)
helped: 0
HURT: 1

LOST:   13
GAINED: 13

Signed-off-by: Ian Romanick 
---
 src/intel/compiler/brw_compiler.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/compiler/brw_compiler.c 
b/src/intel/compiler/brw_compiler.c
index 6df9621fe42..730819476e4 100644
--- a/src/intel/compiler/brw_compiler.c
+++ b/src/intel/compiler/brw_compiler.c
@@ -161,7 +161,7 @@ brw_compiler_create(void *mem_ctx, const struct 
gen_device_info *devinfo)
 
   if (is_scalar) {
  compiler->glsl_compiler_options[i].NirOptions =
-devinfo->gen < 11 ? _nir_options : 
_nir_options_gen11;
+devinfo->gen < 11 && devinfo->gen > 5 ? _nir_options : 
_nir_options_gen11;
   } else {
  compiler->glsl_compiler_options[i].NirOptions =
 devinfo->gen < 6 ? _nir_options : _nir_options_gen6;
-- 
2.14.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 11/12] nir: Reassociate open-coded flrp(1, b, c)

2018-08-24 Thread Ian Romanick
From: Ian Romanick 

In a previous verion of this patch, Jason commented,

   "Re-associating based on whether or not something has a constant
   value of 1.0 seems a bit sneaky.  I think it's well within the rules
   but it seems like something that could bite you."

That is possibly true.  The reassociation will generate different
results if fabs(b) >= 2**24 and fabs(c) < 0.5.  The delta increases as
fabs(c) approaches 0.

However, i965 has done this same reassociation indirectly for years.
We would previously allow nir_op_flrp on all pre-Gen11 hardware even
though Gen4 and Gen5 do not have a LRP instruction.  Optimizations in
nir_opt_algebraic would convert expressions like a+c(b-a) into flrp(a,
b, c).  On Gen7+, the hardware performs the same arithmetic as
a(1-c)+bc.  Gen6 seems to implement LRP as a+c(b-a).  On Gen4 and
Gen5, we would lower LRP to a sequence of instructions that implement
a(1-c)+bc.  The lowering happens after all constant folding, so we
would litterally generate a 1+(-1) instruction sequence in this
scenario: one instruction to load either 1 or -1 in a register, and
another instruction to add either -1 or 1 to it.

This patch just cuts out the middle man.  Do the reassociation that
we've always done, but do it explicitly at a time when we can benefit
from other optimizations.

A few cases that were hurt by "nir: Lower flrp(±1, b, c) and flrp(a,
±1, c) differently" are restored by this patch.  This includes a few
shaders in ET:QW.

I tried a similar thing for open-coded flrp(-1, b, c), and it hurt
instructions on 35 shaders for ILK without helping any.  The helped /
hurt cycles was about even.

No changes on any other Intel platforms.

Iron Lake
total instructions in shared programs: 7735001 -> 7727356 (-0.10%)
instructions in affected programs: 1094100 -> 1086455 (-0.70%)
helped: 3281
HURT: 64
helped stats (abs) min: 1 max: 6 x̄: 2.35 x̃: 2
helped stats (rel) min: 0.13% max: 12.00% x̄: 1.15% x̃: 0.83%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.24% max: 0.64% x̄: 0.39% x̃: 0.38%
95% mean confidence interval for instructions value: -2.32 -2.25
95% mean confidence interval for instructions %-change: -1.16% -1.08%
Instructions are helped.

total cycles in shared programs: 178021114 -> 177982922 (-0.02%)
cycles in affected programs: 20360622 -> 20322430 (-0.19%)
helped: 3022
HURT: 489
helped stats (abs) min: 2 max: 142 x̄: 13.33 x̃: 12
helped stats (rel) min: 0.01% max: 6.37% x̄: 0.52% x̃: 0.24%
HURT stats (abs)   min: 2 max: 328 x̄: 4.26 x̃: 4
HURT stats (rel)   min: 0.02% max: 1.55% x̄: 0.14% x̃: 0.11%
95% mean confidence interval for cycles value: -11.26 -10.50
95% mean confidence interval for cycles %-change: -0.45% -0.41%
Cycles are helped.

LOST:   7
GAINED: 0

GM45
total instructions in shared programs: 4762494 -> 4758409 (-0.09%)
instructions in affected programs: 628390 -> 624305 (-0.65%)
helped: 1751
HURT: 32
helped stats (abs) min: 1 max: 6 x̄: 2.35 x̃: 2
helped stats (rel) min: 0.12% max: 11.11% x̄: 1.08% x̃: 0.73%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.24% max: 0.60% x̄: 0.39% x̃: 0.37%
95% mean confidence interval for instructions value: -2.34 -2.24
95% mean confidence interval for instructions %-change: -1.11% -1.01%
Instructions are helped.

total cycles in shared programs: 121921660 -> 121897144 (-0.02%)
cycles in affected programs: 13682798 -> 13658282 (-0.18%)
helped: 1683
HURT: 360
helped stats (abs) min: 2 max: 142 x̄: 15.48 x̃: 14
helped stats (rel) min: 0.01% max: 6.37% x̄: 0.51% x̃: 0.22%
HURT stats (abs)   min: 2 max: 328 x̄: 4.25 x̃: 2
HURT stats (rel)   min: 0.02% max: 1.55% x̄: 0.14% x̃: 0.11%
95% mean confidence interval for cycles value: -12.60 -11.40
95% mean confidence interval for cycles %-change: -0.43% -0.37%
Cycles are helped.

LOST:   7
GAINED: 7

Signed-off-by: Ian Romanick 
---
 src/compiler/nir/nir_opt_algebraic.py | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/compiler/nir/nir_opt_algebraic.py 
b/src/compiler/nir/nir_opt_algebraic.py
index 1db6d7a2bfe..60b97e14b1a 100644
--- a/src/compiler/nir/nir_opt_algebraic.py
+++ b/src/compiler/nir/nir_opt_algebraic.py
@@ -890,6 +890,9 @@ late_optimizations = [
(('b2f(is_used_more_than_once)', ('inot', a)), ('bcsel', a, 0.0, 1.0)),
(('fneg(is_used_more_than_once)', ('b2f', ('inot', a))), ('bcsel', a, -0.0, 
-1.0)),
 
+   (('~fadd@32', 1.0, ('fmul(is_used_once)', c , ('fadd', b, -1.0 ))), 
('fadd', ('fadd', 1.0, ('fneg', c)), ('fmul', b, c)), 'options->lower_flrp32'),
+   (('~fadd@64', 1.0, ('fmul(is_used_once)', c , ('fadd', b, -1.0 ))), 
('fadd', ('fadd', 1.0, ('fneg', c)), ('fmul', b, c)), 'options->lower_flrp64'),
+
# we do these late so that we don't get in the way of creating ffmas
(('fmin', ('fadd(is_used_once)', '#c', a), ('fadd(is_used_once)', '#c', 
b)), ('fadd', c, ('fmin', a, b))),
(('fmax', ('fadd(is_used_once)', '#c', a), ('fadd(is_used_once)', '#c', 
b)), ('fadd', c, ('fmax', a, b))),
-- 
2.14.4


[Mesa-dev] [PATCH 4/5] i965/fs: Emit BRW_AOP_INC or BRW_AOP_DEC for imageAtomicAdd of +1 or -1

2018-08-24 Thread Ian Romanick
From: Ian Romanick 

No changes on any other Intel platforms.

Skylake
total instructions in shared programs: 14304261 -> 14304241 (<.01%)
instructions in affected programs: 1625 -> 1605 (-1.23%)
helped: 4
HURT: 0
helped stats (abs) min: 1 max: 8 x̄: 5.00 x̃: 5
helped stats (rel) min: 1.01% max: 14.29% x̄: 5.86% x̃: 4.07%
95% mean confidence interval for instructions value: -10.66 0.66
95% mean confidence interval for instructions %-change: -15.91% 4.19%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 527531226 -> 527531194 (<.01%)
cycles in affected programs: 92204 -> 92172 (-0.03%)
helped: 2
HURT: 0

Haswell and Broadwell had similar results. (Broadwell shown)
total instructions in shared programs: 14615730 -> 14615710 (<.01%)
instructions in affected programs: 1838 -> 1818 (-1.09%)
helped: 4
HURT: 0
helped stats (abs) min: 1 max: 8 x̄: 5.00 x̃: 5
helped stats (rel) min: 0.89% max: 13.04% x̄: 5.37% x̃: 3.78%
95% mean confidence interval for instructions value: -10.66 0.66
95% mean confidence interval for instructions %-change: -14.59% 3.85%
Inconclusive result (value mean confidence interval includes 0).

Signed-off-by: Ian Romanick 
---
 src/intel/compiler/brw_fs_nir.cpp | 23 ---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/src/intel/compiler/brw_fs_nir.cpp 
b/src/intel/compiler/brw_fs_nir.cpp
index 40889579d5c..7d2ba247d69 100644
--- a/src/intel/compiler/brw_fs_nir.cpp
+++ b/src/intel/compiler/brw_fs_nir.cpp
@@ -3895,11 +3895,28 @@ fs_visitor::nir_emit_intrinsic(const fs_builder , 
nir_intrinsic_instr *instr
   var->data.image.write_only ? GL_NONE : format);
   } else {
  int op;
+ unsigned num_srcs = info->num_srcs;
 
  switch (instr->intrinsic) {
- case nir_intrinsic_image_deref_atomic_add:
+ case nir_intrinsic_image_deref_atomic_add: {
+assert(num_srcs == 4);
+
 op = BRW_AOP_ADD;
+
+const nir_const_value *const val =
+   nir_src_as_const_value(instr->src[3]);
+if (val != NULL) {
+   if (val->i32[0] == 1) {
+  op = BRW_AOP_INC;
+  num_srcs = 3;
+   } else if (val->i32[0] == -1) {
+  op = BRW_AOP_DEC;
+  num_srcs = 3;
+   }
+}
+
 break;
+ }
  case nir_intrinsic_image_deref_atomic_min:
 op = (get_image_base_type(type) == BRW_REGISTER_TYPE_D ?
  BRW_AOP_IMIN : BRW_AOP_UMIN);
@@ -3927,10 +3944,10 @@ fs_visitor::nir_emit_intrinsic(const fs_builder , 
nir_intrinsic_instr *instr
 unreachable("Not reachable.");
  }
 
- const fs_reg src0 = (info->num_srcs >= 4 ?
+ const fs_reg src0 = (num_srcs >= 4 ?
   retype(get_nir_src(instr->src[3]), base_type) :
   fs_reg());
- const fs_reg src1 = (info->num_srcs >= 5 ?
+ const fs_reg src1 = (num_srcs >= 5 ?
   retype(get_nir_src(instr->src[4]), base_type) :
   fs_reg());
 
-- 
2.14.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/5] i965/fs: Emit BRW_AOP_INC or BRW_AOP_DEC for atomicAdd of +1 or -1

2018-08-24 Thread Ian Romanick
From: Ian Romanick 

Funny story... a single shader was hurt for instructions, spills, fills.
That same shader was also the most helped for cycles.  #GPUsAreWeird

No changes on any other Intel platform.

Haswell, Broadwell, and Skylake had similar results. (Skylake shown)
total instructions in shared programs: 14304116 -> 14304261 (<.01%)
instructions in affected programs: 12776 -> 12921 (1.13%)
helped: 19
HURT: 1
helped stats (abs) min: 1 max: 16 x̄: 2.32 x̃: 1
helped stats (rel) min: 0.05% max: 7.27% x̄: 0.92% x̃: 0.55%
HURT stats (abs)   min: 189 max: 189 x̄: 189.00 x̃: 189
HURT stats (rel)   min: 4.87% max: 4.87% x̄: 4.87% x̃: 4.87%
95% mean confidence interval for instructions value: -12.83 27.33
95% mean confidence interval for instructions %-change: -1.57% 0.31%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 527552861 -> 527531226 (<.01%)
cycles in affected programs: 1459195 -> 1437560 (-1.48%)
helped: 16
HURT: 2
helped stats (abs) min: 2 max: 21328 x̄: 1353.69 x̃: 6
helped stats (rel) min: 0.01% max: 5.29% x̄: 0.36% x̃: 0.03%
HURT stats (abs)   min: 12 max: 12 x̄: 12.00 x̃: 12
HURT stats (rel)   min: 0.03% max: 0.03% x̄: 0.03% x̃: 0.03%
95% mean confidence interval for cycles value: -3699.81 1295.92
95% mean confidence interval for cycles %-change: -0.94% 0.30%
Inconclusive result (value mean confidence interval includes 0).

total spills in shared programs: 8025 -> 8033 (0.10%)
spills in affected programs: 208 -> 216 (3.85%)
helped: 1
HURT: 1

total fills in shared programs: 10989 -> 11040 (0.46%)
fills in affected programs: 444 -> 495 (11.49%)
helped: 1
HURT: 1

Ivy Bridge
total instructions in shared programs: 11709181 -> 11709153 (<.01%)
instructions in affected programs: 3505 -> 3477 (-0.80%)
helped: 3
HURT: 0
helped stats (abs) min: 1 max: 23 x̄: 9.33 x̃: 4
helped stats (rel) min: 0.11% max: 1.16% x̄: 0.63% x̃: 0.61%

total cycles in shared programs: 254741126 -> 254738801 (<.01%)
cycles in affected programs: 919067 -> 916742 (-0.25%)
helped: 3
HURT: 0
helped stats (abs) min: 21 max: 2144 x̄: 775.00 x̃: 160
helped stats (rel) min: 0.03% max: 0.90% x̄: 0.32% x̃: 0.03%

total spills in shared programs: 4536 -> 4533 (-0.07%)
spills in affected programs: 40 -> 37 (-7.50%)
helped: 1
HURT: 0

total fills in shared programs: 4819 -> 4813 (-0.12%)
fills in affected programs: 94 -> 88 (-6.38%)
helped: 1
HURT: 0

Signed-off-by: Ian Romanick 
---
 src/intel/compiler/brw_fs_nir.cpp | 38 --
 1 file changed, 32 insertions(+), 6 deletions(-)

diff --git a/src/intel/compiler/brw_fs_nir.cpp 
b/src/intel/compiler/brw_fs_nir.cpp
index 9c9df5ac09f..a2c3d715380 100644
--- a/src/intel/compiler/brw_fs_nir.cpp
+++ b/src/intel/compiler/brw_fs_nir.cpp
@@ -3659,9 +3659,20 @@ fs_visitor::nir_emit_cs_intrinsic(const fs_builder ,
   break;
}
 
-   case nir_intrinsic_shared_atomic_add:
-  nir_emit_shared_atomic(bld, BRW_AOP_ADD, instr);
+   case nir_intrinsic_shared_atomic_add: {
+  int op = BRW_AOP_ADD;
+  const nir_const_value *const val = nir_src_as_const_value(instr->src[1]);
+
+  if (val != NULL) {
+ if (val->i32[0] == 1)
+op = BRW_AOP_INC;
+ else if (val->i32[0] == -1)
+op = BRW_AOP_DEC;
+  }
+
+  nir_emit_shared_atomic(bld, op, instr);
   break;
+   }
case nir_intrinsic_shared_atomic_imin:
   nir_emit_shared_atomic(bld, BRW_AOP_IMIN, instr);
   break;
@@ -4377,9 +4388,20 @@ fs_visitor::nir_emit_intrinsic(const fs_builder , 
nir_intrinsic_instr *instr
   break;
}
 
-   case nir_intrinsic_ssbo_atomic_add:
-  nir_emit_ssbo_atomic(bld, BRW_AOP_ADD, instr);
+   case nir_intrinsic_ssbo_atomic_add: {
+  int op = BRW_AOP_ADD;
+  const nir_const_value *const val = nir_src_as_const_value(instr->src[2]);
+
+  if (val != NULL) {
+ if (val->i32[0] == 1)
+op = BRW_AOP_INC;
+ else if (val->i32[0] == -1)
+op = BRW_AOP_DEC;
+  }
+
+  nir_emit_ssbo_atomic(bld, op, instr);
   break;
+   }
case nir_intrinsic_ssbo_atomic_imin:
   nir_emit_ssbo_atomic(bld, BRW_AOP_IMIN, instr);
   break;
@@ -4888,7 +4910,9 @@ fs_visitor::nir_emit_ssbo_atomic(const fs_builder ,
}
 
fs_reg offset = get_nir_src(instr->src[1]);
-   fs_reg data1 = get_nir_src(instr->src[2]);
+   fs_reg data1;
+   if (op != BRW_AOP_INC && op != BRW_AOP_DEC && op != BRW_AOP_PREDEC)
+  data1 = get_nir_src(instr->src[2]);
fs_reg data2;
if (op == BRW_AOP_CMPWR)
   data2 = get_nir_src(instr->src[3]);
@@ -4962,7 +4986,9 @@ fs_visitor::nir_emit_shared_atomic(const fs_builder ,
 
fs_reg surface = brw_imm_ud(GEN7_BTI_SLM);
fs_reg offset;
-   fs_reg data1 = get_nir_src(instr->src[1]);
+   fs_reg data1;
+   if (op != BRW_AOP_INC && op != BRW_AOP_DEC && op != BRW_AOP_PREDEC)
+  data1 = get_nir_src(instr->src[1]);
fs_reg data2;
if (op == BRW_AOP_CMPWR)
   data2 = 

[Mesa-dev] [PATCH 1/5] intel/compiler: Silence unused parameter warnings in brw_eu.h

2018-08-24 Thread Ian Romanick
From: Ian Romanick 

All of the other brw_*_desc functions take a devinfo parameter, and all
of the others at least have an assert that uses it.  Keep the parameter,
but mark it as unused.

Silences 37 warnings like:

In file included from src/intel/common/gen_disasm.c:27:0:
src/intel/compiler/brw_eu.h: In function ‘brw_pixel_interp_desc’:
src/intel/compiler/brw_eu.h:377:53: warning: unused parameter ‘devinfo’ 
[-Wunused-parameter]
 brw_pixel_interp_desc(const struct gen_device_info *devinfo,
 ^~~

Signed-off-by: Ian Romanick 
---
 src/intel/compiler/brw_eu.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/compiler/brw_eu.h b/src/intel/compiler/brw_eu.h
index 2228b022404..9f1ca769bd3 100644
--- a/src/intel/compiler/brw_eu.h
+++ b/src/intel/compiler/brw_eu.h
@@ -374,7 +374,7 @@ brw_dp_surface_desc(const struct gen_device_info *devinfo,
  * interpolator function controls.
  */
 static inline uint32_t
-brw_pixel_interp_desc(const struct gen_device_info *devinfo,
+brw_pixel_interp_desc(UNUSED const struct gen_device_info *devinfo,
   unsigned msg_type,
   bool noperspective,
   unsigned simd_mode,
-- 
2.14.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/5] i965/fs: Refactor image atomics to be a bit more like other atomics

2018-08-24 Thread Ian Romanick
From: Ian Romanick 

This greatly simplifies the next patch.

Signed-off-by: Ian Romanick 
---
 src/intel/compiler/brw_fs_nir.cpp | 84 ---
 1 file changed, 44 insertions(+), 40 deletions(-)

diff --git a/src/intel/compiler/brw_fs_nir.cpp 
b/src/intel/compiler/brw_fs_nir.cpp
index a2c3d715380..40889579d5c 100644
--- a/src/intel/compiler/brw_fs_nir.cpp
+++ b/src/intel/compiler/brw_fs_nir.cpp
@@ -1795,36 +1795,6 @@ get_image_base_type(const glsl_type *type)
}
 }
 
-/**
- * Get the appropriate atomic op for an image atomic intrinsic.
- */
-static unsigned
-get_image_atomic_op(nir_intrinsic_op op, const glsl_type *type)
-{
-   switch (op) {
-   case nir_intrinsic_image_deref_atomic_add:
-  return BRW_AOP_ADD;
-   case nir_intrinsic_image_deref_atomic_min:
-  return (get_image_base_type(type) == BRW_REGISTER_TYPE_D ?
-  BRW_AOP_IMIN : BRW_AOP_UMIN);
-   case nir_intrinsic_image_deref_atomic_max:
-  return (get_image_base_type(type) == BRW_REGISTER_TYPE_D ?
-  BRW_AOP_IMAX : BRW_AOP_UMAX);
-   case nir_intrinsic_image_deref_atomic_and:
-  return BRW_AOP_AND;
-   case nir_intrinsic_image_deref_atomic_or:
-  return BRW_AOP_OR;
-   case nir_intrinsic_image_deref_atomic_xor:
-  return BRW_AOP_XOR;
-   case nir_intrinsic_image_deref_atomic_exchange:
-  return BRW_AOP_MOV;
-   case nir_intrinsic_image_deref_atomic_comp_swap:
-  return BRW_AOP_CMPWR;
-   default:
-  unreachable("Not reachable.");
-   }
-}
-
 static fs_inst *
 emit_pixel_interpolater_send(const fs_builder ,
  enum opcode opcode,
@@ -3914,26 +3884,60 @@ fs_visitor::nir_emit_intrinsic(const fs_builder , 
nir_intrinsic_instr *instr
   const fs_reg image = get_nir_image_deref(deref);
   const fs_reg addr = retype(get_nir_src(instr->src[1]),
  BRW_REGISTER_TYPE_UD);
-  const fs_reg src0 = (info->num_srcs >= 4 ?
-   retype(get_nir_src(instr->src[3]), base_type) :
-   fs_reg());
-  const fs_reg src1 = (info->num_srcs >= 5 ?
-   retype(get_nir_src(instr->src[4]), base_type) :
-   fs_reg());
   fs_reg tmp;
 
   /* Emit an image load, store or atomic op. */
   if (instr->intrinsic == nir_intrinsic_image_deref_load)
  tmp = emit_image_load(bld, image, addr, surf_dims, arr_dims, format);
-
-  else if (instr->intrinsic == nir_intrinsic_image_deref_store)
+  else if (instr->intrinsic == nir_intrinsic_image_deref_store) {
+ const fs_reg src0 = retype(get_nir_src(instr->src[3]), base_type);
  emit_image_store(bld, image, addr, src0, surf_dims, arr_dims,
   var->data.image.write_only ? GL_NONE : format);
+  } else {
+ int op;
+
+ switch (instr->intrinsic) {
+ case nir_intrinsic_image_deref_atomic_add:
+op = BRW_AOP_ADD;
+break;
+ case nir_intrinsic_image_deref_atomic_min:
+op = (get_image_base_type(type) == BRW_REGISTER_TYPE_D ?
+ BRW_AOP_IMIN : BRW_AOP_UMIN);
+break;
+ case nir_intrinsic_image_deref_atomic_max:
+op = (get_image_base_type(type) == BRW_REGISTER_TYPE_D ?
+ BRW_AOP_IMAX : BRW_AOP_UMAX);
+break;
+ case nir_intrinsic_image_deref_atomic_and:
+op = BRW_AOP_AND;
+break;
+ case nir_intrinsic_image_deref_atomic_or:
+op = BRW_AOP_OR;
+break;
+ case nir_intrinsic_image_deref_atomic_xor:
+op = BRW_AOP_XOR;
+break;
+ case nir_intrinsic_image_deref_atomic_exchange:
+op = BRW_AOP_MOV;
+break;
+ case nir_intrinsic_image_deref_atomic_comp_swap:
+op = BRW_AOP_CMPWR;
+break;
+ default:
+unreachable("Not reachable.");
+ }
+
+ const fs_reg src0 = (info->num_srcs >= 4 ?
+  retype(get_nir_src(instr->src[3]), base_type) :
+  fs_reg());
+ const fs_reg src1 = (info->num_srcs >= 5 ?
+  retype(get_nir_src(instr->src[4]), base_type) :
+  fs_reg());
 
-  else
  tmp = emit_image_atomic(bld, image, addr, src0, src1,
  surf_dims, arr_dims, dest_components,
- get_image_atomic_op(instr->intrinsic, type));
+ op);
+  }
 
   /* Assign the result. */
   for (unsigned c = 0; c < dest_components; ++c) {
-- 
2.14.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/5] Emit BRW_AOP_INC or BRW_AOP_DEC

2018-08-24 Thread Ian Romanick
I don't know why we never did this.  Almost every shader in shader-db
that uses atomicAdd or imageAtomicAdd uses it with a constant of 1 or
-1.

Results on Skylake across the whole series are below.  There is some
discussion in patch 2 about the 189 instructions (!) added.

total instructions in shared programs: 14304116 -> 14304241 (<.01%)
instructions in affected programs: 12811 -> 12936 (0.98%)
helped: 21
HURT: 1
helped stats (abs) min: 1 max: 16 x̄: 3.05 x̃: 1
helped stats (rel) min: 0.05% max: 14.29% x̄: 1.95% x̃: 0.69%
HURT stats (abs)   min: 189 max: 189 x̄: 189.00 x̃: 189
HURT stats (rel)   min: 4.87% max: 4.87% x̄: 4.87% x̃: 4.87%
95% mean confidence interval for instructions value: -12.55 23.91
95% mean confidence interval for instructions %-change: -3.27% -0.01%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 527552861 -> 527531194 (<.01%)
cycles in affected programs: 1459195 -> 1437528 (-1.48%)
helped: 18
HURT: 0
helped stats (abs) min: 2 max: 21328 x̄: 1203.72 x̃: 6
helped stats (rel) min: <.01% max: 5.29% x̄: 0.32% x̃: 0.03%
95% mean confidence interval for cycles value: -3701.36 1293.92
95% mean confidence interval for cycles %-change: -0.94% 0.29%
Inconclusive result (value mean confidence interval includes 0).

total spills in shared programs: 8025 -> 8033 (0.10%)
spills in affected programs: 208 -> 216 (3.85%)
helped: 1
HURT: 1

total fills in shared programs: 10989 -> 11040 (0.46%)
fills in affected programs: 444 -> 495 (11.49%)
helped: 1
HURT: 1


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/5] i965/vec4: Emit BRW_AOP_INC or BRW_AOP_DEC for atomicAdd of +1 or -1

2018-08-24 Thread Ian Romanick
From: Ian Romanick 

No shader-db changes on any Intel platform.

Signed-off-by: Ian Romanick 
---
 src/intel/compiler/brw_vec4_nir.cpp | 19 ---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/src/intel/compiler/brw_vec4_nir.cpp 
b/src/intel/compiler/brw_vec4_nir.cpp
index 4c3a2d2e10a..124714b59de 100644
--- a/src/intel/compiler/brw_vec4_nir.cpp
+++ b/src/intel/compiler/brw_vec4_nir.cpp
@@ -709,9 +709,20 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr 
*instr)
   break;
}
 
-   case nir_intrinsic_ssbo_atomic_add:
-  nir_emit_ssbo_atomic(BRW_AOP_ADD, instr);
+   case nir_intrinsic_ssbo_atomic_add: {
+  int op = BRW_AOP_ADD;
+  const nir_const_value *const val = nir_src_as_const_value(instr->src[2]);
+
+  if (val != NULL) {
+ if (val->i32[0] == 1)
+op = BRW_AOP_INC;
+ else if (val->i32[0] == -1)
+op = BRW_AOP_DEC;
+  }
+
+  nir_emit_ssbo_atomic(op, instr);
   break;
+   }
case nir_intrinsic_ssbo_atomic_imin:
   nir_emit_ssbo_atomic(BRW_AOP_IMIN, instr);
   break;
@@ -937,7 +948,9 @@ vec4_visitor::nir_emit_ssbo_atomic(int op, 
nir_intrinsic_instr *instr)
}
 
src_reg offset = get_nir_src(instr->src[1], 1);
-   src_reg data1 = get_nir_src(instr->src[2], 1);
+   src_reg data1;
+   if (op != BRW_AOP_INC && op != BRW_AOP_DEC && op != BRW_AOP_PREDEC)
+  data1 = get_nir_src(instr->src[2], 1);
src_reg data2;
if (op == BRW_AOP_CMPWR)
   data2 = get_nir_src(instr->src[3], 1);
-- 
2.14.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/gen7_urb: Re-emit PUSH_CONSTANT_ALLOC on GLK

2018-08-24 Thread Ilia Mirkin
On Fri, Aug 24, 2018 at 9:39 PM, Nanley Chery  wrote:
> On Fri, Aug 24, 2018 at 09:17:03PM -0400, Ilia Mirkin wrote:
>> On Fri, Aug 24, 2018 at 8:46 PM, Nanley Chery  wrote:
>> > According to internal docs, some gen9 platforms have a pixel shader push
>> > constant synchronization issue. Although not listed among said
>> > platforms, this issue seems to be present on the GeminiLake 2x6's we've
>> > tested.
>> >
>> > We consider the available workarounds to be too detrimental on
>> > performance. Instead, we mitigate the issue by applying part of one of
>> > the workarounds. Re-emit PUSH_CONSTANT_ALLOC at the top of every batch
>> > (as suggested by Ken).
>> >
>> > Fixes ext_framebuffer_multisample-accuracy piglit test failures with the
>> > following options:
>> > * 6 depth_draw small depthstencil
>> > * 8 stencil_draw small depthstencil
>> > * 6 stencil_draw small depthstencil
>> > * 8 depth_resolve small
>> > * 6 stencil_resolve small depthstencil
>> > * 4 stencil_draw small depthstencil
>> > * 16 stencil_draw small depthstencil
>> > * 16 depth_draw small depthstencil
>> > * 2 stencil_resolve small depthstencil
>> > * 6 stencil_draw small
>> > * all_samples stencil_draw small
>> > * 2 depth_draw small depthstencil
>> > * all_samples depth_draw small depthstencil
>> > * all_samples stencil_resolve small
>> > * 4 depth_draw small depthstencil
>> > * all_samples depth_draw small
>> > * all_samples stencil_draw small depthstencil
>> > * 4 stencil_resolve small depthstencil
>> > * 4 depth_resolve small depthstencil
>> > * all_samples stencil_resolve small depthstencil
>> >
>> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106865
>> > Cc: 
>> > ---
>> >  src/mesa/drivers/dri/i965/gen7_urb.c | 23 +++
>> >  1 file changed, 23 insertions(+)
>> >
>> > diff --git a/src/mesa/drivers/dri/i965/gen7_urb.c 
>> > b/src/mesa/drivers/dri/i965/gen7_urb.c
>> > index 2e5f8e60ba9..cb045251236 100644
>> > --- a/src/mesa/drivers/dri/i965/gen7_urb.c
>> > +++ b/src/mesa/drivers/dri/i965/gen7_urb.c
>> > @@ -118,6 +118,28 @@ gen7_emit_push_constant_state(struct brw_context 
>> > *brw, unsigned vs_size,
>> > const struct gen_device_info *devinfo = >screen->devinfo;
>> > unsigned offset = 0;
>> >
>> > +   /* From the SKL PRM, Workarounds section (#878):
>> > +*
>> > +*Push constant buffer corruption possible. WA: Insert 2 
>> > zero-length
>> > +*PushConst_PS before every intended PushConst_PS update, issue a
>> > +*NULLPRIM after each of the zero len PC update to make sure CS 
>> > commits
>> > +*them.
>> > +*
>> > +* This workaround is attempting to solve a pixel shader push constant
>> > +* synchronization issue.
>> > +*
>> > +* There's an unpublished WA that involves re-emitting
>> > +* 3DSTATE_PUSH_CONSTANT_ALLOC_PS for every 500-ish 3DSTATE_CONSTANT_PS
>> > +* packets. Since our counting methods may not be reliable due to
>> > +* context-switching and pre-emption, we instead choose to approximate 
>> > this
>> > +* behavior by re-emitting the packet at the top of the batch.
>> > +*/
>> > +   if (brw->ctx.NewDriverState == BRW_NEW_BATCH) {
>>
>> Did you want & here?
>>
>
> Using & would prevent push constant allocation on non-GLK 2x6 devices
> if we had a NEW_BATCH and NEW_GEOMETRY_PROGRAM, which I think we don't
> want.
>
> If the equality fails, we'll emit push constant allocation packets,
> which is what we want. This block basically filters out the cases in
> which we're emitting this packet unnecessarily due to adding the
> BRW_NEW_BATCH dirty flag below.

Got it. You want to bail if only NEW_BATCH is set and it's not GLK
2x6. Makes sense. Sorry for the noise!

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 107680] OpenCL does not work on Ubuntu 18.04 with Nvidia

2018-08-24 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=107680

Devyn Collier Johnson  changed:

   What|Removed |Added

 Status|NEEDINFO|RESOLVED
 Resolution|--- |NOTOURBUG

--- Comment #2 from Devyn Collier Johnson  ---
Good idea, Rhys. Thanks!

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/gen7_urb: Re-emit PUSH_CONSTANT_ALLOC on GLK

2018-08-24 Thread Nanley Chery
On Fri, Aug 24, 2018 at 09:17:03PM -0400, Ilia Mirkin wrote:
> On Fri, Aug 24, 2018 at 8:46 PM, Nanley Chery  wrote:
> > According to internal docs, some gen9 platforms have a pixel shader push
> > constant synchronization issue. Although not listed among said
> > platforms, this issue seems to be present on the GeminiLake 2x6's we've
> > tested.
> >
> > We consider the available workarounds to be too detrimental on
> > performance. Instead, we mitigate the issue by applying part of one of
> > the workarounds. Re-emit PUSH_CONSTANT_ALLOC at the top of every batch
> > (as suggested by Ken).
> >
> > Fixes ext_framebuffer_multisample-accuracy piglit test failures with the
> > following options:
> > * 6 depth_draw small depthstencil
> > * 8 stencil_draw small depthstencil
> > * 6 stencil_draw small depthstencil
> > * 8 depth_resolve small
> > * 6 stencil_resolve small depthstencil
> > * 4 stencil_draw small depthstencil
> > * 16 stencil_draw small depthstencil
> > * 16 depth_draw small depthstencil
> > * 2 stencil_resolve small depthstencil
> > * 6 stencil_draw small
> > * all_samples stencil_draw small
> > * 2 depth_draw small depthstencil
> > * all_samples depth_draw small depthstencil
> > * all_samples stencil_resolve small
> > * 4 depth_draw small depthstencil
> > * all_samples depth_draw small
> > * all_samples stencil_draw small depthstencil
> > * 4 stencil_resolve small depthstencil
> > * 4 depth_resolve small depthstencil
> > * all_samples stencil_resolve small depthstencil
> >
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106865
> > Cc: 
> > ---
> >  src/mesa/drivers/dri/i965/gen7_urb.c | 23 +++
> >  1 file changed, 23 insertions(+)
> >
> > diff --git a/src/mesa/drivers/dri/i965/gen7_urb.c 
> > b/src/mesa/drivers/dri/i965/gen7_urb.c
> > index 2e5f8e60ba9..cb045251236 100644
> > --- a/src/mesa/drivers/dri/i965/gen7_urb.c
> > +++ b/src/mesa/drivers/dri/i965/gen7_urb.c
> > @@ -118,6 +118,28 @@ gen7_emit_push_constant_state(struct brw_context *brw, 
> > unsigned vs_size,
> > const struct gen_device_info *devinfo = >screen->devinfo;
> > unsigned offset = 0;
> >
> > +   /* From the SKL PRM, Workarounds section (#878):
> > +*
> > +*Push constant buffer corruption possible. WA: Insert 2 zero-length
> > +*PushConst_PS before every intended PushConst_PS update, issue a
> > +*NULLPRIM after each of the zero len PC update to make sure CS 
> > commits
> > +*them.
> > +*
> > +* This workaround is attempting to solve a pixel shader push constant
> > +* synchronization issue.
> > +*
> > +* There's an unpublished WA that involves re-emitting
> > +* 3DSTATE_PUSH_CONSTANT_ALLOC_PS for every 500-ish 3DSTATE_CONSTANT_PS
> > +* packets. Since our counting methods may not be reliable due to
> > +* context-switching and pre-emption, we instead choose to approximate 
> > this
> > +* behavior by re-emitting the packet at the top of the batch.
> > +*/
> > +   if (brw->ctx.NewDriverState == BRW_NEW_BATCH) {
> 
> Did you want & here?
> 

Using & would prevent push constant allocation on non-GLK 2x6 devices
if we had a NEW_BATCH and NEW_GEOMETRY_PROGRAM, which I think we don't
want.

If the equality fails, we'll emit push constant allocation packets,
which is what we want. This block basically filters out the cases in
which we're emitting this packet unnecessarily due to adding the
BRW_NEW_BATCH dirty flag below.

-Nanley

> > +   /* Only GLK 2x6 has demonstrated this issue thus far. */
> > +  if (!devinfo->is_geminilake || devinfo->num_subslices[0] != 2)
> > + return;
> > +   }
> > +
> > BEGIN_BATCH(10);
> > OUT_BATCH(_3DSTATE_PUSH_CONSTANT_ALLOC_VS << 16 | (2 - 2));
> > OUT_BATCH(vs_size | offset << GEN7_PUSH_CONSTANT_BUFFER_OFFSET_SHIFT);
> > @@ -154,6 +176,7 @@ const struct brw_tracked_state gen7_push_constant_space 
> > = {
> > .dirty = {
> >.mesa = 0,
> >.brw = BRW_NEW_CONTEXT |
> > + BRW_NEW_BATCH | /* GLK workaround */
> >   BRW_NEW_GEOMETRY_PROGRAM |
> >   BRW_NEW_TESS_PROGRAMS,
> > },
> > --
> > 2.18.0
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] nir: Create sampler2D variables in nir_lower_{bitmap, drawpixels}.

2018-08-24 Thread Kenneth Graunke
This is needed for nir_gather_info to actually count the new textures,
since it operates solely on variables.
---
 src/compiler/nir/nir_lower_bitmap.c |  7 +++
 src/compiler/nir/nir_lower_drawpixels.c | 17 -
 2 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/src/compiler/nir/nir_lower_bitmap.c 
b/src/compiler/nir/nir_lower_bitmap.c
index a4d9498576c..03eb6273129 100644
--- a/src/compiler/nir/nir_lower_bitmap.c
+++ b/src/compiler/nir/nir_lower_bitmap.c
@@ -88,6 +88,13 @@ lower_bitmap(nir_shader *shader, nir_builder *b,
 
texcoord = nir_load_var(b, get_texcoord(shader));
 
+   const struct glsl_type *sampler2D =
+  glsl_sampler_type(GLSL_SAMPLER_DIM_2D, false, false, GLSL_TYPE_FLOAT);
+
+   nir_variable *tex_var =
+  nir_variable_create(shader, nir_var_uniform, sampler2D, "bitmap_tex");
+   tex_var->data.binding = options->sampler;
+
tex = nir_tex_instr_create(shader, 1);
tex->op = nir_texop_tex;
tex->sampler_dim = GLSL_SAMPLER_DIM_2D;
diff --git a/src/compiler/nir/nir_lower_drawpixels.c 
b/src/compiler/nir/nir_lower_drawpixels.c
index 462b9c308b2..99eb646b245 100644
--- a/src/compiler/nir/nir_lower_drawpixels.c
+++ b/src/compiler/nir/nir_lower_drawpixels.c
@@ -35,7 +35,7 @@ typedef struct {
const nir_lower_drawpixels_options *options;
nir_shader   *shader;
nir_builder   b;
-   nir_variable *texcoord, *scale, *bias;
+   nir_variable *texcoord, *scale, *bias, *tex, *pixelmap;
 } lower_drawpixels_state;
 
 static nir_ssa_def *
@@ -125,6 +125,15 @@ lower_color(lower_drawpixels_state *state, 
nir_intrinsic_instr *intr)
 
texcoord = get_texcoord(state);
 
+   const struct glsl_type *sampler2D =
+  glsl_sampler_type(GLSL_SAMPLER_DIM_2D, false, false, GLSL_TYPE_FLOAT);
+
+   if (!state->tex) {
+  state->tex =
+ nir_variable_create(b->shader, nir_var_uniform, sampler2D, "drawpix");
+  state->tex->data.binding = state->options->drawpix_sampler;
+   }
+
/* replace load_var(gl_Color) w/ texture sample:
 *   TEX def, texcoord, drawpix_sampler, 2D
 */
@@ -151,6 +160,12 @@ lower_color(lower_drawpixels_state *state, 
nir_intrinsic_instr *intr)
}
 
if (state->options->pixel_maps) {
+  if (!state->pixelmap) {
+ state->pixelmap = nir_variable_create(b->shader, nir_var_uniform,
+   sampler2D, "pixelmap");
+ state->pixelmap->data.binding = state->options->pixelmap_sampler;
+  }
+
   /* do four pixel map look-ups with two TEX instructions: */
   nir_ssa_def *def_xy, *def_zw;
 
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] st/nir: Call nir_gather_info in st_finalize_nir.

2018-08-24 Thread Kenneth Graunke
Several of the passes change varyings.  This is necessary for
inputs_read and outputs_written to be accurate.
---
 src/mesa/state_tracker/st_glsl_to_nir.cpp | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/mesa/state_tracker/st_glsl_to_nir.cpp 
b/src/mesa/state_tracker/st_glsl_to_nir.cpp
index ae2c49960c9..7d4c20730c3 100644
--- a/src/mesa/state_tracker/st_glsl_to_nir.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_nir.cpp
@@ -857,6 +857,8 @@ st_finalize_nir(struct st_context *st, struct gl_program 
*prog,
   NIR_PASS_V(nir, gl_nir_lower_samplers_as_deref, shader_program);
else
   NIR_PASS_V(nir, gl_nir_lower_samplers, shader_program);
+
+   nir_shader_gather_info(nir, nir_shader_get_entrypoint(nir));
 }
 
 } /* extern "C" */
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] nir: Create sampler variables in prog_to_nir.

2018-08-24 Thread Kenneth Graunke
This is needed for nir_gather_info to actually count the textures,
since it operates solely on variables.
---
 src/mesa/program/prog_to_nir.c | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/src/mesa/program/prog_to_nir.c b/src/mesa/program/prog_to_nir.c
index 14e57b6c6a1..1f0607542e8 100644
--- a/src/mesa/program/prog_to_nir.c
+++ b/src/mesa/program/prog_to_nir.c
@@ -52,6 +52,7 @@ struct ptn_compile {
nir_variable *parameters;
nir_variable *input_vars[VARYING_SLOT_MAX];
nir_variable *output_vars[VARYING_SLOT_MAX];
+   nir_variable *sampler_vars[32]; /* matches number of bits in TexSrcUnit */
nir_register **output_regs;
nir_register **temp_regs;
 
@@ -484,9 +485,10 @@ ptn_kil(nir_builder *b, nir_ssa_def **src)
 }
 
 static void
-ptn_tex(nir_builder *b, nir_alu_dest dest, nir_ssa_def **src,
+ptn_tex(struct ptn_compile *c, nir_alu_dest dest, nir_ssa_def **src,
 struct prog_instruction *prog_inst)
 {
+   nir_builder *b = >build;
nir_tex_instr *instr;
nir_texop op;
unsigned num_srcs;
@@ -568,6 +570,15 @@ ptn_tex(nir_builder *b, nir_alu_dest dest, nir_ssa_def 
**src,
   unreachable("can't reach");
}
 
+   if (!c->sampler_vars[prog_inst->TexSrcUnit]) {
+  const struct glsl_type *type =
+ glsl_sampler_type(instr->sampler_dim, false, false, GLSL_TYPE_FLOAT);
+  nir_variable *var =
+ nir_variable_create(b->shader, nir_var_uniform, type, "sampler");
+  var->data.binding = prog_inst->TexSrcUnit;
+  c->sampler_vars[prog_inst->TexSrcUnit] = var;
+   }
+
unsigned src_number = 0;
 
instr->src[src_number].src =
@@ -784,7 +795,7 @@ ptn_emit_instruction(struct ptn_compile *c, struct 
prog_instruction *prog_inst)
case OPCODE_TXD:
case OPCODE_TXL:
case OPCODE_TXP:
-  ptn_tex(b, dest, src, prog_inst);
+  ptn_tex(c, dest, src, prog_inst);
   break;
 
case OPCODE_SWZ:
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/gen7_urb: Re-emit PUSH_CONSTANT_ALLOC on GLK

2018-08-24 Thread Ilia Mirkin
On Fri, Aug 24, 2018 at 8:46 PM, Nanley Chery  wrote:
> According to internal docs, some gen9 platforms have a pixel shader push
> constant synchronization issue. Although not listed among said
> platforms, this issue seems to be present on the GeminiLake 2x6's we've
> tested.
>
> We consider the available workarounds to be too detrimental on
> performance. Instead, we mitigate the issue by applying part of one of
> the workarounds. Re-emit PUSH_CONSTANT_ALLOC at the top of every batch
> (as suggested by Ken).
>
> Fixes ext_framebuffer_multisample-accuracy piglit test failures with the
> following options:
> * 6 depth_draw small depthstencil
> * 8 stencil_draw small depthstencil
> * 6 stencil_draw small depthstencil
> * 8 depth_resolve small
> * 6 stencil_resolve small depthstencil
> * 4 stencil_draw small depthstencil
> * 16 stencil_draw small depthstencil
> * 16 depth_draw small depthstencil
> * 2 stencil_resolve small depthstencil
> * 6 stencil_draw small
> * all_samples stencil_draw small
> * 2 depth_draw small depthstencil
> * all_samples depth_draw small depthstencil
> * all_samples stencil_resolve small
> * 4 depth_draw small depthstencil
> * all_samples depth_draw small
> * all_samples stencil_draw small depthstencil
> * 4 stencil_resolve small depthstencil
> * 4 depth_resolve small depthstencil
> * all_samples stencil_resolve small depthstencil
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106865
> Cc: 
> ---
>  src/mesa/drivers/dri/i965/gen7_urb.c | 23 +++
>  1 file changed, 23 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/gen7_urb.c 
> b/src/mesa/drivers/dri/i965/gen7_urb.c
> index 2e5f8e60ba9..cb045251236 100644
> --- a/src/mesa/drivers/dri/i965/gen7_urb.c
> +++ b/src/mesa/drivers/dri/i965/gen7_urb.c
> @@ -118,6 +118,28 @@ gen7_emit_push_constant_state(struct brw_context *brw, 
> unsigned vs_size,
> const struct gen_device_info *devinfo = >screen->devinfo;
> unsigned offset = 0;
>
> +   /* From the SKL PRM, Workarounds section (#878):
> +*
> +*Push constant buffer corruption possible. WA: Insert 2 zero-length
> +*PushConst_PS before every intended PushConst_PS update, issue a
> +*NULLPRIM after each of the zero len PC update to make sure CS 
> commits
> +*them.
> +*
> +* This workaround is attempting to solve a pixel shader push constant
> +* synchronization issue.
> +*
> +* There's an unpublished WA that involves re-emitting
> +* 3DSTATE_PUSH_CONSTANT_ALLOC_PS for every 500-ish 3DSTATE_CONSTANT_PS
> +* packets. Since our counting methods may not be reliable due to
> +* context-switching and pre-emption, we instead choose to approximate 
> this
> +* behavior by re-emitting the packet at the top of the batch.
> +*/
> +   if (brw->ctx.NewDriverState == BRW_NEW_BATCH) {

Did you want & here?

> +   /* Only GLK 2x6 has demonstrated this issue thus far. */
> +  if (!devinfo->is_geminilake || devinfo->num_subslices[0] != 2)
> + return;
> +   }
> +
> BEGIN_BATCH(10);
> OUT_BATCH(_3DSTATE_PUSH_CONSTANT_ALLOC_VS << 16 | (2 - 2));
> OUT_BATCH(vs_size | offset << GEN7_PUSH_CONSTANT_BUFFER_OFFSET_SHIFT);
> @@ -154,6 +176,7 @@ const struct brw_tracked_state gen7_push_constant_space = 
> {
> .dirty = {
>.mesa = 0,
>.brw = BRW_NEW_CONTEXT |
> + BRW_NEW_BATCH | /* GLK workaround */
>   BRW_NEW_GEOMETRY_PROGRAM |
>   BRW_NEW_TESS_PROGRAMS,
> },
> --
> 2.18.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965/gen7_urb: Re-emit PUSH_CONSTANT_ALLOC on GLK

2018-08-24 Thread Nanley Chery
According to internal docs, some gen9 platforms have a pixel shader push
constant synchronization issue. Although not listed among said
platforms, this issue seems to be present on the GeminiLake 2x6's we've
tested.

We consider the available workarounds to be too detrimental on
performance. Instead, we mitigate the issue by applying part of one of
the workarounds. Re-emit PUSH_CONSTANT_ALLOC at the top of every batch
(as suggested by Ken).

Fixes ext_framebuffer_multisample-accuracy piglit test failures with the
following options:
* 6 depth_draw small depthstencil
* 8 stencil_draw small depthstencil
* 6 stencil_draw small depthstencil
* 8 depth_resolve small
* 6 stencil_resolve small depthstencil
* 4 stencil_draw small depthstencil
* 16 stencil_draw small depthstencil
* 16 depth_draw small depthstencil
* 2 stencil_resolve small depthstencil
* 6 stencil_draw small
* all_samples stencil_draw small
* 2 depth_draw small depthstencil
* all_samples depth_draw small depthstencil
* all_samples stencil_resolve small
* 4 depth_draw small depthstencil
* all_samples depth_draw small
* all_samples stencil_draw small depthstencil
* 4 stencil_resolve small depthstencil
* 4 depth_resolve small depthstencil
* all_samples stencil_resolve small depthstencil

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106865
Cc: 
---
 src/mesa/drivers/dri/i965/gen7_urb.c | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/gen7_urb.c 
b/src/mesa/drivers/dri/i965/gen7_urb.c
index 2e5f8e60ba9..cb045251236 100644
--- a/src/mesa/drivers/dri/i965/gen7_urb.c
+++ b/src/mesa/drivers/dri/i965/gen7_urb.c
@@ -118,6 +118,28 @@ gen7_emit_push_constant_state(struct brw_context *brw, 
unsigned vs_size,
const struct gen_device_info *devinfo = >screen->devinfo;
unsigned offset = 0;
 
+   /* From the SKL PRM, Workarounds section (#878):
+*
+*Push constant buffer corruption possible. WA: Insert 2 zero-length
+*PushConst_PS before every intended PushConst_PS update, issue a
+*NULLPRIM after each of the zero len PC update to make sure CS commits
+*them.
+*
+* This workaround is attempting to solve a pixel shader push constant
+* synchronization issue.
+*
+* There's an unpublished WA that involves re-emitting
+* 3DSTATE_PUSH_CONSTANT_ALLOC_PS for every 500-ish 3DSTATE_CONSTANT_PS
+* packets. Since our counting methods may not be reliable due to
+* context-switching and pre-emption, we instead choose to approximate this
+* behavior by re-emitting the packet at the top of the batch.
+*/
+   if (brw->ctx.NewDriverState == BRW_NEW_BATCH) {
+   /* Only GLK 2x6 has demonstrated this issue thus far. */
+  if (!devinfo->is_geminilake || devinfo->num_subslices[0] != 2)
+ return;
+   }
+
BEGIN_BATCH(10);
OUT_BATCH(_3DSTATE_PUSH_CONSTANT_ALLOC_VS << 16 | (2 - 2));
OUT_BATCH(vs_size | offset << GEN7_PUSH_CONSTANT_BUFFER_OFFSET_SHIFT);
@@ -154,6 +176,7 @@ const struct brw_tracked_state gen7_push_constant_space = {
.dirty = {
   .mesa = 0,
   .brw = BRW_NEW_CONTEXT |
+ BRW_NEW_BATCH | /* GLK workaround */
  BRW_NEW_GEOMETRY_PROGRAM |
  BRW_NEW_TESS_PROGRAMS,
},
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/4] intel/decoder: Clean up field iteration and fix sub-dword fields

2018-08-24 Thread Lionel Landwerlin

The whole series is :

Reviewed-by: Lionel Landwerlin 

Thanks for the great cleanup!

-
Lionel

On 24/08/2018 22:40, Jason Ekstrand wrote:

First of all, setting iter->name in advance_field is unnecessary because
it gets set by gen_decode_field which gets called immediately after
gen_decode_field in the one call-site.  Second, we weren't properly
initializing start_bit and end_bit in the initial condition of
gen_field_iterator_next so the first field of a struct would get printed
wrong if it doesn't start on the first bit.  This is fixed by adding a
iter_start_field helper which sets the field and also sets up the other
bits we need.  This fixes decoding of 3DSTATE_SBE_SWIZ.
---
  src/intel/common/gen_decoder.c | 32 
  1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/src/intel/common/gen_decoder.c b/src/intel/common/gen_decoder.c
index 39da3cadbf8..e9dabeae653 100644
--- a/src/intel/common/gen_decoder.c
+++ b/src/intel/common/gen_decoder.c
@@ -812,6 +812,18 @@ iter_more_groups(const struct gen_field_iterator *iter)
 }
  }
  
+static void

+iter_start_field(struct gen_field_iterator *iter, struct gen_field *field)
+{
+   iter->field = field;
+
+   int group_member_offset = iter_group_offset_bits(iter, iter->group_iter);
+
+   iter->start_bit = group_member_offset + iter->field->start;
+   iter->end_bit = group_member_offset + iter->field->end;
+   iter->struct_desc = NULL;
+}
+
  static void
  iter_advance_group(struct gen_field_iterator *iter)
  {
@@ -826,32 +838,20 @@ iter_advance_group(struct gen_field_iterator *iter)
}
 }
  
-   iter->field = iter->group->fields;

+   iter_start_field(iter, iter->group->fields);
  }
  
  static bool

  iter_advance_field(struct gen_field_iterator *iter)
  {
 if (iter_more_fields(iter)) {
-  iter->field = iter->field->next;
+  iter_start_field(iter, iter->field->next);
 } else {
if (!iter_more_groups(iter))
   return false;
  
iter_advance_group(iter);

 }
-
-   if (iter->field->name)
-  snprintf(iter->name, sizeof(iter->name), "%s", iter->field->name);
-   else
-  memset(iter->name, 0, sizeof(iter->name));
-
-   int group_member_offset = iter_group_offset_bits(iter, iter->group_iter);
-
-   iter->start_bit = group_member_offset + iter->field->start;
-   iter->end_bit = group_member_offset + iter->field->end;
-   iter->struct_desc = NULL;
-
 return true;
  }
  
@@ -1006,9 +1006,9 @@ gen_field_iterator_next(struct gen_field_iterator *iter)

 /* Initial condition */
 if (!iter->field) {
if (iter->group->fields)
- iter->field = iter->group->fields;
+ iter_start_field(iter, iter->group->fields);
else
- iter->field = iter->group->next->fields;
+ iter_start_field(iter, iter->group->next->fields);
  
bool result = iter_decode_field(iter);

if (iter->p_end)



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 107680] OpenCL does not work on Ubuntu 18.04 with Nvidia

2018-08-24 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=107680

Rhys Kidd  changed:

   What|Removed |Added

 Status|NEW |NEEDINFO

--- Comment #1 from Rhys Kidd  ---
Is your objective to utilize the proprietary NVIDIA drivers on Ubuntu 18.04 to
provide the OpenCL API? e.g. when you ran clinfo previously with your desired
output, it reported:

>  $ clinfo
>...
>Number of platforms   2
>  ...
>  Platform Name   NVIDIA CUDA
>  Platform Vendor NVIDIA Corporation

If so, you should request support through NVIDIA's proprietary driver support
channels (which it looks like you have [0]) or Ubuntu's distribution channels.

This bugtracker is for upstream open source Mesa drivers (including OpenCL
provided by the Mesa driver stack).


[0]
https://devtalk.nvidia.com/default/topic/1036967/linux/unable-to-use-opencl-cuda-on-ubuntu-18-04/

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/4] intel/decoder: Print ISL formats for vertex elements

2018-08-24 Thread Jason Ekstrand
---
 src/intel/common/gen_decoder.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/intel/common/gen_decoder.c b/src/intel/common/gen_decoder.c
index e9dabeae653..9e46f271633 100644
--- a/src/intel/common/gen_decoder.c
+++ b/src/intel/common/gen_decoder.c
@@ -971,7 +971,8 @@ iter_decode_field(struct gen_field_iterator *iter)
   int length = strlen(iter->value);
   snprintf(iter->value + length, sizeof(iter->value) - length,
" (%s)", enum_name);
-   } else if (strcmp(iter->name, "Surface Format") == 0) {
+   } else if (strcmp(iter->name, "Surface Format") == 0 ||
+  strcmp(iter->name, "Source Element Format") == 0) {
   if (isl_format_is_valid((enum isl_format)v.qw)) {
  const char *fmt_name = isl_format_get_name((enum isl_format)v.qw);
  int length = strlen(iter->value);
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/4] intel/batch_decoder: Fix dynamic state printing

2018-08-24 Thread Jason Ekstrand
Instead of printing addresses like everyone else, we were accidentally
printing the offset from state base address.  Also, state_map is a void
pointer so we were incrementing in bytes instead of dwords and every
state other than the first was wrong.
---
 src/intel/common/gen_batch_decoder.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/intel/common/gen_batch_decoder.c 
b/src/intel/common/gen_batch_decoder.c
index f93f4df0066..a57bd93e0f0 100644
--- a/src/intel/common/gen_batch_decoder.c
+++ b/src/intel/common/gen_batch_decoder.c
@@ -665,10 +665,10 @@ decode_dynamic_state_pointers(struct gen_batch_decode_ctx 
*ctx,
 
for (int i = 0; i < count; i++) {
   fprintf(ctx->fp, "%s %d\n", struct_type, i);
-  ctx_print_group(ctx, state, state_offset, state_map);
+  ctx_print_group(ctx, state, state_addr, state_map);
 
   state_addr += state->dw_length * 4;
-  state_map += state->dw_length;
+  state_map += state->dw_length * 4;
}
 }
 
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/4] intel/decoder: Clean up field iteration and fix sub-dword fields

2018-08-24 Thread Jason Ekstrand
First of all, setting iter->name in advance_field is unnecessary because
it gets set by gen_decode_field which gets called immediately after
gen_decode_field in the one call-site.  Second, we weren't properly
initializing start_bit and end_bit in the initial condition of
gen_field_iterator_next so the first field of a struct would get printed
wrong if it doesn't start on the first bit.  This is fixed by adding a
iter_start_field helper which sets the field and also sets up the other
bits we need.  This fixes decoding of 3DSTATE_SBE_SWIZ.
---
 src/intel/common/gen_decoder.c | 32 
 1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/src/intel/common/gen_decoder.c b/src/intel/common/gen_decoder.c
index 39da3cadbf8..e9dabeae653 100644
--- a/src/intel/common/gen_decoder.c
+++ b/src/intel/common/gen_decoder.c
@@ -812,6 +812,18 @@ iter_more_groups(const struct gen_field_iterator *iter)
}
 }
 
+static void
+iter_start_field(struct gen_field_iterator *iter, struct gen_field *field)
+{
+   iter->field = field;
+
+   int group_member_offset = iter_group_offset_bits(iter, iter->group_iter);
+
+   iter->start_bit = group_member_offset + iter->field->start;
+   iter->end_bit = group_member_offset + iter->field->end;
+   iter->struct_desc = NULL;
+}
+
 static void
 iter_advance_group(struct gen_field_iterator *iter)
 {
@@ -826,32 +838,20 @@ iter_advance_group(struct gen_field_iterator *iter)
   }
}
 
-   iter->field = iter->group->fields;
+   iter_start_field(iter, iter->group->fields);
 }
 
 static bool
 iter_advance_field(struct gen_field_iterator *iter)
 {
if (iter_more_fields(iter)) {
-  iter->field = iter->field->next;
+  iter_start_field(iter, iter->field->next);
} else {
   if (!iter_more_groups(iter))
  return false;
 
   iter_advance_group(iter);
}
-
-   if (iter->field->name)
-  snprintf(iter->name, sizeof(iter->name), "%s", iter->field->name);
-   else
-  memset(iter->name, 0, sizeof(iter->name));
-
-   int group_member_offset = iter_group_offset_bits(iter, iter->group_iter);
-
-   iter->start_bit = group_member_offset + iter->field->start;
-   iter->end_bit = group_member_offset + iter->field->end;
-   iter->struct_desc = NULL;
-
return true;
 }
 
@@ -1006,9 +1006,9 @@ gen_field_iterator_next(struct gen_field_iterator *iter)
/* Initial condition */
if (!iter->field) {
   if (iter->group->fields)
- iter->field = iter->group->fields;
+ iter_start_field(iter, iter->group->fields);
   else
- iter->field = iter->group->next->fields;
+ iter_start_field(iter, iter->group->next->fields);
 
   bool result = iter_decode_field(iter);
   if (iter->p_end)
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/4] intel/batch_decoder: Print blend states properly

2018-08-24 Thread Jason Ekstrand
---
 src/intel/common/gen_batch_decoder.c | 17 -
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/src/intel/common/gen_batch_decoder.c 
b/src/intel/common/gen_batch_decoder.c
index a57bd93e0f0..6884a999401 100644
--- a/src/intel/common/gen_batch_decoder.c
+++ b/src/intel/common/gen_batch_decoder.c
@@ -641,7 +641,6 @@ decode_dynamic_state_pointers(struct gen_batch_decode_ctx 
*ctx,
   int count)
 {
struct gen_group *inst = gen_spec_find_instruction(ctx->spec, p);
-   struct gen_group *state = gen_spec_find_struct(ctx->spec, struct_type);
 
uint32_t state_offset = 0;
 
@@ -663,6 +662,22 @@ decode_dynamic_state_pointers(struct gen_batch_decode_ctx 
*ctx,
   return;
}
 
+   struct gen_group *state = gen_spec_find_struct(ctx->spec, struct_type);
+   if (strcmp(struct_type, "BLEND_STATE") == 0) {
+  /* Blend states are different from the others because they have a header
+   * struct called BLEND_STATE which is followed by a variable number of
+   * BLEND_STATE_ENTRY structs.
+   */
+  fprintf(ctx->fp, "%s\n", struct_type);
+  ctx_print_group(ctx, state, state_addr, state_map);
+
+  state_addr += state->dw_length * 4;
+  state_map += state->dw_length * 4;
+
+  struct_type = "BLEND_STATE_ENTRY";
+  state = gen_spec_find_struct(ctx->spec, struct_type);
+   }
+
for (int i = 0; i < count; i++) {
   fprintf(ctx->fp, "%s %d\n", struct_type, i);
   ctx_print_group(ctx, state, state_addr, state_map);
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] egl/android: do not indent HAVE_DRM_GRALLOC preprocessor directive

2018-08-24 Thread Mauro Rossi
Hi,

Il giorno mer 15 ago 2018 alle ore 15:13 Mauro Rossi
 ha scritto:
>
> Fixes: 3f7bca44d9 ("egl/android: #ifdef out flink name support")
> Fixes: c7bb82136b ("egl/android: Add DRM node probing and filtering")
> Signed-off-by: Mauro Rossi 
> ---
>  src/egl/drivers/dri2/platform_android.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/src/egl/drivers/dri2/platform_android.c 
> b/src/egl/drivers/dri2/platform_android.c
> index 834bbd258e..f8c85f97cf 100644
> --- a/src/egl/drivers/dri2/platform_android.c
> +++ b/src/egl/drivers/dri2/platform_android.c
> @@ -1226,7 +1226,7 @@ droid_load_driver(_EGLDisplay *disp)
> dri2_dpy->is_render_node = drmGetNodeTypeFromFd(dri2_dpy->fd) == 
> DRM_NODE_RENDER;
>
> if (!dri2_dpy->is_render_node) {
> -   #ifdef HAVE_DRM_GRALLOC
> +#ifdef HAVE_DRM_GRALLOC
> /* Handle control nodes using __DRI_DRI2_LOADER extension and GEM 
> names
>  * for backwards compatibility with drm_gralloc. (Do not use on new
>  * systems.) */
> @@ -1235,10 +1235,10 @@ droid_load_driver(_EGLDisplay *disp)
>err = "DRI2: failed to load driver";
>goto error;
> }
> -   #else
> +#else
> err = "DRI2: handle is not for a render node";
> goto error;
> -   #endif
> +#endif
> } else {
> dri2_dpy->loader_extensions = droid_image_loader_extensions;
> if (!dri2_load_driver_dri3(disp)) {
> --
> 2.17.1
>

Please provide one R-b ,
in order to proceed in commit to gitlab master
and propose as candidate for mesa 18.2.0 release

Mauro
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 107680] OpenCL does not work on Ubuntu 18.04 with Nvidia

2018-08-24 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=107680

Bug ID: 107680
   Summary: OpenCL does not work on Ubuntu 18.04 with Nvidia
   Product: Mesa
   Version: 18.1
  Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
  Severity: critical
  Priority: medium
 Component: Other
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: devyncjohn...@gmail.com
QA Contact: mesa-dev@lists.freedesktop.org

System Info
 - System: Ubuntu 18.04
 - Linux Kernel Version: I tried 4.15 through 4.17 (both custom and
standard repo kernels)
 - Nvidia Driver: 396
 - Graphics Card (GPU): Nvidia GeForce 1080
 - CPU: i7-8700K (Coffeelake)
 - Cuda Version: 9.1
 - ocl-icd-libopencl1 Version: 2.2.11-1ubuntu1
 - ocl-icd-libopencl1 Provides: libopencl-1.1-1, libopencl-1.2-1,
libopencl-2.0-1, libopencl-2.1-1, libopencl1
 - clinfo Output:

Number of platforms   1
  Platform Name   Clover
  Platform Vendor Mesa
  Platform VersionOpenCL 1.1 Mesa
18.1.1
  Platform ProfileFULL_PROFILE
  Platform Extensions cl_khr_icd
  Platform Extensions function suffix MESA   
  Platform Name   Clover
Number of devices 0
NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  Clover
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   Clover

  clCreateContext(NULL, ...) [default]No devices found in
platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT)  No devices
found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  No devices found
in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  No devices found
in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No devices
found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  No devices
found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  No devices found
in platform
ICD loader properties
  ICD loader Name OpenCL ICD Loader
  ICD loader Vendor   OCL Icd free software 
  ICD loader Version  2.2.11
  ICD loader Profile  OpenCL 2.1

I have been using OpenGL and OpenCL successfully for quite some time. However,
I have not been able to use OpenCL recently. I update my system monthly. I used
OpenCL successfully in May. As for June, I did not use OpenCL until the second
week of June. I assume an update that I got in the beginning of June is the
cause (but I could be wrong). I have tried downgrading the Linux kernel (and
trying various kernel versions), Nvidia drivers (390 and 396), and various
graphics-related libraries without success. The output of clinfo no longer sees
my Nvidia card (like it did in May).

I have tried reinstalling nvidia-cuda-toolkit as well as all other Nvidia and
Cuda packages. I also tried reinstalling intel-microcode and all Optimus/Prime
related packages. I also tried uninstalling such packages and reinstalling. I
have also tried uninstalling all intel-iGPU (such as Beignet and
intel-microcode) related packages. The ubuntu-additional-drivers program only
ever shows the Nvidia driver.

Interestingly, OpenGL is perfectly fine. I can still use the Nvidia card for
graphics rendering. The output of nvidia-smi shows my various processes running
on the graphics card in graphics-mode (but no compute-mode processes).

What am I over-looking in getting OpenCL to work again?

This detail may or may not be related or provide a hint, but the output of
eglinfo shows that dlopen cannot find vgem_dri.so.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 4/4] freedreno: Drop a bunch of duplicated gallium PIPE_CAP default code.

2018-08-24 Thread Rob Clark
On Thu, Aug 23, 2018 at 1:59 PM Eric Anholt  wrote:
>
> Now that we have the util function for the default values, we can get rid
> of the boilerplate.
>
> Cc: Rob Clark 

I like the idea of reducing the boilerplate..

Reviewed-by: Rob Clark 

> ---
>  .../drivers/freedreno/freedreno_screen.c  | 103 +-
>  1 file changed, 2 insertions(+), 101 deletions(-)
>
> diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c 
> b/src/gallium/drivers/freedreno/freedreno_screen.c
> index d62f02e04f85..4e972aea1b06 100644
> --- a/src/gallium/drivers/freedreno/freedreno_screen.c
> +++ b/src/gallium/drivers/freedreno/freedreno_screen.c
> @@ -210,11 +210,7 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum 
> pipe_cap param)
> case PIPE_CAP_COMPUTE:
> return has_compute(screen);
>
> -   case PIPE_CAP_SHADER_STENCIL_EXPORT:
> -   case PIPE_CAP_TGSI_TEXCOORD:
> case PIPE_CAP_PREFER_BLIT_BASED_TEXTURE_TRANSFER:
> -   case PIPE_CAP_TEXTURE_MIRROR_CLAMP:
> -   case PIPE_CAP_QUERY_MEMORY_INFO:
> case PIPE_CAP_PCI_GROUP:
> case PIPE_CAP_PCI_BUS:
> case PIPE_CAP_PCI_DEVICE:
> @@ -247,8 +243,6 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum 
> pipe_cap param)
> case PIPE_CAP_POLYGON_OFFSET_CLAMP:
> return is_a5xx(screen) || is_a6xx(screen);
>
> -   case PIPE_CAP_BUFFER_SAMPLER_VIEW_RGBA_ONLY:
> -   return 0;
> case PIPE_CAP_TEXTURE_BUFFER_OFFSET_ALIGNMENT:
> if (is_a3xx(screen)) return 16;
> if (is_a4xx(screen)) return 32;
> @@ -298,79 +292,11 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum 
> pipe_cap param)
> return 4;
> return 0;
>
> -   /* Unsupported features. */
> -   case PIPE_CAP_TGSI_FS_COORD_ORIGIN_LOWER_LEFT:
> -   case PIPE_CAP_TGSI_FS_COORD_PIXEL_CENTER_HALF_INTEGER:
> -   case PIPE_CAP_TGSI_CAN_COMPACT_CONSTANTS:
> -   case PIPE_CAP_USER_VERTEX_BUFFERS:
> -   case PIPE_CAP_QUERY_PIPELINE_STATISTICS:
> -   case PIPE_CAP_TEXTURE_BORDER_COLOR_QUIRK:
> -   case PIPE_CAP_TGSI_VS_LAYER_VIEWPORT:
> -   case PIPE_CAP_TEXTURE_GATHER_SM5:
> -   case PIPE_CAP_SAMPLE_SHADING:
> -   case PIPE_CAP_TEXTURE_GATHER_OFFSETS:
> -   case PIPE_CAP_TGSI_VS_WINDOW_SPACE_POSITION:
> -   case PIPE_CAP_MULTI_DRAW_INDIRECT:
> -   case PIPE_CAP_MULTI_DRAW_INDIRECT_PARAMS:
> -   case PIPE_CAP_TGSI_FS_FINE_DERIVATIVE:
> -   case PIPE_CAP_MULTISAMPLE_Z_RESOLVE:
> -   case PIPE_CAP_RESOURCE_FROM_USER_MEMORY:
> -   case PIPE_CAP_DEVICE_RESET_STATUS_QUERY:
> -   case PIPE_CAP_MAX_SHADER_PATCH_VARYINGS:
> -   case PIPE_CAP_DEPTH_BOUNDS_TEST:
> -   case PIPE_CAP_TGSI_TXQS:
> /* TODO if we need this, do it in nir/ir3 backend to avoid breaking 
> precompile: */
> case PIPE_CAP_FORCE_PERSAMPLE_INTERP:
> -   case PIPE_CAP_COPY_BETWEEN_COMPRESSED_AND_PLAIN_FORMATS:
> -   case PIPE_CAP_CLEAR_TEXTURE:
> -   case PIPE_CAP_DRAW_PARAMETERS:
> -   case PIPE_CAP_TGSI_PACK_HALF_FLOAT:
> -   case PIPE_CAP_TGSI_FS_POSITION_IS_SYSVAL:
> -   case PIPE_CAP_TGSI_FS_FACE_IS_INTEGER_SYSVAL:
> -   case PIPE_CAP_GENERATE_MIPMAP:
> -   case PIPE_CAP_SURFACE_REINTERPRET_BLOCKS:
> -   case PIPE_CAP_ROBUST_BUFFER_ACCESS_BEHAVIOR:
> -   case PIPE_CAP_CULL_DISTANCE:
> -   case PIPE_CAP_PRIMITIVE_RESTART_FOR_PATCHES:
> -   case PIPE_CAP_TGSI_VOTE:
> -   case PIPE_CAP_MAX_WINDOW_RECTANGLES:
> -   case PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED:
> -   case PIPE_CAP_VIEWPORT_SUBPIXEL_BITS:
> -   case PIPE_CAP_TGSI_ARRAY_COMPONENTS:
> -   case PIPE_CAP_TGSI_CAN_READ_OUTPUTS:
> -   case PIPE_CAP_TGSI_FS_FBFETCH:
> -   case PIPE_CAP_TGSI_MUL_ZERO_WINS:
> -   case PIPE_CAP_DOUBLES:
> -   case PIPE_CAP_INT64:
> -   case PIPE_CAP_INT64_DIVMOD:
> -   case PIPE_CAP_TGSI_TEX_TXF_LZ:
> -   case PIPE_CAP_TGSI_CLOCK:
> -   case PIPE_CAP_POLYGON_MODE_FILL_RECTANGLE:
> -   case PIPE_CAP_SPARSE_BUFFER_PAGE_SIZE:
> -   case PIPE_CAP_TGSI_BALLOT:
> -   case PIPE_CAP_TGSI_TES_LAYER_VIEWPORT:
> -   case PIPE_CAP_CAN_BIND_CONST_BUFFER_AS_VERTEX:
> +   return 0;
> +
> case PIPE_CAP_ALLOW_MAPPED_BUFFERS_DURING_EXECUTION:
> -   case PIPE_CAP_POST_DEPTH_COVERAGE:
> -   case PIPE_CAP_BINDLESS_TEXTURE:
> -   case PIPE_CAP_NIR_SAMPLERS_AS_DEREF:
> -   case PIPE_CAP_QUERY_SO_OVERFLOW:
> -   case PIPE_CAP_MEMOBJ:
> -   case PIPE_CAP_TGSI_ANY_REG_AS_ADDRESS:
> -   case PIPE_CAP_TILE_RASTER_ORDER:
> -   case PIPE_CAP_MAX_COMBINED_SHADER_OUTPUT_RESOURCES:
> -   case PIPE_CAP_FRAMEBUFFER_MSAA_CONSTRAINTS:
> -   case PIPE_CAP_SIGNED_VERTEX_BUFFER_OFFSET:
> -   case PIPE_CAP_FENCE_SIGNAL:
> -   case PIPE_CAP_CONSTBUF0_FLAGS:
> -   case PIPE_CAP_PACKED_UNIFORMS:
> -   case 

Re: [Mesa-dev] [PATCH 2/3] configure: allow building with python3

2018-08-24 Thread Dylan Baker
Can we just change the script to write a file instead of sending it's output
through the shell? That should fix any encoding problems since the shell wont
touch it and the LANG settings (no matter what they are) shouldn't matter.

Dylan

Quoting Mathieu Bridon (2018-08-24 07:58:21)
> Hi,
> 
> On Thu, 2018-08-23 at 23:23 -0400, Ilia Mirkin wrote:
> > This breaks the build for me. It selects python3 instead of python2,
> > and gen_xmlpool.py bails out when trying to print \xf3 to stdout with
> > a LANG=C locale.
> 
> In general though, Python 3 works very badly with LANG=C. Upstream
> Python recommends just not using LANG=C at all, and instead using a
> UTF8 locale, like C.UTF-8 instead.
> 
> In fact, starting with 3.7, Python will emit a big warning when it is
> run on a non-UTF8 locale, and try to fallback to C.UTF-8 if it can.
> 
> There might be something to fix in this case (I haven't had time to
> look at it yet), but I'd still advise you try and use a UTF8 locale
> when running Python scripts in the future, if at all possible.
> 
> 
> -- 
> Mathieu
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/5] util/gen_xmlpool: Add a --meson switch

2018-08-24 Thread Dylan Baker
Quoting Emil Velikov (2018-08-24 08:57:29)
> On Fri, 24 Aug 2018 at 15:16, Dylan Baker  wrote:
> >
> > Meson won't put the .gmo files in the layout that python's
> > gettext.translation() expects, so we need to handle them differently,
> > this switch allows the script to load the files as meson lays them out
> 
> No obvious reason comes to mind why we want divergence here.
> Can you elaborate more what's happening here - what are the .gmo
> files, I though we're using .mo ones?

Meson uses .gmo to distinguish that they are specifics GNU mo files, as opposed
to one of the other flaovrs like the solaris mo files, which use a completely
different syntax. The real difference is that autotools generates a folder
hierarchy to place the .mo (or .gmo) files in, but meson doesn't guarantee
folder structures in the build directory, so mimicking the autotools behavior
isn't guaranteed to work or continue working.

> 
> If the only difference is a) extension and b) .mo file location - we
> could update the autoconf/others to follow the same pattern.

We certainly could change autotools to follow the same pattern if that's what
you wanted to do. I just don't have the expertise with autotools to do it
quickly, and I wanted to get something wired for meson.

Dylan


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] egl/android: rework device probing

2018-08-24 Thread Robert Foss

Hey Emil,

On 24/08/2018 14.21, Emil Velikov wrote:

From: Emil Velikov 

Unlike the other platforms, here we aim do guess if the device that we
somewhat arbitrarily picked, is supported or not.

In particular: when a vendor is _not_ requested we loop through all
devices, picking the first one which can create a DRI screen.

When a vendor is requested - we use that and do _not_ fall-back to any
other device.

The former seems a bit fiddly, but considering EGL_EXT_explicit_device and
EGL_MESA_query_renderer are MIA, this is the best we can do for the
moment.

With those (proposed) extensions userspace will be able to create a
separate EGL display for each device, query device details and make the
conscious decision which one to use.

Cc: Robert Foss 
Cc: Tomasz Figa 
Signed-off-by: Emil Velikov 
---
Thanks for the clarification Tomasz. The original code was using a
fall-back even a vendor was explicitly requested, confusing me a bit ;-)
---
  src/egl/drivers/dri2/platform_android.c | 71 +++--
  1 file changed, 43 insertions(+), 28 deletions(-)

diff --git a/src/egl/drivers/dri2/platform_android.c 
b/src/egl/drivers/dri2/platform_android.c
index 1f9fe27ab85..5bf627dec7d 100644
--- a/src/egl/drivers/dri2/platform_android.c
+++ b/src/egl/drivers/dri2/platform_android.c
@@ -1420,13 +1420,32 @@ droid_filter_device(_EGLDisplay *disp, int fd, const 
char *vendor)
 return 0;
  }
  
+static int

+droid_probe_device(_EGLDisplay *disp)
+{
+  /* Check that the device is supported, by attempting to:
+   * - load the dri module
+   * - and, create a screen
+   */
+   if (!droid_load_driver(disp)) {
+  _eglLog(_EGL_WARNING, "DRI2: failed to load driver");
+  return -1;
+   }
+
+   if (!dri2_create_screen(disp)) {
+  _eglLog(_EGL_WARNING, "DRI2: failed to create screen");
+  return -1;
+   }
+   return 0;
+}
+
  static int
  droid_open_device(_EGLDisplay *disp)
  {
  #define MAX_DRM_DEVICES 32
 drmDevicePtr device, devices[MAX_DRM_DEVICES] = { NULL };
 int prop_set, num_devices;
-   int fd = -1, fallback_fd = -1;
+   int fd = -1;
  
 char *vendor_name = NULL;

 char vendor_buf[PROPERTY_VALUE_MAX];
@@ -1451,33 +1470,39 @@ droid_open_device(_EGLDisplay *disp)
   continue;
}
  
-  if (vendor_name && droid_filter_device(disp, fd, vendor_name)) {

- /* Match requested, but not found - set as fallback */
- if (fallback_fd == -1) {
-fallback_fd = fd;
- } else {
+  /* If a vendor is explicitly provided, we use only that.
+   * Otherwise we fall-back the first device that is supported.
+   */
+  if (vendor_name) {
+ if (droid_filter_device(disp, fd, vendor_name)) {
+/* Device does not match - try next device */
  close(fd);
  fd = -1;
+continue;
   }
-
+ /* If the requested device matches use it, regardless if
+  * init fails. Do not fall-back to any other device.
+  */
+ if (droid_probbe_device(disp)) {


Typo in function name.


+close(fd);
+fd = -1;
+ }


Isn't the above comment saying that the if statement just below it shouldn't
be there? Or am I misparsing something?


+ break;
+  }
+  /* No explicit request - attempt the next device */
+  if (droid_probbe_device(disp)) {


Typo in function name.


+ close(fd);
+ fd = -1;
   continue;
}
-  /* Found a device */
break;
 }
 drmFreeDevices(devices, num_devices);
  
-   if (fallback_fd < 0 && fd < 0) {

-  _eglLog(_EGL_WARNING, "Failed to open any DRM device");
-  return -1;
-   }
-
-   if (fd < 0) {
-  _eglLog(_EGL_WARNING, "Failed to open desired DRM device, using 
fallback");
-  return fallback_fd;
-   }
+   if (fd < 0)
+  _eglLog(_EGL_WARNING, "Failed to open %s DRM device",
+vendor_name ? "desired": "any");
  
-   close(fallback_fd);

 return fd;
  #undef MAX_DRM_DEVICES
  }
@@ -1519,16 +1544,6 @@ dri2_initialize_android(_EGLDriver *drv, _EGLDisplay 
*disp)
goto cleanup;
 }
  
-   if (!droid_load_driver(disp)) {

-  err = "DRI2: failed to load driver";
-  goto cleanup;
-   }
-
-   if (!dri2_create_screen(disp)) {
-  err = "DRI2: failed to create screen";
-  goto cleanup;
-   }
-
 if (!dri2_setup_extensions(disp)) {
err = "DRI2: failed to setup extensions";
goto cleanup;


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 107669] [bisected] wflinfo fails ctx->Const.MaxCombinedTextureImageUnits assertion

2018-08-24 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=107669

Kenneth Graunke  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #2 from Kenneth Graunke  ---
Yeah, that doesn't work...I basically tried that in
b03dcb1e5f507c5950d0de053a6f76e6306ee71f.  You need to include compute still in
MaxShaderStorageBufferBindings and MaxUniformBufferBindings.  But then there's
the  issue 5 in OES_tessellation_shader that says we should include compute in
MAX_COMBINED_TEXTURE_IMAGE_UNITS, too...so even with that it still doesn't
work...

Reverted for now (9d670fd86cc13df0ddff5c6fcb0835926e9a8088), so closing this.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [ANNOUNCE] mesa 18.1.7

2018-08-24 Thread Dylan Baker
Hi List,

Mesa 18.1.7 is now available for general consumption. This release has been
rather small compared to the last few release, There's just a handful of fixes
in total. Meson, radv, anv, gallium winsys, intel, i965, and r600 were the only
recipients of fixs this go around.

Dylan


git tag: mesa-18.1.7

https://mesa.freedesktop.org/archive/mesa-18.1.7.tar.gz
MD5:  0b0131b708b3e3b7ebb26c753dd59add  mesa-18.1.7.tar.gz
SHA1: ea94c9dd6db2e2deb7113b55ceb2569f0b69d2c4  mesa-18.1.7.tar.gz
SHA256: 0c3c240bcd1352d179e65993214f9d55a399beac852c3ab4433e8df9b6c51c83  
mesa-18.1.7.tar.gz
SHA512: 
6b6cd912ad3fd44ea213df5ff245378a105a5d942fcda374e4d755b3b77a43712dbe732cdfedb539ed687adcf9904468b68f3e29766cc7880583c0c46cdf8f6e
  mesa-18.1.7.tar.gz
PGP:  https://mesa.freedesktop.org/archive/mesa-18.1.7.tar.gz.sig

https://mesa.freedesktop.org/archive/mesa-18.1.7.tar.xz
MD5:  17d8a7e7ecbe146a7dc439e8b6eb02e9  mesa-18.1.7.tar.xz
SHA1: 8f86e16a1c03665e55bc284c0e4a5b0a953bcadc  mesa-18.1.7.tar.xz
SHA256: 655e3b32ce3b5e6e8768596e5d4bdef82d0dd37067c324cc4b2daa207306  
mesa-18.1.7.tar.xz
SHA512: 
697c4f441ae52bc867d9d73b103094a29102168c248a502c4ea0fc48f51bcb86b2e741da39e882f24131326d460cdb1416415604c6994d1b8c09fb8a153a5c77
  mesa-18.1.7.tar.xz
PGP:  https://mesa.freedesktop.org/archive/mesa-18.1.7.tar.xz.sig



signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallivm: Detect VSX separately from Altivec

2018-08-24 Thread Vicki Pfau
Is there anything else I need to do on this? This is my first mesa patch 
so I'm not entirely clear what next steps are for getting it committed.



On 08/20/2018 02:44 PM, Roland Scheidegger wrote:

Alright, I guess it's ok then.
In theory the u_cpu_detect bits could be used in different places, for
instance the translate code emits its own sse code, and as long as a
feature was detected properly it may make sense to disable it only for
some users. Albeit llvm setup and the gallivm code need to agree
generally, and there's no good way to deal with this right now (I
suppose gallivm actually should use its own copy of the u_cpu bits). The
fiddling we do in lp_bld_init() wrt SSE (LP_FORCE_SSE2 and also avx
disabling) isn't a clean way neither.
So this looks like as good a solution as others.

Reviewed-by: Roland Scheidegger 

Am 20.08.2018 um 22:15 schrieb Vicki Pfau:

I was mostly following what was done earlier in the file for Altivec. I
can move it but then ideally the Alitvec check should also be moved.


Vicki


On 08/20/2018 08:53 AM, Roland Scheidegger wrote:

u_cpu_detect should detect what's really available, not what is used
(though indeed we actually disable u_cpu bits explicitly in gallivm for
some sse features, but this is a hack).
So I think it would be better if u_cpu_detect sets the has_vsx bit
regardless what the env var is and then enable it based on this bit and
the env var.
Otherwise looks good to me.

Roland

Am 19.08.2018 um 23:17 schrieb Vicki Pfau:

Previously gallivm would attempt to use VSX instructions on all systems
where it detected that Altivec is supported; however, VSX was added to
POWER long after Altivec, causing lots of crashes on older POWER/PPC
hardware, e.g. PPC Macs. By detecting VSX separately from Altivec we can
automatically disable it on hardware that supports Altivec but not VSX

Signed-off-by: Vicki Pfau 
---
   src/gallium/auxiliary/gallivm/lp_bld_misc.cpp | 21 +++
   src/gallium/auxiliary/util/u_cpu_detect.c | 14 -
   src/gallium/auxiliary/util/u_cpu_detect.h |  1 +
   3 files changed, 17 insertions(+), 19 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
b/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
index 79dbedbb56..fcbdd5050f 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
+++ b/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
@@ -650,26 +650,11 @@
lp_build_create_jit_compiler_for_module(LLVMExecutionEngineRef *OutJIT,
   * which are fixed in LLVM 4.0.
   *
   * With LLVM 4.0 or higher:
-    * Make sure VSX instructions are ENABLED, unless
-    * a) the entire -mattr option is overridden via GALLIVM_MATTRS, or
-    * b) VSX instructions are explicitly enabled/disabled via
GALLIVM_VSX=1 or 0.
+    * Make sure VSX instructions are ENABLED (if supported), unless
+    * VSX instructions are explicitly enabled/disabled via
GALLIVM_VSX=1 or 0.
   */
  if (util_cpu_caps.has_altivec) {
-  char *env_mattrs = getenv("GALLIVM_MATTRS");
-  if (env_mattrs) {
- MAttrs.push_back(env_mattrs);
-  }
-  else {
- boolean enable_vsx = true;
- char *env_vsx = getenv("GALLIVM_VSX");
- if (env_vsx && env_vsx[0] == '0') {
-    enable_vsx = false;
- }
- if (enable_vsx)
-    MAttrs.push_back("+vsx");
- else
-    MAttrs.push_back("-vsx");
-  }
+  MAttrs.push_back(util_cpu_caps.has_vsx ? "+vsx" : "-vsx");
  }
   #endif
   #endif
diff --git a/src/gallium/auxiliary/util/u_cpu_detect.c
b/src/gallium/auxiliary/util/u_cpu_detect.c
index 3c6ae4ea1a..14003aa769 100644
--- a/src/gallium/auxiliary/util/u_cpu_detect.c
+++ b/src/gallium/auxiliary/util/u_cpu_detect.c
@@ -133,6 +133,7 @@ check_os_altivec_support(void)
     signal(SIGILL, SIG_DFL);
  } else {
     boolean enable_altivec = TRUE;    /* Default: enable  if
available, and if not overridden */
+  boolean enable_vsx = TRUE;
   #ifdef DEBUG
     /* Disabling Altivec code generation is not the same as
disabling VSX code generation,
  * which can be done simply by passing -mattr=-vsx to the
LLVM compiler; cf.
@@ -144,6 +145,11 @@ check_os_altivec_support(void)
    enable_altivec = FALSE;
     }
   #endif
+  /* VSX instructions can be explicitly enabled/disabled via
GALLIVM_VSX=1 or 0 */
+  char *env_vsx = getenv("GALLIVM_VSX");
+  if (env_vsx && env_vsx[0] == '0') {
+ enable_vsx = FALSE;
+  }
     if (enable_altivec) {
    __lv_powerpc_canjump = 1;
   @@ -153,8 +159,13 @@ check_os_altivec_support(void)
    :
    : "r" (-1));
   - signal(SIGILL, SIG_DFL);
    util_cpu_caps.has_altivec = 1;
+
+ if (enable_vsx) {
+    __asm __volatile("xxland %vs0, %vs0, %vs0");
+    util_cpu_caps.has_vsx = 1;
+ }
+ signal(SIGILL, SIG_DFL);
     } else {
    util_cpu_caps.has_altivec = 0;
     }
@@ 

[Mesa-dev] [PATCH v4 47/49] appveyor: use msbuild instead of ninja

2018-08-24 Thread Liviu Prodea
While this works for a CI like Appveyor it doesn't work on a bit more complex 
build environment where paths can be longer. Unfortunately MsBuild doesn't 
handle well the situation when paths exceed MAX_PATH. Here is an example:
- Mesa source code is in C:\Software\DEVELO~1\projects\mesa\mesa
- build system is generated under source directory in build\windows-x86_64.
When the build gets to zlib, build fails with paths too long error notably due 
to the overly long folder PATH: 
C:\Software\Development\projects\mesa\mesa\build\windows-x86_64\subprojects\zlib-1.2.11\Windows
 resource for file 'subprojects__zlib-1.2.11__win32_zlib1.rc'@cus\Windows 
.F68FD0C4.tlog.


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 107477] [DXVK] Setting high shader quality in GTA V results in LLVM error

2018-08-24 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=107477

--- Comment #11 from Clément Guérin  ---
Here's the capture:
https://send.firefox.com/download/157438a5ba/#DzGe_CjcHthydTi1jPT-0A

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/5] util/gen_xmlpool: Add a --meson switch

2018-08-24 Thread Emil Velikov
On Fri, 24 Aug 2018 at 15:16, Dylan Baker  wrote:
>
> Meson won't put the .gmo files in the layout that python's
> gettext.translation() expects, so we need to handle them differently,
> this switch allows the script to load the files as meson lays them out

No obvious reason comes to mind why we want divergence here.
Can you elaborate more what's happening here - what are the .gmo
files, I though we're using .mo ones?

If the only difference is a) extension and b) .mo file location - we
could update the autoconf/others to follow the same pattern.

> ---
>  src/util/xmlpool/gen_xmlpool.py | 15 +++
>  1 file changed, 11 insertions(+), 4 deletions(-)
>
> diff --git a/src/util/xmlpool/gen_xmlpool.py b/src/util/xmlpool/gen_xmlpool.py
> index b40f295738e..59d7a9bb84d 100644
> --- a/src/util/xmlpool/gen_xmlpool.py
> +++ b/src/util/xmlpool/gen_xmlpool.py
> @@ -8,17 +8,18 @@
>  #
>
>  from __future__ import print_function
> -
>  import argparse
> -import io
> -import sys
>  import gettext
> +import io
> +import os
>  import re
> +import sys
>
I would keep these in 2/5 which already does the cleanups.

>  parser = argparse.ArgumentParser()
>  parser.add_argument('template')
>  parser.add_argument('localedir')
>  parser.add_argument('languages', nargs='*')
> +parser.add_argument('--meson', action='store_true')
Please annotate the argument as required/optional as they are introduced.

>  args = parser.parse_args()
>
>  if sys.version_info < (3, 0):
> @@ -166,8 +167,14 @@ def expandMatches (matches, translations, end=None):
>  translations = [("en", gettext.NullTranslations())]
>  for lang in args.languages:
>  try:
> -trans = gettext.translation ("options", args.localedir, [lang])
> +if args.meson:
> +filename = os.path.join(args.localedir, '{}.gmo'.format(lang))
> +with io.open(filename, 'rb') as f:
> +trans = gettext.GNUTranslations(f)
> +else:
> +trans = gettext.translation ("options", args.localedir, [lang])
>  except IOError:
> +raise
>  sys.stderr.write ("Warning: language '%s' not found.\n" % lang)
>  continue
Something looks odd there - do we raise an exception, or continue
(while printing an warning to strerr)?

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] gitlab-ci: build Mesa using GitLab CI

2018-08-24 Thread Juan A. Suarez Romero
On Thu, 2018-08-09 at 19:35 -0700, Eric Anholt wrote:
> "Juan A. Suarez Romero"  writes:
> 
> > Creates different Docker images containing Mesa built with different
> > tools (autotools, meson, scons, etc).
> > 
> > The build is done in 3 levels: the first level creates a base image
> > with all the requirements to build Mesa.
> > 
> > The second level (based of the first one), builds different images with
> > different versions of LLVM. As Gallium drivers heavily relies on LLVM,
> > this will help to test the build with different LLVM versions.
> > 
> > Finally, the latest level creates different images versions of Mesa.
> > The main differences is the tool to build them: autotools, meson, scons,
> > building Gallium drivers with different LLVM versions, and so on.
> > 
> > As the purpose is just to test that everything can be built correctly,
> > all the images are discarded, except one (the autotools), which is
> > stored in the registry. Thus, anyone can just pull it locally and test
> > against their local system.
> > 
> > In order to build the images, Rocker is used. This is a tool that
> > extends the Dockerfiles with new features that are quite interested
> > here. The main features we use is the support for templating, and the
> > support for mounting external directories during the image building.
> > This help to use tools like ccache to improve the build speed.
> > 
> > Signed-off-by: Juan A. Suarez Romero 
> > ---
> >  .gitlab-ci.yml| 177 +
> >  gitlab-ci/Rockerfile.base | 199 ++
> >  gitlab-ci/Rockerfile.llvm |  57 +++
> >  gitlab-ci/Rockerfile.mesa | 145 +++
> >  4 files changed, 578 insertions(+)
> >  create mode 100644 .gitlab-ci.yml
> >  create mode 100644 gitlab-ci/Rockerfile.base
> >  create mode 100644 gitlab-ci/Rockerfile.llvm
> >  create mode 100644 gitlab-ci/Rockerfile.mesa
> > 
> > diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
> > new file mode 100644
> > index 000..5cee333dd45
> > --- /dev/null
> > +++ b/.gitlab-ci.yml
> > @@ -0,0 +1,177 @@
> > +image: docker:latest
> > +
> > +services:
> > +  - docker:dind
> > +
> > +stages:
> > +  - base
> > +  - llvm
> > +  - mesa
> > +
> > +variables:
> > +  DOCKER_IMAGE: $CI_REGISTRY_IMAGE
> > +  CCACHE_DIR: $CI_PROJECT_DIR/../ccache
> > +  LLVM: "6.0"
> > +
> > +cache:
> > +  paths:
> > +- ccache/
> > +  key: "$CI_JOB_STAGE"
> > +
> > +before_script:
> > +  - mkdir -p ccache
> > +  - rm -fr ../ccache
> > +  - mv ccache ../
> > +  - export MAKEFLAGS=-j$(nproc)
> > +  - apk --no-cache add libc6-compat
> > +  - wget 
> > https://github.com/grammarly/rocker/releases/download/1.3.1/rocker-1.3.1-linux_amd64.tar.gz
> > +  - tar xvf rocker-1.3.1-linux_amd64.tar.gz
> > +  - rm rocker-1.3.1-linux_amd64.tar.gz
> > +  - mv rocker ..
> > +  - docker login -u gitlab-ci-token -p $CI_BUILD_TOKEN $CI_REGISTRY
> > +
> > +after_script:
> > +  - mv ../ccache ./
> > +
> > +.build_llvm: _llvm
> > +  stage: llvm
> > +  cache: {}
> > +  script:
> > +- ../rocker build -f gitlab-ci/Rockerfile.llvm --var LLVM=$LLVM
> > +- docker push $CI_REGISTRY_IMAGE:llvm-$LLVM
> > +
> > +.build_mesa: _mesa
> > +  stage: mesa
> > +  script:
> > +- ../rocker build -f gitlab-ci/Rockerfile.mesa --var BUILD=$BUILD 
> > --var LLVM=$LLVM --var TAG=$CI_COMMIT_REF_SLUG .
> > +
> > +base:
> > +  stage: base
> > +  script:
> > +- DOCKERFILE_SHA256=$(cat gitlab-ci/Rockerfile.base | sha256sum | cut 
> > -c-64)
> > +- IMAGE_DOCKERFILE_SHA256=$(./gitlab-ci/inspect-remote-image.sh 
> > gitlab-ci-token $CI_BUILD_TOKEN $CI_PROJECT_PATH "base" 
> > ".config.Labels[\"dockerfile.sha256\"]" || echo -n "")
> > +- if [ "$DOCKERFILE_SHA256" != "$IMAGE_DOCKERFILE_SHA256" ] ; then 
> > FORCE_BUILD=true ; fi
> > +- if [ "$FORCE_BUILD" ] ; then ../rocker build -f 
> > gitlab-ci/Rockerfile.base --var DOCKERFILE_SHA256=$DOCKERFILE_SHA256 ; fi
> > +- if [ "$FORCE_BUILD" ] ; then docker push $CI_REGISTRY_IMAGE:base ; fi
> 
> I think this patch file was a previous version, as patch 2 removes lines
> that aren't in this block and replaces them with these ones?
> 

Not sure which lines do you mean. In patch 2 we remove two lines in the
"build_llvm" job, but we don't touch "base" job.


> > +llvm:3.3:
> > +  variables:
> > +LLVM: "3.3"
> > +  <<: *build_llvm
> 
> How big do all these images end up being?  Do we have any size limits on
> what our CI can be uploading?
> 

According to the registry[1], the base image requires 226Mb, while each of the
LLVM images require between 250 and 360 Mb. But as docker images share layers,
I'm not sure this is a good measure.

For the size limits, Daniel can answer. But I think space wasn't a big problem.

[1] https://gitlab.freedesktop.org/jasuarez/mesa/container_registry

> > diff --git a/gitlab-ci/Rockerfile.base b/gitlab-ci/Rockerfile.base
> > new file mode 100644
> > index 000..a0cb5e5290d
> > --- /dev/null
> > 

Re: [Mesa-dev] [PATCH 1/4] mesa: use C99 initializer in get_gl_override()

2018-08-24 Thread Ian Romanick
This patch is

Reviewed-by: Ian Romanick 

On 08/24/2018 06:05 AM, Emil Velikov wrote:
> From: Emil Velikov 
> 
> The overrides array contains entries indexed on the gl_api enum.
> Use a C99 initializer to make it a bit more obvious.
> 
> Signed-off-by: Emil Velikov 
> ---
>  src/mesa/main/version.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/src/mesa/main/version.c b/src/mesa/main/version.c
> index 77ff51b6d9e..610ba2f08c5 100644
> --- a/src/mesa/main/version.c
> +++ b/src/mesa/main/version.c
> @@ -64,10 +64,10 @@ get_gl_override(gl_api api, int *version, bool 
> *fwd_context,
>bool fc_suffix;
>bool compat_suffix;
> } override[] = {
> -  { -1, false, false},
> -  { -1, false, false},
> -  { -1, false, false},
> -  { -1, false, false},
> +  [API_OPENGL_COMPAT] = { -1, false, false},
> +  [API_OPENGLES]  = { -1, false, false},
> +  [API_OPENGLES2] = { -1, false, false},
> +  [API_OPENGL_CORE]   = { -1, false, false},
> };
>  
> STATIC_ASSERT(ARRAY_SIZE(override) == API_OPENGL_LAST + 1);
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/5] nir: Lower flrp differently when the alpha value is reused

2018-08-24 Thread Ian Romanick
On 08/23/2018 06:59 PM, Eric Anholt wrote:
> Ian Romanick  writes:
> 
>> From: Ian Romanick 
>>
>> For some reason, if I did not move the regular lowering to late
>> optimizations, the new lowering would never trigger.  This also means
>> that the fsub lowering had to be added to late optimizations, and this
>> requires "intel/compiler: Repeat nir_opt_algebraic_late until no more
>> progress".
>>
>> The loops removed by this patch are the same loops added by
>> "intel/compiler: Don't emit flrp for Gen4 or Gen5"
>>
>> I am CC'ing people who are responsible for drivers that set lower_flrp32
>> as this patch will likely affect shader-db results for those drivers.
>>
>> No changes on any Gen6+ platform.
> 
> No change on vc4 in the previous patch, but this patch seems to cause
> flrps to be left in my NIR, so a bunch of traces crash.

Yeah... it will break everyone except Intel.  I moved the lowering to a
algebraic pass that nobody else uses. :(

I have a new series that fixes this issue and is better overall.  I'm
hoping to get that out later today.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 107670] Massive slowdown under specific memcpy implementations (32bit, no-SIMD, backward copy).

2018-08-24 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=107670

--- Comment #7 from i...@yahoo.com ---
(In reply to Grazvydas Ignotas from comment #4)
> What game/benchmark do you see this with?
> 
> Can you try calling _mesa_streaming_load_memcpy() there? It's for reading
> uncached memory, but by the looks of it it might be suitable for writing too.

I'm running Left4Dead2 under wine with Gallium Nine. The game has a `timedemo`
option where it could replay a previously `record`-ed gameplay, so the
benchmark is consistent.
I run it in a window, so I could watch the terminal with `perf top`.
When the problem is present, memcpy() is always the first with 25% usage, while
everything else is less than 2%.
I have to point out that I do run 64bit kernel, I just need the 32 bit
libraries, since the game is 32bit.

_mesa_streaming_load_memcpy() is a little problematic to test, since it is
written in intrinsic and I'm compiling for i486 (that's what my distribution
does). The function also has strong requirement for alignment of both src,
and could fall back to regular memcpy().
Still its existence is proof that there is need for such functionality.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/3] Use GitLab CI to build Mesa

2018-08-24 Thread Juan A. Suarez Romero
On Wed, 2018-08-08 at 17:40 +0100, Daniel Stone wrote:
> Hi Juan,
> 
> On Wed, 8 Aug 2018 at 16:45, Juan A. Suarez Romero  
> wrote:
> > This is a first part of a more complete proposal to use GitLab CI to build 
> > and
> > test Mesa. This first part just adds the required pieces to build Mesa, 
> > using the
> > different supported tools (meson, autotools, and scons).
> 
> This is great - I'm super excited to see it happen!
> 
> > Unfortunately, Rocker is a tool that is not maintained anymore
> > (https://github.com/grammarly/rocker). We still use it because it provides
> > everything we need, and fortunately we don't need any more features.
> > 
> > Maybe there are other alternatives out there, but we were happy with this 
> > and
> > hence our proposal. If we want to use Docker rather than Rocker, then we 
> > could
> > use template tools like Jinja, and forget about caching during build, which 
> > will
> > impact on the build time.
> 
> This bit is a bit concerning, especially since it makes CI harder to
> approach for people who might want to work on it. There are quite a
> few alternate tools (buildah comes to mind, as well as umoci/skopeo)
> which might end up being better, particularly for scriptability. But
> I'm not volunteering to do that work, so take this with a grain of
> salt!
> 

Yes, I knew whis could be a concerning point. There are two points to cover:

- Dockerfile templates, to allow complex building from single Dockerfile
- Allow external mount on Docker build

buildah indeed provides a way for the later. Still, we need to fix the the
former, unless we drop the Dockerfile and use whatever config syntax uses
buildah. Otherwise, something like a Jinja template or similar, and proper
tools, could fix the problem.

One of the things we decided to continue, and keep using Rocker, is that this
tool fixes both problems at the same time. And the syntax of the Rockerfiles is
the same as the Dockerfiles, but with more options. So anyone familiarized with
Docker will have no problems with the Rockerfiles.

A final reason for Rocker is how easy is to install: it is a single executable
file. This means that if someone needs to build the containers locally, they
only need to download a single file and build the containers, no matter the
distro they are using.


Anyway, I'm not paid by Rocker :). So I'm fine if we need to search for an
alternative.



> > ## Involved stages
> > 
> > The dependencies required to build Mesa doesn't change very frequently, so
> > building them everytime is a waste of time. As Docker allows to create 
> > images
> > based on the content of other images, we have defined the setup in several
> > stages.
> > 
> > On a first stage a "base" image is built. This image contains almost all the
> > dependencies required to build Mesa. Worth to mention that libdrm is 
> > excluded
> > here, as this is a dependency that really changes quite frequently, so we
> > postpone the installation for further stages.
> > 
> > One we have the "base" image, we create different images with the different 
> > LLVM
> > compilers. This ensure that when using a specific image we only have that 
> > LLVM
> > version, and not any other.
> > 
> > An important point here is that, these builts appears in the pipeline, they 
> > are
> > not actually built if not required. That is, in the case of the base image, 
> > if
> > the Rockerfile used to create the image has changed with respect to the one 
> > used
> > to create the image that is already in the registry, then the image is 
> > rebuilt
> > (as this means something changed, very likely some dependency). But if the
> > Rockerfile didn't change, then there is no need to rebuild the image, and 
> > just
> > keep using the one already in the registry. This is also done for the LLVM
> > images. This helps to improve the speed, as most of the times they don't 
> > need to
> > be built again.
> 
> It would be nice to have a GitLab CI variable which can be used to
> force rebuilds of these regardless.
> 

Oh, right! In our system, what we do is to use the scheduled pipelines to force
this rebuild. The schedule is run once per week. I'll include this option then.


> > The third stage is the one that builds Mesa itself. Here we just define 
> > which
> > tool to use and which LLVM version. This is done by passing the right 
> > parameters
> > to the `rocker build` tool. It will pick the right base image, install the
> > missing dependencies (mainly, libdrm), select which drivers should be built
> > (based on the LLVM version and parsing the configure.ac file), and create 
> > the
> > image.
> 
> You can eke out a little bit of a speed improvement by using the
> go-faster runes from before 'apt-get update' here:
> 
> https://gitlab.freedesktop.org/wayland/weston/blob/master/.gitlab-ci.yml#L6
> 

Cool! I think we can include it in the base image, so it gets propagated to rest
of the builds.

> Anyway, this all looks great, and I'm really excited to 

Re: [Mesa-dev] [PATCH 2/2] glsl: remove execute bit and shebang from python tests

2018-08-24 Thread Emil Velikov
On Fri, 24 Aug 2018 at 15:43, Andres Gomez  wrote:
>
> Emil, I've done some trivial conflicts resolution upon cherry picking
> this commit.
>
> You can see it at (staging/18.2):
> https://gitlab.freedesktop.org/mesa/mesa/commit/f6dccf66865c31b13f48b50891a9f5a0d9949b1c
>
> Please, let me know if this is OK.
>
This as well as  ~1 and ~2 are spot on.

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] configure: allow building with python3

2018-08-24 Thread Mathieu Bridon
Hi,

On Thu, 2018-08-23 at 23:23 -0400, Ilia Mirkin wrote:
> This breaks the build for me. It selects python3 instead of python2,
> and gen_xmlpool.py bails out when trying to print \xf3 to stdout with
> a LANG=C locale.

In general though, Python 3 works very badly with LANG=C. Upstream
Python recommends just not using LANG=C at all, and instead using a
UTF8 locale, like C.UTF-8 instead.

In fact, starting with 3.7, Python will emit a big warning when it is
run on a non-UTF8 locale, and try to fallback to C.UTF-8 if it can.

There might be something to fix in this case (I haven't had time to
look at it yet), but I'd still advise you try and use a UTF8 locale
when running Python scripts in the future, if at all possible.


-- 
Mathieu

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 1/8] nir: evaluate if condition uses inside the if branches

2018-08-24 Thread Jason Ekstrand
On Fri, Aug 17, 2018 at 1:13 PM Jason Ekstrand  wrote:

> On Mon, Jul 23, 2018 at 3:03 AM Timothy Arceri 
> wrote:
>
>> Since we know what side of the branch we ended up on we can just
>> replace the use with a constant.
>>
>> All the spill changes in shader-db are from Dolphin uber shaders,
>> despite some small regressions the change is clearly positive.
>>
>> shader-db results IVB:
>>
>> total instructions in shared programs: 201 -> 9993483 (-0.06%)
>> instructions in affected programs: 163235 -> 157517 (-3.50%)
>> helped: 132
>> HURT: 2
>>
>> total cycles in shared programs: 231670754 -> 219476091 (-5.26%)
>> cycles in affected programs: 143424120 -> 131229457 (-8.50%)
>> helped: 115
>> HURT: 24
>>
>> total spills in shared programs: 4383 -> 4370 (-0.30%)
>> spills in affected programs: 1656 -> 1643 (-0.79%)
>> helped: 9
>> HURT: 18
>>
>> total fills in shared programs: 4610 -> 4581 (-0.63%)
>> fills in affected programs: 374 -> 345 (-7.75%)
>> helped: 6
>> HURT: 0
>> ---
>>  src/compiler/nir/nir_opt_if.c | 124 ++
>>  1 file changed, 124 insertions(+)
>>
>> diff --git a/src/compiler/nir/nir_opt_if.c b/src/compiler/nir/nir_opt_if.c
>> index b3d0bf1decb..b3d5046a76e 100644
>> --- a/src/compiler/nir/nir_opt_if.c
>> +++ b/src/compiler/nir/nir_opt_if.c
>> @@ -369,6 +369,87 @@ opt_if_loop_terminator(nir_if *nif)
>> return true;
>>  }
>>
>> +static void
>> +replace_if_condition_use_with_const(nir_src *use, unsigned nir_boolean,
>> +void *mem_ctx, bool if_condition)
>> +{
>> +   /* Create const */
>> +   nir_load_const_instr *load = nir_load_const_instr_create(mem_ctx, 1,
>> 32);
>> +   load->value.u32[0] = nir_boolean;
>> +
>> +   if (if_condition) {
>> +  nir_instr_insert_before_cf(>parent_if->cf_node,
>> >instr);
>>
>
> If it was me, I'd probably use the builder but I think it's a wash in this
> case.
>
>
>> +   } else if (use->parent_instr->type == nir_instr_type_phi) {
>> +  nir_phi_instr *cond_phi = nir_instr_as_phi(use->parent_instr);
>> +
>> +  bool UNUSED found = false;
>> +  nir_foreach_phi_src(phi_src, cond_phi) {
>> + if (phi_src->src.ssa == use->ssa) {
>>
>
> You could also just use some sort of container_of macro to cast from the
> src to a phi_src.  It's a bit sneaky so maybe not a good idea for the tiny
> bit of perf.
>
>
>> +nir_instr_insert_before_block(phi_src->pred, >instr);
>>
>
> after_block_before_jump would work just as well and would put the
> load_const closer to its use.
>
>
>> +found = true;
>> +break;
>> + }
>> +  }
>> +  assert(found);
>> +   } else {
>> +  nir_instr_insert_before(use->parent_instr,  >instr);
>> +   }
>> +
>> +   /* Rewrite use to use const */
>> +   nir_src new_src = nir_src_for_ssa(>def);
>>
>
> Is there a good reason for the temporary variable?
>
>
>> +
>> +   if (if_condition)
>> +  nir_if_rewrite_condition(use->parent_if, new_src);
>> +   else
>> +  nir_instr_rewrite_src(use->parent_instr, use, new_src);
>> +}
>>
>
> Ok, enough nitpicking.  None of the above things are actually problems.
>
>
>> +
>> +static bool
>> +evaluate_condition_use(nir_if *nif, nir_src *use_src, void *mem_ctx,
>> +   bool if_condition)
>> +{
>> +   bool progress = false;
>> +
>> +   nir_block *use_block;
>> +   if (if_condition) {
>> +  use_block =
>> +
>>  nir_cf_node_as_block(nir_cf_node_prev(_src->parent_if->cf_node));
>> +   } else {
>> +  use_block = use_src->parent_instr->block;
>>
>
> Not true for phis!
>
>
>> +   }
>> +
>> +   if (nir_block_dominates(nir_if_first_then_block(nif), use_block)) {
>> +  replace_if_condition_use_with_const(use_src, NIR_TRUE, mem_ctx,
>> +  if_condition);
>> +  progress = true;
>> +   } else if (nir_block_dominates(nir_if_first_else_block(nif),
>> use_block)) {
>> +  replace_if_condition_use_with_const(use_src, NIR_FALSE, mem_ctx,
>> +  if_condition);
>> +  progress = true;
>> +   }
>> +
>> +   return progress;
>> +}
>>
>
> I think things would be more straightforward (and correct!) if you merged
> the above two functions and restructured them a bit as follows:
>
> static bool
> try_rewrite_if_use(nir_builder *b, nir_if *nif, nir_src *src, bool
> if_condition)
> {
>if (if_condition) {
>   b->cursor = nir_before_cf_node(>cf_node);
>} else if (src->parent_instr->type == nir_instr_type_phi) {
>   // Set the cursor and use_block to the predecessor block
>} else {
>   b->cursor = nir_before_instr(src->parent_instr);
>}
>nir_block *use_block = nir_cursor_current_block(b->cursor);
>
>nir_ssa_def *const_val;
>if (nir_block_dominates(nir_if_first_then_block(nif), use_block))
>   const_value = nir_imm_int(b, NIR_TRUE);
>else if (nir_block_dominates(nir_if_first_else_block(nif), use_block)
>   const_value = nir_imm_int(b, 

Re: [Mesa-dev] [PATCH 2/2] glsl: remove execute bit and shebang from python tests

2018-08-24 Thread Andres Gomez
Emil, I've done some trivial conflicts resolution upon cherry picking
this commit.

You can see it at (staging/18.2):
https://gitlab.freedesktop.org/mesa/mesa/commit/f6dccf66865c31b13f48b50891a9f5a0d9949b1c

Please, let me know if this is OK.


On Fri, 2018-08-17 at 12:11 +0100, Emil Velikov wrote:
> From: Emil Velikov 
> 
> Just like the rest of the tree - these should be run either as part of
> the build system check target, or at the very least with an explicitly
> versioned python executable.
> 
> Cc: Dylan Baker 
> Fixes: db8cd8e3677 ("glcpp/tests: Convert shell scripts to a python script")
> Fixes: 97c28cb0823 ("glsl/tests: Convert optimization-test.sh to pure python")
> Fixes: 3b52d292273 ("glsl/tests: reimplement warnings-test in python")
> Signed-off-by: Emil Velikov 
> ---
>  src/compiler/glsl/glcpp/tests/glcpp_test.py  | 1 -
>  src/compiler/glsl/tests/optimization_test.py | 1 -
>  src/compiler/glsl/tests/warnings_test.py | 1 -
>  3 files changed, 3 deletions(-)
>  mode change 100755 => 100644 src/compiler/glsl/glcpp/tests/glcpp_test.py
>  mode change 100755 => 100644 src/compiler/glsl/tests/optimization_test.py
>  mode change 100755 => 100644 src/compiler/glsl/tests/warnings_test.py
> 
> diff --git a/src/compiler/glsl/glcpp/tests/glcpp_test.py 
> b/src/compiler/glsl/glcpp/tests/glcpp_test.py
> old mode 100755
> new mode 100644
> index 8ac5d7cb0a1..8c7552124a6
> --- a/src/compiler/glsl/glcpp/tests/glcpp_test.py
> +++ b/src/compiler/glsl/glcpp/tests/glcpp_test.py
> @@ -1,4 +1,3 @@
> -#!/usr/bin/env python2
>  # encoding=utf-8
>  # Copyright © 2018 Intel Corporation
>  
> diff --git a/src/compiler/glsl/tests/optimization_test.py 
> b/src/compiler/glsl/tests/optimization_test.py
> old mode 100755
> new mode 100644
> index 577d2dfc20f..f8518a168e0
> --- a/src/compiler/glsl/tests/optimization_test.py
> +++ b/src/compiler/glsl/tests/optimization_test.py
> @@ -1,4 +1,3 @@
> -#!/usr/bin/env python2
>  # encoding=utf-8
>  # Copyright © 2018 Intel Corporation
>  
> diff --git a/src/compiler/glsl/tests/warnings_test.py 
> b/src/compiler/glsl/tests/warnings_test.py
> old mode 100755
> new mode 100644
> index 2e0f23180f3..2c4fa5a0d5a
> --- a/src/compiler/glsl/tests/warnings_test.py
> +++ b/src/compiler/glsl/tests/warnings_test.py
> @@ -1,4 +1,3 @@
> -#!/usr/bin/env python
>  # encoding=utf-8
>  # Copyright © 2017 Intel Corporation
>  
-- 
Br,

Andres
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] nir: Pull block_ends_in_jump into nir.h

2018-08-24 Thread Jason Ekstrand
We had two different implementations in different files.  May as well
have one and put it in nir.h.
---
 src/compiler/nir/nir.h  |  7 +++
 src/compiler/nir/nir_control_flow.c | 17 +
 src/compiler/nir/nir_opt_dead_cf.c  | 12 +---
 3 files changed, 13 insertions(+), 23 deletions(-)

diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index 45a8c2c64cc..009a6d60371 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -1753,6 +1753,13 @@ nir_block_last_instr(nir_block *block)
return exec_node_data(nir_instr, tail, node);
 }
 
+static inline bool
+nir_block_ends_in_jump(nir_block *block)
+{
+   return !exec_list_is_empty(>instr_list) &&
+  nir_block_last_instr(block)->type == nir_instr_type_jump;
+}
+
 #define nir_foreach_instr(instr, block) \
foreach_list_typed(nir_instr, instr, node, &(block)->instr_list)
 #define nir_foreach_instr_reverse(instr, block) \
diff --git a/src/compiler/nir/nir_control_flow.c 
b/src/compiler/nir/nir_control_flow.c
index 1622b35a6c9..3b0a0f1a5b0 100644
--- a/src/compiler/nir/nir_control_flow.c
+++ b/src/compiler/nir/nir_control_flow.c
@@ -45,13 +45,6 @@
  */
 /*@{*/
 
-static bool
-block_ends_in_jump(nir_block *block)
-{
-   return !exec_list_is_empty(>instr_list) &&
-  nir_block_last_instr(block)->type == nir_instr_type_jump;
-}
-
 static inline void
 block_add_pred(nir_block *block, nir_block *pred)
 {
@@ -117,12 +110,12 @@ link_non_block_to_block(nir_cf_node *node, nir_block 
*block)
   nir_block *last_then_block = nir_if_last_then_block(if_stmt);
   nir_block *last_else_block = nir_if_last_else_block(if_stmt);
 
-  if (!block_ends_in_jump(last_then_block)) {
+  if (!nir_block_ends_in_jump(last_then_block)) {
  unlink_block_successors(last_then_block);
  link_blocks(last_then_block, block, NULL);
   }
 
-  if (!block_ends_in_jump(last_else_block)) {
+  if (!nir_block_ends_in_jump(last_else_block)) {
  unlink_block_successors(last_else_block);
  link_blocks(last_else_block, block, NULL);
   }
@@ -339,7 +332,7 @@ split_block_end(nir_block *block)
new_block->cf_node.parent = block->cf_node.parent;
exec_node_insert_after(>cf_node.node, _block->cf_node.node);
 
-   if (block_ends_in_jump(block)) {
+   if (nir_block_ends_in_jump(block)) {
   /* Figure out what successor block would've had if it didn't have a jump
* instruction, and make new_block have that successor.
*/
@@ -553,7 +546,7 @@ stitch_blocks(nir_block *before, nir_block *after)
 * TODO: special case when before is empty and after isn't?
 */
 
-   if (block_ends_in_jump(before)) {
+   if (nir_block_ends_in_jump(before)) {
   assert(exec_list_is_empty(>instr_list));
   if (after->successors[0])
  remove_phi_src(after->successors[0], after);
@@ -588,7 +581,7 @@ nir_cf_node_insert(nir_cursor cursor, nir_cf_node *node)
* already been setup with the correct successors, so we need to set
* up jumps here as the block is being inserted.
*/
-  if (block_ends_in_jump(block))
+  if (nir_block_ends_in_jump(block))
  nir_handle_add_jump(block);
 
   stitch_blocks(block, after);
diff --git a/src/compiler/nir/nir_opt_dead_cf.c 
b/src/compiler/nir/nir_opt_dead_cf.c
index a652bcd99bb..e224daa1fda 100644
--- a/src/compiler/nir/nir_opt_dead_cf.c
+++ b/src/compiler/nir/nir_opt_dead_cf.c
@@ -256,16 +256,6 @@ dead_cf_block(nir_block *block)
return true;
 }
 
-static bool
-ends_in_jump(nir_block *block)
-{
-   if (exec_list_is_empty(>instr_list))
-  return false;
-
-   nir_instr *instr = nir_block_last_instr(block);
-   return instr->type == nir_instr_type_jump;
-}
-
 static bool
 dead_cf_list(struct exec_list *list, bool *list_ends_in_jump)
 {
@@ -297,7 +287,7 @@ dead_cf_list(struct exec_list *list, bool 
*list_ends_in_jump)
 progress = true;
  }
 
- if (ends_in_jump(block)) {
+ if (nir_block_ends_in_jump(block)) {
 *list_ends_in_jump = true;
 
 if (!exec_node_is_tail_sentinel(cur->node.next)) {
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/4] radeonsi: add radeonsi_zerovram driconfig option

2018-08-24 Thread Michel Dänzer
On 2018-08-24 1:06 p.m., Timothy Arceri wrote:
> More and more games seem to require this so lets make it a config
> option.
> ---
>  src/gallium/drivers/radeonsi/driinfo_radeonsi.h |  1 +
>  src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c   | 10 +++---
>  src/util/xmlpool/t_options.h|  5 +
>  3 files changed, 13 insertions(+), 3 deletions(-)
> 
> diff --git a/src/gallium/drivers/radeonsi/driinfo_radeonsi.h 
> b/src/gallium/drivers/radeonsi/driinfo_radeonsi.h
> index 7f57b4ea892..8c5078c13f3 100644
> --- a/src/gallium/drivers/radeonsi/driinfo_radeonsi.h
> +++ b/src/gallium/drivers/radeonsi/driinfo_radeonsi.h
> @@ -3,6 +3,7 @@ DRI_CONF_SECTION_PERFORMANCE
>  DRI_CONF_RADEONSI_ENABLE_SISCHED("false")
>  DRI_CONF_RADEONSI_ASSUME_NO_Z_FIGHTS("false")
>  DRI_CONF_RADEONSI_COMMUTATIVE_BLEND_ADD("false")
> +DRI_CONF_RADEONSI_ZERO_ALL_VRAM_ALLOCS("false")
>  DRI_CONF_SECTION_END
>  
>  [...]
>  
> @@ -414,3 +414,8 @@ DRI_CONF_OPT_END
>  DRI_CONF_OPT_BEGIN_B(radeonsi_clear_db_cache_before_clear, def) \
>  DRI_CONF_DESC(en,"Clear DB cache before fast depth clear") \
>  DRI_CONF_OPT_END
> +
> +#define DRI_CONF_RADEONSI_ZERO_ALL_VRAM_ALLOCS(def) \
> +DRI_CONF_OPT_BEGIN_B(radeonsi_zerovram, def) \
> +DRI_CONF_DESC(en,"Zero all vram allocations") \
> +DRI_CONF_OPT_END
> 

I'd name the option simply "zerovram", so it could be used by other
drivers as well.


BTW, AFAICT, currently this only affects BOs allocated from the kernel,
not those re-used from the BO cache. I wonder if that couldn't still
cause trouble with some apps.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 77449] Tracker bug for all bugs related to Steam titles

2018-08-24 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=77449

Vedran Miletić  changed:

   What|Removed |Added

 Depends on||104809


Referenced Bugs:

https://bugs.freedesktop.org/show_bug.cgi?id=104809
[Bug 104809] anv: DOOM 2016 and Wolfenstein II:The New Colossus crash due to
not having depthBoundsTest
-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 107477] [DXVK] Setting high shader quality in GTA V results in LLVM error

2018-08-24 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=107477

--- Comment #10 from Samuel Pitoiset  ---
I would prefer a renderdoc capture if you can provide that.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] configure: allow building with python3

2018-08-24 Thread Dylan Baker
Quoting Emil Velikov (2018-08-24 02:34:04)
> On Fri, 24 Aug 2018 at 04:23, Ilia Mirkin  wrote:
> >
> > This breaks the build for me. It selects python3 instead of python2,
> > and gen_xmlpool.py bails out when trying to print \xf3 to stdout with
> > a LANG=C locale. Revert until scripts are fixed and try again?
> >
> Sure will revert in a moment. The concerning part is why meson "succeeds".
> 
> Having a look if lacks the $(LANGS) argument when invoking gen_xmlpool.py.
> And the .mo and .po files (on which LANGS is based on) are missing all
> together in meson.
> 
> Mathieu, Dylan can you look into this?
> Once the meson build is updated, Ilia's concerns will become more obvious.
> 
> Thanks
> Emil

This (and my dog waking me up at 5 am so she could potty) got me looking into
the translations again. I've sent a series that addresses the lack of
translation handling in meson.

Dylan


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/5] util/gen_xmlpool: use argparse for argument handling

2018-08-24 Thread Dylan Baker
This is a little cleaner than just looking at sys.argv, but it's also
going to allow us to handle the differences in the way meson and
autotools hand translations more cleanly.
---
 src/util/xmlpool/gen_xmlpool.py | 20 +---
 1 file changed, 9 insertions(+), 11 deletions(-)

diff --git a/src/util/xmlpool/gen_xmlpool.py b/src/util/xmlpool/gen_xmlpool.py
index 56a67bcab55..b40f295738e 100644
--- a/src/util/xmlpool/gen_xmlpool.py
+++ b/src/util/xmlpool/gen_xmlpool.py
@@ -9,25 +9,23 @@
 
 from __future__ import print_function
 
+import argparse
 import io
 import sys
 import gettext
 import re
 
+parser = argparse.ArgumentParser()
+parser.add_argument('template')
+parser.add_argument('localedir')
+parser.add_argument('languages', nargs='*')
+args = parser.parse_args()
 
 if sys.version_info < (3, 0):
 gettext_method = 'ugettext'
 else:
 gettext_method = 'gettext'
 
-# Path to t_options.h
-template_header_path = sys.argv[1]
-
-localedir = sys.argv[2]
-
-# List of supported languages
-languages = sys.argv[3:]
-
 # Escape special characters in C strings
 def escapeCString (s):
 escapeSeqs = {'\a' : '\\a', '\b' : '\\b', '\f' : '\\f', '\n' : '\\n',
@@ -166,9 +164,9 @@ def expandMatches (matches, translations, end=None):
 # Compile a list of translation classes to all supported languages.
 # The first translation is always a NullTranslations.
 translations = [("en", gettext.NullTranslations())]
-for lang in languages:
+for lang in args.languages:
 try:
-trans = gettext.translation ("options", localedir, [lang])
+trans = gettext.translation ("options", args.localedir, [lang])
 except IOError:
 sys.stderr.write ("Warning: language '%s' not found.\n" % lang)
 continue
@@ -188,7 +186,7 @@ 
print("/***\
 
 # Process the options template and generate options.h with all
 # translations.
-template = io.open (template_header_path, mode="rt", encoding='utf-8')
+template = io.open (args.template, mode="rt", encoding='utf-8')
 descMatches = []
 for line in template:
 if len(descMatches) > 0:
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/5] Meson support for gettext translations

2018-08-24 Thread Dylan Baker
This is the last thing I know of that's outstanding for meson (that's not
related to windows). One patch in this series is a fix that needs to go into
18.1 and 18.2.

Dylan Baker (5):
  meson: Actually load translation files
  util/gen_xmlpool: use argparse for argument handling
  meson: add support for generating translation mo files
  util/gen_xmlpool: Add a --meson switch
  meson: use meson generated translation files

 src/util/xmlpool/LINGUAS|  1 +
 src/util/xmlpool/POTFILES   |  1 +
 src/util/xmlpool/gen_xmlpool.py | 33 +++--
 src/util/xmlpool/meson.build|  8 +++-
 4 files changed, 28 insertions(+), 15 deletions(-)
 create mode 100644 src/util/xmlpool/LINGUAS
 create mode 100644 src/util/xmlpool/POTFILES

-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/5] meson: add support for generating translation mo files

2018-08-24 Thread Dylan Baker
Meson has handy a handy built-in module for handling gettext called
i18n, this module works a bit differently than our autotools build does,
namely it doesn't automatically generate translations instead it creates
3 new top level targets to run. These are:

xmlpool-pot
xmlpool-update-po
xmlpool-gmo

To use translations from meson you'll want to run:
ninja xmlpool-pot xmlpool-update-po xmlpool-gmo
which will generate the necessary files.
---
 src/util/xmlpool/LINGUAS | 1 +
 src/util/xmlpool/POTFILES| 1 +
 src/util/xmlpool/meson.build | 3 +++
 3 files changed, 5 insertions(+)
 create mode 100644 src/util/xmlpool/LINGUAS
 create mode 100644 src/util/xmlpool/POTFILES

diff --git a/src/util/xmlpool/LINGUAS b/src/util/xmlpool/LINGUAS
new file mode 100644
index 000..3620176519e
--- /dev/null
+++ b/src/util/xmlpool/LINGUAS
@@ -0,0 +1 @@
+ca es de nl sv fr
diff --git a/src/util/xmlpool/POTFILES b/src/util/xmlpool/POTFILES
new file mode 100644
index 000..d68d7009be4
--- /dev/null
+++ b/src/util/xmlpool/POTFILES
@@ -0,0 +1 @@
+src/util/xmlpool/t_options.h
diff --git a/src/util/xmlpool/meson.build b/src/util/xmlpool/meson.build
index 3d2de0cdc3a..9696a3d8a81 100644
--- a/src/util/xmlpool/meson.build
+++ b/src/util/xmlpool/meson.build
@@ -29,3 +29,6 @@ xmlpool_options_h = custom_target(
   capture : true,
   depend_files : files('ca.po', 'es.po', 'de.po', 'nl.po', 'sv.po', 'fr.po'),
 )
+
+i18n = import('i18n')
+i18n.gettext('xmlpool')
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/5] meson: use meson generated translation files

2018-08-24 Thread Dylan Baker
This is a change from the current status-quo, namely to get dri-conf
translations you now *must* run the meson generation scripts, even if
you're building from an autotools generated tarball (an official
release).
---
 src/util/xmlpool/meson.build | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/util/xmlpool/meson.build b/src/util/xmlpool/meson.build
index 9696a3d8a81..8047f064a3b 100644
--- a/src/util/xmlpool/meson.build
+++ b/src/util/xmlpool/meson.build
@@ -23,7 +23,7 @@ xmlpool_options_h = custom_target(
   input : ['gen_xmlpool.py', 't_options.h'],
   output : 'options.h',
   command : [
-prog_python, '@INPUT@', meson.current_source_dir(),
+prog_python, '@INPUT@', '--meson', meson.current_build_dir(),
 'ca', 'es', 'de', 'nl', 'sv', 'fr',
   ],
   capture : true,
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/5] meson: Actually load translation files

2018-08-24 Thread Dylan Baker
Currently we run the script but don't actually load any files, even in a
tarball where they exist.

Fixes: 3218056e0eb375eeda470058d06add1532acd6d4
   ("meson: Build i965 and dri stack")
---
 src/util/xmlpool/meson.build | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/src/util/xmlpool/meson.build b/src/util/xmlpool/meson.build
index 346b1956a55..3d2de0cdc3a 100644
--- a/src/util/xmlpool/meson.build
+++ b/src/util/xmlpool/meson.build
@@ -22,7 +22,10 @@ xmlpool_options_h = custom_target(
   'xmlpool_options.h',
   input : ['gen_xmlpool.py', 't_options.h'],
   output : 'options.h',
-  command : [prog_python, '@INPUT@', meson.current_source_dir()],
+  command : [
+prog_python, '@INPUT@', meson.current_source_dir(),
+'ca', 'es', 'de', 'nl', 'sv', 'fr',
+  ],
   capture : true,
   depend_files : files('ca.po', 'es.po', 'de.po', 'nl.po', 'sv.po', 'fr.po'),
 )
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/5] util/gen_xmlpool: Add a --meson switch

2018-08-24 Thread Dylan Baker
Meson won't put the .gmo files in the layout that python's
gettext.translation() expects, so we need to handle them differently,
this switch allows the script to load the files as meson lays them out
---
 src/util/xmlpool/gen_xmlpool.py | 15 +++
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/src/util/xmlpool/gen_xmlpool.py b/src/util/xmlpool/gen_xmlpool.py
index b40f295738e..59d7a9bb84d 100644
--- a/src/util/xmlpool/gen_xmlpool.py
+++ b/src/util/xmlpool/gen_xmlpool.py
@@ -8,17 +8,18 @@
 #
 
 from __future__ import print_function
-
 import argparse
-import io
-import sys
 import gettext
+import io
+import os
 import re
+import sys
 
 parser = argparse.ArgumentParser()
 parser.add_argument('template')
 parser.add_argument('localedir')
 parser.add_argument('languages', nargs='*')
+parser.add_argument('--meson', action='store_true')
 args = parser.parse_args()
 
 if sys.version_info < (3, 0):
@@ -166,8 +167,14 @@ def expandMatches (matches, translations, end=None):
 translations = [("en", gettext.NullTranslations())]
 for lang in args.languages:
 try:
-trans = gettext.translation ("options", args.localedir, [lang])
+if args.meson:
+filename = os.path.join(args.localedir, '{}.gmo'.format(lang))
+with io.open(filename, 'rb') as f:
+trans = gettext.GNUTranslations(f)
+else:
+trans = gettext.translation ("options", args.localedir, [lang])
 except IOError:
+raise
 sys.stderr.write ("Warning: language '%s' not found.\n" % lang)
 continue
 translations.append ((lang, trans))
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/8] anv/android: support import/export of AHardwareBuffer objects

2018-08-24 Thread Jason Ekstrand
On Fri, Aug 24, 2018 at 12:12 AM Tapani Pälli 
wrote:

>
>
> On 08/22/2018 05:28 PM, Jason Ekstrand wrote:
> > On Tue, Aug 21, 2018 at 3:27 AM Tapani Pälli  > > wrote:
> >
> > v2: add support for non-image buffers (AHARDWAREBUFFER_FORMAT_BLOB)
> > v3: properly handle usage bits when creating from image
> >
> > Signed-off-by: Tapani Pälli  > >
> > ---
> >   src/intel/vulkan/anv_android.c | 149
> > +
> >   src/intel/vulkan/anv_device.c  |  46 -
> >   src/intel/vulkan/anv_private.h |  18 +
> >   3 files changed, 212 insertions(+), 1 deletion(-)
> >
> > diff --git a/src/intel/vulkan/anv_android.c
> > b/src/intel/vulkan/anv_android.c
> > index 7d0eb588e2b..6f90649847d 100644
> > --- a/src/intel/vulkan/anv_android.c
> > +++ b/src/intel/vulkan/anv_android.c
> > @@ -195,6 +195,155 @@ anv_GetAndroidHardwareBufferPropertiesANDROID(
> >  return VK_SUCCESS;
> >   }
> >
> > +VkResult
> > +anv_GetMemoryAndroidHardwareBufferANDROID(
> > +   VkDevice device_h,
> > +   const VkMemoryGetAndroidHardwareBufferInfoANDROID *pInfo,
> > +   struct AHardwareBuffer **pBuffer)
> > +{
> > +   ANV_FROM_HANDLE(anv_device_memory, mem, pInfo->memory);
> > +
> > +   /* Some quotes from Vulkan spec:
> > +*
> > +* "If the device memory was created by importing an Android
> > hardware
> > +* buffer, vkGetMemoryAndroidHardwareBufferANDROID must return
> > that same
> > +* Android hardware buffer object."
> > +*
> > +*
> > "VK_EXTERNAL_MEMORY_HANDLE_TYPE_ANDROID_HARDWARE_BUFFER_BIT_ANDROID
> must
> > +* have been included in
> > VkExportMemoryAllocateInfoKHR::handleTypes when
> > +* memory was created."
> > +*/
> > +   if (mem->ahw) {
> > +  *pBuffer = mem->ahw;
> > +  /* Increase refcount. */
> > +  AHardwareBuffer_acquire(mem->ahw);
> > +  return VK_SUCCESS;
> > +   }
> > +
> > +   return VK_ERROR_INVALID_EXTERNAL_HANDLE_KHR;
> > +}
> > +
> > +/*
> > + * Called from anv_AllocateMemory when import AHardwareBuffer.
> > + */
> > +VkResult
> > +anv_import_ahw_memory(VkDevice device_h,
> > +  struct anv_device_memory *mem,
> > +  const
> > VkImportAndroidHardwareBufferInfoANDROID *info)
> > +{
> > +   ANV_FROM_HANDLE(anv_device, device, device_h);
> > +
> > +   /* Get a description of buffer contents. */
> > +   AHardwareBuffer_Desc desc;
> > +   AHardwareBuffer_describe(info->buffer, );
> > +   VkResult result = VK_SUCCESS;
> > +
> > +   /* Import from AHardwareBuffer to anv_device_memory. */
> > +   const native_handle_t *handle =
> > +  AHardwareBuffer_getNativeHandle(info->buffer);
> > +
> > +   int dma_buf = (handle && handle->numFds) ? handle->data[0] : -1;
> > +   if (dma_buf < 0)
> > +  return VK_ERROR_INVALID_EXTERNAL_HANDLE_KHR;
> > +
> > +   uint64_t bo_flags = 0;
> > +   if (device->instance->physicalDevice.supports_48bit_addresses)
> > +  bo_flags |= EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> > +   if (device->instance->physicalDevice.use_softpin)
> > +  bo_flags |= EXEC_OBJECT_PINNED;
> > +
> > +   result = anv_bo_cache_import(device, >bo_cache,
> > +dma_buf, bo_flags, >bo);
> > +   if (result != VK_SUCCESS)
> > +  return result;
> > +
> > +   /* "If the vkAllocateMemory command succeeds, the implementation
> > must
> > +* acquire a reference to the imported hardware buffer, which it
> > must
> > +* release when the device memory object is freed. If the
> > command fails,
> > +* the implementation must not retain a reference."
> > +*/
> > +   AHardwareBuffer_acquire(info->buffer);
> > +   mem->ahw = info->buffer;
> > +
> > +   return result;
> > +}
> > +
> > +VkResult
> > +anv_create_ahw_memory(VkDevice device_h,
> > +  struct anv_device_memory *mem,
> > +  const VkMemoryAllocateInfo *pAllocateInfo)
> > +{
> > +   ANV_FROM_HANDLE(anv_device, dev, device_h);
> > +
> > +   const VkMemoryDedicatedAllocateInfo *dedicated_info =
> > +  vk_find_struct_const(pAllocateInfo->pNext,
> > +   MEMORY_DEDICATED_ALLOCATE_INFO);
> > +
> > +   uint32_t w = 0;
> > +   uint32_t h = 1;
> > +   uint32_t format = 0;
> > +   uint64_t usage = 0;
> > +
> > +   /* If caller passed dedicated information. */
> > +   if (dedicated_info && dedicated_info->image) {
> > +  ANV_FROM_HANDLE(anv_image, image, dedicated_info->image);
> > +  w = 

Re: [Mesa-dev] [PATCH 3/8] anv/android: add GetAndroidHardwareBufferPropertiesANDROID

2018-08-24 Thread Jason Ekstrand
On Fri, Aug 24, 2018 at 12:08 AM Tapani Pälli 
wrote:

>
>
> On 08/22/2018 05:18 PM, Jason Ekstrand wrote:
> > On Tue, Aug 21, 2018 at 3:27 AM Tapani Pälli  > > wrote:
> >
> > When adding YUV support, we need to figure out implementation-defined
> > external format identifier.
> >
> > Signed-off-by: Tapani Pälli  > >
> > ---
> >   src/intel/vulkan/anv_android.c | 99
> > ++
> >   1 file changed, 99 insertions(+)
> >
> > diff --git a/src/intel/vulkan/anv_android.c
> > b/src/intel/vulkan/anv_android.c
> > index 46c41d57861..7d0eb588e2b 100644
> > --- a/src/intel/vulkan/anv_android.c
> > +++ b/src/intel/vulkan/anv_android.c
> > @@ -29,6 +29,8 @@
> >   #include 
> >
> >   #include "anv_private.h"
> > +#include "vk_format_info.h"
> > +#include "vk_util.h"
> >
> >   static int anv_hal_open(const struct hw_module_t* mod, const char*
> > id, struct hw_device_t** dev);
> >   static int anv_hal_close(struct hw_device_t *dev);
> > @@ -96,6 +98,103 @@ anv_hal_close(struct hw_device_t *dev)
> >  return -1;
> >   }
> >
> > +static VkResult
> > +get_ahw_buffer_format_properties(
> > +   VkDevice device_h,
> > +   const struct AHardwareBuffer *buffer,
> > +   VkAndroidHardwareBufferFormatPropertiesANDROID *pProperties)
> > +{
> > +   ANV_FROM_HANDLE(anv_device, device, device_h);
> > +
> > +   /* Get a description of buffer contents . */
> > +   AHardwareBuffer_Desc desc;
> > +   AHardwareBuffer_describe(buffer, );
> > +
> > +   /* Verify description. */
> > +   uint64_t gpu_usage =
> > +  AHARDWAREBUFFER_USAGE_GPU_SAMPLED_IMAGE |
> > +  AHARDWAREBUFFER_USAGE_GPU_COLOR_OUTPUT |
> > +  AHARDWAREBUFFER_USAGE_GPU_DATA_BUFFER;
> > +
> > +   /* "Buffer must be a valid Android hardware buffer object with
> > at least
> > +* one of the AHARDWAREBUFFER_USAGE_GPU_* usage flags."
> > +*/
> > +   if (!(desc.usage & (gpu_usage)))
> > +  return VK_ERROR_INVALID_EXTERNAL_HANDLE_KHR;
> > +
> > +   /* Fill properties fields based on description. */
> > +   VkAndroidHardwareBufferFormatPropertiesANDROID *p = pProperties;
> > +
> > +   p->pNext = NULL;
> >
> >
> > You shouldn't be overwriting pNext.  That's used by the client to let
> > them chain in multiple structs to fill out in case Google ever extends
> > this extension.  Also, while we're here, it'd be good to throw in an
> > assert that p->sType is the right thing.
>
> Yes of course, will remove.
>
> > +   p->format = vk_format_from_android(desc.format);
> > +   p->externalFormat = 1; /* XXX */
> > +
> > +   const struct anv_format *anv_format = anv_get_format(p->format);
> > +   struct anv_physical_device *physical_device =
> > +  >instance->physicalDevice;
> > +   const struct gen_device_info *devinfo = _device->info;
> >
> >
> > If all you need is devinfo, that's avilable in the device; you don't
> > need to get the physical device for it.  Should be device->info.
>
> OK
>
> > +
> > +   p->formatFeatures =
> > +  anv_get_image_format_features(devinfo, p->format,
> > +anv_format,
> > VK_IMAGE_TILING_OPTIMAL);
> > +
> > +   /* "The formatFeatures member *must* include
> > +*  VK_FORMAT_FEATURE_SAMPLED_IMAGE_BIT and at least one of
> > +*  VK_FORMAT_FEATURE_MIDPOINT_CHROMA_SAMPLES_BIT or
> > +*  VK_FORMAT_FEATURE_COSITED_CHROMA_SAMPLES_BIT"
> > +*/
> > +   p->formatFeatures |=
> > +  VK_FORMAT_FEATURE_MIDPOINT_CHROMA_SAMPLES_BIT;
> >
> >
> > Uh... Why not just throw in SAMPLED_BIT?  For that matter, all of the
> > formats you have in your conversion helpers support sampling.  Maybe
> > just replace that with an assert for now.
>
> Yeah, VK_FORMAT_FEATURE_SAMPLED_IMAGE_BIT is there. Well thing is that
> dEQP checks explicitly that either
> VK_FORMAT_FEATURE_MIDPOINT_CHROMA_SAMPLES_BIT or
> VK_FORMAT_FEATURE_COSITED_CHROMA_SAMPLES_BIT exists (independent of
> surface format). TBH I'm not sure what to do about that.
>

That's annoying...  Is that in a Khronos CTS test or a Google test?  Either
way, we should try to get it corrected.  If you don't know how to do that,
I can send some e-mails.

--Jason


> > +
> > +   p->samplerYcbcrConversionComponents.r =
> > VK_COMPONENT_SWIZZLE_IDENTITY;
> > +   p->samplerYcbcrConversionComponents.g =
> > VK_COMPONENT_SWIZZLE_IDENTITY;
> > +   p->samplerYcbcrConversionComponents.b =
> > VK_COMPONENT_SWIZZLE_IDENTITY;
> > +   p->samplerYcbcrConversionComponents.a =
> > VK_COMPONENT_SWIZZLE_IDENTITY;
> > +
> > +   p->suggestedYcbcrModel =
> > VK_SAMPLER_YCBCR_MODEL_CONVERSION_RGB_IDENTITY;
> > +   p->suggestedYcbcrRange = 

Re: [Mesa-dev] [PATCH] docs/relnotes: Mark NV_fragment_shader_interlock support in i965

2018-08-24 Thread Jason Ekstrand
Acked and pushed.  Thanks!

On Fri, Aug 24, 2018 at 1:01 AM  wrote:

> From: Kevin Rogovin 
>
> ---
>  docs/relnotes/18.3.0.html | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/docs/relnotes/18.3.0.html b/docs/relnotes/18.3.0.html
> index 594b0624a5..afcb044817 100644
> --- a/docs/relnotes/18.3.0.html
> +++ b/docs/relnotes/18.3.0.html
> @@ -59,6 +59,7 @@ Note: some of the new features are only available with
> certain drivers.
>  GL_EXT_vertex_attrib_64bit on i965, nvc0, radeonsi.
>  GL_EXT_window_rectangles on radeonsi.
>  GL_KHR_texture_compression_astc_sliced_3d on radeonsi.
> +GL_NV_fragment_shader_interlock on i965.
>  
>
>  Bug fixes
> --
> 2.17.1
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] st/dri: implement the __DRI_DRIVER_VTABLE extension

2018-08-24 Thread Emil Velikov
From: Emil Velikov 

As the comment above globalDriverAPI (in dri_util.c) says, if the loader
is unaware of createNewScreen2 there is a race condition.

In which globalDriverAPI, will be set in the driver driDriverGetExtensions*
function and used in createNewScreen(). If we call another drivers'
driDriverGetExtensions, the createNewScreen will use the latter's API
instead of the former.

To make it more convoluting, the driver _must_ also expose
__DRI_DRIVER_VTABLE, as that one exposes the correct API.

The race also occurs, for loaders which use the pre megadrivers
driDriverGetExtensions entrypoint.

Cc: mesa-sta...@lists.freedesktop.org
Signed-off-by: Emil Velikov 
---
 src/gallium/state_trackers/dri/dri2.c   | 21 +
 src/gallium/state_trackers/dri/dri_screen.h |  1 +
 src/gallium/state_trackers/dri/drisw.c  |  6 ++
 src/gallium/targets/dri/target.c|  2 +-
 4 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/dri/dri2.c 
b/src/gallium/state_trackers/dri/dri2.c
index 3cbca4e5dc3..b21e6815796 100644
--- a/src/gallium/state_trackers/dri/dri2.c
+++ b/src/gallium/state_trackers/dri/dri2.c
@@ -2318,11 +2318,32 @@ const struct __DriverAPIRec dri_kms_driver_api = {
.ReleaseBuffer  = dri2_release_buffer,
 };
 
+static const struct __DRIDriverVtableExtensionRec gallium_drm_vtable = {
+   .base = { __DRI_DRIVER_VTABLE, 1 },
+   .vtable = _driver_api,
+};
+
+static const struct __DRIDriverVtableExtensionRec dri_kms_vtable = {
+   .base = { __DRI_DRIVER_VTABLE, 1 },
+   .vtable = _kms_driver_api,
+};
+
 /* This is the table of extensions that the loader will dlsym() for. */
 const __DRIextension *galliumdrm_driver_extensions[] = {
 ,
 ,
 ,
+_drm_vtable.base,
+_config_options.base,
+NULL
+};
+
+/* This is the table of extensions that the loader will dlsym() for. */
+const __DRIextension *dri_kms_driver_extensions[] = {
+,
+,
+,
+_kms_vtable.base,
 _config_options.base,
 NULL
 };
diff --git a/src/gallium/state_trackers/dri/dri_screen.h 
b/src/gallium/state_trackers/dri/dri_screen.h
index 8d2d9c02892..fde3b4088a7 100644
--- a/src/gallium/state_trackers/dri/dri_screen.h
+++ b/src/gallium/state_trackers/dri/dri_screen.h
@@ -147,6 +147,7 @@ void
 dri_destroy_screen(__DRIscreen * sPriv);
 
 extern const struct __DriverAPIRec dri_kms_driver_api;
+extern const __DRIextension *dri_kms_driver_extensions[];
 
 extern const struct __DriverAPIRec galliumdrm_driver_api;
 extern const __DRIextension *galliumdrm_driver_extensions[];
diff --git a/src/gallium/state_trackers/dri/drisw.c 
b/src/gallium/state_trackers/dri/drisw.c
index 1fba71bdd97..76a06b36664 100644
--- a/src/gallium/state_trackers/dri/drisw.c
+++ b/src/gallium/state_trackers/dri/drisw.c
@@ -513,11 +513,17 @@ const struct __DriverAPIRec galliumsw_driver_api = {
.CopySubBuffer = drisw_copy_sub_buffer,
 };
 
+static const struct __DRIDriverVtableExtensionRec galliumsw_vtable = {
+   .base = { __DRI_DRIVER_VTABLE, 1 },
+   .vtable = _driver_api,
+};
+
 /* This is the table of extensions that the loader will dlsym() for. */
 const __DRIextension *galliumsw_driver_extensions[] = {
 ,
 ,
 ,
+_vtable.base,
 _config_options.base,
 NULL
 };
diff --git a/src/gallium/targets/dri/target.c b/src/gallium/targets/dri/target.c
index 835d125f21e..e943cae6a16 100644
--- a/src/gallium/targets/dri/target.c
+++ b/src/gallium/targets/dri/target.c
@@ -28,7 +28,7 @@ const __DRIextension 
**__driDriverGetExtensions_kms_swrast(void);
 PUBLIC const __DRIextension **__driDriverGetExtensions_kms_swrast(void)
 {
globalDriverAPI = _kms_driver_api;
-   return galliumdrm_driver_extensions;
+   return dri_kms_driver_extensions;
 }
 
 #endif
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/4] st/dri: make swrast_no_present member of dri_screen

2018-08-24 Thread Emil Velikov
From: Emil Velikov 

Just like the dri2 options, this is better suited in the dri_screen
struct.

Signed-off-by: Emil Velikov 
---
 src/gallium/state_trackers/dri/dri_screen.h | 2 ++
 src/gallium/state_trackers/dri/drisw.c  | 7 +++
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/src/gallium/state_trackers/dri/dri_screen.h 
b/src/gallium/state_trackers/dri/dri_screen.h
index e410aa9c2f2..8d2d9c02892 100644
--- a/src/gallium/state_trackers/dri/dri_screen.h
+++ b/src/gallium/state_trackers/dri/dri_screen.h
@@ -78,6 +78,8 @@ struct dri_screen
boolean has_reset_status_query;
enum pipe_texture_target target;
 
+   boolean swrast_no_present;
+
/* hooks filled in by dri2 & drisw */
__DRIimage * (*lookup_egl_image)(struct dri_screen *ctx, void *handle);
 
diff --git a/src/gallium/state_trackers/dri/drisw.c 
b/src/gallium/state_trackers/dri/drisw.c
index e24fcba3869..1fba71bdd97 100644
--- a/src/gallium/state_trackers/dri/drisw.c
+++ b/src/gallium/state_trackers/dri/drisw.c
@@ -42,7 +42,6 @@
 #include "dri_query_renderer.h"
 
 DEBUG_GET_ONCE_BOOL_OPTION(swrast_no_present, "SWRAST_NO_PRESENT", FALSE);
-static boolean swrast_no_present = FALSE;
 
 static inline void
 get_drawable_info(__DRIdrawable *dPriv, int *x, int *y, int *w, int *h)
@@ -195,7 +194,7 @@ drisw_present_texture(__DRIdrawable *dPriv,
struct dri_drawable *drawable = dri_drawable(dPriv);
struct dri_screen *screen = dri_screen(drawable->sPriv);
 
-   if (swrast_no_present)
+   if (screen->swrast_no_present)
   return;
 
screen->base.screen->flush_frontbuffer(screen->base.screen, ptex, 0, 0, 
drawable, sub_box);
@@ -338,7 +337,7 @@ drisw_allocate_textures(struct dri_context *stctx,
   dri_drawable_get_format(drawable, statts[i], , );
 
   /* if we don't do any present, no need for display targets */
-  if (statts[i] != ST_ATTACHMENT_DEPTH_STENCIL && !swrast_no_present)
+  if (statts[i] != ST_ATTACHMENT_DEPTH_STENCIL && 
!screen->swrast_no_present)
  bind |= PIPE_BIND_DISPLAY_TARGET;
 
   if (format == PIPE_FORMAT_NONE)
@@ -443,7 +442,7 @@ drisw_init_screen(__DRIscreen * sPriv)
screen->sPriv = sPriv;
screen->fd = -1;
 
-   swrast_no_present = debug_get_option_swrast_no_present();
+   screen->swrast_no_present = debug_get_option_swrast_no_present();
 
sPriv->driverPrivate = (void *)screen;
sPriv->extensions = drisw_screen_extensions;
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/4] st/dri: inline dri2_buffer.h within dri2.c

2018-08-24 Thread Emil Velikov
From: Emil Velikov 

The header was used only by dri2.c, containing a two-member struct and cast 
wrapper.
Just inline it where it's used/needed.

Signed-off-by: Emil Velikov 
---
 .../state_trackers/dri/Makefile.sources   |  3 +--
 src/gallium/state_trackers/dri/dri2.c | 15 -
 src/gallium/state_trackers/dri/dri2_buffer.h  | 22 ---
 src/gallium/state_trackers/dri/meson.build|  2 +-
 4 files changed, 16 insertions(+), 26 deletions(-)
 delete mode 100644 src/gallium/state_trackers/dri/dri2_buffer.h

diff --git a/src/gallium/state_trackers/dri/Makefile.sources 
b/src/gallium/state_trackers/dri/Makefile.sources
index 36d5d47bb33..a610293bb11 100644
--- a/src/gallium/state_trackers/dri/Makefile.sources
+++ b/src/gallium/state_trackers/dri/Makefile.sources
@@ -11,8 +11,7 @@ common_SOURCES := \
dri_screen.h
 
 dri2_SOURCES := \
-   dri2.c \
-   dri2_buffer.h
+   dri2.c
 
 drisw_SOURCES := \
drisw.c
diff --git a/src/gallium/state_trackers/dri/dri2.c 
b/src/gallium/state_trackers/dri/dri2.c
index 2ac32205d9a..3cbca4e5dc3 100644
--- a/src/gallium/state_trackers/dri/dri2.c
+++ b/src/gallium/state_trackers/dri/dri2.c
@@ -45,15 +45,28 @@
 #include "main/bufferobj.h"
 #include "main/texobj.h"
 
+#include "dri_util.h"
+
 #include "dri_helpers.h"
 #include "dri_drawable.h"
 #include "dri_query_renderer.h"
-#include "dri2_buffer.h"
 
 #ifndef DRM_FORMAT_MOD_INVALID
 #define DRM_FORMAT_MOD_INVALID ((1ULL<<56) - 1)
 #endif
 
+struct dri2_buffer
+{
+   __DRIbuffer base;
+   struct pipe_resource *resource;
+};
+
+static inline struct dri2_buffer *
+dri2_buffer(__DRIbuffer * driBufferPriv)
+{
+   return (struct dri2_buffer *) driBufferPriv;
+}
+
 static const int fourcc_formats[] = {
__DRI_IMAGE_FOURCC_ARGB2101010,
__DRI_IMAGE_FOURCC_XRGB2101010,
diff --git a/src/gallium/state_trackers/dri/dri2_buffer.h 
b/src/gallium/state_trackers/dri/dri2_buffer.h
deleted file mode 100644
index 0cee4e906e6..000
--- a/src/gallium/state_trackers/dri/dri2_buffer.h
+++ /dev/null
@@ -1,22 +0,0 @@
-#ifndef DRI2_BUFFER_H
-#define DRI2_BUFFER_H
-
-#include "dri_util.h"
-
-struct pipe_surface;
-
-struct dri2_buffer
-{
-   __DRIbuffer base;
-   struct pipe_resource *resource;
-};
-
-static inline struct dri2_buffer *
-dri2_buffer(__DRIbuffer * driBufferPriv)
-{
-   return (struct dri2_buffer *) driBufferPriv;
-}
-
-#endif
-
-/* vim: set sw=3 ts=8 sts=3 expandtab: */
diff --git a/src/gallium/state_trackers/dri/meson.build 
b/src/gallium/state_trackers/dri/meson.build
index dfc37fcd81c..4bb41157e42 100644
--- a/src/gallium/state_trackers/dri/meson.build
+++ b/src/gallium/state_trackers/dri/meson.build
@@ -38,7 +38,7 @@ if with_dri
 endif
 
 if with_dri2
-  files_libdri += files('dri2.c', 'dri2_buffer.h')
+  files_libdri += files('dri2.c')
 endif
 
 libdri_c_args = []
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/4] mesa: use C99 initializer in get_gl_override()

2018-08-24 Thread Emil Velikov
From: Emil Velikov 

The overrides array contains entries indexed on the gl_api enum.
Use a C99 initializer to make it a bit more obvious.

Signed-off-by: Emil Velikov 
---
 src/mesa/main/version.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/mesa/main/version.c b/src/mesa/main/version.c
index 77ff51b6d9e..610ba2f08c5 100644
--- a/src/mesa/main/version.c
+++ b/src/mesa/main/version.c
@@ -64,10 +64,10 @@ get_gl_override(gl_api api, int *version, bool *fwd_context,
   bool fc_suffix;
   bool compat_suffix;
} override[] = {
-  { -1, false, false},
-  { -1, false, false},
-  { -1, false, false},
-  { -1, false, false},
+  [API_OPENGL_COMPAT] = { -1, false, false},
+  [API_OPENGLES]  = { -1, false, false},
+  [API_OPENGLES2] = { -1, false, false},
+  [API_OPENGL_CORE]   = { -1, false, false},
};
 
STATIC_ASSERT(ARRAY_SIZE(override) == API_OPENGL_LAST + 1);
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/4] st/xa: remove unused xa_screen::d[s]_depth_bits_last

2018-08-24 Thread Emil Velikov
From: Emil Velikov 

Unused since the initial import.

Signed-off-by: Emil Velikov 
---
 src/gallium/state_trackers/xa/xa_priv.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/src/gallium/state_trackers/xa/xa_priv.h 
b/src/gallium/state_trackers/xa/xa_priv.h
index 13a0e86f66d..c513b8d9629 100644
--- a/src/gallium/state_trackers/xa/xa_priv.h
+++ b/src/gallium/state_trackers/xa/xa_priv.h
@@ -74,8 +74,6 @@ struct xa_surface {
 struct xa_tracker {
 enum xa_formats *supported_formats;
 unsigned int format_map[XA_LAST_SURFACE_TYPE][2];
-int d_depth_bits_last;
-int ds_depth_bits_last;
 struct pipe_loader_device *dev;
 struct pipe_screen *screen;
 struct xa_context *default_ctx;
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 107670] Massive slowdown under specific memcpy implementations (32bit, no-SIMD, backward copy).

2018-08-24 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=107670

--- Comment #6 from Eero Tamminen  ---
(In reply to Timothy Arceri from comment #1)
> There already is asm optimized version of memcpy() in glibc. Why would we
> want to reinvent that in Mesa?
> 
> glibc should pick the right implementation for you system.

How would memcpy() know that the destination is mapped to PCI-E address space
i.e. gets transparently transferred over the PCI-E bus (which has its own
performance constraints)?

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 107670] Massive slowdown under specific memcpy implementations (32bit, no-SIMD, backward copy).

2018-08-24 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=107670

--- Comment #5 from i...@yahoo.com ---
(In reply to Roland Scheidegger from comment #3)
> Isn't this mapped as WC?
> In this case I'd expect the direction to make little difference, since write
> combine of any decent cpu should be able to combine the writes regardless
> the order?
> Although if it's UC I suppose someone needs to ensure that the maximum
> possible size is picked...

The theory that this is a caching issue has a merit since the distribution
version and my build seem to use the exact same memcpy(), one that goes
backwards, yet the distribution one is not triggering the massive slowdown.
The memmove() uses `rep movsb` and direction flag.

The question is, what controls the cache? How userland Mesa3D controls the PAT
cache flags? Because I am just changing the libraries, without rebooting
machine or restarting Xorg, I don't even stop the steam client. This means that
MTRR registers are not changed and the exact same kernel module and
configuration is used.

I do use modified build script, that disables support for hardware I don't
have, like intel and nvidia. Some of my options might cause the cache problem,
but I need to know what I am looking for.
BTW, the system libdrm is latest version (2.4.92).

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] configure: allow building with python3

2018-08-24 Thread Dylan Baker
Quoting Emil Velikov (2018-08-24 02:34:04)
> On Fri, 24 Aug 2018 at 04:23, Ilia Mirkin  wrote:
> >
> > This breaks the build for me. It selects python3 instead of python2,
> > and gen_xmlpool.py bails out when trying to print \xf3 to stdout with
> > a LANG=C locale. Revert until scripts are fixed and try again?
> >
> Sure will revert in a moment. The concerning part is why meson "succeeds".
> 
> Having a look if lacks the $(LANGS) argument when invoking gen_xmlpool.py.
> And the .mo and .po files (on which LANGS is based on) are missing all
> together in meson.
> 
> Mathieu, Dylan can you look into this?
> Once the meson build is updated, Ilia's concerns will become more obvious.

Meson has built-ins for handling gettext, but I haven't wired them up yet. They
work somewhat differently than the way mesa does (they're a separate build step
you run before building the 'all' target). If you're building from a release
tarball meson will pick up the ones generated by autotools for the tarball.

Dylan


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 6/5] egl/android: continue to next device if dri2_create_screen fails

2018-08-24 Thread Emil Velikov
On Thu, 23 Aug 2018 at 04:27, Tomasz Figa  wrote:
>
> On Thu, Aug 23, 2018 at 1:44 AM Emil Velikov  wrote:
> >
> > Hi Tomasz,
> >
> > On 21 August 2018 at 14:54, Tomasz Figa  wrote:
> > > Hi Emil,
> > >
> > > On Tue, Aug 14, 2018 at 2:05 AM Emil Velikov  
> > > wrote:
> > >>
> > >> From: Emil Velikov 
> > >>
> > >> Unlike the other platforms, here we aim do guess if the device that we
> > >> somewhat arbitrarily picked, is supported or not.
> > >>
> > >> It seems a bit fiddly, but considering EGL_EXT_explicit_device and
> > >> EGL_MESA_query_renderer are MIA, this is the best we can do for the
> > >> moment.
> > >>
> > >> With those (proposed) extensions userspace will be able to create a
> > >> separate EGL display for each device, query device details and make the
> > >> conscious decision which one to use.
> > >>
> > >> Cc: Robert Foss 
> > >> Cc: Tomasz Figa 
> > >> Signed-off-by: Emil Velikov 
> > >> ---
> > >>  src/egl/drivers/dri2/platform_android.c | 29 -
> > >>  1 file changed, 19 insertions(+), 10 deletions(-)
> > >>
> > >> diff --git a/src/egl/drivers/dri2/platform_android.c 
> > >> b/src/egl/drivers/dri2/platform_android.c
> > >> index 50dd7a5e1b4..cac59847b89 100644
> > >> --- a/src/egl/drivers/dri2/platform_android.c
> > >> +++ b/src/egl/drivers/dri2/platform_android.c
> > >> @@ -1295,6 +1295,25 @@ droid_open_device(_EGLDisplay *disp)
> > >>   continue;
> > >>}
> > >>/* Found a device */
> > >> +
> > >> +  /* Check that the device is supported, by attempting to:
> > >> +   * - load the dri module
> > >> +   * - and, create a screen
> > >> +   */
> > >> +  if (!droid_load_driver(disp)) {
> > >> + _eglLog(_EGL_WARNING, "DRI2: failed to load driver");
> > >> + close(fd);
> > >> + fd = -1;
> > >> + continue;
> > >> +  }
> > >> +
> > >> +  if (!dri2_create_screen(disp)) {
> > >> + _eglLog(_EGL_WARNING, "DRI2: failed to create screen");
> > >> + close(fd);
> > >> + fd = -1;
> > >> + continue;
> > >> +  }
> > >
> > > Don't we also need to do these tests when determining if the device is
> > > a suitable fallback? The fallback fd is set much earlier, in the same
> > > block as the continue statement, so the code below wouldn't execute.
> > >
> > Let me see if I got this correctly:
> >  - when a "vendor" is requested we use that, falling back to the first
> > other driver where screen creation succeeds
> >  - if one isn't requested, we pick the first device that can create a screen
> >
> > Is that right?
>
> Yes, seems to match my idea.
>
> Just to make sure we're not going to be masking some failures with
> fallbacks, I think we should make sure that if "vendor" is requested
> and found, we don't fallback, even if the matched device fails to
> create a screen.
>
The original code _is_ using a fallback when a vendor is requested,
which got to me.
Thanks a lot for the clarification. Updated patch should be in your inbox + ML.

Cheers,
Emil

[1] https://lists.freedesktop.org/archives/mesa-dev/2018-August/203533.html
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] egl/android: rework device probing

2018-08-24 Thread Emil Velikov
From: Emil Velikov 

Unlike the other platforms, here we aim do guess if the device that we
somewhat arbitrarily picked, is supported or not.

In particular: when a vendor is _not_ requested we loop through all
devices, picking the first one which can create a DRI screen.

When a vendor is requested - we use that and do _not_ fall-back to any
other device.

The former seems a bit fiddly, but considering EGL_EXT_explicit_device and
EGL_MESA_query_renderer are MIA, this is the best we can do for the
moment.

With those (proposed) extensions userspace will be able to create a
separate EGL display for each device, query device details and make the
conscious decision which one to use.

Cc: Robert Foss 
Cc: Tomasz Figa 
Signed-off-by: Emil Velikov 
---
Thanks for the clarification Tomasz. The original code was using a
fall-back even a vendor was explicitly requested, confusing me a bit ;-)
---
 src/egl/drivers/dri2/platform_android.c | 71 +++--
 1 file changed, 43 insertions(+), 28 deletions(-)

diff --git a/src/egl/drivers/dri2/platform_android.c 
b/src/egl/drivers/dri2/platform_android.c
index 1f9fe27ab85..5bf627dec7d 100644
--- a/src/egl/drivers/dri2/platform_android.c
+++ b/src/egl/drivers/dri2/platform_android.c
@@ -1420,13 +1420,32 @@ droid_filter_device(_EGLDisplay *disp, int fd, const 
char *vendor)
return 0;
 }
 
+static int
+droid_probe_device(_EGLDisplay *disp)
+{
+  /* Check that the device is supported, by attempting to:
+   * - load the dri module
+   * - and, create a screen
+   */
+   if (!droid_load_driver(disp)) {
+  _eglLog(_EGL_WARNING, "DRI2: failed to load driver");
+  return -1;
+   }
+
+   if (!dri2_create_screen(disp)) {
+  _eglLog(_EGL_WARNING, "DRI2: failed to create screen");
+  return -1;
+   }
+   return 0;
+}
+
 static int
 droid_open_device(_EGLDisplay *disp)
 {
 #define MAX_DRM_DEVICES 32
drmDevicePtr device, devices[MAX_DRM_DEVICES] = { NULL };
int prop_set, num_devices;
-   int fd = -1, fallback_fd = -1;
+   int fd = -1;
 
char *vendor_name = NULL;
char vendor_buf[PROPERTY_VALUE_MAX];
@@ -1451,33 +1470,39 @@ droid_open_device(_EGLDisplay *disp)
  continue;
   }
 
-  if (vendor_name && droid_filter_device(disp, fd, vendor_name)) {
- /* Match requested, but not found - set as fallback */
- if (fallback_fd == -1) {
-fallback_fd = fd;
- } else {
+  /* If a vendor is explicitly provided, we use only that.
+   * Otherwise we fall-back the first device that is supported.
+   */
+  if (vendor_name) {
+ if (droid_filter_device(disp, fd, vendor_name)) {
+/* Device does not match - try next device */
 close(fd);
 fd = -1;
+continue;
  }
-
+ /* If the requested device matches use it, regardless if
+  * init fails. Do not fall-back to any other device.
+  */
+ if (droid_probbe_device(disp)) {
+close(fd);
+fd = -1;
+ }
+ break;
+  }
+  /* No explicit request - attempt the next device */
+  if (droid_probbe_device(disp)) {
+ close(fd);
+ fd = -1;
  continue;
   }
-  /* Found a device */
   break;
}
drmFreeDevices(devices, num_devices);
 
-   if (fallback_fd < 0 && fd < 0) {
-  _eglLog(_EGL_WARNING, "Failed to open any DRM device");
-  return -1;
-   }
-
-   if (fd < 0) {
-  _eglLog(_EGL_WARNING, "Failed to open desired DRM device, using 
fallback");
-  return fallback_fd;
-   }
+   if (fd < 0)
+  _eglLog(_EGL_WARNING, "Failed to open %s DRM device",
+vendor_name ? "desired": "any");
 
-   close(fallback_fd);
return fd;
 #undef MAX_DRM_DEVICES
 }
@@ -1519,16 +1544,6 @@ dri2_initialize_android(_EGLDriver *drv, _EGLDisplay 
*disp)
   goto cleanup;
}
 
-   if (!droid_load_driver(disp)) {
-  err = "DRI2: failed to load driver";
-  goto cleanup;
-   }
-
-   if (!dri2_create_screen(disp)) {
-  err = "DRI2: failed to create screen";
-  goto cleanup;
-   }
-
if (!dri2_setup_extensions(disp)) {
   err = "DRI2: failed to setup extensions";
   goto cleanup;
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] meson: Enable readeon vulkan driver on arm, aarch64

2018-08-24 Thread Guido Günther
This is similar to what the Debian package does so it looks like a sane
default.

Signed-off-by: Guido Günther 
---
 meson.build | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/meson.build b/meson.build
index ccc0ed6a0ea..bf8f0770915 100644
--- a/meson.build
+++ b/meson.build
@@ -179,6 +179,8 @@ if _vulkan_drivers.contains('auto')
   if system_has_kms_drm
 if host_machine.cpu_family().startswith('x86')
   _vulkan_drivers = ['amd', 'intel']
+elif ['arm', 'aarch64'].contains(host_machine.cpu_family())
+  _vulkan_drivers = ['amd']
 else
   error('Unknown architecture @0@. Please pass -Dvulkan-drivers to set 
driver options. Patches gladly accepted to fix this.'.format(
 host_machine.cpu_family()))
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] meson: Be a bit more helpful when arch or OS is unknown

2018-08-24 Thread Guido Günther
Signed-off-by: Guido Günther 
---
 meson.build | 21 ++---
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/meson.build b/meson.build
index 1b3dfa221c9..ccc0ed6a0ea 100644
--- a/meson.build
+++ b/meson.build
@@ -102,13 +102,15 @@ if _drivers.contains('auto')
 elif ['arm', 'aarch64'].contains(host_machine.cpu_family())
   _drivers = []
 else
-  error('Unknown architecture. Please pass -Ddri-drivers to set driver 
options. Patches gladly accepted to fix this.')
+  error('Unknown architecture @0@. Please pass -Ddri-drivers to set driver 
options. Patches gladly accepted to fix this.'.format(
+host_machine.cpu_family()))
 endif
   elif ['darwin', 'windows', 'cygwin', 'haiku'].contains(host_machine.system())
 # only swrast would make sense here, but gallium swrast is a much better 
default
 _drivers = []
   else
-error('Unknown OS. Please pass -Ddri-drivers to set driver options. 
Patches gladly accepted to fix this.')
+error('Unknown OS @0@. Please pass -Ddri-drivers to set driver options. 
Patches gladly accepted to fix this.'.format(
+  host_machine.system()))
   endif
 endif
 
@@ -135,12 +137,14 @@ if _drivers.contains('auto')
 'tegra', 'virgl', 'swrast',
   ]
 else
-  error('Unknown architecture. Please pass -Dgallium-drivers to set driver 
options. Patches gladly accepted to fix this.')
+  error('Unknown architecture @0@. Please pass -Dgallium-drivers to set 
driver options. Patches gladly accepted to fix this.'.format(
+host_machine.cpu_family()))
 endif
   elif ['darwin', 'windows', 'cygwin', 'haiku'].contains(host_machine.system())
 _drivers = ['swrast']
   else
-error('Unknown OS. Please pass -Dgallium-drivers to set driver options. 
Patches gladly accepted to fix this.')
+error('Unknown OS @0@. Please pass -Dgallium-drivers to set driver 
options. Patches gladly accepted to fix this.'.format(
+  host_machine.system()))
   endif
 endif
 with_gallium_pl111 = _drivers.contains('pl111')
@@ -176,13 +180,15 @@ if _vulkan_drivers.contains('auto')
 if host_machine.cpu_family().startswith('x86')
   _vulkan_drivers = ['amd', 'intel']
 else
-  error('Unknown architecture. Please pass -Dvulkan-drivers to set driver 
options. Patches gladly accepted to fix this.')
+  error('Unknown architecture @0@. Please pass -Dvulkan-drivers to set 
driver options. Patches gladly accepted to fix this.'.format(
+host_machine.cpu_family()))
 endif
   elif ['darwin', 'windows', 'cygwin', 'haiku'].contains(host_machine.system())
 # No vulkan driver supports windows or macOS currently
 _vulkan_drivers = []
   else
-error('Unknown OS. Please pass -Dvulkan-drivers to set driver options. 
Patches gladly accepted to fix this.')
+error('Unknown OS @0@. Please pass -Dvulkan-drivers to set driver options. 
Patches gladly accepted to fix this.'.format(
+  host_machine.system()))
   endif
 endif
 
@@ -230,7 +236,8 @@ if _platforms.contains('auto')
   elif ['haiku'].contains(host_machine.system())
 _platforms = ['haiku']
   else
-error('Unknown OS. Please pass -Dplatforms to set platforms. Patches 
gladly accepted to fix this.')
+error('Unknown OS. Please pass -Dplatforms to set platforms. Patches 
gladly accepted to fix this.'.format(
+  host_machine.system()))
   endif
 endif
 
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 107524] Broken packDouble2x32 at llvmpipe

2018-08-24 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=107524

--- Comment #4 from Matwey V. Kornilov  ---
Any news?

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] egl/wayland: do not leak wl_buffer when it is locked

2018-08-24 Thread Juan A. Suarez Romero
On Wed, 2018-08-08 at 17:49 +0100, Daniel Stone wrote:
> Hi Juan,
> 
> On Wed, 8 Aug 2018 at 17:40, Juan A. Suarez Romero  
> wrote:
> > On Wed, 2018-08-08 at 17:21 +0100, Daniel Stone wrote:
> > > On Thu, 2 Aug 2018 at 10:02, Juan A. Suarez Romero  
> > > wrote:
> > > > If color buffer is locked, do not set its wayland buffer to NULL;
> > > > otherwise it can not be freed later.
> > > 
> > > It can: see the 'if (i == ARRAY_SIZE(...))' branch inside 
> > > wl_buffer_release.
> > 
> > I think I didn't explain wrongly :)
> > 
> > If color buffer is locked, we set color_buffer.wl_buffer to NULL, and thus 
> > we
> > can't free wl_buffer later.
> 
> If a surface is resized, we will orphan all its wl_buffers by clearing
> their pointers to NULL, hence losing the ability to directly free
> them. But when a new buffer has been attached and displayed, a
> 'release' event will be delivered for the old buffer (handled by
> wl_buffer_release as an event listener), which will detect that the
> wl_buffer is not in the list and should be immediately destroyed.
> 
> In fact, I have to rescind my R-b since I'm pretty sure this breaks resizing:
>   - buffer 1 is allocated with original size, committed to server
> (locked == true)
>   - user resizes surface, release_buffers() is called but leaves
> wl_buffer intact in color_buffers list
>   - buffers 2..4 are allocated with new size and committed to server
> (locked == true)
>   - release event for buffer 1 is delivered, locked = false
>   - get_back_bo() finds buffer 1 has a wl_buffer and is not locked,
> but dri_image is NULL so a new image is created

I understand this situation happens because the release event just set
locked=false, but does not free the wl_buffer.

What about this? Within my proposed fix, add a flag to tell the release event if
the wl_buffer should be released or not. By default this is false, so the
release event only unlocks the buffer. In release_buffers() if the buffer is
locked, then we set the flag to true, so when the release event is invoked we
free the buffer and set to NULL, so get_back_bo() won't reuse the old buffer.


J.A.



>   - swap_buffers_with_damage() finds buffer 1 (with new DRIimage)
> still has wl_buffer (old) and attaches that buffer
> 
> So we need to either find a way to still destroy the wl_buffer inside
> wl_buffer_release(), or at least do it later, e.g. in get_back_bo().
> 
> Cheers,
> Daniel
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] anv: Add support for protected memory properties on anv_GetPhysicalDeviceProperties2()

2018-08-24 Thread Lionel Landwerlin

On 24/08/2018 12:09, Samuel Iglesias Gonsálvez wrote:

VkPhysicalDeviceProtectedMemoryProperties structure is new on Vulkan 1.1.

Fixes Vulkan CTS CL#2849.

Signed-off-by: Samuel Iglesias Gonsálvez 



Reviewed-by: Lionel Landwerlin 



---
  src/intel/vulkan/anv_device.c | 7 +++
  1 file changed, 7 insertions(+)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index d85615caaed..4cb9cc453e6 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -1158,6 +1158,13 @@ void anv_GetPhysicalDeviceProperties2(
   break;
}
  
+  case VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_PROTECTED_MEMORY_PROPERTIES: {

+ VkPhysicalDeviceProtectedMemoryProperties *props =
+(VkPhysicalDeviceProtectedMemoryProperties *)ext;
+ props->protectedNoFault = false;
+ break;
+  }
+
default:
   anv_debug_ignored_stype(ext->sType);
   break;



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] egl/wayland: do not leak wl_buffer when it is locked

2018-08-24 Thread Juan A. Suarez Romero
On Wed, 2018-08-08 at 17:49 +0100, Daniel Stone wrote:
> Hi Juan,
> 
> On Wed, 8 Aug 2018 at 17:40, Juan A. Suarez Romero  
> wrote:
> > On Wed, 2018-08-08 at 17:21 +0100, Daniel Stone wrote:
> > > On Thu, 2 Aug 2018 at 10:02, Juan A. Suarez Romero  
> > > wrote:
> > > > If color buffer is locked, do not set its wayland buffer to NULL;
> > > > otherwise it can not be freed later.
> > > 
> > > It can: see the 'if (i == ARRAY_SIZE(...))' branch inside 
> > > wl_buffer_release.
> > 
> > I think I didn't explain wrongly :)
> > 
> > If color buffer is locked, we set color_buffer.wl_buffer to NULL, and thus 
> > we
> > can't free wl_buffer later.
> 
> If a surface is resized, we will orphan all its wl_buffers by clearing
> their pointers to NULL, hence losing the ability to directly free
> them. But when a new buffer has been attached and displayed, a
> 'release' event will be delivered for the old buffer (handled by
> wl_buffer_release as an event listener), which will detect that the
> wl_buffer is not in the list and should be immediately destroyed.
> 
> In fact, I have to rescind my R-b since I'm pretty sure this breaks resizing:
>   - buffer 1 is allocated with original size, committed to server
> (locked == true)
>   - user resizes surface, release_buffers() is called but leaves
> wl_buffer intact in color_buffers list
>   - buffers 2..4 are allocated with new size and committed to server
> (locked == true)
>   - release event for buffer 1 is delivered, locked = false
>   - get_back_bo() finds buffer 1 has a wl_buffer and is not locked,
> but dri_image is NULL so a new image is created
>   - swap_buffers_with_damage() finds buffer 1 (with new DRIimage)
> still has wl_buffer (old) and attaches that buffer
> 
> So we need to either find a way to still destroy the wl_buffer inside
> wl_buffer_release(), or at least do it later, e.g. in get_back_bo().
> 


Investigating a bit more, I found the following:


1) dri2_wl_release_buffers() is invoked; one of the color_buffers is locked, so
the proper wl_buffer is not destroyed, but set to NULL. This will be destroyed
later through the release event.

2) Test is finished, and dri2_wl_destroy_surface() is invoked. This function
destroys all wl_buffers, no matter if they are locked or not. But can't destroy
the above one, because it is NULL. 
More important, dri2_wl_destroy_surface() frees the surface itself, including
the event queue [wl_event_queue_destroy(dri2_surf->wl_queue);]. So when the
release event for the wl_buffer, there's no wl_queue.


A quick check of what would happen if the queue is not destroyrf reveald that
the tests pass without any problem. Of course, I don't think this is a proper
solution, as we would be leaking the queue itself.

J.A.




> Cheers,
> Daniel
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] anv: Add support for protected memory properties on anv_GetPhysicalDeviceProperties2()

2018-08-24 Thread Samuel Iglesias Gonsálvez
VkPhysicalDeviceProtectedMemoryProperties structure is new on Vulkan 1.1.

Fixes Vulkan CTS CL#2849.

Signed-off-by: Samuel Iglesias Gonsálvez 
---
 src/intel/vulkan/anv_device.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index d85615caaed..4cb9cc453e6 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -1158,6 +1158,13 @@ void anv_GetPhysicalDeviceProperties2(
  break;
   }
 
+  case VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_PROTECTED_MEMORY_PROPERTIES: {
+ VkPhysicalDeviceProtectedMemoryProperties *props =
+(VkPhysicalDeviceProtectedMemoryProperties *)ext;
+ props->protectedNoFault = false;
+ break;
+  }
+
   default:
  anv_debug_ignored_stype(ext->sType);
  break;
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/4] radeonsi: enable radeonsi_zerovram for No Mans Sky

2018-08-24 Thread Timothy Arceri
---
 src/util/00-mesa-defaults.conf | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/util/00-mesa-defaults.conf b/src/util/00-mesa-defaults.conf
index ad59efba50b..5d15b3819fb 100644
--- a/src/util/00-mesa-defaults.conf
+++ b/src/util/00-mesa-defaults.conf
@@ -322,5 +322,8 @@ TODO: document the other workarounds.
 
 
 
+
+
+
 
 
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/4] radeonsi: add radeonsi_zerovram driconfig option

2018-08-24 Thread Timothy Arceri
More and more games seem to require this so lets make it a config
option.
---
 src/gallium/drivers/radeonsi/driinfo_radeonsi.h |  1 +
 src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c   | 10 +++---
 src/util/xmlpool/t_options.h|  5 +
 3 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/driinfo_radeonsi.h 
b/src/gallium/drivers/radeonsi/driinfo_radeonsi.h
index 7f57b4ea892..8c5078c13f3 100644
--- a/src/gallium/drivers/radeonsi/driinfo_radeonsi.h
+++ b/src/gallium/drivers/radeonsi/driinfo_radeonsi.h
@@ -3,6 +3,7 @@ DRI_CONF_SECTION_PERFORMANCE
 DRI_CONF_RADEONSI_ENABLE_SISCHED("false")
 DRI_CONF_RADEONSI_ASSUME_NO_Z_FIGHTS("false")
 DRI_CONF_RADEONSI_COMMUTATIVE_BLEND_ADD("false")
+DRI_CONF_RADEONSI_ZERO_ALL_VRAM_ALLOCS("false")
 DRI_CONF_SECTION_END
 
 DRI_CONF_SECTION_DEBUG
diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c 
b/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c
index 882f500bc69..dcbc075e3c5 100644
--- a/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c
+++ b/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c
@@ -32,6 +32,7 @@
 
 #include "util/u_hash_table.h"
 #include "util/hash_table.h"
+#include "util/xmlconfig.h"
 #include 
 #include 
 #include 
@@ -49,7 +50,9 @@ static simple_mtx_t dev_tab_mutex = 
_SIMPLE_MTX_INITIALIZER_NP;
 DEBUG_GET_ONCE_BOOL_OPTION(all_bos, "RADEON_ALL_BOS", false)
 
 /* Helper function to do the ioctls needed for setup and init. */
-static bool do_winsys_init(struct amdgpu_winsys *ws, int fd)
+static bool do_winsys_init(struct amdgpu_winsys *ws,
+   const struct pipe_screen_config *config,
+   int fd)
 {
if (!ac_query_gpu_info(fd, ws->dev, >info, >amdinfo))
   goto fail;
@@ -63,7 +66,8 @@ static bool do_winsys_init(struct amdgpu_winsys *ws, int fd)
ws->check_vm = strstr(debug_get_option("R600_DEBUG", ""), "check_vm") != 
NULL;
ws->debug_all_bos = debug_get_option_all_bos();
ws->reserve_vmid = strstr(debug_get_option("R600_DEBUG", ""), 
"reserve_vmid") != NULL;
-   ws->zero_all_vram_allocs = strstr(debug_get_option("R600_DEBUG", ""), 
"zerovram") != NULL;
+   ws->zero_all_vram_allocs = strstr(debug_get_option("R600_DEBUG", ""), 
"zerovram") != NULL ||
+  driQueryOptionb(config->options, "radeonsi_zerovram");
 
return true;
 
@@ -279,7 +283,7 @@ amdgpu_winsys_create(int fd, const struct 
pipe_screen_config *config,
ws->info.drm_major = drm_major;
ws->info.drm_minor = drm_minor;
 
-   if (!do_winsys_init(ws, fd))
+   if (!do_winsys_init(ws, config, fd))
   goto fail_alloc;
 
/* Create managers. */
diff --git a/src/util/xmlpool/t_options.h b/src/util/xmlpool/t_options.h
index ecf495a2f29..945d0e60f90 100644
--- a/src/util/xmlpool/t_options.h
+++ b/src/util/xmlpool/t_options.h
@@ -414,3 +414,8 @@ DRI_CONF_OPT_END
 DRI_CONF_OPT_BEGIN_B(radeonsi_clear_db_cache_before_clear, def) \
 DRI_CONF_DESC(en,"Clear DB cache before fast depth clear") \
 DRI_CONF_OPT_END
+
+#define DRI_CONF_RADEONSI_ZERO_ALL_VRAM_ALLOCS(def) \
+DRI_CONF_OPT_BEGIN_B(radeonsi_zerovram, def) \
+DRI_CONF_DESC(en,"Zero all vram allocations") \
+DRI_CONF_OPT_END
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/4] radeonsi: enable GL 4.5 in compat profile

2018-08-24 Thread Timothy Arceri
---
 src/gallium/drivers/radeonsi/si_get.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_get.c 
b/src/gallium/drivers/radeonsi/si_get.c
index 47368fb7c91..f4c61a7e408 100644
--- a/src/gallium/drivers/radeonsi/si_get.c
+++ b/src/gallium/drivers/radeonsi/si_get.c
@@ -184,8 +184,7 @@ static int si_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_GLSL_FEATURE_LEVEL:
case PIPE_CAP_GLSL_FEATURE_LEVEL_COMPATIBILITY:
if (sscreen->info.has_indirect_compute_dispatch)
-   return param == PIPE_CAP_GLSL_FEATURE_LEVEL ?
-   450 : 440;
+   return 450;
return 420;
 
case PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE:
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/4] mesa: enable ARB_direct_state_access in compat for GL3.1+

2018-08-24 Thread Timothy Arceri
We could enable it for lower versions of GL but this allows us
to just use the existing version/extension checks that are already
used by the core profile.
---
 src/mapi/glapi/gen/apiexec.py| 194 +++
 src/mesa/main/extensions_table.h |   2 +-
 src/mesa/main/fbobject.c |  13 ++-
 3 files changed, 105 insertions(+), 104 deletions(-)

diff --git a/src/mapi/glapi/gen/apiexec.py b/src/mapi/glapi/gen/apiexec.py
index b163d88549b..e2fc124be22 100644
--- a/src/mapi/glapi/gen/apiexec.py
+++ b/src/mapi/glapi/gen/apiexec.py
@@ -152,103 +152,103 @@ functions = {
 
 # OpenGL 4.5 / GL_ARB_direct_state_access.   Mesa can expose the extension
 # with core profile.
-"CreateTransformFeedbacks": exec_info(compatibility=45, core=31),
-"TransformFeedbackBufferBase": exec_info(compatibility=45, core=31),
-"TransformFeedbackBufferRange": exec_info(compatibility=45, core=31),
-"GetTransformFeedbackiv": exec_info(compatibility=45, core=31),
-"GetTransformFeedbacki_v": exec_info(compatibility=45, core=31),
-"GetTransformFeedbacki64_v": exec_info(compatibility=45, core=31),
-"CreateBuffers": exec_info(compatibility=45, core=31),
-"NamedBufferStorage": exec_info(compatibility=45, core=31),
-"NamedBufferData": exec_info(compatibility=45, core=31),
-"NamedBufferSubData": exec_info(compatibility=45, core=31),
-"CopyNamedBufferSubData": exec_info(compatibility=45, core=31),
-"ClearNamedBufferData": exec_info(compatibility=45, core=31),
-"ClearNamedBufferSubData": exec_info(compatibility=45, core=31),
-"MapNamedBuffer": exec_info(compatibility=45, core=31),
-"MapNamedBufferRange": exec_info(compatibility=45, core=31),
-"UnmapNamedBuffer": exec_info(compatibility=45, core=31),
-"FlushMappedNamedBufferRange": exec_info(compatibility=45, core=31),
-"GetNamedBufferParameteriv": exec_info(compatibility=45, core=31),
-"GetNamedBufferParameteri64v": exec_info(compatibility=45, core=31),
-"GetNamedBufferPointerv": exec_info(compatibility=45, core=31),
-"GetNamedBufferSubData": exec_info(compatibility=45, core=31),
-"CreateFramebuffers": exec_info(compatibility=45, core=31),
-"NamedFramebufferRenderbuffer": exec_info(compatibility=45, core=31),
-"NamedFramebufferParameteri": exec_info(compatibility=45, core=31),
-"NamedFramebufferTexture": exec_info(compatibility=45, core=31),
-"NamedFramebufferTextureLayer": exec_info(compatibility=45, core=31),
-"NamedFramebufferDrawBuffer": exec_info(compatibility=45, core=31),
-"NamedFramebufferDrawBuffers": exec_info(compatibility=45, core=31),
-"NamedFramebufferReadBuffer": exec_info(compatibility=45, core=31),
-"InvalidateNamedFramebufferData": exec_info(compatibility=45, core=31),
-"InvalidateNamedFramebufferSubData": exec_info(compatibility=45, core=31),
-"ClearNamedFramebufferiv": exec_info(compatibility=45, core=31),
-"ClearNamedFramebufferuiv": exec_info(compatibility=45, core=31),
-"ClearNamedFramebufferfv": exec_info(compatibility=45, core=31),
-"ClearNamedFramebufferfi": exec_info(compatibility=45, core=31),
-"BlitNamedFramebuffer": exec_info(compatibility=45, core=31),
-"CheckNamedFramebufferStatus": exec_info(compatibility=45, core=31),
-"GetNamedFramebufferParameteriv": exec_info(compatibility=45, core=31),
-"GetNamedFramebufferAttachmentParameteriv": exec_info(compatibility=45, 
core=31),
-"CreateRenderbuffers": exec_info(compatibility=45, core=31),
-"NamedRenderbufferStorage": exec_info(compatibility=45, core=31),
-"NamedRenderbufferStorageMultisample": exec_info(compatibility=45, 
core=31),
-"GetNamedRenderbufferParameteriv": exec_info(compatibility=45, core=31),
-"CreateTextures": exec_info(compatibility=45, core=31),
-"TextureBuffer": exec_info(compatibility=45, core=31),
-"TextureBufferRange": exec_info(compatibility=45, core=31),
-"TextureStorage1D": exec_info(compatibility=45, core=31),
-"TextureStorage2D": exec_info(compatibility=45, core=31),
-"TextureStorage3D": exec_info(compatibility=45, core=31),
-"TextureStorage2DMultisample": exec_info(compatibility=45, core=31),
-"TextureStorage3DMultisample": exec_info(compatibility=45, core=31),
-"TextureSubImage1D": exec_info(compatibility=45, core=31),
-"TextureSubImage2D": exec_info(compatibility=45, core=31),
-"TextureSubImage3D": exec_info(compatibility=45, core=31),
-"CompressedTextureSubImage1D": exec_info(compatibility=45, core=31),
-"CompressedTextureSubImage2D": exec_info(compatibility=45, core=31),
-"CompressedTextureSubImage3D": exec_info(compatibility=45, core=31),
-"CopyTextureSubImage1D": exec_info(compatibility=45, core=31),
-"CopyTextureSubImage2D": exec_info(compatibility=45, core=31),
-"CopyTextureSubImage3D": exec_info(compatibility=45, core=31),
-"TextureParameterf": exec_info(compatibility=45, core=31),
-"TextureParameterfv": 

Re: [Mesa-dev] [PATCH 2/3] configure: allow building with python3

2018-08-24 Thread Emil Velikov
On Fri, 24 Aug 2018 at 04:23, Ilia Mirkin  wrote:
>
> This breaks the build for me. It selects python3 instead of python2,
> and gen_xmlpool.py bails out when trying to print \xf3 to stdout with
> a LANG=C locale. Revert until scripts are fixed and try again?
>
Sure will revert in a moment. The concerning part is why meson "succeeds".

Having a look if lacks the $(LANGS) argument when invoking gen_xmlpool.py.
And the .mo and .po files (on which LANGS is based on) are missing all
together in meson.

Mathieu, Dylan can you look into this?
Once the meson build is updated, Ilia's concerns will become more obvious.

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 107670] Massive slowdown under specific memcpy implementations (32bit, no-SIMD, backward copy).

2018-08-24 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=107670

--- Comment #4 from Grazvydas Ignotas  ---
What game/benchmark do you see this with?

Can you try calling _mesa_streaming_load_memcpy() there? It's for reading
uncached memory, but by the looks of it it might be suitable for writing too.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] glsl/linker: Allow unused in blocks which are not declated on previous stage

2018-08-24 Thread Alejandro Piñeiro
CCing Timothy just in case he still thinks that the original comment
should remain as it is. In any case, it looks to me, so:

Reviewed-by: Alejandro Piñeiro 


On 23/08/18 12:12, vadym.shovkoplias wrote:
> From Section 4.3.4 (Inputs) of the GLSL 1.50 spec:
>
> "Only the input variables that are actually read need to be written
>  by the previous stage; it is allowed to have superfluous
>  declarations of input variables."
>
> Fixes:
> * interstage-multiple-shader-objects.shader_test
>
> v2:
>   Update comment in ir.h since the usage of "used" field
>   has been extended.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101247
> Signed-off-by: Vadym Shovkoplias 
> ---
>  src/compiler/glsl/ir.h  | 4 ++--
>  src/compiler/glsl/link_interface_blocks.cpp | 8 +++-
>  2 files changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/src/compiler/glsl/ir.h b/src/compiler/glsl/ir.h
> index 67b38f48ef..d05d1998a5 100644
> --- a/src/compiler/glsl/ir.h
> +++ b/src/compiler/glsl/ir.h
> @@ -667,8 +667,8 @@ public:
> * variable has been used.  For example, it is an error to redeclare a
> * variable as invariant after it has been used.
> *
> -   * This is only maintained in the ast_to_hir.cpp path, not in
> -   * Mesa's fixed function or ARB program paths.
> +   * This is maintained in the ast_to_hir.cpp path and during linking,
> +   * but not in Mesa's fixed function or ARB program paths.
> */
>unsigned used:1;
>  
> diff --git a/src/compiler/glsl/link_interface_blocks.cpp 
> b/src/compiler/glsl/link_interface_blocks.cpp
> index e5eca9460e..801fbcd5d9 100644
> --- a/src/compiler/glsl/link_interface_blocks.cpp
> +++ b/src/compiler/glsl/link_interface_blocks.cpp
> @@ -417,9 +417,15 @@ validate_interstage_inout_blocks(struct 
> gl_shader_program *prog,
> * write to any of the pre-defined outputs (e.g. if the vertex shader
> * does not write to gl_Position, etc), which is allowed and results in
> * undefined behavior.
> +   *
> +   * From Section 4.3.4 (Inputs) of the GLSL 1.50 spec:
> +   *
> +   *"Only the input variables that are actually read need to be 
> written
> +   * by the previous stage; it is allowed to have superfluous
> +   * declarations of input variables."
> */
>if (producer_def == NULL &&
> -  !is_builtin_gl_in_block(var, consumer->Stage)) {
> +  !is_builtin_gl_in_block(var, consumer->Stage) && var->data.used) {
>   linker_error(prog, "Input block `%s' is not an output of "
>"the previous stage\n", 
> var->get_interface_type()->name);
>   return;

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106231] llvmpipe blends produce bad code after llvm patch https://reviews.llvm.org/D44785

2018-08-24 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106231

Roland Scheidegger  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #4 from Roland Scheidegger  ---
Fixed by 8e1be9a34ac8ce6f115eaf2ab0d99b6a0ce37630.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] docs/relnotes: Mark NV_fragment_shader_interlock support in i965

2018-08-24 Thread kevin . rogovin
From: Kevin Rogovin 

---
 docs/relnotes/18.3.0.html | 1 +
 1 file changed, 1 insertion(+)

diff --git a/docs/relnotes/18.3.0.html b/docs/relnotes/18.3.0.html
index 594b0624a5..afcb044817 100644
--- a/docs/relnotes/18.3.0.html
+++ b/docs/relnotes/18.3.0.html
@@ -59,6 +59,7 @@ Note: some of the new features are only available with 
certain drivers.
 GL_EXT_vertex_attrib_64bit on i965, nvc0, radeonsi.
 GL_EXT_window_rectangles on radeonsi.
 GL_KHR_texture_compression_astc_sliced_3d on radeonsi.
+GL_NV_fragment_shader_interlock on i965.
 
 
 Bug fixes
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] docs/relnotes: Mark NV_fragment_shader_interlock support in i965

2018-08-24 Thread kevin . rogovin
From: Kevin Rogovin 

---
 docs/relnotes/18.3.0.html | 1 +
 1 file changed, 1 insertion(+)

diff --git a/docs/relnotes/18.3.0.html b/docs/relnotes/18.3.0.html
index 594b0624a5..afcb044817 100644
--- a/docs/relnotes/18.3.0.html
+++ b/docs/relnotes/18.3.0.html
@@ -59,6 +59,7 @@ Note: some of the new features are only available with 
certain drivers.
 GL_EXT_vertex_attrib_64bit on i965, nvc0, radeonsi.
 GL_EXT_window_rectangles on radeonsi.
 GL_KHR_texture_compression_astc_sliced_3d on radeonsi.
+GL_NV_fragment_shader_interlock on i965.
 
 
 Bug fixes
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev