date:20160614

Re: [Mesa-dev] [PATCH 1/8] gallium: Add a cap for offset_units_unscaled

2016-06-14 Thread Axel Davy


On 15/06/2016 03:04, Roland Scheidegger wrote:

Am 15.06.2016 um 01:08 schrieb Axel Davy:

On 15/06/2016 00:21, Roland Scheidegger wrote:

Am 14.06.2016 um 23:33 schrieb Axel Davy:

diff --git a/src/gallium/include/pipe/p_state.h
b/src/gallium/include/pipe/p_state.h
index 396f563..7dce80a 100644
--- a/src/gallium/include/pipe/p_state.h
+++ b/src/gallium/include/pipe/p_state.h
@@ -139,6 +139,13 @@ struct pipe_rasterizer_state
  unsigned clip_halfz:1;
/**
+* When true do not scale offset_units and use same rules for
unorm and
+* float depth buffers (D3D9). When false use GL/D3D1X behaviour.
+* This depends on PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED.
+*/
+   unsigned offset_units_unscaled;
+
+   /**
   * Enable bits for clipping half-spaces.
   * This applies to both user clip planes and shader clip distances.
   * Note that if the bound shader exports any clip distances, these


I don't like this. Generally, for unorm formats, you can easily enough
translate this from d3d9 to gl (or d3d10) rules (but yes, obviously it's
going to be format dependent). (With one big caveat, in general not all
gl drivers think the minimum resolvable difference is the same, that
might range from 2^-22 to 2^-24 for 24bit unorm depth for instance, and
I don't think it's quite consistent with gallium drivers neither).

You are right though for float depth the formula is different, and you
can't translate it. But do you really need float depth buffer support?
AFAIK no d3d9 app really depends on it, everything can fall back to d24.

Roland


Hi,


That's true float depth buffer do not seem to be widely used in d3d9.

The two float depth buffers available in d3d9, as far as I know, are
D32F_LOCKABLE and D24FS8.

We can see the support for those and other depth buffers here (note that
these are mainly old cards):

http://zp.amsnet.pl/cdragan/query.php?dxversion=9&feature=formats&featuregroup=selected&adaptergroup=all&featureselected[]=45&featureselected[]=44&featureselected[]=41&featureselected[]=42&featureselected[]=43&featureselected[]=40&featureselected[]=39&featureselected[]=46&resource=SURFACE&usage=DEPTHSTENCIL&orientation=horizontal


It is likely not a requirement for any game to support these formats.


We could ignore these formats, and add to gallium a way to get the
minimum resolvable difference per depth buffer format from drivers. We
considered this option.


That said, the driver is the best location to know about the minimum
resolvable difference, and we made the choice to let the driver do the
scaling instead of doing it based on some driver query in the state
tracker.

As for floating point depth buffers behaviour, I understand for some
drivers it may be harder than for others to implement.

That doesn't seem however a reason to drop floating depth buffer support
in Gallium Nine. D32F_LOCKABLE is particularly useful for debugging,
being lockable, it can be used to show depth buffer content after some
draw calls for d3d on windows, and compare with nine. And some apps may
use it for some particular effects.

I'd be ok if we make the float depth buffer part of
offset_units_unscaled optional given how rare the combination float
depth buffers + depth bias must be used. However if hw can do it, I see
no reason why we wouldn't support the capability?

On second look, it doesn't really look too bad (and fwiw we actually
could probably put it to use here if we'd support it in llvmpipe).
Albeit,
unsigned offset_units_unscaled;
needs to be
unsigned offset_units_unscaled:1;

Good catch, this was the reason I had put it in this place of the structure,
but somehow forgot the :1 ...


I'm just very sceptical when it comes to capabilities solely to the
benefit of fringe state trackers (and everything not st/mesa counts
here). It usually means driver authors aren't going to bother. And you
probably can't implement it in all drivers yourselves even if the hw
could do it.

That said, I'm ok with this if there's no objections from others.

Roland



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/7] Fix ralloc/rzalloc usage v2

2016-06-14 Thread Martin Peres


On 14/06/16 18:47, Martin Peres wrote:

On 14/06/16 17:58, Juha-Pekka Heikkila wrote:

Here is fixed version of this ralloc set. Now I got to run this on many
different machines thanks to Mark Janes. There didn't show up any
regressions on different gen hw. On my IVB I've been running also many
different traces with Apitrace while having Valgrind running on
background
but Valgrind did seem to be happy with my changes.

As a performance test I did shader-db compile runs 10 times and compare
timing results against what Mesa master does on my IVB. To my surprise
this
does bring reasonable gain which also seem to be repeatable, on my IVB
shader compile time is around 5% faster with these changes.


On my SKL gt2, I only get a 0.35% improvement (10 runs also).

Ministat says that there is no difference proven at 95% confidence and
0.35% at 90%. Adding 100 more runs overnight, we'll see what we get in
the end.


With n=110: Difference at 95.0% confidence
-0.133665% +/- 0.0945651%
(Student's t, pooled s = 0.447829)

Martin
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 14/18] nir/glsl: add double packing support to vs and fs

2016-06-14 Thread Timothy Arceri

---
 src/compiler/glsl/link_varyings.cpp | 16 +---
 src/compiler/nir/nir_lower_io.c | 16 
 2 files changed, 29 insertions(+), 3 deletions(-)

diff --git a/src/compiler/glsl/link_varyings.cpp 
b/src/compiler/glsl/link_varyings.cpp
index 22dc2d8..7c0d93a 100644
--- a/src/compiler/glsl/link_varyings.cpp
+++ b/src/compiler/glsl/link_varyings.cpp
@@ -1992,10 +1992,11 @@ set_num_packed_components(struct gl_shader *shader, 
ir_variable_mode io_mode,
   var->type->without_array()->is_matrix())
  continue;
 
+  unsigned dfrac = var->type->without_array()->is_double() ? 2 : 1;
   if (var->type->is_array()) {
  const glsl_type *type = get_varying_type(var, shader->Stage);
  unsigned array_components = type->without_array()->vector_elements +
-var->data.location_frac;
+var->data.location_frac / dfrac;
  assert(type->arrays_of_arrays_size() + idx <=
 ARRAY_SIZE(num_components));
  for (unsigned i = idx; i < type->arrays_of_arrays_size(); i++) {
@@ -2003,7 +2004,7 @@ set_num_packed_components(struct gl_shader *shader, 
ir_variable_mode io_mode,
  }
   } else {
  unsigned comps = var->type->vector_elements +
-var->data.location_frac;
+var->data.location_frac / dfrac;
  num_components[idx] = MAX2(comps, num_components[idx]);
   }
}
@@ -2031,7 +2032,16 @@ set_num_packed_components(struct gl_shader *shader, 
ir_variable_mode io_mode,
 c = MAX2(c, num_components[i]);
  }
   } else {
- c = num_components[idx];
+ /* Handle special case of packing dvec3 with a double. The only
+  * valid scenario is packing a double in the 4th component of the
+  * double vector.
+  */
+ if (var->type->is_double() && var->type->vector_elements == 3 &&
+ num_components[idx+1] == 2) {
+c = 4;
+ } else {
+c = num_components[idx];
+ }
   }
   var->data.num_packed_components = c;
}
diff --git a/src/compiler/nir/nir_lower_io.c b/src/compiler/nir/nir_lower_io.c
index b966348..5566c83 100644
--- a/src/compiler/nir/nir_lower_io.c
+++ b/src/compiler/nir/nir_lower_io.c
@@ -104,6 +104,22 @@ nir_assign_var_locations(struct exec_list *var_list, 
unsigned *size,
  if (locations[idx][var->data.index] == -1) {
 var->data.driver_location = location;
 locations[idx][var->data.index] = location;
+
+/* A dvec3 can be packed with a double we need special handling
+ * for this as we are packing across two locations.
+ */
+if (glsl_get_base_type(var->type) == GLSL_TYPE_DOUBLE &&
+glsl_get_vector_elements(var->type) == 3) {
+   /* Hack around type_size functions that expect vectors to be
+* padded out to vec4.
+*/
+   unsigned dsize = type_size(glsl_double_type());
+   unsigned offset =
+  dsize == type_size(glsl_float_type()) ? dsize : dsize * 2;
+
+   locations[idx + 1][var->data.index] = location + offset;
+}
+
 location += type_size(var->type) +
calc_type_size_offset(var->data.num_packed_components,
  var->type, type_size);
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 11/18] i965: add indirect packing support to gs load inputs

2016-06-14 Thread Timothy Arceri

Reviewed-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 18 +++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 4eaf5ea..75737c1 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -2135,14 +2135,26 @@ fs_visitor::emit_gs_input_load(const fs_reg &dst,
   } else {
  /* Indirect indexing - use per-slot offsets as well. */
  const fs_reg srcs[] = { icp_handle, indirect_offset };
+ unsigned read_components = num_components + first_component;
+ fs_reg tmp = bld.vgrf(dst.type, read_components);
  fs_reg payload = bld.vgrf(BRW_REGISTER_TYPE_UD, 2);
  bld.LOAD_PAYLOAD(payload, srcs, ARRAY_SIZE(srcs), 0);
-
- inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8_PER_SLOT, tmp_dst, 
payload);
+ if (first_component != 0) {
+inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8_PER_SLOT, tmp,
+payload);
+inst->regs_written = read_components;
+for (unsigned i = 0; i < num_components; i++) {
+   bld.MOV(offset(tmp_dst, bld, i),
+   offset(tmp, bld, i + first_component));
+}
+ } else {
+inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8_PER_SLOT, tmp_dst,
+ payload);
+inst->regs_written = num_components * type_sz(tmp_dst.type) / 4;
+ }
  inst->offset = base_offset;
  inst->base_mrf = -1;
  inst->mlen = 2;
- inst->regs_written = num_components * type_sz(tmp_dst.type) / 4;
   }
 
   if (type_sz(dst.type) == 8) {
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 17/18] i965: enable ARB_enhanced_layouts for gen8+

2016-06-14 Thread Timothy Arceri

---
 src/mesa/drivers/dri/i965/intel_extensions.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c 
b/src/mesa/drivers/dri/i965/intel_extensions.c
index 5be4787..d61692d 100644
--- a/src/mesa/drivers/dri/i965/intel_extensions.c
+++ b/src/mesa/drivers/dri/i965/intel_extensions.c
@@ -388,6 +388,7 @@ intelInitExtensions(struct gl_context *ctx)
}
 
if (brw->gen >= 8) {
+  ctx->Extensions.ARB_enhanced_layouts = true;
   ctx->Extensions.ARB_shader_precision = true;
   ctx->Extensions.ARB_stencil_texturing = true;
   ctx->Extensions.ARB_texture_stencil8 = true;
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 08/18] nir: add glsl_dvec_type() helper

2016-06-14 Thread Timothy Arceri

---
 src/compiler/nir_types.cpp | 6 ++
 src/compiler/nir_types.h   | 1 +
 2 files changed, 7 insertions(+)

diff --git a/src/compiler/nir_types.cpp b/src/compiler/nir_types.cpp
index 4ea7a2f..835d53b 100644
--- a/src/compiler/nir_types.cpp
+++ b/src/compiler/nir_types.cpp
@@ -257,6 +257,12 @@ glsl_vec_type(unsigned n)
 }
 
 const glsl_type *
+glsl_dvec_type(unsigned n)
+{
+   return glsl_type::dvec(n);
+}
+
+const glsl_type *
 glsl_vec4_type(void)
 {
return glsl_type::vec4_type;
diff --git a/src/compiler/nir_types.h b/src/compiler/nir_types.h
index 7d9917f..f7147a9 100644
--- a/src/compiler/nir_types.h
+++ b/src/compiler/nir_types.h
@@ -118,6 +118,7 @@ bool glsl_sampler_type_is_array(const struct glsl_type 
*type);
 const struct glsl_type *glsl_void_type(void);
 const struct glsl_type *glsl_float_type(void);
 const struct glsl_type *glsl_vec_type(unsigned n);
+const struct glsl_type *glsl_dvec_type(unsigned n);
 const struct glsl_type *glsl_vec4_type(void);
 const struct glsl_type *glsl_int_type(void);
 const struct glsl_type *glsl_uint_type(void);
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 13/18] nir: add glsl_double_type() helper

2016-06-14 Thread Timothy Arceri

Reviewed-by: Kenneth Graunke 
---
 src/compiler/nir_types.cpp | 6 ++
 src/compiler/nir_types.h   | 1 +
 2 files changed, 7 insertions(+)

diff --git a/src/compiler/nir_types.cpp b/src/compiler/nir_types.cpp
index 835d53b..f694a84 100644
--- a/src/compiler/nir_types.cpp
+++ b/src/compiler/nir_types.cpp
@@ -251,6 +251,12 @@ glsl_float_type(void)
 }
 
 const glsl_type *
+glsl_double_type(void)
+{
+   return glsl_type::double_type;
+}
+
+const glsl_type *
 glsl_vec_type(unsigned n)
 {
return glsl_type::vec(n);
diff --git a/src/compiler/nir_types.h b/src/compiler/nir_types.h
index f7147a9..6b4f646 100644
--- a/src/compiler/nir_types.h
+++ b/src/compiler/nir_types.h
@@ -117,6 +117,7 @@ bool glsl_sampler_type_is_array(const struct glsl_type 
*type);
 
 const struct glsl_type *glsl_void_type(void);
 const struct glsl_type *glsl_float_type(void);
+const struct glsl_type *glsl_double_type(void);
 const struct glsl_type *glsl_vec_type(unsigned n);
 const struct glsl_type *glsl_dvec_type(unsigned n);
 const struct glsl_type *glsl_vec4_type(void);
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 07/18] i965: add component packing support for tcs

2016-06-14 Thread Timothy Arceri

Reviewed-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 6033e5e..587549f 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -2680,6 +2680,9 @@ fs_visitor::nir_emit_tcs_intrinsic(const fs_builder &bld,
   fs_reg tmp =
  fs_reg(VGRF, alloc.allocate(2 * iter_components), value.type);
 
+  unsigned first_component = nir_intrinsic_component(instr);
+  mask = mask << first_component;
+
   for (unsigned iter = 0; iter < num_iterations; iter++) {
  if (!is_64bit && mask != WRITEMASK_XYZW) {
 srcs[header_regs++] = brw_imm_ud(mask << 16);
@@ -2717,11 +2720,12 @@ fs_visitor::nir_emit_tcs_intrinsic(const fs_builder 
&bld,
  }
 
  for (unsigned i = 0; i < iter_components; i++) {
-if (!(mask & (1 << i)))
+if (!(mask & (1 << (i + first_component
continue;
 
 if (!is_64bit) {
-   srcs[header_regs + i] = offset(value, bld, BRW_GET_SWZ(swiz, 
i));
+   srcs[header_regs + i + first_component] =
+  offset(value, bld, BRW_GET_SWZ(swiz, i));
 } else {
/* We need to shuffle the 64-bit data to match the layout
 * expected by our 32-bit URB write messages. We use a temporary
@@ -2744,7 +2748,8 @@ fs_visitor::nir_emit_tcs_intrinsic(const fs_builder &bld,
  }
 
  unsigned mlen =
-header_regs + (is_64bit ? 2 * iter_components : iter_components);
+header_regs + (is_64bit ? 2 * iter_components : iter_components) +
+first_component;
  fs_reg payload =
 bld.vgrf(BRW_REGISTER_TYPE_UD, mlen);
  bld.LOAD_PAYLOAD(payload, srcs, mlen, header_regs);
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 16/18] i965: add double packing support to tess stages

2016-06-14 Thread Timothy Arceri

Reviewed-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 27 ++-
 1 file changed, 18 insertions(+), 9 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 9f890ca..bd37a51 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -2407,8 +2407,10 @@ fs_visitor::nir_emit_tcs_intrinsic(const fs_builder &bld,
*/
   unsigned num_iterations = 1;
   unsigned num_components = instr->num_components;
+  unsigned first_component = nir_intrinsic_component(instr);
   fs_reg orig_dst = dst;
   if (type_sz(dst.type) == 8) {
+ first_component = first_component / 2;
  if (instr->num_components > 2) {
 num_iterations = 2;
 num_components = 2;
@@ -2418,7 +2420,6 @@ fs_visitor::nir_emit_tcs_intrinsic(const fs_builder &bld,
  dst = tmp;
   }
 
-  unsigned first_component = nir_intrinsic_component(instr);
   for (unsigned iter = 0; iter < num_iterations; iter++) {
  if (indirect_offset.file == BAD_FILE) {
 /* Constant indexing - use global offset. */
@@ -2459,7 +2460,7 @@ fs_visitor::nir_emit_tcs_intrinsic(const fs_builder &bld,
 inst->mlen = 2;
  }
  inst->regs_written =
-(num_components * type_sz(dst.type) / 4) + first_component;
+((num_components + first_component) * type_sz(dst.type) / 4);
 
  /* If we are reading 64-bit data using 32-bit read messages we need
   * build proper 64-bit data elements by shuffling the low and high
@@ -2720,9 +2721,13 @@ fs_visitor::nir_emit_tcs_intrinsic(const fs_builder &bld,
*/
   unsigned num_iterations = 1;
   unsigned iter_components = num_components;
-  if (is_64bit && instr->num_components > 2) {
- num_iterations = 2;
- iter_components = 2;
+  unsigned first_component = nir_intrinsic_component(instr);
+  if (is_64bit) {
+ first_component = first_component / 2;
+ if (instr->num_components > 2) {
+num_iterations = 2;
+iter_components = 2;
+ }
   }
 
   /* 64-bit data needs to me shuffled before we can write it to the URB.
@@ -2732,7 +2737,6 @@ fs_visitor::nir_emit_tcs_intrinsic(const fs_builder &bld,
   fs_reg tmp =
  fs_reg(VGRF, alloc.allocate(2 * iter_components), value.type);
 
-  unsigned first_component = nir_intrinsic_component(instr);
   mask = mask << first_component;
 
   for (unsigned iter = 0; iter < num_iterations; iter++) {
@@ -2794,14 +2798,15 @@ fs_visitor::nir_emit_tcs_intrinsic(const fs_builder 
&bld,
unsigned idx = 2 * i;
bld.MOV(dest, offset(tmp, bld, idx));
bld.MOV(offset(dest, bld, 1), offset(tmp, bld, idx + 1));
-   srcs[header_regs + idx] = dest;
-   srcs[header_regs + idx + 1] = offset(dest, bld, 1);
+   srcs[header_regs + idx + first_component * 2] = dest;
+   srcs[header_regs + idx + 1 + first_component * 2] =
+  offset(dest, bld, 1);
 }
  }
 
  unsigned mlen =
 header_regs + (is_64bit ? 2 * iter_components : iter_components) +
-first_component;
+(is_64bit ? 2 * first_component : first_component);
  fs_reg payload =
 bld.vgrf(BRW_REGISTER_TYPE_UD, mlen);
  bld.LOAD_PAYLOAD(payload, srcs, mlen, header_regs);
@@ -2898,6 +2903,10 @@ fs_visitor::nir_emit_tes_intrinsic(const fs_builder &bld,
   unsigned imm_offset = instr->const_index[0];
   unsigned first_component = nir_intrinsic_component(instr);
 
+  if (type_sz(dest.type) == 8) {
+ first_component = first_component / 2;
+  }
+
   fs_inst *inst;
   if (indirect_offset.file == BAD_FILE) {
  /* Arbitrarily only push up to 32 vec4 slots worth of data,
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 06/18] i965: add component packing support for tes

2016-06-14 Thread Timothy Arceri

Reviewed-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 38 +++-
 1 file changed, 33 insertions(+), 5 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 6d695f1..6033e5e 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -2405,10 +2405,21 @@ fs_visitor::nir_emit_tcs_intrinsic(const fs_builder 
&bld,
  dst = tmp;
   }
 
+  unsigned first_component = nir_intrinsic_component(instr);
   for (unsigned iter = 0; iter < num_iterations; iter++) {
  if (indirect_offset.file == BAD_FILE) {
 /* Constant indexing - use global offset. */
-inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8, dst, icp_handle);
+if (first_component != 0) {
+   unsigned read_components = num_components + first_component;
+   fs_reg tmp = bld.vgrf(dst.type, read_components);
+   inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8, tmp, icp_handle);
+   for (unsigned i = 0; i < num_components; i++) {
+  bld.MOV(offset(dst, bld, i),
+  offset(tmp, bld, i + first_component));
+   }
+} else {
+   inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8, dst, icp_handle);
+}
 inst->offset = imm_offset;
 inst->mlen = 1;
 inst->base_mrf = -1;
@@ -2423,7 +2434,8 @@ fs_visitor::nir_emit_tcs_intrinsic(const fs_builder &bld,
 inst->base_mrf = -1;
 inst->mlen = 2;
  }
- inst->regs_written = num_components * type_sz(dst.type) / 4;
+ inst->regs_written =
+(num_components * type_sz(dst.type) / 4) + first_component;
 
  /* If we are reading 64-bit data using 32-bit read messages we need
   * build proper 64-bit data elements by shuffling the low and high
@@ -2827,6 +2839,7 @@ fs_visitor::nir_emit_tes_intrinsic(const fs_builder &bld,
case nir_intrinsic_load_per_vertex_input: {
   fs_reg indirect_offset = get_indirect_offset(instr);
   unsigned imm_offset = instr->const_index[0];
+  unsigned first_component = nir_intrinsic_component(instr);
 
   fs_inst *inst;
   if (indirect_offset.file == BAD_FILE) {
@@ -2837,7 +2850,8 @@ fs_visitor::nir_emit_tes_intrinsic(const fs_builder &bld,
  if (imm_offset < max_push_slots) {
 fs_reg src = fs_reg(ATTR, imm_offset / 2, dest.type);
 for (int i = 0; i < instr->num_components; i++) {
-   unsigned comp = 16 / type_sz(dest.type) * (imm_offset % 2) + i;
+   unsigned comp = 16 / type_sz(dest.type) * (imm_offset % 2) +
+  i + first_component;
bld.MOV(offset(dest, bld, i), component(src, comp));
 }
 tes_prog_data->base.urb_read_length =
@@ -2851,11 +2865,25 @@ fs_visitor::nir_emit_tes_intrinsic(const fs_builder 
&bld,
 fs_reg patch_handle = bld.vgrf(BRW_REGISTER_TYPE_UD, 1);
 bld.LOAD_PAYLOAD(patch_handle, srcs, ARRAY_SIZE(srcs), 0);
 
-inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8, dest, patch_handle);
+if (first_component != 0) {
+   unsigned read_components =
+  instr->num_components + first_component;
+   fs_reg tmp = bld.vgrf(dest.type, read_components);
+   inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8, tmp,
+   patch_handle);
+   inst->regs_written = read_components;
+   for (unsigned i = 0; i < instr->num_components; i++) {
+  bld.MOV(offset(dest, bld, i),
+  offset(tmp, bld, i + first_component));
+   }
+} else {
+   inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8, dest,
+   patch_handle);
+   inst->regs_written = instr->num_components;
+}
 inst->mlen = 1;
 inst->offset = imm_offset;
 inst->base_mrf = -1;
-inst->regs_written = instr->num_components;
  }
   } else {
  /* Indirect indexing - use per-slot offsets as well. */
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 18/18] docs: mark ARB_enhanced_layouts as DONE for i965

2016-06-14 Thread Timothy Arceri

---
 docs/GL3.txt | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/GL3.txt b/docs/GL3.txt
index 0204695..b0573c8 100644
--- a/docs/GL3.txt
+++ b/docs/GL3.txt
@@ -193,11 +193,11 @@ GL 4.4, GLSL 4.40:
   GL_MAX_VERTEX_ATTRIB_STRIDE   DONE (all drivers)
   GL_ARB_buffer_storage DONE (i965, nv50, 
nvc0, r600, radeonsi)
   GL_ARB_clear_texture  DONE (i965, nv50, nvc0)
-  GL_ARB_enhanced_layouts   in progress (Timothy)
+  GL_ARB_enhanced_layouts   DONE (i965)
   - compile-time constant expressions   DONE
   - explicit byte offsets for blocksDONE
   - forced alignment within blocks  DONE
-  - specified vec4-slot component numbers   in progress
+  - specified vec4-slot component numbers   DONE (i965)
   - specified transform/feedback layout DONE
   - input/output block locationsDONE
   GL_ARB_multi_bind DONE (all drivers)
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 12/18] i965: add component packing support for load_output intrinsics

2016-06-14 Thread Timothy Arceri

---
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 38 +++-
 1 file changed, 33 insertions(+), 5 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 75737c1..c18e7b6 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -2507,6 +2507,7 @@ fs_visitor::nir_emit_tcs_intrinsic(const fs_builder &bld,
case nir_intrinsic_load_per_vertex_output: {
   fs_reg indirect_offset = get_indirect_offset(instr);
   unsigned imm_offset = instr->const_index[0];
+  unsigned first_component = nir_intrinsic_component(instr);
 
   fs_inst *inst;
   if (indirect_offset.file == BAD_FILE) {
@@ -2590,11 +2591,25 @@ fs_visitor::nir_emit_tcs_intrinsic(const fs_builder 
&bld,
 }
 bld.LOAD_PAYLOAD(dst, srcs, num_components, 0);
  } else {
-inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8, dst, patch_handle);
+if (first_component != 0) {
+   unsigned read_components =
+  instr->num_components + first_component;
+   fs_reg tmp = bld.vgrf(dst.type, read_components);
+   inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8, tmp,
+   patch_handle);
+   inst->regs_written = read_components;
+   for (unsigned i = 0; i < instr->num_components; i++) {
+  bld.MOV(offset(dst, bld, i),
+  offset(tmp, bld, i + first_component));
+   }
+} else {
+   inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8, dst,
+   patch_handle);
+   inst->regs_written = instr->num_components;
+}
 inst->offset = imm_offset;
 inst->mlen = 1;
 inst->base_mrf = -1;
-inst->regs_written = instr->num_components;
  }
   } else {
  /* Indirect indexing - use per-slot offsets as well. */
@@ -2604,12 +2619,25 @@ fs_visitor::nir_emit_tcs_intrinsic(const fs_builder 
&bld,
  };
  fs_reg payload = bld.vgrf(BRW_REGISTER_TYPE_UD, 2);
  bld.LOAD_PAYLOAD(payload, srcs, ARRAY_SIZE(srcs), 0);
-
- inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8_PER_SLOT, dst, payload);
+ if (first_component != 0) {
+unsigned read_components =
+   instr->num_components + first_component;
+fs_reg tmp = bld.vgrf(dst.type, read_components);
+inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8_PER_SLOT, tmp,
+payload);
+inst->regs_written = read_components;
+for (unsigned i = 0; i < instr->num_components; i++) {
+   bld.MOV(offset(dst, bld, i),
+   offset(tmp, bld, i + first_component));
+}
+ } else {
+inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8_PER_SLOT, dst,
+payload);
+inst->regs_written = instr->num_components;
+ }
  inst->offset = imm_offset;
  inst->mlen = 2;
  inst->base_mrf = -1;
- inst->regs_written = instr->num_components;
   }
   break;
}
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 15/18] i965: add double support packing support to gs inputs

2016-06-14 Thread Timothy Arceri

Reviewed-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index c18e7b6..9f890ca 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -2110,6 +2110,7 @@ fs_visitor::emit_gs_input_load(const fs_reg &dst,
   }
   fs_reg tmp = fs_reg(VGRF, alloc.allocate(4), dst.type);
   tmp_dst = tmp;
+  first_component = first_component / 2;
}
 
for (unsigned iter = 0; iter < num_iterations; iter++) {
@@ -2119,7 +2120,7 @@ fs_visitor::emit_gs_input_load(const fs_reg &dst,
 unsigned read_components = num_components + first_component;
 fs_reg tmp = bld.vgrf(dst.type, read_components);
 inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8, tmp, icp_handle);
-inst->regs_written = read_components;
+inst->regs_written = read_components * type_sz(tmp_dst.type) / 4;
 for (unsigned i = 0; i < num_components; i++) {
bld.MOV(offset(tmp_dst, bld, i),
offset(tmp, bld, i + first_component));
@@ -2142,7 +2143,7 @@ fs_visitor::emit_gs_input_load(const fs_reg &dst,
  if (first_component != 0) {
 inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8_PER_SLOT, tmp,
 payload);
-inst->regs_written = read_components;
+inst->regs_written = read_components * type_sz(tmp_dst.type) / 4;
 for (unsigned i = 0; i < num_components; i++) {
bld.MOV(offset(tmp_dst, bld, i),
offset(tmp, bld, i + first_component));
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 10/18] i965: add indirect packing support for tcs and tes

2016-06-14 Thread Timothy Arceri

Reviewed-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 33 
 1 file changed, 29 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 587549f..4eaf5ea 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -2428,8 +2428,19 @@ fs_visitor::nir_emit_tcs_intrinsic(const fs_builder &bld,
 const fs_reg srcs[] = { icp_handle, indirect_offset };
 fs_reg payload = bld.vgrf(BRW_REGISTER_TYPE_UD, 2);
 bld.LOAD_PAYLOAD(payload, srcs, ARRAY_SIZE(srcs), 0);
-
-inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8_PER_SLOT, dst, 
payload);
+if (first_component != 0) {
+   unsigned read_components = num_components + first_component;
+   fs_reg tmp = bld.vgrf(dst.type, read_components);
+   inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8_PER_SLOT, tmp,
+   payload);
+   for (unsigned i = 0; i < num_components; i++) {
+  bld.MOV(offset(dst, bld, i),
+  offset(tmp, bld, i + first_component));
+   }
+} else {
+   inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8_PER_SLOT, dst,
+   payload);
+}
 inst->offset = imm_offset;
 inst->base_mrf = -1;
 inst->mlen = 2;
@@ -2899,11 +2910,25 @@ fs_visitor::nir_emit_tes_intrinsic(const fs_builder 
&bld,
  fs_reg payload = bld.vgrf(BRW_REGISTER_TYPE_UD, 2);
  bld.LOAD_PAYLOAD(payload, srcs, ARRAY_SIZE(srcs), 0);
 
- inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8_PER_SLOT, dest, payload);
+ if (first_component != 0) {
+unsigned read_components =
+instr->num_components + first_component;
+fs_reg tmp = bld.vgrf(dest.type, read_components);
+inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8_PER_SLOT, tmp,
+payload);
+inst->regs_written = read_components;
+for (unsigned i = 0; i < instr->num_components; i++) {
+   bld.MOV(offset(dest, bld, i),
+   offset(tmp, bld, i + first_component));
+}
+ } else {
+inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8_PER_SLOT, dest,
+payload);
+inst->regs_written = instr->num_components;
+ }
  inst->mlen = 2;
  inst->offset = imm_offset;
  inst->base_mrf = -1;
- inst->regs_written = instr->num_components;
   }
   break;
}
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 01/18] nir: add new intrinsic field for storing component offset

2016-06-14 Thread Timothy Arceri

This offset is used for packing.

Reviewed-by: Kenneth Graunke 
---
 src/compiler/nir/nir.h|  6 ++
 src/compiler/nir/nir_intrinsics.h | 12 ++--
 src/compiler/nir/nir_lower_io.c   |  8 
 src/compiler/nir/nir_print.c  |  3 +++
 4 files changed, 23 insertions(+), 6 deletions(-)

diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index ec7b0c7..d5e4733 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -987,6 +987,11 @@ typedef enum {
 */
NIR_INTRINSIC_BINDING = 7,
 
+   /**
+* Component offset.
+*/
+   NIR_INTRINSIC_COMPONENT = 8,
+
NIR_INTRINSIC_NUM_INDEX_FLAGS,
 
 } nir_intrinsic_index_flag;
@@ -1053,6 +1058,7 @@ INTRINSIC_IDX_ACCESSORS(ucp_id, UCP_ID, unsigned)
 INTRINSIC_IDX_ACCESSORS(range, RANGE, unsigned)
 INTRINSIC_IDX_ACCESSORS(desc_set, DESC_SET, unsigned)
 INTRINSIC_IDX_ACCESSORS(binding, BINDING, unsigned)
+INTRINSIC_IDX_ACCESSORS(component, COMPONENT, unsigned)
 
 /**
  * \group texture information
diff --git a/src/compiler/nir/nir_intrinsics.h 
b/src/compiler/nir/nir_intrinsics.h
index 6f86c9f..19df191 100644
--- a/src/compiler/nir/nir_intrinsics.h
+++ b/src/compiler/nir/nir_intrinsics.h
@@ -336,15 +336,15 @@ LOAD(uniform, 1, 2, BASE, RANGE, xx, 
NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC
 /* src[] = { buffer_index, offset }. No const_index */
 LOAD(ubo, 2, 0, xx, xx, xx, NIR_INTRINSIC_CAN_ELIMINATE | 
NIR_INTRINSIC_CAN_REORDER)
 /* src[] = { offset }. const_index[] = { base } */
-LOAD(input, 1, 1, BASE, xx, xx, NIR_INTRINSIC_CAN_ELIMINATE | 
NIR_INTRINSIC_CAN_REORDER)
+LOAD(input, 1, 2, BASE, COMPONENT, xx, NIR_INTRINSIC_CAN_ELIMINATE | 
NIR_INTRINSIC_CAN_REORDER)
 /* src[] = { vertex, offset }. const_index[] = { base } */
-LOAD(per_vertex_input, 2, 1, BASE, xx, xx, NIR_INTRINSIC_CAN_ELIMINATE | 
NIR_INTRINSIC_CAN_REORDER)
+LOAD(per_vertex_input, 2, 2, BASE, COMPONENT, xx, NIR_INTRINSIC_CAN_ELIMINATE 
| NIR_INTRINSIC_CAN_REORDER)
 /* src[] = { buffer_index, offset }. No const_index */
 LOAD(ssbo, 2, 0, xx, xx, xx, NIR_INTRINSIC_CAN_ELIMINATE)
 /* src[] = { offset }. const_index[] = { base } */
-LOAD(output, 1, 1, BASE, xx, xx, NIR_INTRINSIC_CAN_ELIMINATE)
+LOAD(output, 1, 1, BASE, COMPONENT, xx, NIR_INTRINSIC_CAN_ELIMINATE)
 /* src[] = { vertex, offset }. const_index[] = { base } */
-LOAD(per_vertex_output, 2, 1, BASE, xx, xx, NIR_INTRINSIC_CAN_ELIMINATE)
+LOAD(per_vertex_output, 2, 1, BASE, COMPONENT, xx, NIR_INTRINSIC_CAN_ELIMINATE)
 /* src[] = { offset }. const_index[] = { base } */
 LOAD(shared, 1, 1, BASE, xx, xx, NIR_INTRINSIC_CAN_ELIMINATE)
 /* src[] = { offset }. const_index[] = { base, range } */
@@ -362,9 +362,9 @@ LOAD(push_constant, 1, 2, BASE, RANGE, xx,
INTRINSIC(store_##name, srcs, ARR(0, 1, 1, 1), false, 0, 0, num_indices, 
idx0, idx1, idx2, flags)
 
 /* src[] = { value, offset }. const_index[] = { base, write_mask } */
-STORE(output, 2, 2, BASE, WRMASK, xx, 0)
+STORE(output, 2, 3, BASE, WRMASK, COMPONENT, 0)
 /* src[] = { value, vertex, offset }. const_index[] = { base, write_mask } */
-STORE(per_vertex_output, 3, 2, BASE, WRMASK, xx, 0)
+STORE(per_vertex_output, 3, 3, BASE, WRMASK, COMPONENT, 0)
 /* src[] = { value, block_index, offset }. const_index[] = { write_mask } */
 STORE(ssbo, 3, 1, WRMASK, xx, xx, 0)
 /* src[] = { value, offset }. const_index[] = { base, write_mask } */
diff --git a/src/compiler/nir/nir_lower_io.c b/src/compiler/nir/nir_lower_io.c
index a839924..72f1b05 100644
--- a/src/compiler/nir/nir_lower_io.c
+++ b/src/compiler/nir/nir_lower_io.c
@@ -274,6 +274,10 @@ nir_lower_io_block(nir_block *block,
 
  nir_intrinsic_set_base(load,
 intrin->variables[0]->var->data.driver_location);
+ if (mode == nir_var_shader_in || mode == nir_var_shader_out) {
+nir_intrinsic_set_component(load,
+   intrin->variables[0]->var->data.location_frac);
+ }
 
  if (load->intrinsic == nir_intrinsic_load_uniform) {
 nir_intrinsic_set_range(load,
@@ -322,6 +326,10 @@ nir_lower_io_block(nir_block *block,
 
  nir_intrinsic_set_base(store,
 intrin->variables[0]->var->data.driver_location);
+ if (mode == nir_var_shader_out) {
+nir_intrinsic_set_component(store,
+   intrin->variables[0]->var->data.location_frac);
+ }
  nir_intrinsic_set_write_mask(store, nir_intrinsic_write_mask(intrin));
 
  if (per_vertex)
diff --git a/src/compiler/nir/nir_print.c b/src/compiler/nir/nir_print.c
index 36176ec..bca8a35 100644
--- a/src/compiler/nir/nir_print.c
+++ b/src/compiler/nir/nir_print.c
@@ -570,6 +570,7 @@ print_intrinsic_instr(nir_intrinsic_instr *instr, 
print_state *state)
   [NIR_INTRINSIC_RANGE] = "range",
   [NIR_INTRINSIC_DESC_SET] = "desc-set",
   [NIR_INTRINSIC_BINDING] = "binding",
+  [NIR_INTRINSIC_COMPONENT] = "component",
};
for (unsigned idx = 1; idx < NIR_INTRINSIC_NUM_INDEX_FLAGS; idx++) {
   if (!info->index_map[id

[Mesa-dev] [PATCH 03/18] glsl/nir: add new num_packed_components field

2016-06-14 Thread Timothy Arceri

This will be used to store the total number of components used at this location
when packing via ARB_enhanced_layouts.
---
 src/compiler/glsl/glsl_to_nir.cpp   |  1 +
 src/compiler/glsl/ir.h  |  5 +++
 src/compiler/glsl/link_varyings.cpp | 74 -
 src/compiler/glsl/linker.cpp|  2 +
 src/compiler/glsl/linker.h  |  4 ++
 src/compiler/nir/nir.h  |  5 +++
 6 files changed, 89 insertions(+), 2 deletions(-)

diff --git a/src/compiler/glsl/glsl_to_nir.cpp 
b/src/compiler/glsl/glsl_to_nir.cpp
index daf237e..0663c69 100644
--- a/src/compiler/glsl/glsl_to_nir.cpp
+++ b/src/compiler/glsl/glsl_to_nir.cpp
@@ -375,6 +375,7 @@ nir_visitor::visit(ir_variable *ir)
var->data.explicit_binding = ir->data.explicit_binding;
var->data.has_initializer = ir->data.has_initializer;
var->data.location_frac = ir->data.location_frac;
+   var->data.num_packed_components = ir->data.num_packed_components;
 
switch (ir->data.depth_layout) {
case ir_depth_layout_none:
diff --git a/src/compiler/glsl/ir.h b/src/compiler/glsl/ir.h
index 3629356..4248e62 100644
--- a/src/compiler/glsl/ir.h
+++ b/src/compiler/glsl/ir.h
@@ -763,6 +763,11 @@ public:
   unsigned location_frac:2;
 
   /**
+   * The total number of components packed into this location.
+   */
+  unsigned num_packed_components:4;
+
+  /**
* Layout of the matrix.  Uses glsl_matrix_layout values.
*/
   unsigned matrix_layout:2;
diff --git a/src/compiler/glsl/link_varyings.cpp 
b/src/compiler/glsl/link_varyings.cpp
index 534393a..22dc2d8 100644
--- a/src/compiler/glsl/link_varyings.cpp
+++ b/src/compiler/glsl/link_varyings.cpp
@@ -1972,6 +1972,70 @@ reserved_varying_slot(struct gl_shader *stage, 
ir_variable_mode io_mode)
return slots;
 }
 
+void
+set_num_packed_components(struct gl_shader *shader, ir_variable_mode io_mode,
+  unsigned base_offset)
+{
+   /* Find the max number of components used at this location */
+   unsigned num_components[MAX_VARYINGS_INCL_PATCH] = { 0 };
+
+   foreach_in_list(ir_instruction, node, shader->ir) {
+  ir_variable *const var = node->as_variable();
+
+  if (var == NULL || var->data.mode != io_mode ||
+  !var->data.explicit_location)
+ continue;
+
+  int idx = var->data.location - base_offset;
+  if (idx < 0 || idx >= MAX_VARYINGS_INCL_PATCH ||
+  var->type->without_array()->is_record() ||
+  var->type->without_array()->is_matrix())
+ continue;
+
+  if (var->type->is_array()) {
+ const glsl_type *type = get_varying_type(var, shader->Stage);
+ unsigned array_components = type->without_array()->vector_elements +
+var->data.location_frac;
+ assert(type->arrays_of_arrays_size() + idx <=
+ARRAY_SIZE(num_components));
+ for (unsigned i = idx; i < type->arrays_of_arrays_size(); i++) {
+num_components[i] = MAX2(array_components, num_components[i]);
+ }
+  } else {
+ unsigned comps = var->type->vector_elements +
+var->data.location_frac;
+ num_components[idx] = MAX2(comps, num_components[idx]);
+  }
+   }
+
+   foreach_in_list(ir_instruction, node, shader->ir) {
+  ir_variable *const var = node->as_variable();
+
+  if (var == NULL || var->data.mode != io_mode ||
+  !var->data.explicit_location)
+ continue;
+
+  int idx = var->data.location - base_offset;
+  if (idx < 0 || idx >= MAX_VARYINGS_INCL_PATCH ||
+  var->type->without_array()->is_record() ||
+  var->type->without_array()->is_matrix())
+ continue;
+
+  /* For arrays we need to check all elements in order to find the max
+   * number of components used.
+   */
+  unsigned c = 0;
+  if (var->type->is_array()) {
+ const glsl_type *type = get_varying_type(var, shader->Stage);
+ for (unsigned i = idx; i < type->arrays_of_arrays_size(); i++) {
+c = MAX2(c, num_components[i]);
+ }
+  } else {
+ c = num_components[idx];
+  }
+  var->data.num_packed_components = c;
+   }
+}
 
 /**
  * Assign locations for all variables that are produced in one pipeline stage
@@ -2087,11 +2151,17 @@ assign_varying_locations(struct gl_context *ctx,
 * 4. Mark input variables in the consumer that do not have locations as
 *not being inputs.  This lets the optimizer eliminate them.
 */
-   if (consumer)
+   if (consumer) {
   canonicalize_shader_io(consumer->ir, ir_var_shader_in);
+  set_num_packed_components(consumer, ir_var_shader_in,
+VARYING_SLOT_VAR0);
+   }
 
-   if (producer)
+   if (producer) {
   canonicalize_shader_io(producer->ir, ir_var_shader_out);
+  set_num_packed_components(producer, ir_var_shader_out,
+VARYING_SLOT_VAR0);
+   }
 
if (consumer)
   linke

[Mesa-dev] [PATCH 09/18] i965: add support for packing arrays

2016-06-14 Thread Timothy Arceri

Here we add a new helper function calc_type_size_offset() to help
calculate the size of a varying once packing is taken into account.
---
 src/compiler/nir/nir_lower_io.c | 55 +++--
 1 file changed, 48 insertions(+), 7 deletions(-)

diff --git a/src/compiler/nir/nir_lower_io.c b/src/compiler/nir/nir_lower_io.c
index c25790a..b966348 100644
--- a/src/compiler/nir/nir_lower_io.c
+++ b/src/compiler/nir/nir_lower_io.c
@@ -41,6 +41,36 @@ struct lower_io_state {
nir_variable_mode modes;
 };
 
+/**
+ * Calculates the offset for a type by allowing for other components that are
+ * packed into the same location.
+ */
+static unsigned
+calc_type_size_offset(unsigned num_packed_components,
+  const struct glsl_type *type,
+  int (*type_size)(const struct glsl_type *))
+{
+   unsigned base_size;
+   const struct glsl_type *wa = glsl_without_array(type);
+   int comp_diff = num_packed_components - glsl_get_vector_elements(wa);
+
+   /* If there is no difference in component sizes or the type_size function
+* being used treats everything as a vec4 return.
+*/
+   if (comp_diff <= 0 ||
+   type_size(glsl_float_type()) == type_size(glsl_double_type()))
+  return 0;
+
+   if (glsl_get_base_type(wa) == GLSL_TYPE_DOUBLE) {
+  base_size = type_size(glsl_dvec_type(comp_diff));
+   } else {
+  base_size = type_size(glsl_vec_type(comp_diff));
+   }
+
+   return glsl_type_is_array(type) ? base_size * glsl_get_aoa_size(type) :
+  base_size;
+}
+
 void
 nir_assign_var_locations(struct exec_list *var_list, unsigned *size,
  unsigned base_offset,
@@ -74,13 +104,17 @@ nir_assign_var_locations(struct exec_list *var_list, 
unsigned *size,
  if (locations[idx][var->data.index] == -1) {
 var->data.driver_location = location;
 locations[idx][var->data.index] = location;
-location += type_size(var->type);
+location += type_size(var->type) +
+   calc_type_size_offset(var->data.num_packed_components,
+ var->type, type_size);
  } else {
 var->data.driver_location = locations[idx][var->data.index];
  }
   } else {
  var->data.driver_location = location;
- location += type_size(var->type);
+ location += type_size(var->type) +
+calc_type_size_offset(var->data.num_packed_components, var->type,
+  type_size);
   }
}
 
@@ -113,7 +147,8 @@ is_per_vertex_output(struct lower_io_state *state, 
nir_variable *var)
 static nir_ssa_def *
 get_io_offset(nir_builder *b, nir_deref_var *deref,
   nir_ssa_def **vertex_index,
-  int (*type_size)(const struct glsl_type *))
+  int (*type_size)(const struct glsl_type *),
+  unsigned num_packed_components)
 {
nir_deref *tail = &deref->deref;
 
@@ -141,7 +176,9 @@ get_io_offset(nir_builder *b, nir_deref_var *deref,
 
   if (tail->deref_type == nir_deref_type_array) {
  nir_deref_array *deref_array = nir_deref_as_array(tail);
- unsigned size = type_size(tail->type);
+ unsigned size = type_size(tail->type) +
+calc_type_size_offset(num_packed_components, tail->type,
+  type_size);
 
  offset = nir_iadd(b, offset,
nir_imm_int(b, size * deref_array->base_offset));
@@ -289,7 +326,9 @@ nir_lower_io_block(nir_block *block,
 
  offset = get_io_offset(b, intrin->variables[0],
 per_vertex ? &vertex_index : NULL,
-state->type_size);
+state->type_size,
+intrin->variables[0]->var->
+   data.num_packed_components);
 
  nir_intrinsic_instr *load =
 nir_intrinsic_instr_create(state->mem_ctx,
@@ -339,7 +378,9 @@ nir_lower_io_block(nir_block *block,
 
  offset = get_io_offset(b, intrin->variables[0],
 per_vertex ? &vertex_index : NULL,
-state->type_size);
+state->type_size,
+intrin->variables[0]->var->
+   data.num_packed_components);
 
  nir_intrinsic_instr *store =
 nir_intrinsic_instr_create(state->mem_ctx,
@@ -381,7 +422,7 @@ nir_lower_io_block(nir_block *block,
  nir_ssa_def *offset;
 
  offset = get_io_offset(b, intrin->variables[0],
-NULL, state->type_size);
+NULL, state->type_size, 0);
 
  nir_intrinsic_instr *atomic =
 nir_intrinsic_instr_create(state->mem_ctx,
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop

[Mesa-dev] [PATCH 04/18] i965: enable component packing for vs and fs

2016-06-14 Thread Timothy Arceri

---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 20 
 src/mesa/drivers/dri/i965/brw_fs.h   |  5 +++--
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 29 -
 3 files changed, 35 insertions(+), 19 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 8774f25..1fdb654 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -1127,7 +1127,8 @@ fs_visitor::emit_general_interpolation(fs_reg *attr, 
const char *name,
const glsl_type *type,
glsl_interp_qualifier 
interpolation_mode,
int *location, bool mod_centroid,
-   bool mod_sample)
+   bool mod_sample,
+   unsigned num_packed_components)
 {
assert(stage == MESA_SHADER_FRAGMENT);
brw_wm_prog_data *prog_data = (brw_wm_prog_data*) this->prog_data;
@@ -1149,22 +1150,26 @@ fs_visitor::emit_general_interpolation(fs_reg *attr, 
const char *name,
 
   for (unsigned i = 0; i < length; i++) {
  emit_general_interpolation(attr, name, elem_type, interpolation_mode,
-location, mod_centroid, mod_sample);
+location, mod_centroid, mod_sample,
+num_packed_components);
   }
} else if (type->is_record()) {
   for (unsigned i = 0; i < type->length; i++) {
  const glsl_type *field_type = type->fields.structure[i].type;
  emit_general_interpolation(attr, name, field_type, interpolation_mode,
-location, mod_centroid, mod_sample);
+location, mod_centroid, mod_sample,
+num_packed_components);
   }
} else {
   assert(type->is_scalar() || type->is_vector());
+  unsigned num_components = num_packed_components ?
+ num_packed_components : type->vector_elements;
 
   if (prog_data->urb_setup[*location] == -1) {
  /* If there's no incoming setup data for this slot, don't
   * emit interpolation for it.
   */
- *attr = offset(*attr, bld, type->vector_elements);
+ *attr = offset(*attr, bld, num_components);
  (*location)++;
  return;
   }
@@ -1176,7 +1181,6 @@ fs_visitor::emit_general_interpolation(fs_reg *attr, 
const char *name,
   * handed us defined values in only the constant offset
   * field of the setup reg.
   */
- unsigned vector_elements = type->vector_elements;
 
  /* Data starts at suboffet 3 in 32-bit units (12 bytes), so it is not
   * 64-bit aligned and the current implementation fails to read the
@@ -1184,10 +1188,10 @@ fs_visitor::emit_general_interpolation(fs_reg *attr, 
const char *name,
   * read it as vector of floats with twice the number of components.
   */
  if (attr->type == BRW_REGISTER_TYPE_DF) {
-vector_elements *= 2;
+num_components *= 2;
 attr->type = BRW_REGISTER_TYPE_F;
  }
- for (unsigned int i = 0; i < vector_elements; i++) {
+ for (unsigned int i = 0; i < num_components; i++) {
 struct brw_reg interp = interp_reg(*location, i);
 interp = suboffset(interp, 3);
 interp.type = attr->type;
@@ -1196,7 +1200,7 @@ fs_visitor::emit_general_interpolation(fs_reg *attr, 
const char *name,
  }
   } else {
  /* Smooth/noperspective interpolation case. */
- for (unsigned int i = 0; i < type->vector_elements; i++) {
+ for (unsigned int i = 0; i < num_components; i++) {
 struct brw_reg interp = interp_reg(*location, i);
 if (devinfo->needs_unlit_centroid_workaround && mod_centroid) {
/* Get the pixel/sample mask into f0 so that we know
diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index 4237197..fc85206 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -181,7 +181,7 @@ public:
const glsl_type *type,
glsl_interp_qualifier interpolation_mode,
int *location, bool mod_centroid,
-   bool mod_sample);
+   bool mod_sample, unsigned num_components);
fs_reg *emit_vs_system_value(int location);
void emit_interpolation_setup_gen4();
void emit_interpolation_setup_gen6();
@@ -200,7 +200,8 @@ public:
void emit_nir_code();
void nir_setup_inputs();
void nir_setup_single_output_varying(fs_reg *reg, const glsl_type *type,
-unsigned

[Mesa-dev] [PATCH 05/18] i965: add component packing support for gs

2016-06-14 Thread Timothy Arceri

Reviewed-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_fs.h   |  2 +-
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 22 ++
 2 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index fc85206..0c72802 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -266,7 +266,7 @@ public:
void emit_gs_thread_end();
void emit_gs_input_load(const fs_reg &dst, const nir_src &vertex_src,
unsigned base_offset, const nir_src &offset_src,
-   unsigned num_components);
+   unsigned num_components, unsigned first_component);
void emit_cs_terminate();
fs_reg *emit_cs_work_group_id_setup();
 
diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index b90cc8b..6d695f1 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -1980,7 +1980,8 @@ fs_visitor::emit_gs_input_load(const fs_reg &dst,
const nir_src &vertex_src,
unsigned base_offset,
const nir_src &offset_src,
-   unsigned num_components)
+   unsigned num_components,
+   unsigned first_component)
 {
struct brw_gs_prog_data *gs_prog_data = (struct brw_gs_prog_data *) 
prog_data;
 
@@ -2114,11 +2115,23 @@ fs_visitor::emit_gs_input_load(const fs_reg &dst,
for (unsigned iter = 0; iter < num_iterations; iter++) {
   if (offset_const) {
  /* Constant indexing - use global offset. */
- inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8, tmp_dst, icp_handle);
+ if (first_component != 0) {
+unsigned read_components = num_components + first_component;
+fs_reg tmp = bld.vgrf(dst.type, read_components);
+inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8, tmp, icp_handle);
+inst->regs_written = read_components;
+for (unsigned i = 0; i < num_components; i++) {
+   bld.MOV(offset(tmp_dst, bld, i),
+   offset(tmp, bld, i + first_component));
+}
+ } else {
+inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8, tmp_dst,
+icp_handle);
+inst->regs_written = num_components * type_sz(tmp_dst.type) / 4;
+ }
  inst->offset = base_offset + offset_const->u32[0];
  inst->base_mrf = -1;
  inst->mlen = 1;
- inst->regs_written = num_components * type_sz(tmp_dst.type) / 4;
   } else {
  /* Indirect indexing - use per-slot offsets as well. */
  const fs_reg srcs[] = { icp_handle, indirect_offset };
@@ -2891,7 +2904,8 @@ fs_visitor::nir_emit_gs_intrinsic(const fs_builder &bld,
 
case nir_intrinsic_load_per_vertex_input:
   emit_gs_input_load(dest, instr->src[0], instr->const_index[0],
- instr->src[1], instr->num_components);
+ instr->src[1], instr->num_components,
+ nir_intrinsic_component(instr));
   break;
 
case nir_intrinsic_emit_vertex_with_counter:
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 02/18] nir: use the same driver location for packed varyings

2016-06-14 Thread Timothy Arceri

Reviewed-by: Kenneth Graunke 
---
 src/compiler/nir/nir.h|  4 ++--
 src/compiler/nir/nir_lower_io.c   | 28 ++--
 src/mesa/drivers/dri/i965/brw_nir.c   |  8 +---
 src/mesa/drivers/dri/i965/brw_program.c   |  4 ++--
 src/mesa/state_tracker/st_glsl_to_nir.cpp |  3 +++
 5 files changed, 38 insertions(+), 9 deletions(-)

diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index d5e4733..4ade03a 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -2310,8 +2310,8 @@ void nir_lower_io_to_temporaries(nir_shader *shader, 
nir_function *entrypoint,
 
 void nir_shader_gather_info(nir_shader *shader, nir_function_impl *entrypoint);
 
-void nir_assign_var_locations(struct exec_list *var_list,
-  unsigned *size,
+void nir_assign_var_locations(struct exec_list *var_list, unsigned *size,
+  unsigned base_offset,
   int (*type_size)(const struct glsl_type *));
 
 void nir_lower_io(nir_shader *shader,
diff --git a/src/compiler/nir/nir_lower_io.c b/src/compiler/nir/nir_lower_io.c
index 72f1b05..c25790a 100644
--- a/src/compiler/nir/nir_lower_io.c
+++ b/src/compiler/nir/nir_lower_io.c
@@ -43,10 +43,18 @@ struct lower_io_state {
 
 void
 nir_assign_var_locations(struct exec_list *var_list, unsigned *size,
+ unsigned base_offset,
  int (*type_size)(const struct glsl_type *))
 {
unsigned location = 0;
 
+   /* There are 32 regular and 32 patch varyings allowed */
+   int locations[64][2];
+   for (unsigned i = 0; i < 64; i++) {
+  for (unsigned j = 0; j < 2; j++)
+ locations[i][j] = -1;
+   }
+
nir_foreach_variable(var, var_list) {
   /*
* UBO's have their own address spaces, so don't count them towards the
@@ -56,8 +64,24 @@ nir_assign_var_locations(struct exec_list *var_list, 
unsigned *size,
   var->interface_type != NULL)
  continue;
 
-  var->data.driver_location = location;
-  location += type_size(var->type);
+  /* Make sure we give the same location to varyings packed with
+   * ARB_enhanced_layouts.
+   */
+  int idx = var->data.location - base_offset;
+  if (base_offset && idx >= 0) {
+ assert(idx < ARRAY_SIZE(locations));
+
+ if (locations[idx][var->data.index] == -1) {
+var->data.driver_location = location;
+locations[idx][var->data.index] = location;
+location += type_size(var->type);
+ } else {
+var->data.driver_location = locations[idx][var->data.index];
+ }
+  } else {
+ var->data.driver_location = location;
+ location += type_size(var->type);
+  }
}
 
*size = location;
diff --git a/src/mesa/drivers/dri/i965/brw_nir.c 
b/src/mesa/drivers/dri/i965/brw_nir.c
index d8cf12d..6c3e1d1 100644
--- a/src/mesa/drivers/dri/i965/brw_nir.c
+++ b/src/mesa/drivers/dri/i965/brw_nir.c
@@ -282,7 +282,8 @@ brw_nir_lower_tes_inputs(nir_shader *nir, const struct 
brw_vue_map *vue_map)
 void
 brw_nir_lower_fs_inputs(nir_shader *nir)
 {
-   nir_assign_var_locations(&nir->inputs, &nir->num_inputs, type_size_scalar);
+   nir_assign_var_locations(&nir->inputs, &nir->num_inputs, VARYING_SLOT_VAR0,
+type_size_scalar);
nir_lower_io(nir, nir_var_shader_in, type_size_scalar);
 }
 
@@ -292,6 +293,7 @@ brw_nir_lower_vue_outputs(nir_shader *nir,
 {
if (is_scalar) {
   nir_assign_var_locations(&nir->outputs, &nir->num_outputs,
+   VARYING_SLOT_VAR0,
type_size_scalar);
   nir_lower_io(nir, nir_var_shader_out, type_size_scalar);
} else {
@@ -330,14 +332,14 @@ void
 brw_nir_lower_fs_outputs(nir_shader *nir)
 {
nir_assign_var_locations(&nir->outputs, &nir->num_outputs,
-type_size_scalar);
+FRAG_RESULT_DATA0, type_size_scalar);
nir_lower_io(nir, nir_var_shader_out, type_size_scalar);
 }
 
 void
 brw_nir_lower_cs_shared(nir_shader *nir)
 {
-   nir_assign_var_locations(&nir->shared, &nir->num_shared,
+   nir_assign_var_locations(&nir->shared, &nir->num_shared, 0,
 type_size_scalar_bytes);
nir_lower_io(nir, nir_var_shared, type_size_scalar_bytes);
 }
diff --git a/src/mesa/drivers/dri/i965/brw_program.c 
b/src/mesa/drivers/dri/i965/brw_program.c
index a1a8116..2eec7fc 100644
--- a/src/mesa/drivers/dri/i965/brw_program.c
+++ b/src/mesa/drivers/dri/i965/brw_program.c
@@ -51,11 +51,11 @@ static void
 brw_nir_lower_uniforms(nir_shader *nir, bool is_scalar)
 {
if (is_scalar) {
-  nir_assign_var_locations(&nir->uniforms, &nir->num_uniforms,
+  nir_assign_var_locations(&nir->uniforms, &nir->num_uniforms, 0,
type_size_scalar_bytes);
   nir_lower_io(nir, nir_var_uniform, type_size_scalar_bytes);
} else {
-  nir_as

[Mesa-dev] V3 ARB_enhanced_layouts packing support for i965 Gen8+

2016-06-14 Thread Timothy Arceri

V3:
- Rewrite patch 9 (add support for packing arrays) to not add
hacks to the type_size() functions.
- Add packing support for the load_output intrinsics (patch 12)
- Add glsl_dvec_type() helper (patch 8)

V2:
- validation fixes patches 1-2
- added support for packing doubles now that explicit location
 fixes have landed.
- fix various issues with intel debug output with new COMPONENT const index.

This adds component packing support for Gen8+.

Series can be found in my component_packing_backend6 branch:

https://github.com/tarceri/Mesa_arrays_of_arrays.git

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: remove type_size_vec4_times_4()

2016-06-14 Thread Kenneth Graunke

On Tuesday, June 14, 2016 4:53:22 PM PDT Timothy Arceri wrote:
> type_size_vec4_times_4() was introduced as a fix in 8dcf807cb43383
> however since 3810c1561 we can just use type_size_scalar() and
> get the actual number of outputs we need.
> 
> Cc: Kenneth Graunke 
> ---
>  Hi Ken,
> 
>  I'm looking into the other suggestions you made on IRC so this may all just
>  go away but seems like a good idea to clean this up in the meantime.
> 
>  src/mesa/drivers/dri/i965/brw_fs.cpp | 13 -
>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp |  2 +-
>  src/mesa/drivers/dri/i965/brw_nir.c  |  4 ++--
>  src/mesa/drivers/dri/i965/brw_shader.h   |  1 -
>  4 files changed, 3 insertions(+), 17 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index 0347b0a..8774f25 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -505,19 +505,6 @@ type_size_scalar(const struct glsl_type *type)
> return 0;
>  }
>  
> -/**
> - * Returns the number of scalar components needed to store type, assuming
> - * that vectors are padded out to vec4.
> - *
> - * This has the packing rules of type_size_vec4(), but counts components
> - * similar to type_size_scalar().
> - */
> -extern "C" int
> -type_size_vec4_times_4(const struct glsl_type *type)
> -{
> -   return 4 * type_size_vec4(type);
> -}
> -
>  /* Attribute arrays are loaded as one vec4 per element (or matrix column),
>   * except for double-precision types, which are loaded as one dvec4.
>   */
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> index a956f9d..b811953 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> @@ -108,7 +108,7 @@ fs_visitor::nir_setup_single_output_varying(fs_reg *reg,
>for (unsigned count = 0; count < num_elements; count += 4) {
>   this->outputs[*location] = *reg;
>   this->output_components[*location] = MIN2(4, num_elements - count);
> - *reg = offset(*reg, bld, 4);
> + *reg = offset(*reg, bld, this->output_components[*location]);
>   (*location)++;
>}
> }
> diff --git a/src/mesa/drivers/dri/i965/brw_nir.c 
> b/src/mesa/drivers/dri/i965/brw_nir.c
> index e01f160..d8cf12d 100644
> --- a/src/mesa/drivers/dri/i965/brw_nir.c
> +++ b/src/mesa/drivers/dri/i965/brw_nir.c
> @@ -292,8 +292,8 @@ brw_nir_lower_vue_outputs(nir_shader *nir,
>  {
> if (is_scalar) {
>nir_assign_var_locations(&nir->outputs, &nir->num_outputs,
> -   type_size_vec4_times_4);
> -  nir_lower_io(nir, nir_var_shader_out, type_size_vec4_times_4);
> +   type_size_scalar);
> +  nir_lower_io(nir, nir_var_shader_out, type_size_scalar);
> } else {
>nir_foreach_variable(var, &nir->outputs)
>   var->data.driver_location = var->data.location;
> diff --git a/src/mesa/drivers/dri/i965/brw_shader.h 
> b/src/mesa/drivers/dri/i965/brw_shader.h
> index 656dc89..9300f20 100644
> --- a/src/mesa/drivers/dri/i965/brw_shader.h
> +++ b/src/mesa/drivers/dri/i965/brw_shader.h
> @@ -294,7 +294,6 @@ struct gl_shader *brw_new_shader(struct gl_context *ctx, 
> GLuint name, GLuint typ
>  int type_size_scalar(const struct glsl_type *type);
>  int type_size_vec4(const struct glsl_type *type);
>  int type_size_dvec4(const struct glsl_type *type);
> -int type_size_vec4_times_4(const struct glsl_type *type);
>  int type_size_vs_input(const struct glsl_type *type);
>  
>  unsigned tesslevel_outer_components(GLenum tes_primitive_mode);
> 

I was skeptical, but this looks correct.  This only applies to shadowed
outputs, and just controls the packing within the fs_reg we allocate for
those outputs.  The URB layout remains the same.

It appears that we only needed this prior to the commit you referenced
because the old code was buggy.  Now that it's fixed, it doesn't matter.

I think this is fine, then.  Presumably you've run it through Jenkins
and everything was happy?

Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 07/11] mesa: Fix incorrect "see also" comments

2016-06-14 Thread Timothy Arceri

On Tue, 2016-06-14 at 19:01 -0700, Ian Romanick wrote:
> From: Ian Romanick 
> 
> Signed-off-by: Ian Romanick 

Reviewed-by: Timothy Arceri 

> ---
>  src/compiler/glsl/ir.h | 2 +-
>  src/mesa/main/mtypes.h | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/src/compiler/glsl/ir.h b/src/compiler/glsl/ir.h
> index 3629356..cd17f69 100644
> --- a/src/compiler/glsl/ir.h
> +++ b/src/compiler/glsl/ir.h
> @@ -679,7 +679,7 @@ public:
>    /**
> * Interpolation mode for shader inputs / outputs
> *
> -   * \sa ir_variable_interpolation
> +   * \sa glsl_interp_qualifier
> */
>    unsigned interpolation:2;
>  
> diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
> index 471d41d..88702cb 100644
> --- a/src/mesa/main/mtypes.h
> +++ b/src/mesa/main/mtypes.h
> @@ -2615,7 +2615,7 @@ struct gl_shader_variable
> /**
>  * Interpolation mode for shader inputs / outputs
>  *
> -* \sa ir_variable_interpolation
> +* \sa glsl_interp_qualifier
>  */
> unsigned interpolation:2;
>  
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 06/11] mesa: Silence unused parameter warning

2016-06-14 Thread Timothy Arceri

On Tue, 2016-06-14 at 19:01 -0700, Ian Romanick wrote:
> From: Ian Romanick 
> 
> main/pipelineobj.c: In function ‘delete_pipelineobj_cb’:
> main/pipelineobj.c:110:30: warning: unused parameter ‘id’ [-Wunused-
> parameter]
>  delete_pipelineobj_cb(GLuint id, void *data, void *userData)
>   ^
> 
> Signed-off-by: Ian Romanick 
> ---
>  src/mesa/main/pipelineobj.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/mesa/main/pipelineobj.c
> b/src/mesa/main/pipelineobj.c
> index 9ecbcc9..8483752 100644
> --- a/src/mesa/main/pipelineobj.c
> +++ b/src/mesa/main/pipelineobj.c
> @@ -107,7 +107,7 @@ _mesa_init_pipeline(struct gl_context *ctx)
>   * Callback for deleting a pipeline object.  Called by
> _mesa_HashDeleteAll().
>   */
>  static void
> -delete_pipelineobj_cb(GLuint id, void *data, void *userData)
> +delete_pipelineobj_cb(UNUSED GLuint id, void *data, void *userData)

I doesnt look like this has been used in core mesa before, as long as
others are ok with it.

Reviewed-by: Timothy Arceri 

>  {
> struct gl_pipeline_object *obj = (struct gl_pipeline_object *)
> data;
> struct gl_context *ctx = (struct gl_context *) userData;
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 05/11] glsl: Don't monkey about with the interpolation modes

2016-06-14 Thread Timothy Arceri

On Tue, 2016-06-14 at 19:01 -0700, Ian Romanick wrote:
> From: Ian Romanick 
> 
> Previously we'd munge the interpolation mode so that later checks in
> the
> GLSL linker would pass.  The caused problems for similar checks in
> SSO
> IO validation.  Instead, make the check smarter, use the same check
> in
> both places, and don't modify the interpolation mode.
> 
> Signed-off-by: Ian Romanick 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96358
> Cc: "12.0" 
> Cc: Gregory Hainaut 
> Cc: Ilia Mirkin 
> ---
>  src/compiler/glsl/ast_to_hir.cpp| 11 --
>  src/compiler/glsl/link_varyings.cpp | 41
> +
>  src/compiler/glsl/link_varyings.h   |  7 +++
>  src/mesa/main/shader_query.cpp  |  6 +-
>  4 files changed, 49 insertions(+), 16 deletions(-)
> 
> diff --git a/src/compiler/glsl/ast_to_hir.cpp
> b/src/compiler/glsl/ast_to_hir.cpp
> index 7da734c..d675dfa 100644
> --- a/src/compiler/glsl/ast_to_hir.cpp
> +++ b/src/compiler/glsl/ast_to_hir.cpp
> @@ -2991,17 +2991,6 @@ interpret_interpolation_qualifier(const struct
> ast_type_qualifier *qual,
>    interpolation = INTERP_QUALIFIER_NOPERSPECTIVE;
> else if (qual->flags.q.smooth)
>    interpolation = INTERP_QUALIFIER_SMOOTH;
> -   else if (state->es_shader &&
> -((mode == ir_var_shader_in &&
> -  state->stage != MESA_SHADER_VERTEX) ||
> - (mode == ir_var_shader_out &&
> -  state->stage != MESA_SHADER_FRAGMENT)))
> -  /* Section 4.3.9 (Interpolation) of the GLSL ES 3.00 spec
> says:
> -   *
> -   *"When no interpolation qualifier is present, smooth
> interpolation
> -   *is used."
> -   */
> -  interpolation = INTERP_QUALIFIER_SMOOTH;
> else
>    interpolation = INTERP_QUALIFIER_NONE;
>  
> diff --git a/src/compiler/glsl/link_varyings.cpp
> b/src/compiler/glsl/link_varyings.cpp
> index 534393a..54491fc 100644
> --- a/src/compiler/glsl/link_varyings.cpp
> +++ b/src/compiler/glsl/link_varyings.cpp
> @@ -201,6 +201,37 @@ anonymous_struct_type_matches(const glsl_type
> *output_type,
> to_match->record_compare(output_type);
>  }
>  
> +bool
> +interpolation_compatible(gl_shader_stage producer_stage,
> + gl_shader_stage consumer_stage,
> + enum glsl_interp_qualifier producer_interp,
> + enum glsl_interp_qualifier consumer_interp,
> + bool is_builtin_variable)
> +{
> +   if (producer_interp == consumer_interp)
> +  return true;
> +
> +   if (is_builtin_variable)
> +  return false;
> +
> +   /* Section 4.3.9 (Interpolation) of the GLSL ES 3.00 spec says:
> +*
> +*When no interpolation qualifier is present, smooth
> interpolation is
> +*used.
> +*/

Note last time I was looking at this I couldn't find this text in the
desktop spec so I don't think the following code can be applied to
desktop gl.

> +   if (producer_stage == MESA_SHADER_VERTEX &&
> +   producer_interp == INTERP_QUALIFIER_NONE &&
> +   consumer_interp == INTERP_QUALIFIER_SMOOTH)
> +  return true;
> +
> +   if (consumer_stage == MESA_SHADER_FRAGMENT &&
> +   consumer_interp == INTERP_QUALIFIER_NONE &&
> +   producer_interp == INTERP_QUALIFIER_SMOOTH)
> +  return true;

Are you sure this is enough? What about a fragment shader with smooth
and a geom shader with none? That shouldn't that return true also?

> +
> +   return false;
> +}
> +
>  /**
>   * Validate the types and qualifiers of an output from one stage
> against the
>   * matching input to another stage.
> @@ -329,8 +360,11 @@ cross_validate_types_and_qualifiers(struct
> gl_shader_program *prog,
>  * qualifiers of variables of the same name do not match.
>  *
>  */
> -   if (input->data.interpolation != output->data.interpolation &&
> -   prog->Version < 440) {
> +   if (prog->Version < 440 &&
> +   !interpolation_compatible(producer_stage, consumer_stage,
> + glsl_interp_qualifier(output-
> >data.interpolation),
> + glsl_interp_qualifier(input-
> >data.interpolation),
> + is_gl_identifier(output->name))) {
>    linker_error(prog,
> "%s shader output `%s' specifies %s "
> "interpolation qualifier, "
> @@ -1371,8 +1405,7 @@ varying_matches::record(ir_variable
> *producer_var, ir_variable *consumer_var)
>    (producer_var->type->contains_integer() ||
> producer_var->type->contains_double());
>  
> -   if (needs_flat_qualifier ||
> -   (consumer_stage != -1 && consumer_stage !=
> MESA_SHADER_FRAGMENT)) {
> +   if (needs_flat_qualifier) {
>    /* Since this varying is not being consumed by the fragment
> shader, its
> * interpolation type varying cannot possibly affect
> rendering.
> * Also, this variable is non-flat and is (or contains

Re: [Mesa-dev] [PATCH 04/11] glsl: Pack integer and double varyings as flat even if interpolation mode is none

2016-06-14 Thread Timothy Arceri

On Tue, 2016-06-14 at 19:01 -0700, Ian Romanick wrote:
> From: Ian Romanick 
> 
> Signed-off-by: Ian Romanick 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96358
> Cc: "12.0" 
> Cc: Gregory Hainaut 
> Cc: Ilia Mirkin 
> ---

I guess we might also want to
update varying_matches::compute_packing_class() to make the most of
this.


>  src/compiler/glsl/lower_packed_varyings.cpp | 13 -
>  1 file changed, 8 insertions(+), 5 deletions(-)
> 
> diff --git a/src/compiler/glsl/lower_packed_varyings.cpp
> b/src/compiler/glsl/lower_packed_varyings.cpp
> index 130b8f6..ae36c1c 100644
> --- a/src/compiler/glsl/lower_packed_varyings.cpp
> +++ b/src/compiler/glsl/lower_packed_varyings.cpp
> @@ -273,11 +273,11 @@ lower_packed_varyings_visitor::run(struct
> gl_shader *shader)
>   continue;
>  
>    /* This lowering pass is only capable of packing floats and
> ints
> -   * together when their interpolation mode is
> "flat".  Therefore, to be
> -   * safe, caller should ensure that integral varyings always
> use flat
> -   * interpolation, even when this is not required by GLSL.
> +   * together when their interpolation mode is "flat".  Treat
> integers as
> +   * being flat when the interpolation mode is none.
> */
>    assert(var->data.interpolation == INTERP_QUALIFIER_FLAT ||
> + var->data.interpolation == INTERP_QUALIFIER_NONE ||
>   !var->type->contains_integer());
>  
>    /* Clone the variable for program resource list before
> @@ -607,7 +607,9 @@
> lower_packed_varyings_visitor::get_packed_varying_deref(
> if (this->packed_varyings[slot] == NULL) {
>    char *packed_name = ralloc_asprintf(this->mem_ctx,
> "packed:%s", name);
>    const glsl_type *packed_type;
> -  if (unpacked_var->data.interpolation == INTERP_QUALIFIER_FLAT)
> +  if (unpacked_var->data.interpolation == INTERP_QUALIFIER_FLAT
> ||
> +  unpacked_var->type->contains_integer() ||
> +  unpacked_var->type->contains_double())
>   packed_type = glsl_type::ivec4_type;
>    else
>   packed_type = glsl_type::vec4_type;
> @@ -627,7 +629,8 @@
> lower_packed_varyings_visitor::get_packed_varying_deref(
>    packed_var->data.centroid = unpacked_var->data.centroid;
>    packed_var->data.sample = unpacked_var->data.sample;
>    packed_var->data.patch = unpacked_var->data.patch;
> -  packed_var->data.interpolation = unpacked_var-
> >data.interpolation;
> +  packed_var->data.interpolation = packed_type ==
> glsl_type::ivec4_type
> + ? unsigned(INTERP_QUALIFIER_FLAT) : unpacked_var-
> >data.interpolation;
>    packed_var->data.location = location;
>    packed_var->data.precision = unpacked_var->data.precision;
>    packed_var->data.always_active_io = unpacked_var-
> >data.always_active_io;
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 03/11] mesa: Strip arrayness from interface block names in some IO validation

2016-06-14 Thread Timothy Arceri

On Tue, 2016-06-14 at 19:01 -0700, Ian Romanick wrote:
> From: Ian Romanick 
> 
> Outputs from the vertex shader need to be able to match
> per-vertex-arrayed inputs of later stages.  Acomplish this by
> stripping
> one level of arrayness from the names and types of outputs going to a
> per-vertex-arrayed stage.
> 
> Signed-off-by: Ian Romanick 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96358
> Cc: "12.0" 
> Cc: Gregory Hainaut 
> Cc: Ilia Mirkin 
> ---
>  src/mesa/main/shader_query.cpp | 98
> ++
>  1 file changed, 90 insertions(+), 8 deletions(-)
> 
> diff --git a/src/mesa/main/shader_query.cpp
> b/src/mesa/main/shader_query.cpp
> index 5956ce4..b2e53fb 100644
> --- a/src/mesa/main/shader_query.cpp
> +++ b/src/mesa/main/shader_query.cpp
> @@ -1385,13 +1385,24 @@ _mesa_get_program_resourceiv(struct
> gl_shader_program *shProg,
>  
>  static bool
>  validate_io(struct gl_shader_program *producer,
> -struct gl_shader_program *consumer)
> +struct gl_shader_program *consumer,
> +gl_shader_stage producer_stage,
> +gl_shader_stage consumer_stage)
>  {
> if (producer == consumer)
>    return true;
>  
> +   const bool nonarray_stage_to_array_stage =
> +  producer_stage == MESA_SHADER_VERTEX &&
> +  (consumer_stage == MESA_SHADER_GEOMETRY ||
> +   consumer_stage == MESA_SHADER_TESS_CTRL ||
> +   consumer_stage == MESA_SHADER_TESS_EVAL);

TESS_EVAL->GEOM ?

> +
> bool valid = true;
>  
> +   void *name_buffer = NULL;
> +   size_t name_buffer_size = 0;
> +
> gl_shader_variable const **outputs =
>    (gl_shader_variable const **) calloc(producer-
> >NumProgramResourceList,
> sizeof(gl_shader_variable
> *));
> @@ -1463,11 +1474,52 @@ validate_io(struct gl_shader_program
> *producer,
>  }
>   }
>    } else {
> + char *consumer_name = consumer_var->name;
> +
> + if (nonarray_stage_to_array_stage &&
> + consumer_var->interface_type != NULL &&
> + consumer_var->interface_type->is_array() &&
> + !is_gl_identifier(consumer_var->name)) {
> +const size_t name_len = strlen(consumer_var->name);
> +
> +if (name_len >= name_buffer_size) {
> +   free(name_buffer);
> +
> +   name_buffer_size = name_len + 1;
> +   name_buffer = malloc(name_buffer_size);
> +   if (name_buffer == NULL) {
> +  valid = false;
> +  goto out;
> +   }
> +}
> +
> +consumer_name = (char *) name_buffer;
> +
> +char *s = strchr(consumer_var->name, '[');
> +if (s == NULL) {
> +   valid = false;
> +   goto out;
> +}
> +
> +char *t = strchr(s, ']');
> +if (t == NULL) {
> +   valid = false;
> +   goto out;
> +}
> +
> +assert(t[1] == '.' || t[1] == '[');
> +
> +const ptrdiff_t base_name_len = s - consumer_var->name;
> +
> +memcpy(consumer_name, consumer_var->name,
> base_name_len);
> +strcpy(consumer_name + base_name_len, t + 1);
> + }
> +
>   for (unsigned j = 0; j < num_outputs; j++) {
>  const gl_shader_variable *const var = outputs[j];
>  
>  if (!var->explicit_location &&
> -strcmp(consumer_var->name, var->name) == 0) {
> +strcmp(consumer_name, var->name) == 0) {
> producer_var = var;
> match_index = j;
> break;
> @@ -1529,25 +1581,53 @@ validate_io(struct gl_shader_program
> *producer,
> * Note that location mismatches are detected by the loops
> above that
> * find the producer variable that goes with the consumer
> variable.
> */
> -  if (producer_var->type != consumer_var->type ||
> -  producer_var->interpolation != consumer_var->interpolation 
> ||
> -  producer_var->precision != consumer_var->precision) {
> +  if (nonarray_stage_to_array_stage) {
> + if (!consumer_var->type->is_array() ||
> + consumer_var->type->fields.array != producer_var->type) 
> {
> +valid = false;
> +goto out;
> + }
> +
> + if (consumer_var->interface_type != NULL) {
> +if (!consumer_var->interface_type->is_array() ||
> +consumer_var->interface_type->fields.array !=
> producer_var->interface_type) {
> +   valid = false;
> +   goto out;
> +}
> + } else if (producer_var->interface_type != NULL) {
> +valid = false;
> +goto out;
> + }
> +  } else {
> + if (producer_var->type != consumer_var->type) {
> +valid = false;
> +goto out;
> + }
> +
> + if (produ

Re: [Mesa-dev] [PATCH 02/11] mesa: If validation fails in a debug context just emit a debug message

2016-06-14 Thread Timothy Arceri

On Tue, 2016-06-14 at 19:01 -0700, Ian Romanick wrote:
> From: Ian Romanick 
> 
> There are quite a few pipelines that desktop applications (including
> a
> bunch of piglit test) can expect to have run but don't meet the GLES
> requirements.  Instead of failing validation, just emit a debug
> message.
> 
> Signed-off-by: Ian Romanick 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96358
> Cc: "12.0" 
> Cc: Gregory Hainaut 
> Cc: Ilia Mirkin 

Patches 1-2 are:

Reviewed-by: Timothy Arceri 

> ---
>  src/mesa/main/pipelineobj.c | 17 +++--
>  1 file changed, 15 insertions(+), 2 deletions(-)
> 
> diff --git a/src/mesa/main/pipelineobj.c
> b/src/mesa/main/pipelineobj.c
> index 5a46cfe..9ecbcc9 100644
> --- a/src/mesa/main/pipelineobj.c
> +++ b/src/mesa/main/pipelineobj.c
> @@ -929,8 +929,21 @@ _mesa_validate_program_pipeline(struct
> gl_context* ctx,
>  * application has created a debug context.
>  */
> if ((_mesa_is_gles(ctx) || (ctx->Const.ContextFlags &
> GL_CONTEXT_FLAG_DEBUG_BIT)) &&
> -   !_mesa_validate_pipeline_io(pipe))
> -  return GL_FALSE;
> +   !_mesa_validate_pipeline_io(pipe)) {
> +  if (_mesa_is_gles(ctx))
> + return GL_FALSE;
> +
> +  static GLuint msg_id = 0;
> +
> +  _mesa_gl_debug(ctx, &msg_id,
> + MESA_DEBUG_SOURCE_API,
> + MESA_DEBUG_TYPE_PORTABILITY,
> + MESA_DEBUG_SEVERITY_MEDIUM,
> + "glValidateProgramPipeline: pipeline %u does
> not meet "
> + "strict OpenGL ES 3.1 requirements and may not
> be "
> + "portable across desktop hardware\n",
> + pipe->Name);
> +   }
>  
> pipe->Validated = GL_TRUE;
> return GL_TRUE;
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 04/11] glsl: Pack integer and double varyings as flat even if interpolation mode is none

2016-06-14 Thread Ian Romanick

From: Ian Romanick 

Signed-off-by: Ian Romanick 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96358
Cc: "12.0" 
Cc: Gregory Hainaut 
Cc: Ilia Mirkin 
---
 src/compiler/glsl/lower_packed_varyings.cpp | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/src/compiler/glsl/lower_packed_varyings.cpp 
b/src/compiler/glsl/lower_packed_varyings.cpp
index 130b8f6..ae36c1c 100644
--- a/src/compiler/glsl/lower_packed_varyings.cpp
+++ b/src/compiler/glsl/lower_packed_varyings.cpp
@@ -273,11 +273,11 @@ lower_packed_varyings_visitor::run(struct gl_shader 
*shader)
  continue;
 
   /* This lowering pass is only capable of packing floats and ints
-   * together when their interpolation mode is "flat".  Therefore, to be
-   * safe, caller should ensure that integral varyings always use flat
-   * interpolation, even when this is not required by GLSL.
+   * together when their interpolation mode is "flat".  Treat integers as
+   * being flat when the interpolation mode is none.
*/
   assert(var->data.interpolation == INTERP_QUALIFIER_FLAT ||
+ var->data.interpolation == INTERP_QUALIFIER_NONE ||
  !var->type->contains_integer());
 
   /* Clone the variable for program resource list before
@@ -607,7 +607,9 @@ lower_packed_varyings_visitor::get_packed_varying_deref(
if (this->packed_varyings[slot] == NULL) {
   char *packed_name = ralloc_asprintf(this->mem_ctx, "packed:%s", name);
   const glsl_type *packed_type;
-  if (unpacked_var->data.interpolation == INTERP_QUALIFIER_FLAT)
+  if (unpacked_var->data.interpolation == INTERP_QUALIFIER_FLAT ||
+  unpacked_var->type->contains_integer() ||
+  unpacked_var->type->contains_double())
  packed_type = glsl_type::ivec4_type;
   else
  packed_type = glsl_type::vec4_type;
@@ -627,7 +629,8 @@ lower_packed_varyings_visitor::get_packed_varying_deref(
   packed_var->data.centroid = unpacked_var->data.centroid;
   packed_var->data.sample = unpacked_var->data.sample;
   packed_var->data.patch = unpacked_var->data.patch;
-  packed_var->data.interpolation = unpacked_var->data.interpolation;
+  packed_var->data.interpolation = packed_type == glsl_type::ivec4_type
+ ? unsigned(INTERP_QUALIFIER_FLAT) : unpacked_var->data.interpolation;
   packed_var->data.location = location;
   packed_var->data.precision = unpacked_var->data.precision;
   packed_var->data.always_active_io = unpacked_var->data.always_active_io;
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 03/11] mesa: Strip arrayness from interface block names in some IO validation

2016-06-14 Thread Ian Romanick

From: Ian Romanick 

Outputs from the vertex shader need to be able to match
per-vertex-arrayed inputs of later stages.  Acomplish this by stripping
one level of arrayness from the names and types of outputs going to a
per-vertex-arrayed stage.

Signed-off-by: Ian Romanick 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96358
Cc: "12.0" 
Cc: Gregory Hainaut 
Cc: Ilia Mirkin 
---
 src/mesa/main/shader_query.cpp | 98 ++
 1 file changed, 90 insertions(+), 8 deletions(-)

diff --git a/src/mesa/main/shader_query.cpp b/src/mesa/main/shader_query.cpp
index 5956ce4..b2e53fb 100644
--- a/src/mesa/main/shader_query.cpp
+++ b/src/mesa/main/shader_query.cpp
@@ -1385,13 +1385,24 @@ _mesa_get_program_resourceiv(struct gl_shader_program 
*shProg,
 
 static bool
 validate_io(struct gl_shader_program *producer,
-struct gl_shader_program *consumer)
+struct gl_shader_program *consumer,
+gl_shader_stage producer_stage,
+gl_shader_stage consumer_stage)
 {
if (producer == consumer)
   return true;
 
+   const bool nonarray_stage_to_array_stage =
+  producer_stage == MESA_SHADER_VERTEX &&
+  (consumer_stage == MESA_SHADER_GEOMETRY ||
+   consumer_stage == MESA_SHADER_TESS_CTRL ||
+   consumer_stage == MESA_SHADER_TESS_EVAL);
+
bool valid = true;
 
+   void *name_buffer = NULL;
+   size_t name_buffer_size = 0;
+
gl_shader_variable const **outputs =
   (gl_shader_variable const **) calloc(producer->NumProgramResourceList,
sizeof(gl_shader_variable *));
@@ -1463,11 +1474,52 @@ validate_io(struct gl_shader_program *producer,
 }
  }
   } else {
+ char *consumer_name = consumer_var->name;
+
+ if (nonarray_stage_to_array_stage &&
+ consumer_var->interface_type != NULL &&
+ consumer_var->interface_type->is_array() &&
+ !is_gl_identifier(consumer_var->name)) {
+const size_t name_len = strlen(consumer_var->name);
+
+if (name_len >= name_buffer_size) {
+   free(name_buffer);
+
+   name_buffer_size = name_len + 1;
+   name_buffer = malloc(name_buffer_size);
+   if (name_buffer == NULL) {
+  valid = false;
+  goto out;
+   }
+}
+
+consumer_name = (char *) name_buffer;
+
+char *s = strchr(consumer_var->name, '[');
+if (s == NULL) {
+   valid = false;
+   goto out;
+}
+
+char *t = strchr(s, ']');
+if (t == NULL) {
+   valid = false;
+   goto out;
+}
+
+assert(t[1] == '.' || t[1] == '[');
+
+const ptrdiff_t base_name_len = s - consumer_var->name;
+
+memcpy(consumer_name, consumer_var->name, base_name_len);
+strcpy(consumer_name + base_name_len, t + 1);
+ }
+
  for (unsigned j = 0; j < num_outputs; j++) {
 const gl_shader_variable *const var = outputs[j];
 
 if (!var->explicit_location &&
-strcmp(consumer_var->name, var->name) == 0) {
+strcmp(consumer_name, var->name) == 0) {
producer_var = var;
match_index = j;
break;
@@ -1529,25 +1581,53 @@ validate_io(struct gl_shader_program *producer,
* Note that location mismatches are detected by the loops above that
* find the producer variable that goes with the consumer variable.
*/
-  if (producer_var->type != consumer_var->type ||
-  producer_var->interpolation != consumer_var->interpolation ||
-  producer_var->precision != consumer_var->precision) {
+  if (nonarray_stage_to_array_stage) {
+ if (!consumer_var->type->is_array() ||
+ consumer_var->type->fields.array != producer_var->type) {
+valid = false;
+goto out;
+ }
+
+ if (consumer_var->interface_type != NULL) {
+if (!consumer_var->interface_type->is_array() ||
+consumer_var->interface_type->fields.array != 
producer_var->interface_type) {
+   valid = false;
+   goto out;
+}
+ } else if (producer_var->interface_type != NULL) {
+valid = false;
+goto out;
+ }
+  } else {
+ if (producer_var->type != consumer_var->type) {
+valid = false;
+goto out;
+ }
+
+ if (producer_var->interface_type != consumer_var->interface_type) {
+valid = false;
+goto out;
+ }
+  }
+
+  if (producer_var->interpolation != consumer_var->interpolation) {
  valid = false;
  goto out;
   }
 
-  if (producer_var->outermost_struct_type != 
consumer_var->outermost_struct_type) {
+  if (pro

[Mesa-dev] [PATCH 11/11] i965: Delete redundant extension enables

2016-06-14 Thread Ian Romanick

From: Ian Romanick 

A nearly identical block already exists in the gen >= 6 block above.

Signed-off-by: Ian Romanick 
---
 src/mesa/drivers/dri/i965/intel_extensions.c | 9 -
 1 file changed, 9 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c 
b/src/mesa/drivers/dri/i965/intel_extensions.c
index 5be4787..b55fed2 100644
--- a/src/mesa/drivers/dri/i965/intel_extensions.c
+++ b/src/mesa/drivers/dri/i965/intel_extensions.c
@@ -360,15 +360,6 @@ intelInitExtensions(struct gl_context *ctx)
  if (brw->intelScreen->cmd_parser_version >= 2)
 brw->predicate.supported = true;
   }
-
-  /* Only enable this in core profile because other parts of Mesa behave
-   * slightly differently when the extension is enabled.
-   */
-  if (ctx->API == API_OPENGL_CORE) {
- ctx->Extensions.ARB_viewport_array = true;
- ctx->Extensions.AMD_vertex_shader_viewport_index = true;
- ctx->Extensions.ARB_shader_subroutine = true;
-  }
}
 
if (brw->gen >= 8 || brw->is_haswell || brw->is_baytrail) {
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 08/11] docs: Update GL3.txt for OpenGL ES on i965-ish hardware

2016-06-14 Thread Ian Romanick

From: Ian Romanick 

Signed-off-by: Ian Romanick 
---
 docs/GL3.txt | 28 +++-
 1 file changed, 15 insertions(+), 13 deletions(-)

diff --git a/docs/GL3.txt b/docs/GL3.txt
index 0204695..dedea1a 100644
--- a/docs/GL3.txt
+++ b/docs/GL3.txt
@@ -222,25 +222,26 @@ GL 4.5, GLSL 4.50:
   GL_EXT_shader_integer_mix DONE (all drivers that 
support GLSL)
 
 These are the extensions cherry-picked to make GLES 3.1
-GLES3.1, GLSL ES 3.1 -- all DONE: nvc0, radeonsi
+GLES3.1, GLSL ES 3.1 -- all DONE: i965/gen8+, nvc0, radeonsi
+
   GL_ARB_arrays_of_arrays   DONE (all drivers that 
support GLSL 1.30)
-  GL_ARB_compute_shader DONE (i965, softpipe)
-  GL_ARB_draw_indirect  DONE (i965, r600, 
llvmpipe, softpipe, swr)
+  GL_ARB_compute_shader DONE (i965/gen7+, 
softpipe)
+  GL_ARB_draw_indirect  DONE (i965/gen7+, 
r600, llvmpipe, softpipe, swr)
   GL_ARB_explicit_uniform_location  DONE (all drivers that 
support GLSL)
-  GL_ARB_framebuffer_no_attachments DONE (i965, r600, 
softpipe)
+  GL_ARB_framebuffer_no_attachments DONE (i965/gen7+, 
r600, softpipe)
   GL_ARB_program_interface_queryDONE (all drivers)
-  GL_ARB_shader_atomic_counters DONE (i965, softpipe)
-  GL_ARB_shader_image_load_storeDONE (i965, softpipe)
-  GL_ARB_shader_image_size  DONE (i965, softpipe)
-  GL_ARB_shader_storage_buffer_object   DONE (i965, softpipe)
+  GL_ARB_shader_atomic_counters DONE (i965/gen7+, 
softpipe)
+  GL_ARB_shader_image_load_storeDONE (i965/gen7+, 
softpipe)
+  GL_ARB_shader_image_size  DONE (i965/gen7+, 
softpipe)
+  GL_ARB_shader_storage_buffer_object   DONE (i965/gen7+, 
softpipe)
   GL_ARB_shading_language_packing   DONE (all drivers)
   GL_ARB_separate_shader_objectsDONE (all drivers)
-  GL_ARB_stencil_texturing  DONE (i965/gen8+, 
nv50, r600, llvmpipe, softpipe, swr)
-  GL_ARB_texture_multisample (Multisample textures) DONE (i965, nv50, 
r600, llvmpipe, softpipe)
+  GL_ARB_stencil_texturing  DONE (nv50, r600, 
llvmpipe, softpipe, swr)
+  GL_ARB_texture_multisample (Multisample textures) DONE (i965/gen7+, 
nv50, r600, llvmpipe, softpipe)
   GL_ARB_texture_storage_multisampleDONE (all drivers that 
support GL_ARB_texture_multisample)
   GL_ARB_vertex_attrib_binding  DONE (all drivers)
-  GS5 Enhanced textureGatherDONE (i965, r600)
-  GS5 Packing/bitfield/conversion functions DONE (i965, r600)
+  GS5 Enhanced textureGatherDONE (i965/gen7+, r600)
+  GS5 Packing/bitfield/conversion functions DONE (i965/gen6+, r600)
   GL_EXT_shader_integer_mix DONE (all drivers that 
support GLSL)
 
   Additional functionality not covered above:
@@ -249,7 +250,8 @@ GLES3.1, GLSL ES 3.1 -- all DONE: nvc0, radeonsi
   glGetBooleani_v - restrict to GLES enums
   gl_HelperInvocation support   DONE (i965, r600)
 
-GLES3.2, GLSL ES 3.2
+GLES3.2, GLSL ES 3.2:
+
   GL_EXT_color_buffer_float DONE (all drivers)
   GL_KHR_blend_equation_advancednot started
   GL_KHR_debug  DONE (all drivers)
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 05/11] glsl: Don't monkey about with the interpolation modes

2016-06-14 Thread Ian Romanick

From: Ian Romanick 

Previously we'd munge the interpolation mode so that later checks in the
GLSL linker would pass.  The caused problems for similar checks in SSO
IO validation.  Instead, make the check smarter, use the same check in
both places, and don't modify the interpolation mode.

Signed-off-by: Ian Romanick 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96358
Cc: "12.0" 
Cc: Gregory Hainaut 
Cc: Ilia Mirkin 
---
 src/compiler/glsl/ast_to_hir.cpp| 11 --
 src/compiler/glsl/link_varyings.cpp | 41 +
 src/compiler/glsl/link_varyings.h   |  7 +++
 src/mesa/main/shader_query.cpp  |  6 +-
 4 files changed, 49 insertions(+), 16 deletions(-)

diff --git a/src/compiler/glsl/ast_to_hir.cpp b/src/compiler/glsl/ast_to_hir.cpp
index 7da734c..d675dfa 100644
--- a/src/compiler/glsl/ast_to_hir.cpp
+++ b/src/compiler/glsl/ast_to_hir.cpp
@@ -2991,17 +2991,6 @@ interpret_interpolation_qualifier(const struct 
ast_type_qualifier *qual,
   interpolation = INTERP_QUALIFIER_NOPERSPECTIVE;
else if (qual->flags.q.smooth)
   interpolation = INTERP_QUALIFIER_SMOOTH;
-   else if (state->es_shader &&
-((mode == ir_var_shader_in &&
-  state->stage != MESA_SHADER_VERTEX) ||
- (mode == ir_var_shader_out &&
-  state->stage != MESA_SHADER_FRAGMENT)))
-  /* Section 4.3.9 (Interpolation) of the GLSL ES 3.00 spec says:
-   *
-   *"When no interpolation qualifier is present, smooth interpolation
-   *is used."
-   */
-  interpolation = INTERP_QUALIFIER_SMOOTH;
else
   interpolation = INTERP_QUALIFIER_NONE;
 
diff --git a/src/compiler/glsl/link_varyings.cpp 
b/src/compiler/glsl/link_varyings.cpp
index 534393a..54491fc 100644
--- a/src/compiler/glsl/link_varyings.cpp
+++ b/src/compiler/glsl/link_varyings.cpp
@@ -201,6 +201,37 @@ anonymous_struct_type_matches(const glsl_type *output_type,
to_match->record_compare(output_type);
 }
 
+bool
+interpolation_compatible(gl_shader_stage producer_stage,
+ gl_shader_stage consumer_stage,
+ enum glsl_interp_qualifier producer_interp,
+ enum glsl_interp_qualifier consumer_interp,
+ bool is_builtin_variable)
+{
+   if (producer_interp == consumer_interp)
+  return true;
+
+   if (is_builtin_variable)
+  return false;
+
+   /* Section 4.3.9 (Interpolation) of the GLSL ES 3.00 spec says:
+*
+*When no interpolation qualifier is present, smooth interpolation is
+*used.
+*/
+   if (producer_stage == MESA_SHADER_VERTEX &&
+   producer_interp == INTERP_QUALIFIER_NONE &&
+   consumer_interp == INTERP_QUALIFIER_SMOOTH)
+  return true;
+
+   if (consumer_stage == MESA_SHADER_FRAGMENT &&
+   consumer_interp == INTERP_QUALIFIER_NONE &&
+   producer_interp == INTERP_QUALIFIER_SMOOTH)
+  return true;
+
+   return false;
+}
+
 /**
  * Validate the types and qualifiers of an output from one stage against the
  * matching input to another stage.
@@ -329,8 +360,11 @@ cross_validate_types_and_qualifiers(struct 
gl_shader_program *prog,
 * qualifiers of variables of the same name do not match.
 *
 */
-   if (input->data.interpolation != output->data.interpolation &&
-   prog->Version < 440) {
+   if (prog->Version < 440 &&
+   !interpolation_compatible(producer_stage, consumer_stage,
+ 
glsl_interp_qualifier(output->data.interpolation),
+ 
glsl_interp_qualifier(input->data.interpolation),
+ is_gl_identifier(output->name))) {
   linker_error(prog,
"%s shader output `%s' specifies %s "
"interpolation qualifier, "
@@ -1371,8 +1405,7 @@ varying_matches::record(ir_variable *producer_var, 
ir_variable *consumer_var)
   (producer_var->type->contains_integer() ||
producer_var->type->contains_double());
 
-   if (needs_flat_qualifier ||
-   (consumer_stage != -1 && consumer_stage != MESA_SHADER_FRAGMENT)) {
+   if (needs_flat_qualifier) {
   /* Since this varying is not being consumed by the fragment shader, its
* interpolation type varying cannot possibly affect rendering.
* Also, this variable is non-flat and is (or contains) an integer
diff --git a/src/compiler/glsl/link_varyings.h 
b/src/compiler/glsl/link_varyings.h
index 39e9070..6a98c0f 100644
--- a/src/compiler/glsl/link_varyings.h
+++ b/src/compiler/glsl/link_varyings.h
@@ -338,4 +338,11 @@ check_against_input_limit(struct gl_context *ctx,
   gl_shader *consumer,
   unsigned num_explicit_locations);
 
+bool
+interpolation_compatible(gl_shader_stage producer_stage,
+ gl_shader_stage consumer_stage,
+ enum glsl_interp_qualifier producer_interp,
+

[Mesa-dev] [PATCH 10/11] docs: Add extensions not part of any GL or GL ES version

2016-06-14 Thread Ian Romanick

From: Ian Romanick 

Based loosely on patches submitted ages ago by Thomas Helland.

Signed-off-by: Ian Romanick 
---
 docs/GL3.txt | 56 
 1 file changed, 56 insertions(+)

diff --git a/docs/GL3.txt b/docs/GL3.txt
index 0deeaa1..b0966a2 100644
--- a/docs/GL3.txt
+++ b/docs/GL3.txt
@@ -275,5 +275,61 @@ GLES3.2, GLSL ES 3.2:
   GL_OES_texture_stencil8   DONE (all drivers that 
support GL_ARB_texture_stencil8)
   GL_OES_texture_storage_multisample_2d_array   DONE (all drivers that 
support GL_ARB_texture_multisample)
 
+Khronos, ARB, and OES extensions that are not part of any OpenGL or OpenGL ES 
version:
+
+  GL_ARB_bindless_texture   not started
+  GL_ARB_cl_event   not started
+  GL_ARB_compute_variable_group_sizenot started
+  GL_ARB_ES3_2_compatibilitynot started
+  GL_ARB_fragment_shader_interlock  not started
+  GL_ARB_gpu_shader_int64   started (airlied for 
core and Gallium, idr for i965)
+  GL_ARB_indirect_parametersDONE (core only?)
+  GL_ARB_parallel_shader_compilenot started, but 
Chia-I Wu did some related work in 2014
+  GL_ARB_pipeline_statistics_query  DONE (i965, nvc0, 
radeonsi, softpipe, swr)
+  GL_ARB_post_depth_coveragenot started
+  GL_ARB_robustness_isolation   not started
+  GL_ARB_sample_locations   not started
+  GL_ARB_seamless_cubemap_per_texture   DONE (i965, nvc0, 
radeonsi, r600, softpipe, swr)
+  GL_ARB_shader_atomic_counter_ops  DONE (some Gallium 
drivers?)
+  GL_ARB_shader_ballot  not started
+  GL_ARB_shader_clock   DONE (i965/gen7+)
+  GL_ARB_shader_draw_parameters DONE (i965, nvc0)
+  GL_ARB_shader_group_vote  started (Ilia for 
nvc0, Matt for i965)
+  GL_ARB_shader_stencil_export  DONE (i965/gen9+, 
radeonsi, softpipe, llvmpipe, swr)
+  GL_ARB_shader_viewport_layer_arraynot started
+  GL_ARB_sparse_buffer  not started
+  GL_ARB_sparse_texture2not started
+  GL_ARB_sparse_texture_clamp   not started
+  GL_ARB_sparse_texture not started
+  GL_ARB_texture_filter_minmax  not started
+  GL_ARB_transform_feedback_overflow_query  not started
+  GL_KHR_blend_equation_advanced_coherent   not started
+  GL_KHR_no_error   not started
+  GL_KHR_texture_compression_astc_hdr   DONE (core only)
+  GL_KHR_texture_compression_astc_sliced_3d not started
+  GL_OES_depth_texture_cube_map DONE (all drivers that 
support GLSL 1.30+)
+  GL_OES_EGL_image  DONE (all drivers)
+  GL_OES_EGL_image_external_essl3   not started
+  GL_OES_required_internalformatnot started - GLES2 
extension based on OpenGL ES 3.0 feature
+  GL_OES_surfaceless_contextDONE (all drivers)
+  GL_OES_texture_compression_astc   DONE (core only)
+  GL_OES_texture_float  DONE (i965)
+  GL_OES_texture_float_linear   DONE (i965)
+  GL_OES_texture_half_float DONE (i965)
+  GL_OES_texture_half_float_linear  DONE (i965)
+  GL_OES_texture_view   not started - based on 
GL_ARB_texture_view
+  GLX_ARB_context_flush_control not started
+  GLX_ARB_robustness_application_isolation  not started
+  GLX_ARB_robustness_share_group_isolation  not started
+
+The following extensions are not part of any OpenGL or OpenGL ES version, and
+we DO NOT WANT implementations of these extensions for Mesa.
+
+  GL_ARB_geometry_shader4   Superseded by GL 3.2 
geometry shaders
+  GL_ARB_matrix_palette Superseded by 
GL_ARB_vertex_program
+  GL_ARB_shading_language_include   Not interesting
+  GL_ARB_shadow_ambient Superseded by 
GL_ARB_fragment_program
+  GL_ARB_vertex_blend   Superseded by 
GL_ARB_vertex_program
+
 More info about these features and the work involved can be found at
 http://dri.freedesktop.org/wiki/MissingFunctionality
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freede

[Mesa-dev] [PATCH 02/11] mesa: If validation fails in a debug context just emit a debug message

2016-06-14 Thread Ian Romanick

From: Ian Romanick 

There are quite a few pipelines that desktop applications (including a
bunch of piglit test) can expect to have run but don't meet the GLES
requirements.  Instead of failing validation, just emit a debug message.

Signed-off-by: Ian Romanick 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96358
Cc: "12.0" 
Cc: Gregory Hainaut 
Cc: Ilia Mirkin 
---
 src/mesa/main/pipelineobj.c | 17 +++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/src/mesa/main/pipelineobj.c b/src/mesa/main/pipelineobj.c
index 5a46cfe..9ecbcc9 100644
--- a/src/mesa/main/pipelineobj.c
+++ b/src/mesa/main/pipelineobj.c
@@ -929,8 +929,21 @@ _mesa_validate_program_pipeline(struct gl_context* ctx,
 * application has created a debug context.
 */
if ((_mesa_is_gles(ctx) || (ctx->Const.ContextFlags & 
GL_CONTEXT_FLAG_DEBUG_BIT)) &&
-   !_mesa_validate_pipeline_io(pipe))
-  return GL_FALSE;
+   !_mesa_validate_pipeline_io(pipe)) {
+  if (_mesa_is_gles(ctx))
+ return GL_FALSE;
+
+  static GLuint msg_id = 0;
+
+  _mesa_gl_debug(ctx, &msg_id,
+ MESA_DEBUG_SOURCE_API,
+ MESA_DEBUG_TYPE_PORTABILITY,
+ MESA_DEBUG_SEVERITY_MEDIUM,
+ "glValidateProgramPipeline: pipeline %u does not meet "
+ "strict OpenGL ES 3.1 requirements and may not be "
+ "portable across desktop hardware\n",
+ pipe->Name);
+   }
 
pipe->Validated = GL_TRUE;
return GL_TRUE;
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 09/11] docs: Update GL3.txt for OpenGL 4.0 on i965-ish hardware

2016-06-14 Thread Ian Romanick

From: Ian Romanick 

Signed-off-by: Ian Romanick 
---
 docs/GL3.txt | 26 +-
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/docs/GL3.txt b/docs/GL3.txt
index dedea1a..0deeaa1 100644
--- a/docs/GL3.txt
+++ b/docs/GL3.txt
@@ -107,11 +107,11 @@ GL 3.3, GLSL 3.30 --- all DONE: i965, nv50, nvc0, r600, 
radeonsi, llvmpipe, soft
   GL_ARB_vertex_type_2_10_10_10_rev DONE (swr)
 
 
-GL 4.0, GLSL 4.00 --- all DONE: nvc0, r600, radeonsi
+GL 4.0, GLSL 4.00 --- all DONE: i965/gen8+, nvc0, r600, radeonsi
 
-  GL_ARB_draw_buffers_blend DONE (i965, nv50, 
llvmpipe, softpipe, swr)
-  GL_ARB_draw_indirect  DONE (i965, llvmpipe, 
softpipe, swr)
-  GL_ARB_gpu_shader5DONE (i965)
+  GL_ARB_draw_buffers_blend DONE (i965/gen6+, 
nv50, llvmpipe, softpipe, swr)
+  GL_ARB_draw_indirect  DONE (i965/gen7+, 
llvmpipe, softpipe, swr)
+  GL_ARB_gpu_shader5DONE (i965/gen7+)
   - 'precise' qualifier DONE
   - Dynamically uniform sampler array indices   DONE (softpipe)
   - Dynamically uniform UBO array indices   DONE ()
@@ -124,16 +124,16 @@ GL 4.0, GLSL 4.00 --- all DONE: nvc0, r600, radeonsi
   - Enhanced per-sample shading DONE ()
   - Interpolation functions DONE ()
   - New overload resolution rules   DONE
-  GL_ARB_gpu_shader_fp64DONE (i965/gen8+, 
llvmpipe, softpipe)
-  GL_ARB_sample_shading DONE (i965, nv50)
-  GL_ARB_shader_subroutine  DONE (i965, nv50, 
llvmpipe, softpipe, swr)
-  GL_ARB_tessellation_shaderDONE (i965)
-  GL_ARB_texture_buffer_object_rgb32DONE (i965, llvmpipe, 
softpipe, swr)
-  GL_ARB_texture_cube_map_array DONE (i965, nv50, 
llvmpipe, softpipe)
-  GL_ARB_texture_gather DONE (i965, nv50, 
llvmpipe, softpipe, swr)
+  GL_ARB_gpu_shader_fp64DONE (llvmpipe, 
softpipe)
+  GL_ARB_sample_shading DONE (i965/gen6+, nv50)
+  GL_ARB_shader_subroutine  DONE (i965/gen6+, 
nv50, llvmpipe, softpipe, swr)
+  GL_ARB_tessellation_shaderDONE (i965/gen7+)
+  GL_ARB_texture_buffer_object_rgb32DONE (i965/gen6+, 
llvmpipe, softpipe)
+  GL_ARB_texture_cube_map_array DONE (i965/gen6+, 
nv50, llvmpipe, softpipe)
+  GL_ARB_texture_gather DONE (i965/gen6+, 
nv50, llvmpipe, softpipe, swr)
   GL_ARB_texture_query_lod  DONE (i965, nv50, 
softpipe)
-  GL_ARB_transform_feedback2DONE (i965, nv50, 
llvmpipe, softpipe, swr)
-  GL_ARB_transform_feedback3DONE (i965, nv50, 
llvmpipe, softpipe, swr)
+  GL_ARB_transform_feedback2DONE (i965/gen7+, 
nv50, llvmpipe, softpipe, swr)
+  GL_ARB_transform_feedback3DONE (i965/gen7+, 
nv50, llvmpipe, softpipe, swr)
 
 
 GL 4.1, GLSL 4.10 --- all DONE: nvc0, r600, radeonsi
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 07/11] mesa: Fix incorrect "see also" comments

2016-06-14 Thread Ian Romanick

From: Ian Romanick 

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/ir.h | 2 +-
 src/mesa/main/mtypes.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/compiler/glsl/ir.h b/src/compiler/glsl/ir.h
index 3629356..cd17f69 100644
--- a/src/compiler/glsl/ir.h
+++ b/src/compiler/glsl/ir.h
@@ -679,7 +679,7 @@ public:
   /**
* Interpolation mode for shader inputs / outputs
*
-   * \sa ir_variable_interpolation
+   * \sa glsl_interp_qualifier
*/
   unsigned interpolation:2;
 
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 471d41d..88702cb 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -2615,7 +2615,7 @@ struct gl_shader_variable
/**
 * Interpolation mode for shader inputs / outputs
 *
-* \sa ir_variable_interpolation
+* \sa glsl_interp_qualifier
 */
unsigned interpolation:2;
 
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 01/11] glsl: Always strip arrayness in precision_qualifier_allowed

2016-06-14 Thread Ian Romanick

From: Ian Romanick 

Previously some callers of precision_qualifier_allowed would strip the
arrayness from the type and some would not.  As a result, some places
would not notice that float[6], for example, needed a precision
qualifier.

Fixes the new piglit test no-default-float-array-precision.frag.

Signed-off-by: Ian Romanick 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96358
Cc: "12.0" 
Cc: Gregory Hainaut 
Cc: Ilia Mirkin 
---
 src/compiler/glsl/ast_to_hir.cpp | 17 ++---
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/src/compiler/glsl/ast_to_hir.cpp b/src/compiler/glsl/ast_to_hir.cpp
index ea32924..7da734c 100644
--- a/src/compiler/glsl/ast_to_hir.cpp
+++ b/src/compiler/glsl/ast_to_hir.cpp
@@ -2278,10 +2278,10 @@ precision_qualifier_allowed(const glsl_type *type)
 * From this, we infer that GLSL 1.30 (and later) should allow precision
 * qualifiers on sampler types just like float and integer types.
 */
-   return (type->is_float()
-   || type->is_integer()
-   || type->contains_opaque())
-   && !type->without_array()->is_record();
+   const glsl_type *const t = type->without_array();
+
+   return (t->is_float() || t->is_integer() || t->contains_opaque()) &&
+  !t->is_record();
 }
 
 const glsl_type *
@@ -4994,13 +4994,8 @@ ast_declarator_list::hir(exec_list *instructions,
  state->check_precision_qualifiers_allowed(&loc);
   }
 
-
-  /* If a precision qualifier is allowed on a type, it is allowed on
-   * an array of that type.
-   */
-  if (!(this->type->qualifier.precision == ast_precision_none
-  || precision_qualifier_allowed(var->type->without_array( {
-
+  if (this->type->qualifier.precision != ast_precision_none &&
+  !precision_qualifier_allowed(var->type)) {
  _mesa_glsl_error(&loc, state,
   "precision qualifiers apply only to floating point"
   ", integer and opaque types");
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 06/11] mesa: Silence unused parameter warning

2016-06-14 Thread Ian Romanick

From: Ian Romanick 

main/pipelineobj.c: In function ‘delete_pipelineobj_cb’:
main/pipelineobj.c:110:30: warning: unused parameter ‘id’ [-Wunused-parameter]
 delete_pipelineobj_cb(GLuint id, void *data, void *userData)
  ^

Signed-off-by: Ian Romanick 
---
 src/mesa/main/pipelineobj.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/main/pipelineobj.c b/src/mesa/main/pipelineobj.c
index 9ecbcc9..8483752 100644
--- a/src/mesa/main/pipelineobj.c
+++ b/src/mesa/main/pipelineobj.c
@@ -107,7 +107,7 @@ _mesa_init_pipeline(struct gl_context *ctx)
  * Callback for deleting a pipeline object.  Called by _mesa_HashDeleteAll().
  */
 static void
-delete_pipelineobj_cb(GLuint id, void *data, void *userData)
+delete_pipelineobj_cb(UNUSED GLuint id, void *data, void *userData)
 {
struct gl_pipeline_object *obj = (struct gl_pipeline_object *) data;
struct gl_context *ctx = (struct gl_context *) userData;
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] anv: Fix a harmless overflow warning

2016-06-14 Thread Jason Ekstrand

On Jun 14, 2016 4:23 PM, "Chad Versace"  wrote:
>
> anv_pipeline_binding::index is a uint8_t, but some code assigned to it
> UINT16_MAX.
> ---
>  src/intel/vulkan/anv_pipeline.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/intel/vulkan/anv_pipeline.c
b/src/intel/vulkan/anv_pipeline.c
> index 60b7c6b..b41e11e 100644
> --- a/src/intel/vulkan/anv_pipeline.c
> +++ b/src/intel/vulkan/anv_pipeline.c
> @@ -664,7 +664,7 @@ anv_pipeline_compile_fs(struct anv_pipeline *pipeline,
>   rt_bindings[0] = (struct anv_pipeline_binding) {
>  .set = ANV_DESCRIPTOR_SET_COLOR_ATTACHMENTS,
>  .binding = 0,
> -.index = UINT16_MAX,
> +.index = UINT8_MAX,

I believe we have a descriptive #define specifically for render targets.
Probably better to use that.

>   };
>   num_rts = 1;
>}
> --
> 2.9.0.rc2
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/8] gallium: Add a cap for offset_units_unscaled

2016-06-14 Thread Roland Scheidegger

Am 15.06.2016 um 01:08 schrieb Axel Davy:
> On 15/06/2016 00:21, Roland Scheidegger wrote:
>> Am 14.06.2016 um 23:33 schrieb Axel Davy:
>>> diff --git a/src/gallium/include/pipe/p_state.h
>>> b/src/gallium/include/pipe/p_state.h
>>> index 396f563..7dce80a 100644
>>> --- a/src/gallium/include/pipe/p_state.h
>>> +++ b/src/gallium/include/pipe/p_state.h
>>> @@ -139,6 +139,13 @@ struct pipe_rasterizer_state
>>>  unsigned clip_halfz:1;
>>>/**
>>> +* When true do not scale offset_units and use same rules for
>>> unorm and
>>> +* float depth buffers (D3D9). When false use GL/D3D1X behaviour.
>>> +* This depends on PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED.
>>> +*/
>>> +   unsigned offset_units_unscaled;
>>> +
>>> +   /**
>>>   * Enable bits for clipping half-spaces.
>>>   * This applies to both user clip planes and shader clip distances.
>>>   * Note that if the bound shader exports any clip distances, these
>>>
>> I don't like this. Generally, for unorm formats, you can easily enough
>> translate this from d3d9 to gl (or d3d10) rules (but yes, obviously it's
>> going to be format dependent). (With one big caveat, in general not all
>> gl drivers think the minimum resolvable difference is the same, that
>> might range from 2^-22 to 2^-24 for 24bit unorm depth for instance, and
>> I don't think it's quite consistent with gallium drivers neither).
>>
>> You are right though for float depth the formula is different, and you
>> can't translate it. But do you really need float depth buffer support?
>> AFAIK no d3d9 app really depends on it, everything can fall back to d24.
>>
>> Roland
>>
> Hi,
> 
> 
> That's true float depth buffer do not seem to be widely used in d3d9.
> 
> The two float depth buffers available in d3d9, as far as I know, are
> D32F_LOCKABLE and D24FS8.
> 
> We can see the support for those and other depth buffers here (note that
> these are mainly old cards):
> 
> http://zp.amsnet.pl/cdragan/query.php?dxversion=9&feature=formats&featuregroup=selected&adaptergroup=all&featureselected[]=45&featureselected[]=44&featureselected[]=41&featureselected[]=42&featureselected[]=43&featureselected[]=40&featureselected[]=39&featureselected[]=46&resource=SURFACE&usage=DEPTHSTENCIL&orientation=horizontal
> 
> 
> It is likely not a requirement for any game to support these formats.
> 
> 
> We could ignore these formats, and add to gallium a way to get the
> minimum resolvable difference per depth buffer format from drivers. We
> considered this option.
> 
> 
> That said, the driver is the best location to know about the minimum
> resolvable difference, and we made the choice to let the driver do the
> scaling instead of doing it based on some driver query in the state
> tracker.
> 
> As for floating point depth buffers behaviour, I understand for some
> drivers it may be harder than for others to implement.
> 
> That doesn't seem however a reason to drop floating depth buffer support
> in Gallium Nine. D32F_LOCKABLE is particularly useful for debugging,
> being lockable, it can be used to show depth buffer content after some
> draw calls for d3d on windows, and compare with nine. And some apps may
> use it for some particular effects.
> 
> I'd be ok if we make the float depth buffer part of
> offset_units_unscaled optional given how rare the combination float
> depth buffers + depth bias must be used. However if hw can do it, I see
> no reason why we wouldn't support the capability?

On second look, it doesn't really look too bad (and fwiw we actually
could probably put it to use here if we'd support it in llvmpipe).
Albeit,
unsigned offset_units_unscaled;
needs to be
unsigned offset_units_unscaled:1;

I'm just very sceptical when it comes to capabilities solely to the
benefit of fringe state trackers (and everything not st/mesa counts
here). It usually means driver authors aren't going to bother. And you
probably can't implement it in all drivers yourselves even if the hw
could do it.

That said, I'm ok with this if there's no objections from others.

Roland

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] anv: Fix a harmless overflow warning

2016-06-14 Thread Anuj Phogat

On Tue, Jun 14, 2016 at 4:22 PM, Chad Versace  wrote:
> anv_pipeline_binding::index is a uint8_t, but some code assigned to it
> UINT16_MAX.
> ---
>  src/intel/vulkan/anv_pipeline.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/intel/vulkan/anv_pipeline.c b/src/intel/vulkan/anv_pipeline.c
> index 60b7c6b..b41e11e 100644
> --- a/src/intel/vulkan/anv_pipeline.c
> +++ b/src/intel/vulkan/anv_pipeline.c
> @@ -664,7 +664,7 @@ anv_pipeline_compile_fs(struct anv_pipeline *pipeline,
>   rt_bindings[0] = (struct anv_pipeline_binding) {
>  .set = ANV_DESCRIPTOR_SET_COLOR_ATTACHMENTS,
>  .binding = 0,
> -.index = UINT16_MAX,
> +.index = UINT8_MAX,
>   };
>   num_rts = 1;
>}
> --
> 2.9.0.rc2
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reviewed-by: Anuj Phogat 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] anv: Fix a harmless overflow warning

2016-06-14 Thread Chad Versace

anv_pipeline_binding::index is a uint8_t, but some code assigned to it
UINT16_MAX.
---
 src/intel/vulkan/anv_pipeline.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/vulkan/anv_pipeline.c b/src/intel/vulkan/anv_pipeline.c
index 60b7c6b..b41e11e 100644
--- a/src/intel/vulkan/anv_pipeline.c
+++ b/src/intel/vulkan/anv_pipeline.c
@@ -664,7 +664,7 @@ anv_pipeline_compile_fs(struct anv_pipeline *pipeline,
  rt_bindings[0] = (struct anv_pipeline_binding) {
 .set = ANV_DESCRIPTOR_SET_COLOR_ATTACHMENTS,
 .binding = 0,
-.index = UINT16_MAX,
+.index = UINT8_MAX,
  };
  num_rts = 1;
   }
-- 
2.9.0.rc2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/8] gallium: Add a cap for offset_units_unscaled

2016-06-14 Thread Axel Davy


On 15/06/2016 00:21, Roland Scheidegger wrote:

Am 14.06.2016 um 23:33 schrieb Axel Davy:

diff --git a/src/gallium/include/pipe/p_state.h 
b/src/gallium/include/pipe/p_state.h
index 396f563..7dce80a 100644
--- a/src/gallium/include/pipe/p_state.h
+++ b/src/gallium/include/pipe/p_state.h
@@ -139,6 +139,13 @@ struct pipe_rasterizer_state
 unsigned clip_halfz:1;
  
 /**

+* When true do not scale offset_units and use same rules for unorm and
+* float depth buffers (D3D9). When false use GL/D3D1X behaviour.
+* This depends on PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED.
+*/
+   unsigned offset_units_unscaled;
+
+   /**
  * Enable bits for clipping half-spaces.
  * This applies to both user clip planes and shader clip distances.
  * Note that if the bound shader exports any clip distances, these


I don't like this. Generally, for unorm formats, you can easily enough
translate this from d3d9 to gl (or d3d10) rules (but yes, obviously it's
going to be format dependent). (With one big caveat, in general not all
gl drivers think the minimum resolvable difference is the same, that
might range from 2^-22 to 2^-24 for 24bit unorm depth for instance, and
I don't think it's quite consistent with gallium drivers neither).

You are right though for float depth the formula is different, and you
can't translate it. But do you really need float depth buffer support?
AFAIK no d3d9 app really depends on it, everything can fall back to d24.

Roland


Hi,


That's true float depth buffer do not seem to be widely used in d3d9.

The two float depth buffers available in d3d9, as far as I know, are 
D32F_LOCKABLE and D24FS8.


We can see the support for those and other depth buffers here (note that 
these are mainly old cards):


http://zp.amsnet.pl/cdragan/query.php?dxversion=9&feature=formats&featuregroup=selected&adaptergroup=all&featureselected%5B%5D=45&featureselected%5B%5D=44&featureselected%5B%5D=41&featureselected%5B%5D=42&featureselected%5B%5D=43&featureselected%5B%5D=40&featureselected%5B%5D=39&featureselected%5B%5D=46&resource=SURFACE&usage=DEPTHSTENCIL&orientation=horizontal


It is likely not a requirement for any game to support these formats.


We could ignore these formats, and add to gallium a way to get the 
minimum resolvable difference per depth buffer format from drivers. We 
considered this option.



That said, the driver is the best location to know about the minimum 
resolvable difference, and we made the choice to let the driver do the 
scaling instead of doing it based on some driver query in the state tracker.


As for floating point depth buffers behaviour, I understand for some 
drivers it may be harder than for others to implement.


That doesn't seem however a reason to drop floating depth buffer support 
in Gallium Nine. D32F_LOCKABLE is particularly useful for debugging, 
being lockable, it can be used to show depth buffer content after some 
draw calls for d3d on windows, and compare with nine. And some apps may 
use it for some particular effects.


I'd be ok if we make the float depth buffer part of 
offset_units_unscaled optional given how rare the combination float 
depth buffers + depth bias must be used. However if hw can do it, I see 
no reason why we wouldn't support the capability?



Yours,


Axel Davy

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/8] gallium: Add a cap for offset_units_unscaled

2016-06-14 Thread Roland Scheidegger

Am 14.06.2016 um 23:33 schrieb Axel Davy:
> D3D9 has a different behaviour for depth bias.
> 
> For OGL/D3D1X, the depth bias unit is the
> minimal resolvable value for the depth buffer,
> which depends on the format (and has different
> behaviour for float depth buffers).
> 
> For D3D9, the depth bias unit is 1.0f.
> 
> Signed-off-by: Axel Davy 
> ---
>  src/gallium/docs/source/cso/rasterizer.rst   | 6 ++
>  src/gallium/docs/source/screen.rst   | 2 ++
>  src/gallium/drivers/freedreno/freedreno_screen.c | 1 +
>  src/gallium/drivers/i915/i915_screen.c   | 1 +
>  src/gallium/drivers/ilo/ilo_screen.c | 1 +
>  src/gallium/drivers/llvmpipe/lp_screen.c | 1 +
>  src/gallium/drivers/nouveau/nv30/nv30_screen.c   | 1 +
>  src/gallium/drivers/nouveau/nv50/nv50_screen.c   | 1 +
>  src/gallium/drivers/nouveau/nvc0/nvc0_screen.c   | 1 +
>  src/gallium/drivers/r300/r300_screen.c   | 1 +
>  src/gallium/drivers/r600/r600_pipe.c | 1 +
>  src/gallium/drivers/radeonsi/si_pipe.c   | 1 +
>  src/gallium/drivers/softpipe/sp_screen.c | 1 +
>  src/gallium/drivers/svga/svga_screen.c   | 1 +
>  src/gallium/drivers/swr/swr_screen.cpp   | 1 +
>  src/gallium/drivers/vc4/vc4_screen.c | 1 +
>  src/gallium/drivers/virgl/virgl_screen.c | 1 +
>  src/gallium/include/pipe/p_defines.h | 1 +
>  src/gallium/include/pipe/p_state.h   | 7 +++
>  19 files changed, 31 insertions(+)
> 
> diff --git a/src/gallium/docs/source/cso/rasterizer.rst 
> b/src/gallium/docs/source/cso/rasterizer.rst
> index 8d473b8..616e451 100644
> --- a/src/gallium/docs/source/cso/rasterizer.rst
> +++ b/src/gallium/docs/source/cso/rasterizer.rst
> @@ -127,6 +127,12 @@ offset_tri
>  
>  offset_units
>  Specifies the polygon offset bias
> +offset_units_unscaled
> +Specifies the unit of the polygon offset bias. If false, use the
> +GL/D3D1X behaviour. If true, offset_units is a floating point offset
> +which isn't scaled (D3D9). Note that GL/D3D1X behaviour has different
> +formula whether the depth buffer is unorm or float, which is not
> +the case for D3D9.
>  offset_scale
>  Specifies the polygon offset scale
>  offset_clamp
> diff --git a/src/gallium/docs/source/screen.rst 
> b/src/gallium/docs/source/screen.rst
> index 920da42..9c26604 100644
> --- a/src/gallium/docs/source/screen.rst
> +++ b/src/gallium/docs/source/screen.rst
> @@ -340,6 +340,8 @@ The integer capabilities:
>extension and thus implements proper support for culling planes.
>  * ``PIPE_CAP_PRIMITIVE_RESTART_FOR_PATCHES``: Whether primitive restart is
>supported for patch primitives.
> +* ``PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED``: If true, the driver implements 
> support
> +  for ``pipe_rasterizer_state::offset_units_unscaled``.
>  
>  
>  .. _pipe_capf:
> diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c 
> b/src/gallium/drivers/freedreno/freedreno_screen.c
> index ad15aab..ed61456 100644
> --- a/src/gallium/drivers/freedreno/freedreno_screen.c
> +++ b/src/gallium/drivers/freedreno/freedreno_screen.c
> @@ -262,6 +262,7 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum 
> pipe_cap param)
>   case PIPE_CAP_ROBUST_BUFFER_ACCESS_BEHAVIOR:
>   case PIPE_CAP_CULL_DISTANCE:
>   case PIPE_CAP_PRIMITIVE_RESTART_FOR_PATCHES:
> + case PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED:
>   return 0;
>  
>   case PIPE_CAP_MAX_VIEWPORTS:
> diff --git a/src/gallium/drivers/i915/i915_screen.c 
> b/src/gallium/drivers/i915/i915_screen.c
> index c0e06e5..ea451e6 100644
> --- a/src/gallium/drivers/i915/i915_screen.c
> +++ b/src/gallium/drivers/i915/i915_screen.c
> @@ -273,6 +273,7 @@ i915_get_param(struct pipe_screen *screen, enum pipe_cap 
> cap)
> case PIPE_CAP_ROBUST_BUFFER_ACCESS_BEHAVIOR:
> case PIPE_CAP_CULL_DISTANCE:
> case PIPE_CAP_PRIMITIVE_RESTART_FOR_PATCHES:
> +   case PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED:
>return 0;
>  
> case PIPE_CAP_MAX_DUAL_SOURCE_RENDER_TARGETS:
> diff --git a/src/gallium/drivers/ilo/ilo_screen.c 
> b/src/gallium/drivers/ilo/ilo_screen.c
> index c847a90..c9b8d81 100644
> --- a/src/gallium/drivers/ilo/ilo_screen.c
> +++ b/src/gallium/drivers/ilo/ilo_screen.c
> @@ -502,6 +502,7 @@ ilo_get_param(struct pipe_screen *screen, enum pipe_cap 
> param)
> case PIPE_CAP_ROBUST_BUFFER_ACCESS_BEHAVIOR:
> case PIPE_CAP_CULL_DISTANCE:
> case PIPE_CAP_PRIMITIVE_RESTART_FOR_PATCHES:
> +   case PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED:
>return 0;
>  
> case PIPE_CAP_VENDOR_ID:
> diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c 
> b/src/gallium/drivers/llvmpipe/lp_screen.c
> index 5fc4427..f9217d6 100644
> --- a/src/gallium/drivers/llvmpipe/lp_screen.c
> +++ b/src/gallium/drivers/llvmpipe/lp_screen.c
> @@ -329,6 +329,7 @@ llvmpipe_get_param(struct pipe_screen *screen, enum 
> pipe_cap param)
> case PIPE_CAP_FRAMEBUFFER_NO_ATTACHMENT:
>

[Mesa-dev] [PATCH 7/8] r600, r600g: Implement POLYGON_OFFSET_UNITS_UNSCALED

2016-06-14 Thread Axel Davy

Empirical tests show that the polygon offset
behaviour is entirely determined by the content of
the PA_SU_POLY_OFFSET states, and not by the depth buffer
format bound.

PA_SU_POLY_OFFSET seems to directly set the parameters of
the polygon offset formula, and setting 0 for
PA_SU_POLY_OFFSET_DB_FMT_CNTL (ie setting the unorm depth
bias behaviour with a scale of 2^0 = 1.0f) gives the unscaled
behaviour.

Signed-off-by: Axel Davy 
---
 src/gallium/drivers/r600/evergreen_state.c   | 39 +++-
 src/gallium/drivers/r600/r600_pipe.c |  2 +-
 src/gallium/drivers/r600/r600_pipe.h |  2 ++
 src/gallium/drivers/r600/r600_state.c| 35 +
 src/gallium/drivers/r600/r600_state_common.c |  4 ++-
 5 files changed, 46 insertions(+), 36 deletions(-)

diff --git a/src/gallium/drivers/r600/evergreen_state.c 
b/src/gallium/drivers/r600/evergreen_state.c
index 9346ae9..e041842 100644
--- a/src/gallium/drivers/r600/evergreen_state.c
+++ b/src/gallium/drivers/r600/evergreen_state.c
@@ -493,6 +493,7 @@ static void *evergreen_create_rs_state(struct pipe_context 
*ctx,
rs->offset_units = state->offset_units;
rs->offset_scale = state->offset_scale * 16.0f;
rs->offset_enable = state->offset_point || state->offset_line || 
state->offset_tri;
+   rs->offset_units_unscaled = state->offset_units_unscaled;
 
if (state->point_size_per_vertex) {
psize_min = util_get_min_point_size(state);
@@ -1661,24 +1662,26 @@ static void evergreen_emit_polygon_offset(struct 
r600_context *rctx, struct r600
float offset_scale = state->offset_scale;
uint32_t pa_su_poly_offset_db_fmt_cntl = 0;
 
-   switch (state->zs_format) {
-   case PIPE_FORMAT_Z24X8_UNORM:
-   case PIPE_FORMAT_Z24_UNORM_S8_UINT:
-   case PIPE_FORMAT_X8Z24_UNORM:
-   case PIPE_FORMAT_S8_UINT_Z24_UNORM:
-   offset_units *= 2.0f;
-   pa_su_poly_offset_db_fmt_cntl =
-   S_028B78_POLY_OFFSET_NEG_NUM_DB_BITS((char)-24);
-   break;
-   case PIPE_FORMAT_Z16_UNORM:
-   offset_units *= 4.0f;
-   pa_su_poly_offset_db_fmt_cntl =
-   S_028B78_POLY_OFFSET_NEG_NUM_DB_BITS((char)-16);
-   break;
-   default:
-   pa_su_poly_offset_db_fmt_cntl =
-   S_028B78_POLY_OFFSET_NEG_NUM_DB_BITS((char)-23) |
-   S_028B78_POLY_OFFSET_DB_IS_FLOAT_FMT(1);
+   if (!state->offset_units_unscaled) {
+   switch (state->zs_format) {
+   case PIPE_FORMAT_Z24X8_UNORM:
+   case PIPE_FORMAT_Z24_UNORM_S8_UINT:
+   case PIPE_FORMAT_X8Z24_UNORM:
+   case PIPE_FORMAT_S8_UINT_Z24_UNORM:
+   offset_units *= 2.0f;
+   pa_su_poly_offset_db_fmt_cntl =
+   S_028B78_POLY_OFFSET_NEG_NUM_DB_BITS((char)-24);
+   break;
+   case PIPE_FORMAT_Z16_UNORM:
+   offset_units *= 4.0f;
+   pa_su_poly_offset_db_fmt_cntl =
+   S_028B78_POLY_OFFSET_NEG_NUM_DB_BITS((char)-16);
+   break;
+   default:
+   pa_su_poly_offset_db_fmt_cntl =
+   S_028B78_POLY_OFFSET_NEG_NUM_DB_BITS((char)-23) 
|
+   S_028B78_POLY_OFFSET_DB_IS_FLOAT_FMT(1);
+   }
}
 
radeon_set_context_reg_seq(cs, R_028B80_PA_SU_POLY_OFFSET_FRONT_SCALE, 
4);
diff --git a/src/gallium/drivers/r600/r600_pipe.c 
b/src/gallium/drivers/r600/r600_pipe.c
index db4fd1b..d9fffe9 100644
--- a/src/gallium/drivers/r600/r600_pipe.c
+++ b/src/gallium/drivers/r600/r600_pipe.c
@@ -282,6 +282,7 @@ static int r600_get_param(struct pipe_screen* pscreen, enum 
pipe_cap param)
case PIPE_CAP_SURFACE_REINTERPRET_BLOCKS:
case PIPE_CAP_QUERY_MEMORY_INFO:
case PIPE_CAP_FRAMEBUFFER_NO_ATTACHMENT:
+   case PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED:
return 1;
 
case PIPE_CAP_DEVICE_RESET_STATUS_QUERY:
@@ -368,7 +369,6 @@ static int r600_get_param(struct pipe_screen* pscreen, enum 
pipe_cap param)
case PIPE_CAP_ROBUST_BUFFER_ACCESS_BEHAVIOR:
case PIPE_CAP_CULL_DISTANCE:
case PIPE_CAP_PRIMITIVE_RESTART_FOR_PATCHES:
-   case PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED:
return 0;
 
case PIPE_CAP_MAX_SHADER_PATCH_VARYINGS:
diff --git a/src/gallium/drivers/r600/r600_pipe.h 
b/src/gallium/drivers/r600/r600_pipe.h
index 9677bb6..0dd538b 100644
--- a/src/gallium/drivers/r600/r600_pipe.h
+++ b/src/gallium/drivers/r600/r600_pipe.h
@@ -273,6 +273,7 @@ struct r600_rasterizer_state {
float   offset_units;
float   offset_scale;
booloffset_enable;
+   bool

[Mesa-dev] [PATCH 5/8] radeon: Remove useless pa_su_poly_offset_db_fmt_cntl

2016-06-14 Thread Axel Davy

pa_su_poly_offset_db_fmt_cntl usages were removed in
previous patches.

Signed-off-by: Axel Davy 
---
 src/gallium/drivers/radeon/r600_pipe_common.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h 
b/src/gallium/drivers/radeon/r600_pipe_common.h
index 8072833..14edeea 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.h
+++ b/src/gallium/drivers/radeon/r600_pipe_common.h
@@ -315,7 +315,6 @@ struct r600_surface {
unsigned db_htile_surface;
unsigned db_htile_data_base;
unsigned db_preload_control;/* EG and later */
-   unsigned pa_su_poly_offset_db_fmt_cntl;
 };
 
 struct r600_common_screen {
-- 
2.8.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 8/8] st/nine: Use offset_units_unscaled

2016-06-14 Thread Axel Davy

offset_units_unscaled enables proper support
for depth bias for gallium nine. Use it
if available.

Solves issues with some games using depth bias.
For example:
https://github.com/iXit/Mesa-3D/issues/220

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/device9.c|  1 +
 src/gallium/state_trackers/nine/device9.h|  1 +
 src/gallium/state_trackers/nine/nine_pipe.c  | 18 +-
 src/gallium/state_trackers/nine/nine_pipe.h  |  2 +-
 src/gallium/state_trackers/nine/nine_state.c |  2 +-
 5 files changed, 13 insertions(+), 11 deletions(-)

diff --git a/src/gallium/state_trackers/nine/device9.c 
b/src/gallium/state_trackers/nine/device9.c
index f510af7..98636fd 100644
--- a/src/gallium/state_trackers/nine/device9.c
+++ b/src/gallium/state_trackers/nine/device9.c
@@ -427,6 +427,7 @@ NineDevice9_ctor( struct NineDevice9 *This,
 This->driver_caps.window_space_position_support = 
GET_PCAP(TGSI_VS_WINDOW_SPACE_POSITION);
 This->driver_caps.vs_integer = pScreen->get_shader_param(pScreen, 
PIPE_SHADER_VERTEX, PIPE_SHADER_CAP_INTEGERS);
 This->driver_caps.ps_integer = pScreen->get_shader_param(pScreen, 
PIPE_SHADER_FRAGMENT, PIPE_SHADER_CAP_INTEGERS);
+This->driver_caps.offset_units_unscaled = 
GET_PCAP(POLYGON_OFFSET_UNITS_UNSCALED);
 
 nine_ff_init(This); /* initialize fixed function code */
 
diff --git a/src/gallium/state_trackers/nine/device9.h 
b/src/gallium/state_trackers/nine/device9.h
index 73a43cf..d584a35 100644
--- a/src/gallium/state_trackers/nine/device9.h
+++ b/src/gallium/state_trackers/nine/device9.h
@@ -121,6 +121,7 @@ struct NineDevice9
 boolean window_space_position_support;
 boolean vs_integer;
 boolean ps_integer;
+boolean offset_units_unscaled;
 } driver_caps;
 
 struct {
diff --git a/src/gallium/state_trackers/nine/nine_pipe.c 
b/src/gallium/state_trackers/nine/nine_pipe.c
index e3f9717..ea7dc16 100644
--- a/src/gallium/state_trackers/nine/nine_pipe.c
+++ b/src/gallium/state_trackers/nine/nine_pipe.c
@@ -70,7 +70,9 @@ nine_convert_dsa_state(struct pipe_depth_stencil_alpha_state 
*dsa_state,
 }
 
 void
-nine_convert_rasterizer_state(struct pipe_rasterizer_state *rast_state, const 
DWORD *rs)
+nine_convert_rasterizer_state(struct NineDevice9 *device,
+  struct pipe_rasterizer_state *rast_state,
+  const DWORD *rs)
 {
 struct pipe_rasterizer_state rast;
 
@@ -120,14 +122,12 @@ nine_convert_rasterizer_state(struct 
pipe_rasterizer_state *rast_state, const DW
 /* offset_units has the ogl/d3d11 meaning.
  * d3d9: offset = scale * dz + bias
  * ogl/d3d11: offset = scale * dz + r * bias
- * with r implementation dependant and is supposed to be
- * the smallest value the depth buffer format can hold.
- * In practice on current and past hw it seems to be 2^-23
- * for all formats except float formats where it varies depending
- * on the content.
- * For now use 1 << 23, but in the future perhaps add a way in gallium
- * to get r for the format or get the gallium behaviour */
-rast.offset_units = asfloat(rs[D3DRS_DEPTHBIAS]) * (float)(1 << 23);
+ * with r implementation dependent (+ different formula for float depth
+ * buffers). r=2^-23 is often the right value for gallium drivers.
+ * If possible, use offset_units_unscaled, which gives the d3d9
+ * behaviour, else scale by 1 << 23 */
+rast.offset_units = asfloat(rs[D3DRS_DEPTHBIAS]) * 
(device->driver_caps.offset_units_unscaled ? 1.0f : (float)(1 << 23));
+rast.offset_units_unscaled = device->driver_caps.offset_units_unscaled;
 rast.offset_scale = asfloat(rs[D3DRS_SLOPESCALEDEPTHBIAS]);
  /* rast.offset_clamp = 0.0f; */
 
diff --git a/src/gallium/state_trackers/nine/nine_pipe.h 
b/src/gallium/state_trackers/nine/nine_pipe.h
index 4d2bc92..fe8e910 100644
--- a/src/gallium/state_trackers/nine/nine_pipe.h
+++ b/src/gallium/state_trackers/nine/nine_pipe.h
@@ -38,7 +38,7 @@ extern const enum pipe_format 
nine_d3d9_to_pipe_format_map[120];
 extern const D3DFORMAT nine_pipe_to_d3d9_format_map[PIPE_FORMAT_COUNT];
 
 void nine_convert_dsa_state(struct pipe_depth_stencil_alpha_state *, const 
DWORD *);
-void nine_convert_rasterizer_state(struct pipe_rasterizer_state *, const DWORD 
*);
+void nine_convert_rasterizer_state(struct NineDevice9 *, struct 
pipe_rasterizer_state *, const DWORD *);
 void nine_convert_blend_state(struct pipe_blend_state *, const DWORD *);
 void nine_convert_sampler_state(struct cso_context *, int idx, const DWORD *);
 
diff --git a/src/gallium/state_trackers/nine/nine_state.c 
b/src/gallium/state_trackers/nine/nine_state.c
index f0b3d0d..3aa8906 100644
--- a/src/gallium/state_trackers/nine/nine_state.c
+++ b/src/gallium/state_trackers/nine/nine_state.c
@@ -74,7 +74,7 @@ prepare_dsa(struct NineDevice9 *device)
 static inline void
 prepare_rasterizer(struct NineDevice9 *device)
 {
-nine_convert_rasterizer_state(&device-

[Mesa-dev] [PATCH 3/8] r600: Emit poly_offset states together

2016-06-14 Thread Axel Davy

Emit PA_SU_POLY_OFFSET_DB_FMT_CNTL with the other poly_offset states.
This will be useful to implement
PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED.

Signed-off-by: Axel Davy 
---
 src/gallium/drivers/r600/r600_state.c | 35 ---
 1 file changed, 12 insertions(+), 23 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_state.c 
b/src/gallium/drivers/r600/r600_state.c
index cf7f0b3..edb1491 100644
--- a/src/gallium/drivers/r600/r600_state.c
+++ b/src/gallium/drivers/r600/r600_state.c
@@ -254,16 +254,24 @@ static void r600_emit_polygon_offset(struct r600_context 
*rctx, struct r600_atom
struct r600_poly_offset_state *state = (struct 
r600_poly_offset_state*)a;
float offset_units = state->offset_units;
float offset_scale = state->offset_scale;
+   uint32_t pa_su_poly_offset_db_fmt_cntl = 0;
 
switch (state->zs_format) {
case PIPE_FORMAT_Z24X8_UNORM:
case PIPE_FORMAT_Z24_UNORM_S8_UINT:
offset_units *= 2.0f;
+   pa_su_poly_offset_db_fmt_cntl =
+   S_028DF8_POLY_OFFSET_NEG_NUM_DB_BITS((char)-24);
break;
case PIPE_FORMAT_Z16_UNORM:
offset_units *= 4.0f;
+   pa_su_poly_offset_db_fmt_cntl =
+   S_028DF8_POLY_OFFSET_NEG_NUM_DB_BITS((char)-16);
break;
-   default:;
+   default:
+   pa_su_poly_offset_db_fmt_cntl =
+   S_028DF8_POLY_OFFSET_NEG_NUM_DB_BITS((char)-23) |
+   S_028DF8_POLY_OFFSET_DB_IS_FLOAT_FMT(1);
}
 
radeon_set_context_reg_seq(cs, R_028E00_PA_SU_POLY_OFFSET_FRONT_SCALE, 
4);
@@ -271,6 +279,9 @@ static void r600_emit_polygon_offset(struct r600_context 
*rctx, struct r600_atom
radeon_emit(cs, fui(offset_units));
radeon_emit(cs, fui(offset_scale));
radeon_emit(cs, fui(offset_units));
+
+   radeon_set_context_reg(cs, R_028DF8_PA_SU_POLY_OFFSET_DB_FMT_CNTL,
+  pa_su_poly_offset_db_fmt_cntl);
 }
 
 static uint32_t r600_get_blend_control(const struct pipe_blend_state *state, 
unsigned i)
@@ -1059,25 +1070,6 @@ static void r600_init_depth_surface(struct r600_context 
*rctx,
surf->db_depth_size = S_028000_PITCH_TILE_MAX(pitch) | 
S_028000_SLICE_TILE_MAX(slice);
surf->db_prefetch_limit = (rtex->surface.level[level].nblk_y / 8) - 1;
 
-   switch (surf->base.format) {
-   case PIPE_FORMAT_Z24X8_UNORM:
-   case PIPE_FORMAT_Z24_UNORM_S8_UINT:
-   surf->pa_su_poly_offset_db_fmt_cntl =
-   S_028DF8_POLY_OFFSET_NEG_NUM_DB_BITS((char)-24);
-   break;
-   case PIPE_FORMAT_Z32_FLOAT:
-   case PIPE_FORMAT_Z32_FLOAT_S8X24_UINT:
-   surf->pa_su_poly_offset_db_fmt_cntl =
-   S_028DF8_POLY_OFFSET_NEG_NUM_DB_BITS((char)-23) |
-   S_028DF8_POLY_OFFSET_DB_IS_FLOAT_FMT(1);
-   break;
-   case PIPE_FORMAT_Z16_UNORM:
-   surf->pa_su_poly_offset_db_fmt_cntl =
-   S_028DF8_POLY_OFFSET_NEG_NUM_DB_BITS((char)-16);
-   break;
-   default:;
-   }
-
/* use htile only for first level */
if (rtex->htile_buffer && !level) {
surf->db_htile_data_base = 0;
@@ -1457,9 +1449,6 @@ static void r600_emit_framebuffer_state(struct 
r600_context *rctx, struct r600_a
   
RADEON_PRIO_DEPTH_BUFFER_MSAA :
   
RADEON_PRIO_DEPTH_BUFFER);
 
-   radeon_set_context_reg(cs, 
R_028DF8_PA_SU_POLY_OFFSET_DB_FMT_CNTL,
-  surf->pa_su_poly_offset_db_fmt_cntl);
-
radeon_set_context_reg_seq(cs, R_028000_DB_DEPTH_SIZE, 2);
radeon_emit(cs, surf->db_depth_size); /* R_028000_DB_DEPTH_SIZE 
*/
radeon_emit(cs, surf->db_depth_view); /* R_028004_DB_DEPTH_VIEW 
*/
-- 
2.8.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 6/8] radeonsi: Implement POLYGON_OFFSET_UNITS_UNSCALED

2016-06-14 Thread Axel Davy

Empirical tests show that the polygon offset
behaviour is entirely determined by the content of
the PA_SU_POLY_OFFSET states, and not by the depth buffer
format bound.

PA_SU_POLY_OFFSET seems to directly set the parameters of
the polygon offset formula, and setting 0 for
PA_SU_POLY_OFFSET_DB_FMT_CNTL (ie setting the unorm depth
bias behaviour with a scale of 2^0 = 1.0f) gives the unscaled
behaviour.

Signed-off-by: Axel Davy 
---
 src/gallium/drivers/radeonsi/si_pipe.c  |  2 +-
 src/gallium/drivers/radeonsi/si_state.c | 32 ++--
 2 files changed, 19 insertions(+), 15 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
b/src/gallium/drivers/radeonsi/si_pipe.c
index 99c0349..430cca2 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ b/src/gallium/drivers/radeonsi/si_pipe.c
@@ -343,6 +343,7 @@ static int si_get_param(struct pipe_screen* pscreen, enum 
pipe_cap param)
case PIPE_CAP_TGSI_PACK_HALF_FLOAT:
case PIPE_CAP_FRAMEBUFFER_NO_ATTACHMENT:
case PIPE_CAP_ROBUST_BUFFER_ACCESS_BEHAVIOR:
+   case PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED:
return 1;
 
case PIPE_CAP_RESOURCE_FROM_USER_MEMORY:
@@ -400,7 +401,6 @@ static int si_get_param(struct pipe_screen* pscreen, enum 
pipe_cap param)
case PIPE_CAP_QUERY_BUFFER_OBJECT:
case PIPE_CAP_CULL_DISTANCE:
case PIPE_CAP_PRIMITIVE_RESTART_FOR_PATCHES:
-   case PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED:
return 0;
 
case PIPE_CAP_MAX_SHADER_PATCH_VARYINGS:
diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index 06c65be..82b643a 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -812,20 +812,24 @@ static void *si_create_rs_state(struct pipe_context *ctx,
float offset_scale = state->offset_scale * 16.0f;
uint32_t pa_su_poly_offset_db_fmt_cntl = 0;
 
-   switch (i) {
-   case 0: /* 16-bit zbuffer */
-   offset_units *= 4.0f;
-   pa_su_poly_offset_db_fmt_cntl = 
S_028B78_POLY_OFFSET_NEG_NUM_DB_BITS(-16);
-   break;
-   case 1: /* 24-bit zbuffer */
-   offset_units *= 2.0f;
-   pa_su_poly_offset_db_fmt_cntl = 
S_028B78_POLY_OFFSET_NEG_NUM_DB_BITS(-24);
-   break;
-   case 2: /* 32-bit zbuffer */
-   offset_units *= 1.0f;
-   pa_su_poly_offset_db_fmt_cntl = 
S_028B78_POLY_OFFSET_NEG_NUM_DB_BITS(-23) |
-   
S_028B78_POLY_OFFSET_DB_IS_FLOAT_FMT(1);
-   break;
+   if (!state->offset_units_unscaled) {
+   switch (i) {
+   case 0: /* 16-bit zbuffer */
+   offset_units *= 4.0f;
+   pa_su_poly_offset_db_fmt_cntl =
+   
S_028B78_POLY_OFFSET_NEG_NUM_DB_BITS(-16);
+   break;
+   case 1: /* 24-bit zbuffer */
+   offset_units *= 2.0f;
+   pa_su_poly_offset_db_fmt_cntl =
+   
S_028B78_POLY_OFFSET_NEG_NUM_DB_BITS(-24);
+   break;
+   case 2: /* 32-bit zbuffer */
+   offset_units *= 1.0f;
+   pa_su_poly_offset_db_fmt_cntl = 
S_028B78_POLY_OFFSET_NEG_NUM_DB_BITS(-23) |
+   
S_028B78_POLY_OFFSET_DB_IS_FLOAT_FMT(1);
+   break;
+   }
}
 
si_pm4_set_reg(pm4, R_028B80_PA_SU_POLY_OFFSET_FRONT_SCALE,
-- 
2.8.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 4/8] r600g: Emit poly_offset states together

2016-06-14 Thread Axel Davy

Emit PA_SU_POLY_OFFSET_DB_FMT_CNTL with the other poly_offset states.
This will be useful to implement
PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED.

Signed-off-by: Axel Davy 
---
 src/gallium/drivers/r600/evergreen_state.c | 36 ++
 1 file changed, 12 insertions(+), 24 deletions(-)

diff --git a/src/gallium/drivers/r600/evergreen_state.c 
b/src/gallium/drivers/r600/evergreen_state.c
index 1ac8914..9346ae9 100644
--- a/src/gallium/drivers/r600/evergreen_state.c
+++ b/src/gallium/drivers/r600/evergreen_state.c
@@ -1223,27 +1223,6 @@ static void evergreen_init_depth_surface(struct 
r600_context *rctx,
surf->db_depth_slice = S_02805C_SLICE_TILE_MAX(levelinfo->nblk_x *
   levelinfo->nblk_y / 64 - 
1);
 
-   switch (surf->base.format) {
-   case PIPE_FORMAT_Z24X8_UNORM:
-   case PIPE_FORMAT_Z24_UNORM_S8_UINT:
-   case PIPE_FORMAT_X8Z24_UNORM:
-   case PIPE_FORMAT_S8_UINT_Z24_UNORM:
-   surf->pa_su_poly_offset_db_fmt_cntl =
-   S_028B78_POLY_OFFSET_NEG_NUM_DB_BITS((char)-24);
-   break;
-   case PIPE_FORMAT_Z32_FLOAT:
-   case PIPE_FORMAT_Z32_FLOAT_S8X24_UINT:
-   surf->pa_su_poly_offset_db_fmt_cntl =
-   S_028B78_POLY_OFFSET_NEG_NUM_DB_BITS((char)-23) |
-   S_028B78_POLY_OFFSET_DB_IS_FLOAT_FMT(1);
-   break;
-   case PIPE_FORMAT_Z16_UNORM:
-   surf->pa_su_poly_offset_db_fmt_cntl =
-   S_028B78_POLY_OFFSET_NEG_NUM_DB_BITS((char)-16);
-   break;
-   default:;
-   }
-
if (rtex->surface.flags & RADEON_SURF_SBUFFER) {
uint64_t stencil_offset;
unsigned stile_split = rtex->surface.stencil_tile_split;
@@ -1628,8 +1607,6 @@ static void evergreen_emit_framebuffer_state(struct 
r600_context *rctx, struct r
   
RADEON_PRIO_DEPTH_BUFFER_MSAA :
   
RADEON_PRIO_DEPTH_BUFFER);
 
-   radeon_set_context_reg(cs, 
R_028B78_PA_SU_POLY_OFFSET_DB_FMT_CNTL,
-  zb->pa_su_poly_offset_db_fmt_cntl);
radeon_set_context_reg(cs, R_028008_DB_DEPTH_VIEW, 
zb->db_depth_view);
 
radeon_set_context_reg_seq(cs, R_028040_DB_Z_INFO, 8);
@@ -1682,6 +1659,7 @@ static void evergreen_emit_polygon_offset(struct 
r600_context *rctx, struct r600
struct r600_poly_offset_state *state = (struct 
r600_poly_offset_state*)a;
float offset_units = state->offset_units;
float offset_scale = state->offset_scale;
+   uint32_t pa_su_poly_offset_db_fmt_cntl = 0;
 
switch (state->zs_format) {
case PIPE_FORMAT_Z24X8_UNORM:
@@ -1689,11 +1667,18 @@ static void evergreen_emit_polygon_offset(struct 
r600_context *rctx, struct r600
case PIPE_FORMAT_X8Z24_UNORM:
case PIPE_FORMAT_S8_UINT_Z24_UNORM:
offset_units *= 2.0f;
+   pa_su_poly_offset_db_fmt_cntl =
+   S_028B78_POLY_OFFSET_NEG_NUM_DB_BITS((char)-24);
break;
case PIPE_FORMAT_Z16_UNORM:
offset_units *= 4.0f;
+   pa_su_poly_offset_db_fmt_cntl =
+   S_028B78_POLY_OFFSET_NEG_NUM_DB_BITS((char)-16);
break;
-   default:;
+   default:
+   pa_su_poly_offset_db_fmt_cntl =
+   S_028B78_POLY_OFFSET_NEG_NUM_DB_BITS((char)-23) |
+   S_028B78_POLY_OFFSET_DB_IS_FLOAT_FMT(1);
}
 
radeon_set_context_reg_seq(cs, R_028B80_PA_SU_POLY_OFFSET_FRONT_SCALE, 
4);
@@ -1701,6 +1686,9 @@ static void evergreen_emit_polygon_offset(struct 
r600_context *rctx, struct r600
radeon_emit(cs, fui(offset_units));
radeon_emit(cs, fui(offset_scale));
radeon_emit(cs, fui(offset_units));
+
+   radeon_set_context_reg(cs, R_028B78_PA_SU_POLY_OFFSET_DB_FMT_CNTL,
+  pa_su_poly_offset_db_fmt_cntl);
 }
 
 static void evergreen_emit_cb_misc_state(struct r600_context *rctx, struct 
r600_atom *atom)
-- 
2.8.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/8] radeonsi: Emit poly_offset states together

2016-06-14 Thread Axel Davy

Emit PA_SU_POLY_OFFSET_DB_FMT_CNTL with rasterizer poly_offset states.
This will be useful to implement
PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED.

Signed-off-by: Axel Davy 
---
 src/gallium/drivers/radeonsi/si_state.c | 31 ---
 1 file changed, 8 insertions(+), 23 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index 14520ca..06c65be 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -810,16 +810,21 @@ static void *si_create_rs_state(struct pipe_context *ctx,
struct si_pm4_state *pm4 = &rs->pm4_poly_offset[i];
float offset_units = state->offset_units;
float offset_scale = state->offset_scale * 16.0f;
+   uint32_t pa_su_poly_offset_db_fmt_cntl = 0;
 
switch (i) {
case 0: /* 16-bit zbuffer */
offset_units *= 4.0f;
+   pa_su_poly_offset_db_fmt_cntl = 
S_028B78_POLY_OFFSET_NEG_NUM_DB_BITS(-16);
break;
case 1: /* 24-bit zbuffer */
offset_units *= 2.0f;
+   pa_su_poly_offset_db_fmt_cntl = 
S_028B78_POLY_OFFSET_NEG_NUM_DB_BITS(-24);
break;
case 2: /* 32-bit zbuffer */
offset_units *= 1.0f;
+   pa_su_poly_offset_db_fmt_cntl = 
S_028B78_POLY_OFFSET_NEG_NUM_DB_BITS(-23) |
+   
S_028B78_POLY_OFFSET_DB_IS_FLOAT_FMT(1);
break;
}
 
@@ -831,6 +836,8 @@ static void *si_create_rs_state(struct pipe_context *ctx,
   fui(offset_scale));
si_pm4_set_reg(pm4, R_028B8C_PA_SU_POLY_OFFSET_BACK_OFFSET,
   fui(offset_units));
+   si_pm4_set_reg(pm4, R_028B78_PA_SU_POLY_OFFSET_DB_FMT_CNTL,
+  pa_su_poly_offset_db_fmt_cntl);
}
 
return rs;
@@ -2097,26 +2104,7 @@ static void si_init_depth_surface(struct si_context 
*sctx,
unsigned format;
uint32_t z_info, s_info, db_depth_info;
uint64_t z_offs, s_offs;
-   uint32_t db_htile_data_base, db_htile_surface, 
pa_su_poly_offset_db_fmt_cntl = 0;
-
-   switch (sctx->framebuffer.state.zsbuf->texture->format) {
-   case PIPE_FORMAT_S8_UINT_Z24_UNORM:
-   case PIPE_FORMAT_X8Z24_UNORM:
-   case PIPE_FORMAT_Z24X8_UNORM:
-   case PIPE_FORMAT_Z24_UNORM_S8_UINT:
-   pa_su_poly_offset_db_fmt_cntl = 
S_028B78_POLY_OFFSET_NEG_NUM_DB_BITS(-24);
-   break;
-   case PIPE_FORMAT_Z32_FLOAT:
-   case PIPE_FORMAT_Z32_FLOAT_S8X24_UINT:
-   pa_su_poly_offset_db_fmt_cntl = 
S_028B78_POLY_OFFSET_NEG_NUM_DB_BITS(-23) |
-   
S_028B78_POLY_OFFSET_DB_IS_FLOAT_FMT(1);
-   break;
-   case PIPE_FORMAT_Z16_UNORM:
-   pa_su_poly_offset_db_fmt_cntl = 
S_028B78_POLY_OFFSET_NEG_NUM_DB_BITS(-16);
-   break;
-   default:
-   assert(0);
-   }
+   uint32_t db_htile_data_base, db_htile_surface;
 
format = si_translate_dbformat(rtex->resource.b.b.format);
 
@@ -2213,7 +2201,6 @@ static void si_init_depth_surface(struct si_context *sctx,
surf->db_depth_slice = S_02805C_SLICE_TILE_MAX((levelinfo->nblk_x *
levelinfo->nblk_y) / 64 
- 1);
surf->db_htile_surface = db_htile_surface;
-   surf->pa_su_poly_offset_db_fmt_cntl = pa_su_poly_offset_db_fmt_cntl;
 
surf->depth_initialized = true;
 }
@@ -2514,8 +2501,6 @@ static void si_emit_framebuffer_state(struct si_context 
*sctx, struct r600_atom
radeon_emit(cs, fui(rtex->depth_clear_value)); /* 
R_02802C_DB_DEPTH_CLEAR */
 
radeon_set_context_reg(cs, R_028ABC_DB_HTILE_SURFACE, 
zb->db_htile_surface);
-   radeon_set_context_reg(cs, 
R_028B78_PA_SU_POLY_OFFSET_DB_FMT_CNTL,
-  zb->pa_su_poly_offset_db_fmt_cntl);
} else if (sctx->framebuffer.dirty_zsbuf) {
radeon_set_context_reg_seq(cs, R_028040_DB_Z_INFO, 2);
radeon_emit(cs, S_028040_FORMAT(V_028040_Z_INVALID)); /* 
R_028040_DB_Z_INFO */
-- 
2.8.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/8] gallium: Add a cap for offset_units_unscaled

2016-06-14 Thread Axel Davy

D3D9 has a different behaviour for depth bias.

For OGL/D3D1X, the depth bias unit is the
minimal resolvable value for the depth buffer,
which depends on the format (and has different
behaviour for float depth buffers).

For D3D9, the depth bias unit is 1.0f.

Signed-off-by: Axel Davy 
---
 src/gallium/docs/source/cso/rasterizer.rst   | 6 ++
 src/gallium/docs/source/screen.rst   | 2 ++
 src/gallium/drivers/freedreno/freedreno_screen.c | 1 +
 src/gallium/drivers/i915/i915_screen.c   | 1 +
 src/gallium/drivers/ilo/ilo_screen.c | 1 +
 src/gallium/drivers/llvmpipe/lp_screen.c | 1 +
 src/gallium/drivers/nouveau/nv30/nv30_screen.c   | 1 +
 src/gallium/drivers/nouveau/nv50/nv50_screen.c   | 1 +
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c   | 1 +
 src/gallium/drivers/r300/r300_screen.c   | 1 +
 src/gallium/drivers/r600/r600_pipe.c | 1 +
 src/gallium/drivers/radeonsi/si_pipe.c   | 1 +
 src/gallium/drivers/softpipe/sp_screen.c | 1 +
 src/gallium/drivers/svga/svga_screen.c   | 1 +
 src/gallium/drivers/swr/swr_screen.cpp   | 1 +
 src/gallium/drivers/vc4/vc4_screen.c | 1 +
 src/gallium/drivers/virgl/virgl_screen.c | 1 +
 src/gallium/include/pipe/p_defines.h | 1 +
 src/gallium/include/pipe/p_state.h   | 7 +++
 19 files changed, 31 insertions(+)

diff --git a/src/gallium/docs/source/cso/rasterizer.rst 
b/src/gallium/docs/source/cso/rasterizer.rst
index 8d473b8..616e451 100644
--- a/src/gallium/docs/source/cso/rasterizer.rst
+++ b/src/gallium/docs/source/cso/rasterizer.rst
@@ -127,6 +127,12 @@ offset_tri
 
 offset_units
 Specifies the polygon offset bias
+offset_units_unscaled
+Specifies the unit of the polygon offset bias. If false, use the
+GL/D3D1X behaviour. If true, offset_units is a floating point offset
+which isn't scaled (D3D9). Note that GL/D3D1X behaviour has different
+formula whether the depth buffer is unorm or float, which is not
+the case for D3D9.
 offset_scale
 Specifies the polygon offset scale
 offset_clamp
diff --git a/src/gallium/docs/source/screen.rst 
b/src/gallium/docs/source/screen.rst
index 920da42..9c26604 100644
--- a/src/gallium/docs/source/screen.rst
+++ b/src/gallium/docs/source/screen.rst
@@ -340,6 +340,8 @@ The integer capabilities:
   extension and thus implements proper support for culling planes.
 * ``PIPE_CAP_PRIMITIVE_RESTART_FOR_PATCHES``: Whether primitive restart is
   supported for patch primitives.
+* ``PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED``: If true, the driver implements 
support
+  for ``pipe_rasterizer_state::offset_units_unscaled``.
 
 
 .. _pipe_capf:
diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c 
b/src/gallium/drivers/freedreno/freedreno_screen.c
index ad15aab..ed61456 100644
--- a/src/gallium/drivers/freedreno/freedreno_screen.c
+++ b/src/gallium/drivers/freedreno/freedreno_screen.c
@@ -262,6 +262,7 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_ROBUST_BUFFER_ACCESS_BEHAVIOR:
case PIPE_CAP_CULL_DISTANCE:
case PIPE_CAP_PRIMITIVE_RESTART_FOR_PATCHES:
+   case PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED:
return 0;
 
case PIPE_CAP_MAX_VIEWPORTS:
diff --git a/src/gallium/drivers/i915/i915_screen.c 
b/src/gallium/drivers/i915/i915_screen.c
index c0e06e5..ea451e6 100644
--- a/src/gallium/drivers/i915/i915_screen.c
+++ b/src/gallium/drivers/i915/i915_screen.c
@@ -273,6 +273,7 @@ i915_get_param(struct pipe_screen *screen, enum pipe_cap 
cap)
case PIPE_CAP_ROBUST_BUFFER_ACCESS_BEHAVIOR:
case PIPE_CAP_CULL_DISTANCE:
case PIPE_CAP_PRIMITIVE_RESTART_FOR_PATCHES:
+   case PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED:
   return 0;
 
case PIPE_CAP_MAX_DUAL_SOURCE_RENDER_TARGETS:
diff --git a/src/gallium/drivers/ilo/ilo_screen.c 
b/src/gallium/drivers/ilo/ilo_screen.c
index c847a90..c9b8d81 100644
--- a/src/gallium/drivers/ilo/ilo_screen.c
+++ b/src/gallium/drivers/ilo/ilo_screen.c
@@ -502,6 +502,7 @@ ilo_get_param(struct pipe_screen *screen, enum pipe_cap 
param)
case PIPE_CAP_ROBUST_BUFFER_ACCESS_BEHAVIOR:
case PIPE_CAP_CULL_DISTANCE:
case PIPE_CAP_PRIMITIVE_RESTART_FOR_PATCHES:
+   case PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED:
   return 0;
 
case PIPE_CAP_VENDOR_ID:
diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c 
b/src/gallium/drivers/llvmpipe/lp_screen.c
index 5fc4427..f9217d6 100644
--- a/src/gallium/drivers/llvmpipe/lp_screen.c
+++ b/src/gallium/drivers/llvmpipe/lp_screen.c
@@ -329,6 +329,7 @@ llvmpipe_get_param(struct pipe_screen *screen, enum 
pipe_cap param)
case PIPE_CAP_FRAMEBUFFER_NO_ATTACHMENT:
case PIPE_CAP_ROBUST_BUFFER_ACCESS_BEHAVIOR:
case PIPE_CAP_PRIMITIVE_RESTART_FOR_PATCHES:
+   case PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED:
   return 0;
}
/* should only get here on unhandled cases */
diff --git a/src/gallium/drivers/nouveau/

[Mesa-dev] [PATCH] vc4: fix vc4_resource_from_handle() stride calculation

2016-06-14 Thread Rob Herring

The expected stride calculation is completely wrong. It should
ultimately be multiplying cpp and width rather than dividing. The width
also needs to be aligned to the tiling width first before converting to
stride bytes.

The whole stride check here is possibly pointless. Any buffers which
were allocated outside of vc4 may have strides with larger alignment
requirements.

Signed-off-by: Rob Herring 
---
 src/gallium/drivers/vc4/vc4_resource.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/vc4/vc4_resource.c 
b/src/gallium/drivers/vc4/vc4_resource.c
index 20f137a..aabe593 100644
--- a/src/gallium/drivers/vc4/vc4_resource.c
+++ b/src/gallium/drivers/vc4/vc4_resource.c
@@ -534,8 +534,8 @@ vc4_resource_from_handle(struct pipe_screen *pscreen,
 struct vc4_resource *rsc = vc4_resource_setup(pscreen, tmpl);
 struct pipe_resource *prsc = &rsc->base.b;
 struct vc4_resource_slice *slice = &rsc->slices[0];
-uint32_t expected_stride = align(prsc->width0 / rsc->cpp,
- vc4_utile_width(rsc->cpp));
+uint32_t expected_stride =
+align(prsc->width0, vc4_utile_width(rsc->cpp)) * rsc->cpp;
 
 if (!rsc)
 return NULL;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radeon: remove unnecessary checks

2016-06-14 Thread Jan Vesely

On Tue, 2016-06-14 at 22:44 +0200, Jakob Sinclair wrote:
> On 2016-06-14 20:39, Jan Vesely wrote:
> > I really disagree here. The conditions check whether swizzle is
> > between
> > X and W (as in, only X,Y,Z,W are allowed). The fact that X maps to
> > 0 is
> > irrelevant. removing the checks impairs readability of the code
> > because
> > the lower bound is now inferred (by being 0) rather than explicit.
> > 
> > the same comment applies to your v2.
> > 
> > Jan
> 
> Thanks for the input. Now when I think about it again this is
> probably a 
> bad change.
> Didn't think about the lower bound. So this patch should probably not
> be 
> pushed.

It'd still be nice to have them fixed.
those lines produce Wtype-limits ("comparison always false due to
limited range of data types") warnings which are rather useful in other
cases, since type-limits can lead to DCE of useful code with aggressive
optimization.

not sure what the best way to do is, though.

Jan


-- 
Jan Vesely 

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 4/5] radeon/vce: sort cpb by ref list for VAAPI encode

2016-06-14 Thread Boyuan Zhang

Signed-off-by: Boyuan Zhang 
---
 src/gallium/drivers/radeon/radeon_vce.c | 52 +++--
 1 file changed, 49 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_vce.c 
b/src/gallium/drivers/radeon/radeon_vce.c
index 549d999..0ff07eb 100644
--- a/src/gallium/drivers/radeon/radeon_vce.c
+++ b/src/gallium/drivers/radeon/radeon_vce.c
@@ -139,6 +139,48 @@ static void sort_cpb(struct rvce_encoder *enc)
}
 }
 
+/**
+ * sort l0 and l1 based on reference picture list
+ */
+static void sort_cpb_by_ref_list(struct rvce_encoder *enc)
+{
+   struct rvce_cpb_slot *i, *l0 = NULL, *l1 = NULL;
+   struct list_head *current = &enc->cpb_slots;
+
+   for (int j = 0 ; j < 32 ; j++) {
+   if ((enc->pic.ref_pic_list_0[j] == 0x) &&
+   (enc->pic.ref_pic_list_1[j] == 0x))
+   break;
+   LIST_FOR_EACH_ENTRY(i, &enc->cpb_slots, list) {
+   if (i->frame_num == enc->pic.ref_pic_list_0[j])
+   l0 = i;
+
+   if (i->frame_num == enc->pic.ref_pic_list_1[j])
+   l1 = i;
+
+   if (enc->pic.picture_type == 
PIPE_H264_ENC_PICTURE_TYPE_P &&
+   l0)
+   break;
+
+   if (enc->pic.picture_type == 
PIPE_H264_ENC_PICTURE_TYPE_B &&
+   l0 && l1)
+   break;
+   }
+
+   if (l0) {
+   LIST_DEL(&l0->list);
+   LIST_ADD(&l0->list, current);
+   current = current->next;
+   }
+
+   if (l1) {
+   LIST_DEL(&l1->list);
+   LIST_ADD(&l1->list, current);
+   current = current->next;
+   }
+   }
+}
+
 static void get_rate_control_param(struct rvce_encoder *enc, struct 
pipe_h264_enc_picture_desc *pic)
 {
enc->pic.rc.rc_method = pic->rate_ctrl.rate_ctrl_method;
@@ -444,9 +486,13 @@ static void rvce_begin_frame(struct pipe_video_codec 
*encoder,
if (pic->picture_type == PIPE_H264_ENC_PICTURE_TYPE_IDR)
reset_cpb(enc);
else if (pic->picture_type == PIPE_H264_ENC_PICTURE_TYPE_P ||
-pic->picture_type == PIPE_H264_ENC_PICTURE_TYPE_B)
-   sort_cpb(enc);
-   
+pic->picture_type == PIPE_H264_ENC_PICTURE_TYPE_B) {
+   if (pic->has_ref_pic_list)
+   sort_cpb_by_ref_list(enc);
+   else
+   sort_cpb(enc);
+   }
+
if (!enc->stream_handle) {
struct rvid_buffer fb;
enc->stream_handle = rvid_alloc_stream_handle();
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/5] radeon/vce: add vce structures

2016-06-14 Thread Boyuan Zhang

Signed-off-by: Boyuan Zhang 
---
 src/gallium/drivers/radeon/radeon_vce.h | 297 
 1 file changed, 297 insertions(+)

diff --git a/src/gallium/drivers/radeon/radeon_vce.h 
b/src/gallium/drivers/radeon/radeon_vce.h
index e438148..da61285 100644
--- a/src/gallium/drivers/radeon/radeon_vce.h
+++ b/src/gallium/drivers/radeon/radeon_vce.h
@@ -65,6 +65,303 @@ struct rvce_cpb_slot {
unsignedpic_order_cnt;
 };
 
+struct rvce_rate_control {
+   uint32_trc_method;
+   uint32_ttarget_bitrate;
+   uint32_tpeak_bitrate;
+   uint32_tframe_rate_num;
+   uint32_tgop_size;
+   uint32_tquant_i_frames;
+   uint32_tquant_p_frames;
+   uint32_tquant_b_frames;
+   uint32_tvbv_buffer_size;
+   uint32_tframe_rate_den;
+   uint32_tvbv_buf_lv;
+   uint32_tmax_au_size;
+   uint32_tqp_initial_mode;
+   uint32_ttarget_bits_picture;
+   uint32_tpeak_bits_picture_integer;
+   uint32_tpeak_bits_picture_fraction;
+   uint32_tmin_qp;
+   uint32_tmax_qp;
+   uint32_tskip_frame_enable;
+   uint32_tfill_data_enable;
+   uint32_tenforce_hrd;
+   uint32_tb_pics_delta_qp;
+   uint32_tref_b_pics_delta_qp;
+   uint32_trc_reinit_disable;
+   uint32_tenc_lcvbr_init_qp_flag;
+   uint32_tlcvbrsatd_based_nonlinear_bit_budget_flag;
+};
+
+struct rvce_motion_estimation {
+   uint32_tenc_ime_decimation_search;
+   uint32_tmotion_est_half_pixel;
+   uint32_tmotion_est_quarter_pixel;
+   uint32_tdisable_favor_pmv_point;
+   uint32_tforce_zero_point_center;
+   uint32_tlsmvert;
+   uint32_tenc_search_range_x;
+   uint32_tenc_search_range_y;
+   uint32_tenc_search1_range_x;
+   uint32_tenc_search1_range_y;
+   uint32_tdisable_16x16_frame1;
+   uint32_tdisable_satd;
+   uint32_tenable_amd;
+   uint32_tenc_disable_sub_mode;
+   uint32_tenc_ime_skip_x;
+   uint32_tenc_ime_skip_y;
+   uint32_tenc_en_ime_overw_dis_subm;
+   uint32_tenc_ime_overw_dis_subm_no;
+   uint32_tenc_ime2_search_range_x;
+   uint32_tenc_ime2_search_range_y;
+   uint32_tparallel_mode_speedup_enable;
+   uint32_tfme0_enc_disable_sub_mode;
+   uint32_tfme1_enc_disable_sub_mode;
+   uint32_time_sw_speedup_enable;
+};
+
+struct rvce_pic_control {
+   uint32_tenc_use_constrained_intra_pred;
+   uint32_tenc_cabac_enable;
+   uint32_tenc_cabac_idc;
+   uint32_tenc_loop_filter_disable;
+   int32_t enc_lf_beta_offset;
+   int32_t enc_lf_alpha_c0_offset;
+   uint32_tenc_crop_left_offset;
+   uint32_tenc_crop_right_offset;
+   uint32_tenc_crop_top_offset;
+   uint32_tenc_crop_bottom_offset;
+   uint32_tenc_num_mbs_per_slice;
+   uint32_tenc_intra_refresh_num_mbs_per_slot;
+   uint32_tenc_force_intra_refresh;
+   uint32_tenc_force_imb_period;
+   uint32_tenc_pic_order_cnt_type;
+   uint32_tlog2_max_pic_order_cnt_lsb_minus4;
+   uint32_tenc_sps_id;
+   uint32_tenc_pps_id;
+   uint32_tenc_constraint_set_flags;
+   uint32_tenc_b_pic_pattern;
+   uint32_tweight_pred_mode_b_picture;
+   uint32_tenc_number_of_reference_frames;
+   uint32_tenc_max_num_ref_frames;
+   uint32_tenc_num_default_active_ref_l0;
+   uint32_tenc_num_default_active_ref_l1;
+   uint32_tenc_slice_mode;
+   uint32_tenc_max_slice_size;
+};
+
+struct rvce_task_info {
+   uint32_toffset_of_next_task_info;
+   uint32_ttask_operation;
+   uint32_treference_picture_dependency;
+   uint32_tcollocate_flag_dependency;
+   uint32_tfeedback_index;
+   uint32_t

Re: [Mesa-dev] [PATCH] radeon: remove unnecessary checks

2016-06-14 Thread Jakob Sinclair


On 2016-06-14 20:39, Jan Vesely wrote:

I really disagree here. The conditions check whether swizzle is between
X and W (as in, only X,Y,Z,W are allowed). The fact that X maps to 0 is
irrelevant. removing the checks impairs readability of the code because
the lower bound is now inferred (by being 0) rather than explicit.

the same comment applies to your v2.

Jan


Thanks for the input. Now when I think about it again this is probably a 
bad change.
Didn't think about the lower bound. So this patch should probably not be 
pushed.


--
Mvh Jakob Sinclair
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/5] radeon/vce: use vce structures for encoding

2016-06-14 Thread Boyuan Zhang

Signed-off-by: Boyuan Zhang 
---
 src/gallium/drivers/radeon/radeon_vce.c| 180 ++-
 src/gallium/drivers/radeon/radeon_vce.h|   2 +-
 src/gallium/drivers/radeon/radeon_vce_40_2_2.c | 425 +
 src/gallium/drivers/radeon/radeon_vce_50.c | 183 ++-
 src/gallium/drivers/radeon/radeon_vce_52.c | 171 +-
 5 files changed, 603 insertions(+), 358 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_vce.c 
b/src/gallium/drivers/radeon/radeon_vce.c
index e16e0cf..549d999 100644
--- a/src/gallium/drivers/radeon/radeon_vce.c
+++ b/src/gallium/drivers/radeon/radeon_vce.c
@@ -139,6 +139,176 @@ static void sort_cpb(struct rvce_encoder *enc)
}
 }
 
+static void get_rate_control_param(struct rvce_encoder *enc, struct 
pipe_h264_enc_picture_desc *pic)
+{
+   enc->pic.rc.rc_method = pic->rate_ctrl.rate_ctrl_method;
+   enc->pic.rc.target_bitrate = pic->rate_ctrl.target_bitrate;
+   enc->pic.rc.peak_bitrate = pic->rate_ctrl.peak_bitrate;
+   enc->pic.rc.quant_i_frames = pic->quant_i_frames;
+   enc->pic.rc.quant_p_frames = pic->quant_p_frames;
+   enc->pic.rc.quant_b_frames = pic->quant_b_frames;
+   enc->pic.rc.gop_size = pic->gop_size;
+   enc->pic.rc.frame_rate_num = pic->rate_ctrl.frame_rate_num;
+   enc->pic.rc.frame_rate_den = pic->rate_ctrl.frame_rate_den;
+   enc->pic.rc.max_qp = 51;
+
+   if (pic->enable_low_level_control == true) {
+   enc->pic.rc.vbv_buffer_size = 2000;
+   if (pic->rate_ctrl.frame_rate_num == 0)
+   enc->pic.rc.frame_rate_num = 30;
+   if (pic->rate_ctrl.frame_rate_den == 0)
+   enc->pic.rc.frame_rate_den = 1;
+   enc->pic.rc.vbv_buf_lv = 48;
+   enc->pic.rc.fill_data_enable = 1;
+   enc->pic.rc.enforce_hrd = 1;
+   enc->pic.rc.target_bits_picture = enc->pic.rc.target_bitrate / 
enc->pic.rc.frame_rate_num;
+   enc->pic.rc.peak_bits_picture_integer = 
enc->pic.rc.peak_bitrate / enc->pic.rc.frame_rate_num;
+   enc->pic.rc.peak_bits_picture_fraction = 0;
+   } else {
+   enc->pic.rc.vbv_buffer_size = pic->rate_ctrl.vbv_buffer_size;
+   enc->pic.rc.vbv_buf_lv = 0;
+   enc->pic.rc.fill_data_enable = 0;
+   enc->pic.rc.enforce_hrd = 0;
+   enc->pic.rc.target_bits_picture = 
pic->rate_ctrl.target_bits_picture;
+   enc->pic.rc.peak_bits_picture_integer = 
pic->rate_ctrl.peak_bits_picture_integer;
+   enc->pic.rc.peak_bits_picture_fraction = 
pic->rate_ctrl.peak_bits_picture_fraction;
+   }
+}
+
+static void get_motion_estimation_param(struct rvce_encoder *enc, struct 
pipe_h264_enc_picture_desc *pic)
+{
+   if (pic->enable_low_level_control == true) {
+   enc->pic.me.motion_est_quarter_pixel = 0x0001;
+   enc->pic.me.enc_disable_sub_mode = 0x0078;
+   enc->pic.me.lsmvert = 0x0002;
+   enc->pic.me.enc_en_ime_overw_dis_subm = 0x0001;
+   enc->pic.me.enc_ime_overw_dis_subm_no = 0x0001;
+   enc->pic.me.enc_ime2_search_range_x = 0x0004;
+   enc->pic.me.enc_ime2_search_range_y = 0x0004;
+   enc->pic.me.enc_ime_decimation_search = 0x0001;
+   enc->pic.me.motion_est_half_pixel = 0x0001;
+   enc->pic.me.enc_search_range_x = 0x0010;
+   enc->pic.me.enc_search_range_y = 0x0010;
+   enc->pic.me.enc_search1_range_x = 0x0010;
+   enc->pic.me.enc_search1_range_y = 0x0010;
+   } else {
+   enc->pic.me.motion_est_quarter_pixel = 0x;
+   enc->pic.me.enc_disable_sub_mode = 0x00fe;
+   enc->pic.me.lsmvert = 0x;
+   enc->pic.me.enc_en_ime_overw_dis_subm = 0x;
+   enc->pic.me.enc_ime_overw_dis_subm_no = 0x;
+   enc->pic.me.enc_ime2_search_range_x = 0x0001;
+   enc->pic.me.enc_ime2_search_range_y = 0x0001;
+   enc->pic.me.enc_ime_decimation_search = 0x0001;
+   enc->pic.me.motion_est_half_pixel = 0x0001;
+   enc->pic.me.enc_search_range_x = 0x0010;
+   enc->pic.me.enc_search_range_y = 0x0010;
+   enc->pic.me.enc_search1_range_x = 0x0010;
+   enc->pic.me.enc_search1_range_y = 0x0010;
+   }
+}
+
+static void get_pic_control_param(struct rvce_encoder *enc, struct 
pipe_h264_enc_picture_desc *pic)
+{
+   unsigned encNumMBsPerSlice;
+   encNumMBsPerSlice = align(enc->base.width, 16) / 16;
+   encNumMBsPerSlice *= align(enc->base.height, 16) / 16;
+   enc->pic.pc.enc_crop_right_offset = (align(enc->base.width, 16) - 
enc->base.width) >> 1;
+   enc->pic.pc.enc_crop_bottom_offset = (align(enc->base.height, 16) - 
enc

[Mesa-dev] [PATCH 2/5] vl: add parameters for VAAPI encode

2016-06-14 Thread Boyuan Zhang

Signed-off-by: Boyuan Zhang 
---
 src/gallium/include/pipe/p_video_state.h | 13 +
 1 file changed, 13 insertions(+)

diff --git a/src/gallium/include/pipe/p_video_state.h 
b/src/gallium/include/pipe/p_video_state.h
index d353be6..d519d17 100644
--- a/src/gallium/include/pipe/p_video_state.h
+++ b/src/gallium/include/pipe/p_video_state.h
@@ -131,6 +131,7 @@ enum pipe_h264_enc_rate_control_method
 struct pipe_picture_desc
 {
enum pipe_video_profile profile;
+   enum pipe_video_entrypoint entry_point;
 };
 
 struct pipe_quant_matrix
@@ -369,11 +370,23 @@ struct pipe_h264_enc_picture_desc
 
enum pipe_h264_enc_picture_type picture_type;
unsigned frame_num;
+   unsigned frame_num_cnt;
+   unsigned p_remain;
+   unsigned i_remain;
+   unsigned idr_pic_id;
+   unsigned gop_cnt;
unsigned pic_order_cnt;
unsigned ref_idx_l0;
unsigned ref_idx_l1;
+   unsigned gop_size;
 
bool not_referenced;
+   bool is_idr;
+   bool has_ref_pic_list;
+   bool enable_low_level_control;
+   unsigned int ref_pic_list_0[32];
+   unsigned int ref_pic_list_1[32];
+   unsigned int frame_idx[32];
 };
 
 struct pipe_h265_sps
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 5/5] st/va: enable h264 VAAPI encode

2016-06-14 Thread Boyuan Zhang

Signed-off-by: Boyuan Zhang 
---
 src/gallium/state_trackers/va/buffer.c |   6 ++
 src/gallium/state_trackers/va/config.c | 104 +++---
 src/gallium/state_trackers/va/context.c|  72 -
 src/gallium/state_trackers/va/image.c  | 126 +++---
 src/gallium/state_trackers/va/picture.c| 165 -
 src/gallium/state_trackers/va/surface.c|  16 ++-
 src/gallium/state_trackers/va/va_private.h |   9 ++
 7 files changed, 441 insertions(+), 57 deletions(-)

diff --git a/src/gallium/state_trackers/va/buffer.c 
b/src/gallium/state_trackers/va/buffer.c
index 7d3167b..dfcebbe 100644
--- a/src/gallium/state_trackers/va/buffer.c
+++ b/src/gallium/state_trackers/va/buffer.c
@@ -133,6 +133,12 @@ vlVaMapBuffer(VADriverContextP ctx, VABufferID buf_id, 
void **pbuff)
   if (!buf->derived_surface.transfer || !*pbuff)
  return VA_STATUS_ERROR_INVALID_BUFFER;
 
+  if (buf->type == VAEncCodedBufferType) {
+ ((VACodedBufferSegment*)buf->data)->buf = *pbuff;
+ ((VACodedBufferSegment*)buf->data)->size = buf->coded_size;
+ ((VACodedBufferSegment*)buf->data)->next = NULL;
+ *pbuff = buf->data;
+  }
} else {
   pipe_mutex_unlock(drv->mutex);
   *pbuff = buf->data;
diff --git a/src/gallium/state_trackers/va/config.c 
b/src/gallium/state_trackers/va/config.c
index 9ca0aa8..04d214d 100644
--- a/src/gallium/state_trackers/va/config.c
+++ b/src/gallium/state_trackers/va/config.c
@@ -34,6 +34,8 @@
 
 #include "va_private.h"
 
+#include "util/u_handle_table.h"
+
 DEBUG_GET_ONCE_BOOL_OPTION(mpeg4, "VAAPI_MPEG4_ENABLED", false)
 
 VAStatus
@@ -72,6 +74,7 @@ vlVaQueryConfigEntrypoints(VADriverContextP ctx, VAProfile 
profile,
 {
struct pipe_screen *pscreen;
enum pipe_video_profile p;
+   int va_status = VA_STATUS_ERROR_UNSUPPORTED_PROFILE;
 
if (!ctx)
   return VA_STATUS_ERROR_INVALID_CONTEXT;
@@ -88,12 +91,18 @@ vlVaQueryConfigEntrypoints(VADriverContextP ctx, VAProfile 
profile,
   return VA_STATUS_ERROR_UNSUPPORTED_PROFILE;
 
pscreen = VL_VA_PSCREEN(ctx);
-   if (!pscreen->get_video_param(pscreen, p, PIPE_VIDEO_ENTRYPOINT_BITSTREAM, 
PIPE_VIDEO_CAP_SUPPORTED))
-  return VA_STATUS_ERROR_UNSUPPORTED_PROFILE;
-
-   entrypoint_list[(*num_entrypoints)++] = VAEntrypointVLD;
+   if (pscreen->get_video_param(pscreen, p, PIPE_VIDEO_ENTRYPOINT_BITSTREAM, 
PIPE_VIDEO_CAP_SUPPORTED)) {
+  entrypoint_list[(*num_entrypoints)++] = VAEntrypointVLD;
+  va_status = VA_STATUS_SUCCESS;
+   }
+   if (pscreen->get_video_param(pscreen, p, PIPE_VIDEO_ENTRYPOINT_ENCODE, 
PIPE_VIDEO_CAP_SUPPORTED) &&
+   p == PIPE_VIDEO_PROFILE_MPEG4_AVC_BASELINE) {
+  entrypoint_list[(*num_entrypoints)++] = VAEntrypointEncSlice;
+  entrypoint_list[(*num_entrypoints)++] = VAEntrypointEncPicture;
+  va_status = VA_STATUS_SUCCESS;
+   }
 
-   return VA_STATUS_SUCCESS;
+   return va_status;
 }
 
 VAStatus
@@ -112,7 +121,7 @@ vlVaGetConfigAttributes(VADriverContextP ctx, VAProfile 
profile, VAEntrypoint en
  value = VA_RT_FORMAT_YUV420;
  break;
   case VAConfigAttribRateControl:
- value = VA_RC_NONE;
+ value = VA_RC_CQP | VA_RC_CBR;
  break;
   default:
  value = VA_ATTRIB_NOT_SUPPORTED;
@@ -128,14 +137,27 @@ VAStatus
 vlVaCreateConfig(VADriverContextP ctx, VAProfile profile, VAEntrypoint 
entrypoint,
  VAConfigAttrib *attrib_list, int num_attribs, VAConfigID 
*config_id)
 {
+   vlVaDriver *drv;
+   vlVaConfig *config;
struct pipe_screen *pscreen;
enum pipe_video_profile p;
 
if (!ctx)
   return VA_STATUS_ERROR_INVALID_CONTEXT;
 
+   drv = VL_VA_DRIVER(ctx);
+
+   if (!drv)
+  return VA_STATUS_ERROR_INVALID_CONTEXT;
+
+   config = CALLOC(1, sizeof(vlVaConfig));
+   if (!config)
+  return VA_STATUS_ERROR_ALLOCATION_FAILED;
+
if (profile == VAProfileNone && entrypoint == VAEntrypointVideoProc) {
-  *config_id = PIPE_VIDEO_PROFILE_UNKNOWN;
+  config->entrypoint = VAEntrypointVideoProc;
+  config->profile = PIPE_VIDEO_PROFILE_UNKNOWN;
+  *config_id = handle_table_add(drv->htab, config);
   return VA_STATUS_SUCCESS;
}
 
@@ -144,13 +166,36 @@ vlVaCreateConfig(VADriverContextP ctx, VAProfile profile, 
VAEntrypoint entrypoin
   return VA_STATUS_ERROR_UNSUPPORTED_PROFILE;
 
pscreen = VL_VA_PSCREEN(ctx);
-   if (!pscreen->get_video_param(pscreen, p, PIPE_VIDEO_ENTRYPOINT_BITSTREAM, 
PIPE_VIDEO_CAP_SUPPORTED))
-  return VA_STATUS_ERROR_UNSUPPORTED_PROFILE;
-
-   if (entrypoint != VAEntrypointVLD)
+   if (entrypoint == VAEntrypointVLD) {
+  if (!pscreen->get_video_param(pscreen, p, 
PIPE_VIDEO_ENTRYPOINT_BITSTREAM, PIPE_VIDEO_CAP_SUPPORTED))
+ return VA_STATUS_ERROR_UNSUPPORTED_PROFILE;
+   }
+   else if (entrypoint == VAEntrypointEncSlice) {
+  if (!pscreen->get_video_param(pscreen, p, PIPE_VIDEO_ENTRYPOINT_ENCODE, 
PIPE_VIDEO_CAP_SUPPORTED))
+ retu

Re: [Mesa-dev] [PATCH 0/7] Fix ralloc/rzalloc usage v2

2016-06-14 Thread Eric Anholt

Rob Clark  writes:

> I (and I expect Eric too) would appreciate it if you went ahead and
> replaced the current use of non-"z" versions in code that you can't
> test w/ the "z" versions.  That way we can switch over to non-zero'ing
> on our own time, rather than getting a surprise next time we
> pull/rebase
>
> I think it's only a couple spots in freedreno, and pre-emptive r-b for
> that change ;-)

I've checked vc4, and all the calls should be fine already.


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 96517] [llvmpipe] piglit arb_uniform_buffer_object-rendering-dsa regression

2016-06-14 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=96517

Roland Scheidegger  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #5 from Roland Scheidegger  ---
Fixed by f4184d5450c12e107d3e41ae29e5927c75543259.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965/fs: Fix regs_written for SIMD-lowered instructions some more.

2016-06-14 Thread Francisco Jerez

Iago Toral  writes:

> On Fri, 2016-06-10 at 22:39 -0700, Francisco Jerez wrote:
>> ISTR having suggested this during review of the recent FP64 changes to
>> the SIMD lowering pass, but it doesn't look like it was taken into
>> account in the end.  Using the fs_reg::component_size helper instead
>> of this open-coded variant makes sure that the stride is taken into
>> account correctly.  Fixes at least the following piglit tests with
>> spilling forced on (since otherwise regs_written would be calculated
>> incorrectly and the spilling code would be rather confused about how
>> much data needs to be spilled):
>
> Yes, you had suggested it but we forgot about it until a few days ago
> when I was tracking down a similar bug and came up with this same patch.
> I was about to send it for review together with other fixes for BSW this
> week, sorry if that caused you trouble...
>
No worries, it was no trouble at all.

> Iago
>
>>  spec.arb_gpu_shader_fp64.shader_storage.layout-std140-fp64-shader
>>  spec.arb_gpu_shader_fp64.shader_storage.layout-std140-fp64-mixed-shader
>> 
>> Cc: 
>> ---
>>  src/mesa/drivers/dri/i965/brw_fs.cpp | 6 +++---
>>  1 file changed, 3 insertions(+), 3 deletions(-)
>> 
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
>> b/src/mesa/drivers/dri/i965/brw_fs.cpp
>> index 104c20b..0347b0a 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
>> @@ -5261,9 +5261,9 @@ fs_visitor::lower_simd_width()
>> split_inst.src[j] = emit_unzip(lbld, block, inst, j);
>>  
>>  split_inst.dst = emit_zip(lbld, block, inst);
>> -split_inst.regs_written =
>> -   DIV_ROUND_UP(type_sz(inst->dst.type) * dst_size * 
>> lower_width,
>> -REG_SIZE);
>> +split_inst.regs_written = DIV_ROUND_UP(
>> +   split_inst.dst.component_size(lower_width) * dst_size,
>> +   REG_SIZE);
>>  
>>  lbld.emit(split_inst);
>>   }


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radeon: remove unnecessary checks

2016-06-14 Thread Jan Vesely

On Tue, 2016-06-14 at 20:19 +0200, Jakob Sinclair wrote:
> On 2016-06-13 12:02, Nicolai Hähnle wrote:
> > 
> > Meh. This is the kind of thing where Coverity should perhaps just
> > shut 
> > up :/
> 
> I do agree with you that Coverity should perhaps shut up about this 
> kinda thing
> but I couldn't see a reason to have these checks in the code. They 
> really didn't
> contribute to my understanding of the code. 

I really disagree here. The conditions check whether swizzle is between
X and W (as in, only X,Y,Z,W are allowed). The fact that X maps to 0 is
irrelevant. removing the checks impairs readability of the code because
the lower bound is now inferred (by being 0) rather than explicit.

the same comment applies to your v2.

Jan

> Although I may be missing 
> something
> important here.
> 
> > Anyway...
> > I think for consistency, you should also remove the '-
> > PIPE_SWIZZLE_X'
> > here, similar to the first hunk. With that changed,
> 
> Forgot about that one. I agree with this change.
> 
> > Reviewed-by: Nicolai Hähnle 
> 
> Thanks!
-- 
Jan Vesely 

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] V2 radeon: remove unnecessary checks

2016-06-14 Thread Jakob Sinclair

PIPE_SWIZZLE_X is always 0 and desc->swizzle is an unsigned char meaning
that desc->swizzle can never be smaller then PIPE_SWIZZLE_X. Removing
these checks doesn't change the code path at all because they would
always give the same result. Issue discovered by Coverity.

V2: Removed "- PIPE_SWIZZLE_X" for more consistency.

CID: 1337954

Signed-off-by: Jakob Sinclair 
---
 src/gallium/drivers/radeon/r600_texture.c | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_texture.c 
b/src/gallium/drivers/radeon/r600_texture.c
index 32347f2..3a56f9f 100644
--- a/src/gallium/drivers/radeon/r600_texture.c
+++ b/src/gallium/drivers/radeon/r600_texture.c
@@ -1754,10 +1754,9 @@ static void vi_get_fast_clear_parameters(enum 
pipe_format surface_format,
return;
 
for (i = 0; i < 4; ++i) {
-   int index = desc->swizzle[i] - PIPE_SWIZZLE_X;
+   int index = desc->swizzle[i];
 
-   if (desc->swizzle[i] < PIPE_SWIZZLE_X ||
-   desc->swizzle[i] > PIPE_SWIZZLE_W)
+   if (desc->swizzle[i] > PIPE_SWIZZLE_W)
continue;
 
if (util_format_is_pure_sint(surface_format)) {
@@ -1782,8 +1781,7 @@ static void vi_get_fast_clear_parameters(enum pipe_format 
surface_format,
 
for (int i = 0; i < 4; ++i)
if (values[i] != main_value &&
-   desc->swizzle[i] - PIPE_SWIZZLE_X != extra_channel &&
-   desc->swizzle[i] >= PIPE_SWIZZLE_X &&
+   desc->swizzle[i] != extra_channel &&
desc->swizzle[i] <= PIPE_SWIZZLE_W)
return;
 
-- 
2.8.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radeon: remove unnecessary checks

2016-06-14 Thread Jakob Sinclair


On 2016-06-13 12:02, Nicolai Hähnle wrote:


Meh. This is the kind of thing where Coverity should perhaps just shut 
up :/


I do agree with you that Coverity should perhaps shut up about this 
kinda thing
but I couldn't see a reason to have these checks in the code. They 
really didn't
contribute to my understanding of the code. Although I may be missing 
something

important here.


Anyway...
I think for consistency, you should also remove the '- PIPE_SWIZZLE_X'
here, similar to the first hunk. With that changed,


Forgot about that one. I agree with this change.


Reviewed-by: Nicolai Hähnle 


Thanks!
--
Mvh Jakob Sinclair
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/3] radeonsi: fix undefined left-shift into sign bit

2016-06-14 Thread Marek Olšák

Reviewed-by: Marek Olšák 

Marek

On Tue, Jun 14, 2016 at 4:37 PM, Nicolai Hähnle  wrote:
> From: Nicolai Hähnle 
>
> ---
>  src/gallium/drivers/radeonsi/cik_sdma.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/src/gallium/drivers/radeonsi/cik_sdma.c 
> b/src/gallium/drivers/radeonsi/cik_sdma.c
> index d8ec2a3..a36bbce 100644
> --- a/src/gallium/drivers/radeonsi/cik_sdma.c
> +++ b/src/gallium/drivers/radeonsi/cik_sdma.c
> @@ -370,12 +370,13 @@ static bool cik_sdma_copy_texture(struct si_context 
> *sctx,
> copy_height <= (1 << 14) &&
> copy_depth <= (1 << 11)) {
> struct radeon_winsys_cs *cs = sctx->b.dma.cs;
> +   uint32_t direction = linear == rdst ? 1u << 31 : 0;
>
> r600_need_dma_space(&sctx->b, 14, &rdst->resource, 
> &rsrc->resource);
>
> radeon_emit(cs, CIK_SDMA_PACKET(CIK_SDMA_OPCODE_COPY,
> 
> CIK_SDMA_COPY_SUB_OPCODE_TILED_SUB_WINDOW, 0) |
> -   ((linear == rdst) << 31));
> +   direction);
> radeon_emit(cs, tiled_address);
> radeon_emit(cs, tiled_address >> 32);
> radeon_emit(cs, tiled_x | (tiled_y << 16));
> --
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 05/10] gallium: change end_query() to return boolean

2016-06-14 Thread Marek Olšák

On Tue, Jun 14, 2016 at 6:24 PM, Ilia Mirkin  wrote:
> On Tue, Jun 14, 2016 at 12:19 PM, Nicolai Hähnle  wrote:
>> On 14.06.2016 17:57, Rob Clark wrote:
>>>
>>> From: Rob Clark 
>>>
>>> s/bool/boolean/ to make it match the other APIs.
>>
>>
>> Please no. C has finally grown a proper bool type, we should use it where
>> possible. If anything, make the patch go in the other direction.
>
> FWIW I've eradicated boolean from nouveau except for the gallium API
> interfaces. Would definitely be in favor of flipping those to bool.

I'm in favor of bool too and would like to see boolean/TRUE/FALSE
disappear from radeon-specific code at least.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 01/10] gallium: cleanup set_tess_state

2016-06-14 Thread Rob Clark

On Tue, Jun 14, 2016 at 12:30 PM, Nicolai Hähnle  wrote:
> On 14.06.2016 18:02, Ilia Mirkin wrote:
>>
>> Can you explain the motivation behind this change? I'm adding a
>> ->set_window_rectangles thing which also takes multiple parameters.
>> What's the advantage of stuffing things into a struct first?
>
>
> FWIW, I tend to be mildly supportive of changes like this. At least, the
> other extreme where functions grow multiple bool or int parameters over time
> is much worse. But in this particular case, changing this around might be
> too eager.

I'd have to think about how it would work to deal w/ variants that
have params not wrapped in a struct.  It at least sounds annoying, and
I tended to think the benefits of using a struct where enough of a
justification to change this.  (Plus there are not many usages of this
API yet, so seemed like the perfect time to cleanup.)

> Perhaps teaching the script to deal with slightly more complicated cases
> will help elsewhere, too.

*maybe*, but I can't think of anything..  right now it is only the
sampler_view and stream_output_target state that I handle "manually"..
but those are also kind of different from the rest since they are
already refcnt'd.  And I figured it was easier to just deal w/ those
manually than implement a 3rd type of state (CSO vs Param) in
rsq_state.py..

BR,
-R

> Nicolai
>
>>
>>-ilia
>>
>> On Tue, Jun 14, 2016 at 11:57 AM, Rob Clark  wrote:
>>>
>>> From: Rob Clark 
>>>
>>> The reset of the state APIs take state structs, rather than inline
>>> parameters (with the exception of a couple which just amount to a single
>>> uint).
>>>
>>> This makes the API more regular and simplifies autogeneration of the
>>> gallium state related APIs.
>>>
>>> Signed-off-by: Rob Clark 
>>> ---
>>>   src/gallium/drivers/ddebug/dd_context.c   |  9 -
>>>   src/gallium/drivers/nouveau/nvc0/nvc0_state.c |  7 +++
>>>   src/gallium/drivers/r600/evergreen_state.c|  7 +++
>>>   src/gallium/drivers/radeonsi/si_state.c   |  7 +++
>>>   src/gallium/drivers/trace/tr_context.c|  9 -
>>>   src/gallium/include/pipe/p_context.h  |  4 ++--
>>>   src/gallium/include/pipe/p_state.h|  8 
>>>   src/mesa/state_tracker/st_atom_tess.c | 13 ++---
>>>   8 files changed, 37 insertions(+), 27 deletions(-)
>>>
>>> diff --git a/src/gallium/drivers/ddebug/dd_context.c
>>> b/src/gallium/drivers/ddebug/dd_context.c
>>> index 0f8ef18..06b7c91 100644
>>> --- a/src/gallium/drivers/ddebug/dd_context.c
>>> +++ b/src/gallium/drivers/ddebug/dd_context.c
>>> @@ -380,15 +380,14 @@ dd_context_set_viewport_states(struct pipe_context
>>> *_pipe,
>>>   }
>>>
>>>   static void dd_context_set_tess_state(struct pipe_context *_pipe,
>>> -  const float
>>> default_outer_level[4],
>>> -  const float
>>> default_inner_level[2])
>>> +  const struct pipe_tess_state
>>> *state)
>>>   {
>>>  struct dd_context *dctx = dd_context(_pipe);
>>>  struct pipe_context *pipe = dctx->pipe;
>>>
>>> -   memcpy(dctx->tess_default_levels, default_outer_level, sizeof(float)
>>> * 4);
>>> -   memcpy(dctx->tess_default_levels+4, default_inner_level,
>>> sizeof(float) * 2);
>>> -   pipe->set_tess_state(pipe, default_outer_level, default_inner_level);
>>> +   memcpy(dctx->tess_default_levels, state->default_outer_level,
>>> sizeof(float) * 4);
>>> +   memcpy(dctx->tess_default_levels+4, state->default_inner_level,
>>> sizeof(float) * 2);
>>> +   pipe->set_tess_state(pipe, state);
>>>   }
>>>
>>>
>>> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
>>> b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
>>> index 92161ec..a9c1830 100644
>>> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
>>> +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
>>> @@ -1001,13 +1001,12 @@ nvc0_set_viewport_states(struct pipe_context
>>> *pipe,
>>>
>>>   static void
>>>   nvc0_set_tess_state(struct pipe_context *pipe,
>>> -const float default_tess_outer[4],
>>> -const float default_tess_inner[2])
>>> +const struct pipe_tess_state *state)
>>>   {
>>>  struct nvc0_context *nvc0 = nvc0_context(pipe);
>>>
>>> -   memcpy(nvc0->default_tess_outer, default_tess_outer, 4 *
>>> sizeof(float));
>>> -   memcpy(nvc0->default_tess_inner, default_tess_inner, 2 *
>>> sizeof(float));
>>> +   memcpy(nvc0->default_tess_outer, state->default_tess_outer, 4 *
>>> sizeof(float));
>>> +   memcpy(nvc0->default_tess_inner, state->default_tess_inner, 2 *
>>> sizeof(float));
>>>  nvc0->dirty_3d |= NVC0_NEW_3D_TESSFACTOR;
>>>   }
>>>
>>> diff --git a/src/gallium/drivers/r600/evergreen_state.c
>>> b/src/gallium/drivers/r600/evergreen_state.c
>>> index 1ac8914..2a424f5 100644
>>> --- a/src/gallium/drivers/r600/evergreen_state.c
>>> +++ b/src/gallium/drivers/r600/evergreen_state.c
>>> @@ -3569,13 +3569,12

Re: [Mesa-dev] [PATCH v2] swr: automake: don't ship LLVM version specific generated sources

2016-06-14 Thread Emil Velikov

On 14 June 2016 at 18:06, Rowley, Timothy O  wrote:
>
>> On Jun 13, 2016, at 8:03 PM, Rowley, Timothy O  
>> wrote:
>>
>> A clean tree build works with this version, but distcheck fails:
>>
>> ...
>> rm -f config.status config.cache config.log configure.lineno 
>> config.status.lineno
>> rm -f Makefile
>> ERROR: files left in build directory after distclean:
>> ./src/gallium/drivers/swr/rasterizer/jitter/builder_gen.cpp
>> ./src/gallium/drivers/swr/rasterizer/jitter/builder_x86.cpp
>> ./src/gallium/drivers/swr/rasterizer/jitter/builder_gen.h
>> make[1]: *** [distcleancheck] Error 1
>> make[1]: Leaving directory 
>> `/home/torowley/work/mesa-opt/mesa-12.1.0-devel/_build'
>> make: *** [distcheck] Error 1
>>
>> Not sure how builder_x86.cpp managed to change its status.
>
> To answer my own question: the reason for builder_x86.cpp being regenerated 
> is because of its dependency on builder_gen.h (through builder.h).
>
I thought that one was mentioned is the big comment in the patch.
Perhaps my wording could be improved - any suggestions ?

And yes, due to the missing dependency the file will be (re)generated
at a later stage thus we'll need to add yet another workaround for
that. Just listing the whole lot in CLEANFILES should be enough.

Feel free to give it a try, or I'll do at some point later on today.

Thanks,
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH (backport)] radeonsi: mark buffer texture range valid for shader images

2016-06-14 Thread Marek Olšák

Reviewed-by: Marek Olšák 

Marek

On Tue, Jun 14, 2016 at 6:00 PM, Nicolai Hähnle  wrote:
> From: Nicolai Hähnle 
>
> When a shader image view into a buffer texture can be written to, the buffer's
> valid range must be updated, or subsequent transfers may incorrectly skip
> synchronization.
>
> This fixes a bug that was exposed in Xephyr by PBO acceleration for 
> glReadPixels,
> reported by Michel Dänzer.
>
> Cc: Michel Dänzer 
> Cc: 12.0 
> Reviewed-by: Marek Olšák 
>
> Back-ported from commit a64c7cd2bac33a3a2bf908b5ef538dff03b93b73:
> - include util/u_format.h
> - code was extracted to si_set_shader_image in master, move it back
>
> Signed-off-by: Nicolai Hähnle 
> --
>  src/gallium/drivers/radeonsi/si_descriptors.c | 24 
>  1 file changed, 24 insertions(+)
>
> diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c 
> b/src/gallium/drivers/radeonsi/si_descriptors.c
> index 855b79e..e8ce87b 100644
> --- a/src/gallium/drivers/radeonsi/si_descriptors.c
> +++ b/src/gallium/drivers/radeonsi/si_descriptors.c
> @@ -60,6 +60,7 @@
>  #include "si_shader.h"
>  #include "sid.h"
>
> +#include "util/u_format.h"
>  #include "util/u_math.h"
>  #include "util/u_memory.h"
>  #include "util/u_suballoc.h"
> @@ -471,6 +472,23 @@ si_disable_shader_image(struct si_images_info *images, 
> unsigned slot)
>  }
>
>  static void
> +si_mark_image_range_valid(struct pipe_image_view *view)
> +{
> +   struct r600_resource *res = (struct r600_resource *)view->resource;
> +   const struct util_format_description *desc;
> +   unsigned stride;
> +
> +   assert(res && res->b.b.target == PIPE_BUFFER);
> +
> +   desc = util_format_description(view->format);
> +   stride = desc->block.bits / 8;
> +
> +   util_range_add(&res->valid_buffer_range,
> +  stride * (view->u.buf.first_element),
> +  stride * (view->u.buf.last_element + 1));
> +}
> +
> +static void
>  si_set_shader_images(struct pipe_context *pipe, unsigned shader,
>  unsigned start_slot, unsigned count,
>  struct pipe_image_view *views)
> @@ -502,6 +520,9 @@ si_set_shader_images(struct pipe_context *pipe, unsigned 
> shader,
>RADEON_USAGE_READWRITE);
>
> if (res->b.b.target == PIPE_BUFFER) {
> +   if (views[i].access & PIPE_IMAGE_ACCESS_WRITE)
> +   si_mark_image_range_valid(&views[i]);
> +
> si_make_buffer_descriptor(screen, res,
>   views[i].format,
>   
> views[i].u.buf.first_element,
> @@ -1297,6 +1318,9 @@ static void si_invalidate_buffer(struct pipe_context 
> *ctx, struct pipe_resource
> unsigned i = u_bit_scan(&mask);
>
> if (images->views[i].resource == buf) {
> +   if (images->views[i].access & 
> PIPE_IMAGE_ACCESS_WRITE)
> +   
> si_mark_image_range_valid(&images->views[i]);
> +
> si_desc_reset_buffer_offset(
> ctx, images->desc.list + i * 8 + 4,
> old_va, buf);
> --
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/7] Fix ralloc/rzalloc usage v2

2016-06-14 Thread Rob Clark

I (and I expect Eric too) would appreciate it if you went ahead and
replaced the current use of non-"z" versions in code that you can't
test w/ the "z" versions.  That way we can switch over to non-zero'ing
on our own time, rather than getting a surprise next time we
pull/rebase

I think it's only a couple spots in freedreno, and pre-emptive r-b for
that change ;-)

BR,
-R

On Tue, Jun 14, 2016 at 11:07 AM, Ilia Mirkin  wrote:
> I assume you've only tested this with i965? ralloc is also used by
> st/mesa, freedreno, and vc4. Should probably try to coordinate with
> the responsible developers before making the big switch.
>
>   -ilia
>
> On Tue, Jun 14, 2016 at 10:58 AM, Juha-Pekka Heikkila
>  wrote:
>> Here is fixed version of this ralloc set. Now I got to run this on many
>> different machines thanks to Mark Janes. There didn't show up any
>> regressions on different gen hw. On my IVB I've been running also many
>> different traces with Apitrace while having Valgrind running on background
>> but Valgrind did seem to be happy with my changes.
>>
>> As a performance test I did shader-db compile runs 10 times and compare
>> timing results against what Mesa master does on my IVB. To my surprise this
>> does bring reasonable gain which also seem to be repeatable, on my IVB
>> shader compile time is around 5% faster with these changes.
>>
>> /Juha-Pekka
>>
>> Juha-Pekka Heikkila (7):
>>   glsl: Fix reading of uninitialized memory
>>   util: use rzalloc instead on ralloc in _mesa_hash_table_create()
>>   util: use rzalloc instead on ralloc in _mesa_set_create(()
>>   nir: zero allocated memory where needed
>>   i965/vec4: zero allocated memory where needed
>>   i965/fs: fill allocated memory with zeros where needed
>>   util: Fix ralloc to use malloc instead of calloc
>>
>>  src/compiler/glsl/ast_to_hir.cpp   |  2 +-
>>  src/compiler/glsl/glcpp/glcpp-parse.y  |  4 +-
>>  src/compiler/glsl/link_uniform_blocks.cpp  |  2 +-
>>  src/compiler/glsl_types.cpp|  2 +-
>>  src/compiler/nir/nir.c |  6 +--
>>  src/compiler/nir/nir_opt_dce.c |  2 +-
>>  src/compiler/nir/nir_phi_builder.c |  2 +-
>>  src/compiler/nir/nir_search.c  |  2 +-
>>  src/compiler/nir/nir_to_ssa.c  |  2 +-
>>  src/compiler/nir/nir_worklist.c|  2 +-
>>  .../drivers/dri/i965/brw_fs_copy_propagation.cpp   |  2 +-
>>  .../dri/i965/brw_fs_dead_code_eliminate.cpp|  4 +-
>>  .../dri/i965/brw_vec4_dead_code_eliminate.cpp  |  4 +-
>>  src/util/hash_table.c  |  2 +-
>>  src/util/ralloc.c  | 49 
>> +++---
>>  src/util/ralloc.h  |  2 +-
>>  src/util/set.c |  2 +-
>>  17 files changed, 54 insertions(+), 37 deletions(-)
>>
>> --
>> 1.9.1
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2] swr: automake: don't ship LLVM version specific generated sources

2016-06-14 Thread Rowley, Timothy O


> On Jun 13, 2016, at 8:03 PM, Rowley, Timothy O  
> wrote:
> 
> A clean tree build works with this version, but distcheck fails:
> 
> ...
> rm -f config.status config.cache config.log configure.lineno 
> config.status.lineno
> rm -f Makefile
> ERROR: files left in build directory after distclean:
> ./src/gallium/drivers/swr/rasterizer/jitter/builder_gen.cpp
> ./src/gallium/drivers/swr/rasterizer/jitter/builder_x86.cpp
> ./src/gallium/drivers/swr/rasterizer/jitter/builder_gen.h
> make[1]: *** [distcleancheck] Error 1
> make[1]: Leaving directory 
> `/home/torowley/work/mesa-opt/mesa-12.1.0-devel/_build'
> make: *** [distcheck] Error 1
> 
> Not sure how builder_x86.cpp managed to change its status.

To answer my own question: the reason for builder_x86.cpp being regenerated is 
because of its dependency on builder_gen.h (through builder.h).

>> On Jun 13, 2016, at 6:46 PM, Emil Velikov  wrote:
>> 
>> From: Emil Velikov 
>> 
>> Otherwise things will fail to build, if the builder is using another
>> version of LLVM.
>> 
>> v2: annotate all the dependencies of builder_gen.h
>> 
>> Cc: "12.0" 
>> Cc: Tim Rowley 
>> Cc: Chuck Atkins 
>> Reported-by: Chuck Atkins 
>> Signed-off-by: Emil Velikov 
>> ---
>> Unlike v1, this ones seems to work. Please give it a try and let me know
>> how it fares on your end.
>> 
>> Thanks
>> Emil
>> ---
>> src/gallium/drivers/swr/Makefile.am | 37 
>> +++--
>> 1 file changed, 35 insertions(+), 2 deletions(-)
>> 
>> diff --git a/src/gallium/drivers/swr/Makefile.am 
>> b/src/gallium/drivers/swr/Makefile.am
>> index 8151e4a..63dadbf 100644
>> --- a/src/gallium/drivers/swr/Makefile.am
>> +++ b/src/gallium/drivers/swr/Makefile.am
>> @@ -52,8 +52,6 @@ BUILT_SOURCES = \
>>  rasterizer/scripts/gen_knobs.cpp \
>>  rasterizer/scripts/gen_knobs.h \
>>  rasterizer/jitter/state_llvm.h \
>> -rasterizer/jitter/builder_gen.h \
>> -rasterizer/jitter/builder_gen.cpp \
>>  rasterizer/jitter/builder_x86.h \
>>  rasterizer/jitter/builder_x86.cpp
>> 
>> @@ -122,6 +120,23 @@ COMMON_LDFLAGS = \
>>  $(NO_UNDEFINED) \
>>  $(LLVM_LDFLAGS)
>> 
>> +
>> +# XXX: As we cannot use BUILT_SOURCES (the files will end up in the dist
>> +# tarball) just annotate the dependency directly.
>> +# As the single direct user of builder_gen.h is a header (builder.h) trace 
>> all
>> +# the translusive users (one that use the latter header).
>> +#
>> +# Note: one should really clean the includes a bit, according to Tim there's
>> +# only 4 users of the builder_gen methods/API.
>> +rasterizer/jitter/blend_jit.cpp: rasterizer/jitter/builder_gen.h
>> +rasterizer/jitter/builder.cpp: rasterizer/jitter/builder_gen.h
>> +rasterizer/jitter/builder_gen.cpp: rasterizer/jitter/builder_gen.h
>> +rasterizer/jitter/builder_x86.cpp: rasterizer/jitter/builder_gen.h
>> +rasterizer/jitter/builder_misc.cpp: rasterizer/jitter/builder_gen.h
>> +rasterizer/jitter/fetch_jit.cpp: rasterizer/jitter/builder_gen.h
>> +rasterizer/jitter/streamout_jit.cpp: rasterizer/jitter/builder_gen.h
>> +swr_shader.cpp: rasterizer/jitter/builder_gen.h
>> +
>> lib_LTLIBRARIES = libswrAVX.la libswrAVX2.la
>> 
>> libswrAVX_la_CXXFLAGS = \
>> @@ -132,6 +147,15 @@ libswrAVX_la_CXXFLAGS = \
>> libswrAVX_la_SOURCES = \
>>  $(COMMON_SOURCES)
>> 
>> +# XXX: Don't ship these generated sources for now, since they are specific
>> +# to the LLVM version they are generated from. Thus a release tarball
>> +# containing the said files, generated against eg. LLVM 3.8 will fail to 
>> build
>> +# on systems with other versions of LLVM eg. 3.7 or 3.6.
>> +# Move these back to BUILT_SOURCES once that is resolved.
>> +nodist_libswrAVX_la_SOURCES = \
>> +rasterizer/jitter/builder_gen.h \
>> +rasterizer/jitter/builder_gen.cpp
>> +
>> libswrAVX_la_LIBADD = \
>>  $(COMMON_LIBADD)
>> 
>> @@ -146,6 +170,15 @@ libswrAVX2_la_CXXFLAGS = \
>> libswrAVX2_la_SOURCES = \
>>  $(COMMON_SOURCES)
>> 
>> +# XXX: Don't ship these generated sources for now, since they are specific
>> +# to the LLVM version they are generated from. Thus a release tarball
>> +# containing the said files, generated against eg. LLVM 3.8 will fail to 
>> build
>> +# on systems with other versions of LLVM eg. 3.7 or 3.6.
>> +# Move these back to BUILT_SOURCES once that is resolved.
>> +nodist_libswrAVX2_la_SOURCES = \
>> +rasterizer/jitter/builder_gen.h \
>> +rasterizer/jitter/builder_gen.cpp
>> +
>> libswrAVX2_la_LIBADD = \
>>  $(COMMON_LIBADD)
>> 
>> -- 
>> 2.8.2
>> 
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] winsys/radeon: use the common job queue for multithreaded command submission v2

2016-06-14 Thread Marek Olšák

From: Marek Olšák 

v2: fixup after renaming to util_queue_fence
---
 src/gallium/winsys/radeon/drm/radeon_drm_cs.c | 22 
 src/gallium/winsys/radeon/drm/radeon_drm_cs.h |  4 +-
 src/gallium/winsys/radeon/drm/radeon_drm_winsys.c | 63 ++-
 src/gallium/winsys/radeon/drm/radeon_drm_winsys.h | 12 ++---
 4 files changed, 19 insertions(+), 82 deletions(-)

diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_cs.c 
b/src/gallium/winsys/radeon/drm/radeon_drm_cs.c
index e9ab53d..9552bd5 100644
--- a/src/gallium/winsys/radeon/drm/radeon_drm_cs.c
+++ b/src/gallium/winsys/radeon/drm/radeon_drm_cs.c
@@ -177,7 +177,7 @@ radeon_drm_cs_create(struct radeon_winsys_ctx *ctx,
 if (!cs) {
 return NULL;
 }
-pipe_semaphore_init(&cs->flush_completed, 1);
+util_queue_fence_init(&cs->flush_completed);
 
 cs->ws = ws;
 cs->flush_cs = flush;
@@ -427,8 +427,9 @@ static unsigned radeon_drm_cs_get_buffer_list(struct 
radeon_winsys_cs *rcs,
 return cs->csc->crelocs;
 }
 
-void radeon_drm_cs_emit_ioctl_oneshot(struct radeon_drm_cs *cs, struct 
radeon_cs_context *csc)
+void radeon_drm_cs_emit_ioctl_oneshot(void *job)
 {
+struct radeon_cs_context *csc = ((struct radeon_drm_cs*)job)->cst;
 unsigned i;
 int r;
 
@@ -463,11 +464,9 @@ void radeon_drm_cs_sync_flush(struct radeon_winsys_cs *rcs)
 {
 struct radeon_drm_cs *cs = radeon_drm_cs(rcs);
 
-/* Wait for any pending ioctl to complete. */
-if (cs->ws->thread) {
-pipe_semaphore_wait(&cs->flush_completed);
-pipe_semaphore_signal(&cs->flush_completed);
-}
+/* Wait for any pending ioctl of this CS to complete. */
+if (util_queue_is_initialized(&cs->ws->cs_queue))
+util_queue_job_wait(&cs->flush_completed);
 }
 
 DEBUG_GET_ONCE_BOOL_OPTION(noop, "RADEON_NOOP", FALSE)
@@ -586,13 +585,12 @@ static void radeon_drm_cs_flush(struct radeon_winsys_cs 
*rcs,
 break;
 }
 
-if (cs->ws->thread) {
-pipe_semaphore_wait(&cs->flush_completed);
-radeon_drm_ws_queue_cs(cs->ws, cs);
+if (util_queue_is_initialized(&cs->ws->cs_queue)) {
+util_queue_add_job(&cs->ws->cs_queue, cs, &cs->flush_completed);
 if (!(flags & RADEON_FLUSH_ASYNC))
 radeon_drm_cs_sync_flush(rcs);
 } else {
-radeon_drm_cs_emit_ioctl_oneshot(cs, cs->cst);
+radeon_drm_cs_emit_ioctl_oneshot(cs);
 }
 } else {
 radeon_cs_context_cleanup(cs->cst);
@@ -610,7 +608,7 @@ static void radeon_drm_cs_destroy(struct radeon_winsys_cs 
*rcs)
 struct radeon_drm_cs *cs = radeon_drm_cs(rcs);
 
 radeon_drm_cs_sync_flush(rcs);
-pipe_semaphore_destroy(&cs->flush_completed);
+util_queue_fence_destroy(&cs->flush_completed);
 radeon_cs_context_cleanup(&cs->csc1);
 radeon_cs_context_cleanup(&cs->csc2);
 p_atomic_dec(&cs->ws->num_cs);
diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_cs.h 
b/src/gallium/winsys/radeon/drm/radeon_drm_cs.h
index 8056e72..a5f243d 100644
--- a/src/gallium/winsys/radeon/drm/radeon_drm_cs.h
+++ b/src/gallium/winsys/radeon/drm/radeon_drm_cs.h
@@ -78,7 +78,7 @@ struct radeon_drm_cs {
 void (*flush_cs)(void *ctx, unsigned flags, struct pipe_fence_handle 
**fence);
 void *flush_data;
 
-pipe_semaphore flush_completed;
+struct util_queue_fence flush_completed;
 };
 
 int radeon_lookup_buffer(struct radeon_cs_context *csc, struct radeon_bo *bo);
@@ -122,6 +122,6 @@ radeon_bo_is_referenced_by_any_cs(struct radeon_bo *bo)
 
 void radeon_drm_cs_sync_flush(struct radeon_winsys_cs *rcs);
 void radeon_drm_cs_init_functions(struct radeon_drm_winsys *ws);
-void radeon_drm_cs_emit_ioctl_oneshot(struct radeon_drm_cs *cs, struct 
radeon_cs_context *csc);
+void radeon_drm_cs_emit_ioctl_oneshot(void *job);
 
 #endif
diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c 
b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
index 5c85c8f..1f296f4 100644
--- a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
+++ b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
@@ -534,16 +534,11 @@ static void radeon_winsys_destroy(struct radeon_winsys 
*rws)
 {
 struct radeon_drm_winsys *ws = (struct radeon_drm_winsys*)rws;
 
-if (ws->thread) {
-ws->kill_thread = 1;
-pipe_semaphore_signal(&ws->cs_queued);
-pipe_thread_wait(ws->thread);
-}
-pipe_semaphore_destroy(&ws->cs_queued);
+if (util_queue_is_initialized(&ws->cs_queue))
+util_queue_destroy(&ws->cs_queue);
 
 pipe_mutex_destroy(ws->hyperz_owner_mutex);
 pipe_mutex_destroy(ws->cmask_owner_mutex);
-pipe_mutex_destroy(ws->cs_stack_lock);
 
 pb_cache_deinit(&ws->bo_cache);
 
@@ -686,55 +681,7 @@ static int compare_fd(void *key1, void *key2)
stat1.st_rdev != stat2.st_rdev;
 }
 
-void radeon_drm_ws_queue_cs(struct radeon_drm_winsys *ws, struct radeon_drm_cs 
*cs)
-{
-retry:
-pipe_mutex_lock(ws->cs_stack_lock);
-

[Mesa-dev] [PATCH 1/2] gallium/util: import the multithreaded job queue from amdgpu winsys (v2)

2016-06-14 Thread Marek Olšák

From: Marek Olšák 

v2: rename the event to util_queue_fence
---
 src/gallium/auxiliary/Makefile.sources|   2 +
 src/gallium/auxiliary/util/u_queue.c  | 129 ++
 src/gallium/auxiliary/util/u_queue.h  |  80 
 src/gallium/winsys/amdgpu/drm/amdgpu_cs.c |  23 ++---
 src/gallium/winsys/amdgpu/drm/amdgpu_cs.h |   4 +-
 src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c |  63 +
 src/gallium/winsys/amdgpu/drm/amdgpu_winsys.h |  11 +--
 7 files changed, 229 insertions(+), 83 deletions(-)
 create mode 100644 src/gallium/auxiliary/util/u_queue.c
 create mode 100644 src/gallium/auxiliary/util/u_queue.h

diff --git a/src/gallium/auxiliary/Makefile.sources 
b/src/gallium/auxiliary/Makefile.sources
index 7b3853e..ab58358 100644
--- a/src/gallium/auxiliary/Makefile.sources
+++ b/src/gallium/auxiliary/Makefile.sources
@@ -274,6 +274,8 @@ C_SOURCES := \
util/u_pstipple.c \
util/u_pstipple.h \
util/u_pwr8.h \
+   util/u_queue.c \
+   util/u_queue.h \
util/u_range.h \
util/u_rect.h \
util/u_resource.c \
diff --git a/src/gallium/auxiliary/util/u_queue.c 
b/src/gallium/auxiliary/util/u_queue.c
new file mode 100644
index 000..8e58414
--- /dev/null
+++ b/src/gallium/auxiliary/util/u_queue.c
@@ -0,0 +1,129 @@
+/*
+ * Copyright © 2016 Advanced Micro Devices, Inc.
+ * All Rights Reserved.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining
+ * a copy of this software and associated documentation files (the
+ * "Software"), to deal in the Software without restriction, including
+ * without limitation the rights to use, copy, modify, merge, publish,
+ * distribute, sub license, and/or sell copies of the Software, and to
+ * permit persons to whom the Software is furnished to do so, subject to
+ * the following conditions:
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
+ * OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NON-INFRINGEMENT. IN NO EVENT SHALL THE COPYRIGHT HOLDERS, AUTHORS
+ * AND/OR ITS SUPPLIERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
+ * USE OR OTHER DEALINGS IN THE SOFTWARE.
+ *
+ * The above copyright notice and this permission notice (including the
+ * next paragraph) shall be included in all copies or substantial portions
+ * of the Software.
+ */
+
+#include "u_queue.h"
+
+static PIPE_THREAD_ROUTINE(util_queue_thread_func, param)
+{
+   struct util_queue *queue = (struct util_queue*)param;
+   unsigned i;
+
+   while (1) {
+  struct util_queue_job job;
+
+  pipe_semaphore_wait(&queue->queued);
+  if (queue->kill_thread)
+ break;
+
+  pipe_mutex_lock(queue->lock);
+  job = queue->jobs[0];
+  for (i = 1; i < queue->num_jobs; i++)
+ queue->jobs[i - 1] = queue->jobs[i];
+  queue->jobs[--queue->num_jobs].job = NULL;
+  pipe_mutex_unlock(queue->lock);
+
+  pipe_semaphore_signal(&queue->has_space);
+
+  if (job.job) {
+ queue->execute_job(job.job);
+ pipe_semaphore_signal(&job.fence->done);
+  }
+   }
+
+   /* signal remaining jobs before terminating */
+   pipe_mutex_lock(queue->lock);
+   for (i = 0; i < queue->num_jobs; i++) {
+  pipe_semaphore_signal(&queue->jobs[i].fence->done);
+  queue->jobs[i].job = NULL;
+   }
+   queue->num_jobs = 0;
+   pipe_mutex_unlock(queue->lock);
+   return 0;
+}
+
+void
+util_queue_init(struct util_queue *queue,
+void (*execute_job)(void *))
+{
+   memset(queue, 0, sizeof(*queue));
+   queue->execute_job = execute_job;
+   pipe_mutex_init(queue->lock);
+   pipe_semaphore_init(&queue->has_space, ARRAY_SIZE(queue->jobs));
+   pipe_semaphore_init(&queue->queued, 0);
+   queue->thread = pipe_thread_create(util_queue_thread_func, queue);
+}
+
+void
+util_queue_destroy(struct util_queue *queue)
+{
+   queue->kill_thread = 1;
+   pipe_semaphore_signal(&queue->queued);
+   pipe_thread_wait(queue->thread);
+   pipe_semaphore_destroy(&queue->has_space);
+   pipe_semaphore_destroy(&queue->queued);
+   pipe_mutex_destroy(queue->lock);
+}
+
+void
+util_queue_fence_init(struct util_queue_fence *fence)
+{
+   pipe_semaphore_init(&fence->done, 1);
+}
+
+void
+util_queue_fence_destroy(struct util_queue_fence *fence)
+{
+   pipe_semaphore_destroy(&fence->done);
+}
+
+void
+util_queue_add_job(struct util_queue *queue,
+   void *job,
+   struct util_queue_fence *fence)
+{
+   /* Set the semaphore to "busy". */
+   pipe_semaphore_wait(&fence->done);
+
+   /* if the queue is full, wait until there is space */
+   pipe_semaphore_wait(&queue->has_space);
+
+   pipe_mutex_lock(queue->lock);
+   assert(queue->num_jobs < ARRAY_SIZE(queue->jobs));
+   queue->jobs[queue->num_jobs].job = j

Re: [Mesa-dev] [PATCH 06/10] gallium/util: add util_copy_index_buffer() helper

2016-06-14 Thread Rob Clark

On Tue, Jun 14, 2016 at 12:32 PM, Nicolai Hähnle  wrote:
> On 14.06.2016 17:57, Rob Clark wrote:
>>
>> From: Rob Clark 
>>
>> Note there was previously a util_set_index_buffer() which was only used
>> by svga.  Replace this.
>>
>> (The util_copy_* naming is more consistent with other u_inlines/
>> u_framebuffer helpers)
>
>
> Looks like you're changing semantics in a few places: memcpy is replaced by
> util_copy_index_buffer, which does reference counting.

I'll double check, but I think the replaced memcpy's are only in
drivers which do not support the non-user_buffer case.  (Pretty sure
the memcpy approach would have been completely broken otherwise.)

BR,
-R

> Nicolai
>
>
>>
>> Signed-off-by: Rob Clark 
>> ---
>>   src/gallium/auxiliary/util/u_helpers.c  | 15 ---
>>   src/gallium/auxiliary/util/u_helpers.h  |  3 ---
>>   src/gallium/auxiliary/util/u_inlines.h  | 17 +
>>   src/gallium/drivers/freedreno/freedreno_state.c | 11 +--
>>   src/gallium/drivers/i915/i915_state.c   |  6 +-
>>   src/gallium/drivers/ilo/ilo_state.c | 10 +-
>>   src/gallium/drivers/llvmpipe/lp_state_vertex.c  |  6 +-
>>   src/gallium/drivers/nouveau/nv30/nv30_state.c   | 11 +--
>>   src/gallium/drivers/r300/r300_state.c   |  8 +---
>>   src/gallium/drivers/r600/r600_state_common.c|  5 +
>>   src/gallium/drivers/radeonsi/si_state.c |  6 +-
>>   src/gallium/drivers/softpipe/sp_state_vertex.c  |  6 +-
>>   src/gallium/drivers/svga/svga_pipe_vertex.c |  2 +-
>>   src/gallium/drivers/swr/swr_state.cpp   |  7 +--
>>   src/gallium/drivers/vc4/vc4_state.c | 11 +--
>>   src/gallium/drivers/virgl/virgl_context.c   |  8 +---
>>   16 files changed, 30 insertions(+), 102 deletions(-)
>>
>> diff --git a/src/gallium/auxiliary/util/u_helpers.c
>> b/src/gallium/auxiliary/util/u_helpers.c
>> index 09020b0..117a51b 100644
>> --- a/src/gallium/auxiliary/util/u_helpers.c
>> +++ b/src/gallium/auxiliary/util/u_helpers.c
>> @@ -94,18 +94,3 @@ void util_set_vertex_buffers_count(struct
>> pipe_vertex_buffer *dst,
>>
>>  *dst_count = util_last_bit(enabled_buffers);
>>   }
>> -
>> -
>> -void
>> -util_set_index_buffer(struct pipe_index_buffer *dst,
>> -  const struct pipe_index_buffer *src)
>> -{
>> -   if (src) {
>> -  pipe_resource_reference(&dst->buffer, src->buffer);
>> -  memcpy(dst, src, sizeof(*dst));
>> -   }
>> -   else {
>> -  pipe_resource_reference(&dst->buffer, NULL);
>> -  memset(dst, 0, sizeof(*dst));
>> -   }
>> -}
>> diff --git a/src/gallium/auxiliary/util/u_helpers.h
>> b/src/gallium/auxiliary/util/u_helpers.h
>> index a9a53e4..9804163 100644
>> --- a/src/gallium/auxiliary/util/u_helpers.h
>> +++ b/src/gallium/auxiliary/util/u_helpers.h
>> @@ -44,9 +44,6 @@ void util_set_vertex_buffers_count(struct
>> pipe_vertex_buffer *dst,
>>  const struct pipe_vertex_buffer *src,
>>  unsigned start_slot, unsigned count);
>>
>> -void util_set_index_buffer(struct pipe_index_buffer *dst,
>> -   const struct pipe_index_buffer *src);
>> -
>>   #ifdef __cplusplus
>>   }
>>   #endif
>> diff --git a/src/gallium/auxiliary/util/u_inlines.h
>> b/src/gallium/auxiliary/util/u_inlines.h
>> index 207e2aa..78125c8 100644
>> --- a/src/gallium/auxiliary/util/u_inlines.h
>> +++ b/src/gallium/auxiliary/util/u_inlines.h
>> @@ -623,6 +623,23 @@ util_copy_constant_buffer(struct pipe_constant_buffer
>> *dst,
>>   }
>>
>>   static inline void
>> +util_copy_index_buffer(struct pipe_index_buffer *dst,
>> +   const struct pipe_index_buffer *src)
>> +{
>> +   if (src) {
>> +  dst->index_size = src->index_size;
>> +  dst->offset = src->offset;
>> +  pipe_resource_reference(&dst->buffer, src->buffer);
>> +  dst->user_buffer = src->user_buffer;
>> +   } else {
>> +  dst->index_size = 0;
>> +  dst->offset = 0;
>> +  pipe_resource_reference(&dst->buffer, NULL);
>> +  dst->user_buffer = NULL;
>> +   }
>> +}
>> +
>> +static inline void
>>   util_copy_image_view(struct pipe_image_view *dst,
>>const struct pipe_image_view *src)
>>   {
>> diff --git a/src/gallium/drivers/freedreno/freedreno_state.c
>> b/src/gallium/drivers/freedreno/freedreno_state.c
>> index 53ea39b..688975f 100644
>> --- a/src/gallium/drivers/freedreno/freedreno_state.c
>> +++ b/src/gallium/drivers/freedreno/freedreno_state.c
>> @@ -207,16 +207,7 @@ fd_set_index_buffer(struct pipe_context *pctx,
>> const struct pipe_index_buffer *ib)
>>   {
>> struct fd_context *ctx = fd_context(pctx);
>> -
>> -   if (ib) {
>> -   pipe_resource_reference(&ctx->indexbuf.buffer,
>> ib->buffer);
>> -   ctx->indexbuf.index_size = ib->index_size;
>> -   ctx->indexbuf.offset = ib->offset;
>> -

Re: [Mesa-dev] [PATCH 05/10] gallium: change end_query() to return boolean

2016-06-14 Thread Rob Clark

On Tue, Jun 14, 2016 at 12:24 PM, Ilia Mirkin  wrote:
> On Tue, Jun 14, 2016 at 12:19 PM, Nicolai Hähnle  wrote:
>> On 14.06.2016 17:57, Rob Clark wrote:
>>>
>>> From: Rob Clark 
>>>
>>> s/bool/boolean/ to make it match the other APIs.
>>
>>
>> Please no. C has finally grown a proper bool type, we should use it where
>> possible. If anything, make the patch go in the other direction.
>
> FWIW I've eradicated boolean from nouveau except for the gallium API
> interfaces. Would definitely be in favor of flipping those to bool.

ok, I don't mind going the other direction.. I just picked 'boolean'
since that is what most of the query APIs already used.

(and in fact this patch isn't strictly required for the rsq_state.py
stuff.. the inconsistency just bothered me ;-))

BR,
-R

> Cheers,
>
>   -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 06/10] gallium/util: add util_copy_index_buffer() helper

2016-06-14 Thread Nicolai Hähnle


On 14.06.2016 17:57, Rob Clark wrote:

From: Rob Clark 

Note there was previously a util_set_index_buffer() which was only used
by svga.  Replace this.

(The util_copy_* naming is more consistent with other u_inlines/
u_framebuffer helpers)


Looks like you're changing semantics in a few places: memcpy is replaced 
by util_copy_index_buffer, which does reference counting.


Nicolai



Signed-off-by: Rob Clark 
---
  src/gallium/auxiliary/util/u_helpers.c  | 15 ---
  src/gallium/auxiliary/util/u_helpers.h  |  3 ---
  src/gallium/auxiliary/util/u_inlines.h  | 17 +
  src/gallium/drivers/freedreno/freedreno_state.c | 11 +--
  src/gallium/drivers/i915/i915_state.c   |  6 +-
  src/gallium/drivers/ilo/ilo_state.c | 10 +-
  src/gallium/drivers/llvmpipe/lp_state_vertex.c  |  6 +-
  src/gallium/drivers/nouveau/nv30/nv30_state.c   | 11 +--
  src/gallium/drivers/r300/r300_state.c   |  8 +---
  src/gallium/drivers/r600/r600_state_common.c|  5 +
  src/gallium/drivers/radeonsi/si_state.c |  6 +-
  src/gallium/drivers/softpipe/sp_state_vertex.c  |  6 +-
  src/gallium/drivers/svga/svga_pipe_vertex.c |  2 +-
  src/gallium/drivers/swr/swr_state.cpp   |  7 +--
  src/gallium/drivers/vc4/vc4_state.c | 11 +--
  src/gallium/drivers/virgl/virgl_context.c   |  8 +---
  16 files changed, 30 insertions(+), 102 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_helpers.c 
b/src/gallium/auxiliary/util/u_helpers.c
index 09020b0..117a51b 100644
--- a/src/gallium/auxiliary/util/u_helpers.c
+++ b/src/gallium/auxiliary/util/u_helpers.c
@@ -94,18 +94,3 @@ void util_set_vertex_buffers_count(struct pipe_vertex_buffer 
*dst,

 *dst_count = util_last_bit(enabled_buffers);
  }
-
-
-void
-util_set_index_buffer(struct pipe_index_buffer *dst,
-  const struct pipe_index_buffer *src)
-{
-   if (src) {
-  pipe_resource_reference(&dst->buffer, src->buffer);
-  memcpy(dst, src, sizeof(*dst));
-   }
-   else {
-  pipe_resource_reference(&dst->buffer, NULL);
-  memset(dst, 0, sizeof(*dst));
-   }
-}
diff --git a/src/gallium/auxiliary/util/u_helpers.h 
b/src/gallium/auxiliary/util/u_helpers.h
index a9a53e4..9804163 100644
--- a/src/gallium/auxiliary/util/u_helpers.h
+++ b/src/gallium/auxiliary/util/u_helpers.h
@@ -44,9 +44,6 @@ void util_set_vertex_buffers_count(struct pipe_vertex_buffer 
*dst,
 const struct pipe_vertex_buffer *src,
 unsigned start_slot, unsigned count);

-void util_set_index_buffer(struct pipe_index_buffer *dst,
-   const struct pipe_index_buffer *src);
-
  #ifdef __cplusplus
  }
  #endif
diff --git a/src/gallium/auxiliary/util/u_inlines.h 
b/src/gallium/auxiliary/util/u_inlines.h
index 207e2aa..78125c8 100644
--- a/src/gallium/auxiliary/util/u_inlines.h
+++ b/src/gallium/auxiliary/util/u_inlines.h
@@ -623,6 +623,23 @@ util_copy_constant_buffer(struct pipe_constant_buffer *dst,
  }

  static inline void
+util_copy_index_buffer(struct pipe_index_buffer *dst,
+   const struct pipe_index_buffer *src)
+{
+   if (src) {
+  dst->index_size = src->index_size;
+  dst->offset = src->offset;
+  pipe_resource_reference(&dst->buffer, src->buffer);
+  dst->user_buffer = src->user_buffer;
+   } else {
+  dst->index_size = 0;
+  dst->offset = 0;
+  pipe_resource_reference(&dst->buffer, NULL);
+  dst->user_buffer = NULL;
+   }
+}
+
+static inline void
  util_copy_image_view(struct pipe_image_view *dst,
   const struct pipe_image_view *src)
  {
diff --git a/src/gallium/drivers/freedreno/freedreno_state.c 
b/src/gallium/drivers/freedreno/freedreno_state.c
index 53ea39b..688975f 100644
--- a/src/gallium/drivers/freedreno/freedreno_state.c
+++ b/src/gallium/drivers/freedreno/freedreno_state.c
@@ -207,16 +207,7 @@ fd_set_index_buffer(struct pipe_context *pctx,
const struct pipe_index_buffer *ib)
  {
struct fd_context *ctx = fd_context(pctx);
-
-   if (ib) {
-   pipe_resource_reference(&ctx->indexbuf.buffer, ib->buffer);
-   ctx->indexbuf.index_size = ib->index_size;
-   ctx->indexbuf.offset = ib->offset;
-   ctx->indexbuf.user_buffer = ib->user_buffer;
-   } else {
-   pipe_resource_reference(&ctx->indexbuf.buffer, NULL);
-   }
-
+   util_copy_index_buffer(&ctx->indexbuf, ib);
ctx->dirty |= FD_DIRTY_INDEXBUF;
  }

diff --git a/src/gallium/drivers/i915/i915_state.c 
b/src/gallium/drivers/i915/i915_state.c
index 2efa14e..dbd711f 100644
--- a/src/gallium/drivers/i915/i915_state.c
+++ b/src/gallium/drivers/i915/i915_state.c
@@ -1063,11 +1063,7 @@ static void i915_set_index_buffer(struct pipe_context 
*pipe,
const struct pipe_index_

Re: [Mesa-dev] [PATCH 01/10] gallium: cleanup set_tess_state

2016-06-14 Thread Nicolai Hähnle


On 14.06.2016 18:02, Ilia Mirkin wrote:

Can you explain the motivation behind this change? I'm adding a
->set_window_rectangles thing which also takes multiple parameters.
What's the advantage of stuffing things into a struct first?


FWIW, I tend to be mildly supportive of changes like this. At least, the 
other extreme where functions grow multiple bool or int parameters over 
time is much worse. But in this particular case, changing this around 
might be too eager.


Perhaps teaching the script to deal with slightly more complicated cases 
will help elsewhere, too.


Nicolai



   -ilia

On Tue, Jun 14, 2016 at 11:57 AM, Rob Clark  wrote:

From: Rob Clark 

The reset of the state APIs take state structs, rather than inline
parameters (with the exception of a couple which just amount to a single
uint).

This makes the API more regular and simplifies autogeneration of the
gallium state related APIs.

Signed-off-by: Rob Clark 
---
  src/gallium/drivers/ddebug/dd_context.c   |  9 -
  src/gallium/drivers/nouveau/nvc0/nvc0_state.c |  7 +++
  src/gallium/drivers/r600/evergreen_state.c|  7 +++
  src/gallium/drivers/radeonsi/si_state.c   |  7 +++
  src/gallium/drivers/trace/tr_context.c|  9 -
  src/gallium/include/pipe/p_context.h  |  4 ++--
  src/gallium/include/pipe/p_state.h|  8 
  src/mesa/state_tracker/st_atom_tess.c | 13 ++---
  8 files changed, 37 insertions(+), 27 deletions(-)

diff --git a/src/gallium/drivers/ddebug/dd_context.c 
b/src/gallium/drivers/ddebug/dd_context.c
index 0f8ef18..06b7c91 100644
--- a/src/gallium/drivers/ddebug/dd_context.c
+++ b/src/gallium/drivers/ddebug/dd_context.c
@@ -380,15 +380,14 @@ dd_context_set_viewport_states(struct pipe_context *_pipe,
  }

  static void dd_context_set_tess_state(struct pipe_context *_pipe,
-  const float default_outer_level[4],
-  const float default_inner_level[2])
+  const struct pipe_tess_state *state)
  {
 struct dd_context *dctx = dd_context(_pipe);
 struct pipe_context *pipe = dctx->pipe;

-   memcpy(dctx->tess_default_levels, default_outer_level, sizeof(float) * 4);
-   memcpy(dctx->tess_default_levels+4, default_inner_level, sizeof(float) * 2);
-   pipe->set_tess_state(pipe, default_outer_level, default_inner_level);
+   memcpy(dctx->tess_default_levels, state->default_outer_level, sizeof(float) 
* 4);
+   memcpy(dctx->tess_default_levels+4, state->default_inner_level, 
sizeof(float) * 2);
+   pipe->set_tess_state(pipe, state);
  }


diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
index 92161ec..a9c1830 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
@@ -1001,13 +1001,12 @@ nvc0_set_viewport_states(struct pipe_context *pipe,

  static void
  nvc0_set_tess_state(struct pipe_context *pipe,
-const float default_tess_outer[4],
-const float default_tess_inner[2])
+const struct pipe_tess_state *state)
  {
 struct nvc0_context *nvc0 = nvc0_context(pipe);

-   memcpy(nvc0->default_tess_outer, default_tess_outer, 4 * sizeof(float));
-   memcpy(nvc0->default_tess_inner, default_tess_inner, 2 * sizeof(float));
+   memcpy(nvc0->default_tess_outer, state->default_tess_outer, 4 * 
sizeof(float));
+   memcpy(nvc0->default_tess_inner, state->default_tess_inner, 2 * 
sizeof(float));
 nvc0->dirty_3d |= NVC0_NEW_3D_TESSFACTOR;
  }

diff --git a/src/gallium/drivers/r600/evergreen_state.c 
b/src/gallium/drivers/r600/evergreen_state.c
index 1ac8914..2a424f5 100644
--- a/src/gallium/drivers/r600/evergreen_state.c
+++ b/src/gallium/drivers/r600/evergreen_state.c
@@ -3569,13 +3569,12 @@ fallback:
  }

  static void evergreen_set_tess_state(struct pipe_context *ctx,
-const float default_outer_level[4],
-const float default_inner_level[2])
+const struct pipe_tess_state *state)
  {
 struct r600_context *rctx = (struct r600_context *)ctx;

-   memcpy(rctx->tess_state, default_outer_level, sizeof(float) * 4);
-   memcpy(rctx->tess_state+4, default_inner_level, sizeof(float) * 2);
+   memcpy(rctx->tess_state, state->default_outer_level, sizeof(float) * 4);
+   memcpy(rctx->tess_state+4, state->default_inner_level, sizeof(float) * 
2);
 rctx->tess_state_dirty = true;
  }

diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index 0c52eee..6ef3fe5 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -3238,15 +3238,14 @@ static void si_set_index_buffer(struct pipe_context 
*ctx,
   */

  static void si_set_tess_state(struct pipe_context *ctx,
-

Re: [Mesa-dev] [PATCH 01/10] gallium: cleanup set_tess_state

2016-06-14 Thread Rob Clark

On Tue, Jun 14, 2016 at 12:13 PM, Ilia Mirkin  wrote:
> [trimming cc's because mesa-dev hates them]
>
> On Tue, Jun 14, 2016 at 12:09 PM, Rob Clark  wrote:
>> On Tue, Jun 14, 2016 at 12:02 PM, Ilia Mirkin  wrote:
>>> Can you explain the motivation behind this change? I'm adding a
>>> ->set_window_rectangles thing which also takes multiple parameters.
>>> What's the advantage of stuffing things into a struct first?
>>
>> consistency with the other pipe->set_xyz APIs, and adding support for
>> it in rsq would then be a one line addition on rsq_state.py rather
>> than writing a bunch of code by hand ;-)
>
> Sounds like it should be easy to extend your new script to handle
> different argument types easily, no? I don't know that the "this new
> script I wrote doesn't handle this situation well, so let's change a
> bunch of code" logic is the right one...

maybe.. in the best case it makes the script more complicated.  And,
IMHO, it is nice when the API is more consistent.  Plus having a
struct gives other drivers a convenient way to store the information.
So I think it makes sense on it's own, even ignoring the script.

BR,
-R

>   -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 05/10] gallium: change end_query() to return boolean

2016-06-14 Thread Ilia Mirkin

On Tue, Jun 14, 2016 at 12:19 PM, Nicolai Hähnle  wrote:
> On 14.06.2016 17:57, Rob Clark wrote:
>>
>> From: Rob Clark 
>>
>> s/bool/boolean/ to make it match the other APIs.
>
>
> Please no. C has finally grown a proper bool type, we should use it where
> possible. If anything, make the patch go in the other direction.

FWIW I've eradicated boolean from nouveau except for the gallium API
interfaces. Would definitely be in favor of flipping those to bool.

Cheers,

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 04/10] gallium: make image_view const

2016-06-14 Thread Nicolai Hähnle


Patches 2-4:

Reviewed-by: Nicolai Hähnle 

On 14.06.2016 17:57, Rob Clark wrote:

From: Rob Clark 

Signed-off-by: Rob Clark 
---
  src/gallium/drivers/ddebug/dd_context.c   | 2 +-
  src/gallium/drivers/ilo/ilo_state.c   | 2 +-
  src/gallium/drivers/nouveau/nvc0/nvc0_state.c | 4 ++--
  src/gallium/drivers/radeonsi/si_descriptors.c | 6 +++---
  src/gallium/drivers/softpipe/sp_state_image.c | 2 +-
  src/gallium/drivers/trace/tr_context.c| 2 +-
  src/gallium/include/pipe/p_context.h  | 2 +-
  7 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/src/gallium/drivers/ddebug/dd_context.c 
b/src/gallium/drivers/ddebug/dd_context.c
index 64b16f6..f72fd2f 100644
--- a/src/gallium/drivers/ddebug/dd_context.c
+++ b/src/gallium/drivers/ddebug/dd_context.c
@@ -490,7 +490,7 @@ dd_context_set_sampler_views(struct pipe_context *_pipe, 
unsigned shader,
  static void
  dd_context_set_shader_images(struct pipe_context *_pipe, unsigned shader,
   unsigned start, unsigned num,
- struct pipe_image_view *views)
+ const struct pipe_image_view *views)
  {
 struct dd_context *dctx = dd_context(_pipe);
 struct pipe_context *pipe = dctx->pipe;
diff --git a/src/gallium/drivers/ilo/ilo_state.c 
b/src/gallium/drivers/ilo/ilo_state.c
index 53a5aca..4f1002e 100644
--- a/src/gallium/drivers/ilo/ilo_state.c
+++ b/src/gallium/drivers/ilo/ilo_state.c
@@ -1851,7 +1851,7 @@ ilo_set_sampler_views(struct pipe_context *pipe, unsigned 
shader,
  static void
  ilo_set_shader_images(struct pipe_context *pipe, unsigned shader,
unsigned start, unsigned count,
-  struct pipe_image_view *views)
+  const struct pipe_image_view *views)
  {
  #if 0
 struct ilo_state_vector *vec = &ilo_context(pipe)->state_vector;
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
index a0e01bd..0bd756f 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
@@ -1233,7 +1233,7 @@ nvc0_set_compute_resources(struct pipe_context *pipe,
  static bool
  nvc0_bind_images_range(struct nvc0_context *nvc0, const unsigned s,
 unsigned start, unsigned nr,
-   struct pipe_image_view *pimages)
+   const struct pipe_image_view *pimages)
  {
 const unsigned end = start + nr;
 unsigned mask = 0;
@@ -1301,7 +1301,7 @@ nvc0_bind_images_range(struct nvc0_context *nvc0, const 
unsigned s,
  static void
  nvc0_set_shader_images(struct pipe_context *pipe, unsigned shader,
 unsigned start, unsigned nr,
-   struct pipe_image_view *images)
+   const struct pipe_image_view *images)
  {
 const unsigned s = nvc0_shader_stage(shader);
 if (!nvc0_bind_images_range(nvc0_context(pipe), s, start, nr, images))
diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c 
b/src/gallium/drivers/radeonsi/si_descriptors.c
index 55686e8..e95556b 100644
--- a/src/gallium/drivers/radeonsi/si_descriptors.c
+++ b/src/gallium/drivers/radeonsi/si_descriptors.c
@@ -560,7 +560,7 @@ si_disable_shader_image(struct si_context *ctx, unsigned 
shader, unsigned slot)
  }

  static void
-si_mark_image_range_valid(struct pipe_image_view *view)
+si_mark_image_range_valid(const struct pipe_image_view *view)
  {
struct r600_resource *res = (struct r600_resource *)view->resource;
const struct util_format_description *desc;
@@ -578,7 +578,7 @@ si_mark_image_range_valid(struct pipe_image_view *view)

  static void si_set_shader_image(struct si_context *ctx,
unsigned shader,
-   unsigned slot, struct pipe_image_view *view)
+   unsigned slot, const struct pipe_image_view 
*view)
  {
struct si_screen *screen = ctx->screen;
struct si_images_info *images = &ctx->images[shader];
@@ -674,7 +674,7 @@ static void si_set_shader_image(struct si_context *ctx,
  static void
  si_set_shader_images(struct pipe_context *pipe, unsigned shader,
 unsigned start_slot, unsigned count,
-struct pipe_image_view *views)
+const struct pipe_image_view *views)
  {
struct si_context *ctx = (struct si_context *)pipe;
unsigned i, slot;
diff --git a/src/gallium/drivers/softpipe/sp_state_image.c 
b/src/gallium/drivers/softpipe/sp_state_image.c
index 81bb7ca..553a76a 100644
--- a/src/gallium/drivers/softpipe/sp_state_image.c
+++ b/src/gallium/drivers/softpipe/sp_state_image.c
@@ -30,7 +30,7 @@ static void softpipe_set_shader_images(struct pipe_context 
*pipe,
 unsigned shader,
 unsigned start,
 unsigned num,
-

Re: [Mesa-dev] [PATCH 0/7] Fix ralloc/rzalloc usage v2

2016-06-14 Thread Nicolai Hähnle


On 14.06.2016 17:07, Ilia Mirkin wrote:

I assume you've only tested this with i965? ralloc is also used by
st/mesa, freedreno, and vc4. Should probably try to coordinate with
the responsible developers before making the big switch.


In st_glsl_to_tgsi.c, there is one ralloc(mem_ctx, function_entry) that 
doesn't initialize all members. That should probably get the rzalloc 
treatment. The other uses in that file look fine to me.


The use in st_nir_lower_builtin.c looks problematic as well.

Nicolai



   -ilia

On Tue, Jun 14, 2016 at 10:58 AM, Juha-Pekka Heikkila
 wrote:

Here is fixed version of this ralloc set. Now I got to run this on many
different machines thanks to Mark Janes. There didn't show up any
regressions on different gen hw. On my IVB I've been running also many
different traces with Apitrace while having Valgrind running on background
but Valgrind did seem to be happy with my changes.

As a performance test I did shader-db compile runs 10 times and compare
timing results against what Mesa master does on my IVB. To my surprise this
does bring reasonable gain which also seem to be repeatable, on my IVB
shader compile time is around 5% faster with these changes.

/Juha-Pekka

Juha-Pekka Heikkila (7):
   glsl: Fix reading of uninitialized memory
   util: use rzalloc instead on ralloc in _mesa_hash_table_create()
   util: use rzalloc instead on ralloc in _mesa_set_create(()
   nir: zero allocated memory where needed
   i965/vec4: zero allocated memory where needed
   i965/fs: fill allocated memory with zeros where needed
   util: Fix ralloc to use malloc instead of calloc

  src/compiler/glsl/ast_to_hir.cpp   |  2 +-
  src/compiler/glsl/glcpp/glcpp-parse.y  |  4 +-
  src/compiler/glsl/link_uniform_blocks.cpp  |  2 +-
  src/compiler/glsl_types.cpp|  2 +-
  src/compiler/nir/nir.c |  6 +--
  src/compiler/nir/nir_opt_dce.c |  2 +-
  src/compiler/nir/nir_phi_builder.c |  2 +-
  src/compiler/nir/nir_search.c  |  2 +-
  src/compiler/nir/nir_to_ssa.c  |  2 +-
  src/compiler/nir/nir_worklist.c|  2 +-
  .../drivers/dri/i965/brw_fs_copy_propagation.cpp   |  2 +-
  .../dri/i965/brw_fs_dead_code_eliminate.cpp|  4 +-
  .../dri/i965/brw_vec4_dead_code_eliminate.cpp  |  4 +-
  src/util/hash_table.c  |  2 +-
  src/util/ralloc.c  | 49 +++---
  src/util/ralloc.h  |  2 +-
  src/util/set.c |  2 +-
  17 files changed, 54 insertions(+), 37 deletions(-)

--
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 01/10] gallium: cleanup set_tess_state

2016-06-14 Thread Rob Clark

On Tue, Jun 14, 2016 at 12:02 PM, Ilia Mirkin  wrote:
> Can you explain the motivation behind this change? I'm adding a
> ->set_window_rectangles thing which also takes multiple parameters.
> What's the advantage of stuffing things into a struct first?

consistency with the other pipe->set_xyz APIs, and adding support for
it in rsq would then be a one line addition on rsq_state.py rather
than writing a bunch of code by hand ;-)

BR,
-R

>   -ilia
>
> On Tue, Jun 14, 2016 at 11:57 AM, Rob Clark  wrote:
>> From: Rob Clark 
>>
>> The reset of the state APIs take state structs, rather than inline
>> parameters (with the exception of a couple which just amount to a single
>> uint).
>>
>> This makes the API more regular and simplifies autogeneration of the
>> gallium state related APIs.
>>
>> Signed-off-by: Rob Clark 
>> ---
>>  src/gallium/drivers/ddebug/dd_context.c   |  9 -
>>  src/gallium/drivers/nouveau/nvc0/nvc0_state.c |  7 +++
>>  src/gallium/drivers/r600/evergreen_state.c|  7 +++
>>  src/gallium/drivers/radeonsi/si_state.c   |  7 +++
>>  src/gallium/drivers/trace/tr_context.c|  9 -
>>  src/gallium/include/pipe/p_context.h  |  4 ++--
>>  src/gallium/include/pipe/p_state.h|  8 
>>  src/mesa/state_tracker/st_atom_tess.c | 13 ++---
>>  8 files changed, 37 insertions(+), 27 deletions(-)
>>
>> diff --git a/src/gallium/drivers/ddebug/dd_context.c 
>> b/src/gallium/drivers/ddebug/dd_context.c
>> index 0f8ef18..06b7c91 100644
>> --- a/src/gallium/drivers/ddebug/dd_context.c
>> +++ b/src/gallium/drivers/ddebug/dd_context.c
>> @@ -380,15 +380,14 @@ dd_context_set_viewport_states(struct pipe_context 
>> *_pipe,
>>  }
>>
>>  static void dd_context_set_tess_state(struct pipe_context *_pipe,
>> -  const float default_outer_level[4],
>> -  const float default_inner_level[2])
>> +  const struct pipe_tess_state *state)
>>  {
>> struct dd_context *dctx = dd_context(_pipe);
>> struct pipe_context *pipe = dctx->pipe;
>>
>> -   memcpy(dctx->tess_default_levels, default_outer_level, sizeof(float) * 
>> 4);
>> -   memcpy(dctx->tess_default_levels+4, default_inner_level, sizeof(float) * 
>> 2);
>> -   pipe->set_tess_state(pipe, default_outer_level, default_inner_level);
>> +   memcpy(dctx->tess_default_levels, state->default_outer_level, 
>> sizeof(float) * 4);
>> +   memcpy(dctx->tess_default_levels+4, state->default_inner_level, 
>> sizeof(float) * 2);
>> +   pipe->set_tess_state(pipe, state);
>>  }
>>
>>
>> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c 
>> b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
>> index 92161ec..a9c1830 100644
>> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
>> +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
>> @@ -1001,13 +1001,12 @@ nvc0_set_viewport_states(struct pipe_context *pipe,
>>
>>  static void
>>  nvc0_set_tess_state(struct pipe_context *pipe,
>> -const float default_tess_outer[4],
>> -const float default_tess_inner[2])
>> +const struct pipe_tess_state *state)
>>  {
>> struct nvc0_context *nvc0 = nvc0_context(pipe);
>>
>> -   memcpy(nvc0->default_tess_outer, default_tess_outer, 4 * sizeof(float));
>> -   memcpy(nvc0->default_tess_inner, default_tess_inner, 2 * sizeof(float));
>> +   memcpy(nvc0->default_tess_outer, state->default_tess_outer, 4 * 
>> sizeof(float));
>> +   memcpy(nvc0->default_tess_inner, state->default_tess_inner, 2 * 
>> sizeof(float));
>> nvc0->dirty_3d |= NVC0_NEW_3D_TESSFACTOR;
>>  }
>>
>> diff --git a/src/gallium/drivers/r600/evergreen_state.c 
>> b/src/gallium/drivers/r600/evergreen_state.c
>> index 1ac8914..2a424f5 100644
>> --- a/src/gallium/drivers/r600/evergreen_state.c
>> +++ b/src/gallium/drivers/r600/evergreen_state.c
>> @@ -3569,13 +3569,12 @@ fallback:
>>  }
>>
>>  static void evergreen_set_tess_state(struct pipe_context *ctx,
>> -const float default_outer_level[4],
>> -const float default_inner_level[2])
>> +const struct pipe_tess_state *state)
>>  {
>> struct r600_context *rctx = (struct r600_context *)ctx;
>>
>> -   memcpy(rctx->tess_state, default_outer_level, sizeof(float) * 4);
>> -   memcpy(rctx->tess_state+4, default_inner_level, sizeof(float) * 2);
>> +   memcpy(rctx->tess_state, state->default_outer_level, sizeof(float) * 
>> 4);
>> +   memcpy(rctx->tess_state+4, state->default_inner_level, sizeof(float) 
>> * 2);
>> rctx->tess_state_dirty = true;
>>  }
>>
>> diff --git a/src/gallium/drivers/radeonsi/si_state.c 
>> b/src/gallium/drivers/radeonsi/si_state.c
>> index 0c52eee..6ef3fe5 100644
>> --- a/src/gallium/drivers/radeonsi/si_state.c
>> +++ b/src/gallium/drivers/radeonsi/si_state.c
>> @@ -3238,15 +3238,14 @@ stati

Re: [Mesa-dev] [PATCH 00/10] gallium: resequencer layer

2016-06-14 Thread Brian Paul


On 06/14/2016 10:07 AM, Rob Clark wrote:

bleh, seems like max-cc's is still too low on mesa-dev, and some of
the patches didn't get through.  You can also find them here:

   
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_freedreno_mesa_commits_wip-2Drsq&d=CwIGaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=T0t4QG7chq2ZwJo6wilkFznRSFy-8uDKartPGbomVj8&m=y8_YjxPlAvWMLZ5SEnmHmZHxd8Z2nrfsQMnZgaplS1o&s=3zmiU05shnNzgGuD1LPvnREsSmOcjGwBpIAGe6x4X8s&e=


I don't see a way to raise the max-cc's in the mailmain interface.

I approved your pending messages.

-Brian


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2] glsl: reuse main extension table to appropriate restrict extensions

2016-06-14 Thread Eric Engestrom

On Mon, Jun 13, 2016 at 11:43:57PM -0400, Ilia Mirkin wrote:
> Previously we were only restricting based on ES/non-ES-ness and whether
> the overall enable bit had been flipped on. However we have been adding
> more fine-grained restrictions, such as based on compat profiles, as
> well as specific ES versions. Most of the time this doesn't matter, but
> it can create awkward situations and duplication of logic.
> 
> Here we separate the main extension table into a separate object file,
> linked to the glsl compiler, which makes use of it with a custom
> function which takes the ES-ness of the shader into account (thus
> allowing desktop shaders to properly use ES extensions that would
> otherwise have been disallowed.)
> 
> The effect of this change should be nil in most cases.
> 
> Signed-off-by: Ilia Mirkin 
> ---
> 
> v1 -> v2:
>  - use a final enum to obtain number of extensions
>  - move calculation of the gl version to be once per shader, for better reuse
>  - bake GL version into the "supported_versions" struct
>  - while we're at it, fix supported_versions size, it was off by 1 since ES 
> 3.20
>"support" was added.
> 

Looks all good to me :)
Reviewed-by: Eric Engestrom 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 05/10] gallium: change end_query() to return boolean

2016-06-14 Thread Nicolai Hähnle


On 14.06.2016 17:57, Rob Clark wrote:

From: Rob Clark 

s/bool/boolean/ to make it match the other APIs.


Please no. C has finally grown a proper bool type, we should use it 
where possible. If anything, make the patch go in the other direction.


Nicolai



Signed-off-by: Rob Clark 
---
  src/gallium/drivers/freedreno/freedreno_query.c | 2 +-
  src/gallium/drivers/i915/i915_query.c   | 2 +-
  src/gallium/drivers/ilo/ilo_query.c | 2 +-
  src/gallium/drivers/llvmpipe/lp_query.c | 2 +-
  src/gallium/drivers/noop/noop_pipe.c| 2 +-
  src/gallium/drivers/nouveau/nv30/nv30_query.c   | 2 +-
  src/gallium/drivers/nouveau/nv50/nv50_query.c   | 2 +-
  src/gallium/drivers/r300/r300_query.c   | 4 ++--
  src/gallium/drivers/radeon/r600_query.c | 2 +-
  src/gallium/drivers/rbug/rbug_context.c | 4 ++--
  src/gallium/drivers/softpipe/sp_query.c | 2 +-
  src/gallium/drivers/svga/svga_pipe_query.c  | 2 +-
  src/gallium/drivers/swr/swr_query.cpp   | 2 +-
  src/gallium/drivers/vc4/vc4_query.c | 2 +-
  src/gallium/drivers/virgl/virgl_query.c | 4 ++--
  src/gallium/include/pipe/p_context.h| 2 +-
  16 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/src/gallium/drivers/freedreno/freedreno_query.c 
b/src/gallium/drivers/freedreno/freedreno_query.c
index 18e0c79..fc50076 100644
--- a/src/gallium/drivers/freedreno/freedreno_query.c
+++ b/src/gallium/drivers/freedreno/freedreno_query.c
@@ -66,7 +66,7 @@ fd_begin_query(struct pipe_context *pctx, struct pipe_query 
*pq)
return q->funcs->begin_query(fd_context(pctx), q);
  }

-static bool
+static boolean
  fd_end_query(struct pipe_context *pctx, struct pipe_query *pq)
  {
struct fd_query *q = fd_query(pq);
diff --git a/src/gallium/drivers/i915/i915_query.c 
b/src/gallium/drivers/i915/i915_query.c
index d6015a6..9d5569a 100644
--- a/src/gallium/drivers/i915/i915_query.c
+++ b/src/gallium/drivers/i915/i915_query.c
@@ -60,7 +60,7 @@ static boolean i915_begin_query(struct pipe_context *ctx,
 return true;
  }

-static bool i915_end_query(struct pipe_context *ctx, struct pipe_query *query)
+static boolean i915_end_query(struct pipe_context *ctx, struct pipe_query 
*query)
  {
 return true;
  }
diff --git a/src/gallium/drivers/ilo/ilo_query.c 
b/src/gallium/drivers/ilo/ilo_query.c
index 3088c96..98c3f6d 100644
--- a/src/gallium/drivers/ilo/ilo_query.c
+++ b/src/gallium/drivers/ilo/ilo_query.c
@@ -128,7 +128,7 @@ ilo_begin_query(struct pipe_context *pipe, struct 
pipe_query *query)
 return true;
  }

-static bool
+static boolean
  ilo_end_query(struct pipe_context *pipe, struct pipe_query *query)
  {
 struct ilo_query *q = ilo_query(query);
diff --git a/src/gallium/drivers/llvmpipe/lp_query.c 
b/src/gallium/drivers/llvmpipe/lp_query.c
index d5ed656..ffc4f56c 100644
--- a/src/gallium/drivers/llvmpipe/lp_query.c
+++ b/src/gallium/drivers/llvmpipe/lp_query.c
@@ -239,7 +239,7 @@ llvmpipe_begin_query(struct pipe_context *pipe, struct 
pipe_query *q)
  }


-static bool
+static boolean
  llvmpipe_end_query(struct pipe_context *pipe, struct pipe_query *q)
  {
 struct llvmpipe_context *llvmpipe = llvmpipe_context( pipe );
diff --git a/src/gallium/drivers/noop/noop_pipe.c 
b/src/gallium/drivers/noop/noop_pipe.c
index 99e5f1a..e58507b 100644
--- a/src/gallium/drivers/noop/noop_pipe.c
+++ b/src/gallium/drivers/noop/noop_pipe.c
@@ -63,7 +63,7 @@ static boolean noop_begin_query(struct pipe_context *ctx, 
struct pipe_query *que
 return true;
  }

-static bool noop_end_query(struct pipe_context *ctx, struct pipe_query *query)
+static boolean noop_end_query(struct pipe_context *ctx, struct pipe_query 
*query)
  {
 return true;
  }
diff --git a/src/gallium/drivers/nouveau/nv30/nv30_query.c 
b/src/gallium/drivers/nouveau/nv30/nv30_query.c
index aa9a12f..0f4d9b4 100644
--- a/src/gallium/drivers/nouveau/nv30/nv30_query.c
+++ b/src/gallium/drivers/nouveau/nv30/nv30_query.c
@@ -175,7 +175,7 @@ nv30_query_begin(struct pipe_context *pipe, struct 
pipe_query *pq)
 return true;
  }

-static bool
+static boolean
  nv30_query_end(struct pipe_context *pipe, struct pipe_query *pq)
  {
 struct nv30_context *nv30 = nv30_context(pipe);
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_query.c 
b/src/gallium/drivers/nouveau/nv50/nv50_query.c
index 9a1397a..9124946 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_query.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_query.c
@@ -54,7 +54,7 @@ nv50_begin_query(struct pipe_context *pipe, struct pipe_query 
*pq)
 return q->funcs->begin_query(nv50_context(pipe), q);
  }

-static bool
+static boolean
  nv50_end_query(struct pipe_context *pipe, struct pipe_query *pq)
  {
 struct nv50_query *q = nv50_query(pq);
diff --git a/src/gallium/drivers/r300/r300_query.c 
b/src/gallium/drivers/r300/r300_query.c
index 79e2198..1428d03 100644
--- a/src/gallium/drivers/r300/r300_query.c
+++ b/src/gallium/driv

Re: [Mesa-dev] [PATCH 01/10] gallium: cleanup set_tess_state

2016-06-14 Thread Ilia Mirkin

[trimming cc's because mesa-dev hates them]

On Tue, Jun 14, 2016 at 12:09 PM, Rob Clark  wrote:
> On Tue, Jun 14, 2016 at 12:02 PM, Ilia Mirkin  wrote:
>> Can you explain the motivation behind this change? I'm adding a
>> ->set_window_rectangles thing which also takes multiple parameters.
>> What's the advantage of stuffing things into a struct first?
>
> consistency with the other pipe->set_xyz APIs, and adding support for
> it in rsq would then be a one line addition on rsq_state.py rather
> than writing a bunch of code by hand ;-)

Sounds like it should be easy to extend your new script to handle
different argument types easily, no? I don't know that the "this new
script I wrote doesn't handle this situation well, so let's change a
bunch of code" logic is the right one...

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 01/10] gallium: cleanup set_tess_state

2016-06-14 Thread Rob Clark

From: Rob Clark 

The reset of the state APIs take state structs, rather than inline
parameters (with the exception of a couple which just amount to a single
uint).

This makes the API more regular and simplifies autogeneration of the
gallium state related APIs.

Signed-off-by: Rob Clark 
---
 src/gallium/drivers/ddebug/dd_context.c   |  9 -
 src/gallium/drivers/nouveau/nvc0/nvc0_state.c |  7 +++
 src/gallium/drivers/r600/evergreen_state.c|  7 +++
 src/gallium/drivers/radeonsi/si_state.c   |  7 +++
 src/gallium/drivers/trace/tr_context.c|  9 -
 src/gallium/include/pipe/p_context.h  |  4 ++--
 src/gallium/include/pipe/p_state.h|  8 
 src/mesa/state_tracker/st_atom_tess.c | 13 ++---
 8 files changed, 37 insertions(+), 27 deletions(-)

diff --git a/src/gallium/drivers/ddebug/dd_context.c 
b/src/gallium/drivers/ddebug/dd_context.c
index 0f8ef18..06b7c91 100644
--- a/src/gallium/drivers/ddebug/dd_context.c
+++ b/src/gallium/drivers/ddebug/dd_context.c
@@ -380,15 +380,14 @@ dd_context_set_viewport_states(struct pipe_context *_pipe,
 }
 
 static void dd_context_set_tess_state(struct pipe_context *_pipe,
-  const float default_outer_level[4],
-  const float default_inner_level[2])
+  const struct pipe_tess_state *state)
 {
struct dd_context *dctx = dd_context(_pipe);
struct pipe_context *pipe = dctx->pipe;
 
-   memcpy(dctx->tess_default_levels, default_outer_level, sizeof(float) * 4);
-   memcpy(dctx->tess_default_levels+4, default_inner_level, sizeof(float) * 2);
-   pipe->set_tess_state(pipe, default_outer_level, default_inner_level);
+   memcpy(dctx->tess_default_levels, state->default_outer_level, sizeof(float) 
* 4);
+   memcpy(dctx->tess_default_levels+4, state->default_inner_level, 
sizeof(float) * 2);
+   pipe->set_tess_state(pipe, state);
 }
 
 
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
index 92161ec..a9c1830 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
@@ -1001,13 +1001,12 @@ nvc0_set_viewport_states(struct pipe_context *pipe,
 
 static void
 nvc0_set_tess_state(struct pipe_context *pipe,
-const float default_tess_outer[4],
-const float default_tess_inner[2])
+const struct pipe_tess_state *state)
 {
struct nvc0_context *nvc0 = nvc0_context(pipe);
 
-   memcpy(nvc0->default_tess_outer, default_tess_outer, 4 * sizeof(float));
-   memcpy(nvc0->default_tess_inner, default_tess_inner, 2 * sizeof(float));
+   memcpy(nvc0->default_tess_outer, state->default_tess_outer, 4 * 
sizeof(float));
+   memcpy(nvc0->default_tess_inner, state->default_tess_inner, 2 * 
sizeof(float));
nvc0->dirty_3d |= NVC0_NEW_3D_TESSFACTOR;
 }
 
diff --git a/src/gallium/drivers/r600/evergreen_state.c 
b/src/gallium/drivers/r600/evergreen_state.c
index 1ac8914..2a424f5 100644
--- a/src/gallium/drivers/r600/evergreen_state.c
+++ b/src/gallium/drivers/r600/evergreen_state.c
@@ -3569,13 +3569,12 @@ fallback:
 }
 
 static void evergreen_set_tess_state(struct pipe_context *ctx,
-const float default_outer_level[4],
-const float default_inner_level[2])
+const struct pipe_tess_state *state)
 {
struct r600_context *rctx = (struct r600_context *)ctx;
 
-   memcpy(rctx->tess_state, default_outer_level, sizeof(float) * 4);
-   memcpy(rctx->tess_state+4, default_inner_level, sizeof(float) * 2);
+   memcpy(rctx->tess_state, state->default_outer_level, sizeof(float) * 4);
+   memcpy(rctx->tess_state+4, state->default_inner_level, sizeof(float) * 
2);
rctx->tess_state_dirty = true;
 }
 
diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index 0c52eee..6ef3fe5 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -3238,15 +3238,14 @@ static void si_set_index_buffer(struct pipe_context 
*ctx,
  */
 
 static void si_set_tess_state(struct pipe_context *ctx,
- const float default_outer_level[4],
- const float default_inner_level[2])
+ const struct pipe_tess_state *state)
 {
struct si_context *sctx = (struct si_context *)ctx;
struct pipe_constant_buffer cb;
float array[8];
 
-   memcpy(array, default_outer_level, sizeof(float) * 4);
-   memcpy(array+4, default_inner_level, sizeof(float) * 2);
+   memcpy(array, state->default_outer_level, sizeof(float) * 4);
+   memcpy(array+4, state->default_inner_level, sizeof(float) * 2);
 
cb.buffer = NULL;
cb.user_buffer = NULL;
diff --git a/src/gallium/dr

Re: [Mesa-dev] [PATCH 00/10] gallium: resequencer layer

2016-06-14 Thread Rob Clark

bleh, seems like max-cc's is still too low on mesa-dev, and some of
the patches didn't get through.  You can also find them here:

  https://github.com/freedreno/mesa/commits/wip-rsq

BR,
-R

On Tue, Jun 14, 2016 at 11:57 AM, Rob Clark  wrote:
> From: Rob Clark 
>
> So, I know there were a couple concerns voiced over the idea of
> re-ordering rendering in a gallium shim pipe driver layer.  For
> me, the main concern was whether the overhead of an extra layer,
> queueing and replaying state updates, draws, etc, would be
> prohibitive.  So I implemented it enough that I could do some
> benchmarking ;-)
>
> The first 9 patches are just some general API cleanups, which I
> found to be convenient (since the resequencer layer is generating
> most of the state handling with python + mako, so the cleanups to
> improve consistency help minimize the state which required special
> handling).  But regardless of the outcome of the resequencer
> layer, I think these patches make sense on their own.
>
> (Note: auto-generating some of the other wrapper layers might be
> an interesting future cleanup..  at least it should be trivial
> for noop ;-))
>
> As far as overhead, I've been benchmarking (most glmark2 + stk +
> gfxbench), and in the current state (without actually having the
> dependency tracking implemented) it doesn't seem to cause more
> than a couple percent overhead.  From here on out, the remaining
> overhead added to implement the dependency tracking and re-
> ordering would be the same as the additional overhead required
> to implement it in the driver backend.
>
> And a couple percent overhead is small compared to the expected
> gains for games which benefit.. ie. 8MiB for 1080p rgb frame,
> avoiding copying that from tile to memory and back once or twice
> quickly dwarfs an extra copy of some 10's of kb of state.. and
> even more so for (for ex.) f32f32f32f32 intermediate buffers.
>
> Queries are still missing, but I expect what would be required
> to implement it is the same as the logic that would be needed in
> the driver backend otherwise.
>
> Basically, the only concern I have, compared to the approach of
> implementing the dependency tracking in each driver backend is
> pipe_constant_buffer::user_buffer.  Currently both freedreno and
> vc4 what non-UBO constant buffers to be emitted in cmdstream.
> In the adreno case, it looks like a3xx/a4xx should also support
> the non-user_buffer case, although in fact this appears to be
> broken (at least on a4xx) and I've never seen blob driver use
> this.  At the moment I'm doing a hack in freedreno to map the
> backing fd_bo and then memcpy it into cmdstream.  Which is a
> bit silly (since it is a write-combine buffer I'm copying from).
> But in glmark I had trouble even measuring the overhead of this
> extra copy.  Although possibly I need to find something to
> measure which emits more non-UBO constant state.
>
> btw, if someone has some requests for benchmarks to try (provided
> they are available for arm/linux) I'd be happy to try some other
> things.
>
> The plus side of doing this in a separate layer is that we only
> implement the dependency tracking and resource shadowing once,
> instead of both in vc4 and freedreno (and who knows, maybe
> someday someone gets around to writing a lima gallium driver).
> Plus, I envision this to be something that mesa/st wraps the
> pipe_screen with if driconf tells it to, and pscreen->rsq_funcs
> is populated (we at least need a callback to know if resource
> is still busy).  This way we can turn it on for games/apps that
> are known to benefit, and leave it off with zero additional
> overhead for better written things (or rather, things written
> with tilers in mind).
>
>
> Rob Clark (10):
>   gallium: cleanup set_tess_state
>   gallium: make shader_buffers const
>   gallium: make constant_buffer const
>   gallium: make image_view const
>   gallium: change end_query() to return boolean
>   gallium/util: add util_copy_index_buffer() helper
>   gallium/util: add util_copy_shader_buffer() helper
>   gallium/util: add util_copy_vertex_buffer helper
>   gallium/util: make util_copy_framebuffer_state(src=NULL) work
>   RFC: gallium: add resequencer driver (INCOMPLETE)
>
>  configure.ac   |   1 +
>  src/gallium/auxiliary/util/u_framebuffer.c |  37 +-
>  src/gallium/auxiliary/util/u_helpers.c |  15 -
>  src/gallium/auxiliary/util/u_helpers.h |   3 -
>  src/gallium/auxiliary/util/u_inlines.h |  49 ++
>  src/gallium/drivers/ddebug/dd_context.c|  15 +-
>  src/gallium/drivers/freedreno/freedreno_query.c|   2 +-
>  src/gallium/drivers/freedreno/freedreno_state.c|  13 +-
>  src/gallium/drivers/i915/i915_query.c  |   2 +-
>  src/gallium/drivers/i915/i915_state.c  |   8 +-
>  src/gallium/drivers/ilo/ilo_query.c|   2 +-
>  src/gallium/drivers/ilo/ilo_state.c|  14 +-
>  src/gallium/

[Mesa-dev] [PATCH 10/10] RFC: gallium: add resequencer driver (INCOMPLETE)

2016-06-14 Thread Rob Clark

From: Rob Clark 

NOTE: the mako templates turned out to be a bit more hairy than
expected.. maybe they would be better split out, or maybe there
is something that could be done more simply.  It more or less is
my first time doing much with mako.  But, I have changed how the
state tracking / emit / replay works a few times as I went, and
making a couple line change in a template and regenerating is
much nicer than manually refactoring the code ;-)

Note that since it might be easier to see how things work from
the resulting code:

  rsq_state.h -> http://hastebin.com/erupecivab.c
  rsq_state.c -> http://hastebin.com/cokenawanu.c

---
 configure.ac   |   1 +
 src/gallium/drivers/resequencer/.gitignore |   2 +
 src/gallium/drivers/resequencer/Makefile.am|  44 ++
 src/gallium/drivers/resequencer/Makefile.sources   |  23 +
 src/gallium/drivers/resequencer/rsq_batch.c| 144 +
 src/gallium/drivers/resequencer/rsq_batch.h|  71 +++
 src/gallium/drivers/resequencer/rsq_context.c  | 457 
 src/gallium/drivers/resequencer/rsq_context.h  |  84 +++
 src/gallium/drivers/resequencer/rsq_draw.c | 230 
 src/gallium/drivers/resequencer/rsq_draw.h |  40 ++
 src/gallium/drivers/resequencer/rsq_fence.c|  48 ++
 src/gallium/drivers/resequencer/rsq_fence.h|  43 ++
 src/gallium/drivers/resequencer/rsq_public.h   |  68 +++
 src/gallium/drivers/resequencer/rsq_query.c| 148 +
 src/gallium/drivers/resequencer/rsq_query.h|  32 ++
 src/gallium/drivers/resequencer/rsq_resource.c | 222 
 src/gallium/drivers/resequencer/rsq_resource.h |  60 ++
 src/gallium/drivers/resequencer/rsq_screen.c   | 186 +++
 src/gallium/drivers/resequencer/rsq_screen.h   |  50 ++
 src/gallium/drivers/resequencer/rsq_state.py   | 607 +
 .../drivers/resequencer/rsq_state_helpers.h| 219 
 src/gallium/drivers/resequencer/rsq_surface.c  | 107 
 src/gallium/drivers/resequencer/rsq_surface.h  |  72 +++
 23 files changed, 2958 insertions(+)
 create mode 100644 src/gallium/drivers/resequencer/.gitignore
 create mode 100644 src/gallium/drivers/resequencer/Makefile.am
 create mode 100644 src/gallium/drivers/resequencer/Makefile.sources
 create mode 100644 src/gallium/drivers/resequencer/rsq_batch.c
 create mode 100644 src/gallium/drivers/resequencer/rsq_batch.h
 create mode 100644 src/gallium/drivers/resequencer/rsq_context.c
 create mode 100644 src/gallium/drivers/resequencer/rsq_context.h
 create mode 100644 src/gallium/drivers/resequencer/rsq_draw.c
 create mode 100644 src/gallium/drivers/resequencer/rsq_draw.h
 create mode 100644 src/gallium/drivers/resequencer/rsq_fence.c
 create mode 100644 src/gallium/drivers/resequencer/rsq_fence.h
 create mode 100644 src/gallium/drivers/resequencer/rsq_public.h
 create mode 100644 src/gallium/drivers/resequencer/rsq_query.c
 create mode 100644 src/gallium/drivers/resequencer/rsq_query.h
 create mode 100644 src/gallium/drivers/resequencer/rsq_resource.c
 create mode 100644 src/gallium/drivers/resequencer/rsq_resource.h
 create mode 100644 src/gallium/drivers/resequencer/rsq_screen.c
 create mode 100644 src/gallium/drivers/resequencer/rsq_screen.h
 create mode 100644 src/gallium/drivers/resequencer/rsq_state.py
 create mode 100644 src/gallium/drivers/resequencer/rsq_state_helpers.h
 create mode 100644 src/gallium/drivers/resequencer/rsq_surface.c
 create mode 100644 src/gallium/drivers/resequencer/rsq_surface.h

diff --git a/configure.ac b/configure.ac
index c492e15..0dbfc32 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2644,6 +2644,7 @@ AC_CONFIG_FILES([Makefile
src/gallium/drivers/llvmpipe/Makefile
src/gallium/drivers/noop/Makefile
src/gallium/drivers/nouveau/Makefile
+   src/gallium/drivers/resequencer/Makefile
src/gallium/drivers/r300/Makefile
src/gallium/drivers/r600/Makefile
src/gallium/drivers/radeon/Makefile
diff --git a/src/gallium/drivers/resequencer/.gitignore 
b/src/gallium/drivers/resequencer/.gitignore
new file mode 100644
index 000..c827305
--- /dev/null
+++ b/src/gallium/drivers/resequencer/.gitignore
@@ -0,0 +1,2 @@
+rsq_state.c
+rsq_state.h
diff --git a/src/gallium/drivers/resequencer/Makefile.am 
b/src/gallium/drivers/resequencer/Makefile.am
new file mode 100644
index 000..503aa98
--- /dev/null
+++ b/src/gallium/drivers/resequencer/Makefile.am
@@ -0,0 +1,44 @@
+# Copyright © 2016 Red Hat.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a
+# copy of this software and associated documentation files (the "Software"),
+# to deal in the Software without restriction, including without limitation
+# the rights to use, copy, modify, merge, publish, distribute, sublicense,
+# and/or sell copies of the Software, and to permit persons to whom the

[Mesa-dev] [PATCH 05/10] gallium: change end_query() to return boolean

2016-06-14 Thread Rob Clark

From: Rob Clark 

s/bool/boolean/ to make it match the other APIs.

Signed-off-by: Rob Clark 
---
 src/gallium/drivers/freedreno/freedreno_query.c | 2 +-
 src/gallium/drivers/i915/i915_query.c   | 2 +-
 src/gallium/drivers/ilo/ilo_query.c | 2 +-
 src/gallium/drivers/llvmpipe/lp_query.c | 2 +-
 src/gallium/drivers/noop/noop_pipe.c| 2 +-
 src/gallium/drivers/nouveau/nv30/nv30_query.c   | 2 +-
 src/gallium/drivers/nouveau/nv50/nv50_query.c   | 2 +-
 src/gallium/drivers/r300/r300_query.c   | 4 ++--
 src/gallium/drivers/radeon/r600_query.c | 2 +-
 src/gallium/drivers/rbug/rbug_context.c | 4 ++--
 src/gallium/drivers/softpipe/sp_query.c | 2 +-
 src/gallium/drivers/svga/svga_pipe_query.c  | 2 +-
 src/gallium/drivers/swr/swr_query.cpp   | 2 +-
 src/gallium/drivers/vc4/vc4_query.c | 2 +-
 src/gallium/drivers/virgl/virgl_query.c | 4 ++--
 src/gallium/include/pipe/p_context.h| 2 +-
 16 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/src/gallium/drivers/freedreno/freedreno_query.c 
b/src/gallium/drivers/freedreno/freedreno_query.c
index 18e0c79..fc50076 100644
--- a/src/gallium/drivers/freedreno/freedreno_query.c
+++ b/src/gallium/drivers/freedreno/freedreno_query.c
@@ -66,7 +66,7 @@ fd_begin_query(struct pipe_context *pctx, struct pipe_query 
*pq)
return q->funcs->begin_query(fd_context(pctx), q);
 }
 
-static bool
+static boolean
 fd_end_query(struct pipe_context *pctx, struct pipe_query *pq)
 {
struct fd_query *q = fd_query(pq);
diff --git a/src/gallium/drivers/i915/i915_query.c 
b/src/gallium/drivers/i915/i915_query.c
index d6015a6..9d5569a 100644
--- a/src/gallium/drivers/i915/i915_query.c
+++ b/src/gallium/drivers/i915/i915_query.c
@@ -60,7 +60,7 @@ static boolean i915_begin_query(struct pipe_context *ctx,
return true;
 }
 
-static bool i915_end_query(struct pipe_context *ctx, struct pipe_query *query)
+static boolean i915_end_query(struct pipe_context *ctx, struct pipe_query 
*query)
 {
return true;
 }
diff --git a/src/gallium/drivers/ilo/ilo_query.c 
b/src/gallium/drivers/ilo/ilo_query.c
index 3088c96..98c3f6d 100644
--- a/src/gallium/drivers/ilo/ilo_query.c
+++ b/src/gallium/drivers/ilo/ilo_query.c
@@ -128,7 +128,7 @@ ilo_begin_query(struct pipe_context *pipe, struct 
pipe_query *query)
return true;
 }
 
-static bool
+static boolean
 ilo_end_query(struct pipe_context *pipe, struct pipe_query *query)
 {
struct ilo_query *q = ilo_query(query);
diff --git a/src/gallium/drivers/llvmpipe/lp_query.c 
b/src/gallium/drivers/llvmpipe/lp_query.c
index d5ed656..ffc4f56c 100644
--- a/src/gallium/drivers/llvmpipe/lp_query.c
+++ b/src/gallium/drivers/llvmpipe/lp_query.c
@@ -239,7 +239,7 @@ llvmpipe_begin_query(struct pipe_context *pipe, struct 
pipe_query *q)
 }
 
 
-static bool
+static boolean
 llvmpipe_end_query(struct pipe_context *pipe, struct pipe_query *q)
 {
struct llvmpipe_context *llvmpipe = llvmpipe_context( pipe );
diff --git a/src/gallium/drivers/noop/noop_pipe.c 
b/src/gallium/drivers/noop/noop_pipe.c
index 99e5f1a..e58507b 100644
--- a/src/gallium/drivers/noop/noop_pipe.c
+++ b/src/gallium/drivers/noop/noop_pipe.c
@@ -63,7 +63,7 @@ static boolean noop_begin_query(struct pipe_context *ctx, 
struct pipe_query *que
return true;
 }
 
-static bool noop_end_query(struct pipe_context *ctx, struct pipe_query *query)
+static boolean noop_end_query(struct pipe_context *ctx, struct pipe_query 
*query)
 {
return true;
 }
diff --git a/src/gallium/drivers/nouveau/nv30/nv30_query.c 
b/src/gallium/drivers/nouveau/nv30/nv30_query.c
index aa9a12f..0f4d9b4 100644
--- a/src/gallium/drivers/nouveau/nv30/nv30_query.c
+++ b/src/gallium/drivers/nouveau/nv30/nv30_query.c
@@ -175,7 +175,7 @@ nv30_query_begin(struct pipe_context *pipe, struct 
pipe_query *pq)
return true;
 }
 
-static bool
+static boolean
 nv30_query_end(struct pipe_context *pipe, struct pipe_query *pq)
 {
struct nv30_context *nv30 = nv30_context(pipe);
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_query.c 
b/src/gallium/drivers/nouveau/nv50/nv50_query.c
index 9a1397a..9124946 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_query.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_query.c
@@ -54,7 +54,7 @@ nv50_begin_query(struct pipe_context *pipe, struct pipe_query 
*pq)
return q->funcs->begin_query(nv50_context(pipe), q);
 }
 
-static bool
+static boolean
 nv50_end_query(struct pipe_context *pipe, struct pipe_query *pq)
 {
struct nv50_query *q = nv50_query(pq);
diff --git a/src/gallium/drivers/r300/r300_query.c 
b/src/gallium/drivers/r300/r300_query.c
index 79e2198..1428d03 100644
--- a/src/gallium/drivers/r300/r300_query.c
+++ b/src/gallium/drivers/r300/r300_query.c
@@ -112,8 +112,8 @@ void r300_stop_query(struct r300_context *r300)
 r300->query_current = NULL;
 }
 
-static bool r300_end_query(struct pipe_context* pipe,
-  struct pipe_query* query)

[Mesa-dev] [PATCH 02/10] gallium: make shader_buffers const

2016-06-14 Thread Rob Clark

From: Rob Clark 

Be consistent with the rest of the "set_xyz" state interfaces.

Signed-off-by: Rob Clark 
---
 src/gallium/drivers/ddebug/dd_context.c   | 2 +-
 src/gallium/drivers/nouveau/nvc0/nvc0_state.c | 6 +++---
 src/gallium/drivers/radeonsi/si_descriptors.c | 4 ++--
 src/gallium/drivers/softpipe/sp_state_image.c | 2 +-
 src/gallium/drivers/trace/tr_context.c| 2 +-
 src/gallium/include/pipe/p_context.h  | 2 +-
 6 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/src/gallium/drivers/ddebug/dd_context.c 
b/src/gallium/drivers/ddebug/dd_context.c
index 06b7c91..07c46dd 100644
--- a/src/gallium/drivers/ddebug/dd_context.c
+++ b/src/gallium/drivers/ddebug/dd_context.c
@@ -503,7 +503,7 @@ dd_context_set_shader_images(struct pipe_context *_pipe, 
unsigned shader,
 static void
 dd_context_set_shader_buffers(struct pipe_context *_pipe, unsigned shader,
   unsigned start, unsigned num_buffers,
-  struct pipe_shader_buffer *buffers)
+  const struct pipe_shader_buffer *buffers)
 {
struct dd_context *dctx = dd_context(_pipe);
struct pipe_context *pipe = dctx->pipe;
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
index a9c1830..d10a88d 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
@@ -1315,8 +1315,8 @@ nvc0_set_shader_images(struct pipe_context *pipe, 
unsigned shader,
 
 static bool
 nvc0_bind_buffers_range(struct nvc0_context *nvc0, const unsigned t,
- unsigned start, unsigned nr,
- struct pipe_shader_buffer *pbuffers)
+unsigned start, unsigned nr,
+const struct pipe_shader_buffer *pbuffers)
 {
const unsigned end = start + nr;
unsigned mask = 0;
@@ -1366,7 +1366,7 @@ static void
 nvc0_set_shader_buffers(struct pipe_context *pipe,
 unsigned shader,
 unsigned start, unsigned nr,
-struct pipe_shader_buffer *buffers)
+const struct pipe_shader_buffer *buffers)
 {
const unsigned s = nvc0_shader_stage(shader);
if (!nvc0_bind_buffers_range(nvc0_context(pipe), s, start, nr, buffers))
diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c 
b/src/gallium/drivers/radeonsi/si_descriptors.c
index 2d780e6..5ad251f 100644
--- a/src/gallium/drivers/radeonsi/si_descriptors.c
+++ b/src/gallium/drivers/radeonsi/si_descriptors.c
@@ -1040,7 +1040,7 @@ si_shader_buffer_descriptors(struct si_context *sctx, 
unsigned shader)
 
 static void si_set_shader_buffers(struct pipe_context *ctx, unsigned shader,
  unsigned start_slot, unsigned count,
- struct pipe_shader_buffer *sbuffers)
+ const struct pipe_shader_buffer *sbuffers)
 {
struct si_context *sctx = (struct si_context *)ctx;
struct si_buffer_resources *buffers = &sctx->shader_buffers[shader];
@@ -1050,7 +1050,7 @@ static void si_set_shader_buffers(struct pipe_context 
*ctx, unsigned shader,
assert(start_slot + count <= SI_NUM_SHADER_BUFFERS);
 
for (i = 0; i < count; ++i) {
-   struct pipe_shader_buffer *sbuffer = sbuffers ? &sbuffers[i] : 
NULL;
+   const struct pipe_shader_buffer *sbuffer = sbuffers ? 
&sbuffers[i] : NULL;
struct r600_resource *buf;
unsigned slot = start_slot + i;
uint32_t *desc = descs->list + slot * 4;
diff --git a/src/gallium/drivers/softpipe/sp_state_image.c 
b/src/gallium/drivers/softpipe/sp_state_image.c
index b1810d3..81bb7ca 100644
--- a/src/gallium/drivers/softpipe/sp_state_image.c
+++ b/src/gallium/drivers/softpipe/sp_state_image.c
@@ -56,7 +56,7 @@ static void softpipe_set_shader_buffers(struct pipe_context 
*pipe,
 unsigned shader,
 unsigned start,
 unsigned num,
-struct pipe_shader_buffer *buffers)
+const struct pipe_shader_buffer 
*buffers)
 {
struct softpipe_context *softpipe = softpipe_context(pipe);
unsigned i;
diff --git a/src/gallium/drivers/trace/tr_context.c 
b/src/gallium/drivers/trace/tr_context.c
index 0dd07e9..041a47c 100644
--- a/src/gallium/drivers/trace/tr_context.c
+++ b/src/gallium/drivers/trace/tr_context.c
@@ -1670,7 +1670,7 @@ trace_context_set_tess_state(struct pipe_context 
*_context,
 static void trace_context_set_shader_buffers(struct pipe_context *_context,
  unsigned shader,
  unsigned start, unsigned nr,
- struct pipe_shader_buffer 
*buffers)

[Mesa-dev] [PATCH 03/10] gallium: make constant_buffer const

2016-06-14 Thread Rob Clark

From: Rob Clark 

Signed-off-by: Rob Clark 
---
 src/gallium/drivers/ddebug/dd_context.c | 2 +-
 src/gallium/drivers/freedreno/freedreno_state.c | 2 +-
 src/gallium/drivers/i915/i915_state.c   | 2 +-
 src/gallium/drivers/ilo/ilo_state.c | 2 +-
 src/gallium/drivers/llvmpipe/lp_state_fs.c  | 2 +-
 src/gallium/drivers/noop/noop_state.c   | 2 +-
 src/gallium/drivers/nouveau/nv30/nv30_state.c   | 2 +-
 src/gallium/drivers/nouveau/nv50/nv50_state.c   | 2 +-
 src/gallium/drivers/nouveau/nvc0/nvc0_state.c   | 2 +-
 src/gallium/drivers/r300/r300_state.c   | 2 +-
 src/gallium/drivers/r600/r600_state_common.c| 2 +-
 src/gallium/drivers/radeonsi/si_descriptors.c   | 6 +++---
 src/gallium/drivers/radeonsi/si_state.h | 3 +--
 src/gallium/drivers/rbug/rbug_context.c | 2 +-
 src/gallium/drivers/softpipe/sp_state_shader.c  | 2 +-
 src/gallium/drivers/svga/svga_pipe_constants.c  | 2 +-
 src/gallium/drivers/swr/swr_state.cpp   | 2 +-
 src/gallium/drivers/trace/tr_context.c  | 2 +-
 src/gallium/drivers/vc4/vc4_state.c | 2 +-
 src/gallium/drivers/virgl/virgl_context.c   | 2 +-
 src/gallium/include/pipe/p_context.h| 2 +-
 21 files changed, 23 insertions(+), 24 deletions(-)

diff --git a/src/gallium/drivers/ddebug/dd_context.c 
b/src/gallium/drivers/ddebug/dd_context.c
index 07c46dd..64b16f6 100644
--- a/src/gallium/drivers/ddebug/dd_context.c
+++ b/src/gallium/drivers/ddebug/dd_context.c
@@ -343,7 +343,7 @@ DD_IMM_STATE(polygon_stipple, const struct 
pipe_poly_stipple, *state, state)
 static void
 dd_context_set_constant_buffer(struct pipe_context *_pipe,
uint shader, uint index,
-   struct pipe_constant_buffer *constant_buffer)
+   const struct pipe_constant_buffer 
*constant_buffer)
 {
struct dd_context *dctx = dd_context(_pipe);
struct pipe_context *pipe = dctx->pipe;
diff --git a/src/gallium/drivers/freedreno/freedreno_state.c 
b/src/gallium/drivers/freedreno/freedreno_state.c
index e4df909..53ea39b 100644
--- a/src/gallium/drivers/freedreno/freedreno_state.c
+++ b/src/gallium/drivers/freedreno/freedreno_state.c
@@ -89,7 +89,7 @@ fd_set_sample_mask(struct pipe_context *pctx, unsigned 
sample_mask)
  */
 static void
 fd_set_constant_buffer(struct pipe_context *pctx, uint shader, uint index,
-   struct pipe_constant_buffer *cb)
+   const struct pipe_constant_buffer *cb)
 {
struct fd_context *ctx = fd_context(pctx);
struct fd_constbuf_stateobj *so = &ctx->constbuf[shader];
diff --git a/src/gallium/drivers/i915/i915_state.c 
b/src/gallium/drivers/i915/i915_state.c
index 8fa2f42..2efa14e 100644
--- a/src/gallium/drivers/i915/i915_state.c
+++ b/src/gallium/drivers/i915/i915_state.c
@@ -675,7 +675,7 @@ static void i915_delete_vs_state(struct pipe_context *pipe, 
void *shader)
 
 static void i915_set_constant_buffer(struct pipe_context *pipe,
  uint shader, uint index,
- struct pipe_constant_buffer *cb)
+ const struct pipe_constant_buffer *cb)
 {
struct i915_context *i915 = i915_context(pipe);
struct pipe_resource *buf = cb ? cb->buffer : NULL;
diff --git a/src/gallium/drivers/ilo/ilo_state.c 
b/src/gallium/drivers/ilo/ilo_state.c
index 37234ec..53a5aca 100644
--- a/src/gallium/drivers/ilo/ilo_state.c
+++ b/src/gallium/drivers/ilo/ilo_state.c
@@ -1536,7 +1536,7 @@ ilo_set_clip_state(struct pipe_context *pipe,
 static void
 ilo_set_constant_buffer(struct pipe_context *pipe,
 uint shader, uint index,
-struct pipe_constant_buffer *buf)
+const struct pipe_constant_buffer *buf)
 {
const struct ilo_dev *dev = ilo_context(pipe)->dev;
struct ilo_state_vector *vec = &ilo_context(pipe)->state_vector;
diff --git a/src/gallium/drivers/llvmpipe/lp_state_fs.c 
b/src/gallium/drivers/llvmpipe/lp_state_fs.c
index 7dceff7..429b082 100644
--- a/src/gallium/drivers/llvmpipe/lp_state_fs.c
+++ b/src/gallium/drivers/llvmpipe/lp_state_fs.c
@@ -2836,7 +2836,7 @@ llvmpipe_delete_fs_state(struct pipe_context *pipe, void 
*fs)
 static void
 llvmpipe_set_constant_buffer(struct pipe_context *pipe,
  uint shader, uint index,
- struct pipe_constant_buffer *cb)
+ const struct pipe_constant_buffer *cb)
 {
struct llvmpipe_context *llvmpipe = llvmpipe_context(pipe);
struct pipe_resource *constants = cb ? cb->buffer : NULL;
diff --git a/src/gallium/drivers/noop/noop_state.c 
b/src/gallium/drivers/noop/noop_state.c
index fe5b5e4..0ddffa2 100644
--- a/src/gallium/drivers/noop/noop_state.c
+++ b/src/gallium/drivers/noop/noop_state.c
@@ -176,7 +176,7 @@ static void noop_set_framebuffer_state(struct pipe_context 
*ctx,
 
 static void noop_set_cons

[Mesa-dev] [PATCH 04/10] gallium: make image_view const

2016-06-14 Thread Rob Clark

From: Rob Clark 

Signed-off-by: Rob Clark 
---
 src/gallium/drivers/ddebug/dd_context.c   | 2 +-
 src/gallium/drivers/ilo/ilo_state.c   | 2 +-
 src/gallium/drivers/nouveau/nvc0/nvc0_state.c | 4 ++--
 src/gallium/drivers/radeonsi/si_descriptors.c | 6 +++---
 src/gallium/drivers/softpipe/sp_state_image.c | 2 +-
 src/gallium/drivers/trace/tr_context.c| 2 +-
 src/gallium/include/pipe/p_context.h  | 2 +-
 7 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/src/gallium/drivers/ddebug/dd_context.c 
b/src/gallium/drivers/ddebug/dd_context.c
index 64b16f6..f72fd2f 100644
--- a/src/gallium/drivers/ddebug/dd_context.c
+++ b/src/gallium/drivers/ddebug/dd_context.c
@@ -490,7 +490,7 @@ dd_context_set_sampler_views(struct pipe_context *_pipe, 
unsigned shader,
 static void
 dd_context_set_shader_images(struct pipe_context *_pipe, unsigned shader,
  unsigned start, unsigned num,
- struct pipe_image_view *views)
+ const struct pipe_image_view *views)
 {
struct dd_context *dctx = dd_context(_pipe);
struct pipe_context *pipe = dctx->pipe;
diff --git a/src/gallium/drivers/ilo/ilo_state.c 
b/src/gallium/drivers/ilo/ilo_state.c
index 53a5aca..4f1002e 100644
--- a/src/gallium/drivers/ilo/ilo_state.c
+++ b/src/gallium/drivers/ilo/ilo_state.c
@@ -1851,7 +1851,7 @@ ilo_set_sampler_views(struct pipe_context *pipe, unsigned 
shader,
 static void
 ilo_set_shader_images(struct pipe_context *pipe, unsigned shader,
   unsigned start, unsigned count,
-  struct pipe_image_view *views)
+  const struct pipe_image_view *views)
 {
 #if 0
struct ilo_state_vector *vec = &ilo_context(pipe)->state_vector;
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
index a0e01bd..0bd756f 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
@@ -1233,7 +1233,7 @@ nvc0_set_compute_resources(struct pipe_context *pipe,
 static bool
 nvc0_bind_images_range(struct nvc0_context *nvc0, const unsigned s,
unsigned start, unsigned nr,
-   struct pipe_image_view *pimages)
+   const struct pipe_image_view *pimages)
 {
const unsigned end = start + nr;
unsigned mask = 0;
@@ -1301,7 +1301,7 @@ nvc0_bind_images_range(struct nvc0_context *nvc0, const 
unsigned s,
 static void
 nvc0_set_shader_images(struct pipe_context *pipe, unsigned shader,
unsigned start, unsigned nr,
-   struct pipe_image_view *images)
+   const struct pipe_image_view *images)
 {
const unsigned s = nvc0_shader_stage(shader);
if (!nvc0_bind_images_range(nvc0_context(pipe), s, start, nr, images))
diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c 
b/src/gallium/drivers/radeonsi/si_descriptors.c
index 55686e8..e95556b 100644
--- a/src/gallium/drivers/radeonsi/si_descriptors.c
+++ b/src/gallium/drivers/radeonsi/si_descriptors.c
@@ -560,7 +560,7 @@ si_disable_shader_image(struct si_context *ctx, unsigned 
shader, unsigned slot)
 }
 
 static void
-si_mark_image_range_valid(struct pipe_image_view *view)
+si_mark_image_range_valid(const struct pipe_image_view *view)
 {
struct r600_resource *res = (struct r600_resource *)view->resource;
const struct util_format_description *desc;
@@ -578,7 +578,7 @@ si_mark_image_range_valid(struct pipe_image_view *view)
 
 static void si_set_shader_image(struct si_context *ctx,
unsigned shader,
-   unsigned slot, struct pipe_image_view *view)
+   unsigned slot, const struct pipe_image_view 
*view)
 {
struct si_screen *screen = ctx->screen;
struct si_images_info *images = &ctx->images[shader];
@@ -674,7 +674,7 @@ static void si_set_shader_image(struct si_context *ctx,
 static void
 si_set_shader_images(struct pipe_context *pipe, unsigned shader,
 unsigned start_slot, unsigned count,
-struct pipe_image_view *views)
+const struct pipe_image_view *views)
 {
struct si_context *ctx = (struct si_context *)pipe;
unsigned i, slot;
diff --git a/src/gallium/drivers/softpipe/sp_state_image.c 
b/src/gallium/drivers/softpipe/sp_state_image.c
index 81bb7ca..553a76a 100644
--- a/src/gallium/drivers/softpipe/sp_state_image.c
+++ b/src/gallium/drivers/softpipe/sp_state_image.c
@@ -30,7 +30,7 @@ static void softpipe_set_shader_images(struct pipe_context 
*pipe,
unsigned shader,
unsigned start,
unsigned num,
-   struct pipe_image_view *images)
+   const struct pi

Re: [Mesa-dev] [PATCH 01/10] gallium: cleanup set_tess_state

2016-06-14 Thread Ilia Mirkin

Can you explain the motivation behind this change? I'm adding a
->set_window_rectangles thing which also takes multiple parameters.
What's the advantage of stuffing things into a struct first?

  -ilia

On Tue, Jun 14, 2016 at 11:57 AM, Rob Clark  wrote:
> From: Rob Clark 
>
> The reset of the state APIs take state structs, rather than inline
> parameters (with the exception of a couple which just amount to a single
> uint).
>
> This makes the API more regular and simplifies autogeneration of the
> gallium state related APIs.
>
> Signed-off-by: Rob Clark 
> ---
>  src/gallium/drivers/ddebug/dd_context.c   |  9 -
>  src/gallium/drivers/nouveau/nvc0/nvc0_state.c |  7 +++
>  src/gallium/drivers/r600/evergreen_state.c|  7 +++
>  src/gallium/drivers/radeonsi/si_state.c   |  7 +++
>  src/gallium/drivers/trace/tr_context.c|  9 -
>  src/gallium/include/pipe/p_context.h  |  4 ++--
>  src/gallium/include/pipe/p_state.h|  8 
>  src/mesa/state_tracker/st_atom_tess.c | 13 ++---
>  8 files changed, 37 insertions(+), 27 deletions(-)
>
> diff --git a/src/gallium/drivers/ddebug/dd_context.c 
> b/src/gallium/drivers/ddebug/dd_context.c
> index 0f8ef18..06b7c91 100644
> --- a/src/gallium/drivers/ddebug/dd_context.c
> +++ b/src/gallium/drivers/ddebug/dd_context.c
> @@ -380,15 +380,14 @@ dd_context_set_viewport_states(struct pipe_context 
> *_pipe,
>  }
>
>  static void dd_context_set_tess_state(struct pipe_context *_pipe,
> -  const float default_outer_level[4],
> -  const float default_inner_level[2])
> +  const struct pipe_tess_state *state)
>  {
> struct dd_context *dctx = dd_context(_pipe);
> struct pipe_context *pipe = dctx->pipe;
>
> -   memcpy(dctx->tess_default_levels, default_outer_level, sizeof(float) * 4);
> -   memcpy(dctx->tess_default_levels+4, default_inner_level, sizeof(float) * 
> 2);
> -   pipe->set_tess_state(pipe, default_outer_level, default_inner_level);
> +   memcpy(dctx->tess_default_levels, state->default_outer_level, 
> sizeof(float) * 4);
> +   memcpy(dctx->tess_default_levels+4, state->default_inner_level, 
> sizeof(float) * 2);
> +   pipe->set_tess_state(pipe, state);
>  }
>
>
> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c 
> b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
> index 92161ec..a9c1830 100644
> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
> +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
> @@ -1001,13 +1001,12 @@ nvc0_set_viewport_states(struct pipe_context *pipe,
>
>  static void
>  nvc0_set_tess_state(struct pipe_context *pipe,
> -const float default_tess_outer[4],
> -const float default_tess_inner[2])
> +const struct pipe_tess_state *state)
>  {
> struct nvc0_context *nvc0 = nvc0_context(pipe);
>
> -   memcpy(nvc0->default_tess_outer, default_tess_outer, 4 * sizeof(float));
> -   memcpy(nvc0->default_tess_inner, default_tess_inner, 2 * sizeof(float));
> +   memcpy(nvc0->default_tess_outer, state->default_tess_outer, 4 * 
> sizeof(float));
> +   memcpy(nvc0->default_tess_inner, state->default_tess_inner, 2 * 
> sizeof(float));
> nvc0->dirty_3d |= NVC0_NEW_3D_TESSFACTOR;
>  }
>
> diff --git a/src/gallium/drivers/r600/evergreen_state.c 
> b/src/gallium/drivers/r600/evergreen_state.c
> index 1ac8914..2a424f5 100644
> --- a/src/gallium/drivers/r600/evergreen_state.c
> +++ b/src/gallium/drivers/r600/evergreen_state.c
> @@ -3569,13 +3569,12 @@ fallback:
>  }
>
>  static void evergreen_set_tess_state(struct pipe_context *ctx,
> -const float default_outer_level[4],
> -const float default_inner_level[2])
> +const struct pipe_tess_state *state)
>  {
> struct r600_context *rctx = (struct r600_context *)ctx;
>
> -   memcpy(rctx->tess_state, default_outer_level, sizeof(float) * 4);
> -   memcpy(rctx->tess_state+4, default_inner_level, sizeof(float) * 2);
> +   memcpy(rctx->tess_state, state->default_outer_level, sizeof(float) * 
> 4);
> +   memcpy(rctx->tess_state+4, state->default_inner_level, sizeof(float) 
> * 2);
> rctx->tess_state_dirty = true;
>  }
>
> diff --git a/src/gallium/drivers/radeonsi/si_state.c 
> b/src/gallium/drivers/radeonsi/si_state.c
> index 0c52eee..6ef3fe5 100644
> --- a/src/gallium/drivers/radeonsi/si_state.c
> +++ b/src/gallium/drivers/radeonsi/si_state.c
> @@ -3238,15 +3238,14 @@ static void si_set_index_buffer(struct pipe_context 
> *ctx,
>   */
>
>  static void si_set_tess_state(struct pipe_context *ctx,
> - const float default_outer_level[4],
> - const float default_inner_level[2])
> + const struct pipe_tess_state *state)
>  {
> struct si_contex

[Mesa-dev] [PATCH 06/10] gallium/util: add util_copy_index_buffer() helper

2016-06-14 Thread Rob Clark

From: Rob Clark 

Note there was previously a util_set_index_buffer() which was only used
by svga.  Replace this.

(The util_copy_* naming is more consistent with other u_inlines/
u_framebuffer helpers)

Signed-off-by: Rob Clark 
---
 src/gallium/auxiliary/util/u_helpers.c  | 15 ---
 src/gallium/auxiliary/util/u_helpers.h  |  3 ---
 src/gallium/auxiliary/util/u_inlines.h  | 17 +
 src/gallium/drivers/freedreno/freedreno_state.c | 11 +--
 src/gallium/drivers/i915/i915_state.c   |  6 +-
 src/gallium/drivers/ilo/ilo_state.c | 10 +-
 src/gallium/drivers/llvmpipe/lp_state_vertex.c  |  6 +-
 src/gallium/drivers/nouveau/nv30/nv30_state.c   | 11 +--
 src/gallium/drivers/r300/r300_state.c   |  8 +---
 src/gallium/drivers/r600/r600_state_common.c|  5 +
 src/gallium/drivers/radeonsi/si_state.c |  6 +-
 src/gallium/drivers/softpipe/sp_state_vertex.c  |  6 +-
 src/gallium/drivers/svga/svga_pipe_vertex.c |  2 +-
 src/gallium/drivers/swr/swr_state.cpp   |  7 +--
 src/gallium/drivers/vc4/vc4_state.c | 11 +--
 src/gallium/drivers/virgl/virgl_context.c   |  8 +---
 16 files changed, 30 insertions(+), 102 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_helpers.c 
b/src/gallium/auxiliary/util/u_helpers.c
index 09020b0..117a51b 100644
--- a/src/gallium/auxiliary/util/u_helpers.c
+++ b/src/gallium/auxiliary/util/u_helpers.c
@@ -94,18 +94,3 @@ void util_set_vertex_buffers_count(struct pipe_vertex_buffer 
*dst,
 
*dst_count = util_last_bit(enabled_buffers);
 }
-
-
-void
-util_set_index_buffer(struct pipe_index_buffer *dst,
-  const struct pipe_index_buffer *src)
-{
-   if (src) {
-  pipe_resource_reference(&dst->buffer, src->buffer);
-  memcpy(dst, src, sizeof(*dst));
-   }
-   else {
-  pipe_resource_reference(&dst->buffer, NULL);
-  memset(dst, 0, sizeof(*dst));
-   }
-}
diff --git a/src/gallium/auxiliary/util/u_helpers.h 
b/src/gallium/auxiliary/util/u_helpers.h
index a9a53e4..9804163 100644
--- a/src/gallium/auxiliary/util/u_helpers.h
+++ b/src/gallium/auxiliary/util/u_helpers.h
@@ -44,9 +44,6 @@ void util_set_vertex_buffers_count(struct pipe_vertex_buffer 
*dst,
const struct pipe_vertex_buffer *src,
unsigned start_slot, unsigned count);
 
-void util_set_index_buffer(struct pipe_index_buffer *dst,
-   const struct pipe_index_buffer *src);
-
 #ifdef __cplusplus
 }
 #endif
diff --git a/src/gallium/auxiliary/util/u_inlines.h 
b/src/gallium/auxiliary/util/u_inlines.h
index 207e2aa..78125c8 100644
--- a/src/gallium/auxiliary/util/u_inlines.h
+++ b/src/gallium/auxiliary/util/u_inlines.h
@@ -623,6 +623,23 @@ util_copy_constant_buffer(struct pipe_constant_buffer *dst,
 }
 
 static inline void
+util_copy_index_buffer(struct pipe_index_buffer *dst,
+   const struct pipe_index_buffer *src)
+{
+   if (src) {
+  dst->index_size = src->index_size;
+  dst->offset = src->offset;
+  pipe_resource_reference(&dst->buffer, src->buffer);
+  dst->user_buffer = src->user_buffer;
+   } else {
+  dst->index_size = 0;
+  dst->offset = 0;
+  pipe_resource_reference(&dst->buffer, NULL);
+  dst->user_buffer = NULL;
+   }
+}
+
+static inline void
 util_copy_image_view(struct pipe_image_view *dst,
  const struct pipe_image_view *src)
 {
diff --git a/src/gallium/drivers/freedreno/freedreno_state.c 
b/src/gallium/drivers/freedreno/freedreno_state.c
index 53ea39b..688975f 100644
--- a/src/gallium/drivers/freedreno/freedreno_state.c
+++ b/src/gallium/drivers/freedreno/freedreno_state.c
@@ -207,16 +207,7 @@ fd_set_index_buffer(struct pipe_context *pctx,
const struct pipe_index_buffer *ib)
 {
struct fd_context *ctx = fd_context(pctx);
-
-   if (ib) {
-   pipe_resource_reference(&ctx->indexbuf.buffer, ib->buffer);
-   ctx->indexbuf.index_size = ib->index_size;
-   ctx->indexbuf.offset = ib->offset;
-   ctx->indexbuf.user_buffer = ib->user_buffer;
-   } else {
-   pipe_resource_reference(&ctx->indexbuf.buffer, NULL);
-   }
-
+   util_copy_index_buffer(&ctx->indexbuf, ib);
ctx->dirty |= FD_DIRTY_INDEXBUF;
 }
 
diff --git a/src/gallium/drivers/i915/i915_state.c 
b/src/gallium/drivers/i915/i915_state.c
index 2efa14e..dbd711f 100644
--- a/src/gallium/drivers/i915/i915_state.c
+++ b/src/gallium/drivers/i915/i915_state.c
@@ -1063,11 +1063,7 @@ static void i915_set_index_buffer(struct pipe_context 
*pipe,
   const struct pipe_index_buffer *ib)
 {
struct i915_context *i915 = i915_context(pipe);
-
-   if (ib)
-  memcpy(&i915->index_buffer, ib, sizeof(i915->index_buffer));
-   else
-  memset(&i915->index_buffer, 0, sizeof(i915->index

[Mesa-dev] [PATCH (backport)] radeonsi: mark buffer texture range valid for shader images

2016-06-14 Thread Nicolai Hähnle

From: Nicolai Hähnle 

When a shader image view into a buffer texture can be written to, the buffer's
valid range must be updated, or subsequent transfers may incorrectly skip
synchronization.

This fixes a bug that was exposed in Xephyr by PBO acceleration for 
glReadPixels,
reported by Michel Dänzer.

Cc: Michel Dänzer 
Cc: 12.0 
Reviewed-by: Marek Olšák 

Back-ported from commit a64c7cd2bac33a3a2bf908b5ef538dff03b93b73:
- include util/u_format.h
- code was extracted to si_set_shader_image in master, move it back

Signed-off-by: Nicolai Hähnle 
--
 src/gallium/drivers/radeonsi/si_descriptors.c | 24 
 1 file changed, 24 insertions(+)

diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c 
b/src/gallium/drivers/radeonsi/si_descriptors.c
index 855b79e..e8ce87b 100644
--- a/src/gallium/drivers/radeonsi/si_descriptors.c
+++ b/src/gallium/drivers/radeonsi/si_descriptors.c
@@ -60,6 +60,7 @@
 #include "si_shader.h"
 #include "sid.h"
 
+#include "util/u_format.h"
 #include "util/u_math.h"
 #include "util/u_memory.h"
 #include "util/u_suballoc.h"
@@ -471,6 +472,23 @@ si_disable_shader_image(struct si_images_info *images, 
unsigned slot)
 }
 
 static void
+si_mark_image_range_valid(struct pipe_image_view *view)
+{
+   struct r600_resource *res = (struct r600_resource *)view->resource;
+   const struct util_format_description *desc;
+   unsigned stride;
+
+   assert(res && res->b.b.target == PIPE_BUFFER);
+
+   desc = util_format_description(view->format);
+   stride = desc->block.bits / 8;
+
+   util_range_add(&res->valid_buffer_range,
+  stride * (view->u.buf.first_element),
+  stride * (view->u.buf.last_element + 1));
+}
+
+static void
 si_set_shader_images(struct pipe_context *pipe, unsigned shader,
 unsigned start_slot, unsigned count,
 struct pipe_image_view *views)
@@ -502,6 +520,9 @@ si_set_shader_images(struct pipe_context *pipe, unsigned 
shader,
   RADEON_USAGE_READWRITE);
 
if (res->b.b.target == PIPE_BUFFER) {
+   if (views[i].access & PIPE_IMAGE_ACCESS_WRITE)
+   si_mark_image_range_valid(&views[i]);
+
si_make_buffer_descriptor(screen, res,
  views[i].format,
  views[i].u.buf.first_element,
@@ -1297,6 +1318,9 @@ static void si_invalidate_buffer(struct pipe_context 
*ctx, struct pipe_resource
unsigned i = u_bit_scan(&mask);
 
if (images->views[i].resource == buf) {
+   if (images->views[i].access & 
PIPE_IMAGE_ACCESS_WRITE)
+   
si_mark_image_range_valid(&images->views[i]);
+
si_desc_reset_buffer_offset(
ctx, images->desc.list + i * 8 + 4,
old_va, buf);
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 08/10] gallium/util: add util_copy_vertex_buffer helper

2016-06-14 Thread Rob Clark

From: Rob Clark 

Signed-off-by: Rob Clark 
---
 src/gallium/auxiliary/util/u_inlines.h | 17 +
 1 file changed, 17 insertions(+)

diff --git a/src/gallium/auxiliary/util/u_inlines.h 
b/src/gallium/auxiliary/util/u_inlines.h
index ebaf368..93171d9 100644
--- a/src/gallium/auxiliary/util/u_inlines.h
+++ b/src/gallium/auxiliary/util/u_inlines.h
@@ -671,6 +671,23 @@ util_copy_image_view(struct pipe_image_view *dst,
}
 }
 
+static inline void
+util_copy_vertex_buffer(struct pipe_vertex_buffer *dst,
+const struct pipe_vertex_buffer *src)
+{
+   if (src) {
+  dst->stride = src->stride;
+  dst->buffer_offset = src->buffer_offset;
+  pipe_resource_reference(&dst->buffer, src->buffer);
+  dst->user_buffer = src->user_buffer;
+   } else {
+  dst->stride = 0;
+  dst->buffer_offset = 0;
+  pipe_resource_reference(&dst->buffer, NULL);
+  dst->user_buffer = NULL;
+   }
+}
+
 static inline unsigned
 util_max_layer(const struct pipe_resource *r, unsigned level)
 {
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

1 2 >

1 - 100 of 153 matches

Mail list logo