from:"Samuel Iglesias Gonsálvez"

Re: [Mesa-dev] [PATCH] intel/fs: Use a pure vertical stride for large register strides

2017-11-06 Thread Samuel Iglesias Gonsálvez


On Thu, 2017-11-02 at 15:54 -0700, Jason Ekstrand wrote:
> Register strides higher than 4 are uncommon but they can happen.  For
> instance, if you have a 64-bit extract_u8 operation, we turn that
> into
> UB -> UQ MOV with a source stride of 8.  Our previous calculation
> would
> try to generate a stride of <32;8,8>:ub which is invalid because the
> maximum horizontal stride is 4.  To solve this problem, we instead
> use a
> stride of <8;1,0>.  As noted in the comment, this does not work as a
> destination but that's ok as very few things actually generate that
> stride.
> 

Great!

> Cc: mesa-sta...@lists.freedesktop.org
> ---
>  src/intel/compiler/brw_fs_generator.cpp | 15 ---
>  1 file changed, 12 insertions(+), 3 deletions(-)
> 
> diff --git a/src/intel/compiler/brw_fs_generator.cpp
> b/src/intel/compiler/brw_fs_generator.cpp
> index 46f9a33..a557f80 100644
> --- a/src/intel/compiler/brw_fs_generator.cpp
> +++ b/src/intel/compiler/brw_fs_generator.cpp
> @@ -90,9 +90,18 @@ brw_reg_from_fs_reg(const struct gen_device_info
> *devinfo, fs_inst *inst,
>*   different execution size when the number of
> components
>*   written to each destination GRF is not the same.
>*/
> - const unsigned width = MIN2(reg_width, phys_width);
> - brw_reg = brw_vecn_reg(width, brw_file_from_reg(reg), reg-
> >nr, 0);
> - brw_reg = stride(brw_reg, width * reg->stride, width, reg-
> >stride);
> + if (reg->stride > 4) {
> +/* For registers with an exceptionally large stride, we
> use a
> + * width of 1 and only use the vertical stride.  This
> only works
> + * for sources since destinations require hstride == 1.
> + */
> +brw_reg = brw_vec1_reg(brw_file_from_reg(reg), reg->nr,
> 0);
> +brw_reg = stride(brw_reg, reg->stride, 1, 0);

I think it is a good idea to add an assert like:

   assert(reg != >dst)

in order to avoid applying this to dst.

With or without that change,

Reviewed-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>

Sam

> + } else {
> +const unsigned width = MIN2(reg_width, phys_width);
> +brw_reg = brw_vecn_reg(width, brw_file_from_reg(reg),
> reg->nr, 0);
> +brw_reg = stride(brw_reg, width * reg->stride, width,
> reg->stride);
> + }
>  
>   if (devinfo->gen == 7 && !devinfo->is_haswell) {
>  /* From the IvyBridge PRM (EU Changes by Processor
> Generation, page 13):

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 4/4] intel/fs/nir: Return Q types from brw_reg_type_for_bit_size

2017-11-06 Thread Samuel Iglesias Gonsálvez

Patch series is,

Reviewed-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>

Sam

On Thu, 2017-11-02 at 21:53 -0700, Jason Ekstrand wrote:
> Now that we're returning a sane type, we can drop the retyping to Q
> in
> nir_emit_load_const.
> 
> Cc: Jose Maria Casanova Crespo <jmcasan...@igalia.com>
> ---
>  src/intel/compiler/brw_fs_nir.cpp | 7 +++
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/src/intel/compiler/brw_fs_nir.cpp
> b/src/intel/compiler/brw_fs_nir.cpp
> index cbd51a9..0b17e4f 100644
> --- a/src/intel/compiler/brw_fs_nir.cpp
> +++ b/src/intel/compiler/brw_fs_nir.cpp
> @@ -264,7 +264,7 @@ brw_reg_type_from_bit_size(const unsigned
> bit_size,
>case 32:
>   return BRW_REGISTER_TYPE_D;
>case 64:
> - return BRW_REGISTER_TYPE_DF;
> + return BRW_REGISTER_TYPE_Q;
>default:
>   unreachable("Invalid bit size");
>}
> @@ -277,7 +277,7 @@ brw_reg_type_from_bit_size(const unsigned
> bit_size,
>case 32:
>   return BRW_REGISTER_TYPE_UD;
>case 64:
> - return BRW_REGISTER_TYPE_DF;
> + return BRW_REGISTER_TYPE_UQ;
>default:
>   unreachable("Invalid bit size");
>}
> @@ -1420,8 +1420,7 @@ fs_visitor::nir_emit_load_const(const
> fs_builder ,
>   }
>} else {
>   for (unsigned i = 0; i < instr->def.num_components; i++)
> -bld.MOV(retype(offset(reg, bld, i),
> BRW_REGISTER_TYPE_Q),
> -brw_imm_q(instr->value.i64[i]));
> +bld.MOV(offset(reg, bld, i), brw_imm_q(instr-
> >value.i64[i]));
>}
>break;
>  

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] i965: Be more clever about setting up our viewport clip

2017-11-06 Thread Samuel Iglesias Gonsálvez

Both patches are,

Reviewed-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>

Sam

On Fri, 2017-11-03 at 15:31 -0700, Jason Ekstrand wrote:
> Before, we were trusting in the hardware to take the intersection
> of the viewport clip with the drawing rectangle.  Unfortunately,
> 3DSTATE_DRAWING_RECTANGLE is fairly expensive because it implicitly
> does a full pipeline stall.  If we're a bit more careful with our
> viewport clipping, we can just re-emit it once at context creation
> time.
> ---
>  src/mesa/drivers/dri/i965/genX_state_upload.c | 20 -
> ---
>  1 file changed, 12 insertions(+), 8 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c
> b/src/mesa/drivers/dri/i965/genX_state_upload.c
> index b7a6cd7..9fe90a2 100644
> --- a/src/mesa/drivers/dri/i965/genX_state_upload.c
> +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
> @@ -2451,24 +2451,28 @@ genX(upload_sf_clip_viewport)(struct
> brw_context *brw)
>  #elif GEN_GEN >= 8
>/* _NEW_VIEWPORT | _NEW_BUFFERS: Screen Space Viewport
> * The hardware will take the intersection of the drawing
> rectangle,
> -   * scissor rectangle, and the viewport extents. We don't need
> to be
> -   * smart, and can therefore just program the viewport extents.
> +   * scissor rectangle, and the viewport extents.  However,
> emitting
> +   * 3DSTATE_DRAWING_RECTANGLE is expensive since it requires a
> full
> +   * pipeline stall so we're better off just being a little more
> clever
> +   * with our viewport so we can emit it once at context
> creation time.
> */
> +  const float viewport_Xmin = MAX2(ctx->ViewportArray[i].X, 0);
> +  const float viewport_Ymin = MAX2(ctx->ViewportArray[i].Y, 0);
>const float viewport_Xmax =
> - ctx->ViewportArray[i].X + ctx->ViewportArray[i].Width;
> + MIN2(ctx->ViewportArray[i].X + ctx->ViewportArray[i].Width, 
> fb_width);
>const float viewport_Ymax =
> - ctx->ViewportArray[i].Y + ctx->ViewportArray[i].Height;
> + MIN2(ctx->ViewportArray[i].Y + ctx-
> >ViewportArray[i].Height, fb_height);
>  
>if (render_to_fbo) {
> - sfv.XMinViewPort = ctx->ViewportArray[i].X;
> + sfv.XMinViewPort = viewport_Xmin;
>   sfv.XMaxViewPort = viewport_Xmax - 1;
> - sfv.YMinViewPort = ctx->ViewportArray[i].Y;
> + sfv.YMinViewPort = viewport_Ymin;
>   sfv.YMaxViewPort = viewport_Ymax - 1;
>} else {
> - sfv.XMinViewPort = ctx->ViewportArray[i].X;
> + sfv.XMinViewPort = viewport_Xmin;
>   sfv.XMaxViewPort = viewport_Xmax - 1;
>   sfv.YMinViewPort = fb_height - viewport_Ymax;
> - sfv.YMaxViewPort = fb_height - ctx->ViewportArray[i].Y - 1;
> + sfv.YMaxViewPort = fb_height - viewport_Ymin - 1;
>}
>  #endif
>  

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] anv: fix bug when using component qualifier in FS outputs

2017-11-02 Thread Samuel Iglesias Gonsálvez

We can write to the same output but in different components, like
in this example:

layout(location = 0, component = 0) out ivec2 dEQP_FragColor_0;
layout(location = 0, component = 2) out ivec2 dEQP_FragColor_1;

Therefore, they are not two different outputs but only one.

Fixes:

dEQP-VK.glsl.440.linkage.varying.component.frag_out.*

Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
---
 src/compiler/shader_enums.h |  1 +
 src/intel/vulkan/anv_pipeline.c | 10 +-
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/src/compiler/shader_enums.h b/src/compiler/shader_enums.h
index 9d229d4199e..90729dbfd96 100644
--- a/src/compiler/shader_enums.h
+++ b/src/compiler/shader_enums.h
@@ -603,6 +603,7 @@ typedef enum
FRAG_RESULT_DATA5,
FRAG_RESULT_DATA6,
FRAG_RESULT_DATA7,
+   FRAG_RESULT_MAX, /**< Number of fragment program results */
 } gl_frag_result;
 
 const char *gl_frag_result_name(gl_frag_result result);
diff --git a/src/intel/vulkan/anv_pipeline.c b/src/intel/vulkan/anv_pipeline.c
index 907b24a758d..be007f24e3f 100644
--- a/src/intel/vulkan/anv_pipeline.c
+++ b/src/intel/vulkan/anv_pipeline.c
@@ -872,6 +872,8 @@ anv_pipeline_compile_fs(struct anv_pipeline *pipeline,
   unsigned num_rts = 0;
   struct anv_pipeline_binding rt_bindings[8];
   nir_function_impl *impl = nir_shader_get_entrypoint(nir);
+  int map_old_to_new_loc[FRAG_RESULT_MAX];
+  memset(map_old_to_new_loc, -1, sizeof(int) * FRAG_RESULT_MAX);
   nir_foreach_variable_safe(var, >outputs) {
  if (var->data.location < FRAG_RESULT_DATA0)
 continue;
@@ -886,7 +888,13 @@ anv_pipeline_compile_fs(struct anv_pipeline *pipeline,
  }
 
  /* Give it a new, compacted, location */
- var->data.location = FRAG_RESULT_DATA0 + num_rts;
+ if (var->data.location != -1) {
+if (map_old_to_new_loc[var->data.location] == -1)
+   map_old_to_new_loc[var->data.location] = FRAG_RESULT_DATA0 + 
num_rts;
+var->data.location = map_old_to_new_loc[var->data.location];
+ } else {
+var->data.location = FRAG_RESULT_DATA0 + num_rts;
+ }
 
  unsigned array_len =
 glsl_type_is_array(var->type) ? glsl_get_length(var->type) : 1;
-- 
2.14.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 1/4] nir: set default lod to texture opcodes that needed it but don't provide it

2017-10-19 Thread Samuel Iglesias Gonsálvez

On Wednesday, October 18, 2017 8:11:01 AM CEST Jason Ekstrand wrote:
> On October 18, 2017 12:54:48 AM Samuel Iglesias Gonsálvez
> 
> <sigles...@igalia.com> wrote:
> > v2:
> > - Use helper to add a new source to the texture instruction.
> > 
> > Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
> > ---
> > 
> >  src/compiler/nir/nir_lower_tex.c | 23 +++
> >  1 file changed, 23 insertions(+)
> > 
> > diff --git a/src/compiler/nir/nir_lower_tex.c
> > b/src/compiler/nir/nir_lower_tex.c
> > index 65681decb1c..676c0c21e7a 100644
> > --- a/src/compiler/nir/nir_lower_tex.c
> > +++ b/src/compiler/nir/nir_lower_tex.c
> > @@ -813,6 +813,29 @@ nir_lower_tex_block(nir_block *block, nir_builder *b,
> > 
> >   progress = true;
> >   continue;
> >
> >}
> > 
> > +
> > +  /* TXF, TXS and TXL require a LOD but not everything we implement
> > using those
> > +   * three opcodes provides one.  Provide a default LOD of 0.
> > +   */
> > +  if (tex->op == nir_texop_txf || tex->op == nir_texop_txs ||
> > +  tex->op == nir_texop_txl || tex->op == nir_texop_query_levels
> > ||
> > +  (tex->op == nir_texop_tex && b->shader->stage !=
> > MESA_SHADER_FRAGMENT)) {
> > + int i;
> > + bool has_lod = false;
> > + for (i = 0; i < tex->num_srcs; i++) {
> > +if (tex->src[i].src_type == nir_tex_src_lod) {
> > +   has_lod = true;
> > +   break;
> > +}
> > + }
> 
> Sorry to ask you to delete even more of your patch but this is just
> nir_tex_instr_src_index(tex, nir_tex_src_lod).
> 

Thanks for the advice! That simplifies a lot the patch, which is always great 
:-)

Now it is like:

+  /* TXF, TXS and TXL require a LOD but not everything we implement using 
those
+   * three opcodes provides one.  Provide a default LOD of 0.
+   */
+  if ((nir_tex_instr_src_index(tex, nir_tex_src_lod) == -1) &&
+  (tex->op == nir_texop_txf || tex->op == nir_texop_txs ||
+   tex->op == nir_texop_txl || tex->op == nir_texop_query_levels ||
+   (tex->op == nir_texop_tex && b->shader->stage != 
MESA_SHADER_FRAGMENT))) {
+ b->cursor = nir_before_instr(>instr);
+ nir_tex_instr_add_src(tex, nir_tex_src_lod, 
nir_src_for_ssa(nir_imm_int(b, 0)));
+ progress = true;
+ continue;
+  }


I have done this change locally. Does it get your R+1?

Sam

P.S: Patch 4 is still unreviewed.

> > +
> > + if (!has_lod) {
> > +b->cursor = nir_before_instr(>instr);
> > +nir_tex_instr_add_src(tex, nir_tex_src_lod,
> > nir_src_for_ssa(nir_imm_int(b, 0)));
> > +progress = true;
> > +continue;
> > + }
> > +  }
> > 
> > }
> > 
> > return progress;
> > 
> > --
> > 2.14.2
> > 
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev



signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 1/4] nir: set default lod to texture opcodes that needed it but don't provide it

2017-10-18 Thread Samuel Iglesias Gonsálvez

v2:
- Use helper to add a new source to the texture instruction.

Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
---
 src/compiler/nir/nir_lower_tex.c | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/src/compiler/nir/nir_lower_tex.c b/src/compiler/nir/nir_lower_tex.c
index 65681decb1c..676c0c21e7a 100644
--- a/src/compiler/nir/nir_lower_tex.c
+++ b/src/compiler/nir/nir_lower_tex.c
@@ -813,6 +813,29 @@ nir_lower_tex_block(nir_block *block, nir_builder *b,
  progress = true;
  continue;
   }
+
+  /* TXF, TXS and TXL require a LOD but not everything we implement using 
those
+   * three opcodes provides one.  Provide a default LOD of 0.
+   */
+  if (tex->op == nir_texop_txf || tex->op == nir_texop_txs ||
+  tex->op == nir_texop_txl || tex->op == nir_texop_query_levels ||
+  (tex->op == nir_texop_tex && b->shader->stage != 
MESA_SHADER_FRAGMENT)) {
+ int i;
+ bool has_lod = false;
+ for (i = 0; i < tex->num_srcs; i++) {
+if (tex->src[i].src_type == nir_tex_src_lod) {
+   has_lod = true;
+   break;
+}
+ }
+
+ if (!has_lod) {
+b->cursor = nir_before_instr(>instr);
+nir_tex_instr_add_src(tex, nir_tex_src_lod, 
nir_src_for_ssa(nir_imm_int(b, 0)));
+progress = true;
+continue;
+ }
+  }
}
 
return progress;
-- 
2.14.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 3/4] i965/vec4: remove setting default LOD in the backend

2017-10-18 Thread Samuel Iglesias Gonsálvez

It is already done in NIR.

Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwer...@intel.com>
---
 src/intel/compiler/brw_vec4_nir.cpp |  9 -
 src/intel/compiler/brw_vec4_visitor.cpp | 12 
 2 files changed, 21 deletions(-)

diff --git a/src/intel/compiler/brw_vec4_nir.cpp 
b/src/intel/compiler/brw_vec4_nir.cpp
index 9200ffa0ed7..0a1caa9fad8 100644
--- a/src/intel/compiler/brw_vec4_nir.cpp
+++ b/src/intel/compiler/brw_vec4_nir.cpp
@@ -2228,15 +2228,6 @@ vec4_visitor::nir_emit_texture(nir_tex_instr *instr)
   }
}
 
-   /* TXS and TXL require a LOD but not everything we implement using those
-* two opcodes provides one.  Provide a default LOD of 0.
-*/
-   if ((instr->op == nir_texop_txs ||
-instr->op == nir_texop_txl) &&
-   lod.file == BAD_FILE) {
-  lod = brw_imm_ud(0u);
-   }
-
if (instr->op == nir_texop_txf_ms ||
instr->op == nir_texop_samples_identical) {
   assert(coord_type != NULL);
diff --git a/src/intel/compiler/brw_vec4_visitor.cpp 
b/src/intel/compiler/brw_vec4_visitor.cpp
index ae516196b15..6e7ee720310 100644
--- a/src/intel/compiler/brw_vec4_visitor.cpp
+++ b/src/intel/compiler/brw_vec4_visitor.cpp
@@ -915,18 +915,6 @@ vec4_visitor::emit_texture(ir_texture_opcode op,
src_reg surface_reg,
src_reg sampler_reg)
 {
-   /* The sampler can only meaningfully compute LOD for fragment shader
-* messages. For all other stages, we change the opcode to TXL and hardcode
-* the LOD to 0.
-*
-* textureQueryLevels() is implemented in terms of TXS so we need to pass a
-* valid LOD argument.
-*/
-   if (op == ir_tex || op == ir_query_levels) {
-  assert(lod.file == BAD_FILE);
-  lod = brw_imm_f(0.0f);
-   }
-
enum opcode opcode;
switch (op) {
case ir_tex: opcode = SHADER_OPCODE_TXL; break;
-- 
2.14.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 2/4] i965/fs: remove setting default LOD in the backend

2017-10-18 Thread Samuel Iglesias Gonsálvez

It is already done in NIR.

Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwer...@intel.com>
---
 src/intel/compiler/brw_fs_nir.cpp | 9 -
 1 file changed, 9 deletions(-)

diff --git a/src/intel/compiler/brw_fs_nir.cpp 
b/src/intel/compiler/brw_fs_nir.cpp
index 7ed44f534c0..cc098849bed 100644
--- a/src/intel/compiler/brw_fs_nir.cpp
+++ b/src/intel/compiler/brw_fs_nir.cpp
@@ -4536,15 +4536,6 @@ fs_visitor::nir_emit_texture(const fs_builder , 
nir_tex_instr *instr)
   unreachable("unknown texture opcode");
}
 
-   /* TXS and TXL require a LOD but not everything we implement using those
-* two opcodes provides one.  Provide a default LOD of 0.
-*/
-   if ((opcode == SHADER_OPCODE_TXS_LOGICAL ||
-opcode == SHADER_OPCODE_TXL_LOGICAL) &&
-   srcs[TEX_LOGICAL_SRC_LOD].file == BAD_FILE) {
-  srcs[TEX_LOGICAL_SRC_LOD] = brw_imm_ud(0u);
-   }
-
if (instr->op == nir_texop_tg4) {
   if (instr->component == 1 &&
   key_tex->gather_channel_quirk_mask & (1 << texture)) {
-- 
2.14.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 4/4] spirv: add support for images and samplers as function arguments

2017-10-18 Thread Samuel Iglesias Gonsálvez

Fixes:

dEQP-VK.spirv_assembly.instruction.*.image_sampler.*

Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
---
 src/compiler/spirv/vtn_cfg.c | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/src/compiler/spirv/vtn_cfg.c b/src/compiler/spirv/vtn_cfg.c
index 25ff254bcec..8b139068fb0 100644
--- a/src/compiler/spirv/vtn_cfg.c
+++ b/src/compiler/spirv/vtn_cfg.c
@@ -111,6 +111,25 @@ vtn_cfg_handle_prepass_instruction(struct vtn_builder *b, 
SpvOp opcode,
  param->name = ralloc_strdup(param, val->name);
 
  val->pointer = vtn_pointer_for_variable(b, vtn_var, type);
+  } else if (type->base_type == vtn_base_type_image || type->base_type == 
vtn_base_type_sampler) {
+ struct vtn_variable *vtn_var = rzalloc(b, struct vtn_variable);
+ struct vtn_type *ptr_type = rzalloc(b, struct vtn_type);
+ ptr_type->deref = type;
+ ptr_type->base_type = vtn_base_type_pointer;
+
+ vtn_var->type = type;
+ vtn_var->var = param;
+ vtn_var->mode = (type->base_type == vtn_base_type_image) ?
+vtn_variable_mode_image : vtn_variable_mode_sampler;
+ param->interface_type = type->type;
+
+ struct vtn_value *val =
+vtn_push_value(b, w[2], vtn_value_type_pointer);
+
+ /* Name the parameter so it shows up nicely in NIR */
+ param->name = ralloc_strdup(param, val->name);
+
+ val->pointer = vtn_pointer_for_variable(b, vtn_var, ptr_type);
   } else {
  /* We're a regular SSA value. */
  struct vtn_ssa_value *param_ssa =
-- 
2.14.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 1/4] nir: set default lod to texture opcodes that needed it but don't provide it

2017-10-16 Thread Samuel Iglesias Gonsálvez

On Wed, 2017-10-11 at 09:12 +0100, Lionel Landwerlin wrote:
> On 11/10/17 09:00, Samuel Iglesias Gonsálvez wrote:
> > On Tuesday, October 10, 2017 4:40:47 PM CEST Lionel Landwerlin
> > wrote:
> > > On 10/10/17 14:35, Samuel Iglesias Gonsálvez wrote:
> > > > Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
> > > > ---
> > > > 
> > > >   src/compiler/nir/nir_lower_tex.c | 68
> > > >    1 file changed, 68
> > > >   insertions(+)
> > > > 
> > > > diff --git a/src/compiler/nir/nir_lower_tex.c
> > > > b/src/compiler/nir/nir_lower_tex.c index
> > > > 65681decb1c..d3380710405 100644
> > > > --- a/src/compiler/nir/nir_lower_tex.c
> > > > +++ b/src/compiler/nir/nir_lower_tex.c
> > > > @@ -717,6 +717,52 @@ linearize_srgb_result(nir_builder *b,
> > > > nir_tex_instr
> > > > *tex)> 
> > > > result->parent_instr);
> > > >   
> > > >   }
> > > > 
> > > > +static void
> > > > +set_default_lod(nir_builder *b, nir_tex_instr *tex)
> > > > +{
> > > > +   b->cursor = nir_before_instr(>instr);
> > > > +
> > > > +   /* We are going to emit the same texture but adding a
> > > > default LOD.
> > > > +*/
> > > > +   int num_srcs = tex->num_srcs + 1;
> > > > +   nir_tex_instr *new = nir_tex_instr_create(b->shader,
> > > > num_srcs);
> > > > +
> > > > +   new->op = tex->op;
> > > > +   new->sampler_dim = tex->sampler_dim;
> > > > +   new->texture_index = tex->texture_index;
> > > > +   new->dest_type = tex->dest_type;
> > > > +   new->is_array = tex->is_array;
> > > > +   new->is_shadow = tex->is_shadow;
> > > > +   new->is_new_style_shadow = tex->is_new_style_shadow;
> > > > +   new->sampler_index = tex->sampler_index;
> > > > +   new->texture = nir_deref_var_clone(tex->texture, new);
> > > > +   new->sampler = nir_deref_var_clone(tex->sampler, new);
> > > > +   new->coord_components = tex->coord_components;
> > > 
> > > There are a couple of fields you're not copying : component &
> > > texture_array_size
> > > Not 100% sure whether they need to be.
> > > 
> > 
> > I added them locally.
> > 
> > > > +
> > > > +   nir_ssa_dest_init(>instr, >dest, 4, 32, NULL);
> > > > +
> > > > +   int src_num = 0;
> > > > +   for (int i = 0; i < tex->num_srcs; i++) {
> > > > +  nir_src_copy(>src[src_num].src, >src[i].src,
> > > > new);
> > > > +  new->src[src_num].src_type = tex->src[i].src_type;
> > > > +  src_num++;
> > > > +   }
> > > > +
> > > > +   new->src[src_num].src = nir_src_for_ssa(nir_imm_int(b, 0));
> > > > +   new->src[src_num].src_type = nir_tex_src_lod;
> > > 
> > > I think you could get rid of the src_num variable and just use
> > > (new->num_srcs - 1) to set the default lod src.
> > > 
> > 
> > Done.
> > 
> > Does it get your R-b?
> > 
> > Thanks,
> > 
> > Sam
>  
> Thanks!
> Although I think Eric has a point avoid about memcpy(), since I used
> roughly the same code in the ycbcr anv pass, I'll try to come up with
> a helper.
> Patches 1-3 are :
> 
> Reviewed-by: Lionel Landwerlin <lionel.g.landwer...@intel.com>
> 

I forgot to reply before. I will wait for your helper then.
Please add me in Cc so I am aware when you submit it for review :)

Thanks,

Sam

> > > > +   src_num++;
> > > > +
> > > > +   assert(src_num == num_srcs);
> > > > +
> > > > +   nir_ssa_dest_init(>instr, >dest,
> > > > + tex->dest.ssa.num_components, 32, NULL);
> > > > +   nir_builder_instr_insert(b, >instr);
> > > > +
> > > > +   nir_ssa_def_rewrite_uses(>dest.ssa,
> > > > nir_src_for_ssa(>dest.ssa)); +
> > > > +   nir_instr_remove(>instr);
> > > > +}
> > > > +
> > > > 
> > > >   static bool
> > > >   nir_lower_tex_block(nir_block *block, nir_builder *b,
> > > >   
> > > >

Re: [Mesa-dev] [PATCH v2 1/4] nir: set default lod to texture opcodes that needed it but don't provide it

2017-10-11 Thread Samuel Iglesias Gonsálvez

On Wednesday, October 11, 2017 10:12:16 AM CEST Samuel Iglesias Gonsálvez 
wrote:
> On Tuesday, October 10, 2017 11:53:27 AM CEST Eric Anholt wrote:
> > Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:
> > > Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
> > > ---
> > > 
> > >  src/compiler/nir/nir_lower_tex.c | 68
> > >   1 file changed, 68
> > >  insertions(+)
> > > 
> > > diff --git a/src/compiler/nir/nir_lower_tex.c
> > > b/src/compiler/nir/nir_lower_tex.c index 65681decb1c..d3380710405 100644
> > > --- a/src/compiler/nir/nir_lower_tex.c
> > > +++ b/src/compiler/nir/nir_lower_tex.c
> > > @@ -717,6 +717,52 @@ linearize_srgb_result(nir_builder *b, nir_tex_instr
> > > *tex)>
> > > 
> > >result->parent_instr);
> > >  
> > >  }
> > > 
> > > +static void
> > > +set_default_lod(nir_builder *b, nir_tex_instr *tex)
> > > +{
> > > +   b->cursor = nir_before_instr(>instr);
> > > +
> > > +   /* We are going to emit the same texture but adding a default LOD.
> > > +*/
> > > +   int num_srcs = tex->num_srcs + 1;
> > > +   nir_tex_instr *new = nir_tex_instr_create(b->shader, num_srcs);
> > > +
> > > +   new->op = tex->op;
> > > +   new->sampler_dim = tex->sampler_dim;
> > > +   new->texture_index = tex->texture_index;
> > > +   new->dest_type = tex->dest_type;
> > > +   new->is_array = tex->is_array;
> > > +   new->is_shadow = tex->is_shadow;
> > > +   new->is_new_style_shadow = tex->is_new_style_shadow;
> > > +   new->sampler_index = tex->sampler_index;
> > > +   new->texture = nir_deref_var_clone(tex->texture, new);
> > > +   new->sampler = nir_deref_var_clone(tex->sampler, new);
> > > +   new->coord_components = tex->coord_components;
> > 
> > Couldn't we just make a new srcs array of num_srcs+1, memcpy the old
> > srcs/types over, add our new use of the immediate 0 lod to it by
> > manipulating the immediate's uses list
> 
> This is an interesting approach.  I have done the following:
> 
>b->cursor = nir_before_instr(>instr);
> 
>nir_tex_src *srcs = ralloc_array(tex, nir_tex_src, tex->num_srcs + 1);
>memcpy(srcs, tex->src, sizeof(nir_tex_src) * tex->num_srcs);
> 
>srcs[tex->num_srcs + 1].src = nir_src_for_ssa(nir_imm_int(b, 0));
>srcs[tex->num_srcs + 1].src_type = nir_tex_src_lod;

Without the "+ 1" here, of course.  I think I need a cup of coffee now :-/

Sam

>tex->num_srcs++;
>ralloc_free(tex->src);
>tex->src = srcs;
> 
> However, it crashes when validating ssa def. I also tried calling
> nir_ssa_def_rewrite_uses() but I am not sure how to do it for this case,
> probably I am missing something.
> 
> How can I manipulate the immediate's uses list for this case? Can you
> provide an example?
> 
> Sam
> 
> > , and free the old list?  It seems
> > less fragile to me than needing to update this path if we add a new
> > texture flag.



signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 1/4] nir: set default lod to texture opcodes that needed it but don't provide it

2017-10-11 Thread Samuel Iglesias Gonsálvez

On Tuesday, October 10, 2017 11:53:27 AM CEST Eric Anholt wrote:
> Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:
> > Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
> > ---
> > 
> >  src/compiler/nir/nir_lower_tex.c | 68
> >   1 file changed, 68
> >  insertions(+)
> > 
> > diff --git a/src/compiler/nir/nir_lower_tex.c
> > b/src/compiler/nir/nir_lower_tex.c index 65681decb1c..d3380710405 100644
> > --- a/src/compiler/nir/nir_lower_tex.c
> > +++ b/src/compiler/nir/nir_lower_tex.c
> > @@ -717,6 +717,52 @@ linearize_srgb_result(nir_builder *b, nir_tex_instr
> > *tex)> 
> >result->parent_instr);
> >  
> >  }
> > 
> > +static void
> > +set_default_lod(nir_builder *b, nir_tex_instr *tex)
> > +{
> > +   b->cursor = nir_before_instr(>instr);
> > +
> > +   /* We are going to emit the same texture but adding a default LOD.
> > +*/
> > +   int num_srcs = tex->num_srcs + 1;
> > +   nir_tex_instr *new = nir_tex_instr_create(b->shader, num_srcs);
> > +
> > +   new->op = tex->op;
> > +   new->sampler_dim = tex->sampler_dim;
> > +   new->texture_index = tex->texture_index;
> > +   new->dest_type = tex->dest_type;
> > +   new->is_array = tex->is_array;
> > +   new->is_shadow = tex->is_shadow;
> > +   new->is_new_style_shadow = tex->is_new_style_shadow;
> > +   new->sampler_index = tex->sampler_index;
> > +   new->texture = nir_deref_var_clone(tex->texture, new);
> > +   new->sampler = nir_deref_var_clone(tex->sampler, new);
> > +   new->coord_components = tex->coord_components;
> 
> Couldn't we just make a new srcs array of num_srcs+1, memcpy the old
> srcs/types over, add our new use of the immediate 0 lod to it by
> manipulating the immediate's uses list

This is an interesting approach.  I have done the following:

   b->cursor = nir_before_instr(>instr);

   nir_tex_src *srcs = ralloc_array(tex, nir_tex_src, tex->num_srcs + 1);
   memcpy(srcs, tex->src, sizeof(nir_tex_src) * tex->num_srcs);

   srcs[tex->num_srcs + 1].src = nir_src_for_ssa(nir_imm_int(b, 0));
   srcs[tex->num_srcs + 1].src_type = nir_tex_src_lod;
   tex->num_srcs++;
   ralloc_free(tex->src);
   tex->src = srcs;

However, it crashes when validating ssa def. I also tried calling 
nir_ssa_def_rewrite_uses() but I am not sure how to do it for this case, 
probably I am missing something.

How can I manipulate the immediate's uses list for this case? Can you provide 
an example?

Sam

> , and free the old list?  It seems
> less fragile to me than needing to update this path if we add a new
> texture flag.



signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 1/4] nir: set default lod to texture opcodes that needed it but don't provide it

2017-10-11 Thread Samuel Iglesias Gonsálvez

On Tuesday, October 10, 2017 4:40:47 PM CEST Lionel Landwerlin wrote:
> On 10/10/17 14:35, Samuel Iglesias Gonsálvez wrote:
> > Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
> > ---
> > 
> >   src/compiler/nir/nir_lower_tex.c | 68
> >    1 file changed, 68
> >   insertions(+)
> > 
> > diff --git a/src/compiler/nir/nir_lower_tex.c
> > b/src/compiler/nir/nir_lower_tex.c index 65681decb1c..d3380710405 100644
> > --- a/src/compiler/nir/nir_lower_tex.c
> > +++ b/src/compiler/nir/nir_lower_tex.c
> > @@ -717,6 +717,52 @@ linearize_srgb_result(nir_builder *b, nir_tex_instr
> > *tex)> 
> > result->parent_instr);
> >   
> >   }
> > 
> > +static void
> > +set_default_lod(nir_builder *b, nir_tex_instr *tex)
> > +{
> > +   b->cursor = nir_before_instr(>instr);
> > +
> > +   /* We are going to emit the same texture but adding a default LOD.
> > +*/
> > +   int num_srcs = tex->num_srcs + 1;
> > +   nir_tex_instr *new = nir_tex_instr_create(b->shader, num_srcs);
> > +
> > +   new->op = tex->op;
> > +   new->sampler_dim = tex->sampler_dim;
> > +   new->texture_index = tex->texture_index;
> > +   new->dest_type = tex->dest_type;
> > +   new->is_array = tex->is_array;
> > +   new->is_shadow = tex->is_shadow;
> > +   new->is_new_style_shadow = tex->is_new_style_shadow;
> > +   new->sampler_index = tex->sampler_index;
> > +   new->texture = nir_deref_var_clone(tex->texture, new);
> > +   new->sampler = nir_deref_var_clone(tex->sampler, new);
> > +   new->coord_components = tex->coord_components;
> 
> There are a couple of fields you're not copying : component &
> texture_array_size
> Not 100% sure whether they need to be.
> 

I added them locally.

> > +
> > +   nir_ssa_dest_init(>instr, >dest, 4, 32, NULL);
> > +
> > +   int src_num = 0;
> > +   for (int i = 0; i < tex->num_srcs; i++) {
> > +  nir_src_copy(>src[src_num].src, >src[i].src, new);
> > +  new->src[src_num].src_type = tex->src[i].src_type;
> > +  src_num++;
> > +   }
> > +
> > +   new->src[src_num].src = nir_src_for_ssa(nir_imm_int(b, 0));
> > +   new->src[src_num].src_type = nir_tex_src_lod;
> 
> I think you could get rid of the src_num variable and just use
> (new->num_srcs - 1) to set the default lod src.
> 

Done.

Does it get your R-b?

Thanks,

Sam

> > +   src_num++;
> > +
> > +   assert(src_num == num_srcs);
> > +
> > +   nir_ssa_dest_init(>instr, >dest,
> > + tex->dest.ssa.num_components, 32, NULL);
> > +   nir_builder_instr_insert(b, >instr);
> > +
> > +   nir_ssa_def_rewrite_uses(>dest.ssa,
> > nir_src_for_ssa(>dest.ssa)); +
> > +   nir_instr_remove(>instr);
> > +}
> > +
> > 
> >   static bool
> >   nir_lower_tex_block(nir_block *block, nir_builder *b,
> >   
> >   const nir_lower_tex_options *options)
> > 
> > @@ -813,6 +859,28 @@ nir_lower_tex_block(nir_block *block, nir_builder *b,
> > 
> >progress = true;
> >continue;
> > 
> > }
> > 
> > +
> > +  /* TXF, TXS and TXL require a LOD but not everything we implement
> > using those +   * three opcodes provides one.  Provide a default LOD
> > of 0. +   */
> > +  if (tex->op == nir_texop_txf || tex->op == nir_texop_txs ||
> > +  tex->op == nir_texop_txl || tex->op == nir_texop_query_levels
> > ||
> > +  (tex->op == nir_texop_tex && b->shader->stage !=
> > MESA_SHADER_FRAGMENT)) { + int i;
> > + bool has_lod = false;
> > + for (i = 0; i < tex->num_srcs; i++) {
> > +if (tex->src[i].src_type == nir_tex_src_lod) {
> > +   has_lod = true;
> > +   break;
> > +}
> > + }
> > +
> > + if (!has_lod) {
> > +set_default_lod(b, tex);
> > +progress = true;
> > +continue;
> > + }
> > +  }
> > 
> >  }
> >  
> >  return progress;



signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 1/4] nir: set default lod to texture opcodes that needed it but don't provide it

2017-10-10 Thread Samuel Iglesias Gonsálvez

Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
---
 src/compiler/nir/nir_lower_tex.c | 68 
 1 file changed, 68 insertions(+)

diff --git a/src/compiler/nir/nir_lower_tex.c b/src/compiler/nir/nir_lower_tex.c
index 65681decb1c..d3380710405 100644
--- a/src/compiler/nir/nir_lower_tex.c
+++ b/src/compiler/nir/nir_lower_tex.c
@@ -717,6 +717,52 @@ linearize_srgb_result(nir_builder *b, nir_tex_instr *tex)
   result->parent_instr);
 }
 
+static void
+set_default_lod(nir_builder *b, nir_tex_instr *tex)
+{
+   b->cursor = nir_before_instr(>instr);
+
+   /* We are going to emit the same texture but adding a default LOD.
+*/
+   int num_srcs = tex->num_srcs + 1;
+   nir_tex_instr *new = nir_tex_instr_create(b->shader, num_srcs);
+
+   new->op = tex->op;
+   new->sampler_dim = tex->sampler_dim;
+   new->texture_index = tex->texture_index;
+   new->dest_type = tex->dest_type;
+   new->is_array = tex->is_array;
+   new->is_shadow = tex->is_shadow;
+   new->is_new_style_shadow = tex->is_new_style_shadow;
+   new->sampler_index = tex->sampler_index;
+   new->texture = nir_deref_var_clone(tex->texture, new);
+   new->sampler = nir_deref_var_clone(tex->sampler, new);
+   new->coord_components = tex->coord_components;
+
+   nir_ssa_dest_init(>instr, >dest, 4, 32, NULL);
+
+   int src_num = 0;
+   for (int i = 0; i < tex->num_srcs; i++) {
+  nir_src_copy(>src[src_num].src, >src[i].src, new);
+  new->src[src_num].src_type = tex->src[i].src_type;
+  src_num++;
+   }
+
+   new->src[src_num].src = nir_src_for_ssa(nir_imm_int(b, 0));
+   new->src[src_num].src_type = nir_tex_src_lod;
+   src_num++;
+
+   assert(src_num == num_srcs);
+
+   nir_ssa_dest_init(>instr, >dest,
+ tex->dest.ssa.num_components, 32, NULL);
+   nir_builder_instr_insert(b, >instr);
+
+   nir_ssa_def_rewrite_uses(>dest.ssa, nir_src_for_ssa(>dest.ssa));
+
+   nir_instr_remove(>instr);
+}
+
 static bool
 nir_lower_tex_block(nir_block *block, nir_builder *b,
 const nir_lower_tex_options *options)
@@ -813,6 +859,28 @@ nir_lower_tex_block(nir_block *block, nir_builder *b,
  progress = true;
  continue;
   }
+
+  /* TXF, TXS and TXL require a LOD but not everything we implement using 
those
+   * three opcodes provides one.  Provide a default LOD of 0.
+   */
+  if (tex->op == nir_texop_txf || tex->op == nir_texop_txs ||
+  tex->op == nir_texop_txl || tex->op == nir_texop_query_levels ||
+  (tex->op == nir_texop_tex && b->shader->stage != 
MESA_SHADER_FRAGMENT)) {
+ int i;
+ bool has_lod = false;
+ for (i = 0; i < tex->num_srcs; i++) {
+if (tex->src[i].src_type == nir_tex_src_lod) {
+   has_lod = true;
+   break;
+}
+ }
+
+ if (!has_lod) {
+set_default_lod(b, tex);
+progress = true;
+continue;
+ }
+  }
}
 
return progress;
-- 
2.14.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 2/4] i965/fs: remove setting default LOD in the backend

2017-10-10 Thread Samuel Iglesias Gonsálvez

It is already done in NIR.

Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
---
 src/intel/compiler/brw_fs_nir.cpp | 9 -
 1 file changed, 9 deletions(-)

diff --git a/src/intel/compiler/brw_fs_nir.cpp 
b/src/intel/compiler/brw_fs_nir.cpp
index 5b8ccd50bff..5c2f04ea268 100644
--- a/src/intel/compiler/brw_fs_nir.cpp
+++ b/src/intel/compiler/brw_fs_nir.cpp
@@ -4518,15 +4518,6 @@ fs_visitor::nir_emit_texture(const fs_builder , 
nir_tex_instr *instr)
   unreachable("unknown texture opcode");
}
 
-   /* TXS and TXL require a LOD but not everything we implement using those
-* two opcodes provides one.  Provide a default LOD of 0.
-*/
-   if ((opcode == SHADER_OPCODE_TXS_LOGICAL ||
-opcode == SHADER_OPCODE_TXL_LOGICAL) &&
-   srcs[TEX_LOGICAL_SRC_LOD].file == BAD_FILE) {
-  srcs[TEX_LOGICAL_SRC_LOD] = brw_imm_ud(0u);
-   }
-
if (instr->op == nir_texop_tg4) {
   if (instr->component == 1 &&
   key_tex->gather_channel_quirk_mask & (1 << texture)) {
-- 
2.14.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 3/4] i965/vec4: remove setting default LOD in the backend

2017-10-10 Thread Samuel Iglesias Gonsálvez

It is already done in NIR.

Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
---
 src/intel/compiler/brw_vec4_nir.cpp |  9 -
 src/intel/compiler/brw_vec4_visitor.cpp | 12 
 2 files changed, 21 deletions(-)

diff --git a/src/intel/compiler/brw_vec4_nir.cpp 
b/src/intel/compiler/brw_vec4_nir.cpp
index 9200ffa0ed7..0a1caa9fad8 100644
--- a/src/intel/compiler/brw_vec4_nir.cpp
+++ b/src/intel/compiler/brw_vec4_nir.cpp
@@ -2228,15 +2228,6 @@ vec4_visitor::nir_emit_texture(nir_tex_instr *instr)
   }
}
 
-   /* TXS and TXL require a LOD but not everything we implement using those
-* two opcodes provides one.  Provide a default LOD of 0.
-*/
-   if ((instr->op == nir_texop_txs ||
-instr->op == nir_texop_txl) &&
-   lod.file == BAD_FILE) {
-  lod = brw_imm_ud(0u);
-   }
-
if (instr->op == nir_texop_txf_ms ||
instr->op == nir_texop_samples_identical) {
   assert(coord_type != NULL);
diff --git a/src/intel/compiler/brw_vec4_visitor.cpp 
b/src/intel/compiler/brw_vec4_visitor.cpp
index 88e80aaa3af..52d1dfc8fdd 100644
--- a/src/intel/compiler/brw_vec4_visitor.cpp
+++ b/src/intel/compiler/brw_vec4_visitor.cpp
@@ -915,18 +915,6 @@ vec4_visitor::emit_texture(ir_texture_opcode op,
src_reg surface_reg,
src_reg sampler_reg)
 {
-   /* The sampler can only meaningfully compute LOD for fragment shader
-* messages. For all other stages, we change the opcode to TXL and hardcode
-* the LOD to 0.
-*
-* textureQueryLevels() is implemented in terms of TXS so we need to pass a
-* valid LOD argument.
-*/
-   if (op == ir_tex || op == ir_query_levels) {
-  assert(lod.file == BAD_FILE);
-  lod = brw_imm_f(0.0f);
-   }
-
enum opcode opcode;
switch (op) {
case ir_tex: opcode = SHADER_OPCODE_TXL; break;
-- 
2.14.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 4/4] spirv: add support for images and samplers as function arguments

2017-10-10 Thread Samuel Iglesias Gonsálvez

Fixes:

dEQP-VK.spirv_assembly.instruction.*.image_sampler.*

Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
---
 src/compiler/spirv/vtn_cfg.c | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/src/compiler/spirv/vtn_cfg.c b/src/compiler/spirv/vtn_cfg.c
index 25ff254bcec..8b139068fb0 100644
--- a/src/compiler/spirv/vtn_cfg.c
+++ b/src/compiler/spirv/vtn_cfg.c
@@ -111,6 +111,25 @@ vtn_cfg_handle_prepass_instruction(struct vtn_builder *b, 
SpvOp opcode,
  param->name = ralloc_strdup(param, val->name);
 
  val->pointer = vtn_pointer_for_variable(b, vtn_var, type);
+  } else if (type->base_type == vtn_base_type_image || type->base_type == 
vtn_base_type_sampler) {
+ struct vtn_variable *vtn_var = rzalloc(b, struct vtn_variable);
+ struct vtn_type *ptr_type = rzalloc(b, struct vtn_type);
+ ptr_type->deref = type;
+ ptr_type->base_type = vtn_base_type_pointer;
+
+ vtn_var->type = type;
+ vtn_var->var = param;
+ vtn_var->mode = (type->base_type == vtn_base_type_image) ?
+vtn_variable_mode_image : vtn_variable_mode_sampler;
+ param->interface_type = type->type;
+
+ struct vtn_value *val =
+vtn_push_value(b, w[2], vtn_value_type_pointer);
+
+ /* Name the parameter so it shows up nicely in NIR */
+ param->name = ralloc_strdup(param, val->name);
+
+ val->pointer = vtn_pointer_for_variable(b, vtn_var, ptr_type);
   } else {
  /* We're a regular SSA value. */
  struct vtn_ssa_value *param_ssa =
-- 
2.14.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/4] i965/fs: some TXF don't provide LOD

2017-10-06 Thread Samuel Iglesias Gonsálvez

On Fri, 2017-10-06 at 08:28 -0700, Jason Ekstrand wrote:
> On Fri, Oct 6, 2017 at 6:36 AM, Samuel Iglesias Gonsálvez <siglesias@
> igalia.com> wrote:
> > 
> >   
> >   On Fri, 2017-10-06 at 14:23 +0100, Lionel Landwerlin wrote:
> > > I fixed a similar bug in the vec4
> > >   backend a couple of days ago.
> > > 
> > >   Can we maybe put this logic somewhere that could reused
> > > across
> > >   backends?
> > > 
> > >   Or maybe a nir pass to add the missing parameters?
> > > 
> > >   
> > > 
> > >   Thanks,
> > > 
> > >   
> > > 
> > >   -
> > > 
> > >   Lionel
> > > 
> > >   
> > 
> > Right. I think it should be reused across the backends, as it is
> > the sampling
> > instruction the one who needs it; but I'm OK with either option.
> 
> I don't care much where it goes.  I guess we could always add it to
> nir_lower_tex with a default_lod_0 option.  Or we can just fix it up
> in both back-ends.  Or we could fix it up here.  The nir_lower_tex
> option may be the most reliable.
>  

OK, as I need to send a v2 for this patch series, I will write patch
with the nir_lower_tex solution.
Sam
> > Sam
> > 
> > >   On 06/10/17 14:07, Jason Ekstrand wrote:
> > > 
> > > 
> > > >   
> > > >   
> > > > Is there a test case for this?
> > > > 
> > > >   
> > > > 
> > > > 
> > > > Reviewed-by: Jason Ekstrand <ja...@jlekstrand.net>
> > > > 
> > > >   
> > > >   
> > > > 
> > > > On Fri, Oct 6, 2017 at 2:36 AM, Samuel
> > > >   Iglesias Gonsálvez <sigles...@igalia.com>
> > > >   wrote:
> > > > 
> > > >   
> > > > > SpvOpImageFetch
> > > > > doesn't provide it, so set it to zero.
> > > > > 
> > > > > 
> > > > > 
> > > > > Signed-off-by: Samuel Iglesias Gonsálvez  > > > > a...@igalia.com>
> > > > > 
> > > > > ---
> > > > > 
> > > > >  src/intel/compiler/brw_fs_nir.cpp | 9 +
> > > > > 
> > > > >  1 file changed, 5 insertions(+), 4 deletions(-)
> > > > > 
> > > > > 
> > > > > 
> > > > > diff --git a/src/intel/compiler/brw_fs_nir.cpp
> > > > > b/src/intel/compiler/brw_fs_nir.cpp
> > > > > 
> > > > > index 5b8ccd50bff..25488303c29 100644
> > > > > 
> > > > > --- a/src/intel/compiler/brw_fs_nir.cpp
> > > > > 
> > > > > +++ b/src/intel/compiler/brw_fs_nir.cpp
> > > > > 
> > > > > @@ -4518,11 +4518,12 @@
> > > > > fs_visitor::nir_emit_texture(const
> > > > > fs_builder , nir_tex_instr *instr)
> > > > > 
> > > > >unreachable("unknown texture opcode");
> > > > > 
> > > > > }
> > > > > 
> > > > > 
> > > > > 
> > > > > -   /* TXS and TXL require a LOD but not
> > > > > everything we
> > > > > implement using those
> > > > > 
> > > > > -* two opcodes provides one.  Provide a
> > > > > default LOD of
> > > > > 0.
> > > > > 
> > > > > +   /* TXF, TXS and TXL require a LOD but not
> > > > > everything we
> > > > > implement using those
> > > > > 
> > > > > +* three opcodes provides one.  Provide a
> > > > > default LOD of
> > > > > 0.
> > > > > 
> > > > >  */
> > > > > 
> > > > > -   if ((opcode == SHADER_OPCODE_TXS_LOGICAL ||
> > > > > 
> > > > > -opcode == SHADER_OPCODE_TXL_LOGICAL) &&
> > > > > 
> > > > > +   if ((opcode == SHADER_OPCODE_

Re: [Mesa-dev] [PATCH 1/4] i965/fs: some TXF don't provide LOD

2017-10-06 Thread Samuel Iglesias Gonsálvez

On Fri, 2017-10-06 at 14:23 +0100, Lionel Landwerlin wrote:
> I fixed a similar bug in the vec4
>   backend a couple of days ago.
> 
>   Can we maybe put this logic somewhere that could reused across
>   backends?
> 
>   Or maybe a nir pass to add the missing parameters?
> 
>   
> 
>   Thanks,
> 
>   
> 
>   -
> 
>   Lionel
> 
>   

Right. I think it should be reused across the backends, as it is the
samplinginstruction the one who needs it; but I'm OK with either
option.
Sam
>   On 06/10/17 14:07, Jason Ekstrand wrote:
> 
> 
> >   
> >   
> > Is there a test case for this?
> > 
> >   
> > 
> > 
> > Reviewed-by: Jason Ekstrand <ja...@jlekstrand.net>
> > 
> >   
> >   
> > 
> > On Fri, Oct 6, 2017 at 2:36 AM, Samuel
> >   Iglesias Gonsálvez <sigles...@igalia.com>
> >   wrote:
> > 
> >   
> > > SpvOpImageFetch
> > > doesn't provide it, so set it to zero.
> > > 
> > > 
> > > 
> > > Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@i
> > > galia.com>
> > > 
> > > ---
> > > 
> > >  src/intel/compiler/brw_fs_nir.cpp | 9 +
> > > 
> > >  1 file changed, 5 insertions(+), 4 deletions(-)
> > > 
> > > 
> > > 
> > > diff --git a/src/intel/compiler/brw_fs_nir.cpp
> > > b/src/intel/compiler/brw_fs_nir.cpp
> > > 
> > > index 5b8ccd50bff..25488303c29 100644
> > > 
> > > --- a/src/intel/compiler/brw_fs_nir.cpp
> > > 
> > > +++ b/src/intel/compiler/brw_fs_nir.cpp
> > > 
> > > @@ -4518,11 +4518,12 @@
> > > fs_visitor::nir_emit_texture(const
> > > fs_builder , nir_tex_instr *instr)
> > > 
> > >unreachable("unknown texture opcode");
> > > 
> > > }
> > > 
> > > 
> > > 
> > > -   /* TXS and TXL require a LOD but not everything
> > > we
> > > implement using those
> > > 
> > > -* two opcodes provides one.  Provide a default
> > > LOD of
> > > 0.
> > > 
> > > +   /* TXF, TXS and TXL require a LOD but not
> > > everything we
> > > implement using those
> > > 
> > > +* three opcodes provides one.  Provide a default
> > > LOD of
> > > 0.
> > > 
> > >  */
> > > 
> > > -   if ((opcode == SHADER_OPCODE_TXS_LOGICAL ||
> > > 
> > > -opcode == SHADER_OPCODE_TXL_LOGICAL) &&
> > > 
> > > +   if ((opcode == SHADER_OPCODE_TXF_LOGICAL ||
> > > 
> > > + opcode == SHADER_OPCODE_TXS_LOGICAL ||
> > > 
> > > + opcode == SHADER_OPCODE_TXL_LOGICAL) &&
> > > 
> > > srcs[TEX_LOGICAL_SRC_LOD].file == BAD_FILE) {
> > > 
> > >srcs[TEX_LOGICAL_SRC_LOD] = brw_imm_ud(0u);
> > > 
> > > }
> > > 
> > > --
> > > 
> > > 2.13.6
> > > 
> > > 
> > > 
> > > ___
> > > 
> > > mesa-dev mailing list
> > > 
> > > mesa-dev@lists.freedesktop.org
> > > 
> > > https://lists.freedesktop.org/mailman/listinfo/me
> > > sa-dev
> > > 
> > >   
> > 
> > 
> > 
> > 
> >   
> >   
> > 
> >   
> >   
> > 
> >   ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> > 
> > 
> 
> 
> 
> 
>   
> ___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/4] spirv: Add support for fuction arguments of type image and sampler

2017-10-06 Thread Samuel Iglesias Gonsálvez

On Fri, 2017-10-06 at 06:22 -0700, Jason Ekstrand wrote:
> On Fri, Oct 6, 2017 at 2:36 AM, Samuel Iglesias Gonsálvez <siglesias@
> igalia.com> wrote:
> > These arguments are actually variables, not pointers. This is
> > allowed
> > by SPIR-V spec but the support was missing.
> 
> In SPIR-V, even OpVariable returns a pointer.  I think you could
> probably save yourself a lot of trouble if you used
> vtn_pointer_for_variable.  So far as I can tell, that should make
> most of patches 3 and 4 unneeded.
> 

Yeah, I though the same. I got some issues but maybe I can do another
spin to it, probably I missed something when I worked on it.

> That said, I don't really get what's going on here.  Could you
> provide a test case or even some example SPIR-V?
> 

Tests provided by CL#1627.

Thanks,

Sam


> --Jason
>  
> > Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
> > ---
> >  src/compiler/spirv/vtn_cfg.c | 13 +
> >  src/compiler/spirv/vtn_private.h |  5 +
> >  2 files changed, 18 insertions(+)
> > 
> > diff --git a/src/compiler/spirv/vtn_cfg.c
> > b/src/compiler/spirv/vtn_cfg.c
> > index 25ff254bcec..15d8bb426a1 100644
> > --- a/src/compiler/spirv/vtn_cfg.c
> > +++ b/src/compiler/spirv/vtn_cfg.c
> > @@ -111,6 +111,19 @@ vtn_cfg_handle_prepass_instruction(struct
> > vtn_builder *b, SpvOp opcode,
> >   param->name = ralloc_strdup(param, val->name);
> > 
> >   val->pointer = vtn_pointer_for_variable(b, vtn_var,
> > type);
> > +  } else if (type->base_type == vtn_base_type_image || type-
> > >base_type == vtn_base_type_sampler) {
> > + struct vtn_variable *vtn_var = rzalloc(b, struct
> > vtn_variable);
> > + vtn_var->type = type;
> > + vtn_var->var = param;
> > + vtn_var->mode = (type->base_type == vtn_base_type_image)
> > ?
> > +vtn_variable_mode_image : vtn_variable_mode_sampler;
> > + struct vtn_value *val =
> > + vtn_push_value(b, w[2],
> > +(type->base_type == vtn_base_type_image) ?
> > +vtn_value_type_image_variable :
> > vtn_value_type_sampler_variable);
> > + val->var = vtn_var;
> > + /* Name the parameter so it shows up nicely in NIR */
> > + param->name = ralloc_strdup(param, val->name);
> >} else {
> >   /* We're a regular SSA value. */
> >   struct vtn_ssa_value *param_ssa =
> > diff --git a/src/compiler/spirv/vtn_private.h
> > b/src/compiler/spirv/vtn_private.h
> > index 84584620fc1..f194a7ed32a 100644
> > --- a/src/compiler/spirv/vtn_private.h
> > +++ b/src/compiler/spirv/vtn_private.h
> > @@ -51,6 +51,8 @@ enum vtn_value_type {
> > vtn_value_type_extension,
> > vtn_value_type_image_pointer,
> > vtn_value_type_sampled_image,
> > +   vtn_value_type_image_variable,
> > +   vtn_value_type_sampler_variable,
> >  };
> > 
> >  enum vtn_branch_type {
> > @@ -413,6 +415,8 @@ struct vtn_image_pointer {
> >  struct vtn_sampled_image {
> > struct vtn_pointer *image; /* Image or array of images */
> > struct vtn_pointer *sampler; /* Sampler */
> > +   struct vtn_variable *var_image;  /* Image or array of images
> > variable */
> > +   struct vtn_variable *var_sampler; /* Sampler variable */
> >  };
> > 
> >  struct vtn_value {
> > @@ -427,6 +431,7 @@ struct vtn_value {
> >   nir_constant *constant;
> >   const struct glsl_type *const_type;
> >};
> > +  struct vtn_variable *var;
> >struct vtn_pointer *pointer;
> >struct vtn_image_pointer *image;
> >struct vtn_sampled_image *sampled_image;
> > --
> > 2.13.6
> > 
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/4] spirv: add sampler and image variable support when handling texture opcodes

2017-10-06 Thread Samuel Iglesias Gonsálvez

On Fri, 2017-10-06 at 11:36 +0200, Samuel Iglesias Gonsálvez wrote:
> From: Samuel Iglesias Gonsalvez <cor...@samuelig.es>
> 
> Signed-off-by: Samuel Iglesias Gonsalvez <cor...@samuelig.es>

This patch and the following should be signed off by my Igalia email.
Fixed locally.

Sam

> ---
>  src/compiler/spirv/spirv_to_nir.c | 58
> +++
>  1 file changed, 47 insertions(+), 11 deletions(-)
> 
> diff --git a/src/compiler/spirv/spirv_to_nir.c
> b/src/compiler/spirv/spirv_to_nir.c
> index 6ce9d1ada34..cf7617454de 100644
> --- a/src/compiler/spirv/spirv_to_nir.c
> +++ b/src/compiler/spirv/spirv_to_nir.c
> @@ -1489,17 +1489,37 @@ vtn_handle_texture(struct vtn_builder *b,
> SpvOp opcode,
> if (opcode == SpvOpSampledImage) {
>struct vtn_value *val =
>   vtn_push_value(b, w[2], vtn_value_type_sampled_image);
> -  val->sampled_image = ralloc(b, struct vtn_sampled_image);
> -  val->sampled_image->image =
> - vtn_value(b, w[3], vtn_value_type_pointer)->pointer;
> -  val->sampled_image->sampler =
> - vtn_value(b, w[4], vtn_value_type_pointer)->pointer;
> +  val->sampled_image = rzalloc(b, struct vtn_sampled_image);
> +
> +  struct vtn_value *img_val = vtn_untyped_value(b, w[3]);
> +  struct vtn_value *sampler_val = vtn_untyped_value(b, w[4]);
> +
> +  if (img_val->value_type == vtn_value_type_pointer) {
> + val->sampled_image->image = img_val->pointer;
> +  } else {
> + assert(img_val->value_type ==
> vtn_value_type_image_variable);
> + val->sampled_image->var_image = img_val->var;
> +  }
> +
> +  if (sampler_val->value_type == vtn_value_type_pointer) {
> + val->sampled_image->sampler = sampler_val->pointer;
> +  } else {
> + assert(sampler_val->value_type ==
> vtn_value_type_sampler_variable);
> + val->sampled_image->var_sampler = sampler_val->var;
> +  }
>return;
> } else if (opcode == SpvOpImage) {
>struct vtn_value *val = vtn_push_value(b, w[2],
> vtn_value_type_pointer);
>struct vtn_value *src_val = vtn_untyped_value(b, w[3]);
>if (src_val->value_type == vtn_value_type_sampled_image) {
>   val->pointer = src_val->sampled_image->image;
> + if (val->pointer == NULL && src_val->sampled_image-
> >var_image != NULL) {
> +val->value_type = vtn_value_type_image_variable;
> +val->var = src_val->sampled_image->var_image;
> + } else if (val->pointer == NULL && src_val->sampled_image-
> >var_sampler != NULL) {
> +val->value_type = vtn_value_type_sampler_variable;
> +val->var = src_val->sampled_image->var_sampler;
> + }
>} else {
>   assert(src_val->value_type == vtn_value_type_pointer);
>   val->pointer = src_val->pointer;
> @@ -1510,21 +1530,32 @@ vtn_handle_texture(struct vtn_builder *b,
> SpvOp opcode,
> struct vtn_type *ret_type = vtn_value(b, w[1],
> vtn_value_type_type)->type;
> struct vtn_value *val = vtn_push_value(b, w[2],
> vtn_value_type_ssa);
>  
> -   struct vtn_sampled_image sampled;
> +   struct vtn_sampled_image sampled = {NULL};
> +   nir_deref_var *sampler = NULL;
> struct vtn_value *sampled_val = vtn_untyped_value(b, w[3]);
> if (sampled_val->value_type == vtn_value_type_sampled_image) {
>sampled = *sampled_val->sampled_image;
> -   } else {
> +  if (sampled.var_sampler)
> + sampler = nir_deref_var_create(b, sampled.var_sampler-
> >var);
> +   } else if (sampled_val->value_type == vtn_value_type_pointer) {
>assert(sampled_val->value_type == vtn_value_type_pointer);
>sampled.image = NULL;
>sampled.sampler = sampled_val->pointer;
> +   } else {
> +  assert(sampled_val->value_type ==
> vtn_value_type_image_variable ||
> + sampled_val->value_type ==
> vtn_value_type_sampler_variable);
> +  sampler = nir_deref_var_create(b, sampled_val->var->var);
> }
>  
> const struct glsl_type *image_type;
> if (sampled.image) {
>image_type = sampled.image->var->var->interface_type;
> -   } else {
> +   } else if (sampled.var_image) {
> +  image_type = sampled.var_image->type->type;
> +   } else if (sampled.sampler) {
>image_type = sampled.sampler->var->var->interface_type;
> +   } else {
> +  image_type = sampled_val->var->var->type;
> }
> const enum glsl_sampler_dim sampl

[Mesa-dev] [PATCH 2/4] spirv: Add support for fuction arguments of type image and sampler

2017-10-06 Thread Samuel Iglesias Gonsálvez

These arguments are actually variables, not pointers. This is allowed
by SPIR-V spec but the support was missing.

Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
---
 src/compiler/spirv/vtn_cfg.c | 13 +
 src/compiler/spirv/vtn_private.h |  5 +
 2 files changed, 18 insertions(+)

diff --git a/src/compiler/spirv/vtn_cfg.c b/src/compiler/spirv/vtn_cfg.c
index 25ff254bcec..15d8bb426a1 100644
--- a/src/compiler/spirv/vtn_cfg.c
+++ b/src/compiler/spirv/vtn_cfg.c
@@ -111,6 +111,19 @@ vtn_cfg_handle_prepass_instruction(struct vtn_builder *b, 
SpvOp opcode,
  param->name = ralloc_strdup(param, val->name);
 
  val->pointer = vtn_pointer_for_variable(b, vtn_var, type);
+  } else if (type->base_type == vtn_base_type_image || type->base_type == 
vtn_base_type_sampler) {
+ struct vtn_variable *vtn_var = rzalloc(b, struct vtn_variable);
+ vtn_var->type = type;
+ vtn_var->var = param;
+ vtn_var->mode = (type->base_type == vtn_base_type_image) ?
+vtn_variable_mode_image : vtn_variable_mode_sampler;
+ struct vtn_value *val =
+ vtn_push_value(b, w[2],
+(type->base_type == vtn_base_type_image) ?
+vtn_value_type_image_variable : 
vtn_value_type_sampler_variable);
+ val->var = vtn_var;
+ /* Name the parameter so it shows up nicely in NIR */
+ param->name = ralloc_strdup(param, val->name);
   } else {
  /* We're a regular SSA value. */
  struct vtn_ssa_value *param_ssa =
diff --git a/src/compiler/spirv/vtn_private.h b/src/compiler/spirv/vtn_private.h
index 84584620fc1..f194a7ed32a 100644
--- a/src/compiler/spirv/vtn_private.h
+++ b/src/compiler/spirv/vtn_private.h
@@ -51,6 +51,8 @@ enum vtn_value_type {
vtn_value_type_extension,
vtn_value_type_image_pointer,
vtn_value_type_sampled_image,
+   vtn_value_type_image_variable,
+   vtn_value_type_sampler_variable,
 };
 
 enum vtn_branch_type {
@@ -413,6 +415,8 @@ struct vtn_image_pointer {
 struct vtn_sampled_image {
struct vtn_pointer *image; /* Image or array of images */
struct vtn_pointer *sampler; /* Sampler */
+   struct vtn_variable *var_image;  /* Image or array of images variable */
+   struct vtn_variable *var_sampler; /* Sampler variable */
 };
 
 struct vtn_value {
@@ -427,6 +431,7 @@ struct vtn_value {
  nir_constant *constant;
  const struct glsl_type *const_type;
   };
+  struct vtn_variable *var;
   struct vtn_pointer *pointer;
   struct vtn_image_pointer *image;
   struct vtn_sampled_image *sampled_image;
-- 
2.13.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/4] i965/fs: some TXF don't provide LOD

2017-10-06 Thread Samuel Iglesias Gonsálvez

SpvOpImageFetch doesn't provide it, so set it to zero.

Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
---
 src/intel/compiler/brw_fs_nir.cpp | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/src/intel/compiler/brw_fs_nir.cpp 
b/src/intel/compiler/brw_fs_nir.cpp
index 5b8ccd50bff..25488303c29 100644
--- a/src/intel/compiler/brw_fs_nir.cpp
+++ b/src/intel/compiler/brw_fs_nir.cpp
@@ -4518,11 +4518,12 @@ fs_visitor::nir_emit_texture(const fs_builder , 
nir_tex_instr *instr)
   unreachable("unknown texture opcode");
}
 
-   /* TXS and TXL require a LOD but not everything we implement using those
-* two opcodes provides one.  Provide a default LOD of 0.
+   /* TXF, TXS and TXL require a LOD but not everything we implement using 
those
+* three opcodes provides one.  Provide a default LOD of 0.
 */
-   if ((opcode == SHADER_OPCODE_TXS_LOGICAL ||
-opcode == SHADER_OPCODE_TXL_LOGICAL) &&
+   if ((opcode == SHADER_OPCODE_TXF_LOGICAL ||
+ opcode == SHADER_OPCODE_TXS_LOGICAL ||
+ opcode == SHADER_OPCODE_TXL_LOGICAL) &&
srcs[TEX_LOGICAL_SRC_LOD].file == BAD_FILE) {
   srcs[TEX_LOGICAL_SRC_LOD] = brw_imm_ud(0u);
}
-- 
2.13.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/4] spirv: add sampler and image variable support when handling texture opcodes

2017-10-06 Thread Samuel Iglesias Gonsálvez

From: Samuel Iglesias Gonsalvez 

Signed-off-by: Samuel Iglesias Gonsalvez 
---
 src/compiler/spirv/spirv_to_nir.c | 58 +++
 1 file changed, 47 insertions(+), 11 deletions(-)

diff --git a/src/compiler/spirv/spirv_to_nir.c 
b/src/compiler/spirv/spirv_to_nir.c
index 6ce9d1ada34..cf7617454de 100644
--- a/src/compiler/spirv/spirv_to_nir.c
+++ b/src/compiler/spirv/spirv_to_nir.c
@@ -1489,17 +1489,37 @@ vtn_handle_texture(struct vtn_builder *b, SpvOp opcode,
if (opcode == SpvOpSampledImage) {
   struct vtn_value *val =
  vtn_push_value(b, w[2], vtn_value_type_sampled_image);
-  val->sampled_image = ralloc(b, struct vtn_sampled_image);
-  val->sampled_image->image =
- vtn_value(b, w[3], vtn_value_type_pointer)->pointer;
-  val->sampled_image->sampler =
- vtn_value(b, w[4], vtn_value_type_pointer)->pointer;
+  val->sampled_image = rzalloc(b, struct vtn_sampled_image);
+
+  struct vtn_value *img_val = vtn_untyped_value(b, w[3]);
+  struct vtn_value *sampler_val = vtn_untyped_value(b, w[4]);
+
+  if (img_val->value_type == vtn_value_type_pointer) {
+ val->sampled_image->image = img_val->pointer;
+  } else {
+ assert(img_val->value_type == vtn_value_type_image_variable);
+ val->sampled_image->var_image = img_val->var;
+  }
+
+  if (sampler_val->value_type == vtn_value_type_pointer) {
+ val->sampled_image->sampler = sampler_val->pointer;
+  } else {
+ assert(sampler_val->value_type == vtn_value_type_sampler_variable);
+ val->sampled_image->var_sampler = sampler_val->var;
+  }
   return;
} else if (opcode == SpvOpImage) {
   struct vtn_value *val = vtn_push_value(b, w[2], vtn_value_type_pointer);
   struct vtn_value *src_val = vtn_untyped_value(b, w[3]);
   if (src_val->value_type == vtn_value_type_sampled_image) {
  val->pointer = src_val->sampled_image->image;
+ if (val->pointer == NULL && src_val->sampled_image->var_image != 
NULL) {
+val->value_type = vtn_value_type_image_variable;
+val->var = src_val->sampled_image->var_image;
+ } else if (val->pointer == NULL && 
src_val->sampled_image->var_sampler != NULL) {
+val->value_type = vtn_value_type_sampler_variable;
+val->var = src_val->sampled_image->var_sampler;
+ }
   } else {
  assert(src_val->value_type == vtn_value_type_pointer);
  val->pointer = src_val->pointer;
@@ -1510,21 +1530,32 @@ vtn_handle_texture(struct vtn_builder *b, SpvOp opcode,
struct vtn_type *ret_type = vtn_value(b, w[1], vtn_value_type_type)->type;
struct vtn_value *val = vtn_push_value(b, w[2], vtn_value_type_ssa);
 
-   struct vtn_sampled_image sampled;
+   struct vtn_sampled_image sampled = {NULL};
+   nir_deref_var *sampler = NULL;
struct vtn_value *sampled_val = vtn_untyped_value(b, w[3]);
if (sampled_val->value_type == vtn_value_type_sampled_image) {
   sampled = *sampled_val->sampled_image;
-   } else {
+  if (sampled.var_sampler)
+ sampler = nir_deref_var_create(b, sampled.var_sampler->var);
+   } else if (sampled_val->value_type == vtn_value_type_pointer) {
   assert(sampled_val->value_type == vtn_value_type_pointer);
   sampled.image = NULL;
   sampled.sampler = sampled_val->pointer;
+   } else {
+  assert(sampled_val->value_type == vtn_value_type_image_variable ||
+ sampled_val->value_type == vtn_value_type_sampler_variable);
+  sampler = nir_deref_var_create(b, sampled_val->var->var);
}
 
const struct glsl_type *image_type;
if (sampled.image) {
   image_type = sampled.image->var->var->interface_type;
-   } else {
+   } else if (sampled.var_image) {
+  image_type = sampled.var_image->type->type;
+   } else if (sampled.sampler) {
   image_type = sampled.sampler->var->var->interface_type;
+   } else {
+  image_type = sampled_val->var->var->type;
}
const enum glsl_sampler_dim sampler_dim = glsl_get_sampler_dim(image_type);
const bool is_array = glsl_sampler_type_is_array(image_type);
@@ -1741,10 +1772,15 @@ vtn_handle_texture(struct vtn_builder *b, SpvOp opcode,
   unreachable("Invalid base type for sampler result");
}
 
-   nir_deref_var *sampler = vtn_pointer_to_deref(b, sampled.sampler);
+   if (!sampler)
+  sampler = vtn_pointer_to_deref(b, sampled.sampler);
nir_deref_var *texture;
-   if (sampled.image) {
-  nir_deref_var *image = vtn_pointer_to_deref(b, sampled.image);
+   if (sampled.image || sampled.var_image) {
+  nir_deref_var *image;
+  if (!sampled.image && sampled.var_image)
+ image = nir_deref_var_create(b, sampled.var_image->var);
+  else
+ image = vtn_pointer_to_deref(b, sampled.image);
   texture = image;
} else {
   texture = sampler;
-- 
2.13.6

[Mesa-dev] [PATCH 4/4] spirv: add support for image variables for image opcodes

2017-10-06 Thread Samuel Iglesias Gonsálvez

From: Samuel Iglesias Gonsalvez 

Signed-off-by: Samuel Iglesias Gonsalvez 
---
 src/compiler/spirv/spirv_to_nir.c | 27 ++-
 1 file changed, 22 insertions(+), 5 deletions(-)

diff --git a/src/compiler/spirv/spirv_to_nir.c 
b/src/compiler/spirv/spirv_to_nir.c
index cf7617454de..bc3fb861397 100644
--- a/src/compiler/spirv/spirv_to_nir.c
+++ b/src/compiler/spirv/spirv_to_nir.c
@@ -1948,7 +1948,8 @@ vtn_handle_image(struct vtn_builder *b, SpvOp opcode,
   return;
}
 
-   struct vtn_image_pointer image;
+   struct vtn_image_pointer image = {NULL};
+   nir_deref_var *image_deref = NULL;
 
switch (opcode) {
case SpvOpAtomicExchange:
@@ -1974,13 +1975,23 @@ vtn_handle_image(struct vtn_builder *b, SpvOp opcode,
   break;
 
case SpvOpImageQuerySize:
-  image.image = vtn_value(b, w[3], vtn_value_type_pointer)->pointer;
+  if (vtn_untyped_value(b, w[3])->value_type == vtn_value_type_pointer) {
+ image.image = vtn_value(b, w[3], vtn_value_type_pointer)->pointer;
+  } else {
+ assert(vtn_untyped_value(b, w[3])->value_type == 
vtn_value_type_image_variable);
+ image_deref = nir_deref_var_create(b, vtn_value(b, w[3], 
vtn_value_type_image_variable)->var->var);
+  }
   image.coord = NULL;
   image.sample = NULL;
   break;
 
case SpvOpImageRead:
-  image.image = vtn_value(b, w[3], vtn_value_type_pointer)->pointer;
+  if (vtn_untyped_value(b, w[3])->value_type == vtn_value_type_pointer) {
+ image.image = vtn_value(b, w[3], vtn_value_type_pointer)->pointer;
+  } else {
+ assert(vtn_untyped_value(b, w[3])->value_type == 
vtn_value_type_image_variable);
+ image_deref = nir_deref_var_create(b, vtn_value(b, w[3], 
vtn_value_type_image_variable)->var->var);
+  }
   image.coord = get_image_coord(b, w[4]);
 
   if (count > 5 && (w[5] & SpvImageOperandsSampleMask)) {
@@ -1992,7 +2003,12 @@ vtn_handle_image(struct vtn_builder *b, SpvOp opcode,
   break;
 
case SpvOpImageWrite:
-  image.image = vtn_value(b, w[1], vtn_value_type_pointer)->pointer;
+  if (vtn_untyped_value(b, w[1])->value_type == vtn_value_type_pointer) {
+ image.image = vtn_value(b, w[1], vtn_value_type_pointer)->pointer;
+  } else {
+ assert(vtn_untyped_value(b, w[1])->value_type == 
vtn_value_type_image_variable);
+ image_deref = nir_deref_var_create(b, vtn_value(b, w[1], 
vtn_value_type_image_variable)->var->var);
+  }
   image.coord = get_image_coord(b, w[2]);
 
   /* texel = w[3] */
@@ -2037,7 +2053,8 @@ vtn_handle_image(struct vtn_builder *b, SpvOp opcode,
 
nir_intrinsic_instr *intrin = nir_intrinsic_instr_create(b->shader, op);
 
-   nir_deref_var *image_deref = vtn_pointer_to_deref(b, image.image);
+   if (!image_deref)
+  image_deref = vtn_pointer_to_deref(b, image.image);
intrin->variables[0] = nir_deref_var_clone(image_deref, intrin);
 
/* ImageQuerySize doesn't take any extra parameters */
-- 
2.13.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Mesa-stable] [PATCH] vulkan/wsi/wayland: Stop printing out the DRM device

2017-09-24 Thread Samuel Iglesias Gonsálvez

Reviewed-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>

On Friday, September 22, 2017 12:45:27 PM CEST Jason Ekstrand wrote:
> Cc: mesa-sta...@lists.freedesktop.org
> ---
>  src/vulkan/wsi/wsi_common_wayland.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/src/vulkan/wsi/wsi_common_wayland.c
> b/src/vulkan/wsi/wsi_common_wayland.c index dd283a1..b726d98 100644
> --- a/src/vulkan/wsi/wsi_common_wayland.c
> +++ b/src/vulkan/wsi/wsi_common_wayland.c
> @@ -98,7 +98,6 @@ wsi_wl_display_add_vk_format(struct wsi_wl_display
> *display, VkFormat format) static void
>  drm_handle_device(void *data, struct wl_drm *drm, const char *name)
>  {
> -   fprintf(stderr, "wl_drm.device(%s)\n", name);
>  }
> 
>  static uint32_t



signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] anv: fix viewport transformation for z component

2017-09-22 Thread Samuel Iglesias Gonsálvez

Kindly reminder that this patch is still unreviewed.

Sam

On Friday, September 15, 2017 11:50:46 AM CEST you wrote:
> In Vulkan, for 'z' (depth) component, the scale and translate values
> for the viewport transformation are:
> 
> pz = maxDepth - minDepth
> oz = minDepth
> 
> zf = pz × zd + oz
> 
> Being zd, the third component in vertex's normalized device coordinates.
> 
> Fixes: dEQP-VK.draw.inverted_depth_ranges.*
> 
> Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
> ---
>  src/intel/vulkan/gen8_cmd_buffer.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/src/intel/vulkan/gen8_cmd_buffer.c
> b/src/intel/vulkan/gen8_cmd_buffer.c index 064b8e930e..7bea231ea7 100644
> --- a/src/intel/vulkan/gen8_cmd_buffer.c
> +++ b/src/intel/vulkan/gen8_cmd_buffer.c
> @@ -49,10 +49,10 @@ gen8_cmd_buffer_emit_viewport(struct anv_cmd_buffer
> *cmd_buffer) struct GENX(SF_CLIP_VIEWPORT) sf_clip_viewport = {
>   .ViewportMatrixElementm00 = vp->width / 2,
>   .ViewportMatrixElementm11 = vp->height / 2,
> - .ViewportMatrixElementm22 = 1.0,
> + .ViewportMatrixElementm22 = vp->maxDepth - vp->minDepth,
>   .ViewportMatrixElementm30 = vp->x + vp->width / 2,
>   .ViewportMatrixElementm31 = vp->y + vp->height / 2,
> - .ViewportMatrixElementm32 = 0.0,
> + .ViewportMatrixElementm32 = vp->minDepth,
>   .XMinClipGuardband = -1.0f,
>   .XMaxClipGuardband = 1.0f,
>   .YMinClipGuardband = -1.0f,



signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] anv: fix viewport transformation for z component

2017-09-15 Thread Samuel Iglesias Gonsálvez

In Vulkan, for 'z' (depth) component, the scale and translate values
for the viewport transformation are:

pz = maxDepth - minDepth
oz = minDepth

zf = pz × zd + oz

Being zd, the third component in vertex's normalized device coordinates.

Fixes: dEQP-VK.draw.inverted_depth_ranges.*

Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
---
 src/intel/vulkan/gen8_cmd_buffer.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/intel/vulkan/gen8_cmd_buffer.c 
b/src/intel/vulkan/gen8_cmd_buffer.c
index 064b8e930e..7bea231ea7 100644
--- a/src/intel/vulkan/gen8_cmd_buffer.c
+++ b/src/intel/vulkan/gen8_cmd_buffer.c
@@ -49,10 +49,10 @@ gen8_cmd_buffer_emit_viewport(struct anv_cmd_buffer 
*cmd_buffer)
   struct GENX(SF_CLIP_VIEWPORT) sf_clip_viewport = {
  .ViewportMatrixElementm00 = vp->width / 2,
  .ViewportMatrixElementm11 = vp->height / 2,
- .ViewportMatrixElementm22 = 1.0,
+ .ViewportMatrixElementm22 = vp->maxDepth - vp->minDepth,
  .ViewportMatrixElementm30 = vp->x + vp->width / 2,
  .ViewportMatrixElementm31 = vp->y + vp->height / 2,
- .ViewportMatrixElementm32 = 0.0,
+ .ViewportMatrixElementm32 = vp->minDepth,
  .XMinClipGuardband = -1.0f,
  .XMaxClipGuardband = 1.0f,
  .YMinClipGuardband = -1.0f,
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2] nir/spirv: handle if's with same label in both branches

2017-09-11 Thread Samuel Iglesias Gonsálvez

From: "Juan A. Suarez Romero" 

When a conditional branch has the same labels in the "if" part and in the
"else" part, then we have the same cfg block, and it must be handled
once.

v2: handle it the same way as OpBranch (Jason).

Fixes:
dEQP-VK.spirv_assembly.instruction.compute.conditional_branch.same_labels*
dEQP-VK.spirv_assembly.instruction.graphics.conditional_branch.same_labels*
---
 src/compiler/spirv/vtn_cfg.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/src/compiler/spirv/vtn_cfg.c b/src/compiler/spirv/vtn_cfg.c
index 03c452cb31..3ad20b9ad8 100644
--- a/src/compiler/spirv/vtn_cfg.c
+++ b/src/compiler/spirv/vtn_cfg.c
@@ -356,8 +356,16 @@ vtn_cfg_walk_blocks(struct vtn_builder *b, struct 
list_head *cf_list,
   switch_case, switch_break,
   loop_break, loop_cont);
 
- if (if_stmt->then_type == vtn_branch_type_none &&
- if_stmt->else_type == vtn_branch_type_none) {
+ if (then_block == else_block) {
+block->branch_type = if_stmt->then_type;
+if (block->branch_type == vtn_branch_type_none) {
+   block = then_block;
+   continue;
+} else {
+   return;
+}
+ } else if (if_stmt->then_type == vtn_branch_type_none &&
+if_stmt->else_type == vtn_branch_type_none) {
 /* Neither side of the if is something we can short-circuit. */
 assert((*block->merge & SpvOpCodeMask) == SpvOpSelectionMerge);
 struct vtn_block *merge_block =
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nir/spirv: handle if's with same label in both branches

2017-09-11 Thread Samuel Iglesias Gonsálvez



On 09/07/2017 07:03 PM, Jason Ekstrand wrote:
> On Thu, Aug 24, 2017 at 8:16 AM, Juan A. Suarez Romero
> > wrote:
>
> When a conditional branch has the same labels in the "if" part and
> in the
> "else" part, then we have the same cfg block, and it must be handled
> once.
>
> Fixes:
> dEQP-VK.spirv_assembly.instruction.compute.conditional_branch.same_labels*
> 
> dEQP-VK.spirv_assembly.instruction.graphics.conditional_branch.same_labels*
> ---
>  src/compiler/spirv/vtn_cfg.c | 7 +--
>  1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/src/compiler/spirv/vtn_cfg.c
> b/src/compiler/spirv/vtn_cfg.c
> index 03c452cb31..bfca7043cc 100644
> --- a/src/compiler/spirv/vtn_cfg.c
> +++ b/src/compiler/spirv/vtn_cfg.c
> @@ -356,8 +356,11 @@ vtn_cfg_walk_blocks(struct vtn_builder *b,
> struct list_head *cf_list,
>                                                    switch_case,
> switch_break,
>                                                    loop_break,
> loop_cont);
>
> -         if (if_stmt->then_type == vtn_branch_type_none &&
> -             if_stmt->else_type == vtn_branch_type_none) {
> +         if (then_block == else_block) {
> +            block = then_block;
> +            continue;
>
>
> This isn't quite sufficient.  This needs to be handled the same way as
> OpBranch.  In particular,
>
> block->branch_type = if_stmt->then_type;
> if (block->branch_type == vtn_branch_type_none) {
>    block = then_block;
>    continue;
> } else {
>    return;
> }

OK, thanks. I am going to send a v2 soon.

Sam

>  
>
> +         } else if (if_stmt->then_type == vtn_branch_type_none &&
> +                    if_stmt->else_type == vtn_branch_type_none) {
>              /* Neither side of the if is something we can
> short-circuit. */
>              assert((*block->merge & SpvOpCodeMask) ==
> SpvOpSelectionMerge);
>              struct vtn_block *merge_block =
> --
> 2.13.5
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org 
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
>
>
>
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nir/spirv: fix crashes when dereferencing a pointer for an OpVariable

2017-09-11 Thread Samuel Iglesias Gonsálvez



On 09/07/2017 07:15 PM, Jason Ekstrand wrote:
> On Tue, Aug 29, 2017 at 3:04 AM, Samuel Iglesias Gonsálvez
> <sigles...@igalia.com <mailto:sigles...@igalia.com>> wrote:
>
> When creating a vtn_pointer for an OpVariable, the block_index and
> offsets fields are null because there is not ssa to take the data
> from.
>
> However, we can dereference that pointer when processing an
> SpvOp*AccessChain opcodes through vtn_ssa_offset_pointer_dereference()
> when the OpVariable when the StorageClass is Uniform or StorageBuffer.
>
> Inside vtn_ssa_offset_pointer_dereference() we have the code to
> initialize block_index and offset if they are null, but it is called
> after checking if the pointer has then non-null.
>
>
> This seems fishy.  The code you're moving only triggers for
> OpPtrAccessChain.  In order to run into an issue, they would have to
> be doing an OpPtrAccessChain an an array of resources.  This shouldn't
> be allowed by the spec.  I'll have to look at the actual SPIR-V to be
> sure, but I think this is probably a CTS bug.
>  

For example, one of the tests failing is:
dEQP-VK.spirv_assembly.instruction.compute.indexing.opptraccesschain_u32

A snippet of its SPIR-V is:

[...]
OpDecorate %_ptr_buffer_Input ArrayStride 65536
[...]
%Input = OpTypeStruct %_arr__arr_mat4v4float_uint_32_uint_32
%_ptr_buffer_Input = OpTypePointer StorageBuffer %Input
%dataInput = OpVariable %_ptr_buffer_Input StorageBuffer
[...]
%54 = OpPtrAccessChain %_ptr_buffer_float %dataInput %idx_1 %idx_0 %i0
%i1 %i2 %i3

%dataInput points to a struct of an AoA of matrices. However, according
to SPV_KHR_variable_pointers [0]:

  *

Each *OpPtrAccessChain* must have a /Base/ whose type is decorated
with *ArrayStride*.

So I think this is fine, or am I understanding it wrongly?

Sam

[0]
https://www.khronos.org/registry/spir-v/extensions/KHR/SPV_KHR_variable_pointers.html

> Reordering that code fixes crashes in:
>
>    dEQP-VK.spirv_assembly.instruction.*.indexing.*
>
> Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com
> <mailto:sigles...@igalia.com>>
> ---
>  src/compiler/spirv/vtn_variables.c | 29 +++--
>  1 file changed, 15 insertions(+), 14 deletions(-)
>
> diff --git a/src/compiler/spirv/vtn_variables.c
> b/src/compiler/spirv/vtn_variables.c
> index 4f6acd2e07..baf1edde4c 100644
> --- a/src/compiler/spirv/vtn_variables.c
> +++ b/src/compiler/spirv/vtn_variables.c
> @@ -146,20 +146,6 @@ vtn_ssa_offset_pointer_dereference(struct
> vtn_builder *b,
>     struct vtn_type *type = base->type;
>
>     unsigned idx = 0;
> -   if (deref_chain->ptr_as_array) {
> -      /* We need ptr_type for the stride */
> -      assert(base->ptr_type);
> -      /* This must be a pointer to an actual element somewhere */
> -      assert(block_index && offset);
> -      /* We need at least one element in the chain */
> -      assert(deref_chain->length >= 1);
> -
> -      nir_ssa_def *elem_offset =
> -         vtn_access_link_as_ssa(b, deref_chain->link[idx],
> -                                base->ptr_type->stride);
> -      offset = nir_iadd(>nb, offset, elem_offset);
> -      idx++;
> -   }
>
>     if (!block_index) {
>        assert(base->var);
> @@ -182,6 +168,21 @@ vtn_ssa_offset_pointer_dereference(struct
> vtn_builder *b,
>     }
>     assert(offset);
>
> +   if (deref_chain->ptr_as_array) {
> +      /* We need ptr_type for the stride */
> +      assert(base->ptr_type);
> +      /* This must be a pointer to an actual element somewhere */
> +      assert(block_index && offset);
> +      /* We need at least one element in the chain */
> +      assert(deref_chain->length >= 1);
> +
> +      nir_ssa_def *elem_offset =
> +         vtn_access_link_as_ssa(b, deref_chain->link[idx],
> +                                base->ptr_type->stride);
> +      offset = nir_iadd(>nb, offset, elem_offset);
> +      idx++;
> +   }
> +
>     for (; idx < deref_chain->length; idx++) {
>        switch (glsl_get_base_type(type->type)) {
>        case GLSL_TYPE_UINT:
> --
> 2.14.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org <mailto:mesa-dev@lists.freedesktop.org>
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> <https://lists.freedesktop.org/mailman/listinfo/mesa-dev>
>
>



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2] nir/spirv: fix chain access with different index bit sizes

2017-09-07 Thread Samuel Iglesias Gonsálvez

This patch is unreviewed.

On Tue, 2017-08-29 at 08:42 +0200, Samuel Iglesias Gonsálvez wrote:
> Currently we support 32-bit indexes/offsets all over the driver, so
> we
> convert them to that bit size.
> 
> Fixes dEQP-VK.spirv_assembly.instruction.*.indexing.*
> 
> Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
> ---
>  src/compiler/spirv/vtn_variables.c | 11 ---
>  1 file changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/src/compiler/spirv/vtn_variables.c
> b/src/compiler/spirv/vtn_variables.c
> index 4432e72e54..4f6acd2e07 100644
> --- a/src/compiler/spirv/vtn_variables.c
> +++ b/src/compiler/spirv/vtn_variables.c
> @@ -102,10 +102,15 @@ vtn_access_link_as_ssa(struct vtn_builder *b,
> struct vtn_access_link link,
> if (link.mode == vtn_access_mode_literal) {
>    return nir_imm_int(>nb, link.id * stride);
> } else if (stride == 1) {
> -  return vtn_ssa_value(b, link.id)->def;
> +   nir_ssa_def *ssa = vtn_ssa_value(b, link.id)->def;
> +   if (ssa->bit_size != 32)
> +  ssa = nir_i2i32(>nb, ssa);
> +  return ssa;
> } else {
> -  return nir_imul(>nb, vtn_ssa_value(b, link.id)->def,
> -  nir_imm_int(>nb, stride));
> +  nir_ssa_def *src0 = vtn_ssa_value(b, link.id)->def;
> +  if (src0->bit_size != 32)
> + src0 = nir_i2i32(>nb, src0);
> +  return nir_imul(>nb, src0, nir_imm_int(>nb, stride));
> }
>  }
>  

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nir/spirv: fix crashes when dereferencing a pointer for an OpVariable

2017-09-07 Thread Samuel Iglesias Gonsálvez

This patch is unreviewed.

On Tue, 2017-08-29 at 12:04 +0200, Samuel Iglesias Gonsálvez wrote:
> When creating a vtn_pointer for an OpVariable, the block_index and
> offsets fields are null because there is not ssa to take the data
> from.
> 
> However, we can dereference that pointer when processing an
> SpvOp*AccessChain opcodes through
> vtn_ssa_offset_pointer_dereference()
> when the OpVariable when the StorageClass is Uniform or
> StorageBuffer.
> 
> Inside vtn_ssa_offset_pointer_dereference() we have the code to
> initialize block_index and offset if they are null, but it is called
> after checking if the pointer has then non-null.
> 
> Reordering that code fixes crashes in:
> 
>    dEQP-VK.spirv_assembly.instruction.*.indexing.*
> 
> Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
> ---
>  src/compiler/spirv/vtn_variables.c | 29 +++-
> -
>  1 file changed, 15 insertions(+), 14 deletions(-)
> 
> diff --git a/src/compiler/spirv/vtn_variables.c
> b/src/compiler/spirv/vtn_variables.c
> index 4f6acd2e07..baf1edde4c 100644
> --- a/src/compiler/spirv/vtn_variables.c
> +++ b/src/compiler/spirv/vtn_variables.c
> @@ -146,20 +146,6 @@ vtn_ssa_offset_pointer_dereference(struct
> vtn_builder *b,
> struct vtn_type *type = base->type;
>  
> unsigned idx = 0;
> -   if (deref_chain->ptr_as_array) {
> -  /* We need ptr_type for the stride */
> -  assert(base->ptr_type);
> -  /* This must be a pointer to an actual element somewhere */
> -  assert(block_index && offset);
> -  /* We need at least one element in the chain */
> -  assert(deref_chain->length >= 1);
> -
> -  nir_ssa_def *elem_offset =
> - vtn_access_link_as_ssa(b, deref_chain->link[idx],
> -base->ptr_type->stride);
> -  offset = nir_iadd(>nb, offset, elem_offset);
> -  idx++;
> -   }
>  
> if (!block_index) {
>    assert(base->var);
> @@ -182,6 +168,21 @@ vtn_ssa_offset_pointer_dereference(struct
> vtn_builder *b,
> }
> assert(offset);
>  
> +   if (deref_chain->ptr_as_array) {
> +  /* We need ptr_type for the stride */
> +  assert(base->ptr_type);
> +  /* This must be a pointer to an actual element somewhere */
> +  assert(block_index && offset);
> +  /* We need at least one element in the chain */
> +  assert(deref_chain->length >= 1);
> +
> +  nir_ssa_def *elem_offset =
> + vtn_access_link_as_ssa(b, deref_chain->link[idx],
> +base->ptr_type->stride);
> +  offset = nir_iadd(>nb, offset, elem_offset);
> +  idx++;
> +   }
> +
> for (; idx < deref_chain->length; idx++) {
>    switch (glsl_get_base_type(type->type)) {
>    case GLSL_TYPE_UINT:

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa/main: Fix GetTransformFeedbacki64 for glTransformFeedbackBufferBase

2017-09-06 Thread Samuel Iglesias Gonsálvez

Reviewed-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>

On Tue, 2017-09-05 at 14:41 +0200, Iago Toral Quiroga wrote:
> The spec has special rules for querying buffer offsets and sizes
> when BindBufferBase is used, described  in the OpenGL 4.6 spec,
> section 6.8 Buffer Object State:
> 
>    "To query the starting offset or size of the range of a buffer
> object binding in an indexed array, call GetInteger64i_v with
> target set to respectively the starting offset or binding size
> name from table 6.5 for that array. Index must be in the range
> zero to the number of bind points supported minus one. If the
> starting offset or size was not specified when the buffer object
> was bound (e.g. if it was bound with BindBufferBase), or if no
> buffer object is bound to the target array at index, zero is
> returned."
> 
> Transform feedback buffer queries should follow the same rules, since
> it is the same case for them. There is a CTS test for this.
> 
> Fixes:
> KHR-GL45.direct_state_access.xfb_buffers
> ---
>  src/mesa/main/transformfeedback.c | 22 ++
>  1 file changed, 22 insertions(+)
> 
> diff --git a/src/mesa/main/transformfeedback.c
> b/src/mesa/main/transformfeedback.c
> index befc7c..a5ea2a5eb7 100644
> --- a/src/mesa/main/transformfeedback.c
> +++ b/src/mesa/main/transformfeedback.c
> @@ -1402,12 +1402,34 @@ _mesa_GetTransformFeedbacki64_v(GLuint xfb,
> GLenum pname, GLuint index,
>    return;
> }
>  
> +   /**
> +* This follows the same general rules used for BindBufferBase:
> +*
> +*   "To query the starting offset or size of the range of a
> buffer
> +*object binding in an indexed array, call GetInteger64i_v
> with
> +*target set to respectively the starting offset or binding
> size
> +*name from table 6.5 for that array. Index must be in the
> range
> +*zero to the number of bind points supported minus one. If
> the
> +*starting offset or size was not specified when the buffer
> object
> +*was bound (e.g. if it was bound with BindBufferBase), or if
> no
> +*buffer object is bound to the target array at index, zero
> is
> +*returned."
> +*/
> +   if (obj->RequestedSize[index] == 0 &&
> +   (pname == GL_TRANSFORM_FEEDBACK_BUFFER_START ||
> +pname == GL_TRANSFORM_FEEDBACK_BUFFER_SIZE)) {
> +  *param = 0;
> +  return;
> +   }
> +
> compute_transform_feedback_buffer_sizes(obj);
> switch(pname) {
> case GL_TRANSFORM_FEEDBACK_BUFFER_START:
> +  assert(obj->RequestedSize[index] > 0);
>    *param = obj->Offset[index];
>    break;
> case GL_TRANSFORM_FEEDBACK_BUFFER_SIZE:
> +  assert(obj->RequestedSize[index] > 0);
>    *param = obj->Size[index];
>    break;
> default:

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Mesa-stable] [PATCH] anv/formats: Fix an off-by-one in the format array range check

2017-09-03 Thread Samuel Iglesias Gonsálvez

I have just see Eric's patch. Forget this R-b.

Sam

On Mon, 2017-09-04 at 06:59 +0200, Samuel Iglesias Gonsálvez wrote:
> Reviewed-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
> 
> On Sun, 2017-09-03 at 17:10 -0700, Jason Ekstrand wrote:
> > Found with static code analysis
> > 
> > Cc: mesa-sta...@lists.freedesktop.org
> > ---
> >  src/intel/vulkan/anv_formats.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/src/intel/vulkan/anv_formats.c
> > b/src/intel/vulkan/anv_formats.c
> > index c23b143..eead1aa 100644
> > --- a/src/intel/vulkan/anv_formats.c
> > +++ b/src/intel/vulkan/anv_formats.c
> > @@ -253,7 +253,7 @@ static const struct anv_format anv_formats[] =
> > {
> >  static bool
> >  format_supported(VkFormat vk_format)
> >  {
> > -   if (vk_format > ARRAY_SIZE(anv_formats))
> > +   if (vk_format >= ARRAY_SIZE(anv_formats))
> >    return false;
> >  
> > return anv_formats[vk_format].isl_format !=
> > ISL_FORMAT_UNSUPPORTED;
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH mesa] anv: fix off by one in array check

2017-09-03 Thread Samuel Iglesias Gonsálvez

Reviewed-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>

On Sun, 2017-09-03 at 21:54 -0700, Jason Ekstrand wrote:
> I sent the same patch a few hours later.  I don't care which one we
> land.  You have a more descriptive commit message.
> 
> Reviewed-by: Jason Ekstrand <ja...@jlekstrand.net>
> 
> On Sun, Sep 3, 2017 at 11:33 AM, Eric Engestrom <e...@engestrom.ch>
> wrote:
> > `anv_formats[ARRAY_SIZE(anv_formats)]` is already one too far.
> > 
> > Spotted by Coverity.
> > 
> > 
> > 
> > CovID: 1417259
> > 
> > Fixes: 242211933a0682696170 "anv/formats: Nicely handle unknown
> > VkFormat enums"
> > 
> > Cc: Jason Ekstrand <jason.ekstr...@intel.com>
> > 
> > Signed-off-by: Eric Engestrom <e...@engestrom.ch>
> > 
> > ---
> > 
> >  src/intel/vulkan/anv_formats.c | 2 +-
> > 
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > 
> > 
> > diff --git a/src/intel/vulkan/anv_formats.c
> > b/src/intel/vulkan/anv_formats.c
> > 
> > index c23b143cac..eead1aa790 100644
> > 
> > --- a/src/intel/vulkan/anv_formats.c
> > 
> > +++ b/src/intel/vulkan/anv_formats.c
> > 
> > @@ -253,7 +253,7 @@ static const struct anv_format anv_formats[] =
> > {
> > 
> >  static bool
> > 
> >  format_supported(VkFormat vk_format)
> > 
> >  {
> > 
> > -   if (vk_format > ARRAY_SIZE(anv_formats))
> > 
> > +   if (vk_format >= ARRAY_SIZE(anv_formats))
> > 
> >        return false;
> > 
> > 
> > 
> >     return anv_formats[vk_format].isl_format !=
> > ISL_FORMAT_UNSUPPORTED;
> > 
> > --
> > 
> > Cheers,
> > 
> >   Eric
> > 
> > 
> > 
> > ___
> > 
> > mesa-dev mailing list
> > 
> > mesa-dev@lists.freedesktop.org
> > 
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> > 
> > 
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Mesa-stable] [PATCH] anv/formats: Fix an off-by-one in the format array range check

2017-09-03 Thread Samuel Iglesias Gonsálvez

Reviewed-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>

On Sun, 2017-09-03 at 17:10 -0700, Jason Ekstrand wrote:
> Found with static code analysis
> 
> Cc: mesa-sta...@lists.freedesktop.org
> ---
>  src/intel/vulkan/anv_formats.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/intel/vulkan/anv_formats.c
> b/src/intel/vulkan/anv_formats.c
> index c23b143..eead1aa 100644
> --- a/src/intel/vulkan/anv_formats.c
> +++ b/src/intel/vulkan/anv_formats.c
> @@ -253,7 +253,7 @@ static const struct anv_format anv_formats[] = {
>  static bool
>  format_supported(VkFormat vk_format)
>  {
> -   if (vk_format > ARRAY_SIZE(anv_formats))
> +   if (vk_format >= ARRAY_SIZE(anv_formats))
>    return false;
>  
> return anv_formats[vk_format].isl_format !=
> ISL_FORMAT_UNSUPPORTED;

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] anv: fix build errors on android

2017-08-31 Thread Samuel Iglesias Gonsálvez

Reviewed-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>

On Thu, 2017-08-31 at 08:52 +0300, Tapani Pälli wrote:
> error: incompatible pointer to integer conversion initializing
> 'VkFence'
>    (aka 'unsigned long long') with an expression of type 'void *' [-
> Werror,-Wint-conversion]
> 
> Signed-off-by: Tapani Pälli <tapani.pa...@intel.com>
> ---
>  src/intel/vulkan/anv_queue.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/src/intel/vulkan/anv_queue.c
> b/src/intel/vulkan/anv_queue.c
> index 429bac9739..d675e8667e 100644
> --- a/src/intel/vulkan/anv_queue.c
> +++ b/src/intel/vulkan/anv_queue.c
> @@ -169,7 +169,7 @@ VkResult anv_QueueSubmit(
>  
> for (uint32_t i = 0; i < submitCount; i++) {
>    /* Fence for this submit.  NULL for all but the last one */
> -  VkFence submit_fence = (i == submitCount - 1) ? fence : NULL;
> +  VkFence submit_fence = (i == submitCount - 1) ? fence : 0;
>  
>    if (pSubmits[i].commandBufferCount == 0) {
>   /* If we don't have any command buffers, we need to submit
> a dummy
> @@ -197,7 +197,7 @@ VkResult anv_QueueSubmit(
>  
>   /* Fence for this execbuf.  NULL for all but the last one
> */
>   VkFence execbuf_fence =
> -(j == pSubmits[i].commandBufferCount - 1) ? submit_fence
> : NULL;
> +(j == pSubmits[i].commandBufferCount - 1) ? submit_fence
> : 0;
>  
>   const VkSemaphore *in_semaphores = NULL, *out_semaphores =
> NULL;
>   uint32_t num_in_semaphores = 0, num_out_semaphores = 0;

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 00/16] anv/i965: Cleanup copies of devinfo fields in brw_context

2017-08-30 Thread Samuel Iglesias Gonsálvez

On Wed, 2017-08-30 at 11:10 +0100, Lionel Landwerlin wrote:
> You can find this series on github : 
> https://github.com/djdeath/mesa/tree/wip/djdeath/drop-is-has-brw
> 
> (One paches got caught by the mailing list's size limit)
> 

Assuming there is no regressions on Intel CI, series is:

Reviewed-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>

Sam

> On 30/08/17 11:07, Lionel Landwerlin wrote:
> > Hi all,
> > 
> > Following a quick discussion on IRC, Matt reminded me we still had
> > some duplicated fields on brw_context which just hold the same
> > values
> > as gen_device_info. Let's just use gen_device_info instead.
> > 
> > Cheers,
> > 
> > Lionel Landwerlin (16):
> >    anv: use device->info instead of brw->is_*
> >    i965: drop brw->gen in favor of devinfo->gen
> >    i965: drop brw->gt in favor of devinfo->gt
> >    i965: drop brw->is_g4x in favor of devinfo->is_g4x
> >    i965: drop brw->is_baytrail in favor of devinfo->is_baytrail
> >    i965: drop brw->is_haswell in favor of devinfo->is_haswell
> >    i965: drop brw->is_cherryview in favor of devinfo->is_cherryview
> >    i965: drop brw->is_broxton
> >    i965: drop brw->has_llc in favor of devinfo->has_llc
> >    i965: drop unused brw->has_compr4
> >    i965: drop unused brw->has_negative_rhw_bug
> >    i965: drop brw->must_use_separate_stencil in favor of devinfo's
> >    i965: drop unused brw->has_pln
> >    i965: drop unused brw->no_simd8
> >    i965: drop brw->has_surface_tile_offset in favor of devinfo's
> >    i965: drop unused brw->needs_unlit_centroid_workaround
> > 
> >   src/intel/vulkan/genX_pipeline.c  |  2 +-
> >   src/mesa/drivers/dri/i965/brw_binding_tables.c|  6 +-
> >   src/mesa/drivers/dri/i965/brw_blorp.c | 30 +---
> >   src/mesa/drivers/dri/i965/brw_clear.c |  8 +-
> >   src/mesa/drivers/dri/i965/brw_clip.c  |  3 +-
> >   src/mesa/drivers/dri/i965/brw_compute.c   | 14 ++--
> >   src/mesa/drivers/dri/i965/brw_context.c   | 86
> > ++
> >   src/mesa/drivers/dri/i965/brw_context.h   | 25 ---
> >   src/mesa/drivers/dri/i965/brw_cs.c|  2 +-
> >   src/mesa/drivers/dri/i965/brw_curbe.c |  4 +-
> >   src/mesa/drivers/dri/i965/brw_draw.c  | 22 +++---
> >   src/mesa/drivers/dri/i965/brw_draw_upload.c   | 22 +++---
> >   src/mesa/drivers/dri/i965/brw_ff_gs.c |  8 +-
> >   src/mesa/drivers/dri/i965/brw_formatquery.c   |  5 +-
> >   src/mesa/drivers/dri/i965/brw_gs.c|  3 +-
> >   src/mesa/drivers/dri/i965/brw_link.cpp|  9 ++-
> >   src/mesa/drivers/dri/i965/brw_meta_util.c |  7 +-
> >   src/mesa/drivers/dri/i965/brw_misc_state.c| 66
> > ++---
> >   src/mesa/drivers/dri/i965/brw_pipe_control.c  | 46 
> > 
> >   src/mesa/drivers/dri/i965/brw_primitive_restart.c |  3 +-
> >   src/mesa/drivers/dri/i965/brw_program.c   | 11 ++-
> >   src/mesa/drivers/dri/i965/brw_queryobj.c  | 29 +---
> >   src/mesa/drivers/dri/i965/brw_state_upload.c  | 39 +-
> >   src/mesa/drivers/dri/i965/brw_surface_formats.c   |  9 ++-
> >   src/mesa/drivers/dri/i965/brw_tcs.c   | 10 ++-
> >   src/mesa/drivers/dri/i965/brw_urb.c   |  6 +-
> >   src/mesa/drivers/dri/i965/brw_vs.c|  8 +-
> >   src/mesa/drivers/dri/i965/brw_wm.c| 37 +-
> >   src/mesa/drivers/dri/i965/brw_wm_surface_state.c  | 54 +-
> > 
> >   src/mesa/drivers/dri/i965/gen6_constant_state.c   |  3 +-
> >   src/mesa/drivers/dri/i965/gen6_queryobj.c | 18 +++--
> >   src/mesa/drivers/dri/i965/gen6_sol.c  |  6 +-
> >   src/mesa/drivers/dri/i965/gen7_l3_state.c | 11 +--
> >   src/mesa/drivers/dri/i965/gen7_misc_state.c   |  3 +-
> >   src/mesa/drivers/dri/i965/gen7_sol_state.c|  9 ++-
> >   src/mesa/drivers/dri/i965/gen7_urb.c  | 13 ++--
> >   src/mesa/drivers/dri/i965/gen8_depth_state.c  |  9 ++-
> >   src/mesa/drivers/dri/i965/genX_state_upload.c | 12 ++-
> >   src/mesa/drivers/dri/i965/hsw_queryobj.c  | 10 ++-
> >   src/mesa/drivers/dri/i965/hsw_sol.c   |  9 ++-
> >   src/mesa/drivers/dri/i965/intel_batchbuffer.c | 69
> > --
> >   src/mesa/drivers/dri/i965/intel_blit.c| 31 +---
> >   src/mesa/drivers/dri/i965/in

[Mesa-dev] [PATCH] nir/spirv: fix crashes when dereferencing a pointer for an OpVariable

2017-08-29 Thread Samuel Iglesias Gonsálvez

When creating a vtn_pointer for an OpVariable, the block_index and
offsets fields are null because there is not ssa to take the data
from.

However, we can dereference that pointer when processing an
SpvOp*AccessChain opcodes through vtn_ssa_offset_pointer_dereference()
when the OpVariable when the StorageClass is Uniform or StorageBuffer.

Inside vtn_ssa_offset_pointer_dereference() we have the code to
initialize block_index and offset if they are null, but it is called
after checking if the pointer has then non-null.

Reordering that code fixes crashes in:

   dEQP-VK.spirv_assembly.instruction.*.indexing.*

Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
---
 src/compiler/spirv/vtn_variables.c | 29 +++--
 1 file changed, 15 insertions(+), 14 deletions(-)

diff --git a/src/compiler/spirv/vtn_variables.c 
b/src/compiler/spirv/vtn_variables.c
index 4f6acd2e07..baf1edde4c 100644
--- a/src/compiler/spirv/vtn_variables.c
+++ b/src/compiler/spirv/vtn_variables.c
@@ -146,20 +146,6 @@ vtn_ssa_offset_pointer_dereference(struct vtn_builder *b,
struct vtn_type *type = base->type;
 
unsigned idx = 0;
-   if (deref_chain->ptr_as_array) {
-  /* We need ptr_type for the stride */
-  assert(base->ptr_type);
-  /* This must be a pointer to an actual element somewhere */
-  assert(block_index && offset);
-  /* We need at least one element in the chain */
-  assert(deref_chain->length >= 1);
-
-  nir_ssa_def *elem_offset =
- vtn_access_link_as_ssa(b, deref_chain->link[idx],
-base->ptr_type->stride);
-  offset = nir_iadd(>nb, offset, elem_offset);
-  idx++;
-   }
 
if (!block_index) {
   assert(base->var);
@@ -182,6 +168,21 @@ vtn_ssa_offset_pointer_dereference(struct vtn_builder *b,
}
assert(offset);
 
+   if (deref_chain->ptr_as_array) {
+  /* We need ptr_type for the stride */
+  assert(base->ptr_type);
+  /* This must be a pointer to an actual element somewhere */
+  assert(block_index && offset);
+  /* We need at least one element in the chain */
+  assert(deref_chain->length >= 1);
+
+  nir_ssa_def *elem_offset =
+ vtn_access_link_as_ssa(b, deref_chain->link[idx],
+base->ptr_type->stride);
+  offset = nir_iadd(>nb, offset, elem_offset);
+  idx++;
+   }
+
for (; idx < deref_chain->length; idx++) {
   switch (glsl_get_base_type(type->type)) {
   case GLSL_TYPE_UINT:
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2] nir/spirv: fix chain access with different index bit sizes

2017-08-29 Thread Samuel Iglesias Gonsálvez

Currently we support 32-bit indexes/offsets all over the driver, so we
convert them to that bit size.

Fixes dEQP-VK.spirv_assembly.instruction.*.indexing.*

Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
---
 src/compiler/spirv/vtn_variables.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/src/compiler/spirv/vtn_variables.c 
b/src/compiler/spirv/vtn_variables.c
index 4432e72e54..4f6acd2e07 100644
--- a/src/compiler/spirv/vtn_variables.c
+++ b/src/compiler/spirv/vtn_variables.c
@@ -102,10 +102,15 @@ vtn_access_link_as_ssa(struct vtn_builder *b, struct 
vtn_access_link link,
if (link.mode == vtn_access_mode_literal) {
   return nir_imm_int(>nb, link.id * stride);
} else if (stride == 1) {
-  return vtn_ssa_value(b, link.id)->def;
+   nir_ssa_def *ssa = vtn_ssa_value(b, link.id)->def;
+   if (ssa->bit_size != 32)
+  ssa = nir_i2i32(>nb, ssa);
+  return ssa;
} else {
-  return nir_imul(>nb, vtn_ssa_value(b, link.id)->def,
-  nir_imm_int(>nb, stride));
+  nir_ssa_def *src0 = vtn_ssa_value(b, link.id)->def;
+  if (src0->bit_size != 32)
+ src0 = nir_i2i32(>nb, src0);
+  return nir_imul(>nb, src0, nir_imm_int(>nb, stride));
}
 }
 
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nir/spirv: add support for chain access with different index bit sizes

2017-08-28 Thread Samuel Iglesias Gonsálvez

On Mon, 2017-08-28 at 07:08 -0700, Jason Ekstrand wrote:
> On August 28, 2017 1:18:33 AM Samuel Iglesias Gonsálvez 
> <sigles...@igalia.com> wrote:
> 
> > Fixes dEQP-VK.spirv_assembly.instruction.*.indexing.*
> > 
> > Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
> > ---
> >  src/compiler/spirv/vtn_variables.c | 31
> > +--
> >  1 file changed, 29 insertions(+), 2 deletions(-)
> > 
> > diff --git a/src/compiler/spirv/vtn_variables.c 
> > b/src/compiler/spirv/vtn_variables.c
> > index 4432e72e54..c86f7d5c3a 100644
> > --- a/src/compiler/spirv/vtn_variables.c
> > +++ b/src/compiler/spirv/vtn_variables.c
> > @@ -104,8 +104,20 @@ vtn_access_link_as_ssa(struct vtn_builder *b,
> > struct 
> > vtn_access_link link,
> > } else if (stride == 1) {
> >    return vtn_ssa_value(b, link.id)->def;
> > } else {
> > -  return nir_imul(>nb, vtn_ssa_value(b, link.id)->def,
> > -  nir_imm_int(>nb, stride));
> > +  nir_ssa_def *src0 = vtn_ssa_value(b, link.id)->def;
> > +  nir_ssa_def *src1;
> > +  switch (src0->bit_size) {
> > +  case 64:
> > + src1 = nir_imm_int64(>nb, stride);
> 
> There are lots of places in NIR that assume UBO/SSBO offsets and
> array 
> indices are 32-bit.  It's probably safer to just force it to 32-bit
> instead 
> of trying to support 64 for now.
> 

OK, I am going to send a patch soon.

Sam

> > + break;
> > +  case 32:
> > + src1 = nir_imm_int(>nb, stride);
> > + break;
> > +  default:
> > + unreachable("Type not supported");
> > +  }
> > +
> > +  return nir_imul(>nb, src0, src1);
> > }
> >  }
> > 
> > @@ -189,6 +201,21 @@ vtn_ssa_offset_pointer_dereference(struct
> > vtn_builder *b,
> >    case GLSL_TYPE_ARRAY: {
> >   nir_ssa_def *elem_offset =
> >  vtn_access_link_as_ssa(b, deref_chain->link[idx],
> > type->stride);
> > + if (elem_offset->bit_size != offset->bit_size) {
> > +switch (elem_offset->bit_size) {
> > +case 64:
> > +   offset = nir_i2i64(>nb, offset);
> > +   break;
> > +case 32:
> > +   offset = nir_i2i32(>nb, offset);
> > +   break;
> > +case 16:
> > +   offset = nir_i2i16(>nb, offset);
> > +   break;
> > +default:
> > +   unreachable("Type not supported");
> > +}
> > + }
> >   offset = nir_iadd(>nb, offset, elem_offset);
> >   type = type->array_element;
> >   break;
> > --
> > 2.14.1
> > 
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
> 
> 

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] nir/spirv: add support for chain access with different index bit sizes

2017-08-28 Thread Samuel Iglesias Gonsálvez

Fixes dEQP-VK.spirv_assembly.instruction.*.indexing.*

Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
---
 src/compiler/spirv/vtn_variables.c | 31 +--
 1 file changed, 29 insertions(+), 2 deletions(-)

diff --git a/src/compiler/spirv/vtn_variables.c 
b/src/compiler/spirv/vtn_variables.c
index 4432e72e54..c86f7d5c3a 100644
--- a/src/compiler/spirv/vtn_variables.c
+++ b/src/compiler/spirv/vtn_variables.c
@@ -104,8 +104,20 @@ vtn_access_link_as_ssa(struct vtn_builder *b, struct 
vtn_access_link link,
} else if (stride == 1) {
   return vtn_ssa_value(b, link.id)->def;
} else {
-  return nir_imul(>nb, vtn_ssa_value(b, link.id)->def,
-  nir_imm_int(>nb, stride));
+  nir_ssa_def *src0 = vtn_ssa_value(b, link.id)->def;
+  nir_ssa_def *src1;
+  switch (src0->bit_size) {
+  case 64:
+ src1 = nir_imm_int64(>nb, stride);
+ break;
+  case 32:
+ src1 = nir_imm_int(>nb, stride);
+ break;
+  default:
+ unreachable("Type not supported");
+  }
+
+  return nir_imul(>nb, src0, src1);
}
 }
 
@@ -189,6 +201,21 @@ vtn_ssa_offset_pointer_dereference(struct vtn_builder *b,
   case GLSL_TYPE_ARRAY: {
  nir_ssa_def *elem_offset =
 vtn_access_link_as_ssa(b, deref_chain->link[idx], type->stride);
+ if (elem_offset->bit_size != offset->bit_size) {
+switch (elem_offset->bit_size) {
+case 64:
+   offset = nir_i2i64(>nb, offset);
+   break;
+case 32:
+   offset = nir_i2i32(>nb, offset);
+   break;
+case 16:
+   offset = nir_i2i16(>nb, offset);
+   break;
+default:
+   unreachable("Type not supported");
+}
+ }
  offset = nir_iadd(>nb, offset, elem_offset);
  type = type->array_element;
  break;
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] spirv: Add support for the HelperInvocation builtin

2017-08-21 Thread Samuel Iglesias Gonsálvez

Reviewed-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>

On Mon, 2017-08-21 at 22:11 -0700, Jason Ekstrand wrote:
> I have no idea how this got missed but it's been missing since
> forever.
> 
> Cc: mesa-sta...@lists.freedesktop.org
> ---
>  src/compiler/spirv/vtn_variables.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/src/compiler/spirv/vtn_variables.c
> b/src/compiler/spirv/vtn_variables.c
> index 6a8776b..87cb935 100644
> --- a/src/compiler/spirv/vtn_variables.c
> +++ b/src/compiler/spirv/vtn_variables.c
> @@ -1121,6 +1121,10 @@ vtn_get_builtin_location(struct vtn_builder
> *b,
>    *location = FRAG_RESULT_DEPTH;
>    assert(*mode == nir_var_shader_out);
>    break;
> +   case SpvBuiltInHelperInvocation:
> +  *location = SYSTEM_VALUE_HELPER_INVOCATION;
> +  set_mode_system_value(mode);
> +  break;
> case SpvBuiltInNumWorkgroups:
>    *location = SYSTEM_VALUE_NUM_WORK_GROUPS;
>    set_mode_system_value(mode);
> @@ -1177,7 +1181,6 @@ vtn_get_builtin_location(struct vtn_builder *b,
>    *location = SYSTEM_VALUE_VIEW_INDEX;
>    set_mode_system_value(mode);
>    break;
> -   case SpvBuiltInHelperInvocation:
> default:
>    unreachable("unsupported builtin");
> }

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 3/5] i965/vec4: add support for doing DF register spilling on IVB+

2017-07-19 Thread Samuel Iglesias Gonsálvez

Both spill/unspill process assume that both lower simd width
and DF scalarization were previously done.

* Spilling process does the following:

  1) Reads the existing content from the scratch memory that
 corresponds to the vertex (use inst->group to know if we
 are going to write data to the first or the second vertex).
 As it is already scalarized, we don't want to modify existing
 data of other components. We only read one GRF content as we are
 not going to modify the other (exec_size = 4).
  2) Overwrite the component the spilled instruction writes to.
  3) Do a scratch write to save the updated content of the respective
 vertex to scratch memory.

* Unspilling is implemented as several scratch reads when we find
  the first instruction whose sources were spilled.
  These scratch read get the content of the DF data for both vertices
  because we want to have DF data in two consecutive GRFs, even when
  this first instruction only reads one (exec_size = 4). Then, it is
  not needed to do more unspills until we write new content to the
  scratch memory, so we just need to update the register number in
  the affected sources of the following instructions.

v2:

- Change mlen only in emit_scratch_{read, write}
- Allow partial DF writes/reads spilling on IVB+.
- Modify emit_scratch_write() to mark the partial DF read case.
- Fix size_written, it is in byte units (Curro)
- Don't do shuffling on emit_scratch_read().
- Don't do shuffling on emit_scratch_write().
- Simplify emit_scratch_read() changes.
- Simplify emit_scratch_write() changes.
- Merge reladdr changes.

Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
---
 src/intel/compiler/brw_vec4.cpp  |   2 +
 src/intel/compiler/brw_vec4.h|   9 ++-
 src/intel/compiler/brw_vec4_reg_allocate.cpp |   4 +-
 src/intel/compiler/brw_vec4_visitor.cpp  | 112 ---
 4 files changed, 109 insertions(+), 18 deletions(-)

diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp
index 410922c62b..459e37f2f5 100644
--- a/src/intel/compiler/brw_vec4.cpp
+++ b/src/intel/compiler/brw_vec4.cpp
@@ -338,6 +338,8 @@ vec4_visitor::implied_mrf_writes(vec4_instruction *inst)
case SHADER_OPCODE_GEN4_SCRATCH_READ:
   return 2;
case SHADER_OPCODE_GEN4_SCRATCH_WRITE:
+  if (devinfo->gen >= 7 && type_sz(inst->dst.type) == 8)
+ return 2;
   return 3;
case GS_OPCODE_URB_WRITE:
case GS_OPCODE_URB_WRITE_ALLOCATE:
diff --git a/src/intel/compiler/brw_vec4.h b/src/intel/compiler/brw_vec4.h
index d828da02ea..fafad1291b 100644
--- a/src/intel/compiler/brw_vec4.h
+++ b/src/intel/compiler/brw_vec4.h
@@ -291,11 +291,12 @@ public:
src_reg get_scratch_offset(bblock_t *block, vec4_instruction *inst,
  src_reg *reladdr, int reg_offset);
void emit_scratch_read(bblock_t *block, vec4_instruction *inst,
- dst_reg dst,
- src_reg orig_src,
- int base_offset);
+  dst_reg dst,
+  src_reg orig_src,
+  int base_offset,
+  bool resolve_reladdr);
void emit_scratch_write(bblock_t *block, vec4_instruction *inst,
-  int base_offset);
+  int base_offset, bool resolve_reladdr);
void emit_pull_constant_load(bblock_t *block, vec4_instruction *inst,
dst_reg dst,
src_reg orig_src,
diff --git a/src/intel/compiler/brw_vec4_reg_allocate.cpp 
b/src/intel/compiler/brw_vec4_reg_allocate.cpp
index a0ba77b867..bed3471159 100644
--- a/src/intel/compiler/brw_vec4_reg_allocate.cpp
+++ b/src/intel/compiler/brw_vec4_reg_allocate.cpp
@@ -526,7 +526,7 @@ vec4_visitor::spill_reg(int spill_reg_nr)
temp.offset = 0;
temp.swizzle = BRW_SWIZZLE_XYZW;
emit_scratch_read(block, inst,
- dst_reg(temp), inst->src[i], spill_offset);
+ dst_reg(temp), inst->src[i], spill_offset, 
false);
temp.offset = inst->src[i].offset;
 }
 assert(scratch_reg != -1);
@@ -535,7 +535,7 @@ vec4_visitor::spill_reg(int spill_reg_nr)
   }
 
   if (inst->dst.file == VGRF && inst->dst.nr == spill_reg_nr) {
- emit_scratch_write(block, inst, spill_offset);
+ emit_scratch_write(block, inst, spill_offset, false);
  scratch_reg = inst->dst.nr;
   }
}
diff --git a/src/intel/compiler/brw_vec4_visitor.cpp 
b/src/intel/compiler/brw_vec4_visitor.cpp
index 22ee4dd1c4..d798db8f17 100644
--- a/src/intel/compiler/brw_vec4_visitor.cpp
+++ b/src/intel/compiler/brw_vec4_visitor.cpp
@@ -1478,8 +1478,8 @@ vec4_visitor::get_scratch_offset(bblock_t *block, 
vec4_instruction *inst,
  */
 v

[Mesa-dev] [PATCH v2 2/5] i965/vec4/generator: use 1-Oword Block Write messages for DF scratch write

2017-07-19 Thread Samuel Iglesias Gonsálvez

v2:
- Enable partial DF on HSW+ in emit_1grf_df_ivb_scratch_read()
- Copy the data read by first 1-Oword Block read as UD instead
  of DF, because on HSW+ we can break regioning rules.

v3:
- Update the calls to brw_oword_block_*_scratch().
- Remove changes in generate_scratch_read().
- Fix offset when emitting 1-Oword Block Write messages, so we
  don't need to shuffle data.
- Remove DF_IVB_SCRATCH_READ() and emit_1grf_df_ivb_scratch_read()
- Remove VEC4_OPCODE_GEN4_SCRATCH_READ_1OWORD_{LOW,HIGH} opcodes.
- Add support for Haswell.

Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
---
 src/intel/compiler/brw_vec4_generator.cpp | 59 +++
 1 file changed, 59 insertions(+)

diff --git a/src/intel/compiler/brw_vec4_generator.cpp 
b/src/intel/compiler/brw_vec4_generator.cpp
index 334933d15a..c0ceacd9aa 100644
--- a/src/intel/compiler/brw_vec4_generator.cpp
+++ b/src/intel/compiler/brw_vec4_generator.cpp
@@ -1192,6 +1192,65 @@ generate_scratch_write(struct brw_codegen *p,
struct brw_reg header = brw_vec8_grf(0, 0);
bool write_commit;
 
+   if (devinfo->gen >= 7 && type_sz(src.type) == 8) {
+  bool partial_df = inst->exec_size < 8;
+  brw_set_default_access_mode(p, BRW_ALIGN_1);
+
+  if (!partial_df || inst->group == 0) {
+ for (int i = 0; i < 2; i++) {
+brw_set_default_exec_size(p, BRW_EXECUTE_4);
+brw_set_default_mask_control(p, true);
+struct brw_reg temp =
+   retype(suboffset(src, i * 16 / type_sz(src.type)), 
BRW_REGISTER_TYPE_UD);
+temp = stride(temp, 4, 4, 1);
+
+brw_MOV(p, brw_uvec_mrf(4, inst->base_mrf + 1, 0),
+temp);
+brw_set_default_mask_control(p, inst->force_writemask_all);
+brw_set_default_exec_size(p, BRW_EXECUTE_8);
+
+/* Offset in OWORDs */
+brw_oword_block_write_scratch(p, brw_message_reg(inst->base_mrf),
+  1, 32*inst->offset + 16*i);
+ }
+  }
+
+  if (!partial_df) {
+ /* HSW can do full DF scratch writes, however we split the writes in
+  * four 1-OWord messages: two for the first GRF, two for the second.
+  *
+  * In order to emit properly the 1-OWord messages for the second GRF,
+  * we need to set the default group (which sets the nibble control)
+  * for them. We also need to fix source regiter to pick the data.
+  */
+ src = suboffset(src, 32 / type_sz(src.type));
+ brw_set_default_group(p, 4);
+  }
+
+  if (!partial_df || inst->group != 0) {
+ for (int i = 0; i < 2; i++) {
+brw_set_default_exec_size(p, BRW_EXECUTE_4);
+brw_set_default_mask_control(p, true);
+struct brw_reg temp =
+   retype(suboffset(src, i * 16 / type_sz(src.type)), 
BRW_REGISTER_TYPE_UD);
+temp = stride(temp, 4, 4, 1);
+
+brw_MOV(p, brw_uvec_mrf(4, inst->base_mrf + 1, 4),
+temp);
+
+brw_set_default_mask_control(p, inst->force_writemask_all);
+brw_set_default_exec_size(p, BRW_EXECUTE_8);
+
+/* Offset in OWORDs */
+brw_oword_block_write_scratch(p, brw_message_reg(inst->base_mrf),
+  1, 32*inst->offset + 16*i + 32);
+ }
+  }
+  brw_set_default_exec_size(p, cvt(inst->exec_size) - 1);
+  brw_set_default_access_mode(p, BRW_ALIGN_16);
+  return;
+   }
+
/* If the instruction is predicated, we'll predicate the send, not
 * the header setup.
 */
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 1/5] i965/eu: add support for 1-OWord Block Read/Write messages

2017-07-19 Thread Samuel Iglesias Gonsálvez

v2:
- Use nibctrl and the number of written/read owords to detect
each case of a 1-OWord Block Read/Write (Curro)

Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
---
 src/intel/compiler/brw_eu.h | 14 +-
 src/intel/compiler/brw_eu_emit.c| 46 +
 src/intel/compiler/brw_fs_generator.cpp |  4 +--
 3 files changed, 44 insertions(+), 20 deletions(-)

diff --git a/src/intel/compiler/brw_eu.h b/src/intel/compiler/brw_eu.h
index a3a9c63239..de8470b4b5 100644
--- a/src/intel/compiler/brw_eu.h
+++ b/src/intel/compiler/brw_eu.h
@@ -342,15 +342,15 @@ void brw_oword_block_read(struct brw_codegen *p,
 unsigned brw_scratch_surface_idx(const struct brw_codegen *p);
 
 void brw_oword_block_read_scratch(struct brw_codegen *p,
- struct brw_reg dest,
- struct brw_reg mrf,
- int num_regs,
- unsigned offset);
+  struct brw_reg dest,
+  struct brw_reg mrf,
+  int num_owords,
+  unsigned offset);
 
 void brw_oword_block_write_scratch(struct brw_codegen *p,
-  struct brw_reg mrf,
-  int num_regs,
-  unsigned offset);
+   struct brw_reg mrf,
+   int num_owords,
+   unsigned offset);
 
 void gen7_block_read_scratch(struct brw_codegen *p,
  struct brw_reg dest,
diff --git a/src/intel/compiler/brw_eu_emit.c b/src/intel/compiler/brw_eu_emit.c
index 0b0d67a5c5..956ef263a2 100644
--- a/src/intel/compiler/brw_eu_emit.c
+++ b/src/intel/compiler/brw_eu_emit.c
@@ -2133,9 +2133,9 @@ brw_scratch_surface_idx(const struct brw_codegen *p)
  * register spilling.
  */
 void brw_oword_block_write_scratch(struct brw_codegen *p,
-  struct brw_reg mrf,
-  int num_regs,
-  unsigned offset)
+   struct brw_reg mrf,
+   int num_owords,
+   unsigned offset)
 {
const struct gen_device_info *devinfo = p->devinfo;
const unsigned target_cache =
@@ -2149,7 +2149,7 @@ void brw_oword_block_write_scratch(struct brw_codegen *p,
 
mrf = retype(mrf, BRW_REGISTER_TYPE_UD);
 
-   const unsigned mlen = 1 + num_regs;
+   const unsigned mlen = 1 + MAX2(1, num_owords / 2);
 
/* Set up the message header.  This is g0, with g0.2 filled with
 * the offset.  We don't want to leave our offset around in g0 or
@@ -2180,6 +2180,18 @@ void brw_oword_block_write_scratch(struct brw_codegen *p,
   int send_commit_msg;
   struct brw_reg src_header = retype(brw_vec8_grf(0, 0),
 BRW_REGISTER_TYPE_UW);
+  int msg_control = BRW_DATAPORT_OWORD_BLOCK_DWORDS(num_owords * 4);
+
+  /* By default for 1-oword, msg_control = 
BRW_DATAPORT_OWORD_BLOCK_1_OWORDLOW,
+   * fix it when we are writing the high part.
+   */
+  if (num_owords == 1 && brw_inst_nib_control(devinfo, insn) != 0) {
+ msg_control = BRW_DATAPORT_OWORD_BLOCK_1_OWORDHIGH;
+ /* The messages only work with group == 0, we use the group to know 
which
+  * message emit (1-OWORD LOW or 1-OWORD HIGH), so reset it to zero.
+  */
+ brw_inst_set_group(devinfo, insn, 0);
+  }
 
   brw_inst_set_compression(devinfo, insn, false);
 
@@ -2223,7 +2235,7 @@ void brw_oword_block_write_scratch(struct brw_codegen *p,
   brw_set_dp_write_message(p,
   insn,
brw_scratch_surface_idx(p),
-  BRW_DATAPORT_OWORD_BLOCK_DWORDS(num_regs * 8),
+  msg_control,
   msg_type,
target_cache,
   mlen,
@@ -2245,10 +2257,10 @@ void brw_oword_block_write_scratch(struct brw_codegen 
*p,
  */
 void
 brw_oword_block_read_scratch(struct brw_codegen *p,
-struct brw_reg dest,
-struct brw_reg mrf,
-int num_regs,
-unsigned offset)
+ struct brw_reg dest,
+ struct brw_reg mrf,
+ int num_owords,
+ unsigned offset)
 {
const struct gen_device_info *devinfo = p->devinfo;
 
@@ -2269,7 +2281,7 @@ brw_oword_block_read_scratch(struct brw_codegen *p,
}
dest = retype(dest, BRW_REGISTER_TYPE_UW);
 
-   const unsigned rlen = num_regs;
+   const unsigned rlen = MAX2(1, num_ow

[Mesa-dev] [PATCH v2 5/5] i965/vec4: allow partial DF register spilling

2017-07-19 Thread Samuel Iglesias Gonsálvez

v2:
- Enable spilling for partial DF reads/writes on HSW+

Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
---
 src/intel/compiler/brw_vec4_reg_allocate.cpp | 54 
 1 file changed, 40 insertions(+), 14 deletions(-)

diff --git a/src/intel/compiler/brw_vec4_reg_allocate.cpp 
b/src/intel/compiler/brw_vec4_reg_allocate.cpp
index a6f1070ebd..3ad18b12bb 100644
--- a/src/intel/compiler/brw_vec4_reg_allocate.cpp
+++ b/src/intel/compiler/brw_vec4_reg_allocate.cpp
@@ -411,17 +411,21 @@ vec4_visitor::evaluate_spill_costs(float *spill_costs, 
bool *no_spill)
spill_costs[inst->src[i].nr] +=
   loop_scale * spill_cost_for_type(inst->src[i].type);
if (inst->src[i].reladdr ||
-   inst->src[i].offset >= REG_SIZE)
+   (inst->src[i].offset >= REG_SIZE &&
+(type_sz(inst->src[i].type) != 8 ||
+ !(inst->src[i].offset == 32 && inst->group == 4
   no_spill[inst->src[i].nr] = true;
 
-   /* We don't support unspills of partial DF reads.
+   /* For execsize == 8, our 64-bit unspills are implemented with
+* two 32-bit scratch messages, each one reading that for both
+* SIMD4x2 threads that we need to shuffle into correct 64-bit
+* data. Ensure that we are reading data for both threads.
 *
-* Our 64-bit unspills are implemented with two 32-bit scratch
-* messages, each one reading that for both SIMD4x2 threads that
-* we need to shuffle into correct 64-bit data. Ensure that we
-* are reading data for both threads.
+* For execsize == 4, it is similar but using 1-Oword block
+* read messages and we don't need to shuffle data.
 */
-   if (type_sz(inst->src[i].type) == 8 && inst->exec_size != 8)
+   if (type_sz(inst->src[i].type) == 8 &&
+   inst->exec_size != 8 && inst->exec_size != 4)
   no_spill[inst->src[i].nr] = true;
 }
 
@@ -439,16 +443,21 @@ vec4_visitor::evaluate_spill_costs(float *spill_costs, 
bool *no_spill)
   if (inst->dst.file == VGRF && !no_spill[inst->dst.nr]) {
  spill_costs[inst->dst.nr] +=
 loop_scale * spill_cost_for_type(inst->dst.type);
- if (inst->dst.reladdr || inst->dst.offset >= REG_SIZE)
+ if (inst->dst.reladdr ||
+ (inst->dst.offset >= REG_SIZE &&
+  (type_sz(inst->dst.type) != 8 ||
+   !(inst->dst.offset == 32 && inst->group == 4
 no_spill[inst->dst.nr] = true;
 
- /* We don't support spills of partial DF writes.
+ /* For execsize == 8, our 64-bit spills are implemented with two
+  * 32-bit scratch messages, each one writing that for both SIMD4x2
+  * threads. Ensure that we are writing data for both threads.
   *
-  * Our 64-bit spills are implemented with two 32-bit scratch messages,
-  * each one writing that for both SIMD4x2 threads. Ensure that we
-  * are writing data for both threads.
+  * For execsize == 4, it is similar but using 1-Oword block
+  * write messages.
   */
- if (type_sz(inst->dst.type) == 8 && inst->exec_size != 8)
+ if (type_sz(inst->dst.type) == 8 &&
+ inst->exec_size != 8 && inst->exec_size != 4)
 no_spill[inst->dst.nr] = true;
 
  /* We can't spill registers that mix 32-bit and 64-bit access (that
@@ -514,11 +523,25 @@ vec4_visitor::spill_reg(int spill_reg_nr)
 
/* Generate spill/unspill instructions for the objects being spilled. */
int scratch_reg = -1;
+   bool do_partial_df_scratch_read = false;
foreach_block_and_inst(block, vec4_instruction, inst, cfg) {
   for (unsigned int i = 0; i < 3; i++) {
  if (inst->src[i].file == VGRF && inst->src[i].nr == spill_reg_nr) {
+/* DF scratch reads are not actual partial reads because we are
+ * going to read both GRFs in the first read instruction.
+ * Because of that, we will skip scratch read of the other splitted
+ * instruction (if any), as it can reuse the read value. We check
+ * the value of done_scratch_read to know if we need to do scratch
+ * read or not.
+ */
+bool do_df_scratch_read = devinfo->gen >= 7 &&
+   type_sz(inst->src[i].type) == 8 &&
+   (inst->exec_size != 4 || do_partial_df_scratch_read);
+
 if (scratch_reg == -1 ||
-!can_use_scratc

[Mesa-dev] [PATCH v2 4/5] i965/vec4: fix can_use_scratch_for_source() to support partial DFs

2017-07-19 Thread Samuel Iglesias Gonsálvez

Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
---
 src/intel/compiler/brw_vec4_reg_allocate.cpp | 16 ++--
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/src/intel/compiler/brw_vec4_reg_allocate.cpp 
b/src/intel/compiler/brw_vec4_reg_allocate.cpp
index bed3471159..a6f1070ebd 100644
--- a/src/intel/compiler/brw_vec4_reg_allocate.cpp
+++ b/src/intel/compiler/brw_vec4_reg_allocate.cpp
@@ -301,7 +301,7 @@ vec4_visitor::reg_allocate()
  */
 static bool
 can_use_scratch_for_source(const vec4_instruction *inst, unsigned i,
-   unsigned scratch_reg)
+   unsigned scratch_reg, bool partial_df_read)
 {
assert(inst->src[i].file == VGRF);
bool prev_inst_read_scratch_reg = false;
@@ -319,12 +319,14 @@ can_use_scratch_for_source(const vec4_instruction *inst, 
unsigned i,
 
   /* If the previous instruction writes to scratch_reg then we can reuse
* it if the write is not conditional and the channels we write are
-   * compatible with our read mask
+   * compatible with our read mask.
+   *
+   * Ignore partial DF read case as we will read the data for both 
vertices.
*/
   if (prev_inst->dst.file == VGRF && prev_inst->dst.nr == scratch_reg) {
  return (!prev_inst->predicate || prev_inst->opcode == BRW_OPCODE_SEL) 
&&
-(brw_mask_for_swizzle(inst->src[i].swizzle) &
- ~prev_inst->dst.writemask) == 0;
+((brw_mask_for_swizzle(inst->src[i].swizzle) &
+  ~prev_inst->dst.writemask) == 0) && !partial_df_read;
   }
 
   /* Skip scratch read/writes so that instructions generated by spilling
@@ -403,7 +405,9 @@ vec4_visitor::evaluate_spill_costs(float *spill_costs, bool 
*no_spill)
  * previous instruction, in which case we'll just reuse the scratch
  * reg for this instruction.
  */
-if (!can_use_scratch_for_source(inst, i, inst->src[i].nr)) {
+bool partial_df_read = inst->exec_size == 4 &&
+   type_sz(inst->src[i].type) == 8;
+if (!can_use_scratch_for_source(inst, i, inst->src[i].nr, 
partial_df_read)) {
spill_costs[inst->src[i].nr] +=
   loop_scale * spill_cost_for_type(inst->src[i].type);
if (inst->src[i].reladdr ||
@@ -514,7 +518,7 @@ vec4_visitor::spill_reg(int spill_reg_nr)
   for (unsigned int i = 0; i < 3; i++) {
  if (inst->src[i].file == VGRF && inst->src[i].nr == spill_reg_nr) {
 if (scratch_reg == -1 ||
-!can_use_scratch_for_source(inst, i, scratch_reg)) {
+!can_use_scratch_for_source(inst, i, scratch_reg, false)) {
/* We need to unspill anyway so make sure we read the full vec4
 * in any case. This way, the cached register can be reused
 * for consecutive instructions that read different channels of
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 2/2] anv: ensure device name contains terminating character

2017-07-16 Thread Samuel Iglesias Gonsálvez

Reviewed-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>

On Mon, 2017-07-17 at 00:29 +0100, Lionel Landwerlin wrote:
> v2: Use sizeof() (Chris)
> 
> CID: 1415113
> Reported-by: Grazvydas Ignotas <nota...@gmail.com>
> Signed-off-by: Lionel Landwerlin <lionel.g.landwer...@intel.com>
> ---
>  src/intel/vulkan/anv_device.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/src/intel/vulkan/anv_device.c
> b/src/intel/vulkan/anv_device.c
> index 34d4a675481..7e3eae43081 100644
> --- a/src/intel/vulkan/anv_device.c
> +++ b/src/intel/vulkan/anv_device.c
> @@ -891,8 +891,8 @@ void anv_GetPhysicalDeviceProperties(
>    .sparseProperties = {0}, /* Broadwell doesn't do sparse. */
> };
> 
> -   strncpy(pProperties->deviceName, pdevice->name,
> -   VK_MAX_PHYSICAL_DEVICE_NAME_SIZE);
> +   snprintf(pProperties->deviceName, sizeof(pProperties-
> >deviceName),
> +"%s", pdevice->name);
> memcpy(pProperties->pipelineCacheUUID,
>    pdevice->pipeline_cache_uuid, VK_UUID_SIZE);
>  }
> --
> 2.13.2
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] i965: miptree: silence coverity warning

2017-07-16 Thread Samuel Iglesias Gonsálvez

Reviewed-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>

On Sun, 2017-07-16 at 15:31 +0100, Lionel Landwerlin wrote:
> This probably can't happen, but we're better off with initialized
> variables.
> 
> CID: 1415114
> Signed-off-by: Lionel Landwerlin <lionel.g.landwer...@intel.com>
> ---
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> index e7ebc29b59d..83d3d8204aa 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> @@ -990,7 +990,7 @@ miptree_create_for_planar_image(struct
> brw_context *brw,
>  __DRIimage *image, GLenum target)
>  {
> struct intel_image_format *f = image->planar_format;
> -   struct intel_mipmap_tree *planar_mt;
> +   struct intel_mipmap_tree *planar_mt = NULL;
>  
> for (int i = 0; i < f->nplanes; i++) {
>    const int index = f->planes[i].buffer_index;

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/8] vulkan: Update to 1.0.54

2017-07-14 Thread Samuel Iglesias Gonsálvez

On Fri, 2017-07-14 at 09:42 +0200, Samuel Iglesias Gonsálvez wrote:
> On Thu, 2017-07-13 at 12:32 -0700, Jason Ekstrand wrote:
> > This little series updates us to the 1.0.54 headers and XML.  The
> > major
> > change here is that 1.0.54 dropped the VK_KHX_external* extensions
> > and
> > replaced them with VK_KHR variants.  The first three patches drop
> > support
> > for the KHX versions from anv and radv and the last 3 implement the
> > KHR
> > version of external memory and the related dependent extensions.
> 
> Patches 1, 2 and 4-8 are:
> 
> Reviewed-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
> 
> BTW, you removed VK_KHX_external_semaphore_* but I don't see a patch
> adding it again as KHR.
> 

Forget it, I found the patch series :)

Sam

> Sam
> 
> > 
> > Cc: Dave Airlie <airl...@redhat.com>
> > 
> > Jason Ekstrand (8):
> >   anv: Drop support for VK_KHX_external_memory_*
> >   anv: Drop support for VK_KHX_external_semaphore_*
> >   radv: Drop support for VK_KHX_external_memory_*
> >   vulkan: Update to the new 1.0.54 spec XML and headers
> >   anv: Advertise version 1.0.54
> >   anv: Implement VK_KHR_get_memory_requirements2
> >   anv: Implement VK_KHR_dedicated_allocation
> >   anv: Implement VK_KHR_external_memory_*
> > 
> >  include/vulkan/vulkan.h | 1284
> > +--
> >  src/amd/vulkan/radv_device.c|   68 +-
> >  src/amd/vulkan/radv_entrypoints_gen.py  |3 -
> >  src/amd/vulkan/radv_formats.c   |  112 ---
> >  src/amd/vulkan/radv_image.c |7 +-
> >  src/intel/vulkan/anv_allocator.c|6 +-
> >  src/intel/vulkan/anv_device.c   |  129 +++-
> >  src/intel/vulkan/anv_entrypoints_gen.py |   11 +-
> >  src/intel/vulkan/anv_formats.c  |   42 +-
> >  src/intel/vulkan/anv_queue.c|  115 +--
> >  src/intel/vulkan/dev_icd.json.in|2 +-
> >  src/intel/vulkan/intel_icd.json.in  |2 +-
> >  src/vulkan/registry/vk.xml  | 1208 +++
> > --
> >  13 files changed, 1837 insertions(+), 1152 deletions(-)

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/8] vulkan: Update to 1.0.54

2017-07-14 Thread Samuel Iglesias Gonsálvez

On Thu, 2017-07-13 at 12:32 -0700, Jason Ekstrand wrote:
> This little series updates us to the 1.0.54 headers and XML.  The
> major
> change here is that 1.0.54 dropped the VK_KHX_external* extensions
> and
> replaced them with VK_KHR variants.  The first three patches drop
> support
> for the KHX versions from anv and radv and the last 3 implement the
> KHR
> version of external memory and the related dependent extensions.

Patches 1, 2 and 4-8 are:

Reviewed-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>

BTW, you removed VK_KHX_external_semaphore_* but I don't see a patch
adding it again as KHR.

Sam

> 
> Cc: Dave Airlie <airl...@redhat.com>
> 
> Jason Ekstrand (8):
>   anv: Drop support for VK_KHX_external_memory_*
>   anv: Drop support for VK_KHX_external_semaphore_*
>   radv: Drop support for VK_KHX_external_memory_*
>   vulkan: Update to the new 1.0.54 spec XML and headers
>   anv: Advertise version 1.0.54
>   anv: Implement VK_KHR_get_memory_requirements2
>   anv: Implement VK_KHR_dedicated_allocation
>   anv: Implement VK_KHR_external_memory_*
> 
>  include/vulkan/vulkan.h | 1284
> +--
>  src/amd/vulkan/radv_device.c|   68 +-
>  src/amd/vulkan/radv_entrypoints_gen.py  |3 -
>  src/amd/vulkan/radv_formats.c   |  112 ---
>  src/amd/vulkan/radv_image.c |7 +-
>  src/intel/vulkan/anv_allocator.c|6 +-
>  src/intel/vulkan/anv_device.c   |  129 +++-
>  src/intel/vulkan/anv_entrypoints_gen.py |   11 +-
>  src/intel/vulkan/anv_formats.c  |   42 +-
>  src/intel/vulkan/anv_queue.c|  115 +--
>  src/intel/vulkan/dev_icd.json.in|2 +-
>  src/intel/vulkan/intel_icd.json.in  |2 +-
>  src/vulkan/registry/vk.xml  | 1208 +++
> --
>  13 files changed, 1837 insertions(+), 1152 deletions(-)
> 

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] anv: check support for enabled features in vkCreateDevice()

2017-06-30 Thread Samuel Iglesias Gonsálvez

From Vulkan spec, 4.2.1. "Device Creation":

  "vkCreateDevice verifies that extensions and features requested in
   the ppEnabledExtensionNames and pEnabledFeatures members of
   pCreateInfo, respectively, are supported by the implementation."

Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
---

I wrote this patch for the driver but similar code could probably go
to the Vulkan Loader as well.

 src/intel/vulkan/anv_device.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index 63f37308c1..df977f394e 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -1130,6 +1130,19 @@ VkResult anv_CreateDevice(
  return vk_error(VK_ERROR_EXTENSION_NOT_PRESENT);
}
 
+   /* Check enabled features */
+   if (pCreateInfo->pEnabledFeatures) {
+  VkPhysicalDeviceFeatures supported_features;
+  anv_GetPhysicalDeviceFeatures(physicalDevice, _features);
+  VkBool32 *supported_feature = (VkBool32 *)_features;
+  VkBool32 *enabled_feature = (VkBool32 *)pCreateInfo->pEnabledFeatures;
+  unsigned num_features = sizeof(VkPhysicalDeviceFeatures) / 
sizeof(VkBool32);
+  for (uint32_t i = 0; i < num_features; i++) {
+ if (enabled_feature[i] && !supported_feature[i])
+return vk_error(VK_ERROR_FEATURE_NOT_PRESENT);
+  }
+   }
+
device = vk_alloc2(_device->instance->alloc, pAllocator,
sizeof(*device), 8,
VK_SYSTEM_ALLOCATION_SCOPE_DEVICE);
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] anv: merge tessellation's primitive mode in merge_tess_info()

2017-06-28 Thread Samuel Iglesias Gonsálvez

SPIR-V tessellation shaders that were created from HSLS will have
the primitive generation domain set in tessellation control shader
(hull shader in HLSL) instead of the tessellation evaluation shader.

Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
---
 src/intel/vulkan/anv_pipeline.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/intel/vulkan/anv_pipeline.c b/src/intel/vulkan/anv_pipeline.c
index c43915e..76b45b9 100644
--- a/src/intel/vulkan/anv_pipeline.c
+++ b/src/intel/vulkan/anv_pipeline.c
@@ -587,6 +587,7 @@ merge_tess_info(struct shader_info *tes_info,
   tcs_info->tess.spacing == tes_info->tess.spacing);
tes_info->tess.spacing |= tcs_info->tess.spacing;
 
+   tes_info->tess.primitive_mode |= tcs_info->tess.primitive_mode;
tes_info->tess.ccw |= tcs_info->tess.ccw;
tes_info->tess.point_mode |= tcs_info->tess.point_mode;
 }
-- 
2.9.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/6] i965/vec4/generator: use 1-Oword Block Read/Write messages for DF scratch writes/reads

2017-06-27 Thread Samuel Iglesias Gonsálvez

On Mon, 2017-06-26 at 10:38 -0700, Francisco Jerez wrote:
> Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:
> 
> > On Fri, 2017-06-23 at 11:06 -0700, Francisco Jerez wrote:
> > > Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:
> > > 
> > > > On Thu, 2017-06-22 at 16:25 -0700, Francisco Jerez wrote:
> > > > > Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:
> > > > > 
> > > > > > Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.
> > > > > > com>
> > > > > > ---
> > > > > >  src/intel/compiler/brw_eu_defines.h  |   2 +
> > > > > >  src/intel/compiler/brw_shader.cpp|   5 +
> > > > > >  src/intel/compiler/brw_vec4.cpp  |   7 ++
> > > > > >  src/intel/compiler/brw_vec4.h|   8 ++
> > > > > >  src/intel/compiler/brw_vec4_generator.cpp| 136
> > > > > > +++
> > > > > >  src/intel/compiler/brw_vec4_reg_allocate.cpp |   6 +-
> > > > > >  src/intel/compiler/brw_vec4_visitor.cpp  |  49
> > > > > > ++
> > > > > >  7 files changed, 212 insertions(+), 1 deletion(-)
> > > > > > 
> > > > > > diff --git a/src/intel/compiler/brw_eu_defines.h
> > > > > > b/src/intel/compiler/brw_eu_defines.h
> > > > > > index 1af835d47e..3c148de0fa 100644
> > > > > > --- a/src/intel/compiler/brw_eu_defines.h
> > > > > > +++ b/src/intel/compiler/brw_eu_defines.h
> > > > > > @@ -436,6 +436,8 @@ enum opcode {
> > > > > > VEC4_OPCODE_PICK_HIGH_32BIT,
> > > > > > VEC4_OPCODE_SET_LOW_32BIT,
> > > > > > VEC4_OPCODE_SET_HIGH_32BIT,
> > > > > > +   VEC4_OPCODE_GEN4_SCRATCH_READ_1OWORD_LOW,
> > > > > > +   VEC4_OPCODE_GEN4_SCRATCH_READ_1OWORD_HIGH,
> > > > > >  
> > > > > 
> > > > > What's the point of introducing two different opcodes with
> > > > > essentially
> > > > > the same semantics (read 32B worth of data) as the current
> > > > > SHADER_OPCODE_GEN4_SCRATCH_READ?
> > > > 
> > > > Originally I had only SHADER_OPCODE_GEN4_SCRATCH_READ but I
> > > > changed
> > > > it
> > > > to don't allocate more registers than needed when doing scratch
> > > > write
> > > > of a partial DF write. Let me explain it:
> > > > 
> > > > When doing spilling, as DF instructions are both split and
> > > > scalarized,
> > > > we read the existing contents in scratch memory, overwrite them
> > > > with
> > > > the destination of the instruction, then emit scratch write.
> > > > Together
> > > > with the fact that I am not shuffling DF data, we only need to
> > > > allocate
> > > > 1 GRF to do so, instead of 2 (if I had emitted
> > > > SHADER_OPCODE_GEN4_SCRATCH_READ), when doing spilling on
> > > > partial DF
> > > > writes.
> > > > 
> > > 
> > > Why would you need to allocate more GRFs for
> > > SHADER_OPCODE_GEN4_SCRATCH_READ?  It also only reads one
> > > register,
> > > which
> > > should be sufficient for a single scalarized instruction as long
> > > as
> > > you
> > > don't shuffle data around -- Have a look at how the FS back-end
> > > addresses this problem.
> > > 
> > 
> > OK
> > 
> > > > >   Is there any downside from using the
> > > > > current opcode with force_writemask_all?  If anything it
> > > > > would
> > > > > give
> > > > > you
> > > > > better performance because you'd only have to set up one
> > > > > header
> > > > > (which
> > > > > stalls the EU pipeline twice), send down one message to the
> > > > > dataport,
> > > > > and avoid stalling to shuffle the data around in the return
> > > > > payload
> > > > > (which prevents your two 1OWORD messages from being pipelined
> > > > > at
> > > > > all).
> > > > > 
> > > > 
> > > > Sorry, I am confused here. Do you mean using
> > > > SHADER_OPCODE_GEN4_SCRATCH_READ as-is, which emits a "OWord

Re: [Mesa-dev] [PATCH 2/6] i965/vec4/generator: use 1-Oword Block Read/Write messages for DF scratch writes/reads

2017-06-26 Thread Samuel Iglesias Gonsálvez

On Fri, 2017-06-23 at 11:06 -0700, Francisco Jerez wrote:
> Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:
> 
> > On Thu, 2017-06-22 at 16:25 -0700, Francisco Jerez wrote:
> > > Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:
> > > 
> > > > Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
> > > > ---
> > > >  src/intel/compiler/brw_eu_defines.h  |   2 +
> > > >  src/intel/compiler/brw_shader.cpp|   5 +
> > > >  src/intel/compiler/brw_vec4.cpp  |   7 ++
> > > >  src/intel/compiler/brw_vec4.h|   8 ++
> > > >  src/intel/compiler/brw_vec4_generator.cpp| 136
> > > > +++
> > > >  src/intel/compiler/brw_vec4_reg_allocate.cpp |   6 +-
> > > >  src/intel/compiler/brw_vec4_visitor.cpp  |  49 ++
> > > >  7 files changed, 212 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/src/intel/compiler/brw_eu_defines.h
> > > > b/src/intel/compiler/brw_eu_defines.h
> > > > index 1af835d47e..3c148de0fa 100644
> > > > --- a/src/intel/compiler/brw_eu_defines.h
> > > > +++ b/src/intel/compiler/brw_eu_defines.h
> > > > @@ -436,6 +436,8 @@ enum opcode {
> > > > VEC4_OPCODE_PICK_HIGH_32BIT,
> > > > VEC4_OPCODE_SET_LOW_32BIT,
> > > > VEC4_OPCODE_SET_HIGH_32BIT,
> > > > +   VEC4_OPCODE_GEN4_SCRATCH_READ_1OWORD_LOW,
> > > > +   VEC4_OPCODE_GEN4_SCRATCH_READ_1OWORD_HIGH,
> > > >  
> > > 
> > > What's the point of introducing two different opcodes with
> > > essentially
> > > the same semantics (read 32B worth of data) as the current
> > > SHADER_OPCODE_GEN4_SCRATCH_READ?
> > 
> > Originally I had only SHADER_OPCODE_GEN4_SCRATCH_READ but I changed
> > it
> > to don't allocate more registers than needed when doing scratch
> > write
> > of a partial DF write. Let me explain it:
> > 
> > When doing spilling, as DF instructions are both split and
> > scalarized,
> > we read the existing contents in scratch memory, overwrite them
> > with
> > the destination of the instruction, then emit scratch write.
> > Together
> > with the fact that I am not shuffling DF data, we only need to
> > allocate
> > 1 GRF to do so, instead of 2 (if I had emitted
> > SHADER_OPCODE_GEN4_SCRATCH_READ), when doing spilling on partial DF
> > writes.
> > 
> 
> Why would you need to allocate more GRFs for
> SHADER_OPCODE_GEN4_SCRATCH_READ?  It also only reads one register,
> which
> should be sufficient for a single scalarized instruction as long as
> you
> don't shuffle data around -- Have a look at how the FS back-end
> addresses this problem.
> 

OK

> > >   Is there any downside from using the
> > > current opcode with force_writemask_all?  If anything it would
> > > give
> > > you
> > > better performance because you'd only have to set up one header
> > > (which
> > > stalls the EU pipeline twice), send down one message to the
> > > dataport,
> > > and avoid stalling to shuffle the data around in the return
> > > payload
> > > (which prevents your two 1OWORD messages from being pipelined at
> > > all).
> > > 
> > 
> > Sorry, I am confused here. Do you mean using
> > SHADER_OPCODE_GEN4_SCRATCH_READ as-is, which emits a "OWord Dual
> > Block
> > Read" message (so only one message)?
> > 
> > If that's the case, then I should shuffle the destination data of
> > the
> > partial DF write, change the 1-Oword block write offsets and so
> > on...
> 
> Why would you need to shuffle any spilled data?  I don't think
> there's
> much of a benefit from shuffling since scratch overwrites need read
> the
> original data for the most part anyway because of writemasking.  In
> fact
> shuffling DF data is probably the reason things blow up right now
> whenever you have mixed DF and single-precision reads or writes to
> the
> same spilled variable, which I guess is the reason you need to look
> for
> those cases and mark them as no_spill...
> 

Right, I don't need to shuffle data for the scratch write.

> > in order to save it inside scratch memory in the proper place to
> > make
> > OWord Dual Block Read work. That would require to some extra
> > instructions, but I don't know if this would give better
> > performance
> > against current implementation or not.
> >

Re: [Mesa-dev] [PATCH 2/6] i965/vec4/generator: use 1-Oword Block Read/Write messages for DF scratch writes/reads

2017-06-23 Thread Samuel Iglesias Gonsálvez

On Thu, 2017-06-22 at 16:25 -0700, Francisco Jerez wrote:
> Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:
> 
> > Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
> > ---
> >  src/intel/compiler/brw_eu_defines.h  |   2 +
> >  src/intel/compiler/brw_shader.cpp|   5 +
> >  src/intel/compiler/brw_vec4.cpp  |   7 ++
> >  src/intel/compiler/brw_vec4.h|   8 ++
> >  src/intel/compiler/brw_vec4_generator.cpp| 136
> > +++
> >  src/intel/compiler/brw_vec4_reg_allocate.cpp |   6 +-
> >  src/intel/compiler/brw_vec4_visitor.cpp  |  49 ++
> >  7 files changed, 212 insertions(+), 1 deletion(-)
> > 
> > diff --git a/src/intel/compiler/brw_eu_defines.h
> > b/src/intel/compiler/brw_eu_defines.h
> > index 1af835d47e..3c148de0fa 100644
> > --- a/src/intel/compiler/brw_eu_defines.h
> > +++ b/src/intel/compiler/brw_eu_defines.h
> > @@ -436,6 +436,8 @@ enum opcode {
> > VEC4_OPCODE_PICK_HIGH_32BIT,
> > VEC4_OPCODE_SET_LOW_32BIT,
> > VEC4_OPCODE_SET_HIGH_32BIT,
> > +   VEC4_OPCODE_GEN4_SCRATCH_READ_1OWORD_LOW,
> > +   VEC4_OPCODE_GEN4_SCRATCH_READ_1OWORD_HIGH,
> >  
> 
> What's the point of introducing two different opcodes with
> essentially
> the same semantics (read 32B worth of data) as the current
> SHADER_OPCODE_GEN4_SCRATCH_READ?

Originally I had only SHADER_OPCODE_GEN4_SCRATCH_READ but I changed it
to don't allocate more registers than needed when doing scratch write
of a partial DF write. Let me explain it:

When doing spilling, as DF instructions are both split and scalarized,
we read the existing contents in scratch memory, overwrite them with
the destination of the instruction, then emit scratch write. Together
with the fact that I am not shuffling DF data, we only need to allocate
1 GRF to do so, instead of 2 (if I had emitted
SHADER_OPCODE_GEN4_SCRATCH_READ), when doing spilling on partial DF
writes.

>  Is there any downside from using the
> current opcode with force_writemask_all?  If anything it would give
> you
> better performance because you'd only have to set up one header
> (which
> stalls the EU pipeline twice), send down one message to the dataport,
> and avoid stalling to shuffle the data around in the return payload
> (which prevents your two 1OWORD messages from being pipelined at
> all).
> 

Sorry, I am confused here. Do you mean using
SHADER_OPCODE_GEN4_SCRATCH_READ as-is, which emits a "OWord Dual Block
Read" message (so only one message)?

If that's the case, then I should shuffle the destination data of the
partial DF write, change the 1-Oword block write offsets and so on...
in order to save it inside scratch memory in the proper place to make
OWord Dual Block Read work. That would require to some extra
instructions, but I don't know if this would give better performance
against current implementation or not.

Then, why do I need force_writemask=true when emitting
SHADER_OPCODE_GEN4_SCRATCH_READ?

I can try this alternative solution if this is what you meant. It has
the advantage of simplifying the changes a lot, which is always great.

Sam

> > FS_OPCODE_DDX_COARSE,
> > FS_OPCODE_DDX_FINE,
> > diff --git a/src/intel/compiler/brw_shader.cpp
> > b/src/intel/compiler/brw_shader.cpp
> > index 53d0742d2e..248feacbd2 100644
> > --- a/src/intel/compiler/brw_shader.cpp
> > +++ b/src/intel/compiler/brw_shader.cpp
> > @@ -296,6 +296,11 @@ brw_instruction_name(const struct
> > gen_device_info *devinfo, enum opcode op)
> > case FS_OPCODE_PACK:
> >    return "pack";
> >  
> > +
> > +   case VEC4_OPCODE_GEN4_SCRATCH_READ_1OWORD_LOW:
> > +  return "gen4_scratch_read_1word_low";
> > +   case VEC4_OPCODE_GEN4_SCRATCH_READ_1OWORD_HIGH:
> > +  return "gen4_scratch_read_1word_high";
> > case SHADER_OPCODE_GEN4_SCRATCH_READ:
> >    return "gen4_scratch_read";
> > case SHADER_OPCODE_GEN4_SCRATCH_WRITE:
> > diff --git a/src/intel/compiler/brw_vec4.cpp
> > b/src/intel/compiler/brw_vec4.cpp
> > index b443effca9..b6d409eea2 100644
> > --- a/src/intel/compiler/brw_vec4.cpp
> > +++ b/src/intel/compiler/brw_vec4.cpp
> > @@ -259,6 +259,8 @@ bool
> >  vec4_instruction::can_do_writemask(const struct gen_device_info
> > *devinfo)
> >  {
> > switch (opcode) {
> > +   case VEC4_OPCODE_GEN4_SCRATCH_READ_1OWORD_LOW:
> > +   case VEC4_OPCODE_GEN4_SCRATCH_READ_1OWORD_HIGH:
> > case SHADER_OPCODE_GEN4_SCRATCH_READ:
> > case VEC4_OPCODE_DOUBLE_TO_F32:
> > case VEC4_OPCODE_DOUBLE_TO_D32:
> >

Re: [Mesa-dev] [PATCH 5/6] i965/vec4: fix resolve reladdr case on DF scratch read/write on IVB

2017-06-23 Thread Samuel Iglesias Gonsálvez

Please ignore this patch, I have a better solution that will be
included in the v2 of the patch series.

Sam

On Thu, 2017-06-15 at 13:15 +0200, Samuel Iglesias Gonsálvez wrote:
> We emit scratch read/write to resolve reladdr and when moving
> varyings to scratch memory, however these instructions are emitted
> before lower simd splitting and before scalarizing DF instructions.
> As the code added for doing DF scratch read/writes assumes both
> were previously done, added a flag to fallback to old behavior.
> 
> Fixes {vs,gs}-array-copy tests on piglit.
> 
> Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
> ---
>  src/intel/compiler/brw_vec4.h|  9 +
>  src/intel/compiler/brw_vec4_reg_allocate.cpp |  4 ++--
>  src/intel/compiler/brw_vec4_visitor.cpp  | 13 +++--
>  3 files changed, 14 insertions(+), 12 deletions(-)
> 
> diff --git a/src/intel/compiler/brw_vec4.h
> b/src/intel/compiler/brw_vec4.h
> index a5b45aca21..3cb1506d85 100644
> --- a/src/intel/compiler/brw_vec4.h
> +++ b/src/intel/compiler/brw_vec4.h
> @@ -294,16 +294,17 @@ public:
> src_reg get_scratch_offset(bblock_t *block, vec4_instruction
> *inst,
>     src_reg *reladdr, int reg_offset);
> void emit_scratch_read(bblock_t *block, vec4_instruction *inst,
> -   dst_reg dst,
> -   src_reg orig_src,
> -   int base_offset);
> +  dst_reg dst,
> +  src_reg orig_src,
> +  int base_offset,
> +  bool resolve_reladdr);
> void emit_1grf_df_ivb_scratch_read(bblock_t *block,
>    vec4_instruction *inst,
>    dst_reg temp, src_reg
> orig_src,
>    int base_offset, bool
> first_grf);
>  
> void emit_scratch_write(bblock_t *block, vec4_instruction *inst,
> -    int base_offset);
> +   int base_offset, bool resolve_addr);
> void emit_pull_constant_load(bblock_t *block, vec4_instruction
> *inst,
>   dst_reg dst,
>   src_reg orig_src,
> diff --git a/src/intel/compiler/brw_vec4_reg_allocate.cpp
> b/src/intel/compiler/brw_vec4_reg_allocate.cpp
> index ec5ba10e86..58cd862841 100644
> --- a/src/intel/compiler/brw_vec4_reg_allocate.cpp
> +++ b/src/intel/compiler/brw_vec4_reg_allocate.cpp
> @@ -530,7 +530,7 @@ vec4_visitor::spill_reg(int spill_reg_nr)
> temp.offset = 0;
> temp.swizzle = BRW_SWIZZLE_XYZW;
> emit_scratch_read(block, inst,
> - dst_reg(temp), inst->src[i],
> spill_offset);
> + dst_reg(temp), inst->src[i],
> spill_offset, false);
> temp.offset = inst->src[i].offset;
>  }
>  assert(scratch_reg != -1);
> @@ -539,7 +539,7 @@ vec4_visitor::spill_reg(int spill_reg_nr)
>    }
>  
>    if (inst->dst.file == VGRF && inst->dst.nr == spill_reg_nr) {
> - emit_scratch_write(block, inst, spill_offset);
> + emit_scratch_write(block, inst, spill_offset, false);
>   scratch_reg = inst->dst.nr;
>    }
> }
> diff --git a/src/intel/compiler/brw_vec4_visitor.cpp
> b/src/intel/compiler/brw_vec4_visitor.cpp
> index 0d5ad4d8f8..158feca6c9 100644
> --- a/src/intel/compiler/brw_vec4_visitor.cpp
> +++ b/src/intel/compiler/brw_vec4_visitor.cpp
> @@ -1532,7 +1532,7 @@
> vec4_visitor::emit_1grf_df_ivb_scratch_read(bblock_t *block,
>  void
>  vec4_visitor::emit_scratch_read(bblock_t *block, vec4_instruction
> *inst,
>  dst_reg temp, src_reg orig_src,
> -int base_offset)
> +int base_offset, bool
> resolve_reladdr)
>  {
> assert(orig_src.offset % REG_SIZE == 0);
> int reg_offset = base_offset + orig_src.offset / REG_SIZE;
> @@ -1541,7 +1541,7 @@ vec4_visitor::emit_scratch_read(bblock_t
> *block, vec4_instruction *inst,
>  
> if (type_sz(orig_src.type) < 8) {
>    emit_before(block, inst, SCRATCH_READ(temp, index));
> -   } else if (devinfo->gen == 7 && !devinfo->is_haswell &&
> +   } else if (devinfo->gen == 7 && !devinfo->is_haswell &&
> !resolve_reladdr &&
>    type_sz(temp.type) == 8) {
>    /* Set the offset to the base offset because we address the
> base GRF of
> * the DF. We will take into account the second GRF in the
> scratch

Re: [Mesa-dev] [PATCH 0/6] i965/vec4: Implement partial DF register spilling

2017-06-23 Thread Samuel Iglesias Gonsálvez

On Thu, 2017-06-22 at 17:02 -0700, Francisco Jerez wrote:
> Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:
> 
> > Hello,
> > 
> > As mentioned in the patch series that implemented Ivybridge support
> > ARB_gpu_shader_fp64 [0], the only missing feature in that series
> > was
> > register spilling of 64-bit data and, because of that, about ~39
> > fp64
> > piglit tests failed to spill registers.
> > 
> > This new patch series implement register spilling of 64-bit data
> > for
> > IVB and splitted DF instructions on vec4 backend in general.
> > Unfortunately, this doesn't make the previous failed tests to pass
> > :-(
> > 
> 
> I think the reason why the tests keep failing to register allocate on
> you is because you aren't setting up the
> vec4_instruction::size_written
> fields correctly throughout this series.  It's in *byte* units but
> you
> seem to be providing the number in GRF units.  For that reason the
> liveness analysis pass will think most of the unspilled data you read
> is
> not fully initialized by the scratch reads, and incorrectly extend
> their
> live ranges all the way up to the program entry point, so spilling of
> DF
> variables will hugely *increase* register pressure instead of
> reducing
> it...
> 

Oh, you are absolutely right! I did not see this mistake :-(

I am going to do this and the other suggestions you mention in the
other emails.

Thanks for the review!

Sam

> > Nevertheless, I think this is still useful to have it in place. The
> > implementation uses 1-OWord block write/read messages by reusing
> > the
> > existing implementation. Thanks to that, we can write/read valid
> > dvecN
> > data to/from scratch memory even under non-uniform control flow.
> > 
> > If you want to test the branch:
> > 
> > $ git clone -b fp64-ivb-vec4-spilling \
> > https://github.com/Igalia/mesa.git
> > 
> > Thanks,
> > 
> > Sam
> > 
> > [0] https://lists.freedesktop.org/archives/mesa-dev/2017-March/1486
> > 46.html
> > 
> > Samuel Iglesias Gonsálvez (6):
> >   i965/eu: add support for 1-OWord Block Read/Write messages
> >   i965/vec4/generator: use 1-Oword Block Read/Write messages for DF
> > scratch writes/reads
> >   i965/generator: use MRF when sending 1-OWord read messages for DF
> > scratch reads on IVB
> >   i965/vec4: add support for doing DF register spilling on IVB
> >   i965/vec4: fix resolve reladdr case on DF scratch read/write on
> > IVB
> >   i965/vec4: allow partial DF register spilling
> > 
> >  src/intel/compiler/brw_eu.h  |  18 ++--
> >  src/intel/compiler/brw_eu_defines.h  |   2 +
> >  src/intel/compiler/brw_eu_emit.c |  42 +++--
> >  src/intel/compiler/brw_fs_generator.cpp  |   5 +-
> >  src/intel/compiler/brw_shader.cpp|   5 +
> >  src/intel/compiler/brw_vec4.cpp  |  10 ++
> >  src/intel/compiler/brw_vec4.h|  17 +++-
> >  src/intel/compiler/brw_vec4_generator.cpp| 136
> > +++
> >  src/intel/compiler/brw_vec4_reg_allocate.cpp |  42 +++--
> >  src/intel/compiler/brw_vec4_visitor.cpp  | 125
> > ++--
> >  10 files changed, 361 insertions(+), 41 deletions(-)
> > 
> > -- 
> > 2.11.0
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/6] i965/vec4: Implement partial DF register spilling

2017-06-16 Thread Samuel Iglesias Gonsálvez

On Thu, 2017-06-15 at 13:15 +0200, Samuel Iglesias Gonsálvez wrote:
> Hello,
> 
> As mentioned in the patch series that implemented Ivybridge support
> ARB_gpu_shader_fp64 [0], the only missing feature in that series was
> register spilling of 64-bit data and, because of that, about ~39 fp64
> piglit tests failed to spill registers.
> 
> This new patch series implement register spilling of 64-bit data for
> IVB and splitted DF instructions on vec4 backend in general.

I have realized that the latter was not being done due to an error on
my side: I limited it to IVB in some places. I have changed the
corresponding conditions on my local branch, ready to send for a v2.
However I prefer to have some feedback of the branch before doing so.

Sam

> Unfortunately, this doesn't make the previous failed tests to pass :-
> (
> 
> Nevertheless, I think this is still useful to have it in place. The
> implementation uses 1-OWord block write/read messages by reusing the
> existing implementation. Thanks to that, we can write/read valid
> dvecN
> data to/from scratch memory even under non-uniform control flow.
> 
> If you want to test the branch:
> 
> $ git clone -b fp64-ivb-vec4-spilling \
> https://github.com/Igalia/mesa.git
> 
> Thanks,
> 
> Sam
> 
> [0] https://lists.freedesktop.org/archives/mesa-dev/2017-March/148646
> .html
> 
> Samuel Iglesias Gonsálvez (6):
>   i965/eu: add support for 1-OWord Block Read/Write messages
>   i965/vec4/generator: use 1-Oword Block Read/Write messages for DF
> scratch writes/reads
>   i965/generator: use MRF when sending 1-OWord read messages for DF
> scratch reads on IVB
>   i965/vec4: add support for doing DF register spilling on IVB
>   i965/vec4: fix resolve reladdr case on DF scratch read/write on IVB
>   i965/vec4: allow partial DF register spilling
> 
>  src/intel/compiler/brw_eu.h  |  18 ++--
>  src/intel/compiler/brw_eu_defines.h  |   2 +
>  src/intel/compiler/brw_eu_emit.c |  42 +++--
>  src/intel/compiler/brw_fs_generator.cpp  |   5 +-
>  src/intel/compiler/brw_shader.cpp|   5 +
>  src/intel/compiler/brw_vec4.cpp  |  10 ++
>  src/intel/compiler/brw_vec4.h|  17 +++-
>  src/intel/compiler/brw_vec4_generator.cpp| 136
> +++
>  src/intel/compiler/brw_vec4_reg_allocate.cpp |  42 +++--
>  src/intel/compiler/brw_vec4_visitor.cpp  | 125
> ++--
>  10 files changed, 361 insertions(+), 41 deletions(-)
> 

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 5/6] i965/vec4: fix resolve reladdr case on DF scratch read/write on IVB

2017-06-15 Thread Samuel Iglesias Gonsálvez

We emit scratch read/write to resolve reladdr and when moving
varyings to scratch memory, however these instructions are emitted
before lower simd splitting and before scalarizing DF instructions.
As the code added for doing DF scratch read/writes assumes both
were previously done, added a flag to fallback to old behavior.

Fixes {vs,gs}-array-copy tests on piglit.

Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
---
 src/intel/compiler/brw_vec4.h|  9 +
 src/intel/compiler/brw_vec4_reg_allocate.cpp |  4 ++--
 src/intel/compiler/brw_vec4_visitor.cpp  | 13 +++--
 3 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/src/intel/compiler/brw_vec4.h b/src/intel/compiler/brw_vec4.h
index a5b45aca21..3cb1506d85 100644
--- a/src/intel/compiler/brw_vec4.h
+++ b/src/intel/compiler/brw_vec4.h
@@ -294,16 +294,17 @@ public:
src_reg get_scratch_offset(bblock_t *block, vec4_instruction *inst,
  src_reg *reladdr, int reg_offset);
void emit_scratch_read(bblock_t *block, vec4_instruction *inst,
- dst_reg dst,
- src_reg orig_src,
- int base_offset);
+  dst_reg dst,
+  src_reg orig_src,
+  int base_offset,
+  bool resolve_reladdr);
void emit_1grf_df_ivb_scratch_read(bblock_t *block,
   vec4_instruction *inst,
   dst_reg temp, src_reg orig_src,
   int base_offset, bool first_grf);
 
void emit_scratch_write(bblock_t *block, vec4_instruction *inst,
-  int base_offset);
+   int base_offset, bool resolve_addr);
void emit_pull_constant_load(bblock_t *block, vec4_instruction *inst,
dst_reg dst,
src_reg orig_src,
diff --git a/src/intel/compiler/brw_vec4_reg_allocate.cpp 
b/src/intel/compiler/brw_vec4_reg_allocate.cpp
index ec5ba10e86..58cd862841 100644
--- a/src/intel/compiler/brw_vec4_reg_allocate.cpp
+++ b/src/intel/compiler/brw_vec4_reg_allocate.cpp
@@ -530,7 +530,7 @@ vec4_visitor::spill_reg(int spill_reg_nr)
temp.offset = 0;
temp.swizzle = BRW_SWIZZLE_XYZW;
emit_scratch_read(block, inst,
- dst_reg(temp), inst->src[i], spill_offset);
+ dst_reg(temp), inst->src[i], spill_offset, 
false);
temp.offset = inst->src[i].offset;
 }
 assert(scratch_reg != -1);
@@ -539,7 +539,7 @@ vec4_visitor::spill_reg(int spill_reg_nr)
   }
 
   if (inst->dst.file == VGRF && inst->dst.nr == spill_reg_nr) {
- emit_scratch_write(block, inst, spill_offset);
+ emit_scratch_write(block, inst, spill_offset, false);
  scratch_reg = inst->dst.nr;
   }
}
diff --git a/src/intel/compiler/brw_vec4_visitor.cpp 
b/src/intel/compiler/brw_vec4_visitor.cpp
index 0d5ad4d8f8..158feca6c9 100644
--- a/src/intel/compiler/brw_vec4_visitor.cpp
+++ b/src/intel/compiler/brw_vec4_visitor.cpp
@@ -1532,7 +1532,7 @@ vec4_visitor::emit_1grf_df_ivb_scratch_read(bblock_t 
*block,
 void
 vec4_visitor::emit_scratch_read(bblock_t *block, vec4_instruction *inst,
 dst_reg temp, src_reg orig_src,
-int base_offset)
+int base_offset, bool resolve_reladdr)
 {
assert(orig_src.offset % REG_SIZE == 0);
int reg_offset = base_offset + orig_src.offset / REG_SIZE;
@@ -1541,7 +1541,7 @@ vec4_visitor::emit_scratch_read(bblock_t *block, 
vec4_instruction *inst,
 
if (type_sz(orig_src.type) < 8) {
   emit_before(block, inst, SCRATCH_READ(temp, index));
-   } else if (devinfo->gen == 7 && !devinfo->is_haswell &&
+   } else if (devinfo->gen == 7 && !devinfo->is_haswell && !resolve_reladdr &&
   type_sz(temp.type) == 8) {
   /* Set the offset to the base offset because we address the base GRF of
* the DF. We will take into account the second GRF in the scratch write 
emission.
@@ -1574,7 +1574,7 @@ vec4_visitor::emit_scratch_read(bblock_t *block, 
vec4_instruction *inst,
  */
 void
 vec4_visitor::emit_scratch_write(bblock_t *block, vec4_instruction *inst,
- int base_offset)
+ int base_offset, bool resolve_reladdr)
 {
assert(inst->dst.offset % REG_SIZE == 0);
int reg_offset = base_offset + inst->dst.offset / REG_SIZE;
@@ -1606,7 +1606,8 @@ vec4_visitor::emit_scratch_write(bblock_t *block, 
vec4_instruction *inst,
   write->ir = inst->ir;
   write->annotation = inst->annotation;
   inst->insert_after(bl

[Mesa-dev] [PATCH 6/6] i965/vec4: allow partial DF register spilling

2017-06-15 Thread Samuel Iglesias Gonsálvez

Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
---
 src/intel/compiler/brw_vec4_reg_allocate.cpp | 32 +++-
 1 file changed, 27 insertions(+), 5 deletions(-)

diff --git a/src/intel/compiler/brw_vec4_reg_allocate.cpp 
b/src/intel/compiler/brw_vec4_reg_allocate.cpp
index 58cd862841..0fc9e3ca36 100644
--- a/src/intel/compiler/brw_vec4_reg_allocate.cpp
+++ b/src/intel/compiler/brw_vec4_reg_allocate.cpp
@@ -409,7 +409,9 @@ vec4_visitor::evaluate_spill_costs(float *spill_costs, bool 
*no_spill)
spill_costs[inst->src[i].nr] +=
   loop_scale * spill_cost_for_type(inst->src[i].type);
if (inst->src[i].reladdr ||
-   inst->src[i].offset >= REG_SIZE)
+   (inst->src[i].offset >= REG_SIZE &&
+(type_sz(inst->src[i].type) != 8 ||
+ !(inst->src[i].offset == 32 && inst->group == 4
   no_spill[inst->src[i].nr] = true;
 
/* We don't support unspills of partial DF reads.
@@ -419,7 +421,8 @@ vec4_visitor::evaluate_spill_costs(float *spill_costs, bool 
*no_spill)
 * we need to shuffle into correct 64-bit data. Ensure that we
 * are reading data for both threads.
 */
-   if (type_sz(inst->src[i].type) == 8 && inst->exec_size != 8)
+   if (type_sz(inst->src[i].type) == 8 && inst->exec_size != 8 &&
+   (devinfo->gen != 7 || devinfo->is_haswell || 
inst->exec_size != 4))
   no_spill[inst->src[i].nr] = true;
 }
 
@@ -437,7 +440,10 @@ vec4_visitor::evaluate_spill_costs(float *spill_costs, 
bool *no_spill)
   if (inst->dst.file == VGRF && !no_spill[inst->dst.nr]) {
  spill_costs[inst->dst.nr] +=
 loop_scale * spill_cost_for_type(inst->dst.type);
- if (inst->dst.reladdr || inst->dst.offset >= REG_SIZE)
+ if (inst->dst.reladdr ||
+ (inst->dst.offset >= REG_SIZE &&
+  (type_sz(inst->dst.type) != 8 ||
+   !(inst->dst.offset == 32 && inst->group == 4
 no_spill[inst->dst.nr] = true;
 
  /* We don't support spills of partial DF writes.
@@ -446,7 +452,8 @@ vec4_visitor::evaluate_spill_costs(float *spill_costs, bool 
*no_spill)
   * each one writing that for both SIMD4x2 threads. Ensure that we
   * are writing data for both threads.
   */
- if (type_sz(inst->dst.type) == 8 && inst->exec_size != 8)
+ if (type_sz(inst->dst.type) == 8 && inst->exec_size != 8 &&
+ (devinfo->gen != 7 || devinfo->is_haswell || inst->exec_size != 
4))
 no_spill[inst->dst.nr] = true;
 
  /* We can't spill registers that mix 32-bit and 64-bit access (that
@@ -514,11 +521,24 @@ vec4_visitor::spill_reg(int spill_reg_nr)
 
/* Generate spill/unspill instructions for the objects being spilled. */
int scratch_reg = -1;
+   bool done_scratch_read = false;
foreach_block_and_inst(block, vec4_instruction, inst, cfg) {
   for (unsigned int i = 0; i < 3; i++) {
  if (inst->src[i].file == VGRF && inst->src[i].nr == spill_reg_nr) {
+/* On IVB, DF scratch reads are not actual partial reads because 
we are
+ * going to read both GRFs on the first found instruction.
+ * Because of that, we will skip scratch read of the other splitted
+ * instruction, as it can reuse the read value. We check the value 
of
+ * done_scratch_read to know if we need to do scratch read or not.
+ *
+ * For the rest of generations, just return true.
+ */
+bool do_df_scratch_read = type_sz(inst->src[i].type) == 8 &&
+   (devinfo->gen != 7 || devinfo->is_haswell || 
!done_scratch_read);
+
 if (scratch_reg == -1 ||
-!can_use_scratch_for_source(inst, i, scratch_reg)) {
+(!can_use_scratch_for_source(inst, i, scratch_reg) &&
+ (do_df_scratch_read || type_sz(inst->src[i].type) != 8))) {
/* We need to unspill anyway so make sure we read the full vec4
 * in any case. This way, the cached register can be reused
 * for consecutive instructions that read different channels of
@@ -532,6 +552,7 @@ vec4_visitor::spill_reg(int spill_reg_nr)
emit_scratch_read(block, inst,
  dst_reg(temp), inst->src[i], spill_offset, 
false);
temp.offset = inst->src[i].offset;
+   done_scratch_read = true;
 }
 assert(scratch_reg != -1);

[Mesa-dev] [PATCH 4/6] i965/vec4: add support for doing DF register spilling on IVB

2017-06-15 Thread Samuel Iglesias Gonsálvez

Both spill/unspill process assume that both lower simd width
and DF scalarization were previously done.

* Spilling process does the following:

  1) Reads the existing content from the scratch memory that
 corresponds to the vertex (use inst->group to know if we
 are going to write data to the first or the second vertex).
 As it is already scalarized, we don't want to modify existing
 data of other components. We only read one GRF content as we are
 not going to modify the other (exec_size = 4).
  2) Overwrite the component the spilled instruction writes to.
  3) Do a scratch write to save the updated content of the respective
 vertex to scratch memory.

* Unspilling is implemented as several scratch reads when we find
  the first instruction whose sources were spilled.
  These scratch read get the content of the DF data for both vertices
  because we want to have DF data in two consecutive GRFs, even when
  this first instruction only reads one (exec_size = 4). Then, it is
  not needed to do more unspills until we write new content to the
  scratch memory, so we just need to update the register number in
  the affected sources of the following instructions.

Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
---
 src/intel/compiler/brw_vec4.cpp |  3 ++
 src/intel/compiler/brw_vec4_visitor.cpp | 69 +
 2 files changed, 65 insertions(+), 7 deletions(-)

diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp
index b6d409eea2..e25316d0b4 100644
--- a/src/intel/compiler/brw_vec4.cpp
+++ b/src/intel/compiler/brw_vec4.cpp
@@ -343,6 +343,9 @@ vec4_visitor::implied_mrf_writes(vec4_instruction *inst)
case SHADER_OPCODE_GEN4_SCRATCH_READ:
   return 2;
case SHADER_OPCODE_GEN4_SCRATCH_WRITE:
+  if (devinfo->gen == 7 && !devinfo->is_haswell &&
+  type_sz(inst->dst.type) == 8)
+ return 2;
   return 3;
case GS_OPCODE_URB_WRITE:
case GS_OPCODE_URB_WRITE_ALLOCATE:
diff --git a/src/intel/compiler/brw_vec4_visitor.cpp 
b/src/intel/compiler/brw_vec4_visitor.cpp
index 37ae31c0d5..0d5ad4d8f8 100644
--- a/src/intel/compiler/brw_vec4_visitor.cpp
+++ b/src/intel/compiler/brw_vec4_visitor.cpp
@@ -254,11 +254,13 @@ vec4_instruction *
 vec4_visitor::SCRATCH_READ(const dst_reg , const src_reg )
 {
vec4_instruction *inst;
+   bool is_df_ivb = devinfo->gen == 7 && !devinfo->is_haswell &&
+  type_sz(dst.type) == 8;
 
inst = new(mem_ctx) vec4_instruction(SHADER_OPCODE_GEN4_SCRATCH_READ,
dst, index);
inst->base_mrf = FIRST_SPILL_MRF(devinfo->gen) + 1;
-   inst->mlen = 2;
+   inst->mlen = is_df_ivb ? 1 : 2;
 
return inst;
 }
@@ -286,11 +288,13 @@ vec4_visitor::SCRATCH_WRITE(const dst_reg , const 
src_reg ,
 const src_reg )
 {
vec4_instruction *inst;
+   bool is_df_ivb = devinfo->gen == 7 && !devinfo->is_haswell &&
+  type_sz(src.type) == 8;
 
inst = new(mem_ctx) vec4_instruction(SHADER_OPCODE_GEN4_SCRATCH_WRITE,
dst, src, index);
inst->base_mrf = FIRST_SPILL_MRF(devinfo->gen);
-   inst->mlen = 3;
+   inst->mlen = is_df_ivb ? 2 : 3;
 
return inst;
 }
@@ -1527,8 +1531,8 @@ vec4_visitor::emit_1grf_df_ivb_scratch_read(bblock_t 
*block,
  */
 void
 vec4_visitor::emit_scratch_read(bblock_t *block, vec4_instruction *inst,
-   dst_reg temp, src_reg orig_src,
-   int base_offset)
+dst_reg temp, src_reg orig_src,
+int base_offset)
 {
assert(orig_src.offset % REG_SIZE == 0);
int reg_offset = base_offset + orig_src.offset / REG_SIZE;
@@ -1537,6 +1541,19 @@ vec4_visitor::emit_scratch_read(bblock_t *block, 
vec4_instruction *inst,
 
if (type_sz(orig_src.type) < 8) {
   emit_before(block, inst, SCRATCH_READ(temp, index));
+   } else if (devinfo->gen == 7 && !devinfo->is_haswell &&
+  type_sz(temp.type) == 8) {
+  /* Set the offset to the base offset because we address the base GRF of
+   * the DF. We will take into account the second GRF in the scratch write 
emission.
+   */
+  if (inst->group == 4)
+ reg_offset = base_offset;
+  temp.offset = 0;
+  vec4_instruction *read = SCRATCH_READ(temp, index);
+  read->exec_size = 4;
+  read->offset = reg_offset;
+  read->size_written = 2;
+  emit_before(block, inst, read);
} else {
   dst_reg shuffled = dst_reg(this, glsl_type::dvec4_type);
   dst_reg shuffled_float = retype(shuffled, BRW_REGISTER_TYPE_F);
@@ -1574,9 +1591,11 @@ vec4_visitor::emit_scratch_write(bblock_t *block, 
vec4_instruction *inst,
bool is_64bit = type_sz(inst->dst.type) == 8;
const glsl_type

[Mesa-dev] [PATCH 0/6] i965/vec4: Implement partial DF register spilling

2017-06-15 Thread Samuel Iglesias Gonsálvez

Hello,

As mentioned in the patch series that implemented Ivybridge support
ARB_gpu_shader_fp64 [0], the only missing feature in that series was
register spilling of 64-bit data and, because of that, about ~39 fp64
piglit tests failed to spill registers.

This new patch series implement register spilling of 64-bit data for
IVB and splitted DF instructions on vec4 backend in general.
Unfortunately, this doesn't make the previous failed tests to pass :-(

Nevertheless, I think this is still useful to have it in place. The
implementation uses 1-OWord block write/read messages by reusing the
existing implementation. Thanks to that, we can write/read valid dvecN
data to/from scratch memory even under non-uniform control flow.

If you want to test the branch:

$ git clone -b fp64-ivb-vec4-spilling \
https://github.com/Igalia/mesa.git

Thanks,

Sam

[0] https://lists.freedesktop.org/archives/mesa-dev/2017-March/148646.html

Samuel Iglesias Gonsálvez (6):
  i965/eu: add support for 1-OWord Block Read/Write messages
  i965/vec4/generator: use 1-Oword Block Read/Write messages for DF
scratch writes/reads
  i965/generator: use MRF when sending 1-OWord read messages for DF
scratch reads on IVB
  i965/vec4: add support for doing DF register spilling on IVB
  i965/vec4: fix resolve reladdr case on DF scratch read/write on IVB
  i965/vec4: allow partial DF register spilling

 src/intel/compiler/brw_eu.h  |  18 ++--
 src/intel/compiler/brw_eu_defines.h  |   2 +
 src/intel/compiler/brw_eu_emit.c |  42 +++--
 src/intel/compiler/brw_fs_generator.cpp  |   5 +-
 src/intel/compiler/brw_shader.cpp|   5 +
 src/intel/compiler/brw_vec4.cpp  |  10 ++
 src/intel/compiler/brw_vec4.h|  17 +++-
 src/intel/compiler/brw_vec4_generator.cpp| 136 +++
 src/intel/compiler/brw_vec4_reg_allocate.cpp |  42 +++--
 src/intel/compiler/brw_vec4_visitor.cpp  | 125 ++--
 10 files changed, 361 insertions(+), 41 deletions(-)

-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/6] i965/generator: use MRF when sending 1-OWord read messages for DF scratch reads on IVB

2017-06-15 Thread Samuel Iglesias Gonsálvez

Use MRF for 1-Oword read messages again to avoid problems when
sending scratch read messages. We cannot reuse the destination as the DF
scratch reads on IVB are splitted in several instructions and that could
end up having invalid data.

Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
---
 src/intel/compiler/brw_eu_emit.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/intel/compiler/brw_eu_emit.c b/src/intel/compiler/brw_eu_emit.c
index bd6f46c776..fa6dc0d5ff 100644
--- a/src/intel/compiler/brw_eu_emit.c
+++ b/src/intel/compiler/brw_eu_emit.c
@@ -2267,7 +2267,9 @@ brw_oword_block_read_scratch(struct brw_codegen *p,
if (devinfo->gen >= 6)
   offset /= 16;
 
-   if (p->devinfo->gen >= 7) {
+   if (p->devinfo->gen >= 7 &&
+   (p->devinfo->gen > 7 || p->devinfo->is_haswell ||
+type_sz(dest.type) != 8)) {
   /* On gen 7 and above, we no longer have message registers and we can
* send from any register we want.  By using the destination register
* for the message, we guarantee that the implied message write won't
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/6] i965/vec4/generator: use 1-Oword Block Read/Write messages for DF scratch writes/reads

2017-06-15 Thread Samuel Iglesias Gonsálvez

Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
---
 src/intel/compiler/brw_eu_defines.h  |   2 +
 src/intel/compiler/brw_shader.cpp|   5 +
 src/intel/compiler/brw_vec4.cpp  |   7 ++
 src/intel/compiler/brw_vec4.h|   8 ++
 src/intel/compiler/brw_vec4_generator.cpp| 136 +++
 src/intel/compiler/brw_vec4_reg_allocate.cpp |   6 +-
 src/intel/compiler/brw_vec4_visitor.cpp  |  49 ++
 7 files changed, 212 insertions(+), 1 deletion(-)

diff --git a/src/intel/compiler/brw_eu_defines.h 
b/src/intel/compiler/brw_eu_defines.h
index 1af835d47e..3c148de0fa 100644
--- a/src/intel/compiler/brw_eu_defines.h
+++ b/src/intel/compiler/brw_eu_defines.h
@@ -436,6 +436,8 @@ enum opcode {
VEC4_OPCODE_PICK_HIGH_32BIT,
VEC4_OPCODE_SET_LOW_32BIT,
VEC4_OPCODE_SET_HIGH_32BIT,
+   VEC4_OPCODE_GEN4_SCRATCH_READ_1OWORD_LOW,
+   VEC4_OPCODE_GEN4_SCRATCH_READ_1OWORD_HIGH,
 
FS_OPCODE_DDX_COARSE,
FS_OPCODE_DDX_FINE,
diff --git a/src/intel/compiler/brw_shader.cpp 
b/src/intel/compiler/brw_shader.cpp
index 53d0742d2e..248feacbd2 100644
--- a/src/intel/compiler/brw_shader.cpp
+++ b/src/intel/compiler/brw_shader.cpp
@@ -296,6 +296,11 @@ brw_instruction_name(const struct gen_device_info 
*devinfo, enum opcode op)
case FS_OPCODE_PACK:
   return "pack";
 
+
+   case VEC4_OPCODE_GEN4_SCRATCH_READ_1OWORD_LOW:
+  return "gen4_scratch_read_1word_low";
+   case VEC4_OPCODE_GEN4_SCRATCH_READ_1OWORD_HIGH:
+  return "gen4_scratch_read_1word_high";
case SHADER_OPCODE_GEN4_SCRATCH_READ:
   return "gen4_scratch_read";
case SHADER_OPCODE_GEN4_SCRATCH_WRITE:
diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp
index b443effca9..b6d409eea2 100644
--- a/src/intel/compiler/brw_vec4.cpp
+++ b/src/intel/compiler/brw_vec4.cpp
@@ -259,6 +259,8 @@ bool
 vec4_instruction::can_do_writemask(const struct gen_device_info *devinfo)
 {
switch (opcode) {
+   case VEC4_OPCODE_GEN4_SCRATCH_READ_1OWORD_LOW:
+   case VEC4_OPCODE_GEN4_SCRATCH_READ_1OWORD_HIGH:
case SHADER_OPCODE_GEN4_SCRATCH_READ:
case VEC4_OPCODE_DOUBLE_TO_F32:
case VEC4_OPCODE_DOUBLE_TO_D32:
@@ -335,6 +337,9 @@ vec4_visitor::implied_mrf_writes(vec4_instruction *inst)
   return 1;
case VS_OPCODE_PULL_CONSTANT_LOAD:
   return 2;
+   case VEC4_OPCODE_GEN4_SCRATCH_READ_1OWORD_LOW:
+   case VEC4_OPCODE_GEN4_SCRATCH_READ_1OWORD_HIGH:
+  return 1;
case SHADER_OPCODE_GEN4_SCRATCH_READ:
   return 2;
case SHADER_OPCODE_GEN4_SCRATCH_WRITE:
@@ -2091,6 +2096,8 @@ get_lowered_simd_width(const struct gen_device_info 
*devinfo,
 {
/* Do not split some instructions that require special handling */
switch (inst->opcode) {
+   case VEC4_OPCODE_GEN4_SCRATCH_READ_1OWORD_LOW:
+   case VEC4_OPCODE_GEN4_SCRATCH_READ_1OWORD_HIGH:
case SHADER_OPCODE_GEN4_SCRATCH_READ:
case SHADER_OPCODE_GEN4_SCRATCH_WRITE:
   return inst->exec_size;
diff --git a/src/intel/compiler/brw_vec4.h b/src/intel/compiler/brw_vec4.h
index d828da02ea..a5b45aca21 100644
--- a/src/intel/compiler/brw_vec4.h
+++ b/src/intel/compiler/brw_vec4.h
@@ -214,6 +214,9 @@ public:
 enum brw_conditional_mod condition);
vec4_instruction *IF(enum brw_predicate predicate);
EMIT1(SCRATCH_READ)
+   vec4_instruction *DF_IVB_SCRATCH_READ(const dst_reg , const src_reg 
,
+ bool low);
+
EMIT2(SCRATCH_WRITE)
EMIT3(LRP)
EMIT1(BFREV)
@@ -294,6 +297,11 @@ public:
  dst_reg dst,
  src_reg orig_src,
  int base_offset);
+   void emit_1grf_df_ivb_scratch_read(bblock_t *block,
+  vec4_instruction *inst,
+  dst_reg temp, src_reg orig_src,
+  int base_offset, bool first_grf);
+
void emit_scratch_write(bblock_t *block, vec4_instruction *inst,
   int base_offset);
void emit_pull_constant_load(bblock_t *block, vec4_instruction *inst,
diff --git a/src/intel/compiler/brw_vec4_generator.cpp 
b/src/intel/compiler/brw_vec4_generator.cpp
index 334933d15a..3bb931385a 100644
--- a/src/intel/compiler/brw_vec4_generator.cpp
+++ b/src/intel/compiler/brw_vec4_generator.cpp
@@ -1133,6 +1133,73 @@ generate_unpack_flags(struct brw_codegen *p,
 }
 
 static void
+generate_scratch_read_1oword(struct brw_codegen *p,
+ vec4_instruction *inst,
+ struct brw_reg dst,
+ struct brw_reg index,
+ bool low)
+{
+   const struct gen_device_info *devinfo = p->devinfo;
+
+   assert(devinfo->gen >= 7 && inst->exec_size == 4 &&
+  type_sz(dst.type) == 8);
+   brw_set_default_access_mode(p, BRW_ALIGN_1);
+

[Mesa-dev] [PATCH 1/6] i965/eu: add support for 1-OWord Block Read/Write messages

2017-06-15 Thread Samuel Iglesias Gonsálvez

Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
---
 src/intel/compiler/brw_eu.h | 18 ++--
 src/intel/compiler/brw_eu_emit.c| 38 +
 src/intel/compiler/brw_fs_generator.cpp |  5 +++--
 3 files changed, 43 insertions(+), 18 deletions(-)

diff --git a/src/intel/compiler/brw_eu.h b/src/intel/compiler/brw_eu.h
index a3a9c63239..723fe2e1b2 100644
--- a/src/intel/compiler/brw_eu.h
+++ b/src/intel/compiler/brw_eu.h
@@ -342,15 +342,19 @@ void brw_oword_block_read(struct brw_codegen *p,
 unsigned brw_scratch_surface_idx(const struct brw_codegen *p);
 
 void brw_oword_block_read_scratch(struct brw_codegen *p,
- struct brw_reg dest,
- struct brw_reg mrf,
- int num_regs,
- unsigned offset);
+  struct brw_reg dest,
+  struct brw_reg mrf,
+  int num_regs,
+  unsigned offset,
+  bool oword1_low,
+  bool oword_high);
 
 void brw_oword_block_write_scratch(struct brw_codegen *p,
-  struct brw_reg mrf,
-  int num_regs,
-  unsigned offset);
+   struct brw_reg mrf,
+   int num_regs,
+   unsigned offset,
+   bool oword1_low,
+   bool oword1_high);
 
 void gen7_block_read_scratch(struct brw_codegen *p,
  struct brw_reg dest,
diff --git a/src/intel/compiler/brw_eu_emit.c b/src/intel/compiler/brw_eu_emit.c
index 231d6fdaec..bd6f46c776 100644
--- a/src/intel/compiler/brw_eu_emit.c
+++ b/src/intel/compiler/brw_eu_emit.c
@@ -2133,9 +2133,11 @@ brw_scratch_surface_idx(const struct brw_codegen *p)
  * register spilling.
  */
 void brw_oword_block_write_scratch(struct brw_codegen *p,
-  struct brw_reg mrf,
-  int num_regs,
-  unsigned offset)
+   struct brw_reg mrf,
+   int num_regs,
+   unsigned offset,
+   bool oword1_low,
+   bool oword1_high)
 {
const struct gen_device_info *devinfo = p->devinfo;
const unsigned target_cache =
@@ -2180,6 +2182,14 @@ void brw_oword_block_write_scratch(struct brw_codegen *p,
   int send_commit_msg;
   struct brw_reg src_header = retype(brw_vec8_grf(0, 0),
 BRW_REGISTER_TYPE_UW);
+  int msg_control = BRW_DATAPORT_OWORD_BLOCK_DWORDS(num_regs * 8);
+
+  if (num_regs == 1 && (oword1_low || oword1_high)) {
+ /* Only one of them can be true */
+ assert(oword1_low ^ oword1_high);
+ msg_control = oword1_high ?
+BRW_DATAPORT_OWORD_BLOCK_1_OWORDHIGH : 
BRW_DATAPORT_OWORD_BLOCK_1_OWORDLOW;
+  }
 
   brw_inst_set_compression(devinfo, insn, false);
 
@@ -2223,7 +2233,7 @@ void brw_oword_block_write_scratch(struct brw_codegen *p,
   brw_set_dp_write_message(p,
   insn,
brw_scratch_surface_idx(p),
-  BRW_DATAPORT_OWORD_BLOCK_DWORDS(num_regs * 8),
+  msg_control,
   msg_type,
target_cache,
   mlen,
@@ -2245,10 +2255,12 @@ void brw_oword_block_write_scratch(struct brw_codegen 
*p,
  */
 void
 brw_oword_block_read_scratch(struct brw_codegen *p,
-struct brw_reg dest,
-struct brw_reg mrf,
-int num_regs,
-unsigned offset)
+ struct brw_reg dest,
+ struct brw_reg mrf,
+ int num_regs,
+ unsigned offset,
+ bool oword1_low,
+ bool oword1_high)
 {
const struct gen_device_info *devinfo = p->devinfo;
 
@@ -2291,6 +2303,14 @@ brw_oword_block_read_scratch(struct brw_codegen *p,
 
{
   brw_inst *insn = next_insn(p, BRW_OPCODE_SEND);
+  int msg_control = BRW_DATAPORT_OWORD_BLOCK_DWORDS(num_regs * 8);
+
+  if (num_regs == 1 && (oword1_low || oword1_high)) {
+ /* Only one of them can be true */
+ assert(oword1_low ^ oword1_high);
+ msg_control = oword1_high ?
+BRW_DATAPORT_OWORD_BLOCK_1_OWORDHIGH : 
BRW_DATAPORT_OWORD_BLOCK_1_OWORDL

Re: [Mesa-dev] [PATCH] intel/blorp: Work around Sandy Bridge occlusion query issue

2017-06-13 Thread Samuel Iglesias Gonsálvez

Reviewed-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>

On Thu, 2017-06-08 at 10:45 -0700, Jason Ekstrand wrote:
> ---
>  src/intel/blorp/blorp_clear.c | 10 ++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/src/intel/blorp/blorp_clear.c
> b/src/intel/blorp/blorp_clear.c
> index 3d5c41c..efacadf 100644
> --- a/src/intel/blorp/blorp_clear.c
> +++ b/src/intel/blorp/blorp_clear.c
> @@ -479,6 +479,16 @@ blorp_clear_depth_stencil(struct blorp_batch
> *batch,
> params.x1 = x1;
> params.y1 = y1;
>  
> +   if (ISL_DEV_GEN(batch->blorp->isl_dev) == 6) {
> +  /* For some reason, Sandy Bridge gets occlusion queries wrong
> if we
> +   * don't have a shader.  In particular, it records samples
> even though
> +   * we disable statistics in 3DSTATE_WM.  Give it the usual
> clear shader
> +   * to work around the issue.
> +   */
> +  if (!blorp_params_get_clear_kernel(batch->blorp, ,
> false))
> + return;
> +   }
> +
> while (num_layers > 0) {
>    params.num_layers = num_layers;
>  

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 00/11] i965: Use BLORP for depth/stencil clears

2017-06-13 Thread Samuel Iglesias Gonsálvez

On Mon, 2017-06-12 at 13:52 +0200, Samuel Iglesias Gonsálvez wrote:
> On Tue, 2017-06-06 at 21:59 -0700, Jason Ekstrand wrote:
> > This little series switches the GL driver to use BLORP for depth
> > and
> > stencil clears.  BLORP has had depth/stencil clear support ever
> > since
> > we
> > started using it in the Vulkan driver but we didn't hook it up in
> > GL
> > because of a few very hard-to-debug CTS fails.  Patches 10 takes
> > care
> > of
> > those and we now pass except for some weird behavior around
> > occlusion
> > queries on Sandy Bridge.  I'll look into those later.  For now, I
> > think the
> > series is worth reviewing.
> > 
> > Jason Ekstrand (11):
> >   i965/blorp: Set aux_usage to NONE for miplevels without HiZ
> >   mesa: Add a BUFFER_BITS mask for depth+stencil
> >   i965/miptree: Choose the stencil layout in miptree_create_layout
> >   intel/isl: Properly set SeparateStencilBufferEnable on gen5-6
> >   i965: Remove some of the remnants of meta
> >   i965: Remove some unneeded fields from brw_context
> >   i965/blorp: Set no_depth_or_stencil correctly
> >   i965/blorp: Do a depth flush/stall prior to HiZ operations
> >   i965: Disable the interleaved vertex optimization when instancing
> >   i965: Set step_rate == 0 for interleaved vertex buffers
> >   i965: Use blorp for depth/stencil clears on gen6+
> > 
> 
> Patches 2, 3, 6-10 are:
> 
> Reviewed-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>

Patches 1 and 4 are:

Reviewed-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>

I am not sure about patch 11 but, as I don't see anything wrong and
assuming Jenkins is happy, then:

Acked-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>

Sam


> 
> Sam
> 
> >  src/intel/isl/isl_emit_depth_stencil.c|  13 ++-
> >  src/mesa/drivers/dri/i965/brw_blorp.c | 129
> > ++
> >  src/mesa/drivers/dri/i965/brw_blorp.h |   4 +
> >  src/mesa/drivers/dri/i965/brw_clear.c |   6 ++
> >  src/mesa/drivers/dri/i965/brw_context.h   |  13 ---
> >  src/mesa/drivers/dri/i965/brw_draw_upload.c   |  12 ++-
> >  src/mesa/drivers/dri/i965/brw_wm.c|   2 +-
> >  src/mesa/drivers/dri/i965/genX_blorp_exec.c   |   3 +-
> >  src/mesa/drivers/dri/i965/genX_state_upload.c |   2 +-
> >  src/mesa/drivers/dri/i965/intel_mipmap_tree.c |   6 +-
> >  src/mesa/main/mtypes.h|   3 +
> >  11 files changed, 167 insertions(+), 26 deletions(-)
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 05/11] i965: Remove some of the remnants of meta

2017-06-12 Thread Samuel Iglesias Gonsálvez

El Lunes, 12 de junio de 2017 07:57:23 Jason Ekstrand escribió:
> On Mon, Jun 12, 2017 at 4:54 AM, Samuel Iglesias Gonsálvez <
> 
> sigles...@igalia.com> wrote:
> > On Tue, 2017-06-06 at 22:00 -0700, Jason Ekstrand wrote:
> > > ---
> > > 
> > >  src/mesa/drivers/dri/i965/brw_context.h   | 1 -
> > >  src/mesa/drivers/dri/i965/brw_wm.c| 2 +-
> > >  src/mesa/drivers/dri/i965/genX_state_upload.c | 2 +-
> > >  3 files changed, 2 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/src/mesa/drivers/dri/i965/brw_context.h
> > > b/src/mesa/drivers/dri/i965/brw_context.h
> > > index 4c5bc3b..3f4b86a 100644
> > > --- a/src/mesa/drivers/dri/i965/brw_context.h
> > > +++ b/src/mesa/drivers/dri/i965/brw_context.h
> > > @@ -750,7 +750,6 @@ struct brw_context
> > > 
> > > bool has_negative_rhw_bug;
> > > bool has_pln;
> > > bool no_simd8;
> > > 
> > > -   bool use_rep_send;
> > > 
> > > /**
> > > 
> > >  * Some versions of Gen hardware don't do centroid interpolation
> > > 
> > > correctly
> > > diff --git a/src/mesa/drivers/dri/i965/brw_wm.c
> > > b/src/mesa/drivers/dri/i965/brw_wm.c
> > > index 6fac3c4..7f688e2 100644
> > > --- a/src/mesa/drivers/dri/i965/brw_wm.c
> > > +++ b/src/mesa/drivers/dri/i965/brw_wm.c
> > > @@ -188,7 +188,7 @@ brw_codegen_wm_prog(struct brw_context *brw,
> > > 
> > > program = brw_compile_fs(brw->screen->compiler, brw, mem_ctx,
> > > 
> > >  key, _data, fp->program.nir,
> > >  >program, st_index8, st_index16,
> > > 
> > > -true, brw->use_rep_send, vue_map,
> > > +true, false, vue_map,
> > > 
> > >  _size, _str);
> > > 
> > > if (program == NULL) {
> > > 
> > > diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c
> > > b/src/mesa/drivers/dri/i965/genX_state_upload.c
> > > index 23358c4..f6b2f17 100644
> > > --- a/src/mesa/drivers/dri/i965/genX_state_upload.c
> > > +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
> > > @@ -1316,7 +1316,7 @@ genX(upload_clip_state)(struct brw_context
> > > *brw)
> > > 
> > >   clip.ClipMode = CLIPMODE_NORMAL;
> > >
> > >    }
> > > 
> > > -  clip.ClipEnable = brw->primitive != _3DPRIM_RECTLIST;
> > > +  clip.ClipEnable = true;
> > 
> > Is this patch fine? Look like both changes are completely unrelated :-/
> 
> They're related in the sense that the old meta clear code was the only
> thing using either RECTLIST primitives or use_rep_send.  We still use
> RECTLIST primitives but it all happens in BLORP now so the regular state
> upload code will never see them.
> 

Thanks for the explanation! Then this patch is:

Reviewed-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>

Sam

> --Jason
> 
> > Sam
> > 
> > >/* _NEW_POLYGON,
> > >
> > > * BRW_NEW_GEOMETRY_PROGRAM | BRW_NEW_TES_PROG_DATA |
> > > 
> > > BRW_NEW_PRIMITIVE

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 05/11] i965: Remove some of the remnants of meta

2017-06-12 Thread Samuel Iglesias Gonsálvez

On Tue, 2017-06-06 at 22:00 -0700, Jason Ekstrand wrote:
> ---
>  src/mesa/drivers/dri/i965/brw_context.h   | 1 -
>  src/mesa/drivers/dri/i965/brw_wm.c| 2 +-
>  src/mesa/drivers/dri/i965/genX_state_upload.c | 2 +-
>  3 files changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_context.h
> b/src/mesa/drivers/dri/i965/brw_context.h
> index 4c5bc3b..3f4b86a 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.h
> +++ b/src/mesa/drivers/dri/i965/brw_context.h
> @@ -750,7 +750,6 @@ struct brw_context
> bool has_negative_rhw_bug;
> bool has_pln;
> bool no_simd8;
> -   bool use_rep_send;
>  
> /**
>  * Some versions of Gen hardware don't do centroid interpolation
> correctly
> diff --git a/src/mesa/drivers/dri/i965/brw_wm.c
> b/src/mesa/drivers/dri/i965/brw_wm.c
> index 6fac3c4..7f688e2 100644
> --- a/src/mesa/drivers/dri/i965/brw_wm.c
> +++ b/src/mesa/drivers/dri/i965/brw_wm.c
> @@ -188,7 +188,7 @@ brw_codegen_wm_prog(struct brw_context *brw,
> program = brw_compile_fs(brw->screen->compiler, brw, mem_ctx,
>  key, _data, fp->program.nir,
>  >program, st_index8, st_index16,
> -true, brw->use_rep_send, vue_map,
> +true, false, vue_map,
>  _size, _str);
>  
> if (program == NULL) {
> diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c
> b/src/mesa/drivers/dri/i965/genX_state_upload.c
> index 23358c4..f6b2f17 100644
> --- a/src/mesa/drivers/dri/i965/genX_state_upload.c
> +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
> @@ -1316,7 +1316,7 @@ genX(upload_clip_state)(struct brw_context
> *brw)
>   clip.ClipMode = CLIPMODE_NORMAL;
>    }
>  
> -  clip.ClipEnable = brw->primitive != _3DPRIM_RECTLIST;
> +  clip.ClipEnable = true;
>  

Is this patch fine? Look like both changes are completely unrelated :-/

Sam

>    /* _NEW_POLYGON,
> * BRW_NEW_GEOMETRY_PROGRAM | BRW_NEW_TES_PROG_DATA |
> BRW_NEW_PRIMITIVE

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 00/11] i965: Use BLORP for depth/stencil clears

2017-06-12 Thread Samuel Iglesias Gonsálvez

On Tue, 2017-06-06 at 21:59 -0700, Jason Ekstrand wrote:
> This little series switches the GL driver to use BLORP for depth and
> stencil clears.  BLORP has had depth/stencil clear support ever since
> we
> started using it in the Vulkan driver but we didn't hook it up in GL
> because of a few very hard-to-debug CTS fails.  Patches 10 takes care
> of
> those and we now pass except for some weird behavior around occlusion
> queries on Sandy Bridge.  I'll look into those later.  For now, I
> think the
> series is worth reviewing.
> 
> Jason Ekstrand (11):
>   i965/blorp: Set aux_usage to NONE for miplevels without HiZ
>   mesa: Add a BUFFER_BITS mask for depth+stencil
>   i965/miptree: Choose the stencil layout in miptree_create_layout
>   intel/isl: Properly set SeparateStencilBufferEnable on gen5-6
>   i965: Remove some of the remnants of meta
>   i965: Remove some unneeded fields from brw_context
>   i965/blorp: Set no_depth_or_stencil correctly
>   i965/blorp: Do a depth flush/stall prior to HiZ operations
>   i965: Disable the interleaved vertex optimization when instancing
>   i965: Set step_rate == 0 for interleaved vertex buffers
>   i965: Use blorp for depth/stencil clears on gen6+
> 

Patches 2, 3, 6-10 are:

Reviewed-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>

Sam

>  src/intel/isl/isl_emit_depth_stencil.c|  13 ++-
>  src/mesa/drivers/dri/i965/brw_blorp.c | 129
> ++
>  src/mesa/drivers/dri/i965/brw_blorp.h |   4 +
>  src/mesa/drivers/dri/i965/brw_clear.c |   6 ++
>  src/mesa/drivers/dri/i965/brw_context.h   |  13 ---
>  src/mesa/drivers/dri/i965/brw_draw_upload.c   |  12 ++-
>  src/mesa/drivers/dri/i965/brw_wm.c|   2 +-
>  src/mesa/drivers/dri/i965/genX_blorp_exec.c   |   3 +-
>  src/mesa/drivers/dri/i965/genX_state_upload.c |   2 +-
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c |   6 +-
>  src/mesa/main/mtypes.h|   3 +
>  11 files changed, 167 insertions(+), 26 deletions(-)
> 

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/3] i965/vec4/gs: restore the uniform values which was overwritten by failed vec4_gs_visitor execution

2017-05-17 Thread Samuel Iglesias Gonsálvez

Kind reminder that patches 1 and 3 are still unreviewed.

Sam

On Fri, 2017-05-05 at 12:38 +0200, Samuel Iglesias Gonsálvez wrote:
> We are going to add a packing feature to reduce the usage of the push
> constant buffer. One of the consequences is that 'nr_params' would be
> modified by vec4_visitor's run call, so we need to restore it if one
> of
> them failed before executing the fallback ones. Same thing happens to
> the
> uniforms values that would be reordered afterwards.
> 
> Fixes GL45-CTS.arrays_of_arrays_gl.InteractionFunctionCalls2 when
> the dvec4 alignment and packing patch is applied.
> 
> Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
> Cc: "17.1" <mesa-sta...@lists.freedesktop.org>
> ---
>  src/intel/compiler/brw_vec4_gs_visitor.cpp | 26
> ++
>  1 file changed, 26 insertions(+)
> 
> diff --git a/src/intel/compiler/brw_vec4_gs_visitor.cpp
> b/src/intel/compiler/brw_vec4_gs_visitor.cpp
> index 4a8b5be30e..5fcd02831a 100644
> --- a/src/intel/compiler/brw_vec4_gs_visitor.cpp
> +++ b/src/intel/compiler/brw_vec4_gs_visitor.cpp
> @@ -868,10 +868,36 @@ brw_compile_gs(const struct brw_compiler
> *compiler, void *log_data,
>  
>   vec4_gs_visitor v(compiler, log_data, , prog_data,
> shader,
> mem_ctx, true /* no_spills */,
> shader_time_index);
> +
> + /* Backup 'nr_params' and 'param' as they can be modified
> by the
> +  * the DUAL_OBJECT visitor. If it fails, we will run the
> fallback
> +  * (DUAL_INSTANCED or SINGLE mode) and we need to restore
> original
> +  * values.
> +  */
> + const unsigned param_count = prog_data-
> >base.base.nr_params;
> + gl_constant_value **param = ralloc_array(NULL,
> gl_constant_value*,
> +  param_count);
> + memcpy(param, prog_data->base.base.param,
> +sizeof(gl_constant_value*) * param_count);
> +
>   if (v.run()) {
> +/* Success! Backup is not needed */
> +ralloc_free(param);
>  return brw_vec4_generate_assembly(compiler, log_data,
> mem_ctx,
>    shader, _data-
> >base, v.cfg,
>    final_assembly_size);
> + } else {
> +/* These variables could be modified by the execution of
> the GS
> + * visitor if it packed the uniforms in the push
> constant buffer.
> + * As it failed, we need restore them so we can start
> again with
> + * DUAL_INSTANCED or SINGLE mode.
> + *
> + * FIXME: Could more variables be modified by this
> execution?
> + */
> +memcpy(prog_data->base.base.param, param,
> +   sizeof(gl_constant_value*) * param_count);
> +prog_data->base.base.nr_params = param_count;
> +ralloc_free(param);
>   }
>    }
> }

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/6] vulkan/wsi: Ad get_capabilities2 and get_formats2d interface pointers

2017-05-16 Thread Samuel Iglesias Gonsálvez

s/Ad/Add

Series is,

Reviewed-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>

Sam

On Mon, 2017-05-15 at 08:07 -0700, Jason Ekstrand wrote:
> ---
>  src/vulkan/wsi/wsi_common.h | 8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/src/vulkan/wsi/wsi_common.h
> b/src/vulkan/wsi/wsi_common.h
> index 5e77518..8aee9c7 100644
> --- a/src/vulkan/wsi/wsi_common.h
> +++ b/src/vulkan/wsi/wsi_common.h
> @@ -87,10 +87,18 @@ struct wsi_interface {
> VkBool32* pSupported);
> VkResult (*get_capabilities)(VkIcdSurfaceBase *surface,
>  VkSurfaceCapabilitiesKHR*
> pSurfaceCapabilities);
> +   VkResult (*get_capabilities2)(VkIcdSurfaceBase *surface,
> + const void *info_next,
> + VkSurfaceCapabilities2KHR*
> pSurfaceCapabilities);
> VkResult (*get_formats)(VkIcdSurfaceBase *surface,
> struct wsi_device *wsi_device,
> uint32_t* pSurfaceFormatCount,
> VkSurfaceFormatKHR* pSurfaceFormats);
> +   VkResult (*get_formats2)(VkIcdSurfaceBase *surface,
> +struct wsi_device *wsi_device,
> +const void *info_next,
> +uint32_t* pSurfaceFormatCount,
> +VkSurfaceFormat2KHR* pSurfaceFormats);
> VkResult (*get_present_modes)(VkIcdSurfaceBase *surface,
>   uint32_t* pPresentModeCount,
>   VkPresentModeKHR* pPresentModes);

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] i965: Drop INTEL_DEBUG=stats.

2017-05-09 Thread Samuel Iglesias Gonsálvez

Series is:

Reviewed-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>

On Tue, 2017-05-09 at 01:48 -0700, Kenneth Graunke wrote:
> For whatever reason, we had an INTEL_DEBUG=stats option that enabled
> various statistics counters on Gen4-5 systems.  It's been around
> forever, though I can't think of a single time that it's been useful.
> 
> On Gen6+, we enable statistics all the time because they're necessary
> to support various query object targets.  Turning them off would
> break
> those queries.
> 
> Gen4-5 don't support those queries, so the statistics counters
> generally
> aren't useful; we disabled them by default.  This patch disables them
> altogether.
> ---
>  docs/envvars.html  | 1 -
>  src/intel/common/gen_debug.c   | 1 -
>  src/intel/common/gen_debug.h   | 2 +-
>  src/mesa/drivers/dri/i965/brw_cc.c | 2 +-
>  src/mesa/drivers/dri/i965/brw_clip_state.c | 3 ---
>  src/mesa/drivers/dri/i965/brw_gs_state.c   | 3 ---
>  src/mesa/drivers/dri/i965/brw_sf_state.c   | 3 ---
>  src/mesa/drivers/dri/i965/brw_vs_state.c   | 4 
>  src/mesa/drivers/dri/i965/brw_wm_state.c   | 2 +-
>  9 files changed, 3 insertions(+), 18 deletions(-)
> 
> diff --git a/docs/envvars.html b/docs/envvars.html
> index 05afd2d5529..e075c20536a 100644
> --- a/docs/envvars.html
> +++ b/docs/envvars.html
> @@ -195,7 +195,6 @@ See the Xlib software
> driver page for details.
> spill_fs - force spilling of all registers in the scalar
> backend (useful to debug spilling code)
> spill_vec4 - force spilling of all registers in the vec4
> backend (useful to debug spilling code)
> state - emit messages about state flag tracking
> -   stats - enable statistics counters. you probably actually
> want perfmon or intel_gpu_top instead.
> sync - after sending each batch, emit a message and wait for
> that batch to finish rendering
> tcs - dump shader assembly for tessellation control
> shaders
> tes - dump shader assembly for tessellation evaluation
> shaders
> diff --git a/src/intel/common/gen_debug.c
> b/src/intel/common/gen_debug.c
> index be6fcdb3bdc..f5702f009bc 100644
> --- a/src/intel/common/gen_debug.c
> +++ b/src/intel/common/gen_debug.c
> @@ -57,7 +57,6 @@ static const struct debug_control debug_control[] =
> {
> { "vert",DEBUG_VERTS },
> { "dri", DEBUG_DRI },
> { "sf",  DEBUG_SF },
> -   { "stats",   DEBUG_STATS },
> { "wm",  DEBUG_WM },
> { "urb", DEBUG_URB },
> { "vs",  DEBUG_VS },
> diff --git a/src/intel/common/gen_debug.h
> b/src/intel/common/gen_debug.h
> index c0b74ea2afe..f7f59c9b5d8 100644
> --- a/src/intel/common/gen_debug.h
> +++ b/src/intel/common/gen_debug.h
> @@ -57,7 +57,7 @@ extern uint64_t INTEL_DEBUG;
>  #define DEBUG_VERTS   (1ull << 13)
>  #define DEBUG_DRI (1ull << 14)
>  #define DEBUG_SF  (1ull << 15)
> -#define DEBUG_STATS   (1ull << 16)
> +/* Hole - feel free to reuse  (1ull << 16) */
>  #define DEBUG_WM  (1ull << 17)
>  #define DEBUG_URB (1ull << 18)
>  #define DEBUG_VS  (1ull << 19)
> diff --git a/src/mesa/drivers/dri/i965/brw_cc.c
> b/src/mesa/drivers/dri/i965/brw_cc.c
> index 21b01f3bb18..62e81253cc9 100644
> --- a/src/mesa/drivers/dri/i965/brw_cc.c
> +++ b/src/mesa/drivers/dri/i965/brw_cc.c
> @@ -226,7 +226,7 @@ static void upload_cc_unit(struct brw_context
> *brw)
>    cc->cc2.depth_write_enable = brw_depth_writes_enabled(brw);
> }
>  
> -   if (brw->stats_wm || unlikely(INTEL_DEBUG & DEBUG_STATS))
> +   if (brw->stats_wm)
>    cc->cc5.statistics_enable = 1;
>  
> /* BRW_NEW_CC_VP */
> diff --git a/src/mesa/drivers/dri/i965/brw_clip_state.c
> b/src/mesa/drivers/dri/i965/brw_clip_state.c
> index 5e084a9961d..d5fe2b547fa 100644
> --- a/src/mesa/drivers/dri/i965/brw_clip_state.c
> +++ b/src/mesa/drivers/dri/i965/brw_clip_state.c
> @@ -114,9 +114,6 @@ brw_upload_clip_unit(struct brw_context *brw)
>    clip->thread4.max_threads = 1 - 1;
> }
>  
> -   if (unlikely(INTEL_DEBUG & DEBUG_STATS))
> -  clip->thread4.stats_enable = 1;
> -
> /* _NEW_TRANSFORM */
> if (brw->gen == 5 || brw->is_g4x)
>    clip->clip5.userclip_enable_flags = ctx-
> >Transform.ClipPlanesEnabled;
> diff --git a/src/mesa/drivers/dri/i965/brw_gs_state.c
> b/src/mesa/drivers/dri/i965/brw_gs_state.c
> index ed9ae44bcdb..72ad044f6c7 100644
> --- a/src/mesa/drivers

[Mesa-dev] [PATCH 3/3] i965/vec4: load dvec3/4 uniforms first in the push constant buffer

2017-05-05 Thread Samuel Iglesias Gonsálvez

Reorder the uniforms to load first the dvec4-aligned variables in the
push constant buffer and then push the vec4-aligned ones. It takes
into account that the relocated uniforms should be aligned to their
channel size.

This fixes a bug were the dvec3/4 might be loaded one part on a GRF and
the rest in next GRF, so the region parameters to read that could break
the HW rules.

v2:
- Fix broken logic.
- Add a comment to explain what should be needed to optimise the usage
  of the push constant buffer slots, as this patch does not pack the
  uniforms.

v3:
- Implemented the push constant buffer usage optimization.

Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
Cc: "17.1" <mesa-sta...@lists.freedesktop.org>
---
 src/intel/compiler/brw_vec4.cpp | 107 ++--
 1 file changed, 80 insertions(+), 27 deletions(-)

diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp
index 70487d3c15..ff9058021e 100644
--- a/src/intel/compiler/brw_vec4.cpp
+++ b/src/intel/compiler/brw_vec4.cpp
@@ -583,16 +583,46 @@ vec4_visitor::split_uniform_registers()
}
 }
 
+/* This function returns the register number where we placed the uniform */
+static int
+set_push_constant_loc(const int nr_uniforms, int *new_uniform_count,
+  const int src, const int size, const int channel_size,
+  int *new_loc, int *new_chan,
+  int *new_chans_used)
+{
+   int dst;
+   /* Find the lowest place we can slot this uniform in. */
+   for (dst = 0; dst < nr_uniforms; dst++) {
+  if (ALIGN(new_chans_used[dst], channel_size) + size <= 4)
+ break;
+   }
+
+   assert(dst < nr_uniforms);
+
+   new_loc[src] = dst;
+   new_chan[src] = ALIGN(new_chans_used[dst], channel_size);
+   new_chans_used[dst] = ALIGN(new_chans_used[dst], channel_size) + size;
+
+   *new_uniform_count = MAX2(*new_uniform_count, dst + 1);
+   return dst;
+}
+
 void
 vec4_visitor::pack_uniform_registers()
 {
uint8_t chans_used[this->uniforms];
int new_loc[this->uniforms];
int new_chan[this->uniforms];
+   bool is_aligned_to_dvec4[this->uniforms];
+   int new_chans_used[this->uniforms];
+   int channel_sizes[this->uniforms];
 
memset(chans_used, 0, sizeof(chans_used));
memset(new_loc, 0, sizeof(new_loc));
memset(new_chan, 0, sizeof(new_chan));
+   memset(new_chans_used, 0, sizeof(new_chans_used));
+   memset(is_aligned_to_dvec4, 0, sizeof(is_aligned_to_dvec4));
+   memset(channel_sizes, 0, sizeof(channel_sizes));
 
/* Find which uniform vectors are actually used by the program.  We
 * expect unused vector elements when we've moved array access out
@@ -622,7 +652,7 @@ vec4_visitor::pack_uniform_registers()
 continue;
 
  assert(type_sz(inst->src[i].type) % 4 == 0);
- unsigned channel_size = type_sz(inst->src[i].type) / 4;
+ int channel_size = type_sz(inst->src[i].type) / 4;
 
  int reg = inst->src[i].nr;
  for (int c = 0; c < 4; c++) {
@@ -631,10 +661,15 @@ vec4_visitor::pack_uniform_registers()
 
 unsigned channel = BRW_GET_SWZ(inst->src[i].swizzle, c) + 1;
 unsigned used = MAX2(chans_used[reg], channel * channel_size);
-if (used <= 4)
+if (used <= 4) {
chans_used[reg] = used;
-else
+   channel_sizes[reg] = MAX2(channel_sizes[reg], channel_size);
+} else {
+   is_aligned_to_dvec4[reg] = true;
+   is_aligned_to_dvec4[reg + 1] = true;
chans_used[reg + 1] = used - 4;
+   channel_sizes[reg + 1] = MAX2(channel_sizes[reg + 1], 
channel_size);
+}
  }
   }
 
@@ -659,42 +694,60 @@ vec4_visitor::pack_uniform_registers()
 
int new_uniform_count = 0;
 
+   /* As the uniforms are going to be reordered, take the data from a temporary
+* copy of the original param[].
+*/
+   gl_constant_value **param = ralloc_array(NULL, gl_constant_value*,
+stage_prog_data->nr_params);
+   memcpy(param, stage_prog_data->param,
+  sizeof(gl_constant_value*) * stage_prog_data->nr_params);
+
/* Now, figure out a packing of the live uniform vectors into our
-* push constants.
+* push constants. Start with dvec{3,4} because they are aligned to
+* dvec4 size (2 vec4).
 */
for (int src = 0; src < uniforms; src++) {
   int size = chans_used[src];
 
-  if (size == 0)
+  if (size == 0 || !is_aligned_to_dvec4[src])
  continue;
 
-  int dst;
-  /* Find the lowest place we can slot this uniform in. */
-  for (dst = 0; dst < src; dst++) {
- if (chans_used[dst] + size <= 4)
-break;
+  /* dvec3 are aligned to dvec4 size, apply the alignment of the size
+   * to 4 to avoid moving last component of

[Mesa-dev] [PATCH 2/3] i965/vec4: fix swizzle and writemask when loading an uniform with constant offset

2017-05-05 Thread Samuel Iglesias Gonsálvez

It was setting XYWZ swizzle and writemask to all uniforms, no matter if they
were a vector or scalar, so this can lead to problems when loading them
to the push constant buffer.

Moreover, 'shift' calculation was designed to calculate the offset in
DWORDS, but it doesn't take into account DFs, so the calculated swizzle
for the later ones was wrong.

The indirect case is not changed because MOV INDIRECT will write
to all components. Added an assert to verify that these uniforms
are aligned.

v2:
- Fix 'shift' calculation (Curro)
- Set both swizzle and writemask.
- Add assert(shift == 0) for the indirect case.

Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
Cc: "17.1" <mesa-sta...@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <curroje...@riseup.net>
---
 src/intel/compiler/brw_vec4_nir.cpp | 15 +++
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/src/intel/compiler/brw_vec4_nir.cpp 
b/src/intel/compiler/brw_vec4_nir.cpp
index a82d52088a..80115aca0f 100644
--- a/src/intel/compiler/brw_vec4_nir.cpp
+++ b/src/intel/compiler/brw_vec4_nir.cpp
@@ -852,7 +852,8 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr *instr)
* The swizzle also works in the indirect case as the generator adds
* the swizzle to the offset for us.
*/
-  unsigned shift = (nir_intrinsic_base(instr) % 16) / 4;
+  const int type_size = type_sz(src.type);
+  unsigned shift = (nir_intrinsic_base(instr) % 16) / type_size;
   assert(shift + instr->num_components <= 4);
 
   nir_const_value *const_offset = nir_src_as_const_value(instr->src[0]);
@@ -860,14 +861,20 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr 
*instr)
  /* Offsets are in bytes but they should always be multiples of 4 */
  assert(const_offset->u32[0] % 4 == 0);
 
- unsigned offset = const_offset->u32[0] + shift * 4;
+ src.swizzle = brw_swizzle_for_size(instr->num_components);
+ dest.writemask = brw_writemask_for_size(instr->num_components);
+ unsigned offset = const_offset->u32[0] + shift * type_size;
  src.offset = ROUND_DOWN_TO(offset, 16);
- shift = (offset % 16) / 4;
+ shift = (offset % 16) / type_size;
+ assert(shift + instr->num_components <= 4);
  src.swizzle += BRW_SWIZZLE4(shift, shift, shift, shift);
 
  emit(MOV(dest, src));
   } else {
- src.swizzle += BRW_SWIZZLE4(shift, shift, shift, shift);
+ /* Uniform arrays are vec4 aligned, because of std140 alignment
+  * rules.
+  */
+ assert(shift == 0);
 
  src_reg indirect = get_nir_src(instr->src[0], BRW_REGISTER_TYPE_UD, 
1);
 
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/3] i965/vec4/gs: restore the uniform values which was overwritten by failed vec4_gs_visitor execution

2017-05-05 Thread Samuel Iglesias Gonsálvez

We are going to add a packing feature to reduce the usage of the push
constant buffer. One of the consequences is that 'nr_params' would be
modified by vec4_visitor's run call, so we need to restore it if one of
them failed before executing the fallback ones. Same thing happens to the
uniforms values that would be reordered afterwards.

Fixes GL45-CTS.arrays_of_arrays_gl.InteractionFunctionCalls2 when
the dvec4 alignment and packing patch is applied.

Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
Cc: "17.1" <mesa-sta...@lists.freedesktop.org>
---
 src/intel/compiler/brw_vec4_gs_visitor.cpp | 26 ++
 1 file changed, 26 insertions(+)

diff --git a/src/intel/compiler/brw_vec4_gs_visitor.cpp 
b/src/intel/compiler/brw_vec4_gs_visitor.cpp
index 4a8b5be30e..5fcd02831a 100644
--- a/src/intel/compiler/brw_vec4_gs_visitor.cpp
+++ b/src/intel/compiler/brw_vec4_gs_visitor.cpp
@@ -868,10 +868,36 @@ brw_compile_gs(const struct brw_compiler *compiler, void 
*log_data,
 
  vec4_gs_visitor v(compiler, log_data, , prog_data, shader,
mem_ctx, true /* no_spills */, shader_time_index);
+
+ /* Backup 'nr_params' and 'param' as they can be modified by the
+  * the DUAL_OBJECT visitor. If it fails, we will run the fallback
+  * (DUAL_INSTANCED or SINGLE mode) and we need to restore original
+  * values.
+  */
+ const unsigned param_count = prog_data->base.base.nr_params;
+ gl_constant_value **param = ralloc_array(NULL, gl_constant_value*,
+  param_count);
+ memcpy(param, prog_data->base.base.param,
+sizeof(gl_constant_value*) * param_count);
+
  if (v.run()) {
+/* Success! Backup is not needed */
+ralloc_free(param);
 return brw_vec4_generate_assembly(compiler, log_data, mem_ctx,
   shader, _data->base, v.cfg,
   final_assembly_size);
+ } else {
+/* These variables could be modified by the execution of the GS
+ * visitor if it packed the uniforms in the push constant buffer.
+ * As it failed, we need restore them so we can start again with
+ * DUAL_INSTANCED or SINGLE mode.
+ *
+ * FIXME: Could more variables be modified by this execution?
+ */
+memcpy(prog_data->base.base.param, param,
+   sizeof(gl_constant_value*) * param_count);
+prog_data->base.base.nr_params = param_count;
+ralloc_free(param);
  }
   }
}
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] anv: document that anv_gem_mmap returns MAP_FAILED on error

2017-05-05 Thread Samuel Iglesias Gonsálvez

Reviewed-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>

On Thu, 2017-05-04 at 15:55 +0100, Emil Velikov wrote:
> From: Emil Velikov <emil.veli...@collabora.com>
> 
> Signed-off-by: Emil Velikov <emil.veli...@collabora.com>
> ---
>  src/intel/vulkan/anv_gem.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/intel/vulkan/anv_gem.c b/src/intel/vulkan/anv_gem.c
> index 185086fefcc..68c5e611342 100644
> --- a/src/intel/vulkan/anv_gem.c
> +++ b/src/intel/vulkan/anv_gem.c
> @@ -74,7 +74,7 @@ anv_gem_close(struct anv_device *device, uint32_t
> gem_handle)
>  }
>  
>  /**
> - * Wrapper around DRM_IOCTL_I915_GEM_MMAP.
> + * Wrapper around DRM_IOCTL_I915_GEM_MMAP. Returns MAP_FAILED on
> error.
>   */
>  void*
>  anv_gem_mmap(struct anv_device *device, uint32_t gem_handle,

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] anv: fix anv_gem_mmap comment to not mention NULL

2017-05-04 Thread Samuel Iglesias Gonsálvez

Reviewed-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>

On Thu, 2017-05-04 at 15:55 +0100, Emil Velikov wrote:
> From: Emil Velikov <emil.veli...@collabora.com>
> 
> The function cannot return NULL, update the comment accordingly.
> 
> Fixes: b546c9d ("anv: anv_gem_mmap() returns MAP_FAILED as mapping
> error")
> Cc: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
> Signed-off-by: Emil Velikov <emil.veli...@collabora.com>
> ---
>  src/intel/vulkan/anv_image.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/intel/vulkan/anv_image.c
> b/src/intel/vulkan/anv_image.c
> index 55402b25571..d21e055f020 100644
> --- a/src/intel/vulkan/anv_image.c
> +++ b/src/intel/vulkan/anv_image.c
> @@ -348,7 +348,7 @@ VkResult anv_BindImageMemory(
> if (image->aux_surface.isl.size > 0) {
>  
>    /* The offset and size must be a multiple of 4K or else the
> -   * anv_gem_mmap call below will return NULL.
> +   * anv_gem_mmap call below will fail.
> */
>    assert((image->offset + image->aux_surface.offset) % 4096 ==
> 0);
>    assert(image->aux_surface.isl.size % 4096 == 0);

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] anv: vkBindImageMemory() should return VK_ERROR_OUT_OF_{HOST, DEVICE}_MEMORY on error

2017-05-04 Thread Samuel Iglesias Gonsálvez

On Thu, 2017-05-04 at 14:03 +0100, Emil Velikov wrote:
> On 4 May 2017 at 11:01, Samuel Iglesias Gonsálvez <siglesias@igalia.c
> om> wrote:
> > Fixes returned value changed by b546c9d.
> > 
> 
> According to the spec we get VK_ERROR_OUT_OF_HOST_MEMORY or
> VK_ERROR_OUT_OF_DEVICE_MEMORY on vkBindImageMemory failure.
> I should have explicitly checked it closer :-\
> 

Yeah, I realised it after pushing it :-/

> > Fixes: b546c9d ("anv: anv_gem_mmap() returns MAP_FAILED as mapping
> > error")
> > Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
> > Cc: "17.0 17.1" <mesa-sta...@lists.freedesktop.org>
> 
> Reviewed-by: Emil Velikov <emil.veli...@collabora.com>
> 

Pushed.

Thanks!

Sam

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] anv: vkBindImageMemory() should return VK_ERROR_OUT_OF_{HOST, DEVICE}_MEMORY on error

2017-05-04 Thread Samuel Iglesias Gonsálvez

Fixes returned value changed by b546c9d.

Fixes: b546c9d ("anv: anv_gem_mmap() returns MAP_FAILED as mapping error")
Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
Cc: "17.0 17.1" <mesa-sta...@lists.freedesktop.org>
---
 src/intel/vulkan/anv_image.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c
index 36f5d47e1a..55402b2557 100644
--- a/src/intel/vulkan/anv_image.c
+++ b/src/intel/vulkan/anv_image.c
@@ -365,7 +365,7 @@ VkResult anv_BindImageMemory(
device->info.has_llc ? 0 : I915_MMAP_WC);
 
   if (map == MAP_FAILED)
- return vk_error(VK_ERROR_MEMORY_MAP_FAILED);
+ return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
 
   memset(map, 0, image->aux_surface.isl.size);
 
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 2/2] i965/vec4: load dvec3/4 uniforms first in the push constant buffer

2017-05-04 Thread Samuel Iglesias Gonsálvez

On Wed, 2017-05-03 at 16:47 -0700, Francisco Jerez wrote:
> Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:
> 
> > Reorder the uniforms to load first the dvec4-aligned variables
> > in the push constant buffer and then push the vec4-aligned ones.
> > 
> > This fixes a bug were the dvec3/4 might be loaded one part on a GRF
> > and
> > the rest in next GRF, so the region parameters to read that could
> > break
> > the HW rules.
> > 
> > v2:
> > - Fix broken logic.
> > - Add a comment to explain what should be needed to optimise the
> > usage
> > of the push constant buffer slots, as this patch does not pack the
> > uniforms.
> > 
> > Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
> > Cc: "17.1" <mesa-sta...@lists.freedesktop.org>
> > ---
> >  src/intel/compiler/brw_vec4.cpp | 97
> > +++--
> >  1 file changed, 74 insertions(+), 23 deletions(-)
> > 
> > diff --git a/src/intel/compiler/brw_vec4.cpp
> > b/src/intel/compiler/brw_vec4.cpp
> > index 0909ddb586..18bfd48fa1 100644
> > --- a/src/intel/compiler/brw_vec4.cpp
> > +++ b/src/intel/compiler/brw_vec4.cpp
> > @@ -583,16 +583,44 @@ vec4_visitor::split_uniform_registers()
> > }
> >  }
> >  
> > +/* This function returns the register number where we placed the
> > uniform */
> > +static int
> > +set_push_constant_loc(const int nr_uniforms, int
> > *new_uniform_count,
> > +  const int src, const int size,
> > +  int *new_loc, int *new_chan,
> > +  int *new_chans_used)
> > +{
> > +   int dst;
> > +   /* Find the lowest place we can slot this uniform in. */
> > +   for (dst = 0; dst < nr_uniforms; dst++) {
> > +  if (new_chans_used[dst] + size <= 4)
> > + break;
> > +   }
> > +
> > +   assert(dst < nr_uniforms);
> > +
> > +   new_loc[src] = dst;
> > +   new_chan[src] = new_chans_used[dst];
> > +   new_chans_used[dst] += size;
> > +
> > +   *new_uniform_count = MAX2(*new_uniform_count, dst + 1);
> > +   return dst;
> > +}
> > +
> >  void
> >  vec4_visitor::pack_uniform_registers()
> >  {
> > uint8_t chans_used[this->uniforms];
> > int new_loc[this->uniforms];
> > int new_chan[this->uniforms];
> > +   bool is_aligned_to_dvec4[this->uniforms];
> > +   int new_chans_used[this->uniforms];
> >  
> > memset(chans_used, 0, sizeof(chans_used));
> > memset(new_loc, 0, sizeof(new_loc));
> > memset(new_chan, 0, sizeof(new_chan));
> > +   memset(new_chans_used, 0, sizeof(new_chans_used));
> > +   memset(is_aligned_to_dvec4, 0, sizeof(is_aligned_to_dvec4));
> >  
> > /* Find which uniform vectors are actually used by the
> > program.  We
> >  * expect unused vector elements when we've moved array access
> > out
> > @@ -631,10 +659,19 @@ vec4_visitor::pack_uniform_registers()
> >  
> >  unsigned channel = BRW_GET_SWZ(inst->src[i].swizzle,
> > c) + 1;
> >  unsigned used = MAX2(chans_used[reg], channel *
> > channel_size);
> > -if (used <= 4)
> > -   chans_used[reg] = used;
> > -else
> > -   chans_used[reg + 1] = used - 4;
> > +/* FIXME: Marked all channels as used, so each uniform
> > will
> > + * fully use one or two vec4s. If we want to pack
> > them, we need
> > + * to, among other changes, set chans_used[reg] =
> > used;
> > + * chans_used[reg+1] = used - 4; and fix the swizzle
> > at the
> > + * end in order to set the proper location.
> > + */
> > +if (used <= 4) {
> > +   chans_used[reg] = 4;
> 
> Uhm...  So this change prevents the uniform packing pass from
> actually
> packing anything?  Might affect more applications negatively than
> broken
> FP64 would.  Are you planning to send a v3 that fixes the issue
> without
> disabling the optimization?

Yes, I am planning to send a v3 of this patch with the optimization in-
place.

>   May be worth holding this off until then.
> Even if that means it will miss the v17.1 release it will probably
> make
> it for the next bug-fix release.
> 

OK, thanks!

Sam


> > +} else {
> > +   is_aligned_to_dvec4[reg] = true;
> > +   is_aligned_to_dvec4[reg + 1]

Re: [Mesa-dev] [PATCH] anv: anv_gem_mmap() returns MAP_FAILED as mapping error

2017-05-04 Thread Samuel Iglesias Gonsálvez

On Wed, 2017-05-03 at 14:37 +0100, Emil Velikov wrote:
> On 3 May 2017 at 14:26, Samuel Iglesias Gonsálvez <siglesias@igalia.c
> om> wrote:
> > On Wed, 2017-05-03 at 14:15 +0100, Emil Velikov wrote:
> > > On 3 May 2017 at 12:33, Samuel Iglesias Gonsálvez <siglesias@igal
> > > ia.c
> > > om> wrote:
> > > > On Wed, 2017-05-03 at 11:50 +0100, Emil Velikov wrote:
> > > > > Hi Samuel,
> > > > > 
> > > > > On 3 May 2017 at 08:57, Samuel Iglesias Gonsálvez <siglesias@
> > > > > igal
> > > > > ia.c
> > > > > om> wrote:
> > > > > > Take it into account when checking if the mapping failed.
> > > > > > 
> > > > > > Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.
> > > > > > com>
> > > > > > ---
> > > > > >  src/intel/vulkan/anv_allocator.c | 2 +-
> > > > > >  src/intel/vulkan/anv_image.c | 4 
> > > > > >  2 files changed, 5 insertions(+), 1 deletion(-)
> > > > > > 
> > > > > > diff --git a/src/intel/vulkan/anv_allocator.c
> > > > > > b/src/intel/vulkan/anv_allocator.c
> > > > > > index 554ca4ac5f..6ab2da5d64 100644
> > > > > > --- a/src/intel/vulkan/anv_allocator.c
> > > > > > +++ b/src/intel/vulkan/anv_allocator.c
> > > > > > @@ -889,7 +889,7 @@ anv_bo_pool_alloc(struct anv_bo_pool
> > > > > > *pool,
> > > > > > struct anv_bo *bo, uint32_t size)
> > > > > > assert(new_bo.size == pow2_size);
> > > > > > 
> > > > > > new_bo.map = anv_gem_mmap(pool->device,
> > > > > > new_bo.gem_handle,
> > > > > > 0,
> > > > > > pow2_size, 0);
> > > > > > -   if (new_bo.map == NULL) {
> > > > > > +   if (new_bo.map == MAP_FAILED) {
> > > > > >    anv_gem_close(pool->device, new_bo.gem_handle);
> > > > > >    return vk_error(VK_ERROR_MEMORY_MAP_FAILED);
> > > > > > }
> > > > > > diff --git a/src/intel/vulkan/anv_image.c
> > > > > > b/src/intel/vulkan/anv_image.c
> > > > > > index 4874f2f3d3..d7d53f96a4 100644
> > > > > > --- a/src/intel/vulkan/anv_image.c
> > > > > > +++ b/src/intel/vulkan/anv_image.c
> > > > > > @@ -26,6 +26,7 @@
> > > > > >  #include 
> > > > > >  #include 
> > > > > >  #include 
> > > > > > +#include 
> > > > > > 
> > > > > >  #include "anv_private.h"
> > > > > >  #include "util/debug.h"
> > > > > > @@ -369,6 +370,9 @@ VkResult anv_BindImageMemory(
> > > > > >    if (map == NULL)
> > > > > >   return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
> > > > > > 
> > > > > 
> > > > > Shouldn't we drop this check alongside its comment? There's a
> > > > > another
> > > > > comment a few lines further up which should be updated as
> > > > > well.
> > > > 
> > > > I thought about doing the same when writing the patch. However,
> > > > that
> > > > comment gave me some doubts which is the reason I added Jason
> > > > in
> > > > Cc.
> > > > Let's see if he agrees that it is safe to remove this check
> > > > (and
> > > > its
> > > > comment), as he was the author of those lines.
> > > > 
> > > 
> > > Most likely it's a typo since if the kernel returns NULL w/o
> > > reporting
> > > an error we're in the deep.

Right.

> > > Regardless, please add the following tags.
> > > 
> > > Fixes: 6f3e3c715a7 ("vk/allocator: Add a BO pool")
> > > Fixes: 9919a2d34de ("anv/image: Memset hiz surfaces to 0 when
> > > binding
> > > memory")
> > > 
> > 
> > OK, I will add them.
> > 
> > Should I add the mesa-stable tag to this or you will pick it
> > without
> > it? If I need to add it... then it would be "17.0 17.1", right?
> > 
> 
> The Fixes tag somewhat supersedes mesa-stable, as the former
> nominates
> the patch for the correct branch(es).
> Adding mesa-stable won't hurt though :-)
> 

I finally pushed this patch to master, so it enters into next mesa
release.

Thanks,

Sam

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] anv: anv_gem_mmap() returns MAP_FAILED as mapping error

2017-05-03 Thread Samuel Iglesias Gonsálvez

On Wed, 2017-05-03 at 14:15 +0100, Emil Velikov wrote:
> On 3 May 2017 at 12:33, Samuel Iglesias Gonsálvez <siglesias@igalia.c
> om> wrote:
> > On Wed, 2017-05-03 at 11:50 +0100, Emil Velikov wrote:
> > > Hi Samuel,
> > > 
> > > On 3 May 2017 at 08:57, Samuel Iglesias Gonsálvez <siglesias@igal
> > > ia.c
> > > om> wrote:
> > > > Take it into account when checking if the mapping failed.
> > > > 
> > > > Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
> > > > ---
> > > >  src/intel/vulkan/anv_allocator.c | 2 +-
> > > >  src/intel/vulkan/anv_image.c | 4 
> > > >  2 files changed, 5 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/src/intel/vulkan/anv_allocator.c
> > > > b/src/intel/vulkan/anv_allocator.c
> > > > index 554ca4ac5f..6ab2da5d64 100644
> > > > --- a/src/intel/vulkan/anv_allocator.c
> > > > +++ b/src/intel/vulkan/anv_allocator.c
> > > > @@ -889,7 +889,7 @@ anv_bo_pool_alloc(struct anv_bo_pool *pool,
> > > > struct anv_bo *bo, uint32_t size)
> > > > assert(new_bo.size == pow2_size);
> > > > 
> > > > new_bo.map = anv_gem_mmap(pool->device, new_bo.gem_handle,
> > > > 0,
> > > > pow2_size, 0);
> > > > -   if (new_bo.map == NULL) {
> > > > +   if (new_bo.map == MAP_FAILED) {
> > > >    anv_gem_close(pool->device, new_bo.gem_handle);
> > > >    return vk_error(VK_ERROR_MEMORY_MAP_FAILED);
> > > > }
> > > > diff --git a/src/intel/vulkan/anv_image.c
> > > > b/src/intel/vulkan/anv_image.c
> > > > index 4874f2f3d3..d7d53f96a4 100644
> > > > --- a/src/intel/vulkan/anv_image.c
> > > > +++ b/src/intel/vulkan/anv_image.c
> > > > @@ -26,6 +26,7 @@
> > > >  #include 
> > > >  #include 
> > > >  #include 
> > > > +#include 
> > > > 
> > > >  #include "anv_private.h"
> > > >  #include "util/debug.h"
> > > > @@ -369,6 +370,9 @@ VkResult anv_BindImageMemory(
> > > >    if (map == NULL)
> > > >   return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
> > > > 
> > > 
> > > Shouldn't we drop this check alongside its comment? There's a
> > > another
> > > comment a few lines further up which should be updated as well.
> > 
> > I thought about doing the same when writing the patch. However,
> > that
> > comment gave me some doubts which is the reason I added Jason in
> > Cc.
> > Let's see if he agrees that it is safe to remove this check (and
> > its
> > comment), as he was the author of those lines.
> > 
> 
> Most likely it's a typo since if the kernel returns NULL w/o
> reporting
> an error we're in the deep.
> Regardless, please add the following tags.
> 
> Fixes: 6f3e3c715a7 ("vk/allocator: Add a BO pool")
> Fixes: 9919a2d34de ("anv/image: Memset hiz surfaces to 0 when binding
> memory")
> 

OK, I will add them.

Should I add the mesa-stable tag to this or you will pick it without
it? If I need to add it... then it would be "17.0 17.1", right?

Sam

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 2/2] i965/vec4: load dvec3/4 uniforms first in the push constant buffer

2017-05-03 Thread Samuel Iglesias Gonsálvez

Reorder the uniforms to load first the dvec4-aligned variables
in the push constant buffer and then push the vec4-aligned ones.

This fixes a bug were the dvec3/4 might be loaded one part on a GRF and
the rest in next GRF, so the region parameters to read that could break
the HW rules.

v2:
- Fix broken logic.
- Add a comment to explain what should be needed to optimise the usage
of the push constant buffer slots, as this patch does not pack the
uniforms.

Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
Cc: "17.1" <mesa-sta...@lists.freedesktop.org>
---
 src/intel/compiler/brw_vec4.cpp | 97 +++--
 1 file changed, 74 insertions(+), 23 deletions(-)

diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp
index 0909ddb586..18bfd48fa1 100644
--- a/src/intel/compiler/brw_vec4.cpp
+++ b/src/intel/compiler/brw_vec4.cpp
@@ -583,16 +583,44 @@ vec4_visitor::split_uniform_registers()
}
 }
 
+/* This function returns the register number where we placed the uniform */
+static int
+set_push_constant_loc(const int nr_uniforms, int *new_uniform_count,
+  const int src, const int size,
+  int *new_loc, int *new_chan,
+  int *new_chans_used)
+{
+   int dst;
+   /* Find the lowest place we can slot this uniform in. */
+   for (dst = 0; dst < nr_uniforms; dst++) {
+  if (new_chans_used[dst] + size <= 4)
+ break;
+   }
+
+   assert(dst < nr_uniforms);
+
+   new_loc[src] = dst;
+   new_chan[src] = new_chans_used[dst];
+   new_chans_used[dst] += size;
+
+   *new_uniform_count = MAX2(*new_uniform_count, dst + 1);
+   return dst;
+}
+
 void
 vec4_visitor::pack_uniform_registers()
 {
uint8_t chans_used[this->uniforms];
int new_loc[this->uniforms];
int new_chan[this->uniforms];
+   bool is_aligned_to_dvec4[this->uniforms];
+   int new_chans_used[this->uniforms];
 
memset(chans_used, 0, sizeof(chans_used));
memset(new_loc, 0, sizeof(new_loc));
memset(new_chan, 0, sizeof(new_chan));
+   memset(new_chans_used, 0, sizeof(new_chans_used));
+   memset(is_aligned_to_dvec4, 0, sizeof(is_aligned_to_dvec4));
 
/* Find which uniform vectors are actually used by the program.  We
 * expect unused vector elements when we've moved array access out
@@ -631,10 +659,19 @@ vec4_visitor::pack_uniform_registers()
 
 unsigned channel = BRW_GET_SWZ(inst->src[i].swizzle, c) + 1;
 unsigned used = MAX2(chans_used[reg], channel * channel_size);
-if (used <= 4)
-   chans_used[reg] = used;
-else
-   chans_used[reg + 1] = used - 4;
+/* FIXME: Marked all channels as used, so each uniform will
+ * fully use one or two vec4s. If we want to pack them, we need
+ * to, among other changes, set chans_used[reg] = used;
+ * chans_used[reg+1] = used - 4; and fix the swizzle at the
+ * end in order to set the proper location.
+ */
+if (used <= 4) {
+   chans_used[reg] = 4;
+} else {
+   is_aligned_to_dvec4[reg] = true;
+   is_aligned_to_dvec4[reg + 1] = true;
+   chans_used[reg + 1] = 4;
+}
  }
   }
 
@@ -659,42 +696,56 @@ vec4_visitor::pack_uniform_registers()
 
int new_uniform_count = 0;
 
+   /* As the uniforms are going to be reordered, take the data from a temporary
+* copy of the original param[].
+*/
+   gl_constant_value **param = ralloc_array(NULL, gl_constant_value*,
+stage_prog_data->nr_params);
+   memcpy(param, stage_prog_data->param,
+  sizeof(gl_constant_value*) * stage_prog_data->nr_params);
+
/* Now, figure out a packing of the live uniform vectors into our
-* push constants.
+* push constants. Start with dvec{3,4} because they are aligned to
+* dvec4 size (2 vec4).
 */
for (int src = 0; src < uniforms; src++) {
   int size = chans_used[src];
 
-  if (size == 0)
+  if (size == 0 || !is_aligned_to_dvec4[src])
  continue;
 
-  int dst;
-  /* Find the lowest place we can slot this uniform in. */
-  for (dst = 0; dst < src; dst++) {
- if (chans_used[dst] + size <= 4)
-break;
+  int dst = set_push_constant_loc(uniforms, _uniform_count,
+  src, size, new_loc, new_chan,
+  new_chans_used);
+  if (dst != src) {
+ /* Move the references to the data */
+ for (int j = 0; j < size; j++) {
+stage_prog_data->param[dst * 4 + new_chan[src] + j] =
+   param[src * 4 + j];
+ }
   }
+   }
 
-  if (src == dst) {
- new_loc[src] = dst;
- new_chan[src] = 0;
-  } else {
- new_loc[src]

Re: [Mesa-dev] [PATCH v2 1/2] i965/vec4: fix swizzle and writemask when loading an uniform with constant offset

2017-05-03 Thread Samuel Iglesias Gonsálvez

On Tue, 2017-05-02 at 12:23 -0700, Francisco Jerez wrote:
> Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:
> 
> > El Viernes, 28 de abril de 2017 16:08:35 Francisco Jerez escribió:
> > > Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:
> > > > It was setting XYWZ swizzle and writemask to all uniforms, no
> > > > matter if
> > > > they were a vector or scalar, so this can lead to problems when
> > > > loading
> > > > them to the push constant buffer.
> > > > 
> > > > Moreover, 'shift' calculation was designed to calculate the
> > > > offset in
> > > > DWORDS, but it doesn't take into account DFs, so the calculated
> > > > swizzle
> > > > for the later ones was wrong.
> > > > 
> > > > The indirect case is not changed because MOV INDIRECT will
> > > > write
> > > > to all components. Added an assert to verify that these
> > > > uniforms
> > > > are aligned.
> > > > 
> > > > v2:
> > > > - Fix 'shift' calculation (Curro)
> > > > - Set both swizzle and writemask.
> > > > - Add assert(shift == 0) for the indirect case.
> > > > 
> > > > Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
> > > > Cc: "17.1" <mesa-sta...@lists.freedesktop.org>
> > > 
> > > Reviewed-by: Francisco Jerez <curroje...@riseup.net>
> > > 
> > 
> > Thanks!
> > 
> > What about the second patch? Is it OK for you?
> > 
> 
> Not 100% certain it's doing the right thing, but feel free to add my:
> 
> Acked-by: Francisco Jerez <curroje...@riseup.net>
> 
> to PATCH 2 of this series.
> 

Actually, it was not doing the thing I though it would do: it was
working because I was assigning consecutive locations but all the
intended logic was broken :-(

I will send a new version of the patch today fixing this for review, so
it can enter into the mesa release. This new patch does not optimise
the use of the slots (vec4-size) in the push constant buffer, meaning
that each uniform will "use" the whole slot (or two consecutive slots
in case of dvec3/dvec4) as it was done before the first patch of this
series; i.e. the channels used are always 4 per slot, but I use the
swizzle and writemask set by this first patch of the series only to
identify dvec3 and dvec4 uniforms, in order to align them properly.

I will add a comment saying what should be done to add that
optimisation, so this can be done in a follow-up patch after the
release.

Sam



signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] anv: anv_gem_mmap() returns MAP_FAILED as mapping error

2017-05-03 Thread Samuel Iglesias Gonsálvez

On Wed, 2017-05-03 at 11:50 +0100, Emil Velikov wrote:
> Hi Samuel,
> 
> On 3 May 2017 at 08:57, Samuel Iglesias Gonsálvez <siglesias@igalia.c
> om> wrote:
> > Take it into account when checking if the mapping failed.
> > 
> > Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
> > ---
> >  src/intel/vulkan/anv_allocator.c | 2 +-
> >  src/intel/vulkan/anv_image.c | 4 
> >  2 files changed, 5 insertions(+), 1 deletion(-)
> > 
> > diff --git a/src/intel/vulkan/anv_allocator.c
> > b/src/intel/vulkan/anv_allocator.c
> > index 554ca4ac5f..6ab2da5d64 100644
> > --- a/src/intel/vulkan/anv_allocator.c
> > +++ b/src/intel/vulkan/anv_allocator.c
> > @@ -889,7 +889,7 @@ anv_bo_pool_alloc(struct anv_bo_pool *pool,
> > struct anv_bo *bo, uint32_t size)
> > assert(new_bo.size == pow2_size);
> > 
> > new_bo.map = anv_gem_mmap(pool->device, new_bo.gem_handle, 0,
> > pow2_size, 0);
> > -   if (new_bo.map == NULL) {
> > +   if (new_bo.map == MAP_FAILED) {
> >    anv_gem_close(pool->device, new_bo.gem_handle);
> >    return vk_error(VK_ERROR_MEMORY_MAP_FAILED);
> > }
> > diff --git a/src/intel/vulkan/anv_image.c
> > b/src/intel/vulkan/anv_image.c
> > index 4874f2f3d3..d7d53f96a4 100644
> > --- a/src/intel/vulkan/anv_image.c
> > +++ b/src/intel/vulkan/anv_image.c
> > @@ -26,6 +26,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> > 
> >  #include "anv_private.h"
> >  #include "util/debug.h"
> > @@ -369,6 +370,9 @@ VkResult anv_BindImageMemory(
> >    if (map == NULL)
> >   return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
> > 
> 
> Shouldn't we drop this check alongside its comment? There's a another
> comment a few lines further up which should be updated as well.

I thought about doing the same when writing the patch. However, that
comment gave me some doubts which is the reason I added Jason in Cc.
Let's see if he agrees that it is safe to remove this check (and its
comment), as he was the author of those lines.

> With that
> Reviewed-by: Emil Velikov <emil.veli...@collabora.com>
> 
> Some suggestions for follow-up patches:
>  - check if anv_gem_mmap in genX_query.c fails
>  - document the anv_gem_mmap return value
> 
> -Emil

Thanks,

Sam

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] anv: anv_gem_mmap() returns MAP_FAILED as mapping error

2017-05-03 Thread Samuel Iglesias Gonsálvez

Take it into account when checking if the mapping failed.

Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
---
 src/intel/vulkan/anv_allocator.c | 2 +-
 src/intel/vulkan/anv_image.c | 4 
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/intel/vulkan/anv_allocator.c b/src/intel/vulkan/anv_allocator.c
index 554ca4ac5f..6ab2da5d64 100644
--- a/src/intel/vulkan/anv_allocator.c
+++ b/src/intel/vulkan/anv_allocator.c
@@ -889,7 +889,7 @@ anv_bo_pool_alloc(struct anv_bo_pool *pool, struct anv_bo 
*bo, uint32_t size)
assert(new_bo.size == pow2_size);
 
new_bo.map = anv_gem_mmap(pool->device, new_bo.gem_handle, 0, pow2_size, 0);
-   if (new_bo.map == NULL) {
+   if (new_bo.map == MAP_FAILED) {
   anv_gem_close(pool->device, new_bo.gem_handle);
   return vk_error(VK_ERROR_MEMORY_MAP_FAILED);
}
diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c
index 4874f2f3d3..d7d53f96a4 100644
--- a/src/intel/vulkan/anv_image.c
+++ b/src/intel/vulkan/anv_image.c
@@ -26,6 +26,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "anv_private.h"
 #include "util/debug.h"
@@ -369,6 +370,9 @@ VkResult anv_BindImageMemory(
   if (map == NULL)
  return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
 
+  if (map == MAP_FAILED)
+ return vk_error(VK_ERROR_MEMORY_MAP_FAILED);
+
   memset(map, 0, image->aux_surface.isl.size);
 
   anv_gem_munmap(map, image->aux_surface.isl.size);
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Mesa-stable] [PATCH 2/3] i965/vec4: fix register width for DF VGRF and UNIFORM

2017-05-02 Thread Samuel Iglesias Gonsálvez

On Mon, 2017-05-01 at 14:55 +0200, Samuel Iglesias Gonsálvez wrote:
> El Viernes, 28 de abril de 2017 16:27:56 Francisco Jerez escribió:
> > Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:
> > > On gen7, the swizzles used in DF align16 instructions works for
> > > element
> > > size of 32 bits, so we can address only 2 consecutive DFs. As we
> > > assumed
> > > that in the rest of the code and prepare the instructions for
> > > this
> > > (scalarize_df()), we need to set it to two again.
> > > 
> > > However, for DF align1 instructions, a width of 2 is wrong as we
> > > are not
> > > reading the data we want. For example, an uniform would have a
> > > region of
> > > <0, 2, 1> so it would repeat the first 2 DFs, when we wanted to
> > > access
> > > to the first 4.
> > > 
> > > This patch sets the default one to 4 and then modifies the width
> > > of
> > > align16 instruction's DF sources when we translate the logical
> > > swizzle
> > > to the physical one.
> > > 
> > > Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
> > > Cc: "17.1" <mesa-sta...@lists.freedesktop.org>
> > > ---
> > > 
> > >  src/intel/compiler/brw_vec4.cpp | 13 -
> > >  1 file changed, 8 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/src/intel/compiler/brw_vec4.cpp
> > > b/src/intel/compiler/brw_vec4.cpp index 95f96ea69c0..8b755e1b75e
> > > 100644
> > > --- a/src/intel/compiler/brw_vec4.cpp
> > > +++ b/src/intel/compiler/brw_vec4.cpp
> > > @@ -2003,9 +2003,7 @@ vec4_visitor::convert_to_hw_regs()
> > > 
> > >   struct brw_reg reg;
> > >   switch (src.file) {
> > >   case VGRF: {
> > > 
> > > -const unsigned type_size = type_sz(src.type);
> > > -const unsigned width = REG_SIZE / 2 / MAX2(4,
> > > type_size);
> > > -reg = byte_offset(brw_vecn_grf(width, src.nr, 0),
> > > src.offset);
> > > +reg = byte_offset(brw_vecn_grf(4, src.nr, 0),
> > > src.offset);
> > > 
> > >  reg.type = src.type;
> > >  reg.abs = src.abs;
> > >  reg.negate = src.negate;
> > > 
> > > @@ -2013,12 +2011,11 @@ vec4_visitor::convert_to_hw_regs()
> > > 
> > >   }
> > >   
> > >   case UNIFORM: {
> > > 
> > > -const unsigned width = REG_SIZE / 2 / MAX2(4,
> > > type_sz(src.type));> 
> > >  reg = stride(byte_offset(brw_vec4_grf(
> > >  
> > >  prog_data-
> > > >base.dispatch_grf_star
> > >  t_reg +
> > >  src.nr / 2, src.nr % 2 *
> > > 4),
> > >   
> > >   src.offset),
> > > 
> > > - 0, width, 1);
> > > + 0, 4, 1);
> > > 
> > >  reg.type = src.type;
> > >  reg.abs = src.abs;
> > >  reg.negate = src.negate;
> > > 
> > > @@ -2576,6 +2573,12 @@ vec4_visitor::apply_logical_swizzle(struct
> > > brw_reg
> > > *hw_reg,> 
> > > assert(brw_is_single_value_swizzle(reg.swizzle) ||
> > > 
> > >    is_supported_64bit_region(inst, arg));
> > > 
> > > +   /* Apply the region <2, 2, 1> for GRF or <0, 2, 1> for
> > > uniforms, as
> > > align16 +* HW can only do 32-bit swizzle channels.
> > > +*/
> > > +   if (reg.file == UNIFORM || reg.file == VGRF)
> > > +  hw_reg->width = BRW_WIDTH_2;
> > 
> > Any reason this is conditional on the register file?  Originally we
> > were
> > only setting the width to 2 for the UNIFORM and VGRF files, but
> > that was
> > probably an oversight...
> > 
> 
> No reason, this was an oversight. I have removed locally the
> conditional.
> 
> Do you get your R-b to it then?
> 

s/get/give

> Sam
> 
> > > +
> > > 
> > > if (is_supported_64bit_region(inst, arg) &&
> > > 
> > > !is_gen7_supported_64bit_swizzle(inst, arg)) {
> > >    
> > >    /* Supported 64-bit swizzles are those such that their
> > > first two
> > > 
> > > ___
> > > mesa-stable mailing list
> > > mesa-sta...@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/mesa-stable

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] anv/query: handle more cases of 'out of host memory'

2017-05-02 Thread Samuel Iglesias Gonsálvez

Reviewed-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>

On Wed, 2017-04-26 at 09:03 +0200, Iago Toral Quiroga wrote:
> ---
>  src/intel/vulkan/genX_query.c | 10 ++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/src/intel/vulkan/genX_query.c
> b/src/intel/vulkan/genX_query.c
> index 126431b..22de3c3 100644
> --- a/src/intel/vulkan/genX_query.c
> +++ b/src/intel/vulkan/genX_query.c
> @@ -566,6 +566,11 @@ keep_gpr0_lower_n_bits(struct anv_batch *batch,
> uint32_t n)
> emit_load_alu_reg_imm64(batch, CS_GPR(1), (1ull << n) - 1);
>  
> uint32_t *dw = anv_batch_emitn(batch, 5, GENX(MI_MATH));
> +   if (!dw) {
> +  anv_batch_set_error(batch, VK_ERROR_OUT_OF_HOST_MEMORY);
> +  return;
> +   }
> +
> dw[1] = mi_alu(MI_ALU_LOAD, MI_ALU_SRCA, MI_ALU_REG0);
> dw[2] = mi_alu(MI_ALU_LOAD, MI_ALU_SRCB, MI_ALU_REG1);
> dw[3] = mi_alu(MI_ALU_AND, 0, 0);
> @@ -592,6 +597,11 @@ shl_gpr0_by_30_bits(struct anv_batch *batch)
> for (int o = 0; o < outer_count; o++) {
>    /* Submit one MI_MATH to shift left by 6 bits */
>    uint32_t *dw = anv_batch_emitn(batch, cmd_len, GENX(MI_MATH));
> +  if (!dw) {
> + anv_batch_set_error(batch, VK_ERROR_OUT_OF_HOST_MEMORY);
> + return;
> +  }
> +
>    dw++;
>    for (int i = 0; i < inner_count; i++, dw += 4) {
>   dw[0] = mi_alu(MI_ALU_LOAD, MI_ALU_SRCA, MI_ALU_REG0);

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 1/2] i965/vec4: fix swizzle and writemask when loading an uniform with constant offset

2017-05-01 Thread Samuel Iglesias Gonsálvez

El Viernes, 28 de abril de 2017 16:08:35 Francisco Jerez escribió:
> Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:
> > It was setting XYWZ swizzle and writemask to all uniforms, no matter if
> > they were a vector or scalar, so this can lead to problems when loading
> > them to the push constant buffer.
> > 
> > Moreover, 'shift' calculation was designed to calculate the offset in
> > DWORDS, but it doesn't take into account DFs, so the calculated swizzle
> > for the later ones was wrong.
> > 
> > The indirect case is not changed because MOV INDIRECT will write
> > to all components. Added an assert to verify that these uniforms
> > are aligned.
> > 
> > v2:
> > - Fix 'shift' calculation (Curro)
> > - Set both swizzle and writemask.
> > - Add assert(shift == 0) for the indirect case.
> > 
> > Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
> > Cc: "17.1" <mesa-sta...@lists.freedesktop.org>
> 
> Reviewed-by: Francisco Jerez <curroje...@riseup.net>
> 

Thanks!

What about the second patch? Is it OK for you?

Sam

> > ---
> > 
> >  src/intel/compiler/brw_vec4_nir.cpp | 15 +++
> >  1 file changed, 11 insertions(+), 4 deletions(-)
> > 
> > diff --git a/src/intel/compiler/brw_vec4_nir.cpp
> > b/src/intel/compiler/brw_vec4_nir.cpp index a82d52088a8..80115aca0f9
> > 100644
> > --- a/src/intel/compiler/brw_vec4_nir.cpp
> > +++ b/src/intel/compiler/brw_vec4_nir.cpp
> > @@ -852,7 +852,8 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr
> > *instr)> 
> > * The swizzle also works in the indirect case as the generator
> > adds
> > * the swizzle to the offset for us.
> > */
> > 
> > -  unsigned shift = (nir_intrinsic_base(instr) % 16) / 4;
> > +  const int type_size = type_sz(src.type);
> > +  unsigned shift = (nir_intrinsic_base(instr) % 16) / type_size;
> > 
> >assert(shift + instr->num_components <= 4);
> >
> >nir_const_value *const_offset =
> >nir_src_as_const_value(instr->src[0]);
> > 
> > @@ -860,14 +861,20 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr
> > *instr)> 
> >   /* Offsets are in bytes but they should always be multiples of 4
> >   */
> >   assert(const_offset->u32[0] % 4 == 0);
> > 
> > - unsigned offset = const_offset->u32[0] + shift * 4;
> > + src.swizzle = brw_swizzle_for_size(instr->num_components);
> > + dest.writemask = brw_writemask_for_size(instr->num_components);
> > + unsigned offset = const_offset->u32[0] + shift * type_size;
> > 
> >   src.offset = ROUND_DOWN_TO(offset, 16);
> > 
> > - shift = (offset % 16) / 4;
> > + shift = (offset % 16) / type_size;
> > + assert(shift + instr->num_components <= 4);
> > 
> >   src.swizzle += BRW_SWIZZLE4(shift, shift, shift, shift);
> >   
> >   emit(MOV(dest, src));
> >
> >} else {
> > 
> > - src.swizzle += BRW_SWIZZLE4(shift, shift, shift, shift);
> > + /* Uniform arrays are vec4 aligned, because of std140 alignment
> > +  * rules.
> > +  */
> > + assert(shift == 0);
> > 
> >   src_reg indirect = get_nir_src(instr->src[0],
> >   BRW_REGISTER_TYPE_UD, 1);


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Mesa-stable] [PATCH 2/3] i965/vec4: fix register width for DF VGRF and UNIFORM

2017-05-01 Thread Samuel Iglesias Gonsálvez

El Viernes, 28 de abril de 2017 16:27:56 Francisco Jerez escribió:
> Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:
> > On gen7, the swizzles used in DF align16 instructions works for element
> > size of 32 bits, so we can address only 2 consecutive DFs. As we assumed
> > that in the rest of the code and prepare the instructions for this
> > (scalarize_df()), we need to set it to two again.
> > 
> > However, for DF align1 instructions, a width of 2 is wrong as we are not
> > reading the data we want. For example, an uniform would have a region of
> > <0, 2, 1> so it would repeat the first 2 DFs, when we wanted to access
> > to the first 4.
> > 
> > This patch sets the default one to 4 and then modifies the width of
> > align16 instruction's DF sources when we translate the logical swizzle
> > to the physical one.
> > 
> > Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
> > Cc: "17.1" <mesa-sta...@lists.freedesktop.org>
> > ---
> > 
> >  src/intel/compiler/brw_vec4.cpp | 13 -
> >  1 file changed, 8 insertions(+), 5 deletions(-)
> > 
> > diff --git a/src/intel/compiler/brw_vec4.cpp
> > b/src/intel/compiler/brw_vec4.cpp index 95f96ea69c0..8b755e1b75e 100644
> > --- a/src/intel/compiler/brw_vec4.cpp
> > +++ b/src/intel/compiler/brw_vec4.cpp
> > @@ -2003,9 +2003,7 @@ vec4_visitor::convert_to_hw_regs()
> > 
> >   struct brw_reg reg;
> >   switch (src.file) {
> >   case VGRF: {
> > 
> > -const unsigned type_size = type_sz(src.type);
> > -const unsigned width = REG_SIZE / 2 / MAX2(4, type_size);
> > -reg = byte_offset(brw_vecn_grf(width, src.nr, 0),
> > src.offset);
> > +reg = byte_offset(brw_vecn_grf(4, src.nr, 0), src.offset);
> > 
> >  reg.type = src.type;
> >  reg.abs = src.abs;
> >  reg.negate = src.negate;
> > 
> > @@ -2013,12 +2011,11 @@ vec4_visitor::convert_to_hw_regs()
> > 
> >   }
> >   
> >   case UNIFORM: {
> > 
> > -const unsigned width = REG_SIZE / 2 / MAX2(4,
> > type_sz(src.type));> 
> >  reg = stride(byte_offset(brw_vec4_grf(
> >  
> >  prog_data->base.dispatch_grf_star
> >  t_reg +
> >  src.nr / 2, src.nr % 2 * 4),
> >   
> >   src.offset),
> > 
> > - 0, width, 1);
> > + 0, 4, 1);
> > 
> >  reg.type = src.type;
> >  reg.abs = src.abs;
> >  reg.negate = src.negate;
> > 
> > @@ -2576,6 +2573,12 @@ vec4_visitor::apply_logical_swizzle(struct brw_reg
> > *hw_reg,> 
> > assert(brw_is_single_value_swizzle(reg.swizzle) ||
> > 
> >is_supported_64bit_region(inst, arg));
> > 
> > +   /* Apply the region <2, 2, 1> for GRF or <0, 2, 1> for uniforms, as
> > align16 +* HW can only do 32-bit swizzle channels.
> > +*/
> > +   if (reg.file == UNIFORM || reg.file == VGRF)
> > +  hw_reg->width = BRW_WIDTH_2;
> 
> Any reason this is conditional on the register file?  Originally we were
> only setting the width to 2 for the UNIFORM and VGRF files, but that was
> probably an oversight...
> 

No reason, this was an oversight. I have removed locally the conditional.

Do you get your R-b to it then?

Sam

> > +
> > 
> > if (is_supported_64bit_region(inst, arg) &&
> > 
> > !is_gen7_supported_64bit_swizzle(inst, arg)) {
> >
> >/* Supported 64-bit swizzles are those such that their first two
> > 
> > ___
> > mesa-stable mailing list
> > mesa-sta...@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-stable


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/3] i965/vec4: don't modify regioning parameters to the sources of DF align1 instructions

2017-04-26 Thread Samuel Iglesias Gonsálvez

The regioning parameters are now properly set by convert_to_hw_regs()
and we don't need to fix them in the generator.

Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
Cc: "17.1" <mesa-sta...@lists.freedesktop.org>
---
 src/intel/compiler/brw_vec4_generator.cpp | 9 +
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/src/intel/compiler/brw_vec4_generator.cpp 
b/src/intel/compiler/brw_vec4_generator.cpp
index e786ac6a0ca..753b00c4ed1 100644
--- a/src/intel/compiler/brw_vec4_generator.cpp
+++ b/src/intel/compiler/brw_vec4_generator.cpp
@@ -1980,8 +1980,6 @@ generate_code(struct brw_codegen *p,
  else
 spread_dst = stride(dst, 8, 4, 2);
 
- src[0].vstride = BRW_VERTICAL_STRIDE_4;
- src[0].width = BRW_WIDTH_4;
  brw_MOV(p, spread_dst, src[0]);
 
  brw_set_default_access_mode(p, BRW_ALIGN_16);
@@ -2016,9 +2014,7 @@ generate_code(struct brw_codegen *p,
  src[0] = retype(src[0], BRW_REGISTER_TYPE_UD);
  if (inst->opcode == VEC4_OPCODE_PICK_HIGH_32BIT)
 src[0] = suboffset(src[0], 1);
- src[0].vstride = BRW_VERTICAL_STRIDE_8;
- src[0].width = BRW_WIDTH_4;
- src[0].hstride = BRW_HORIZONTAL_STRIDE_2;
+ src[0] = spread(src[0], 2);
  brw_MOV(p, dst, src[0]);
 
  brw_set_default_access_mode(p, BRW_ALIGN_16);
@@ -2041,9 +2037,6 @@ generate_code(struct brw_codegen *p,
  dst.hstride = BRW_HORIZONTAL_STRIDE_2;
 
  src[0] = retype(src[0], BRW_REGISTER_TYPE_UD);
- src[0].vstride = BRW_VERTICAL_STRIDE_4;
- src[0].width = BRW_WIDTH_4;
- src[0].hstride = BRW_HORIZONTAL_STRIDE_1;
  brw_MOV(p, dst, src[0]);
 
  brw_set_default_access_mode(p, BRW_ALIGN_16);
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/3] i965/vec4: fix register width for DF VGRF and UNIFORM

2017-04-26 Thread Samuel Iglesias Gonsálvez

On gen7, the swizzles used in DF align16 instructions works for element
size of 32 bits, so we can address only 2 consecutive DFs. As we assumed that
in the rest of the code and prepare the instructions for this (scalarize_df()),
we need to set it to two again.

However, for DF align1 instructions, a width of 2 is wrong as we are not
reading the data we want. For example, an uniform would have a region of
<0, 2, 1> so it would repeat the first 2 DFs, when we wanted to access
to the first 4.

This patch sets the default one to 4 and then modifies the width of
align16 instruction's DF sources when we translate the logical swizzle
to the physical one.

Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
Cc: "17.1" <mesa-sta...@lists.freedesktop.org>
---
 src/intel/compiler/brw_vec4.cpp | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp
index 95f96ea69c0..8b755e1b75e 100644
--- a/src/intel/compiler/brw_vec4.cpp
+++ b/src/intel/compiler/brw_vec4.cpp
@@ -2003,9 +2003,7 @@ vec4_visitor::convert_to_hw_regs()
  struct brw_reg reg;
  switch (src.file) {
  case VGRF: {
-const unsigned type_size = type_sz(src.type);
-const unsigned width = REG_SIZE / 2 / MAX2(4, type_size);
-reg = byte_offset(brw_vecn_grf(width, src.nr, 0), src.offset);
+reg = byte_offset(brw_vecn_grf(4, src.nr, 0), src.offset);
 reg.type = src.type;
 reg.abs = src.abs;
 reg.negate = src.negate;
@@ -2013,12 +2011,11 @@ vec4_visitor::convert_to_hw_regs()
  }
 
  case UNIFORM: {
-const unsigned width = REG_SIZE / 2 / MAX2(4, type_sz(src.type));
 reg = stride(byte_offset(brw_vec4_grf(
 prog_data->base.dispatch_grf_start_reg 
+
 src.nr / 2, src.nr % 2 * 4),
  src.offset),
- 0, width, 1);
+ 0, 4, 1);
 reg.type = src.type;
 reg.abs = src.abs;
 reg.negate = src.negate;
@@ -2576,6 +2573,12 @@ vec4_visitor::apply_logical_swizzle(struct brw_reg 
*hw_reg,
assert(brw_is_single_value_swizzle(reg.swizzle) ||
   is_supported_64bit_region(inst, arg));
 
+   /* Apply the region <2, 2, 1> for GRF or <0, 2, 1> for uniforms, as align16
+* HW can only do 32-bit swizzle channels.
+*/
+   if (reg.file == UNIFORM || reg.file == VGRF)
+  hw_reg->width = BRW_WIDTH_2;
+
if (is_supported_64bit_region(inst, arg) &&
!is_gen7_supported_64bit_swizzle(inst, arg)) {
   /* Supported 64-bit swizzles are those such that their first two
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/3] i965/vec4: fix vertical stride to avoid breaking region parameter rule

2017-04-26 Thread Samuel Iglesias Gonsálvez

From IVB PRM, vol4, part3, "General Restrictions on Regioning
Parameters":

  "If ExecSize = Width and HorzStride ≠ 0, VertStride must
   be set to Width * HorzStride."

In next patch, we are going to modify the region parameter for
uniforms and vgrf. For uniforms that are the source of
DF align1 instructions, they will have <0, 4, 1> regioning and
the execsize for those instructions will be 4, so they will break
the regioning rule. This will be the same for VGRF sources where
we use the vstride == 0 exploit.

As we know we are not going to cross the GRF boundary with that
execsize and parameters (not even with the exploit), we just fix
the vstride here.

Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
Cc: "17.1" <mesa-sta...@lists.freedesktop.org>
---
 src/intel/compiler/brw_reg.h| 15 +++
 src/intel/compiler/brw_vec4.cpp | 19 +++
 2 files changed, 34 insertions(+)

diff --git a/src/intel/compiler/brw_reg.h b/src/intel/compiler/brw_reg.h
index 17a51fbd655..24e09a84fce 100644
--- a/src/intel/compiler/brw_reg.h
+++ b/src/intel/compiler/brw_reg.h
@@ -914,6 +914,21 @@ static inline unsigned cvt(unsigned val)
return 0;
 }
 
+static inline unsigned inv_cvt(unsigned val)
+{
+   switch (val) {
+   case 0: return 0;
+   case 1: return 1;
+   case 2: return 2;
+   case 3: return 4;
+   case 4: return 8;
+   case 5: return 16;
+   case 6: return 32;
+   }
+   return 0;
+}
+
+
 static inline struct brw_reg
 stride(struct brw_reg reg, unsigned vstride, unsigned width, unsigned hstride)
 {
diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp
index f9b805ea5a9..95f96ea69c0 100644
--- a/src/intel/compiler/brw_vec4.cpp
+++ b/src/intel/compiler/brw_vec4.cpp
@@ -38,6 +38,8 @@ using namespace brw;
 
 namespace brw {
 
+static bool is_align1_df(vec4_instruction *inst);
+
 void
 src_reg::init()
 {
@@ -2049,6 +2051,23 @@ vec4_visitor::convert_to_hw_regs()
 
  apply_logical_swizzle(, inst, i);
  src = reg;
+
+ /* From IVB PRM, vol4, part3, "General Restrictions on Regioning
+  * Parameters":
+  *
+  *   "If ExecSize = Width and HorzStride ≠ 0, VertStride must be set
+  *to Width * HorzStride."
+  *
+  * We can break this rule with DF sources on DF align1
+  * instructions, because the exec_size would be 4 and width is 4.
+  * As we know we are not accessing to next GRF, it is safe to
+  * set vstride to the formula given by the rule itself.
+  */
+ if (is_align1_df(inst) && inst->exec_size == inv_cvt(src.width + 1)) {
+const unsigned width = inv_cvt(src.width + 1);
+const unsigned hstride = inv_cvt(src.hstride);
+src.vstride = cvt(width * hstride);
+ }
   }
 
   if (inst->is_3src(devinfo)) {
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 2/2] i965/vec4: load dvec3/4 uniforms first in the push constant buffer

2017-04-26 Thread Samuel Iglesias Gonsálvez

Reorder the uniforms to load first the dvec4-aligned variables
in the push constant buffer and then push the vec4-aligned ones.

This fixes a bug were the dvec3/4 might be loaded one part on a GRF and
the rest in next GRF, so the region parameters to read that could break
the HW rules.

Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
Cc: "17.1" <mesa-sta...@lists.freedesktop.org>
---
 src/intel/compiler/brw_vec4.cpp | 86 +++--
 1 file changed, 65 insertions(+), 21 deletions(-)

diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp
index 0909ddb5861..f9b805ea5a9 100644
--- a/src/intel/compiler/brw_vec4.cpp
+++ b/src/intel/compiler/brw_vec4.cpp
@@ -583,16 +583,43 @@ vec4_visitor::split_uniform_registers()
}
 }
 
+/* This function returns the register number where we placed the uniform */
+static int
+set_push_constant_loc(const int nr_uniforms, int *new_uniform_count,
+  const int src, const int size,
+  int *new_loc, int *new_chan,
+  int *new_chans_used)
+{
+   int dst;
+   /* Find the lowest place we can slot this uniform in. */
+   for (dst = *new_uniform_count; dst < nr_uniforms; dst++) {
+  if (new_chans_used[dst] + size <= 4)
+ break;
+   }
+
+   assert(dst < nr_uniforms);
+
+   new_loc[src] = dst;
+   new_chan[src] = new_chans_used[dst];
+
+   *new_uniform_count = MAX2(*new_uniform_count, dst + 1);
+   return dst;
+}
+
 void
 vec4_visitor::pack_uniform_registers()
 {
uint8_t chans_used[this->uniforms];
int new_loc[this->uniforms];
int new_chan[this->uniforms];
+   bool is_aligned_to_dvec4[this->uniforms];
+   int new_chans_used[this->uniforms];
 
memset(chans_used, 0, sizeof(chans_used));
memset(new_loc, 0, sizeof(new_loc));
memset(new_chan, 0, sizeof(new_chan));
+   memset(new_chans_used, 0, sizeof(new_chans_used));
+   memset(is_aligned_to_dvec4, 0, sizeof(is_aligned_to_dvec4));
 
/* Find which uniform vectors are actually used by the program.  We
 * expect unused vector elements when we've moved array access out
@@ -631,10 +658,13 @@ vec4_visitor::pack_uniform_registers()
 
 unsigned channel = BRW_GET_SWZ(inst->src[i].swizzle, c) + 1;
 unsigned used = MAX2(chans_used[reg], channel * channel_size);
-if (used <= 4)
+if (used <= 4) {
chans_used[reg] = used;
-else
+} else {
+   is_aligned_to_dvec4[reg] = true;
+   is_aligned_to_dvec4[reg + 1] = true;
chans_used[reg + 1] = used - 4;
+}
  }
   }
 
@@ -659,42 +689,56 @@ vec4_visitor::pack_uniform_registers()
 
int new_uniform_count = 0;
 
+   /* As the uniforms are going to be reordered, take the data from a temporary
+* copy of the original param[].
+*/
+   gl_constant_value **param = ralloc_array(NULL, gl_constant_value*,
+stage_prog_data->nr_params);
+   memcpy(param, stage_prog_data->param,
+  sizeof(gl_constant_value*) * stage_prog_data->nr_params);
+
/* Now, figure out a packing of the live uniform vectors into our
-* push constants.
+* push constants. Start with dvec{3,4} because they are aligned to
+* dvec4 size (2 vec4).
 */
for (int src = 0; src < uniforms; src++) {
   int size = chans_used[src];
 
-  if (size == 0)
+  if (size == 0 || !is_aligned_to_dvec4[src])
  continue;
 
-  int dst;
-  /* Find the lowest place we can slot this uniform in. */
-  for (dst = 0; dst < src; dst++) {
- if (chans_used[dst] + size <= 4)
-break;
+  int dst = set_push_constant_loc(uniforms, _uniform_count,
+  src, size, new_loc, new_chan,
+  new_chans_used);
+  if (dst != src) {
+ /* Move the references to the data */
+ for (int j = 0; j < size; j++) {
+stage_prog_data->param[dst * 4 + new_chan[src] + j] =
+   param[src * 4 + j];
+ }
   }
+   }
 
-  if (src == dst) {
- new_loc[src] = dst;
- new_chan[src] = 0;
-  } else {
- new_loc[src] = dst;
- new_chan[src] = chans_used[dst];
+   /* Continue with the rest of data, which is aligned to vec4. */
+   for (int src = 0; src < uniforms; src++) {
+  int size = chans_used[src];
+
+  if (size == 0 || is_aligned_to_dvec4[src])
+ continue;
 
+  int dst = set_push_constant_loc(uniforms, _uniform_count,
+  src, size, new_loc, new_chan,
+  new_chans_used);
+  if (dst != src) {
  /* Move the references to the data */
  for (int j = 0; j < size; j++) {
 stage_prog_data->

[Mesa-dev] [PATCH v2 1/2] i965/vec4: fix swizzle and writemask when loading an uniform with constant offset

2017-04-26 Thread Samuel Iglesias Gonsálvez

It was setting XYWZ swizzle and writemask to all uniforms, no matter if they
were a vector or scalar, so this can lead to problems when loading them
to the push constant buffer.

Moreover, 'shift' calculation was designed to calculate the offset in
DWORDS, but it doesn't take into account DFs, so the calculated swizzle
for the later ones was wrong.

The indirect case is not changed because MOV INDIRECT will write
to all components. Added an assert to verify that these uniforms
are aligned.

v2:
- Fix 'shift' calculation (Curro)
- Set both swizzle and writemask.
- Add assert(shift == 0) for the indirect case.

Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
Cc: "17.1" <mesa-sta...@lists.freedesktop.org>
---
 src/intel/compiler/brw_vec4_nir.cpp | 15 +++
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/src/intel/compiler/brw_vec4_nir.cpp 
b/src/intel/compiler/brw_vec4_nir.cpp
index a82d52088a8..80115aca0f9 100644
--- a/src/intel/compiler/brw_vec4_nir.cpp
+++ b/src/intel/compiler/brw_vec4_nir.cpp
@@ -852,7 +852,8 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr *instr)
* The swizzle also works in the indirect case as the generator adds
* the swizzle to the offset for us.
*/
-  unsigned shift = (nir_intrinsic_base(instr) % 16) / 4;
+  const int type_size = type_sz(src.type);
+  unsigned shift = (nir_intrinsic_base(instr) % 16) / type_size;
   assert(shift + instr->num_components <= 4);
 
   nir_const_value *const_offset = nir_src_as_const_value(instr->src[0]);
@@ -860,14 +861,20 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr 
*instr)
  /* Offsets are in bytes but they should always be multiples of 4 */
  assert(const_offset->u32[0] % 4 == 0);
 
- unsigned offset = const_offset->u32[0] + shift * 4;
+ src.swizzle = brw_swizzle_for_size(instr->num_components);
+ dest.writemask = brw_writemask_for_size(instr->num_components);
+ unsigned offset = const_offset->u32[0] + shift * type_size;
  src.offset = ROUND_DOWN_TO(offset, 16);
- shift = (offset % 16) / 4;
+ shift = (offset % 16) / type_size;
+ assert(shift + instr->num_components <= 4);
  src.swizzle += BRW_SWIZZLE4(shift, shift, shift, shift);
 
  emit(MOV(dest, src));
   } else {
- src.swizzle += BRW_SWIZZLE4(shift, shift, shift, shift);
+ /* Uniform arrays are vec4 aligned, because of std140 alignment
+  * rules.
+  */
+ assert(shift == 0);
 
  src_reg indirect = get_nir_src(instr->src[0], BRW_REGISTER_TYPE_UD, 
1);
 
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] i965/vec4: set swizzle when loading an uniform

2017-04-25 Thread Samuel Iglesias Gonsálvez

On Mon, 2017-04-24 at 11:22 -0700, Francisco Jerez wrote:
> Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:
> 
> > On Fri, 2017-04-21 at 10:23 -0700, Francisco Jerez wrote:
> > > Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:
> > > 
> > > > On Thu, 2017-04-20 at 10:26 -0700, Francisco Jerez wrote:
> > > > > Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:
> > > > > 
> > > > > > It was setting XYWZ swizzle to all uniforms, no matter if
> > > > > > they
> > > > > > were
> > > > > > a vector or not.
> > > > > > 
> > > > > > Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.
> > > > > > com>
> > > > > > Cc: curroje...@riseup.net
> > > > > 
> > > > > Don't you need to CC mesa-stable here and in the next patch?
> > > > > 
> > > > 
> > > > I considered it but I has doubts about which tag use "17.1.0-
> > > > rc1"
> > > > or
> > > > just "17.1.0" or whatever. So my plan is to notify Emil once
> > > > they
> > > > are
> > > > merged (and add Cc to stable in the commit log before pushing
> > > > it to
> > > > master).
> > > > 
> > > > If you are more comfortable with Cc mesa-stable, I will do it
> > > > next
> > > > time
> > > > (or if I need to send v2 of this series).
> > > > 
> > > 
> > > I believe Emil will notice them even if you don't put a version
> > > tag.  I
> > > don't really care how you nominate it for stable as long as it
> > > hits
> > > the
> > > 17.1 branch before the end of the release cycle.  ;)
> > > 
> > > > > > ---
> > > > > >  src/intel/compiler/brw_vec4_nir.cpp | 1 +
> > > > > >  1 file changed, 1 insertion(+)
> > > > > > 
> > > > > > diff --git a/src/intel/compiler/brw_vec4_nir.cpp
> > > > > > b/src/intel/compiler/brw_vec4_nir.cpp
> > > > > > index a82d52088a8..5f4488c7e86 100644
> > > > > > --- a/src/intel/compiler/brw_vec4_nir.cpp
> > > > > > +++ b/src/intel/compiler/brw_vec4_nir.cpp
> > > > > > @@ -863,6 +863,7 @@
> > > > > > vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr
> > > > > > *instr)
> > > > > >   unsigned offset = const_offset->u32[0] + shift *
> > > > > > 4;
> > > > > >   src.offset = ROUND_DOWN_TO(offset, 16);
> > > > > >   shift = (offset % 16) / 4;
> > > > > > + src.swizzle = brw_swizzle_for_size(instr-
> > > > > > > num_components);
> > > > > 
> > > > > What about the indirect case a few lines below?  Isn't the
> > > > > swizzle
> > > > > passed
> > > > > to the mov indirect instruction still bogus?
> > > > > 
> > > > 
> > > > This is different. It is expecting to have a swizzle of XYZW
> > > > because
> > > > MOV_INDIRECT will copy all the contents. See assert in
> > > > move_uniform_array_access_to_pull_constants()
> > > 
> > > I believe the ultimate problem here is that the MOV_INDIRECT gets
> > > a
> > > writemask of XYZW even if you're reading a scalar, so setting the
> > > minimal swizzle will lead to a situation in which the resulting
> > > swizzle
> > > is not the identity so you will run into trouble to turn it into
> > > scratch
> > > access.  That brings me to the following question which I don't
> > > think
> > > I
> > > can answer by looking at this patch alone: If having an XYZW
> > > swizzle
> > > is
> > > a problem for direct moves (I assume this patch is fixing
> > > something?),
> > 
> > The bug is to properly identify DFs and dvecs, instead of
> > considering
> > all of them as dvec4 when aligning them for the push constant
> > buffer,
> > which is done in next patch.
> > 
> 
> Sorry, I'm not following what you mean with the last paragraph.  What
> does this have to do with identifying DFs?
> 

By the register type, I know it is a DF, but I need then to know how
many components are used in order to know if they are dvec3 or dvec4,
so I push them fi

< 1 2 3 4 5 6 7 8 9 10 >

301 - 400 of 1472 matches

Mail list logo