[Mesa-dev] [PATCH 3/7] i965/fs: Support 16-bit store_ssbo with VK_KHR_relaxed_block_layout

2018-02-23 Thread Jose Maria Casanova Crespo
Restrict the use of untyped_surface_write with 16-bit pairs in
ssbo to the cases where we can guarantee that offset is multiple
of 4.

Taking into account that VK_KHR_relaxed_block_layout is available
in ANV we can only guarantee that when we have a constant offset
that is multiple of 4. For non constant offsets we will always use
byte_scattered_write.
---
 src/intel/compiler/brw_fs_nir.cpp | 22 +++---
 1 file changed, 15 insertions(+), 7 deletions(-)

diff --git a/src/intel/compiler/brw_fs_nir.cpp 
b/src/intel/compiler/brw_fs_nir.cpp
index 45b8e8b637..abf9098252 100644
--- a/src/intel/compiler/brw_fs_nir.cpp
+++ b/src/intel/compiler/brw_fs_nir.cpp
@@ -4135,6 +4135,8 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, 
nir_intrinsic_instr *instr
  unsigned num_components = ffs(~(writemask >> first_component)) - 1;
  fs_reg write_src = offset(val_reg, bld, first_component);
 
+ nir_const_value *const_offset = nir_src_as_const_value(instr->src[2]);
+
  if (type_size > 4) {
 /* We can't write more than 2 64-bit components at once. Limit
  * the num_components of the write to what we can do and let the 
next
@@ -4150,14 +4152,19 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, 
nir_intrinsic_instr *instr
  * 32-bit-aligned we need to use byte-scattered writes because
  * untyped writes works with 32-bit components with 32-bit
  * alignment. byte_scattered_write messages only support one
- * 16-bit component at a time.
+ * 16-bit component at a time. As VK_KHR_relaxed_block_layout
+ * could be enabled we can not guarantee that not constant offsets
+ * to be 32-bit aligned for 16-bit types. For example an array, of
+ * 16-bit vec3 with array element stride of 6.
  *
- * For example, if there is a 3-components vector we submit one
- * untyped-write message of 32-bit (first two components), and one
- * byte-scattered write message (the last component).
+ * In the case of 32-bit aligned constant offsets if there is
+ * a 3-components vector we submit one untyped-write message
+ * of 32-bit (first two components), and one byte-scattered
+ * write message (the last component).
  */
 
-if (first_component % 2) {
+if ( !const_offset || ((const_offset->u32[0] +
+   type_size * first_component) % 4)) {
/* If we use a .yz writemask we also need to emit 2
 * byte-scattered write messages because of y-component not
 * being aligned to 32-bit.
@@ -4183,7 +4190,7 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, 
nir_intrinsic_instr *instr
  }
 
  fs_reg offset_reg;
- nir_const_value *const_offset = nir_src_as_const_value(instr->src[2]);
+
  if (const_offset) {
 offset_reg = brw_imm_ud(const_offset->u32[0] +
 type_size * first_component);
@@ -4222,7 +4229,8 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, 
nir_intrinsic_instr *instr
  } else {
 assert(num_components * type_size <= 16);
 assert((num_components * type_size) % 4 == 0);
-assert((first_component * type_size) % 4 == 0);
+assert(!const_offset ||
+   (const_offset->u32[0] + type_size * first_component) % 4 == 
0);
 unsigned num_slots = (num_components * type_size) / 4;
 
 emit_untyped_write(bld, surf_index, offset_reg,
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/7] i965/fs: Support 16-bit store_ssbo with VK_KHR_relaxed_block_layout

2018-02-23 Thread Jason Ekstrand
On Fri, Feb 23, 2018 at 1:26 AM, Jose Maria Casanova Crespo <
jmcasan...@igalia.com> wrote:

> Restrict the use of untyped_surface_write with 16-bit pairs in
> ssbo to the cases where we can guarantee that offset is multiple
> of 4.
>
> Taking into account that VK_KHR_relaxed_block_layout is available
> in ANV we can only guarantee that when we have a constant offset
> that is multiple of 4. For non constant offsets we will always use
> byte_scattered_write.
>

I double-checked the rules and we can't even guarantee that a f16vec2 is
dword-aligned.


> ---
>  src/intel/compiler/brw_fs_nir.cpp | 22 +++---
>  1 file changed, 15 insertions(+), 7 deletions(-)
>
> diff --git a/src/intel/compiler/brw_fs_nir.cpp
> b/src/intel/compiler/brw_fs_nir.cpp
> index 45b8e8b637..abf9098252 100644
> --- a/src/intel/compiler/brw_fs_nir.cpp
> +++ b/src/intel/compiler/brw_fs_nir.cpp
> @@ -4135,6 +4135,8 @@ fs_visitor::nir_emit_intrinsic(const fs_builder
> &bld, nir_intrinsic_instr *instr
>   unsigned num_components = ffs(~(writemask >> first_component)) -
> 1;
>   fs_reg write_src = offset(val_reg, bld, first_component);
>
> + nir_const_value *const_offset = nir_src_as_const_value(instr->
> src[2]);
> +
>   if (type_size > 4) {
>  /* We can't write more than 2 64-bit components at once. Limit
>   * the num_components of the write to what we can do and let
> the next
> @@ -4150,14 +4152,19 @@ fs_visitor::nir_emit_intrinsic(const fs_builder
> &bld, nir_intrinsic_instr *instr
>   * 32-bit-aligned we need to use byte-scattered writes because
>   * untyped writes works with 32-bit components with 32-bit
>   * alignment. byte_scattered_write messages only support one
> - * 16-bit component at a time.
> + * 16-bit component at a time. As VK_KHR_relaxed_block_layout
> + * could be enabled we can not guarantee that not constant
> offsets
> + * to be 32-bit aligned for 16-bit types. For example an
> array, of
> + * 16-bit vec3 with array element stride of 6.
>   *
> - * For example, if there is a 3-components vector we submit
> one
> - * untyped-write message of 32-bit (first two components),
> and one
> - * byte-scattered write message (the last component).
> + * In the case of 32-bit aligned constant offsets if there is
> + * a 3-components vector we submit one untyped-write message
> + * of 32-bit (first two components), and one byte-scattered
> + * write message (the last component).
>   */
>
> -if (first_component % 2) {
> +if ( !const_offset || ((const_offset->u32[0] +
> +   type_size * first_component) % 4)) {
> /* If we use a .yz writemask we also need to emit 2
>  * byte-scattered write messages because of y-component not
>  * being aligned to 32-bit.
> @@ -4183,7 +4190,7 @@ fs_visitor::nir_emit_intrinsic(const fs_builder
> &bld, nir_intrinsic_instr *instr
>   }
>
>   fs_reg offset_reg;
> - nir_const_value *const_offset = nir_src_as_const_value(instr->
> src[2]);
> +
>   if (const_offset) {
>  offset_reg = brw_imm_ud(const_offset->u32[0] +
>  type_size * first_component);
> @@ -4222,7 +4229,8 @@ fs_visitor::nir_emit_intrinsic(const fs_builder
> &bld, nir_intrinsic_instr *instr
>   } else {
>  assert(num_components * type_size <= 16);
>  assert((num_components * type_size) % 4 == 0);
> -assert((first_component * type_size) % 4 == 0);
> +assert(!const_offset ||
> +   (const_offset->u32[0] + type_size * first_component) %
> 4 == 0);
>

How about

assert(offset_reg.file != BRW_IMMEDIATE_VALUE || offset_reg.ud % 4 == 0);

We've already done the above calculation and stored it in offset_reg.

Reviewed-by: Jason Ekstrand 


>  unsigned num_slots = (num_components * type_size) / 4;
>
>  emit_untyped_write(bld, surf_index, offset_reg,
> --
> 2.14.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/7] i965/fs: Support 16-bit store_ssbo with VK_KHR_relaxed_block_layout

2018-02-26 Thread Chema Casanova
On 23/02/18 21:23, Jason Ekstrand wrote:
> On Fri, Feb 23, 2018 at 1:26 AM, Jose Maria Casanova Crespo
> mailto:jmcasan...@igalia.com>> wrote:
> 
> Restrict the use of untyped_surface_write with 16-bit pairs in
> ssbo to the cases where we can guarantee that offset is multiple
> of 4.
> 
> Taking into account that VK_KHR_relaxed_block_layout is available
> in ANV we can only guarantee that when we have a constant offset
> that is multiple of 4. For non constant offsets we will always use
> byte_scattered_write.
> 
> 
> I double-checked the rules and we can't even guarantee that a f16vec2 is
> dword-aligned.
>  
> 
> ---
>  src/intel/compiler/brw_fs_nir.cpp | 22 +++---
>  1 file changed, 15 insertions(+), 7 deletions(-)
> 
> diff --git a/src/intel/compiler/brw_fs_nir.cpp
> b/src/intel/compiler/brw_fs_nir.cpp
> index 45b8e8b637..abf9098252 100644
> --- a/src/intel/compiler/brw_fs_nir.cpp
> +++ b/src/intel/compiler/brw_fs_nir.cpp
> @@ -4135,6 +4135,8 @@ fs_visitor::nir_emit_intrinsic(const
> fs_builder &bld, nir_intrinsic_instr *instr
>           unsigned num_components = ffs(~(writemask >>
> first_component)) - 1;
>           fs_reg write_src = offset(val_reg, bld, first_component);
> 
> +         nir_const_value *const_offset =
> nir_src_as_const_value(instr->src[2]);
> +
>           if (type_size > 4) {
>              /* We can't write more than 2 64-bit components at
> once. Limit
>               * the num_components of the write to what we can do
> and let the next
> @@ -4150,14 +4152,19 @@ fs_visitor::nir_emit_intrinsic(const
> fs_builder &bld, nir_intrinsic_instr *instr
>               * 32-bit-aligned we need to use byte-scattered writes
> because
>               * untyped writes works with 32-bit components with 32-bit
>               * alignment. byte_scattered_write messages only
> support one
> -             * 16-bit component at a time.
> +             * 16-bit component at a time. As
> VK_KHR_relaxed_block_layout
> +             * could be enabled we can not guarantee that not
> constant offsets
> +             * to be 32-bit aligned for 16-bit types. For example
> an array, of
> +             * 16-bit vec3 with array element stride of 6.
>               *
> -             * For example, if there is a 3-components vector we
> submit one
> -             * untyped-write message of 32-bit (first two
> components), and one
> -             * byte-scattered write message (the last component).
> +             * In the case of 32-bit aligned constant offsets if
> there is
> +             * a 3-components vector we submit one untyped-write
> message
> +             * of 32-bit (first two components), and one byte-scattered
> +             * write message (the last component).
>               */
> 
> -            if (first_component % 2) {
> +            if ( !const_offset || ((const_offset->u32[0] +
> +                                   type_size * first_component) % 4)) {
>                 /* If we use a .yz writemask we also need to emit 2
>                  * byte-scattered write messages because of
> y-component not
>                  * being aligned to 32-bit.
> @@ -4183,7 +4190,7 @@ fs_visitor::nir_emit_intrinsic(const
> fs_builder &bld, nir_intrinsic_instr *instr
>           }
> 
>           fs_reg offset_reg;
> -         nir_const_value *const_offset =
> nir_src_as_const_value(instr->src[2]);
> +
>           if (const_offset) {
>              offset_reg = brw_imm_ud(const_offset->u32[0] +
>                                      type_size * first_component);
> @@ -4222,7 +4229,8 @@ fs_visitor::nir_emit_intrinsic(const
> fs_builder &bld, nir_intrinsic_instr *instr
>           } else {
>              assert(num_components * type_size <= 16);
>              assert((num_components * type_size) % 4 == 0);
> -            assert((first_component * type_size) % 4 == 0);
> +            assert(!const_offset ||
> +                   (const_offset->u32[0] + type_size *
> first_component) % 4 == 0);
> 
> 
> How about
> 
> assert(offset_reg.file != BRW_IMMEDIATE_VALUE || offset_reg.ud % 4 == 0);
> 
> We've already done the above calculation and stored it in offset_reg.

Makes sense.
> 
> Reviewed-by: Jason Ekstrand  >

Thanks for the review.

Chema

>              unsigned num_slots = (num_components * type_size) / 4;
> 
>              emit_untyped_write(bld, surf_index, offset_reg,
> --
> 2.14.3
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org 
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>