Re: [Mesa-dev] [PATCH v3 29/43] i965/fs: Unpack 16-bit from 32-bit components in VS load_input

2017-11-01 Thread Jason Ekstrand
On Thu, Oct 12, 2017 at 11:38 AM, Jose Maria Casanova Crespo <
jmcasan...@igalia.com> wrote:

> The VS load input for 16-bit values receives pairs of 16-bit values
> packed in 32-bit values. Because of the adjusted format used at:
>
>  anv/pipeline: Use 32-bit surface formats for 16-bit formats
>
> v2: Removed use of stride = 2 on 16-bit sources (Jason Ekstrand)
> ---
>  src/intel/compiler/brw_fs_nir.cpp | 21 +++--
>  1 file changed, 19 insertions(+), 2 deletions(-)
>
> diff --git a/src/intel/compiler/brw_fs_nir.cpp
> b/src/intel/compiler/brw_fs_nir.cpp
> index d2f2e17b70..83ff0607a7 100644
> --- a/src/intel/compiler/brw_fs_nir.cpp
> +++ b/src/intel/compiler/brw_fs_nir.cpp
> @@ -2322,8 +2322,25 @@ fs_visitor::nir_emit_vs_intrinsic(const fs_builder
> ,
>assert(const_offset && "Indirect input loads not allowed");
>src = offset(src, bld, const_offset->u32[0]);
>
> -  for (unsigned j = 0; j < num_components; j++) {
> - bld.MOV(offset(dest, bld, j), offset(src, bld, j +
> first_component));
> +  /* The VS load input for 16-bit values receives pairs of 16-bit
> values
> +   * packed in 32-bit values. This is an example on SIMD8:
> +   *
> +   * xy xy xy xy xy xy xy xy
> +   * zw zw zw zw yw zw zw xw
> +   *
> +   * We need to format it to something like:
> +   *
> +   * xx xx xx xx yy yy yy yy
> +   * zz zz zz zz ww ww ww ww
> +   */
>

Please pull this comment inside the if.


> +  if (type_sz(type) == 2) {
> + for (unsigned j = 0; j < num_components; j++)
> +bld.MOV(offset(dest, bld, j),
> +subscript(retype(offset(src,bld, (j / 2) * 2 +
> first_component),
>
+ BRW_REGISTER_TYPE_F), type, j % 2));


Ugh... This is rather hard to parse.  All of the retyping and subscripting
is making my head spin.  How about:

src.type = BRW_REGISTER_TYPE_F;
for (unsigned j = 0; j < num_components; j++) {
   unsigned src_comp = j / 2 + first_component;
   bld.MOV(offset(dest, bld, j),
 subscript(offset(src, bld, src_comp), dst.type, j % 2));
}


> +  } else {
> + for (unsigned j = 0; j < num_components; j++)
> +bld.MOV(offset(dest, bld, j), offset(src, bld, j +
> first_component));
>}
>
>if (type == BRW_REGISTER_TYPE_DF) {
> --
> 2.13.6
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 29/43] i965/fs: Unpack 16-bit from 32-bit components in VS load_input

2017-10-24 Thread Chema Casanova
El 15/10/17 a las 12:59, Pohjolainen, Topi escribió:
> On Thu, Oct 12, 2017 at 08:38:18PM +0200, Jose Maria Casanova Crespo wrote:
>> The VS load input for 16-bit values receives pairs of 16-bit values
>> packed in 32-bit values. Because of the adjusted format used at:
>>
>>  anv/pipeline: Use 32-bit surface formats for 16-bit formats
>>
>> v2: Removed use of stride = 2 on 16-bit sources (Jason Ekstrand)
>> ---
>>  src/intel/compiler/brw_fs_nir.cpp | 21 +++--
>>  1 file changed, 19 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/intel/compiler/brw_fs_nir.cpp 
>> b/src/intel/compiler/brw_fs_nir.cpp
>> index d2f2e17b70..83ff0607a7 100644
>> --- a/src/intel/compiler/brw_fs_nir.cpp
>> +++ b/src/intel/compiler/brw_fs_nir.cpp
>> @@ -2322,8 +2322,25 @@ fs_visitor::nir_emit_vs_intrinsic(const fs_builder 
>> ,
>>assert(const_offset && "Indirect input loads not allowed");
>>src = offset(src, bld, const_offset->u32[0]);
>>  
>> -  for (unsigned j = 0; j < num_components; j++) {
>> - bld.MOV(offset(dest, bld, j), offset(src, bld, j + 
>> first_component));
>> +  /* The VS load input for 16-bit values receives pairs of 16-bit values
>> +   * packed in 32-bit values. This is an example on SIMD8:
>> +   *
>> +   * xy xy xy xy xy xy xy xy
>> +   * zw zw zw zw yw zw zw xw
> zw
Fixed locally.
>
>> +   *
>> +   * We need to format it to something like:
>> +   *
>> +   * xx xx xx xx yy yy yy yy
>> +   * zz zz zz zz ww ww ww ww
>> +   */
>> +  if (type_sz(type) == 2) {
>> + for (unsigned j = 0; j < num_components; j++)
>> +bld.MOV(offset(dest, bld, j),
>> +subscript(retype(offset(src,bld, (j / 2) * 2 + 
>> first_component),
> Space missing before 'bld'. I went thru the math and it looks correct to me.
Fixed locally. About the math, it is based in the same principles that
the one at shuffle_32bit_load_result_to_16bit_data function available at
the following patch of the series, but it is simpler as in this case
source and destination are different that we can not assume for the
general case.
>
>> + BRW_REGISTER_TYPE_F), type, j % 2));
>> +  } else {
>> + for (unsigned j = 0; j < num_components; j++)
>> +bld.MOV(offset(dest, bld, j), offset(src, bld, j + 
>> first_component));
>>}
>>  
>>if (type == BRW_REGISTER_TYPE_DF) {
>> -- 
>> 2.13.6
>>
>> ___
>

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 29/43] i965/fs: Unpack 16-bit from 32-bit components in VS load_input

2017-10-15 Thread Pohjolainen, Topi
On Thu, Oct 12, 2017 at 08:38:18PM +0200, Jose Maria Casanova Crespo wrote:
> The VS load input for 16-bit values receives pairs of 16-bit values
> packed in 32-bit values. Because of the adjusted format used at:
> 
>  anv/pipeline: Use 32-bit surface formats for 16-bit formats
> 
> v2: Removed use of stride = 2 on 16-bit sources (Jason Ekstrand)
> ---
>  src/intel/compiler/brw_fs_nir.cpp | 21 +++--
>  1 file changed, 19 insertions(+), 2 deletions(-)
> 
> diff --git a/src/intel/compiler/brw_fs_nir.cpp 
> b/src/intel/compiler/brw_fs_nir.cpp
> index d2f2e17b70..83ff0607a7 100644
> --- a/src/intel/compiler/brw_fs_nir.cpp
> +++ b/src/intel/compiler/brw_fs_nir.cpp
> @@ -2322,8 +2322,25 @@ fs_visitor::nir_emit_vs_intrinsic(const fs_builder 
> ,
>assert(const_offset && "Indirect input loads not allowed");
>src = offset(src, bld, const_offset->u32[0]);
>  
> -  for (unsigned j = 0; j < num_components; j++) {
> - bld.MOV(offset(dest, bld, j), offset(src, bld, j + 
> first_component));
> +  /* The VS load input for 16-bit values receives pairs of 16-bit values
> +   * packed in 32-bit values. This is an example on SIMD8:
> +   *
> +   * xy xy xy xy xy xy xy xy
> +   * zw zw zw zw yw zw zw xw

zw

> +   *
> +   * We need to format it to something like:
> +   *
> +   * xx xx xx xx yy yy yy yy
> +   * zz zz zz zz ww ww ww ww
> +   */
> +  if (type_sz(type) == 2) {
> + for (unsigned j = 0; j < num_components; j++)
> +bld.MOV(offset(dest, bld, j),
> +subscript(retype(offset(src,bld, (j / 2) * 2 + 
> first_component),

Space missing before 'bld'. I went thru the math and it looks correct to me.

> + BRW_REGISTER_TYPE_F), type, j % 2));
> +  } else {
> + for (unsigned j = 0; j < num_components; j++)
> +bld.MOV(offset(dest, bld, j), offset(src, bld, j + 
> first_component));
>}
>  
>if (type == BRW_REGISTER_TYPE_DF) {
> -- 
> 2.13.6
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 29/43] i965/fs: Unpack 16-bit from 32-bit components in VS load_input

2017-10-12 Thread Jose Maria Casanova Crespo
The VS load input for 16-bit values receives pairs of 16-bit values
packed in 32-bit values. Because of the adjusted format used at:

 anv/pipeline: Use 32-bit surface formats for 16-bit formats

v2: Removed use of stride = 2 on 16-bit sources (Jason Ekstrand)
---
 src/intel/compiler/brw_fs_nir.cpp | 21 +++--
 1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/src/intel/compiler/brw_fs_nir.cpp 
b/src/intel/compiler/brw_fs_nir.cpp
index d2f2e17b70..83ff0607a7 100644
--- a/src/intel/compiler/brw_fs_nir.cpp
+++ b/src/intel/compiler/brw_fs_nir.cpp
@@ -2322,8 +2322,25 @@ fs_visitor::nir_emit_vs_intrinsic(const fs_builder ,
   assert(const_offset && "Indirect input loads not allowed");
   src = offset(src, bld, const_offset->u32[0]);
 
-  for (unsigned j = 0; j < num_components; j++) {
- bld.MOV(offset(dest, bld, j), offset(src, bld, j + first_component));
+  /* The VS load input for 16-bit values receives pairs of 16-bit values
+   * packed in 32-bit values. This is an example on SIMD8:
+   *
+   * xy xy xy xy xy xy xy xy
+   * zw zw zw zw yw zw zw xw
+   *
+   * We need to format it to something like:
+   *
+   * xx xx xx xx yy yy yy yy
+   * zz zz zz zz ww ww ww ww
+   */
+  if (type_sz(type) == 2) {
+ for (unsigned j = 0; j < num_components; j++)
+bld.MOV(offset(dest, bld, j),
+subscript(retype(offset(src,bld, (j / 2) * 2 + 
first_component),
+ BRW_REGISTER_TYPE_F), type, j % 2));
+  } else {
+ for (unsigned j = 0; j < num_components; j++)
+bld.MOV(offset(dest, bld, j), offset(src, bld, j + 
first_component));
   }
 
   if (type == BRW_REGISTER_TYPE_DF) {
-- 
2.13.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev