[Mesa-dev] [PATCH 5/6] i965: Implement nir_intrinsic_shader_clock

2015-10-07 Thread Emil Velikov
Signed-off-by: Emil Velikov 
---
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp   |  9 +
 src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 10 ++
 2 files changed, 19 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 03fe680..bcb8f38 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -1317,6 +1317,15 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, 
nir_intrinsic_instr *instr
   break;
}
 
+   case nir_intrinsic_shader_clock: {
+  /* We cannot do anything if there is an event, so ignore it for now */
+  fs_reg shader_clock = get_timestamp(bld);
+
+  bld.MOV(retype(dest, brw_type_for_base_type(glsl_type::uvec2_type)),
+  shader_clock);
+  break;
+   }
+
case nir_intrinsic_image_size: {
   /* Get the referenced image variable and type. */
   const nir_variable *var = instr->variables[0]->var;
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
index 41bd80d..f1de8d4 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
@@ -806,6 +806,16 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr 
*instr)
   break;
}
 
+   case nir_intrinsic_shader_clock: {
+  /* We cannot do anything if there is an event, so ignore it for now */
+  src_reg shader_clock = get_timestamp();
+  enum brw_reg_type type = brw_type_for_base_type(glsl_type::uvec2_type);
+
+  dest = get_nir_dest(instr->dest, type);
+  emit(MOV(dest, retype(shader_clock, type)));
+  break;
+   }
+
default:
   unreachable("Unknown intrinsic");
}
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/6] i965: Implement nir_intrinsic_shader_clock

2015-10-07 Thread Connor Abbott
On Wed, Oct 7, 2015 at 7:51 AM, Emil Velikov  wrote:
> Signed-off-by: Emil Velikov 
> ---
>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp   |  9 +
>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 10 ++
>  2 files changed, 19 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> index 03fe680..bcb8f38 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> @@ -1317,6 +1317,15 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, 
> nir_intrinsic_instr *instr
>break;
> }
>
> +   case nir_intrinsic_shader_clock: {
> +  /* We cannot do anything if there is an event, so ignore it for now */
> +  fs_reg shader_clock = get_timestamp(bld);

get_timestamp() isn't doing quite what you want. If you look at the
definition of fs_visitor::get_timestamp(), you see this comment:

   /* The caller wants the low 32 bits of the timestamp.  Since it's running
* at the GPU clock rate of ~1.2ghz, it will roll over every ~3 seconds,
* which is plenty of time for our purposes.  It is identical across the
* EUs, but since it's tracking GPU core speed it will increment at a
* varying rate as render P-states change.
*
* The caller could also check if render P-states have changed (or anything
* else that might disrupt timing) by setting smear to 2 and checking if
* that field is != 0.
*/
When you read the ARF register, you get three components set in your
SIMD8 register, plus some junk because we have to read in four
components at a time. That is, this code in get_timestamp:

   fs_reg ts = fs_reg(retype(brw_vec4_reg(BRW_ARCHITECTURE_REGISTER_FILE,
  BRW_ARF_TIMESTAMP,
  0),
 BRW_REGISTER_TYPE_UD));

   bld.group(4, 0).exec_all().MOV(dst, ts);

will create a register called dst that looks like this:

 --
| tm0.0 | tm0.1 | tm0.2 | junk |
 --

Then doing dst.set_smear(0) will create a source that looks like this:

 ---
| tm0.0 | tm0.0 | tm0.0 | tm0.0 | tm0.0 | tm0.0 | tm0.0 | tm0.0 |
 ---

Or, in SIMD16,

 ---
| tm0.0 | tm0.0 | tm0.0 | tm0.0 | tm0.0 | tm0.0 | tm0.0 | tm0.0 |
 ---
 ---
| tm0.0 | tm0.0 | tm0.0 | tm0.0 | tm0.0 | tm0.0 | tm0.0 | tm0.0 |
 ---

Whereas what you actually want is:

 ---
| tm0.0 | tm0.0 | tm0.0 | tm0.0 | tm0.0 | tm0.0 | tm0.0 | tm0.0 |
 ---
 ---
| tm0.1 | tm0.1 | tm0.1 | tm0.1 | tm0.1 | tm0.1 | tm0.1 | tm0.1 |
 ---

Or, in SIMD16,

 ---
| tm0.0 | tm0.0 | tm0.0 | tm0.0 | tm0.0 | tm0.0 | tm0.0 | tm0.0 |
 ---
 ---
| tm0.0 | tm0.0 | tm0.0 | tm0.0 | tm0.0 | tm0.0 | tm0.0 | tm0.0 |
 ---
 ---
| tm0.1 | tm0.1 | tm0.1 | tm0.1 | tm0.1 | tm0.1 | tm0.1 | tm0.1 |
 ---
 ---
| tm0.1 | tm0.1 | tm0.1 | tm0.1 | tm0.1 | tm0.1 | tm0.1 | tm0.1 |
 ---


You can use get_timestamp(), except you need to save the result and
emit two more MOV's to appropriately-sized registers allocated using
bld.vgrf(BRW_REGISTER_TYPE_UD) using the same smear() method that the
get_timestamp() code uses (note that although it did smear(0), you can
still call .smear(1) on the result to pick out tm0.1) to pick out the
right components, and then a LOAD_PAYLOAD to combine them into the
destination.

> +
> +  bld.MOV(retype(dest, brw_type_for_base_type(glsl_type::uvec2_type)),
> +  shader_clock);
> +  break;
> +   }
> +
> case nir_intrinsic_image_size: {
>/* Get the referenced image variable and type. */
>const nir_variable *var = instr->variables[0]->var;
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> index 41bd80d..f1de8d4 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> @@ -806,6 +806,16 @@ ve

Re: [Mesa-dev] [PATCH 5/6] i965: Implement nir_intrinsic_shader_clock

2015-10-08 Thread Emil Velikov
On 8 October 2015 at 00:36, Connor Abbott  wrote:
> On Wed, Oct 7, 2015 at 7:51 AM, Emil Velikov  wrote:
>> Signed-off-by: Emil Velikov 
>> ---
>>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp   |  9 +
>>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 10 ++
>>  2 files changed, 19 insertions(+)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
>> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> index 03fe680..bcb8f38 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> @@ -1317,6 +1317,15 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, 
>> nir_intrinsic_instr *instr
>>break;
>> }
>>
>> +   case nir_intrinsic_shader_clock: {
>> +  /* We cannot do anything if there is an event, so ignore it for now */
>> +  fs_reg shader_clock = get_timestamp(bld);
>
> get_timestamp() isn't doing quite what you want. If you look at the
> definition of fs_visitor::get_timestamp(), you see this comment:
>
>/* The caller wants the low 32 bits of the timestamp.  Since it's running
> * at the GPU clock rate of ~1.2ghz, it will roll over every ~3 seconds,
> * which is plenty of time for our purposes.  It is identical across the
> * EUs, but since it's tracking GPU core speed it will increment at a
> * varying rate as render P-states change.
> *
> * The caller could also check if render P-states have changed (or anything
> * else that might disrupt timing) by setting smear to 2 and checking if
> * that field is != 0.
> */
> When you read the ARF register, you get three components set in your
> SIMD8 register, plus some junk because we have to read in four
> components at a time. That is, this code in get_timestamp:
>
>fs_reg ts = fs_reg(retype(brw_vec4_reg(BRW_ARCHITECTURE_REGISTER_FILE,
>   BRW_ARF_TIMESTAMP,
>   0),
>  BRW_REGISTER_TYPE_UD));
>
>bld.group(4, 0).exec_all().MOV(dst, ts);
>
> will create a register called dst that looks like this:
>
>  --
> | tm0.0 | tm0.1 | tm0.2 | junk |
>  --
>
> Then doing dst.set_smear(0) will create a source that looks like this:
>
Thanks for this. Not sure how I missed this hunk.

I'm leaning towards pushing the comment + smear to
emit_shader_time_{begin.end} as it's applicable only there. This will
make the function analogous to the vec4 one, not to mention that it'll
save us the back and forth gymnastic, as you kindly described.

Cheers,
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev