Re: [Mesa-dev] [PATCH v2 2/2] i965: Implement ARB_compute_variable_group_size.

2018-11-08 Thread Karol Herbst
any update on that?

I could take care of extracting the bits myself, I was just expecting
that you would take care of that.
On Thu, Jun 28, 2018 at 6:53 PM Manolova, Plamena
 wrote:
>
> Hi Karol,
> Thank you for reviewing! I'll go ahead and push the changes you need from
> nir_lower_system_values.c to master.
>
> Thank you,
> Pam
>
> On Thu, Jun 28, 2018 at 5:50 AM, Karol Herbst  wrote:
>>
>> Hi,
>>
>> if the changes inside "src/compiler/nir/nir_lower_system_values.c" are
>> extracted into a seperate patch, this patch with the equal changes
>> would be
>>
>> Reviewed-by: Karol Herbst 
>>
>> I would need that for a nir to codegen pass for Nouveau and maybe it
>> will help other drivers implementing this extension as well. I don't
>> think it would hurt to extract those, right?
>>
>> Thanks!
>>
>> On Thu, Jun 7, 2018 at 5:34 PM, Plamena Manolova
>>  wrote:
>> > This patch adds the implementation of ARB_compute_variable_group_size
>> > for i965. We do this by storing the group size in a buffer surface,
>> > similarly to the work group number.
>> >
>> > v2: Fix some indentation inconsistencies (Jordan, Ilia)
>> > Do DIV_ROUND_UP correctly in brw_nir_lower_cs_intrinsics.c (Jordan)
>> > Use alphabetical order in features.txt (Matt)
>> > Set the extension constants properly in brw_context.c
>> >
>> > Signed-off-by: Plamena Manolova 
>> > ---
>> >  docs/features.txt|  2 +-
>> >  docs/relnotes/18.2.0.html|  1 +
>> >  src/compiler/nir/nir_lower_system_values.c   | 13 
>> >  src/intel/compiler/brw_compiler.h|  2 +
>> >  src/intel/compiler/brw_fs.cpp| 45 
>> >  src/intel/compiler/brw_fs_nir.cpp| 20 ++
>> >  src/intel/compiler/brw_nir_lower_cs_intrinsics.c | 88 
>> > +---
>> >  src/mesa/drivers/dri/i965/brw_compute.c  | 25 ++-
>> >  src/mesa/drivers/dri/i965/brw_context.c  |  6 ++
>> >  src/mesa/drivers/dri/i965/brw_context.h  |  1 +
>> >  src/mesa/drivers/dri/i965/brw_cs.c   |  4 ++
>> >  src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 27 +++-
>> >  src/mesa/drivers/dri/i965/intel_extensions.c |  1 +
>> >  13 files changed, 193 insertions(+), 42 deletions(-)
>> >
>> > diff --git a/docs/features.txt b/docs/features.txt
>> > index ed4050cf98..81b6663288 100644
>> > --- a/docs/features.txt
>> > +++ b/docs/features.txt
>> > @@ -298,7 +298,7 @@ Khronos, ARB, and OES extensions that are not part of 
>> > any OpenGL or OpenGL ES ve
>> >
>> >GL_ARB_bindless_texture   DONE (nvc0, 
>> > radeonsi)
>> >GL_ARB_cl_event   not started
>> > -  GL_ARB_compute_variable_group_sizeDONE (nvc0, 
>> > radeonsi)
>> > +  GL_ARB_compute_variable_group_sizeDONE (i965, nvc0, 
>> > radeonsi)
>> >GL_ARB_ES3_2_compatibilityDONE (i965/gen8+)
>> >GL_ARB_fragment_shader_interlock  DONE (i965)
>> >GL_ARB_gpu_shader_int64   DONE (i965/gen8+, 
>> > nvc0, radeonsi, softpipe, llvmpipe)
>> > diff --git a/docs/relnotes/18.2.0.html b/docs/relnotes/18.2.0.html
>> > index 0db37b620d..7475a56633 100644
>> > --- a/docs/relnotes/18.2.0.html
>> > +++ b/docs/relnotes/18.2.0.html
>> > @@ -52,6 +52,7 @@ Note: some of the new features are only available with 
>> > certain drivers.
>> >
>> >  
>> >  GL_ARB_fragment_shader_interlock on i965
>> > +GL_ARB_compute_variable_group_size on i965
>> >  
>> >
>> >  Bug fixes
>> > diff --git a/src/compiler/nir/nir_lower_system_values.c 
>> > b/src/compiler/nir/nir_lower_system_values.c
>> > index 487da04262..7ab005b000 100644
>> > --- a/src/compiler/nir/nir_lower_system_values.c
>> > +++ b/src/compiler/nir/nir_lower_system_values.c
>> > @@ -57,6 +57,14 @@ convert_block(nir_block *block, nir_builder *b)
>> >*gl_WorkGroupID * gl_WorkGroupSize + gl_LocalInvocationID"
>> >*/
>> >
>> > + /*
>> > +  * If the local work group size is variable we can't lower the 
>> > global
>> > +  * invocation id here.
>> > +  */
>> > + if (b->shader->info.cs.local_size_variable) {
>> > +break;
>> > + }
>> > +
>> >   nir_const_value local_size;
>> >   memset(_size, 0, sizeof(local_size));
>> >   local_size.u32[0] = b->shader->info.cs.local_size[0];
>> > @@ -102,6 +110,11 @@ convert_block(nir_block *block, nir_builder *b)
>> >}
>> >
>> >case SYSTEM_VALUE_LOCAL_GROUP_SIZE: {
>> > + /* If the local work group size is variable we can't lower it 
>> > here */
>> > + if (b->shader->info.cs.local_size_variable) {
>> > +break;
>> > + }
>> > +
>> >   nir_const_value local_size;
>> >   memset(_size, 0, sizeof(local_size));
>> >   local_size.u32[0] = 

Re: [Mesa-dev] [PATCH v2 2/2] i965: Implement ARB_compute_variable_group_size.

2018-06-28 Thread Manolova, Plamena
Hi Karol,
Thank you for reviewing! I'll go ahead and push the changes you need from
nir_lower_system_values.c to master.

Thank you,
Pam

On Thu, Jun 28, 2018 at 5:50 AM, Karol Herbst  wrote:

> Hi,
>
> if the changes inside "src/compiler/nir/nir_lower_system_values.c" are
> extracted into a seperate patch, this patch with the equal changes
> would be
>
> Reviewed-by: Karol Herbst 
>
> I would need that for a nir to codegen pass for Nouveau and maybe it
> will help other drivers implementing this extension as well. I don't
> think it would hurt to extract those, right?
>
> Thanks!
>
> On Thu, Jun 7, 2018 at 5:34 PM, Plamena Manolova
>  wrote:
> > This patch adds the implementation of ARB_compute_variable_group_size
> > for i965. We do this by storing the group size in a buffer surface,
> > similarly to the work group number.
> >
> > v2: Fix some indentation inconsistencies (Jordan, Ilia)
> > Do DIV_ROUND_UP correctly in brw_nir_lower_cs_intrinsics.c (Jordan)
> > Use alphabetical order in features.txt (Matt)
> > Set the extension constants properly in brw_context.c
> >
> > Signed-off-by: Plamena Manolova 
> > ---
> >  docs/features.txt|  2 +-
> >  docs/relnotes/18.2.0.html|  1 +
> >  src/compiler/nir/nir_lower_system_values.c   | 13 
> >  src/intel/compiler/brw_compiler.h|  2 +
> >  src/intel/compiler/brw_fs.cpp| 45 
> >  src/intel/compiler/brw_fs_nir.cpp| 20 ++
> >  src/intel/compiler/brw_nir_lower_cs_intrinsics.c | 88
> +---
> >  src/mesa/drivers/dri/i965/brw_compute.c  | 25 ++-
> >  src/mesa/drivers/dri/i965/brw_context.c  |  6 ++
> >  src/mesa/drivers/dri/i965/brw_context.h  |  1 +
> >  src/mesa/drivers/dri/i965/brw_cs.c   |  4 ++
> >  src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 27 +++-
> >  src/mesa/drivers/dri/i965/intel_extensions.c |  1 +
> >  13 files changed, 193 insertions(+), 42 deletions(-)
> >
> > diff --git a/docs/features.txt b/docs/features.txt
> > index ed4050cf98..81b6663288 100644
> > --- a/docs/features.txt
> > +++ b/docs/features.txt
> > @@ -298,7 +298,7 @@ Khronos, ARB, and OES extensions that are not part
> of any OpenGL or OpenGL ES ve
> >
> >GL_ARB_bindless_texture   DONE (nvc0,
> radeonsi)
> >GL_ARB_cl_event   not started
> > -  GL_ARB_compute_variable_group_sizeDONE (nvc0,
> radeonsi)
> > +  GL_ARB_compute_variable_group_sizeDONE (i965,
> nvc0, radeonsi)
> >GL_ARB_ES3_2_compatibilityDONE
> (i965/gen8+)
> >GL_ARB_fragment_shader_interlock  DONE (i965)
> >GL_ARB_gpu_shader_int64   DONE
> (i965/gen8+, nvc0, radeonsi, softpipe, llvmpipe)
> > diff --git a/docs/relnotes/18.2.0.html b/docs/relnotes/18.2.0.html
> > index 0db37b620d..7475a56633 100644
> > --- a/docs/relnotes/18.2.0.html
> > +++ b/docs/relnotes/18.2.0.html
> > @@ -52,6 +52,7 @@ Note: some of the new features are only available with
> certain drivers.
> >
> >  
> >  GL_ARB_fragment_shader_interlock on i965
> > +GL_ARB_compute_variable_group_size on i965
> >  
> >
> >  Bug fixes
> > diff --git a/src/compiler/nir/nir_lower_system_values.c
> b/src/compiler/nir/nir_lower_system_values.c
> > index 487da04262..7ab005b000 100644
> > --- a/src/compiler/nir/nir_lower_system_values.c
> > +++ b/src/compiler/nir/nir_lower_system_values.c
> > @@ -57,6 +57,14 @@ convert_block(nir_block *block, nir_builder *b)
> >*gl_WorkGroupID * gl_WorkGroupSize + gl_LocalInvocationID"
> >*/
> >
> > + /*
> > +  * If the local work group size is variable we can't lower the
> global
> > +  * invocation id here.
> > +  */
> > + if (b->shader->info.cs.local_size_variable) {
> > +break;
> > + }
> > +
> >   nir_const_value local_size;
> >   memset(_size, 0, sizeof(local_size));
> >   local_size.u32[0] = b->shader->info.cs.local_size[0];
> > @@ -102,6 +110,11 @@ convert_block(nir_block *block, nir_builder *b)
> >}
> >
> >case SYSTEM_VALUE_LOCAL_GROUP_SIZE: {
> > + /* If the local work group size is variable we can't lower it
> here */
> > + if (b->shader->info.cs.local_size_variable) {
> > +break;
> > + }
> > +
> >   nir_const_value local_size;
> >   memset(_size, 0, sizeof(local_size));
> >   local_size.u32[0] = b->shader->info.cs.local_size[0];
> > diff --git a/src/intel/compiler/brw_compiler.h b/src/intel/compiler/brw_
> compiler.h
> > index 8b4e6fe2e2..f54952c28f 100644
> > --- a/src/intel/compiler/brw_compiler.h
> > +++ b/src/intel/compiler/brw_compiler.h
> > @@ -759,6 +759,7 @@ struct brw_cs_prog_data {
> > unsigned threads;
> > bool uses_barrier;
> 

Re: [Mesa-dev] [PATCH v2 2/2] i965: Implement ARB_compute_variable_group_size.

2018-06-27 Thread Karol Herbst
Hi,

if the changes inside "src/compiler/nir/nir_lower_system_values.c" are
extracted into a seperate patch, this patch with the equal changes
would be

Reviewed-by: Karol Herbst 

I would need that for a nir to codegen pass for Nouveau and maybe it
will help other drivers implementing this extension as well. I don't
think it would hurt to extract those, right?

Thanks!

On Thu, Jun 7, 2018 at 5:34 PM, Plamena Manolova
 wrote:
> This patch adds the implementation of ARB_compute_variable_group_size
> for i965. We do this by storing the group size in a buffer surface,
> similarly to the work group number.
>
> v2: Fix some indentation inconsistencies (Jordan, Ilia)
> Do DIV_ROUND_UP correctly in brw_nir_lower_cs_intrinsics.c (Jordan)
> Use alphabetical order in features.txt (Matt)
> Set the extension constants properly in brw_context.c
>
> Signed-off-by: Plamena Manolova 
> ---
>  docs/features.txt|  2 +-
>  docs/relnotes/18.2.0.html|  1 +
>  src/compiler/nir/nir_lower_system_values.c   | 13 
>  src/intel/compiler/brw_compiler.h|  2 +
>  src/intel/compiler/brw_fs.cpp| 45 
>  src/intel/compiler/brw_fs_nir.cpp| 20 ++
>  src/intel/compiler/brw_nir_lower_cs_intrinsics.c | 88 
> +---
>  src/mesa/drivers/dri/i965/brw_compute.c  | 25 ++-
>  src/mesa/drivers/dri/i965/brw_context.c  |  6 ++
>  src/mesa/drivers/dri/i965/brw_context.h  |  1 +
>  src/mesa/drivers/dri/i965/brw_cs.c   |  4 ++
>  src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 27 +++-
>  src/mesa/drivers/dri/i965/intel_extensions.c |  1 +
>  13 files changed, 193 insertions(+), 42 deletions(-)
>
> diff --git a/docs/features.txt b/docs/features.txt
> index ed4050cf98..81b6663288 100644
> --- a/docs/features.txt
> +++ b/docs/features.txt
> @@ -298,7 +298,7 @@ Khronos, ARB, and OES extensions that are not part of any 
> OpenGL or OpenGL ES ve
>
>GL_ARB_bindless_texture   DONE (nvc0, radeonsi)
>GL_ARB_cl_event   not started
> -  GL_ARB_compute_variable_group_sizeDONE (nvc0, radeonsi)
> +  GL_ARB_compute_variable_group_sizeDONE (i965, nvc0, 
> radeonsi)
>GL_ARB_ES3_2_compatibilityDONE (i965/gen8+)
>GL_ARB_fragment_shader_interlock  DONE (i965)
>GL_ARB_gpu_shader_int64   DONE (i965/gen8+, 
> nvc0, radeonsi, softpipe, llvmpipe)
> diff --git a/docs/relnotes/18.2.0.html b/docs/relnotes/18.2.0.html
> index 0db37b620d..7475a56633 100644
> --- a/docs/relnotes/18.2.0.html
> +++ b/docs/relnotes/18.2.0.html
> @@ -52,6 +52,7 @@ Note: some of the new features are only available with 
> certain drivers.
>
>  
>  GL_ARB_fragment_shader_interlock on i965
> +GL_ARB_compute_variable_group_size on i965
>  
>
>  Bug fixes
> diff --git a/src/compiler/nir/nir_lower_system_values.c 
> b/src/compiler/nir/nir_lower_system_values.c
> index 487da04262..7ab005b000 100644
> --- a/src/compiler/nir/nir_lower_system_values.c
> +++ b/src/compiler/nir/nir_lower_system_values.c
> @@ -57,6 +57,14 @@ convert_block(nir_block *block, nir_builder *b)
>*gl_WorkGroupID * gl_WorkGroupSize + gl_LocalInvocationID"
>*/
>
> + /*
> +  * If the local work group size is variable we can't lower the 
> global
> +  * invocation id here.
> +  */
> + if (b->shader->info.cs.local_size_variable) {
> +break;
> + }
> +
>   nir_const_value local_size;
>   memset(_size, 0, sizeof(local_size));
>   local_size.u32[0] = b->shader->info.cs.local_size[0];
> @@ -102,6 +110,11 @@ convert_block(nir_block *block, nir_builder *b)
>}
>
>case SYSTEM_VALUE_LOCAL_GROUP_SIZE: {
> + /* If the local work group size is variable we can't lower it here 
> */
> + if (b->shader->info.cs.local_size_variable) {
> +break;
> + }
> +
>   nir_const_value local_size;
>   memset(_size, 0, sizeof(local_size));
>   local_size.u32[0] = b->shader->info.cs.local_size[0];
> diff --git a/src/intel/compiler/brw_compiler.h 
> b/src/intel/compiler/brw_compiler.h
> index 8b4e6fe2e2..f54952c28f 100644
> --- a/src/intel/compiler/brw_compiler.h
> +++ b/src/intel/compiler/brw_compiler.h
> @@ -759,6 +759,7 @@ struct brw_cs_prog_data {
> unsigned threads;
> bool uses_barrier;
> bool uses_num_work_groups;
> +   bool uses_variable_group_size;
>
> struct {
>struct brw_push_const_block cross_thread;
> @@ -771,6 +772,7 @@ struct brw_cs_prog_data {
> * surface indices the CS-specific surfaces
> */
>uint32_t work_groups_start;
> +  uint32_t work_group_size_start;
>/** @} */
> } binding_table;
>  };
> diff --git 

Re: [Mesa-dev] [PATCH v2 2/2] i965: Implement ARB_compute_variable_group_size.

2018-06-08 Thread Jordan Justen
On 2018-06-07 08:34:26, Plamena Manolova wrote:
> This patch adds the implementation of ARB_compute_variable_group_size
> for i965. We do this by storing the group size in a buffer surface,
> similarly to the work group number.
> 
> v2: Fix some indentation inconsistencies (Jordan, Ilia)
> Do DIV_ROUND_UP correctly in brw_nir_lower_cs_intrinsics.c (Jordan)
> Use alphabetical order in features.txt (Matt)
> Set the extension constants properly in brw_context.c
> 
> Signed-off-by: Plamena Manolova 
> ---
>  docs/features.txt|  2 +-
>  docs/relnotes/18.2.0.html|  1 +
>  src/compiler/nir/nir_lower_system_values.c   | 13 
>  src/intel/compiler/brw_compiler.h|  2 +
>  src/intel/compiler/brw_fs.cpp| 45 
>  src/intel/compiler/brw_fs_nir.cpp| 20 ++
>  src/intel/compiler/brw_nir_lower_cs_intrinsics.c | 88 
> +---
>  src/mesa/drivers/dri/i965/brw_compute.c  | 25 ++-
>  src/mesa/drivers/dri/i965/brw_context.c  |  6 ++
>  src/mesa/drivers/dri/i965/brw_context.h  |  1 +
>  src/mesa/drivers/dri/i965/brw_cs.c   |  4 ++
>  src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 27 +++-
>  src/mesa/drivers/dri/i965/intel_extensions.c |  1 +
>  13 files changed, 193 insertions(+), 42 deletions(-)
> 
> diff --git a/docs/features.txt b/docs/features.txt
> index ed4050cf98..81b6663288 100644
> --- a/docs/features.txt
> +++ b/docs/features.txt
> @@ -298,7 +298,7 @@ Khronos, ARB, and OES extensions that are not part of any 
> OpenGL or OpenGL ES ve
> 
>GL_ARB_bindless_texture   DONE (nvc0, radeonsi)
>GL_ARB_cl_event   not started
> -  GL_ARB_compute_variable_group_sizeDONE (nvc0, radeonsi)
> +  GL_ARB_compute_variable_group_sizeDONE (i965, nvc0, 
> radeonsi)
>GL_ARB_ES3_2_compatibilityDONE (i965/gen8+)
>GL_ARB_fragment_shader_interlock  DONE (i965)
>GL_ARB_gpu_shader_int64   DONE (i965/gen8+, 
> nvc0, radeonsi, softpipe, llvmpipe)
> diff --git a/docs/relnotes/18.2.0.html b/docs/relnotes/18.2.0.html
> index 0db37b620d..7475a56633 100644
> --- a/docs/relnotes/18.2.0.html
> +++ b/docs/relnotes/18.2.0.html
> @@ -52,6 +52,7 @@ Note: some of the new features are only available with 
> certain drivers.
> 
>  
>  GL_ARB_fragment_shader_interlock on i965
> +GL_ARB_compute_variable_group_size on i965
>  
> 
>  Bug fixes
> diff --git a/src/compiler/nir/nir_lower_system_values.c 
> b/src/compiler/nir/nir_lower_system_values.c
> index 487da04262..7ab005b000 100644
> --- a/src/compiler/nir/nir_lower_system_values.c
> +++ b/src/compiler/nir/nir_lower_system_values.c
> @@ -57,6 +57,14 @@ convert_block(nir_block *block, nir_builder *b)
>*gl_WorkGroupID * gl_WorkGroupSize + gl_LocalInvocationID"
>*/
> 
> + /*
> +  * If the local work group size is variable we can't lower the 
> global
> +  * invocation id here.
> +  */
> + if (b->shader->info.cs.local_size_variable) {
> +break;
> + }
> +
>   nir_const_value local_size;
>   memset(_size, 0, sizeof(local_size));
>   local_size.u32[0] = b->shader->info.cs.local_size[0];
> @@ -102,6 +110,11 @@ convert_block(nir_block *block, nir_builder *b)
>}
> 
>case SYSTEM_VALUE_LOCAL_GROUP_SIZE: {
> + /* If the local work group size is variable we can't lower it here 
> */
> + if (b->shader->info.cs.local_size_variable) {
> +break;
> + }
> +
>   nir_const_value local_size;
>   memset(_size, 0, sizeof(local_size));
>   local_size.u32[0] = b->shader->info.cs.local_size[0];
> diff --git a/src/intel/compiler/brw_compiler.h 
> b/src/intel/compiler/brw_compiler.h
> index 8b4e6fe2e2..f54952c28f 100644
> --- a/src/intel/compiler/brw_compiler.h
> +++ b/src/intel/compiler/brw_compiler.h
> @@ -759,6 +759,7 @@ struct brw_cs_prog_data {
> unsigned threads;
> bool uses_barrier;
> bool uses_num_work_groups;
> +   bool uses_variable_group_size;
> 
> struct {
>struct brw_push_const_block cross_thread;
> @@ -771,6 +772,7 @@ struct brw_cs_prog_data {
> * surface indices the CS-specific surfaces
> */
>uint32_t work_groups_start;
> +  uint32_t work_group_size_start;
>/** @} */
> } binding_table;
>  };
> diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp
> index d67c0a4192..28730af47b 100644
> --- a/src/intel/compiler/brw_fs.cpp
> +++ b/src/intel/compiler/brw_fs.cpp
> @@ -7228,18 +7228,32 @@ brw_compile_cs(const struct brw_compiler *compiler, 
> void *log_data,
> int shader_time_index,
> char **error_str)
>  {
> -