Re: [Mesa-dev] [PATCH v2 2/2] i965: Implement ARB_compute_variable_group_size.
any update on that? I could take care of extracting the bits myself, I was just expecting that you would take care of that. On Thu, Jun 28, 2018 at 6:53 PM Manolova, Plamena wrote: > > Hi Karol, > Thank you for reviewing! I'll go ahead and push the changes you need from > nir_lower_system_values.c to master. > > Thank you, > Pam > > On Thu, Jun 28, 2018 at 5:50 AM, Karol Herbst wrote: >> >> Hi, >> >> if the changes inside "src/compiler/nir/nir_lower_system_values.c" are >> extracted into a seperate patch, this patch with the equal changes >> would be >> >> Reviewed-by: Karol Herbst >> >> I would need that for a nir to codegen pass for Nouveau and maybe it >> will help other drivers implementing this extension as well. I don't >> think it would hurt to extract those, right? >> >> Thanks! >> >> On Thu, Jun 7, 2018 at 5:34 PM, Plamena Manolova >> wrote: >> > This patch adds the implementation of ARB_compute_variable_group_size >> > for i965. We do this by storing the group size in a buffer surface, >> > similarly to the work group number. >> > >> > v2: Fix some indentation inconsistencies (Jordan, Ilia) >> > Do DIV_ROUND_UP correctly in brw_nir_lower_cs_intrinsics.c (Jordan) >> > Use alphabetical order in features.txt (Matt) >> > Set the extension constants properly in brw_context.c >> > >> > Signed-off-by: Plamena Manolova >> > --- >> > docs/features.txt| 2 +- >> > docs/relnotes/18.2.0.html| 1 + >> > src/compiler/nir/nir_lower_system_values.c | 13 >> > src/intel/compiler/brw_compiler.h| 2 + >> > src/intel/compiler/brw_fs.cpp| 45 >> > src/intel/compiler/brw_fs_nir.cpp| 20 ++ >> > src/intel/compiler/brw_nir_lower_cs_intrinsics.c | 88 >> > +--- >> > src/mesa/drivers/dri/i965/brw_compute.c | 25 ++- >> > src/mesa/drivers/dri/i965/brw_context.c | 6 ++ >> > src/mesa/drivers/dri/i965/brw_context.h | 1 + >> > src/mesa/drivers/dri/i965/brw_cs.c | 4 ++ >> > src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 27 +++- >> > src/mesa/drivers/dri/i965/intel_extensions.c | 1 + >> > 13 files changed, 193 insertions(+), 42 deletions(-) >> > >> > diff --git a/docs/features.txt b/docs/features.txt >> > index ed4050cf98..81b6663288 100644 >> > --- a/docs/features.txt >> > +++ b/docs/features.txt >> > @@ -298,7 +298,7 @@ Khronos, ARB, and OES extensions that are not part of >> > any OpenGL or OpenGL ES ve >> > >> >GL_ARB_bindless_texture DONE (nvc0, >> > radeonsi) >> >GL_ARB_cl_event not started >> > - GL_ARB_compute_variable_group_sizeDONE (nvc0, >> > radeonsi) >> > + GL_ARB_compute_variable_group_sizeDONE (i965, nvc0, >> > radeonsi) >> >GL_ARB_ES3_2_compatibilityDONE (i965/gen8+) >> >GL_ARB_fragment_shader_interlock DONE (i965) >> >GL_ARB_gpu_shader_int64 DONE (i965/gen8+, >> > nvc0, radeonsi, softpipe, llvmpipe) >> > diff --git a/docs/relnotes/18.2.0.html b/docs/relnotes/18.2.0.html >> > index 0db37b620d..7475a56633 100644 >> > --- a/docs/relnotes/18.2.0.html >> > +++ b/docs/relnotes/18.2.0.html >> > @@ -52,6 +52,7 @@ Note: some of the new features are only available with >> > certain drivers. >> > >> > >> > GL_ARB_fragment_shader_interlock on i965 >> > +GL_ARB_compute_variable_group_size on i965 >> > >> > >> > Bug fixes >> > diff --git a/src/compiler/nir/nir_lower_system_values.c >> > b/src/compiler/nir/nir_lower_system_values.c >> > index 487da04262..7ab005b000 100644 >> > --- a/src/compiler/nir/nir_lower_system_values.c >> > +++ b/src/compiler/nir/nir_lower_system_values.c >> > @@ -57,6 +57,14 @@ convert_block(nir_block *block, nir_builder *b) >> >*gl_WorkGroupID * gl_WorkGroupSize + gl_LocalInvocationID" >> >*/ >> > >> > + /* >> > + * If the local work group size is variable we can't lower the >> > global >> > + * invocation id here. >> > + */ >> > + if (b->shader->info.cs.local_size_variable) { >> > +break; >> > + } >> > + >> > nir_const_value local_size; >> > memset(_size, 0, sizeof(local_size)); >> > local_size.u32[0] = b->shader->info.cs.local_size[0]; >> > @@ -102,6 +110,11 @@ convert_block(nir_block *block, nir_builder *b) >> >} >> > >> >case SYSTEM_VALUE_LOCAL_GROUP_SIZE: { >> > + /* If the local work group size is variable we can't lower it >> > here */ >> > + if (b->shader->info.cs.local_size_variable) { >> > +break; >> > + } >> > + >> > nir_const_value local_size; >> > memset(_size, 0, sizeof(local_size)); >> > local_size.u32[0] =
Re: [Mesa-dev] [PATCH v2 2/2] i965: Implement ARB_compute_variable_group_size.
Hi Karol, Thank you for reviewing! I'll go ahead and push the changes you need from nir_lower_system_values.c to master. Thank you, Pam On Thu, Jun 28, 2018 at 5:50 AM, Karol Herbst wrote: > Hi, > > if the changes inside "src/compiler/nir/nir_lower_system_values.c" are > extracted into a seperate patch, this patch with the equal changes > would be > > Reviewed-by: Karol Herbst > > I would need that for a nir to codegen pass for Nouveau and maybe it > will help other drivers implementing this extension as well. I don't > think it would hurt to extract those, right? > > Thanks! > > On Thu, Jun 7, 2018 at 5:34 PM, Plamena Manolova > wrote: > > This patch adds the implementation of ARB_compute_variable_group_size > > for i965. We do this by storing the group size in a buffer surface, > > similarly to the work group number. > > > > v2: Fix some indentation inconsistencies (Jordan, Ilia) > > Do DIV_ROUND_UP correctly in brw_nir_lower_cs_intrinsics.c (Jordan) > > Use alphabetical order in features.txt (Matt) > > Set the extension constants properly in brw_context.c > > > > Signed-off-by: Plamena Manolova > > --- > > docs/features.txt| 2 +- > > docs/relnotes/18.2.0.html| 1 + > > src/compiler/nir/nir_lower_system_values.c | 13 > > src/intel/compiler/brw_compiler.h| 2 + > > src/intel/compiler/brw_fs.cpp| 45 > > src/intel/compiler/brw_fs_nir.cpp| 20 ++ > > src/intel/compiler/brw_nir_lower_cs_intrinsics.c | 88 > +--- > > src/mesa/drivers/dri/i965/brw_compute.c | 25 ++- > > src/mesa/drivers/dri/i965/brw_context.c | 6 ++ > > src/mesa/drivers/dri/i965/brw_context.h | 1 + > > src/mesa/drivers/dri/i965/brw_cs.c | 4 ++ > > src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 27 +++- > > src/mesa/drivers/dri/i965/intel_extensions.c | 1 + > > 13 files changed, 193 insertions(+), 42 deletions(-) > > > > diff --git a/docs/features.txt b/docs/features.txt > > index ed4050cf98..81b6663288 100644 > > --- a/docs/features.txt > > +++ b/docs/features.txt > > @@ -298,7 +298,7 @@ Khronos, ARB, and OES extensions that are not part > of any OpenGL or OpenGL ES ve > > > >GL_ARB_bindless_texture DONE (nvc0, > radeonsi) > >GL_ARB_cl_event not started > > - GL_ARB_compute_variable_group_sizeDONE (nvc0, > radeonsi) > > + GL_ARB_compute_variable_group_sizeDONE (i965, > nvc0, radeonsi) > >GL_ARB_ES3_2_compatibilityDONE > (i965/gen8+) > >GL_ARB_fragment_shader_interlock DONE (i965) > >GL_ARB_gpu_shader_int64 DONE > (i965/gen8+, nvc0, radeonsi, softpipe, llvmpipe) > > diff --git a/docs/relnotes/18.2.0.html b/docs/relnotes/18.2.0.html > > index 0db37b620d..7475a56633 100644 > > --- a/docs/relnotes/18.2.0.html > > +++ b/docs/relnotes/18.2.0.html > > @@ -52,6 +52,7 @@ Note: some of the new features are only available with > certain drivers. > > > > > > GL_ARB_fragment_shader_interlock on i965 > > +GL_ARB_compute_variable_group_size on i965 > > > > > > Bug fixes > > diff --git a/src/compiler/nir/nir_lower_system_values.c > b/src/compiler/nir/nir_lower_system_values.c > > index 487da04262..7ab005b000 100644 > > --- a/src/compiler/nir/nir_lower_system_values.c > > +++ b/src/compiler/nir/nir_lower_system_values.c > > @@ -57,6 +57,14 @@ convert_block(nir_block *block, nir_builder *b) > >*gl_WorkGroupID * gl_WorkGroupSize + gl_LocalInvocationID" > >*/ > > > > + /* > > + * If the local work group size is variable we can't lower the > global > > + * invocation id here. > > + */ > > + if (b->shader->info.cs.local_size_variable) { > > +break; > > + } > > + > > nir_const_value local_size; > > memset(_size, 0, sizeof(local_size)); > > local_size.u32[0] = b->shader->info.cs.local_size[0]; > > @@ -102,6 +110,11 @@ convert_block(nir_block *block, nir_builder *b) > >} > > > >case SYSTEM_VALUE_LOCAL_GROUP_SIZE: { > > + /* If the local work group size is variable we can't lower it > here */ > > + if (b->shader->info.cs.local_size_variable) { > > +break; > > + } > > + > > nir_const_value local_size; > > memset(_size, 0, sizeof(local_size)); > > local_size.u32[0] = b->shader->info.cs.local_size[0]; > > diff --git a/src/intel/compiler/brw_compiler.h b/src/intel/compiler/brw_ > compiler.h > > index 8b4e6fe2e2..f54952c28f 100644 > > --- a/src/intel/compiler/brw_compiler.h > > +++ b/src/intel/compiler/brw_compiler.h > > @@ -759,6 +759,7 @@ struct brw_cs_prog_data { > > unsigned threads; > > bool uses_barrier; >
Re: [Mesa-dev] [PATCH v2 2/2] i965: Implement ARB_compute_variable_group_size.
Hi, if the changes inside "src/compiler/nir/nir_lower_system_values.c" are extracted into a seperate patch, this patch with the equal changes would be Reviewed-by: Karol Herbst I would need that for a nir to codegen pass for Nouveau and maybe it will help other drivers implementing this extension as well. I don't think it would hurt to extract those, right? Thanks! On Thu, Jun 7, 2018 at 5:34 PM, Plamena Manolova wrote: > This patch adds the implementation of ARB_compute_variable_group_size > for i965. We do this by storing the group size in a buffer surface, > similarly to the work group number. > > v2: Fix some indentation inconsistencies (Jordan, Ilia) > Do DIV_ROUND_UP correctly in brw_nir_lower_cs_intrinsics.c (Jordan) > Use alphabetical order in features.txt (Matt) > Set the extension constants properly in brw_context.c > > Signed-off-by: Plamena Manolova > --- > docs/features.txt| 2 +- > docs/relnotes/18.2.0.html| 1 + > src/compiler/nir/nir_lower_system_values.c | 13 > src/intel/compiler/brw_compiler.h| 2 + > src/intel/compiler/brw_fs.cpp| 45 > src/intel/compiler/brw_fs_nir.cpp| 20 ++ > src/intel/compiler/brw_nir_lower_cs_intrinsics.c | 88 > +--- > src/mesa/drivers/dri/i965/brw_compute.c | 25 ++- > src/mesa/drivers/dri/i965/brw_context.c | 6 ++ > src/mesa/drivers/dri/i965/brw_context.h | 1 + > src/mesa/drivers/dri/i965/brw_cs.c | 4 ++ > src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 27 +++- > src/mesa/drivers/dri/i965/intel_extensions.c | 1 + > 13 files changed, 193 insertions(+), 42 deletions(-) > > diff --git a/docs/features.txt b/docs/features.txt > index ed4050cf98..81b6663288 100644 > --- a/docs/features.txt > +++ b/docs/features.txt > @@ -298,7 +298,7 @@ Khronos, ARB, and OES extensions that are not part of any > OpenGL or OpenGL ES ve > >GL_ARB_bindless_texture DONE (nvc0, radeonsi) >GL_ARB_cl_event not started > - GL_ARB_compute_variable_group_sizeDONE (nvc0, radeonsi) > + GL_ARB_compute_variable_group_sizeDONE (i965, nvc0, > radeonsi) >GL_ARB_ES3_2_compatibilityDONE (i965/gen8+) >GL_ARB_fragment_shader_interlock DONE (i965) >GL_ARB_gpu_shader_int64 DONE (i965/gen8+, > nvc0, radeonsi, softpipe, llvmpipe) > diff --git a/docs/relnotes/18.2.0.html b/docs/relnotes/18.2.0.html > index 0db37b620d..7475a56633 100644 > --- a/docs/relnotes/18.2.0.html > +++ b/docs/relnotes/18.2.0.html > @@ -52,6 +52,7 @@ Note: some of the new features are only available with > certain drivers. > > > GL_ARB_fragment_shader_interlock on i965 > +GL_ARB_compute_variable_group_size on i965 > > > Bug fixes > diff --git a/src/compiler/nir/nir_lower_system_values.c > b/src/compiler/nir/nir_lower_system_values.c > index 487da04262..7ab005b000 100644 > --- a/src/compiler/nir/nir_lower_system_values.c > +++ b/src/compiler/nir/nir_lower_system_values.c > @@ -57,6 +57,14 @@ convert_block(nir_block *block, nir_builder *b) >*gl_WorkGroupID * gl_WorkGroupSize + gl_LocalInvocationID" >*/ > > + /* > + * If the local work group size is variable we can't lower the > global > + * invocation id here. > + */ > + if (b->shader->info.cs.local_size_variable) { > +break; > + } > + > nir_const_value local_size; > memset(_size, 0, sizeof(local_size)); > local_size.u32[0] = b->shader->info.cs.local_size[0]; > @@ -102,6 +110,11 @@ convert_block(nir_block *block, nir_builder *b) >} > >case SYSTEM_VALUE_LOCAL_GROUP_SIZE: { > + /* If the local work group size is variable we can't lower it here > */ > + if (b->shader->info.cs.local_size_variable) { > +break; > + } > + > nir_const_value local_size; > memset(_size, 0, sizeof(local_size)); > local_size.u32[0] = b->shader->info.cs.local_size[0]; > diff --git a/src/intel/compiler/brw_compiler.h > b/src/intel/compiler/brw_compiler.h > index 8b4e6fe2e2..f54952c28f 100644 > --- a/src/intel/compiler/brw_compiler.h > +++ b/src/intel/compiler/brw_compiler.h > @@ -759,6 +759,7 @@ struct brw_cs_prog_data { > unsigned threads; > bool uses_barrier; > bool uses_num_work_groups; > + bool uses_variable_group_size; > > struct { >struct brw_push_const_block cross_thread; > @@ -771,6 +772,7 @@ struct brw_cs_prog_data { > * surface indices the CS-specific surfaces > */ >uint32_t work_groups_start; > + uint32_t work_group_size_start; >/** @} */ > } binding_table; > }; > diff --git
Re: [Mesa-dev] [PATCH v2 2/2] i965: Implement ARB_compute_variable_group_size.
On 2018-06-07 08:34:26, Plamena Manolova wrote: > This patch adds the implementation of ARB_compute_variable_group_size > for i965. We do this by storing the group size in a buffer surface, > similarly to the work group number. > > v2: Fix some indentation inconsistencies (Jordan, Ilia) > Do DIV_ROUND_UP correctly in brw_nir_lower_cs_intrinsics.c (Jordan) > Use alphabetical order in features.txt (Matt) > Set the extension constants properly in brw_context.c > > Signed-off-by: Plamena Manolova > --- > docs/features.txt| 2 +- > docs/relnotes/18.2.0.html| 1 + > src/compiler/nir/nir_lower_system_values.c | 13 > src/intel/compiler/brw_compiler.h| 2 + > src/intel/compiler/brw_fs.cpp| 45 > src/intel/compiler/brw_fs_nir.cpp| 20 ++ > src/intel/compiler/brw_nir_lower_cs_intrinsics.c | 88 > +--- > src/mesa/drivers/dri/i965/brw_compute.c | 25 ++- > src/mesa/drivers/dri/i965/brw_context.c | 6 ++ > src/mesa/drivers/dri/i965/brw_context.h | 1 + > src/mesa/drivers/dri/i965/brw_cs.c | 4 ++ > src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 27 +++- > src/mesa/drivers/dri/i965/intel_extensions.c | 1 + > 13 files changed, 193 insertions(+), 42 deletions(-) > > diff --git a/docs/features.txt b/docs/features.txt > index ed4050cf98..81b6663288 100644 > --- a/docs/features.txt > +++ b/docs/features.txt > @@ -298,7 +298,7 @@ Khronos, ARB, and OES extensions that are not part of any > OpenGL or OpenGL ES ve > >GL_ARB_bindless_texture DONE (nvc0, radeonsi) >GL_ARB_cl_event not started > - GL_ARB_compute_variable_group_sizeDONE (nvc0, radeonsi) > + GL_ARB_compute_variable_group_sizeDONE (i965, nvc0, > radeonsi) >GL_ARB_ES3_2_compatibilityDONE (i965/gen8+) >GL_ARB_fragment_shader_interlock DONE (i965) >GL_ARB_gpu_shader_int64 DONE (i965/gen8+, > nvc0, radeonsi, softpipe, llvmpipe) > diff --git a/docs/relnotes/18.2.0.html b/docs/relnotes/18.2.0.html > index 0db37b620d..7475a56633 100644 > --- a/docs/relnotes/18.2.0.html > +++ b/docs/relnotes/18.2.0.html > @@ -52,6 +52,7 @@ Note: some of the new features are only available with > certain drivers. > > > GL_ARB_fragment_shader_interlock on i965 > +GL_ARB_compute_variable_group_size on i965 > > > Bug fixes > diff --git a/src/compiler/nir/nir_lower_system_values.c > b/src/compiler/nir/nir_lower_system_values.c > index 487da04262..7ab005b000 100644 > --- a/src/compiler/nir/nir_lower_system_values.c > +++ b/src/compiler/nir/nir_lower_system_values.c > @@ -57,6 +57,14 @@ convert_block(nir_block *block, nir_builder *b) >*gl_WorkGroupID * gl_WorkGroupSize + gl_LocalInvocationID" >*/ > > + /* > + * If the local work group size is variable we can't lower the > global > + * invocation id here. > + */ > + if (b->shader->info.cs.local_size_variable) { > +break; > + } > + > nir_const_value local_size; > memset(_size, 0, sizeof(local_size)); > local_size.u32[0] = b->shader->info.cs.local_size[0]; > @@ -102,6 +110,11 @@ convert_block(nir_block *block, nir_builder *b) >} > >case SYSTEM_VALUE_LOCAL_GROUP_SIZE: { > + /* If the local work group size is variable we can't lower it here > */ > + if (b->shader->info.cs.local_size_variable) { > +break; > + } > + > nir_const_value local_size; > memset(_size, 0, sizeof(local_size)); > local_size.u32[0] = b->shader->info.cs.local_size[0]; > diff --git a/src/intel/compiler/brw_compiler.h > b/src/intel/compiler/brw_compiler.h > index 8b4e6fe2e2..f54952c28f 100644 > --- a/src/intel/compiler/brw_compiler.h > +++ b/src/intel/compiler/brw_compiler.h > @@ -759,6 +759,7 @@ struct brw_cs_prog_data { > unsigned threads; > bool uses_barrier; > bool uses_num_work_groups; > + bool uses_variable_group_size; > > struct { >struct brw_push_const_block cross_thread; > @@ -771,6 +772,7 @@ struct brw_cs_prog_data { > * surface indices the CS-specific surfaces > */ >uint32_t work_groups_start; > + uint32_t work_group_size_start; >/** @} */ > } binding_table; > }; > diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp > index d67c0a4192..28730af47b 100644 > --- a/src/intel/compiler/brw_fs.cpp > +++ b/src/intel/compiler/brw_fs.cpp > @@ -7228,18 +7228,32 @@ brw_compile_cs(const struct brw_compiler *compiler, > void *log_data, > int shader_time_index, > char **error_str) > { > -