Re: [Mesa-dev] [PATCH v3 3/3] i965/nir/fs: Implement new barrier functions for compute shaders
On 2015-11-06 04:48:00, Francisco Jerez wrote: > Jordan Justen writes: > > > For these nir intrinsics, we emit the same code as > > nir_intrinsic_memory_barrier: > > > > * nir_intrinsic_memory_barrier_atomic_counter > > * nir_intrinsic_memory_barrier_buffer > > * nir_intrinsic_memory_barrier_image > > > > We treat these nir intrinsics as no-ops: > > * nir_intrinsic_group_memory_barrier > > * nir_intrinsic_memory_barrier_shared > > > > v3: > > * Add comment for no-op cases (curro) > > > > Signed-off-by: Jordan Justen > > Cc: Francisco Jerez > > --- > > src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 11 +++ > > 1 file changed, 11 insertions(+) > > > > diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > > b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > > index b6f4c52..3b3bc67 100644 > > --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > > +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > > @@ -1697,6 +1697,9 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, > > nir_intrinsic_instr *instr > >break; > > } > > > > + case nir_intrinsic_memory_barrier_atomic_counter: > > + case nir_intrinsic_memory_barrier_buffer: > > + case nir_intrinsic_memory_barrier_image: > > case nir_intrinsic_memory_barrier: { > >const fs_reg tmp = bld.vgrf(BRW_REGISTER_TYPE_UD, 16 / > > dispatch_width); > >bld.emit(SHADER_OPCODE_MEMORY_FENCE, tmp) > > @@ -1704,6 +1707,14 @@ fs_visitor::nir_emit_intrinsic(const fs_builder > > &bld, nir_intrinsic_instr *instr > >break; > > } > > > > + case nir_intrinsic_group_memory_barrier: > > + case nir_intrinsic_memory_barrier_shared: > > + /* We treat these single workgroup level barriers as no-ops. This is > > + * safe today because we don't allow memory functions to be > > re-ordered > > + * and we only execute programs on a single sub-slice. > > + */ > > You forgot to mention the fault-and-stream thing, which is going to be a > problem already on Gen9.5. How about the following: > > | /* We treat these workgroup-level barriers as no-ops. This should be > | * safe at present and as long as: > | * > | * - Memory access instructions are not subsequently reordered by the > | *compiler back-end. > | * > | * - All threads from a given compute shader workgroup fit within a > | *single subslice and therefore talk to the same HDC shared unit > | *what supposedly guarantees ordering and coherency between threads > | *from the same workgroup. This may change in the future when we > | *start splitting workgroups across multiple subslices. > | * > | * - The context is not in fault-and-stream mode, which could cause > | *memory transactions (including to SLM) prior to the barrier to be > | *replayed after the barrier if a pagefault occurs. This shouldn't > | *be a problem up to and including SKL because fault-and-stream is > | *not usable due to hardware issues, but that's likely to change in > | *the future. > | */ > > If you use that comment instead this patch is: > > Reviewed-by: Francisco Jerez Ok, but I think I'll move the comment to a separate patch authored by you. Thanks, -Jordan > > > > + break; > > + > > case nir_intrinsic_shader_clock: { > >/* We cannot do anything if there is an event, so ignore it for now > > */ > >fs_reg shader_clock = get_timestamp(bld); > > -- > > 2.6.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 3/3] i965/nir/fs: Implement new barrier functions for compute shaders
Jordan Justen writes: > For these nir intrinsics, we emit the same code as > nir_intrinsic_memory_barrier: > > * nir_intrinsic_memory_barrier_atomic_counter > * nir_intrinsic_memory_barrier_buffer > * nir_intrinsic_memory_barrier_image > > We treat these nir intrinsics as no-ops: > * nir_intrinsic_group_memory_barrier > * nir_intrinsic_memory_barrier_shared > > v3: > * Add comment for no-op cases (curro) > > Signed-off-by: Jordan Justen > Cc: Francisco Jerez > --- > src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 11 +++ > 1 file changed, 11 insertions(+) > > diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > index b6f4c52..3b3bc67 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > @@ -1697,6 +1697,9 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, > nir_intrinsic_instr *instr >break; > } > > + case nir_intrinsic_memory_barrier_atomic_counter: > + case nir_intrinsic_memory_barrier_buffer: > + case nir_intrinsic_memory_barrier_image: > case nir_intrinsic_memory_barrier: { >const fs_reg tmp = bld.vgrf(BRW_REGISTER_TYPE_UD, 16 / dispatch_width); >bld.emit(SHADER_OPCODE_MEMORY_FENCE, tmp) > @@ -1704,6 +1707,14 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, > nir_intrinsic_instr *instr >break; > } > > + case nir_intrinsic_group_memory_barrier: > + case nir_intrinsic_memory_barrier_shared: > + /* We treat these single workgroup level barriers as no-ops. This is > + * safe today because we don't allow memory functions to be re-ordered > + * and we only execute programs on a single sub-slice. > + */ You forgot to mention the fault-and-stream thing, which is going to be a problem already on Gen9.5. How about the following: | /* We treat these workgroup-level barriers as no-ops. This should be | * safe at present and as long as: | * | * - Memory access instructions are not subsequently reordered by the | *compiler back-end. | * | * - All threads from a given compute shader workgroup fit within a | *single subslice and therefore talk to the same HDC shared unit | *what supposedly guarantees ordering and coherency between threads | *from the same workgroup. This may change in the future when we | *start splitting workgroups across multiple subslices. | * | * - The context is not in fault-and-stream mode, which could cause | *memory transactions (including to SLM) prior to the barrier to be | *replayed after the barrier if a pagefault occurs. This shouldn't | *be a problem up to and including SKL because fault-and-stream is | *not usable due to hardware issues, but that's likely to change in | *the future. | */ If you use that comment instead this patch is: Reviewed-by: Francisco Jerez > + break; > + > case nir_intrinsic_shader_clock: { >/* We cannot do anything if there is an event, so ignore it for now */ >fs_reg shader_clock = get_timestamp(bld); > -- > 2.6.2 signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v3 3/3] i965/nir/fs: Implement new barrier functions for compute shaders
For these nir intrinsics, we emit the same code as nir_intrinsic_memory_barrier: * nir_intrinsic_memory_barrier_atomic_counter * nir_intrinsic_memory_barrier_buffer * nir_intrinsic_memory_barrier_image We treat these nir intrinsics as no-ops: * nir_intrinsic_group_memory_barrier * nir_intrinsic_memory_barrier_shared v3: * Add comment for no-op cases (curro) Signed-off-by: Jordan Justen Cc: Francisco Jerez --- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 11 +++ 1 file changed, 11 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index b6f4c52..3b3bc67 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -1697,6 +1697,9 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr break; } + case nir_intrinsic_memory_barrier_atomic_counter: + case nir_intrinsic_memory_barrier_buffer: + case nir_intrinsic_memory_barrier_image: case nir_intrinsic_memory_barrier: { const fs_reg tmp = bld.vgrf(BRW_REGISTER_TYPE_UD, 16 / dispatch_width); bld.emit(SHADER_OPCODE_MEMORY_FENCE, tmp) @@ -1704,6 +1707,14 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr break; } + case nir_intrinsic_group_memory_barrier: + case nir_intrinsic_memory_barrier_shared: + /* We treat these single workgroup level barriers as no-ops. This is + * safe today because we don't allow memory functions to be re-ordered + * and we only execute programs on a single sub-slice. + */ + break; + case nir_intrinsic_shader_clock: { /* We cannot do anything if there is an event, so ignore it for now */ fs_reg shader_clock = get_timestamp(bld); -- 2.6.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev