Jordan Justen <jordan.l.jus...@intel.com> writes: > For these nir intrinsics, we emit the same code as > nir_intrinsic_memory_barrier: > > * nir_intrinsic_memory_barrier_atomic_counter > * nir_intrinsic_memory_barrier_buffer > * nir_intrinsic_memory_barrier_image > > We treat these nir intrinsics as no-ops: > * nir_intrinsic_group_memory_barrier > * nir_intrinsic_memory_barrier_shared > > v3: > * Add comment for no-op cases (curro) > > Signed-off-by: Jordan Justen <jordan.l.jus...@intel.com> > Cc: Francisco Jerez <curroje...@riseup.net> > --- > src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 11 +++++++++++ > 1 file changed, 11 insertions(+) > > diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > index b6f4c52..3b3bc67 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > @@ -1697,6 +1697,9 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, > nir_intrinsic_instr *instr > break; > } > > + case nir_intrinsic_memory_barrier_atomic_counter: > + case nir_intrinsic_memory_barrier_buffer: > + case nir_intrinsic_memory_barrier_image: > case nir_intrinsic_memory_barrier: { > const fs_reg tmp = bld.vgrf(BRW_REGISTER_TYPE_UD, 16 / dispatch_width); > bld.emit(SHADER_OPCODE_MEMORY_FENCE, tmp) > @@ -1704,6 +1707,14 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, > nir_intrinsic_instr *instr > break; > } > > + case nir_intrinsic_group_memory_barrier: > + case nir_intrinsic_memory_barrier_shared: > + /* We treat these single workgroup level barriers as no-ops. This is > + * safe today because we don't allow memory functions to be re-ordered > + * and we only execute programs on a single sub-slice. > + */
You forgot to mention the fault-and-stream thing, which is going to be a problem already on Gen9.5. How about the following: | /* We treat these workgroup-level barriers as no-ops. This should be | * safe at present and as long as: | * | * - Memory access instructions are not subsequently reordered by the | * compiler back-end. | * | * - All threads from a given compute shader workgroup fit within a | * single subslice and therefore talk to the same HDC shared unit | * what supposedly guarantees ordering and coherency between threads | * from the same workgroup. This may change in the future when we | * start splitting workgroups across multiple subslices. | * | * - The context is not in fault-and-stream mode, which could cause | * memory transactions (including to SLM) prior to the barrier to be | * replayed after the barrier if a pagefault occurs. This shouldn't | * be a problem up to and including SKL because fault-and-stream is | * not usable due to hardware issues, but that's likely to change in | * the future. | */ If you use that comment instead this patch is: Reviewed-by: Francisco Jerez <curroje...@riseup.net> > + break; > + > case nir_intrinsic_shader_clock: { > /* We cannot do anything if there is an event, so ignore it for now */ > fs_reg shader_clock = get_timestamp(bld); > -- > 2.6.2
signature.asc
Description: PGP signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev