[Freedreno] [PATCH] freedreno/ir3: avoid using shr.b for immediate offset inputs
Since this is all happening as a post-optimization fixup, and offsets are generally immediates, we can just do the calculation directly. Signed-off-by: Ilia Mirkin --- Only very mildly tested. Noticed it when looking closely at our shaders, thinking why it tries to shift 0 by a constant. This is why. src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c index c97df4f1d63..ab326c24aa7 100644 --- a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c +++ b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c @@ -1351,6 +1351,7 @@ emit_intrinsic_atomic_ssbo(struct ir3_context *ctx, nir_intrinsic_instr *intr) ssbo = create_immed(b, const_offset->u32[0]); offset = get_src(ctx, &intr->src[1])[0]; + const_offset = nir_src_as_const_value(intr->src[1]); /* src0 is data (or uvec2(data, compare)) * src1 is offset @@ -1359,7 +1360,10 @@ emit_intrinsic_atomic_ssbo(struct ir3_context *ctx, nir_intrinsic_instr *intr) * Note that nir already multiplies the offset by four */ src0 = get_src(ctx, &intr->src[2])[0]; - src1 = ir3_SHR_B(b, offset, 0, create_immed(b, 2), 0); + if (const_offset) + src1 = create_immed(b, const_offset->u32[0] >> 2); + else + src1 = ir3_SHR_B(b, offset, 0, create_immed(b, 2), 0); src2 = create_collect(b, (struct ir3_instruction*[]){ offset, create_immed(b, 0), -- 2.13.6 ___ Freedreno mailing list Freedreno@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno
Re: [Freedreno] [PATCH] freedreno/ir3: avoid using shr.b for immediate offset inputs
On Sun, Nov 26, 2017 at 12:08 PM, Ilia Mirkin wrote: > Since this is all happening as a post-optimization fixup, and offsets > are generally immediates, we can just do the calculation directly. > > Signed-off-by: Ilia Mirkin > --- > > Only very mildly tested. Noticed it when looking closely at our shaders, > thinking > why it tries to shift 0 by a constant. This is why. not strictly against this, but a few thoughts: 1) I'm not sure how common in real life it is to access ssbo at hard-coded offsets.. I've noticed the funny shaders like shifting an immed zero by constant too, but figured it wasn't too likely to happen in real life. Although undoing nir's shl w/ our shr might be useful. 2) if it is common, maybe support in ir3_cp to recognize the handful of instructions that are added when lowering nir instructions to ir3 would be more beneficial (ie. ssbo load/store isn't the only one to add shl/shr/etc.. although the instructions added are a small subset of possible instructions so might be sane to make cp a bit more clever.. 3) or, perhaps an even better idea is nir->nir pass that lowers things into ir3 specific nir instructions and then run nir's opt passes again.. that has been kinda on my todo list for a while BR, -R > src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c | 6 +- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c > b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c > index c97df4f1d63..ab326c24aa7 100644 > --- a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c > +++ b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c > @@ -1351,6 +1351,7 @@ emit_intrinsic_atomic_ssbo(struct ir3_context *ctx, > nir_intrinsic_instr *intr) > ssbo = create_immed(b, const_offset->u32[0]); > > offset = get_src(ctx, &intr->src[1])[0]; > + const_offset = nir_src_as_const_value(intr->src[1]); > > /* src0 is data (or uvec2(data, compare)) > * src1 is offset > @@ -1359,7 +1360,10 @@ emit_intrinsic_atomic_ssbo(struct ir3_context *ctx, > nir_intrinsic_instr *intr) > * Note that nir already multiplies the offset by four > */ > src0 = get_src(ctx, &intr->src[2])[0]; > - src1 = ir3_SHR_B(b, offset, 0, create_immed(b, 2), 0); > + if (const_offset) > + src1 = create_immed(b, const_offset->u32[0] >> 2); > + else > + src1 = ir3_SHR_B(b, offset, 0, create_immed(b, 2), 0); > src2 = create_collect(b, (struct ir3_instruction*[]){ > offset, > create_immed(b, 0), > -- > 2.13.6 > > ___ > Freedreno mailing list > Freedreno@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/freedreno ___ Freedreno mailing list Freedreno@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno
Re: [Freedreno] [PATCH] freedreno/ir3: avoid using shr.b for immediate offset inputs
On Sun, Nov 26, 2017 at 1:29 PM, Rob Clark wrote: > On Sun, Nov 26, 2017 at 12:08 PM, Ilia Mirkin wrote: >> Since this is all happening as a post-optimization fixup, and offsets >> are generally immediates, we can just do the calculation directly. >> >> Signed-off-by: Ilia Mirkin >> --- >> >> Only very mildly tested. Noticed it when looking closely at our shaders, >> thinking >> why it tries to shift 0 by a constant. This is why. > > not strictly against this, but a few thoughts: > > 1) I'm not sure how common in real life it is to access ssbo at > hard-coded offsets.. I've noticed the funny shaders like shifting an > immed zero by constant too, but figured it wasn't too likely to happen > in real life. Although undoing nir's shl w/ our shr might be useful. I suspect it's moderately common. Any time you don't have a variably-indexed array, that will happen. > > 2) if it is common, maybe support in ir3_cp to recognize the handful > of instructions that are added when lowering nir instructions to ir3 > would be more beneficial (ie. ssbo load/store isn't the only one to > add shl/shr/etc.. although the instructions added are a small subset > of possible instructions so might be sane to make cp a bit more > clever.. > > 3) or, perhaps an even better idea is nir->nir pass that lowers things > into ir3 specific nir instructions and then run nir's opt passes > again.. that has been kinda on my todo list for a while Yeah, that's clearly the right way to go. Having new instructions added after opt is ... not a good idea. (This is why I've never warmed up to the "frontend" vs "backend" concept -- the backend needs the opts just as much.) Happy to drop this until that happens. I just hated seeing shr.b r0.x, 0, c0.x (Where c0.x == 2, of course.) -ilia > > BR, > -R > >> src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c | 6 +- >> 1 file changed, 5 insertions(+), 1 deletion(-) >> >> diff --git a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c >> b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c >> index c97df4f1d63..ab326c24aa7 100644 >> --- a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c >> +++ b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c >> @@ -1351,6 +1351,7 @@ emit_intrinsic_atomic_ssbo(struct ir3_context *ctx, >> nir_intrinsic_instr *intr) >> ssbo = create_immed(b, const_offset->u32[0]); >> >> offset = get_src(ctx, &intr->src[1])[0]; >> + const_offset = nir_src_as_const_value(intr->src[1]); >> >> /* src0 is data (or uvec2(data, compare)) >> * src1 is offset >> @@ -1359,7 +1360,10 @@ emit_intrinsic_atomic_ssbo(struct ir3_context *ctx, >> nir_intrinsic_instr *intr) >> * Note that nir already multiplies the offset by four >> */ >> src0 = get_src(ctx, &intr->src[2])[0]; >> - src1 = ir3_SHR_B(b, offset, 0, create_immed(b, 2), 0); >> + if (const_offset) >> + src1 = create_immed(b, const_offset->u32[0] >> 2); >> + else >> + src1 = ir3_SHR_B(b, offset, 0, create_immed(b, 2), 0); >> src2 = create_collect(b, (struct ir3_instruction*[]){ >> offset, >> create_immed(b, 0), >> -- >> 2.13.6 >> >> ___ >> Freedreno mailing list >> Freedreno@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/freedreno ___ Freedreno mailing list Freedreno@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno