Re: [Freedreno] [PATCH] freedreno/ir3: avoid using shr.b for immediate offset inputs
On Sun, Nov 26, 2017 at 1:29 PM, Rob Clark wrote: > On Sun, Nov 26, 2017 at 12:08 PM, Ilia Mirkin wrote: >> Since this is all happening as a post-optimization fixup, and offsets >> are generally immediates, we can just do the calculation directly. >> >> Signed-off-by: Ilia Mirkin >> --- >> >> Only very mildly tested. Noticed it when looking closely at our shaders, >> thinking >> why it tries to shift 0 by a constant. This is why. > > not strictly against this, but a few thoughts: > > 1) I'm not sure how common in real life it is to access ssbo at > hard-coded offsets.. I've noticed the funny shaders like shifting an > immed zero by constant too, but figured it wasn't too likely to happen > in real life. Although undoing nir's shl w/ our shr might be useful. I suspect it's moderately common. Any time you don't have a variably-indexed array, that will happen. > > 2) if it is common, maybe support in ir3_cp to recognize the handful > of instructions that are added when lowering nir instructions to ir3 > would be more beneficial (ie. ssbo load/store isn't the only one to > add shl/shr/etc.. although the instructions added are a small subset > of possible instructions so might be sane to make cp a bit more > clever.. > > 3) or, perhaps an even better idea is nir->nir pass that lowers things > into ir3 specific nir instructions and then run nir's opt passes > again.. that has been kinda on my todo list for a while Yeah, that's clearly the right way to go. Having new instructions added after opt is ... not a good idea. (This is why I've never warmed up to the "frontend" vs "backend" concept -- the backend needs the opts just as much.) Happy to drop this until that happens. I just hated seeing shr.b r0.x, 0, c0.x (Where c0.x == 2, of course.) -ilia > > BR, > -R > >> src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c | 6 +- >> 1 file changed, 5 insertions(+), 1 deletion(-) >> >> diff --git a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c >> b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c >> index c97df4f1d63..ab326c24aa7 100644 >> --- a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c >> +++ b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c >> @@ -1351,6 +1351,7 @@ emit_intrinsic_atomic_ssbo(struct ir3_context *ctx, >> nir_intrinsic_instr *intr) >> ssbo = create_immed(b, const_offset->u32[0]); >> >> offset = get_src(ctx, &intr->src[1])[0]; >> + const_offset = nir_src_as_const_value(intr->src[1]); >> >> /* src0 is data (or uvec2(data, compare)) >> * src1 is offset >> @@ -1359,7 +1360,10 @@ emit_intrinsic_atomic_ssbo(struct ir3_context *ctx, >> nir_intrinsic_instr *intr) >> * Note that nir already multiplies the offset by four >> */ >> src0 = get_src(ctx, &intr->src[2])[0]; >> - src1 = ir3_SHR_B(b, offset, 0, create_immed(b, 2), 0); >> + if (const_offset) >> + src1 = create_immed(b, const_offset->u32[0] >> 2); >> + else >> + src1 = ir3_SHR_B(b, offset, 0, create_immed(b, 2), 0); >> src2 = create_collect(b, (struct ir3_instruction*[]){ >> offset, >> create_immed(b, 0), >> -- >> 2.13.6 >> >> ___ >> Freedreno mailing list >> Freedreno@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/freedreno ___ Freedreno mailing list Freedreno@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno
Re: [Freedreno] [PATCH] freedreno/ir3: avoid using shr.b for immediate offset inputs
On Sun, Nov 26, 2017 at 12:08 PM, Ilia Mirkin wrote: > Since this is all happening as a post-optimization fixup, and offsets > are generally immediates, we can just do the calculation directly. > > Signed-off-by: Ilia Mirkin > --- > > Only very mildly tested. Noticed it when looking closely at our shaders, > thinking > why it tries to shift 0 by a constant. This is why. not strictly against this, but a few thoughts: 1) I'm not sure how common in real life it is to access ssbo at hard-coded offsets.. I've noticed the funny shaders like shifting an immed zero by constant too, but figured it wasn't too likely to happen in real life. Although undoing nir's shl w/ our shr might be useful. 2) if it is common, maybe support in ir3_cp to recognize the handful of instructions that are added when lowering nir instructions to ir3 would be more beneficial (ie. ssbo load/store isn't the only one to add shl/shr/etc.. although the instructions added are a small subset of possible instructions so might be sane to make cp a bit more clever.. 3) or, perhaps an even better idea is nir->nir pass that lowers things into ir3 specific nir instructions and then run nir's opt passes again.. that has been kinda on my todo list for a while BR, -R > src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c | 6 +- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c > b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c > index c97df4f1d63..ab326c24aa7 100644 > --- a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c > +++ b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c > @@ -1351,6 +1351,7 @@ emit_intrinsic_atomic_ssbo(struct ir3_context *ctx, > nir_intrinsic_instr *intr) > ssbo = create_immed(b, const_offset->u32[0]); > > offset = get_src(ctx, &intr->src[1])[0]; > + const_offset = nir_src_as_const_value(intr->src[1]); > > /* src0 is data (or uvec2(data, compare)) > * src1 is offset > @@ -1359,7 +1360,10 @@ emit_intrinsic_atomic_ssbo(struct ir3_context *ctx, > nir_intrinsic_instr *intr) > * Note that nir already multiplies the offset by four > */ > src0 = get_src(ctx, &intr->src[2])[0]; > - src1 = ir3_SHR_B(b, offset, 0, create_immed(b, 2), 0); > + if (const_offset) > + src1 = create_immed(b, const_offset->u32[0] >> 2); > + else > + src1 = ir3_SHR_B(b, offset, 0, create_immed(b, 2), 0); > src2 = create_collect(b, (struct ir3_instruction*[]){ > offset, > create_immed(b, 0), > -- > 2.13.6 > > ___ > Freedreno mailing list > Freedreno@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/freedreno ___ Freedreno mailing list Freedreno@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno
[Freedreno] [PATCH] freedreno/ir3: avoid using shr.b for immediate offset inputs
Since this is all happening as a post-optimization fixup, and offsets are generally immediates, we can just do the calculation directly. Signed-off-by: Ilia Mirkin --- Only very mildly tested. Noticed it when looking closely at our shaders, thinking why it tries to shift 0 by a constant. This is why. src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c index c97df4f1d63..ab326c24aa7 100644 --- a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c +++ b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c @@ -1351,6 +1351,7 @@ emit_intrinsic_atomic_ssbo(struct ir3_context *ctx, nir_intrinsic_instr *intr) ssbo = create_immed(b, const_offset->u32[0]); offset = get_src(ctx, &intr->src[1])[0]; + const_offset = nir_src_as_const_value(intr->src[1]); /* src0 is data (or uvec2(data, compare)) * src1 is offset @@ -1359,7 +1360,10 @@ emit_intrinsic_atomic_ssbo(struct ir3_context *ctx, nir_intrinsic_instr *intr) * Note that nir already multiplies the offset by four */ src0 = get_src(ctx, &intr->src[2])[0]; - src1 = ir3_SHR_B(b, offset, 0, create_immed(b, 2), 0); + if (const_offset) + src1 = create_immed(b, const_offset->u32[0] >> 2); + else + src1 = ir3_SHR_B(b, offset, 0, create_immed(b, 2), 0); src2 = create_collect(b, (struct ir3_instruction*[]){ offset, create_immed(b, 0), -- 2.13.6 ___ Freedreno mailing list Freedreno@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno
Re: [Freedreno] [PATCH 09/15] drm/msm/mdp5: Use drm_mode_get_hv_timing() to populate plane clip rectangle
On 11/24/2017 12:34 AM, Ville Syrjala wrote: From: Ville Syrjälä Use drm_mode_get_hv_timing() to fill out the plane clip rectangle. Note that this replaces crtc_state->adjusted_mode usage with crtc_state->mode. The latter is the correct choice since that's the mode the user provided and it matches the plane crtc coordinates the user also provided. Once everyone agrees on this we can move the clip handling into drm_atomic_helper_check_plane_state(). For this and the msm change in patch # 15/15: Reviewed-by: Archit Taneja Thanks, Archit Cc: Laurent Pinchart Cc: Rob Clark Cc: Archit Taneja Cc: linux-arm-...@vger.kernel.org Cc: freedreno@lists.freedesktop.org Signed-off-by: Ville Syrjälä --- drivers/gpu/drm/msm/mdp/mdp5/mdp5_plane.c | 20 ++-- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/msm/mdp/mdp5/mdp5_plane.c b/drivers/gpu/drm/msm/mdp/mdp5/mdp5_plane.c index ee41423baeb7..09f758e7bb1b 100644 --- a/drivers/gpu/drm/msm/mdp/mdp5/mdp5_plane.c +++ b/drivers/gpu/drm/msm/mdp/mdp5/mdp5_plane.c @@ -286,7 +286,7 @@ static int mdp5_plane_atomic_check_with_state(struct drm_crtc_state *crtc_state, uint32_t max_width, max_height; bool out_of_bounds = false; uint32_t caps = 0; - struct drm_rect clip; + struct drm_rect clip = {}; int min_scale, max_scale; int ret; @@ -320,13 +320,13 @@ static int mdp5_plane_atomic_check_with_state(struct drm_crtc_state *crtc_state, return -ERANGE; } - clip.x1 = 0; - clip.y1 = 0; - clip.x2 = crtc_state->adjusted_mode.hdisplay; - clip.y2 = crtc_state->adjusted_mode.vdisplay; min_scale = FRAC_16_16(1, 8); max_scale = FRAC_16_16(8, 1); + if (crtc_state->enable) + drm_mode_get_hv_timing(&crtc_state->mode, + &clip.x2, &clip.y2); + ret = drm_atomic_helper_check_plane_state(state, crtc_state, &clip, min_scale, max_scale, true, true); @@ -471,7 +471,7 @@ static int mdp5_plane_atomic_async_check(struct drm_plane *plane, { struct mdp5_plane_state *mdp5_state = to_mdp5_plane_state(state); struct drm_crtc_state *crtc_state; - struct drm_rect clip; + struct drm_rect clip = {}; int min_scale, max_scale; int ret; @@ -499,13 +499,13 @@ static int mdp5_plane_atomic_async_check(struct drm_plane *plane, plane->state->fb != state->fb) return -EINVAL; - clip.x1 = 0; - clip.y1 = 0; - clip.x2 = crtc_state->adjusted_mode.hdisplay; - clip.y2 = crtc_state->adjusted_mode.vdisplay; min_scale = FRAC_16_16(1, 8); max_scale = FRAC_16_16(8, 1); + if (crtc_state->enable) + drm_mode_get_hv_timing(&crtc_state->mode, + &clip.x2, &clip.y2); + ret = drm_atomic_helper_check_plane_state(state, crtc_state, &clip, min_scale, max_scale, true, true); -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ___ Freedreno mailing list Freedreno@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno