Re: [Freedreno] [PATCH] freedreno/ir3: avoid using shr.b for immediate offset inputs

2017-11-26 Thread Ilia Mirkin
On Sun, Nov 26, 2017 at 1:29 PM, Rob Clark  wrote:
> On Sun, Nov 26, 2017 at 12:08 PM, Ilia Mirkin  wrote:
>> Since this is all happening as a post-optimization fixup, and offsets
>> are generally immediates, we can just do the calculation directly.
>>
>> Signed-off-by: Ilia Mirkin 
>> ---
>>
>> Only very mildly tested. Noticed it when looking closely at our shaders, 
>> thinking
>> why it tries to shift 0 by a constant. This is why.
>
> not strictly against this, but a few thoughts:
>
> 1) I'm not sure how common in real life it is to access ssbo at
> hard-coded offsets.. I've noticed the funny shaders like shifting an
> immed zero by constant too, but figured it wasn't too likely to happen
> in real life.  Although undoing nir's shl w/ our shr might be useful.

I suspect it's moderately common. Any time you don't have a
variably-indexed array, that will happen.

>
> 2) if it is common, maybe support in ir3_cp to recognize the handful
> of instructions that are added when lowering nir instructions to ir3
> would be more beneficial (ie. ssbo load/store isn't the only one to
> add shl/shr/etc..  although the instructions added are a small subset
> of possible instructions so might be sane to make cp a bit more
> clever..
>
> 3) or, perhaps an even better idea is nir->nir pass that lowers things
> into ir3 specific nir instructions and then run nir's opt passes
> again.. that has been kinda on my todo list for a while

Yeah, that's clearly the right way to go. Having new instructions
added after opt is ... not a good idea. (This is why I've never warmed
up to the "frontend" vs "backend" concept -- the backend needs the
opts just as much.)

Happy to drop this until that happens. I just hated seeing

shr.b r0.x, 0, c0.x

(Where c0.x == 2, of course.)

  -ilia

>
> BR,
> -R
>
>>  src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c | 6 +-
>>  1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c 
>> b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
>> index c97df4f1d63..ab326c24aa7 100644
>> --- a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
>> +++ b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
>> @@ -1351,6 +1351,7 @@ emit_intrinsic_atomic_ssbo(struct ir3_context *ctx, 
>> nir_intrinsic_instr *intr)
>> ssbo = create_immed(b, const_offset->u32[0]);
>>
>> offset = get_src(ctx, &intr->src[1])[0];
>> +   const_offset = nir_src_as_const_value(intr->src[1]);
>>
>> /* src0 is data (or uvec2(data, compare))
>>  * src1 is offset
>> @@ -1359,7 +1360,10 @@ emit_intrinsic_atomic_ssbo(struct ir3_context *ctx, 
>> nir_intrinsic_instr *intr)
>>  * Note that nir already multiplies the offset by four
>>  */
>> src0 = get_src(ctx, &intr->src[2])[0];
>> -   src1 = ir3_SHR_B(b, offset, 0, create_immed(b, 2), 0);
>> +   if (const_offset)
>> +   src1 = create_immed(b, const_offset->u32[0] >> 2);
>> +   else
>> +   src1 = ir3_SHR_B(b, offset, 0, create_immed(b, 2), 0);
>> src2 = create_collect(b, (struct ir3_instruction*[]){
>> offset,
>> create_immed(b, 0),
>> --
>> 2.13.6
>>
>> ___
>> Freedreno mailing list
>> Freedreno@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/freedreno
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH] freedreno/ir3: avoid using shr.b for immediate offset inputs

2017-11-26 Thread Rob Clark
On Sun, Nov 26, 2017 at 12:08 PM, Ilia Mirkin  wrote:
> Since this is all happening as a post-optimization fixup, and offsets
> are generally immediates, we can just do the calculation directly.
>
> Signed-off-by: Ilia Mirkin 
> ---
>
> Only very mildly tested. Noticed it when looking closely at our shaders, 
> thinking
> why it tries to shift 0 by a constant. This is why.

not strictly against this, but a few thoughts:

1) I'm not sure how common in real life it is to access ssbo at
hard-coded offsets.. I've noticed the funny shaders like shifting an
immed zero by constant too, but figured it wasn't too likely to happen
in real life.  Although undoing nir's shl w/ our shr might be useful.

2) if it is common, maybe support in ir3_cp to recognize the handful
of instructions that are added when lowering nir instructions to ir3
would be more beneficial (ie. ssbo load/store isn't the only one to
add shl/shr/etc..  although the instructions added are a small subset
of possible instructions so might be sane to make cp a bit more
clever..

3) or, perhaps an even better idea is nir->nir pass that lowers things
into ir3 specific nir instructions and then run nir's opt passes
again.. that has been kinda on my todo list for a while

BR,
-R

>  src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c 
> b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
> index c97df4f1d63..ab326c24aa7 100644
> --- a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
> +++ b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
> @@ -1351,6 +1351,7 @@ emit_intrinsic_atomic_ssbo(struct ir3_context *ctx, 
> nir_intrinsic_instr *intr)
> ssbo = create_immed(b, const_offset->u32[0]);
>
> offset = get_src(ctx, &intr->src[1])[0];
> +   const_offset = nir_src_as_const_value(intr->src[1]);
>
> /* src0 is data (or uvec2(data, compare))
>  * src1 is offset
> @@ -1359,7 +1360,10 @@ emit_intrinsic_atomic_ssbo(struct ir3_context *ctx, 
> nir_intrinsic_instr *intr)
>  * Note that nir already multiplies the offset by four
>  */
> src0 = get_src(ctx, &intr->src[2])[0];
> -   src1 = ir3_SHR_B(b, offset, 0, create_immed(b, 2), 0);
> +   if (const_offset)
> +   src1 = create_immed(b, const_offset->u32[0] >> 2);
> +   else
> +   src1 = ir3_SHR_B(b, offset, 0, create_immed(b, 2), 0);
> src2 = create_collect(b, (struct ir3_instruction*[]){
> offset,
> create_immed(b, 0),
> --
> 2.13.6
>
> ___
> Freedreno mailing list
> Freedreno@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/freedreno
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


[Freedreno] [PATCH] freedreno/ir3: avoid using shr.b for immediate offset inputs

2017-11-26 Thread Ilia Mirkin
Since this is all happening as a post-optimization fixup, and offsets
are generally immediates, we can just do the calculation directly.

Signed-off-by: Ilia Mirkin 
---

Only very mildly tested. Noticed it when looking closely at our shaders, 
thinking
why it tries to shift 0 by a constant. This is why.

 src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c 
b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
index c97df4f1d63..ab326c24aa7 100644
--- a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
+++ b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
@@ -1351,6 +1351,7 @@ emit_intrinsic_atomic_ssbo(struct ir3_context *ctx, 
nir_intrinsic_instr *intr)
ssbo = create_immed(b, const_offset->u32[0]);
 
offset = get_src(ctx, &intr->src[1])[0];
+   const_offset = nir_src_as_const_value(intr->src[1]);
 
/* src0 is data (or uvec2(data, compare))
 * src1 is offset
@@ -1359,7 +1360,10 @@ emit_intrinsic_atomic_ssbo(struct ir3_context *ctx, 
nir_intrinsic_instr *intr)
 * Note that nir already multiplies the offset by four
 */
src0 = get_src(ctx, &intr->src[2])[0];
-   src1 = ir3_SHR_B(b, offset, 0, create_immed(b, 2), 0);
+   if (const_offset)
+   src1 = create_immed(b, const_offset->u32[0] >> 2);
+   else
+   src1 = ir3_SHR_B(b, offset, 0, create_immed(b, 2), 0);
src2 = create_collect(b, (struct ir3_instruction*[]){
offset,
create_immed(b, 0),
-- 
2.13.6

___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH 09/15] drm/msm/mdp5: Use drm_mode_get_hv_timing() to populate plane clip rectangle

2017-11-26 Thread Archit Taneja



On 11/24/2017 12:34 AM, Ville Syrjala wrote:

From: Ville Syrjälä 

Use drm_mode_get_hv_timing() to fill out the plane clip rectangle.

Note that this replaces crtc_state->adjusted_mode usage with
crtc_state->mode. The latter is the correct choice since that's the
mode the user provided and it matches the plane crtc coordinates
the user also provided.

Once everyone agrees on this we can move the clip handling into
drm_atomic_helper_check_plane_state().


For this and the msm change in patch # 15/15:

Reviewed-by: Archit Taneja 

Thanks,
Archit



Cc: Laurent Pinchart 
Cc: Rob Clark 
Cc: Archit Taneja 
Cc: linux-arm-...@vger.kernel.org
Cc: freedreno@lists.freedesktop.org
Signed-off-by: Ville Syrjälä 
---
  drivers/gpu/drm/msm/mdp/mdp5/mdp5_plane.c | 20 ++--
  1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/msm/mdp/mdp5/mdp5_plane.c 
b/drivers/gpu/drm/msm/mdp/mdp5/mdp5_plane.c
index ee41423baeb7..09f758e7bb1b 100644
--- a/drivers/gpu/drm/msm/mdp/mdp5/mdp5_plane.c
+++ b/drivers/gpu/drm/msm/mdp/mdp5/mdp5_plane.c
@@ -286,7 +286,7 @@ static int mdp5_plane_atomic_check_with_state(struct 
drm_crtc_state *crtc_state,
uint32_t max_width, max_height;
bool out_of_bounds = false;
uint32_t caps = 0;
-   struct drm_rect clip;
+   struct drm_rect clip = {};
int min_scale, max_scale;
int ret;
  
@@ -320,13 +320,13 @@ static int mdp5_plane_atomic_check_with_state(struct drm_crtc_state *crtc_state,

return -ERANGE;
}
  
-	clip.x1 = 0;

-   clip.y1 = 0;
-   clip.x2 = crtc_state->adjusted_mode.hdisplay;
-   clip.y2 = crtc_state->adjusted_mode.vdisplay;
min_scale = FRAC_16_16(1, 8);
max_scale = FRAC_16_16(8, 1);
  
+	if (crtc_state->enable)

+   drm_mode_get_hv_timing(&crtc_state->mode,
+  &clip.x2, &clip.y2);
+
ret = drm_atomic_helper_check_plane_state(state, crtc_state, &clip,
  min_scale, max_scale,
  true, true);
@@ -471,7 +471,7 @@ static int mdp5_plane_atomic_async_check(struct drm_plane 
*plane,
  {
struct mdp5_plane_state *mdp5_state = to_mdp5_plane_state(state);
struct drm_crtc_state *crtc_state;
-   struct drm_rect clip;
+   struct drm_rect clip = {};
int min_scale, max_scale;
int ret;
  
@@ -499,13 +499,13 @@ static int mdp5_plane_atomic_async_check(struct drm_plane *plane,

plane->state->fb != state->fb)
return -EINVAL;
  
-	clip.x1 = 0;

-   clip.y1 = 0;
-   clip.x2 = crtc_state->adjusted_mode.hdisplay;
-   clip.y2 = crtc_state->adjusted_mode.vdisplay;
min_scale = FRAC_16_16(1, 8);
max_scale = FRAC_16_16(8, 1);
  
+	if (crtc_state->enable)

+   drm_mode_get_hv_timing(&crtc_state->mode,
+  &clip.x2, &clip.y2);
+
ret = drm_atomic_helper_check_plane_state(state, crtc_state, &clip,
  min_scale, max_scale,
  true, true);



--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno