subject:"\[Intel\-gfx\] \[PATCH 2\/2\] drm\/i915\: Simplify some icl pll calculations"

Re: [Intel-gfx] [PATCH 2/2] drm/i915: Simplify some icl pll calculations

2019-04-10 Thread Ville Syrjälä

On Wed, Apr 10, 2019 at 07:13:48PM +0100, Chris Wilson wrote:
> Quoting Ville Syrjälä (2019-04-10 18:59:52)
> > On Mon, Apr 08, 2019 at 06:05:06PM +0100, Chris Wilson wrote:
> > > Quoting Ville Syrjälä (2019-04-08 17:06:01)
> > > > On Mon, Apr 08, 2019 at 04:49:13PM +0100, Chris Wilson wrote:
> > > > > Quoting Ville Syrjala (2019-04-08 16:27:02)
> > > > > > -   /*
> > > > > > -* Adjust the original formula to delay the division by 
> > > > > > 2^22 in order to
> > > > > > -* minimize possible rounding errors.
> > > > > > -*/
> > > > > > -   tmp = (u64)m1 * m2_int * ref_clock +
> > > > > > - (((u64)m1 * m2_frac * ref_clock) >> 22);
> > > > > > -   tmp = div_u64(tmp, 5 * div1 * div2);
> > > > > > -
> > > > > > -   return tmp;
> > > > > > +   return div_u64(mul_u32_u32(ref_clock * m1, m2),
> > > > > > +  (5 * div1 * div2) << 22);
> > > > > 
> > > > > You say the denominator here is a u64, so do you not need to cast
> > > > > (u64)(5 * d1 * d2) to ensure it doesn't overflow the shift?
> > > > 
> > > > It should fit into u32. The maximum value should be
> > > > <= (5*0xf*0x7)<<22 based on the number of bits available
> > > 3b * 4b * 3b = 10b. So just fits.
> > > 
> > > Is it worth asserting those limits? Feels like it is running close, and
> > > will be subject to cargo-culting.
> > 
> > I suppose checking for it might be a good idea.
> > 
> > Just 'WARN_ON(5 * div1 * div2 >= 1 << 10)' maybe, or were you thinking
> > of something fancier?
> 
> How about something like
> struct {
>   unsigned int div1 : 3;
>   unsigned int div2 : 3;
> } d;
> 
> then with a bit of luck smatch will spot an overflow, and people might
> think twice when copying?
> 
> Even weirder,
> add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-29 (-29)
> Function old new   delta
> intel_ddi_get_config23772348 -29
> 
> I dread to look into the function to see how that changed gcc's mind.

I get 48 bytes with a 32bit build, but only if I let it inline that
function. With noinline there is no change apart from a few registers
getting shuffled around. gcc works in mysterious ways.

-- 
Ville Syrjälä
Intel
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 2/2] drm/i915: Simplify some icl pll calculations

2019-04-10 Thread Chris Wilson

Quoting Ville Syrjälä (2019-04-10 18:59:52)
> On Mon, Apr 08, 2019 at 06:05:06PM +0100, Chris Wilson wrote:
> > Quoting Ville Syrjälä (2019-04-08 17:06:01)
> > > On Mon, Apr 08, 2019 at 04:49:13PM +0100, Chris Wilson wrote:
> > > > Quoting Ville Syrjala (2019-04-08 16:27:02)
> > > > > -   /*
> > > > > -* Adjust the original formula to delay the division by 2^22 
> > > > > in order to
> > > > > -* minimize possible rounding errors.
> > > > > -*/
> > > > > -   tmp = (u64)m1 * m2_int * ref_clock +
> > > > > - (((u64)m1 * m2_frac * ref_clock) >> 22);
> > > > > -   tmp = div_u64(tmp, 5 * div1 * div2);
> > > > > -
> > > > > -   return tmp;
> > > > > +   return div_u64(mul_u32_u32(ref_clock * m1, m2),
> > > > > +  (5 * div1 * div2) << 22);
> > > > 
> > > > You say the denominator here is a u64, so do you not need to cast
> > > > (u64)(5 * d1 * d2) to ensure it doesn't overflow the shift?
> > > 
> > > It should fit into u32. The maximum value should be
> > > <= (5*0xf*0x7)<<22 based on the number of bits available
> > 3b * 4b * 3b = 10b. So just fits.
> > 
> > Is it worth asserting those limits? Feels like it is running close, and
> > will be subject to cargo-culting.
> 
> I suppose checking for it might be a good idea.
> 
> Just 'WARN_ON(5 * div1 * div2 >= 1 << 10)' maybe, or were you thinking
> of something fancier?

How about something like
struct {
unsigned int div1 : 3;
unsigned int div2 : 3;
} d;

then with a bit of luck smatch will spot an overflow, and people might
think twice when copying?

Even weirder,
add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-29 (-29)
Function old new   delta
intel_ddi_get_config23772348 -29

I dread to look into the function to see how that changed gcc's mind.
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 2/2] drm/i915: Simplify some icl pll calculations

2019-04-10 Thread Ville Syrjälä

On Mon, Apr 08, 2019 at 06:05:06PM +0100, Chris Wilson wrote:
> Quoting Ville Syrjälä (2019-04-08 17:06:01)
> > On Mon, Apr 08, 2019 at 04:49:13PM +0100, Chris Wilson wrote:
> > > Quoting Ville Syrjala (2019-04-08 16:27:02)
> > > > -   /*
> > > > -* Adjust the original formula to delay the division by 2^22 in 
> > > > order to
> > > > -* minimize possible rounding errors.
> > > > -*/
> > > > -   tmp = (u64)m1 * m2_int * ref_clock +
> > > > - (((u64)m1 * m2_frac * ref_clock) >> 22);
> > > > -   tmp = div_u64(tmp, 5 * div1 * div2);
> > > > -
> > > > -   return tmp;
> > > > +   return div_u64(mul_u32_u32(ref_clock * m1, m2),
> > > > +  (5 * div1 * div2) << 22);
> > > 
> > > You say the denominator here is a u64, so do you not need to cast
> > > (u64)(5 * d1 * d2) to ensure it doesn't overflow the shift?
> > 
> > It should fit into u32. The maximum value should be
> > <= (5*0xf*0x7)<<22 based on the number of bits available
> 3b * 4b * 3b = 10b. So just fits.
> 
> Is it worth asserting those limits? Feels like it is running close, and
> will be subject to cargo-culting.

I suppose checking for it might be a good idea.

Just 'WARN_ON(5 * div1 * div2 >= 1 << 10)' maybe, or were you thinking
of something fancier?

-- 
Ville Syrjälä
Intel
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 2/2] drm/i915: Simplify some icl pll calculations

2019-04-08 Thread Chris Wilson

Quoting Ville Syrjälä (2019-04-08 17:06:01)
> On Mon, Apr 08, 2019 at 04:49:13PM +0100, Chris Wilson wrote:
> > Quoting Ville Syrjala (2019-04-08 16:27:02)
> > > -   /*
> > > -* Adjust the original formula to delay the division by 2^22 in 
> > > order to
> > > -* minimize possible rounding errors.
> > > -*/
> > > -   tmp = (u64)m1 * m2_int * ref_clock +
> > > - (((u64)m1 * m2_frac * ref_clock) >> 22);
> > > -   tmp = div_u64(tmp, 5 * div1 * div2);
> > > -
> > > -   return tmp;
> > > +   return div_u64(mul_u32_u32(ref_clock * m1, m2),
> > > +  (5 * div1 * div2) << 22);
> > 
> > You say the denominator here is a u64, so do you not need to cast
> > (u64)(5 * d1 * d2) to ensure it doesn't overflow the shift?
> 
> It should fit into u32. The maximum value should be
> <= (5*0xf*0x7)<<22 based on the number of bits available
3b * 4b * 3b = 10b. So just fits.

Is it worth asserting those limits? Feels like it is running close, and
will be subject to cargo-culting.
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 2/2] drm/i915: Simplify some icl pll calculations

2019-04-08 Thread Chris Wilson

Quoting Ville Syrjälä (2019-04-08 17:06:01)
> On Mon, Apr 08, 2019 at 04:49:13PM +0100, Chris Wilson wrote:
> > Quoting Ville Syrjala (2019-04-08 16:27:02)
> > > -   /*
> > > -* Adjust the original formula to delay the division by 2^22 in 
> > > order to
> > > -* minimize possible rounding errors.
> > > -*/
> > > -   tmp = (u64)m1 * m2_int * ref_clock +
> > > - (((u64)m1 * m2_frac * ref_clock) >> 22);
> > > -   tmp = div_u64(tmp, 5 * div1 * div2);
> > > -
> > > -   return tmp;
> > > +   return div_u64(mul_u32_u32(ref_clock * m1, m2),
> > > +  (5 * div1 * div2) << 22);
> > 
> > You say the denominator here is a u64, so do you not need to cast
> > (u64)(5 * d1 * d2) to ensure it doesn't overflow the shift?
> 
> It should fit into u32. The maximum value should be
> <= (5*0xf*0x7)<<22 based on the number of bits available
> for div1/2.

No worries, I was thinking div_u64 == div_u64_u64 anyway.
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 2/2] drm/i915: Simplify some icl pll calculations

2019-04-08 Thread Ville Syrjälä

On Mon, Apr 08, 2019 at 04:49:13PM +0100, Chris Wilson wrote:
> Quoting Ville Syrjala (2019-04-08 16:27:02)
> > -   /*
> > -* Adjust the original formula to delay the division by 2^22 in 
> > order to
> > -* minimize possible rounding errors.
> > -*/
> > -   tmp = (u64)m1 * m2_int * ref_clock +
> > - (((u64)m1 * m2_frac * ref_clock) >> 22);
> > -   tmp = div_u64(tmp, 5 * div1 * div2);
> > -
> > -   return tmp;
> > +   return div_u64(mul_u32_u32(ref_clock * m1, m2),
> > +  (5 * div1 * div2) << 22);
> 
> You say the denominator here is a u64, so do you not need to cast
> (u64)(5 * d1 * d2) to ensure it doesn't overflow the shift?

It should fit into u32. The maximum value should be
<= (5*0xf*0x7)<<22 based on the number of bits available
for div1/2.

-- 
Ville Syrjälä
Intel
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 2/2] drm/i915: Simplify some icl pll calculations

2019-04-08 Thread Chris Wilson

Quoting Ville Syrjala (2019-04-08 16:27:02)
> -   /*
> -* Adjust the original formula to delay the division by 2^22 in order 
> to
> -* minimize possible rounding errors.
> -*/
> -   tmp = (u64)m1 * m2_int * ref_clock +
> - (((u64)m1 * m2_frac * ref_clock) >> 22);
> -   tmp = div_u64(tmp, 5 * div1 * div2);
> -
> -   return tmp;
> +   return div_u64(mul_u32_u32(ref_clock * m1, m2),
> +  (5 * div1 * div2) << 22);

You say the denominator here is a u64, so do you not need to cast
(u64)(5 * d1 * d2) to ensure it doesn't overflow the shift?
Aiui d1, d2 are u32 here.
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 2/2] drm/i915: Simplify some icl pll calculations

2019-04-08 Thread Ville Syrjala

From: Ville Syrjälä 

Write some icl pll calculations in a more straightforward way.
We have just enough bits for the full divisor.

Signed-off-by: Ville Syrjälä 
---
 drivers/gpu/drm/i915/intel_ddi.c  | 15 ---
 drivers/gpu/drm/i915/intel_dpll_mgr.c |  2 +-
 2 files changed, 5 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ddi.c b/drivers/gpu/drm/i915/intel_ddi.c
index 720de7a3f5e6..0a24608ce49b 100644
--- a/drivers/gpu/drm/i915/intel_ddi.c
+++ b/drivers/gpu/drm/i915/intel_ddi.c
@@ -1386,8 +1386,7 @@ static int icl_calc_tbt_pll_link(struct drm_i915_private 
*dev_priv,
 static int icl_calc_mg_pll_link(struct drm_i915_private *dev_priv,
const struct intel_dpll_hw_state *pll_state)
 {
-   u32 m1, m2_int, m2_frac, div1, div2, ref_clock;
-   u64 tmp;
+   u32 m1, m2, m2_int, m2_frac, div1, div2, ref_clock;
 
ref_clock = dev_priv->cdclk.hw.ref;
 
@@ -1396,6 +1395,7 @@ static int icl_calc_mg_pll_link(struct drm_i915_private 
*dev_priv,
m2_frac = (pll_state->mg_pll_div0 & MG_PLL_DIV0_FRACNEN_H) ?
(pll_state->mg_pll_div0 & MG_PLL_DIV0_FBDIV_FRAC_MASK) >>
MG_PLL_DIV0_FBDIV_FRAC_SHIFT : 0;
+   m2 = (m2_int << 22) | m2_frac;
 
switch (pll_state->mg_clktop2_hsclkctl &
MG_CLKTOP2_HSCLKCTL_HSDIV_RATIO_MASK) {
@@ -1424,15 +1424,8 @@ static int icl_calc_mg_pll_link(struct drm_i915_private 
*dev_priv,
if (div2 == 0)
div2 = 1;
 
-   /*
-* Adjust the original formula to delay the division by 2^22 in order to
-* minimize possible rounding errors.
-*/
-   tmp = (u64)m1 * m2_int * ref_clock +
- (((u64)m1 * m2_frac * ref_clock) >> 22);
-   tmp = div_u64(tmp, 5 * div1 * div2);
-
-   return tmp;
+   return div_u64(mul_u32_u32(ref_clock * m1, m2),
+  (5 * div1 * div2) << 22);
 }
 
 static void ddi_dotclock_get(struct intel_crtc_state *pipe_config)
diff --git a/drivers/gpu/drm/i915/intel_dpll_mgr.c 
b/drivers/gpu/drm/i915/intel_dpll_mgr.c
index 29edc369920b..3098c783a615 100644
--- a/drivers/gpu/drm/i915/intel_dpll_mgr.c
+++ b/drivers/gpu/drm/i915/intel_dpll_mgr.c
@@ -2682,7 +2682,7 @@ static bool icl_calc_mg_pll_state(struct intel_crtc_state 
*crtc_state)
}
m2div_rem = dco_khz % (refclk_khz * m1div);
 
-   tmp = (u64)m2div_rem * (1 << 22);
+   tmp = (u64)m2div_rem << 22;
do_div(tmp, refclk_khz * m1div);
m2div_frac = tmp;
 
-- 
2.21.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 2/2] drm/i915: Simplify some icl pll calculations

Re: [Intel-gfx] [PATCH 2/2] drm/i915: Simplify some icl pll calculations

Re: [Intel-gfx] [PATCH 2/2] drm/i915: Simplify some icl pll calculations

Re: [Intel-gfx] [PATCH 2/2] drm/i915: Simplify some icl pll calculations

Re: [Intel-gfx] [PATCH 2/2] drm/i915: Simplify some icl pll calculations

Re: [Intel-gfx] [PATCH 2/2] drm/i915: Simplify some icl pll calculations

Re: [Intel-gfx] [PATCH 2/2] drm/i915: Simplify some icl pll calculations

[Intel-gfx] [PATCH 2/2] drm/i915: Simplify some icl pll calculations

8 matches

Site Navigation

Mail list logo

Footer information