Re: [PATCH 1/3] drm/panfrost: Simplify lock_region calculation

2021-08-23 Thread Steven Price
On 23 August 2021 22:09:08 BST, Alyssa Rosenzweig  wrote:
>> > In lock_region, simplify the calculation of the region_width parameter.
>> > This field is the size, but encoded as log2(ceil(size)) - 1.
>> > log2(ceil(size)) may be computed directly as fls(size - 1). However, we
>> > want to use the 64-bit versions as the amount to lock can exceed
>> > 32-bits.
>> > 
>> > This avoids undefined behaviour when locking all memory (size ~0),
>> > caught by UBSAN.
>> 
>> It might have been useful to mention what it is that UBSAN specifically
>> picked up (it took me a while to spot) - but anyway I think there's a
>> bigger issue with it being completely wrong when size == ~0 (see below).
>
>Indeed. I've updated the commit message in v2 to explain what goes
>wrong (your analysis was spot on, but a mailing list message is more
>ephermal than a commit message). I'll send out v2 tomorrow assuming
>nobody objects to v1 in the mean time.
>
>Thanks for the review.
>
>> There is potentially a third bug which kbase only recently attempted to
>> fix. The lock address is effectively rounded down by the hardware (the
>> bottom bits are ignored). So if you have mask=(1<> (iova & mask) != ((iova + size) & mask) then you are potentially failing
>> to lock the end of the intended region. kbase has added some code to
>> handle this:
>> 
>> >/* Round up if some memory pages spill into the next region. */
>> >region_frame_number_start = pfn >> (lockaddr_size_log2 - PAGE_SHIFT);
>> >region_frame_number_end =
>> >(pfn + num_pages - 1) >> (lockaddr_size_log2 - PAGE_SHIFT);
>> > 
>> >if (region_frame_number_start < region_frame_number_end)
>> >lockaddr_size_log2 += 1;
>> 
>> I guess we should too?
>
>Oh, I missed this one. Guess we have 4 bugs with this code instead of
>just 3, yikes. How could such a short function be so deeply and horribly
>broken? 
>
>Should I add a fourth patch to the series to fix this?

Yes please!

Thanks,
Steve


Re: [PATCH 1/3] drm/panfrost: Simplify lock_region calculation

2021-08-23 Thread Alyssa Rosenzweig
> > In lock_region, simplify the calculation of the region_width parameter.
> > This field is the size, but encoded as log2(ceil(size)) - 1.
> > log2(ceil(size)) may be computed directly as fls(size - 1). However, we
> > want to use the 64-bit versions as the amount to lock can exceed
> > 32-bits.
> > 
> > This avoids undefined behaviour when locking all memory (size ~0),
> > caught by UBSAN.
> 
> It might have been useful to mention what it is that UBSAN specifically
> picked up (it took me a while to spot) - but anyway I think there's a
> bigger issue with it being completely wrong when size == ~0 (see below).

Indeed. I've updated the commit message in v2 to explain what goes
wrong (your analysis was spot on, but a mailing list message is more
ephermal than a commit message). I'll send out v2 tomorrow assuming
nobody objects to v1 in the mean time.

Thanks for the review.

> There is potentially a third bug which kbase only recently attempted to
> fix. The lock address is effectively rounded down by the hardware (the
> bottom bits are ignored). So if you have mask=(1< (iova & mask) != ((iova + size) & mask) then you are potentially failing
> to lock the end of the intended region. kbase has added some code to
> handle this:
> 
> > /* Round up if some memory pages spill into the next region. */
> > region_frame_number_start = pfn >> (lockaddr_size_log2 - PAGE_SHIFT);
> > region_frame_number_end =
> > (pfn + num_pages - 1) >> (lockaddr_size_log2 - PAGE_SHIFT);
> > 
> > if (region_frame_number_start < region_frame_number_end)
> > lockaddr_size_log2 += 1;
> 
> I guess we should too?

Oh, I missed this one. Guess we have 4 bugs with this code instead of
just 3, yikes. How could such a short function be so deeply and horribly
broken? 

Should I add a fourth patch to the series to fix this?


Re: [PATCH 1/3] drm/panfrost: Simplify lock_region calculation

2021-08-23 Thread Steven Price
On 20/08/2021 22:31, Alyssa Rosenzweig wrote:
> In lock_region, simplify the calculation of the region_width parameter.
> This field is the size, but encoded as log2(ceil(size)) - 1.
> log2(ceil(size)) may be computed directly as fls(size - 1). However, we
> want to use the 64-bit versions as the amount to lock can exceed
> 32-bits.
> 
> This avoids undefined behaviour when locking all memory (size ~0),
> caught by UBSAN.

It might have been useful to mention what it is that UBSAN specifically
picked up (it took me a while to spot) - but anyway I think there's a
bigger issue with it being completely wrong when size == ~0 (see below).

> Signed-off-by: Alyssa Rosenzweig 
> Reported-and-tested-by: Chris Morgan 
> Cc: 

However, I've confirmed this returns the same values and is certainly
more simple, so:

Reviewed-by: Steven Price 

> ---
>  drivers/gpu/drm/panfrost/panfrost_mmu.c | 19 +--
>  1 file changed, 5 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c 
> b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> index 0da5b3100ab1..f6e02d0392f4 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> @@ -62,21 +62,12 @@ static void lock_region(struct panfrost_device *pfdev, 
> u32 as_nr,
>  {
>   u8 region_width;
>   u64 region = iova & PAGE_MASK;
> - /*
> -  * fls returns:
> -  * 1 .. 32
> -  *
> -  * 10 + fls(num_pages)
> -  * results in the range (11 .. 42)
> -  */
> -
> - size = round_up(size, PAGE_SIZE);

This seems to be the first issue - ~0 will be 'rounded up' to 0.

>  
> - region_width = 10 + fls(size >> PAGE_SHIFT);

fls(0) == 0, so region_width == 10.

> - if ((size >> PAGE_SHIFT) != (1ul << (region_width - 11))) {

Presumably here's where UBSAN objects - we're shifting by a negative
value, which even it it happens to works means the lock region is tiny
and certainly not what was intended! It might well be worth a:

Fixes: f3ba91228e8e ("drm/panfrost: Add initial panfrost driver")

Note for anyone following along at (working-from-) home: although this
code was cargo culted from kbase - kbase is fine because it takes a pfn
and doesn't do the round_up() stage.

Which also exposes the second bug (fixed in patch 2): a size_t isn't big
enough on 32 bit platforms (all Midgard/Bifrost GPUs have a VA size
bigger than 32 bits). Again kbase gets away with a u32 because it's a pfn.

There is potentially a third bug which kbase only recently attempted to
fix. The lock address is effectively rounded down by the hardware (the
bottom bits are ignored). So if you have mask=(1<   /* Round up if some memory pages spill into the next region. */
>   region_frame_number_start = pfn >> (lockaddr_size_log2 - PAGE_SHIFT);
>   region_frame_number_end =
>   (pfn + num_pages - 1) >> (lockaddr_size_log2 - PAGE_SHIFT);
> 
>   if (region_frame_number_start < region_frame_number_end)
>   lockaddr_size_log2 += 1;

I guess we should too?

Steve

> - /* not pow2, so must go up to the next pow2 */
> - region_width += 1;
> - }
> + /* The size is encoded as ceil(log2) minus(1), which may be calculated
> +  * with fls. The size must be clamped to hardware bounds.
> +  */
> + size = max_t(u64, size, PAGE_SIZE);
> + region_width = fls64(size - 1) - 1;
>   region |= region_width;
>  
>   /* Lock the region that needs to be updated */
> 



[PATCH 1/3] drm/panfrost: Simplify lock_region calculation

2021-08-20 Thread Alyssa Rosenzweig
In lock_region, simplify the calculation of the region_width parameter.
This field is the size, but encoded as log2(ceil(size)) - 1.
log2(ceil(size)) may be computed directly as fls(size - 1). However, we
want to use the 64-bit versions as the amount to lock can exceed
32-bits.

This avoids undefined behaviour when locking all memory (size ~0),
caught by UBSAN.

Signed-off-by: Alyssa Rosenzweig 
Reported-and-tested-by: Chris Morgan 
Cc: 
---
 drivers/gpu/drm/panfrost/panfrost_mmu.c | 19 +--
 1 file changed, 5 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c 
b/drivers/gpu/drm/panfrost/panfrost_mmu.c
index 0da5b3100ab1..f6e02d0392f4 100644
--- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
@@ -62,21 +62,12 @@ static void lock_region(struct panfrost_device *pfdev, u32 
as_nr,
 {
u8 region_width;
u64 region = iova & PAGE_MASK;
-   /*
-* fls returns:
-* 1 .. 32
-*
-* 10 + fls(num_pages)
-* results in the range (11 .. 42)
-*/
-
-   size = round_up(size, PAGE_SIZE);
 
-   region_width = 10 + fls(size >> PAGE_SHIFT);
-   if ((size >> PAGE_SHIFT) != (1ul << (region_width - 11))) {
-   /* not pow2, so must go up to the next pow2 */
-   region_width += 1;
-   }
+   /* The size is encoded as ceil(log2) minus(1), which may be calculated
+* with fls. The size must be clamped to hardware bounds.
+*/
+   size = max_t(u64, size, PAGE_SIZE);
+   region_width = fls64(size - 1) - 1;
region |= region_width;
 
/* Lock the region that needs to be updated */
-- 
2.30.2