Re: [Intel-gfx] [PATCH v2 2/4] drm/i915: Introduce guard pages to i915_vma

2022-11-24 Thread Tvrtko Ursulin



On 23/11/2022 18:54, Andi Shyti wrote:

Hi Tvrtko,

[...]


@@ -768,6 +768,9 @@ i915_vma_insert(struct i915_vma *vma, struct 
i915_gem_ww_ctx *ww,
GEM_BUG_ON(!IS_ALIGNED(alignment, I915_GTT_MIN_ALIGNMENT));
GEM_BUG_ON(!is_power_of_2(alignment));
+   guard = vma->guard; /* retain guard across rebinds */
+   guard = ALIGN(guard, alignment);


Why does guard area needs the same alignment as the requested mapping? What 
about the fact on 32-bit builds guard is 32-bit and alignment u64?


I guess this just to round up/down guard to something, not
necessarily to that alignment.

Shall I remove it?


Don't know, initially I thought it maybe needs a comment on what's it 
doing and why. If it is about aligning to "something" then should it be 
I915_GTT_MIN_ALIGNMENT?



@@ -777,6 +780,7 @@ i915_vma_insert(struct i915_vma *vma, struct 
i915_gem_ww_ctx *ww,
if (flags & PIN_ZONE_4G)
end = min_t(u64, end, (1ULL << 32) - I915_GTT_PAGE_SIZE);
GEM_BUG_ON(!IS_ALIGNED(end, I915_GTT_PAGE_SIZE));
+   GEM_BUG_ON(2 * guard > end);


End is the size of relevant VA area at this point so what and why is this 
checking?


I think because we want to make sure the padding is at least not
bigger that the size. What is actually wrong with this.


Same as above - if there is subtle special meaning please add a comment. 
Otherwise, for the whole object and not just the guards, it is covered by:


+   if (size > end - 2 * guard) {

I don't follow what is the point on only checking the guards.



[...]


@@ -855,6 +869,7 @@ i915_vma_insert(struct i915_vma *vma, struct 
i915_gem_ww_ctx *ww,
GEM_BUG_ON(!i915_gem_valid_gtt_space(vma, color));
list_move_tail(&vma->vm_link, &vma->vm->bound_list);
+   vma->guard = guard;


unsigned long into u32 - what guarantees no truncation?


we are missing here this part above:

guard = vma->guard; /* retain guard across rebinds */
if (flags & PIN_OFFSET_GUARD) {
GEM_BUG_ON(overflows_type(flags & PIN_OFFSET_MASK, u32));
guard = max_t(u32, guard, flags & PIN_OFFSET_MASK);
}

that should make sure that we fit into 32 bits.


Hm okay. I guess the u64 alignment and that "guard = ALIGN(guard, 
alignment);" is what is bothering me to begin with. In other words with 
that there is a chance to overflow vma->guard with a small guard and 
large alignment.




[...]


@@ -197,14 +197,15 @@ struct i915_vma {
struct i915_fence_reg *fence;
u64 size;
-   u64 display_alignment;
struct i915_page_sizes page_sizes;
/* mmap-offset associated with fencing for this vma */
struct i915_mmap_offset *mmo;
+   u32 guard; /* padding allocated around vma->pages within the node */
u32 fence_size;
u32 fence_alignment;
+   u32 display_alignment;


u64 -> u32 for display_alignment looks unrelated change.

./display/intel_fb_pin.c:   vma->display_alignment = max_t(u64, 
vma->display_alignment, alignment);
./gem/i915_gem_domain.c:vma->display_alignment = max_t(u64, 
vma->display_alignment, alignment);

These two sites need to be changed not to use u64.

Do this part in a separate patch?


Right! will remove it.


Okay, to be clear, refactoring of vma->display_alignemnt to be u32 as a 
separate patch in the series. Thanks!


Regards,

Tvrtko




/**
 * Count of the number of times this vma has been opened by different


Regards,


Thanks,
Andi


Tvrtko


Re: [Intel-gfx] [PATCH v2 2/4] drm/i915: Introduce guard pages to i915_vma

2022-11-24 Thread Andi Shyti
> > > @@ -768,6 +768,9 @@ i915_vma_insert(struct i915_vma *vma, struct 
> > > i915_gem_ww_ctx *ww,
> > >   GEM_BUG_ON(!IS_ALIGNED(alignment, I915_GTT_MIN_ALIGNMENT));
> > >   GEM_BUG_ON(!is_power_of_2(alignment));
> > > + guard = vma->guard; /* retain guard across rebinds */
> > > + guard = ALIGN(guard, alignment);
> > 
> > Why does guard area needs the same alignment as the requested mapping? What 
> > about the fact on 32-bit builds guard is 32-bit and alignment u64?
> 
> I guess this just to round up/down guard to something, not
> necessarily to that alignment.
> 
> Shall I remove it?

or we could just add a comment to explain that this is just to do
some rounding in order to avoid weird values of guard.

Andi


Re: [Intel-gfx] [PATCH v2 2/4] drm/i915: Introduce guard pages to i915_vma

2022-11-23 Thread Andi Shyti
Hi Tvrtko,

[...]

> > @@ -768,6 +768,9 @@ i915_vma_insert(struct i915_vma *vma, struct 
> > i915_gem_ww_ctx *ww,
> > GEM_BUG_ON(!IS_ALIGNED(alignment, I915_GTT_MIN_ALIGNMENT));
> > GEM_BUG_ON(!is_power_of_2(alignment));
> > +   guard = vma->guard; /* retain guard across rebinds */
> > +   guard = ALIGN(guard, alignment);
> 
> Why does guard area needs the same alignment as the requested mapping? What 
> about the fact on 32-bit builds guard is 32-bit and alignment u64?

I guess this just to round up/down guard to something, not
necessarily to that alignment.

Shall I remove it?

[...]

> > @@ -777,6 +780,7 @@ i915_vma_insert(struct i915_vma *vma, struct 
> > i915_gem_ww_ctx *ww,
> > if (flags & PIN_ZONE_4G)
> > end = min_t(u64, end, (1ULL << 32) - I915_GTT_PAGE_SIZE);
> > GEM_BUG_ON(!IS_ALIGNED(end, I915_GTT_PAGE_SIZE));
> > +   GEM_BUG_ON(2 * guard > end);
> 
> End is the size of relevant VA area at this point so what and why is this 
> checking?

I think because we want to make sure the padding is at least not
bigger that the size. What is actually wrong with this.

[...]

> > @@ -855,6 +869,7 @@ i915_vma_insert(struct i915_vma *vma, struct 
> > i915_gem_ww_ctx *ww,
> > GEM_BUG_ON(!i915_gem_valid_gtt_space(vma, color));
> > list_move_tail(&vma->vm_link, &vma->vm->bound_list);
> > +   vma->guard = guard;
> 
> unsigned long into u32 - what guarantees no truncation?

we are missing here this part above:

guard = vma->guard; /* retain guard across rebinds */
if (flags & PIN_OFFSET_GUARD) {
GEM_BUG_ON(overflows_type(flags & PIN_OFFSET_MASK, u32));
guard = max_t(u32, guard, flags & PIN_OFFSET_MASK);
}

that should make sure that we fit into 32 bits.

[...]

> > @@ -197,14 +197,15 @@ struct i915_vma {
> > struct i915_fence_reg *fence;
> > u64 size;
> > -   u64 display_alignment;
> > struct i915_page_sizes page_sizes;
> > /* mmap-offset associated with fencing for this vma */
> > struct i915_mmap_offset *mmo;
> > +   u32 guard; /* padding allocated around vma->pages within the node */
> > u32 fence_size;
> > u32 fence_alignment;
> > +   u32 display_alignment;
> 
> u64 -> u32 for display_alignment looks unrelated change.
> 
> ./display/intel_fb_pin.c:   vma->display_alignment = max_t(u64, 
> vma->display_alignment, alignment);
> ./gem/i915_gem_domain.c:vma->display_alignment = max_t(u64, 
> vma->display_alignment, alignment);
> 
> These two sites need to be changed not to use u64.
> 
> Do this part in a separate patch?

Right! will remove it.

> > /**
> >  * Count of the number of times this vma has been opened by different
> 
> Regards,

Thanks,
Andi

> Tvrtko


Re: [Intel-gfx] [PATCH v2 2/4] drm/i915: Introduce guard pages to i915_vma

2022-11-23 Thread Tvrtko Ursulin



On 22/11/2022 18:57, Andi Shyti wrote:

From: Chris Wilson 

Introduce the concept of padding the i915_vma with guard pages before
and after. The major consequence is that all ordinary uses of i915_vma
must use i915_vma_offset/i915_vma_size and not i915_vma.node.start/size
directly, as the drm_mm_node will include the guard pages that surround
our object.

The biggest connundrum is how exactly to mix requesting a fixed address
with guard pages, particularly through the existing uABI. The user does
not know about guard pages, so such must be transparent to the user, and
so the execobj.offset must be that of the object itself excluding the
guard. So a PIN_OFFSET_FIXED must then be exclusive of the guard pages.
The caveat is that some placements will be impossible with guard pages,
as wrap arounds need to be avoided, and the vma itself will require a
larger node. We must not report EINVAL but ENOSPC as these are unavailable
locations within the GTT rather than conflicting user requirements.

In the next patch, we start using guard pages for scanout objects. While
these are limited to GGTT vma, on a few platforms these vma (or at least
an alias of the vma) is shared with userspace, so we may leak the
existence of such guards if we are not careful to ensure that the
execobj.offset is transparent and excludes the guards. (On such platforms
like ivb, without full-ppgtt, userspace has to use relocations so the
presence of more untouchable regions within its GTT such be of no further
issue.)

Signed-off-by: Chris Wilson 
Signed-off-by: Tejas Upadhyay 
Signed-off-by: Tvrtko Ursulin 
Signed-off-by: Andi Shyti 
---
  drivers/gpu/drm/i915/gt/intel_ggtt.c | 14 
  drivers/gpu/drm/i915/i915_gem_gtt.h  |  3 ++-
  drivers/gpu/drm/i915/i915_vma.c  | 27 ++--
  drivers/gpu/drm/i915/i915_vma.h  |  5 +++--
  drivers/gpu/drm/i915/i915_vma_resource.c |  4 ++--
  drivers/gpu/drm/i915/i915_vma_resource.h |  7 +-
  drivers/gpu/drm/i915/i915_vma_types.h|  3 ++-
  7 files changed, 46 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c 
b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index 8145851ad23d5..133710258eae6 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
@@ -287,8 +287,11 @@ static void gen8_ggtt_insert_entries(struct 
i915_address_space *vm,
 */
  
  	gte = (gen8_pte_t __iomem *)ggtt->gsm;

-   gte += vma_res->start / I915_GTT_PAGE_SIZE;
-   end = gte + vma_res->node_size / I915_GTT_PAGE_SIZE;
+   gte += (vma_res->start - vma_res->guard) / I915_GTT_PAGE_SIZE;
+   end = gte + vma_res->guard / I915_GTT_PAGE_SIZE;
+   while (gte < end)
+   gen8_set_pte(gte++, vm->scratch[0]->encode);
+   end += (vma_res->node_size + vma_res->guard) / I915_GTT_PAGE_SIZE;
  
  	for_each_sgt_daddr(addr, iter, vma_res->bi.pages)

gen8_set_pte(gte++, pte_encode | addr);
@@ -338,9 +341,12 @@ static void gen6_ggtt_insert_entries(struct 
i915_address_space *vm,
dma_addr_t addr;
  
  	gte = (gen6_pte_t __iomem *)ggtt->gsm;

-   gte += vma_res->start / I915_GTT_PAGE_SIZE;
-   end = gte + vma_res->node_size / I915_GTT_PAGE_SIZE;
+   gte += (vma_res->start - vma_res->guard) / I915_GTT_PAGE_SIZE;
  
+	end = gte + vma_res->guard / I915_GTT_PAGE_SIZE;

+   while (gte < end)
+   iowrite32(vm->scratch[0]->encode, gte++);
+   end += (vma_res->node_size + vma_res->guard) / I915_GTT_PAGE_SIZE;
for_each_sgt_daddr(addr, iter, vma_res->bi.pages)
iowrite32(vm->pte_encode(addr, level, flags), gte++);
GEM_BUG_ON(gte > end);
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h 
b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 8c2f57eb5ddaa..2434197830523 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -44,7 +44,8 @@ int i915_gem_gtt_insert(struct i915_address_space *vm,
  #define PIN_HIGH  BIT_ULL(5)
  #define PIN_OFFSET_BIAS   BIT_ULL(6)
  #define PIN_OFFSET_FIXED  BIT_ULL(7)
-#define PIN_VALIDATE   BIT_ULL(8) /* validate placement only, no need 
to call unpin() */
+#define PIN_OFFSET_GUARD   BIT_ULL(8)
+#define PIN_VALIDATE   BIT_ULL(9) /* validate placement only, no need 
to call unpin() */
  
  #define PIN_GLOBAL		BIT_ULL(10) /* I915_VMA_GLOBAL_BIND */

  #define PIN_USER  BIT_ULL(11) /* I915_VMA_LOCAL_BIND */
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 2232118babeb3..457e35e03895f 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -419,7 +419,7 @@ i915_vma_resource_init_from_vma(struct i915_vma_resource 
*vma_res,
   obj->mm.rsgt, i915_gem_object_is_readonly(obj),
   i915_gem_object_is_lmem(obj), obj->mm.region,
   vma->ops, vma->private, __i915_vma_offs

[Intel-gfx] [PATCH v2 2/4] drm/i915: Introduce guard pages to i915_vma

2022-11-22 Thread Andi Shyti
From: Chris Wilson 

Introduce the concept of padding the i915_vma with guard pages before
and after. The major consequence is that all ordinary uses of i915_vma
must use i915_vma_offset/i915_vma_size and not i915_vma.node.start/size
directly, as the drm_mm_node will include the guard pages that surround
our object.

The biggest connundrum is how exactly to mix requesting a fixed address
with guard pages, particularly through the existing uABI. The user does
not know about guard pages, so such must be transparent to the user, and
so the execobj.offset must be that of the object itself excluding the
guard. So a PIN_OFFSET_FIXED must then be exclusive of the guard pages.
The caveat is that some placements will be impossible with guard pages,
as wrap arounds need to be avoided, and the vma itself will require a
larger node. We must not report EINVAL but ENOSPC as these are unavailable
locations within the GTT rather than conflicting user requirements.

In the next patch, we start using guard pages for scanout objects. While
these are limited to GGTT vma, on a few platforms these vma (or at least
an alias of the vma) is shared with userspace, so we may leak the
existence of such guards if we are not careful to ensure that the
execobj.offset is transparent and excludes the guards. (On such platforms
like ivb, without full-ppgtt, userspace has to use relocations so the
presence of more untouchable regions within its GTT such be of no further
issue.)

Signed-off-by: Chris Wilson 
Signed-off-by: Tejas Upadhyay 
Signed-off-by: Tvrtko Ursulin 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/gt/intel_ggtt.c | 14 
 drivers/gpu/drm/i915/i915_gem_gtt.h  |  3 ++-
 drivers/gpu/drm/i915/i915_vma.c  | 27 ++--
 drivers/gpu/drm/i915/i915_vma.h  |  5 +++--
 drivers/gpu/drm/i915/i915_vma_resource.c |  4 ++--
 drivers/gpu/drm/i915/i915_vma_resource.h |  7 +-
 drivers/gpu/drm/i915/i915_vma_types.h|  3 ++-
 7 files changed, 46 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c 
b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index 8145851ad23d5..133710258eae6 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
@@ -287,8 +287,11 @@ static void gen8_ggtt_insert_entries(struct 
i915_address_space *vm,
 */
 
gte = (gen8_pte_t __iomem *)ggtt->gsm;
-   gte += vma_res->start / I915_GTT_PAGE_SIZE;
-   end = gte + vma_res->node_size / I915_GTT_PAGE_SIZE;
+   gte += (vma_res->start - vma_res->guard) / I915_GTT_PAGE_SIZE;
+   end = gte + vma_res->guard / I915_GTT_PAGE_SIZE;
+   while (gte < end)
+   gen8_set_pte(gte++, vm->scratch[0]->encode);
+   end += (vma_res->node_size + vma_res->guard) / I915_GTT_PAGE_SIZE;
 
for_each_sgt_daddr(addr, iter, vma_res->bi.pages)
gen8_set_pte(gte++, pte_encode | addr);
@@ -338,9 +341,12 @@ static void gen6_ggtt_insert_entries(struct 
i915_address_space *vm,
dma_addr_t addr;
 
gte = (gen6_pte_t __iomem *)ggtt->gsm;
-   gte += vma_res->start / I915_GTT_PAGE_SIZE;
-   end = gte + vma_res->node_size / I915_GTT_PAGE_SIZE;
+   gte += (vma_res->start - vma_res->guard) / I915_GTT_PAGE_SIZE;
 
+   end = gte + vma_res->guard / I915_GTT_PAGE_SIZE;
+   while (gte < end)
+   iowrite32(vm->scratch[0]->encode, gte++);
+   end += (vma_res->node_size + vma_res->guard) / I915_GTT_PAGE_SIZE;
for_each_sgt_daddr(addr, iter, vma_res->bi.pages)
iowrite32(vm->pte_encode(addr, level, flags), gte++);
GEM_BUG_ON(gte > end);
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h 
b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 8c2f57eb5ddaa..2434197830523 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -44,7 +44,8 @@ int i915_gem_gtt_insert(struct i915_address_space *vm,
 #define PIN_HIGH   BIT_ULL(5)
 #define PIN_OFFSET_BIASBIT_ULL(6)
 #define PIN_OFFSET_FIXED   BIT_ULL(7)
-#define PIN_VALIDATE   BIT_ULL(8) /* validate placement only, no need 
to call unpin() */
+#define PIN_OFFSET_GUARD   BIT_ULL(8)
+#define PIN_VALIDATE   BIT_ULL(9) /* validate placement only, no need 
to call unpin() */
 
 #define PIN_GLOBAL BIT_ULL(10) /* I915_VMA_GLOBAL_BIND */
 #define PIN_USER   BIT_ULL(11) /* I915_VMA_LOCAL_BIND */
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 2232118babeb3..457e35e03895f 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -419,7 +419,7 @@ i915_vma_resource_init_from_vma(struct i915_vma_resource 
*vma_res,
   obj->mm.rsgt, i915_gem_object_is_readonly(obj),
   i915_gem_object_is_lmem(obj), obj->mm.region,
   vma->ops, vma->private, __i915_vma_offset(vma),
-