Re: [Intel-gfx] [PATCH v6 3/5] drm/i915: Introduce guard pages to i915_vma
On 02/12/2022 11:11, Andi Shyti wrote: Hi Tvrtko, On Fri, Dec 02, 2022 at 10:20:11AM +, Tvrtko Ursulin wrote: On 01/12/2022 20:39, Andi Shyti wrote: From: Chris Wilson Introduce the concept of padding the i915_vma with guard pages before and after. The major consequence is that all ordinary uses of i915_vma must use i915_vma_offset/i915_vma_size and not i915_vma.node.start/size directly, as the drm_mm_node will include the guard pages that surround our object. The biggest connundrum is how exactly to mix requesting a fixed address with guard pages, particularly through the existing uABI. The user does not know about guard pages, so such must be transparent to the user, and so the execobj.offset must be that of the object itself excluding the guard. So a PIN_OFFSET_FIXED must then be exclusive of the guard pages. The caveat is that some placements will be impossible with guard pages, as wrap arounds need to be avoided, and the vma itself will require a larger node. We must not report EINVAL but ENOSPC as these are unavailable locations within the GTT rather than conflicting user requirements. In the next patch, we start using guard pages for scanout objects. While these are limited to GGTT vma, on a few platforms these vma (or at least an alias of the vma) is shared with userspace, so we may leak the existence of such guards if we are not careful to ensure that the execobj.offset is transparent and excludes the guards. (On such platforms like ivb, without full-ppgtt, userspace has to use relocations so the presence of more untouchable regions within its GTT such be of no further issue.) Signed-off-by: Chris Wilson Signed-off-by: Tejas Upadhyay Signed-off-by: Tvrtko Ursulin Signed-off-by: Andi Shyti --- Hi Tvrtko, I removed your r-b in this version because I restored the original value of the guard being aligned with the vma size alignment. Turns out that CI failed with the latest version because the guard was becoming too big (we would have hit the GEM_BUG_ON)[*]. The reason why now the guard is aligned with the vma alignment is that the area is already aligned and if we use as a starting address start + guard, guard needs to be aligned, otherwise we screw up all the memory alignment. Let me know if it makes sense to you. Reviewed-by: Tvrtko Ursulin Conditional to promise of a prioritised follow up improvement, if it turns out GGTT wastage due a bit over zealous guard size comes to bite. Sure! I'll be alert! There are some unrelated failures from CI, just to be sure I sent last night a trybot run. Trybot looked okay, and I just pressed re-test for the intel-gfx series so lets see that too. Regards, Tvrtko
Re: [Intel-gfx] [PATCH v6 3/5] drm/i915: Introduce guard pages to i915_vma
Hi Tvrtko, On Fri, Dec 02, 2022 at 10:20:11AM +, Tvrtko Ursulin wrote: > > On 01/12/2022 20:39, Andi Shyti wrote: > > From: Chris Wilson > > > > Introduce the concept of padding the i915_vma with guard pages before > > and after. The major consequence is that all ordinary uses of i915_vma > > must use i915_vma_offset/i915_vma_size and not i915_vma.node.start/size > > directly, as the drm_mm_node will include the guard pages that surround > > our object. > > > > The biggest connundrum is how exactly to mix requesting a fixed address > > with guard pages, particularly through the existing uABI. The user does > > not know about guard pages, so such must be transparent to the user, and > > so the execobj.offset must be that of the object itself excluding the > > guard. So a PIN_OFFSET_FIXED must then be exclusive of the guard pages. > > The caveat is that some placements will be impossible with guard pages, > > as wrap arounds need to be avoided, and the vma itself will require a > > larger node. We must not report EINVAL but ENOSPC as these are unavailable > > locations within the GTT rather than conflicting user requirements. > > > > In the next patch, we start using guard pages for scanout objects. While > > these are limited to GGTT vma, on a few platforms these vma (or at least > > an alias of the vma) is shared with userspace, so we may leak the > > existence of such guards if we are not careful to ensure that the > > execobj.offset is transparent and excludes the guards. (On such platforms > > like ivb, without full-ppgtt, userspace has to use relocations so the > > presence of more untouchable regions within its GTT such be of no further > > issue.) > > > > Signed-off-by: Chris Wilson > > Signed-off-by: Tejas Upadhyay > > Signed-off-by: Tvrtko Ursulin > > Signed-off-by: Andi Shyti > > --- > > Hi Tvrtko, > > > > I removed your r-b in this version because I restored the original value > > of the guard being aligned with the vma size alignment. Turns out that > > CI failed with the latest version because the guard was becoming too big > > (we would have hit the GEM_BUG_ON)[*]. > > > > The reason why now the guard is aligned with the vma alignment is that > > the area is already aligned and if we use as a starting address start + > > guard, guard needs to be aligned, otherwise we screw up all the memory > > alignment. > > > > Let me know if it makes sense to you. > > Reviewed-by: Tvrtko Ursulin > > Conditional to promise of a prioritised follow up improvement, if it turns > out GGTT wastage due a bit over zealous guard size comes to bite. Sure! I'll be alert! There are some unrelated failures from CI, just to be sure I sent last night a trybot run. Thanks, Tvrtko! Andi
Re: [Intel-gfx] [PATCH v6 3/5] drm/i915: Introduce guard pages to i915_vma
On 01/12/2022 20:39, Andi Shyti wrote: From: Chris Wilson Introduce the concept of padding the i915_vma with guard pages before and after. The major consequence is that all ordinary uses of i915_vma must use i915_vma_offset/i915_vma_size and not i915_vma.node.start/size directly, as the drm_mm_node will include the guard pages that surround our object. The biggest connundrum is how exactly to mix requesting a fixed address with guard pages, particularly through the existing uABI. The user does not know about guard pages, so such must be transparent to the user, and so the execobj.offset must be that of the object itself excluding the guard. So a PIN_OFFSET_FIXED must then be exclusive of the guard pages. The caveat is that some placements will be impossible with guard pages, as wrap arounds need to be avoided, and the vma itself will require a larger node. We must not report EINVAL but ENOSPC as these are unavailable locations within the GTT rather than conflicting user requirements. In the next patch, we start using guard pages for scanout objects. While these are limited to GGTT vma, on a few platforms these vma (or at least an alias of the vma) is shared with userspace, so we may leak the existence of such guards if we are not careful to ensure that the execobj.offset is transparent and excludes the guards. (On such platforms like ivb, without full-ppgtt, userspace has to use relocations so the presence of more untouchable regions within its GTT such be of no further issue.) Signed-off-by: Chris Wilson Signed-off-by: Tejas Upadhyay Signed-off-by: Tvrtko Ursulin Signed-off-by: Andi Shyti --- Hi Tvrtko, I removed your r-b in this version because I restored the original value of the guard being aligned with the vma size alignment. Turns out that CI failed with the latest version because the guard was becoming too big (we would have hit the GEM_BUG_ON)[*]. The reason why now the guard is aligned with the vma alignment is that the area is already aligned and if we use as a starting address start + guard, guard needs to be aligned, otherwise we screw up all the memory alignment. Let me know if it makes sense to you. Reviewed-by: Tvrtko Ursulin Conditional to promise of a prioritised follow up improvement, if it turns out GGTT wastage due a bit over zealous guard size comes to bite. Regards, Tvrtko Thanks, Andi Changelog = v5 -> v6: - restore the original alignment of guard so that it's aligned coherently with the vma area's alignment. v4 -> v5: - remove again the GEM_BUG_ON() - fix an oversight where the rounding was called without assigning the value to the guard. v1 -> v4: - refer to the cover letter. [*] https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110720v5/fi-rkl-11600/igt@i915_module_l...@load.html#dmesg-warnings300 drivers/gpu/drm/i915/gt/intel_ggtt.c | 14 +--- drivers/gpu/drm/i915/i915_gem_gtt.h | 3 +- drivers/gpu/drm/i915/i915_vma.c | 43 drivers/gpu/drm/i915/i915_vma.h | 5 +-- drivers/gpu/drm/i915/i915_vma_resource.c | 4 +-- drivers/gpu/drm/i915/i915_vma_resource.h | 7 +++- drivers/gpu/drm/i915/i915_vma_types.h| 1 + 7 files changed, 60 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c index 7644738b9cdbe..784d4a8c43ba9 100644 --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c @@ -296,8 +296,11 @@ static void gen8_ggtt_insert_entries(struct i915_address_space *vm, */ gte = (gen8_pte_t __iomem *)ggtt->gsm; - gte += vma_res->start / I915_GTT_PAGE_SIZE; - end = gte + vma_res->node_size / I915_GTT_PAGE_SIZE; + gte += (vma_res->start - vma_res->guard) / I915_GTT_PAGE_SIZE; + end = gte + vma_res->guard / I915_GTT_PAGE_SIZE; + while (gte < end) + gen8_set_pte(gte++, vm->scratch[0]->encode); + end += (vma_res->node_size + vma_res->guard) / I915_GTT_PAGE_SIZE; for_each_sgt_daddr(addr, iter, vma_res->bi.pages) gen8_set_pte(gte++, pte_encode | addr); @@ -347,9 +350,12 @@ static void gen6_ggtt_insert_entries(struct i915_address_space *vm, dma_addr_t addr; gte = (gen6_pte_t __iomem *)ggtt->gsm; - gte += vma_res->start / I915_GTT_PAGE_SIZE; - end = gte + vma_res->node_size / I915_GTT_PAGE_SIZE; + gte += (vma_res->start - vma_res->guard) / I915_GTT_PAGE_SIZE; + end = gte + vma_res->guard / I915_GTT_PAGE_SIZE; + while (gte < end) + iowrite32(vm->scratch[0]->encode, gte++); + end += (vma_res->node_size + vma_res->guard) / I915_GTT_PAGE_SIZE; for_each_sgt_daddr(addr, iter, vma_res->bi.pages) iowrite32(vm->pte_encode(addr, level, flags), gte++); GEM_BUG_ON(gte > end); diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h index 8c2f57eb5ddaa..2434197830523
[PATCH v6 3/5] drm/i915: Introduce guard pages to i915_vma
From: Chris Wilson Introduce the concept of padding the i915_vma with guard pages before and after. The major consequence is that all ordinary uses of i915_vma must use i915_vma_offset/i915_vma_size and not i915_vma.node.start/size directly, as the drm_mm_node will include the guard pages that surround our object. The biggest connundrum is how exactly to mix requesting a fixed address with guard pages, particularly through the existing uABI. The user does not know about guard pages, so such must be transparent to the user, and so the execobj.offset must be that of the object itself excluding the guard. So a PIN_OFFSET_FIXED must then be exclusive of the guard pages. The caveat is that some placements will be impossible with guard pages, as wrap arounds need to be avoided, and the vma itself will require a larger node. We must not report EINVAL but ENOSPC as these are unavailable locations within the GTT rather than conflicting user requirements. In the next patch, we start using guard pages for scanout objects. While these are limited to GGTT vma, on a few platforms these vma (or at least an alias of the vma) is shared with userspace, so we may leak the existence of such guards if we are not careful to ensure that the execobj.offset is transparent and excludes the guards. (On such platforms like ivb, without full-ppgtt, userspace has to use relocations so the presence of more untouchable regions within its GTT such be of no further issue.) Signed-off-by: Chris Wilson Signed-off-by: Tejas Upadhyay Signed-off-by: Tvrtko Ursulin Signed-off-by: Andi Shyti --- Hi Tvrtko, I removed your r-b in this version because I restored the original value of the guard being aligned with the vma size alignment. Turns out that CI failed with the latest version because the guard was becoming too big (we would have hit the GEM_BUG_ON)[*]. The reason why now the guard is aligned with the vma alignment is that the area is already aligned and if we use as a starting address start + guard, guard needs to be aligned, otherwise we screw up all the memory alignment. Let me know if it makes sense to you. Thanks, Andi Changelog = v5 -> v6: - restore the original alignment of guard so that it's aligned coherently with the vma area's alignment. v4 -> v5: - remove again the GEM_BUG_ON() - fix an oversight where the rounding was called without assigning the value to the guard. v1 -> v4: - refer to the cover letter. [*] https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110720v5/fi-rkl-11600/igt@i915_module_l...@load.html#dmesg-warnings300 drivers/gpu/drm/i915/gt/intel_ggtt.c | 14 +--- drivers/gpu/drm/i915/i915_gem_gtt.h | 3 +- drivers/gpu/drm/i915/i915_vma.c | 43 drivers/gpu/drm/i915/i915_vma.h | 5 +-- drivers/gpu/drm/i915/i915_vma_resource.c | 4 +-- drivers/gpu/drm/i915/i915_vma_resource.h | 7 +++- drivers/gpu/drm/i915/i915_vma_types.h| 1 + 7 files changed, 60 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c index 7644738b9cdbe..784d4a8c43ba9 100644 --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c @@ -296,8 +296,11 @@ static void gen8_ggtt_insert_entries(struct i915_address_space *vm, */ gte = (gen8_pte_t __iomem *)ggtt->gsm; - gte += vma_res->start / I915_GTT_PAGE_SIZE; - end = gte + vma_res->node_size / I915_GTT_PAGE_SIZE; + gte += (vma_res->start - vma_res->guard) / I915_GTT_PAGE_SIZE; + end = gte + vma_res->guard / I915_GTT_PAGE_SIZE; + while (gte < end) + gen8_set_pte(gte++, vm->scratch[0]->encode); + end += (vma_res->node_size + vma_res->guard) / I915_GTT_PAGE_SIZE; for_each_sgt_daddr(addr, iter, vma_res->bi.pages) gen8_set_pte(gte++, pte_encode | addr); @@ -347,9 +350,12 @@ static void gen6_ggtt_insert_entries(struct i915_address_space *vm, dma_addr_t addr; gte = (gen6_pte_t __iomem *)ggtt->gsm; - gte += vma_res->start / I915_GTT_PAGE_SIZE; - end = gte + vma_res->node_size / I915_GTT_PAGE_SIZE; + gte += (vma_res->start - vma_res->guard) / I915_GTT_PAGE_SIZE; + end = gte + vma_res->guard / I915_GTT_PAGE_SIZE; + while (gte < end) + iowrite32(vm->scratch[0]->encode, gte++); + end += (vma_res->node_size + vma_res->guard) / I915_GTT_PAGE_SIZE; for_each_sgt_daddr(addr, iter, vma_res->bi.pages) iowrite32(vm->pte_encode(addr, level, flags), gte++); GEM_BUG_ON(gte > end); diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h index 8c2f57eb5ddaa..2434197830523 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.h +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h @@ -44,7 +44,8 @@ int i915_gem_gtt_insert(struct i915_address_space *vm, #define PIN_HIGH BIT_ULL(5) #define PIN_OFFSET_BIAS