RE: [PATCH v3] drm/i915/gt: Do not consider preemption during execlists_dequeue for gen8

2024-07-11 Thread Cavitt, Jonathan
-Original Message-
From: Intel-gfx  On Behalf Of Nitin 
Gote
Sent: Thursday, July 11, 2024 9:32 AM
To: Wilson, Chris P ; tursu...@ursulin.net; 
intel-gfx@lists.freedesktop.org
Cc: dri-de...@lists.freedesktop.org; Shyti, Andi ; Das, 
Nirmoy ; janusz.krzyszto...@linux.intel.com; Gote, Nitin 
R ; Chris Wilson ; 
sta...@vger.kernel.org
Subject: [PATCH v3] drm/i915/gt: Do not consider preemption during 
execlists_dequeue for gen8
> 
> We're seeing a GPU HANG issue on a CHV platform, which was caused by
> bac24f59f454 ("drm/i915/execlists: Enable coarse preemption boundaries for 
> gen8").
> 
> Gen8 platform has only timeslice and doesn't support a preemption mechanism
> as engines do not have a preemption timer and doesn't send an irq if the
> preemption timeout expires.

That seems to mean the original can_preempt function was inaccurately built,
so fixing it here makes the most sense to me, especially if it's causing 
problems.

Reviewed-by: Jonathan Cavitt 
-Jonathan Cavitt

> So, add a fix to not consider preemption
> during dequeuing for gen8 platforms.
> 
> v2: Simplify can_preempt() function (Tvrtko Ursulin)
> 
> v3:
>  - Inside need_preempt(), condition of can_preempt() is not required
>as simplified can_preempt() is enough. (Chris Wilson)
> 
> Fixes: bac24f59f454 ("drm/i915/execlists: Enable coarse preemption boundaries 
> for gen8")
> Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/11396
> Suggested-by: Andi Shyti 
> Signed-off-by: Nitin Gote 
> Cc: Chris Wilson 
> CC:  # v5.2+
> ---
>  drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 6 +-
>  1 file changed, 1 insertion(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c 
> b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> index 21829439e686..72090f52fb85 100644
> --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> @@ -3315,11 +3315,7 @@ static void remove_from_engine(struct i915_request *rq)
>  
>  static bool can_preempt(struct intel_engine_cs *engine)
>  {
> - if (GRAPHICS_VER(engine->i915) > 8)
> - return true;
> -
> - /* GPGPU on bdw requires extra w/a; not implemented */
> - return engine->class != RENDER_CLASS;
> + return GRAPHICS_VER(engine->i915) > 8;
>  }
>  
>  static void kick_execlists(const struct i915_request *rq, int prio)
> -- 
> 2.25.1
> 
> 


RE: [PATCH 2/2] drm/i915: disable fbc due to Wa_16023588340

2024-06-20 Thread Cavitt, Jonathan
-Original Message-
From: Auld, Matthew  
Sent: Wednesday, June 19, 2024 7:31 AM
To: intel...@lists.freedesktop.org
Cc: Cavitt, Jonathan ; Roper, Matthew D 
; De Marchi, Lucas ; 
Govindapillai, Vinod ; 
intel-gfx@lists.freedesktop.org
Subject: [PATCH 2/2] drm/i915: disable fbc due to Wa_16023588340
> 
> On BMG-G21 we need to disable fbc due to complications around the WA.
> 
> Signed-off-by: Matthew Auld 
> Cc: Jonathan Cavitt 
> Cc: Matt Roper 
> Cc: Lucas De Marchi 
> Cc: Vinod Govindapillai 
> Cc: intel-gfx@lists.freedesktop.org
> ---
>  drivers/gpu/drm/i915/display/intel_display_wa.h |  8 
>  drivers/gpu/drm/i915/display/intel_fbc.c|  6 ++
>  drivers/gpu/drm/xe/Makefile |  4 +++-
>  drivers/gpu/drm/xe/display/xe_display_wa.c  | 16 
>  4 files changed, 33 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/gpu/drm/xe/display/xe_display_wa.c
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_display_wa.h 
> b/drivers/gpu/drm/i915/display/intel_display_wa.h
> index 63201d09852c..be644ab6ae00 100644
> --- a/drivers/gpu/drm/i915/display/intel_display_wa.h
> +++ b/drivers/gpu/drm/i915/display/intel_display_wa.h
> @@ -6,8 +6,16 @@
>  #ifndef __INTEL_DISPLAY_WA_H__
>  #define __INTEL_DISPLAY_WA_H__
>  
> +#include 
> +
>  struct drm_i915_private;
>  
>  void intel_display_wa_apply(struct drm_i915_private *i915);
>  
> +#ifdef I915
> +static inline bool intel_display_needs_wa_16023588340(struct 
> drm_i915_private *i915) { return false; }
> +#else
> +bool intel_display_needs_wa_16023588340(struct drm_i915_private *i915);
> +#endif
> +
>  #endif
> diff --git a/drivers/gpu/drm/i915/display/intel_fbc.c 
> b/drivers/gpu/drm/i915/display/intel_fbc.c
> index 67116c9f1464..8488f82143a4 100644
> --- a/drivers/gpu/drm/i915/display/intel_fbc.c
> +++ b/drivers/gpu/drm/i915/display/intel_fbc.c
> @@ -56,6 +56,7 @@
>  #include "intel_display_device.h"
>  #include "intel_display_trace.h"
>  #include "intel_display_types.h"
> +#include "intel_display_wa.h"
>  #include "intel_fbc.h"
>  #include "intel_fbc_regs.h"
>  #include "intel_frontbuffer.h"
> @@ -1237,6 +1238,11 @@ static int intel_fbc_check_plane(struct 
> intel_atomic_state *state,
>   return 0;
>   }
>  
> + if (intel_display_needs_wa_16023588340(i915)) {
> + plane_state->no_fbc_reason = "Wa_16023588340";
> + return 0;
> + }
> +
>   /* WaFbcTurnOffFbcWhenHyperVisorIsUsed:skl,bxt */
>   if (i915_vtd_active(i915) && (IS_SKYLAKE(i915) || IS_BROXTON(i915))) {
>   plane_state->no_fbc_reason = "VT-d enabled";
> diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
> index 0e16e5029081..f7521fd5db4c 100644
> --- a/drivers/gpu/drm/xe/Makefile
> +++ b/drivers/gpu/drm/xe/Makefile
> @@ -34,7 +34,8 @@ uses_generated_oob := \
>   $(obj)/xe_ring_ops.o \
>   $(obj)/xe_vm.o \
>   $(obj)/xe_wa.o \
> - $(obj)/xe_ttm_stolen_mgr.o
> + $(obj)/xe_ttm_stolen_mgr.o \
> + $(obj)/display/xe_display_wa.o \
>  
>  $(uses_generated_oob): $(generated_oob)
>  
> @@ -192,6 +193,7 @@ xe-$(CONFIG_DRM_XE_DISPLAY) += \
>   display/xe_display.o \
>   display/xe_display_misc.o \
>   display/xe_display_rps.o \
> + display/xe_display_wa.o \
>   display/xe_dsb_buffer.o \
>   display/xe_fb_pin.o \
>   display/xe_hdcp_gsc.o \
> diff --git a/drivers/gpu/drm/xe/display/xe_display_wa.c 
> b/drivers/gpu/drm/xe/display/xe_display_wa.c
> new file mode 100644
> index ..68e3d1959ad6
> --- /dev/null
> +++ b/drivers/gpu/drm/xe/display/xe_display_wa.c
> @@ -0,0 +1,16 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2024 Intel Corporation
> + */
> +
> +#include "intel_display_wa.h"
> +
> +#include "xe_device.h"
> +#include "xe_wa.h"
> +
> +#include 
> +
> +bool intel_display_needs_wa_16023588340(struct drm_i915_private *i915)
> +{
> + return XE_WA(xe_root_mmio_gt(i915), 16023588340);

You'll probably want to get verification this is the correct WA number
before pushing this change, but other than that:
Reviewed-by: Jonathan Cavitt 
-Jonathan Cavitt

> +}
> -- 
> 2.45.1
> 
> 


RE: [PATCH v2 0/2] Sparse errors on the i915_gem_stolen

2024-06-18 Thread Cavitt, Jonathan
-Original Message-
From: Andi Shyti  
Sent: Monday, June 17, 2024 11:43 AM
To: intel-gfx ; dri-devel 

Cc: Andi Shyti ; Cavitt, Jonathan 
; De Marchi, Lucas 
Subject: [PATCH v2 0/2] Sparse errors on the i915_gem_stolen
> 
> Hi Jonathan,
> 
> Commit 05da7d9f717b ("drm/i915/gem: Downgrade stolen lmem setup
> warning") produces two sparse warnings. The first one being a bit
> more sever as it might cause a segmentation fault.
> 
> The difference between v1 and v2 is that the patch should return
> a NULL, which won't cause any issues.
> 
> Andi

Sure.  Apply to all:
Reviewed-by: Jonathan Cavitt 
-Jonathan Cavitt

> 
> Cc: Jonathan Cavitt 
> Cc: Lucas De Marchi 
> 
> Andi Shyti (2):
>   drm/i915/gem: Return NULL instead of '0'
>   drm/i915/gem: Use the correct format specifier for resource_size_t
> 
>  drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> -- 
> 2.45.1
> 
> 


RE: [PATCH] drm/i915/gt/uc: Fix typo in comment

2024-06-14 Thread Cavitt, Jonathan


> 
> -Original Message-
From: Vivi, Rodrigo  
Sent: Friday, June 14, 2024 12:46 PM
To: Cavitt, Jonathan 
Cc: Andi Shyti ; intel-gfx 
; dri-devel ; 
Harrison, John C 
Subject: Re: [PATCH] drm/i915/gt/uc: Fix typo in comment
> 
> On Fri, Jun 14, 2024 at 03:23:54PM +0000, Cavitt, Jonathan wrote:
> > -Original Message-
> > From: Andi Shyti  
> > Sent: Friday, June 14, 2024 8:22 AM
> > To: Andi Shyti 
> > Cc: intel-gfx ; dri-devel 
> > ; Harrison, John C 
> > ; Cavitt, Jonathan 
> > Subject: Re: [PATCH] drm/i915/gt/uc: Fix typo in comment
> > > 
> > > I guess sparse and potential CI errors won't minimally relate to
> > > this patch.
> > 
> > Yeah, I don't see how a change to a comment could ever be related
> > to any CI errors:
> > 
> > Reviewed-by: Jonathan Cavitt
> 
> Please ensure you use the full line.
> 1. People don't have to go to the header of your email to get your email.
> 2. People might not be entirely sure of your choice of email. Please notice 
> that
> even on Intel many folks have @intel.com and @linux.intel.com and sometimes 
> although
> responding from one, they use the other to sign things.
> 3. Tooling! (b4, patchwork, etc) There are many tools that get these tags 
> directly
>  from the email response and by using partial one you can complicate things.
> 
> If you also allow one advice, try to use a terminal mail client like mutt and
> open the replies in your favorite editor and add macros there to add your 
> lines
> based on shortcuts... For instance, on my case I use mutt+emacs and I just hit
> Ctrl+c-rev and that adds my full rv-b tag in the response.
> 

Whoops!

Reviewed-by: Jonathan Cavitt 
-Jonathan Cavitt

> Thanks,
> Rodrigo.
> 
> > -Jonathan Cavitt
> > 
> > > 
> > > Adding also Jonathan in Cc :-)
> > > 
> > > Thanks,   
> > > Andi
> > > 
> > > On Fri, Jun 14, 2024 at 12:28:37AM +0200, Andi Shyti wrote:
> > > > Replace "dynmically" with "dynamically".
> > > > 
> > > > Signed-off-by: Andi Shyti 
> > > > Cc: John Harrison 
> > > > ---
> > > >  drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h 
> > > > b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
> > > > index 14797e80bc92..263c9c3f6a03 100644
> > > > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
> > > > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
> > > > @@ -295,7 +295,7 @@ struct guc_update_scheduling_policy_header {
> > > >  } __packed;
> > > >  
> > > >  /*
> > > > - * Can't dynmically allocate memory for the scheduling policy KLV 
> > > > because
> > > > + * Can't dynamically allocate memory for the scheduling policy KLV 
> > > > because
> > > >   * it will be sent from within the reset path. Need a fixed size lump 
> > > > on
> > > >   * the stack instead :(.
> > > >   *
> > > > -- 
> > > > 2.45.1
> > > 
> 


RE: [PATCH] drm/i915/gt/uc: Fix typo in comment

2024-06-14 Thread Cavitt, Jonathan
-Original Message-
From: Andi Shyti  
Sent: Friday, June 14, 2024 8:22 AM
To: Andi Shyti 
Cc: intel-gfx ; dri-devel 
; Harrison, John C 
; Cavitt, Jonathan 
Subject: Re: [PATCH] drm/i915/gt/uc: Fix typo in comment
> 
> I guess sparse and potential CI errors won't minimally relate to
> this patch.

Yeah, I don't see how a change to a comment could ever be related
to any CI errors:

Reviewed-by: Jonathan Cavitt
-Jonathan Cavitt

> 
> Adding also Jonathan in Cc :-)
> 
> Thanks,   
> Andi
> 
> On Fri, Jun 14, 2024 at 12:28:37AM +0200, Andi Shyti wrote:
> > Replace "dynmically" with "dynamically".
> > 
> > Signed-off-by: Andi Shyti 
> > Cc: John Harrison 
> > ---
> >  drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h 
> > b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
> > index 14797e80bc92..263c9c3f6a03 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
> > @@ -295,7 +295,7 @@ struct guc_update_scheduling_policy_header {
> >  } __packed;
> >  
> >  /*
> > - * Can't dynmically allocate memory for the scheduling policy KLV because
> > + * Can't dynamically allocate memory for the scheduling policy KLV because
> >   * it will be sent from within the reset path. Need a fixed size lump on
> >   * the stack instead :(.
> >   *
> > -- 
> > 2.45.1
> 


RE: [PATCH] drm/i915/selftest_hangcheck: Fix potential UAF after HW fence revoke

2024-05-29 Thread Cavitt, Jonathan
-Original Message-
From: Janusz Krzysztofik  
Sent: Wednesday, May 29, 2024 4:37 AM
To: intel-gfx@lists.freedesktop.org
Cc: dri-de...@lists.freedesktop.org; Jani Nikula ; 
Joonas Lahtinen ; Vivi, Rodrigo 
; Tvrtko Ursulin ; Andi Shyti 
; Cavitt, Jonathan ; 
Janusz Krzysztofik 
Subject: [PATCH] drm/i915/selftest_hangcheck: Fix potential UAF after HW fence 
revoke
> 
> CI is sporadically reporting the following issue triggered by
> igt@i915_selftest@live@hangcheck test case:
> 
> <6> [414.049203] i915: Running 
> intel_hangcheck_live_selftests/igt_reset_evict_fence
> ...
> <6> [414.068804] i915 :00:02.0: [drm] GT0: GUC: submission enabled
> <6> [414.068812] i915 :00:02.0: [drm] GT0: GUC: SLPC enabled
> <3> [414.070354] Unable to pin Y-tiled fence; err:-4
> <3> [414.071282] i915_vma_revoke_fence:301 
> GEM_BUG_ON(!i915_active_is_idle(>active))
> ...
> <4>[  609.603992] [ cut here ]
> <2>[  609.603995] kernel BUG at 
> drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c:301!
> <4>[  609.604003] invalid opcode:  [#1] PREEMPT SMP NOPTI
> <4>[  609.604006] CPU: 0 PID: 268 Comm: kworker/u64:3 Tainted: G U  W 
>  6.9.0-CI_DRM_14785-g1ba62f8cea9c+ #1
> <4>[  609.604008] Hardware name: Intel Corporation Alder Lake Client 
> Platform/AlderLake-P DDR4 RVP, BIOS RPLPFWI1.R00.4035.A00.2301200723 
> 01/20/2023
> <4>[  609.604010] Workqueue: i915 __i915_gem_free_work [i915]
> <4>[  609.604149] RIP: 0010:i915_vma_revoke_fence+0x187/0x1f0 [i915]
> ...
> <4>[  609.604271] Call Trace:
> <4>[  609.604273]  
> ...
> <4>[  609.604716]  __i915_vma_evict+0x2e9/0x550 [i915]
> <4>[  609.604852]  __i915_vma_unbind+0x7c/0x160 [i915]
> <4>[  609.604977]  force_unbind+0x24/0xa0 [i915]
> <4>[  609.605098]  i915_vma_destroy+0x2f/0xa0 [i915]
> <4>[  609.605210]  __i915_gem_object_pages_fini+0x51/0x2f0 [i915]
> <4>[  609.605330]  __i915_gem_free_objects.isra.0+0x6a/0xc0 [i915]
> <4>[  609.605440]  process_scheduled_works+0x351/0x690
> 
> Since no other tests nor users report that issue, I believe it is specific
> to that test case, which should just wait after reset it triggers for
> actual completion of a request that it forced to claim using a hardware
> fence before it releases allocated resources.  Fix it.
> 

+ Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/10021

> Signed-off-by: Janusz Krzysztofik 

Reviewed-by: Jonathan Cavitt 
-Jonathan Cavitt

> ---
>  drivers/gpu/drm/i915/gt/selftest_hangcheck.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c 
> b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> index 9ce8ff1c04fe5..b47c99f38a525 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> @@ -1568,6 +1568,8 @@ static int __igt_reset_evict_vma(struct intel_gt *gt,
>  
>  out_rq:
>   i915_request_put(rq);
> + if (flags & EXEC_OBJECT_NEEDS_FENCE)
> + i915_active_wait(>fence->active);
>  out_obj:
>   i915_gem_object_put(obj);
>  fini:
> -- 
> 2.45.1
> 
> 


RE: [PATCH] drm/i915/selftests: Set always_coherent to false when reading from CPU

2024-05-16 Thread Cavitt, Jonathan
-Original Message-
From: Das, Nirmoy  
Sent: Thursday, May 16, 2024 8:14 AM
To: intel-gfx@lists.freedesktop.org
Cc: dri-de...@lists.freedesktop.org; Das, Nirmoy ; Andi 
Shyti ; Janusz Krzysztofik 
; Cavitt, Jonathan 

Subject: [PATCH] drm/i915/selftests: Set always_coherent to false when reading 
from CPU
> 
> The previous commit 'commit 8d4ba9fc1c6c ("drm/i915/selftests: Pick
> correct caching mode.")' was not complete as for non LLC  sharing platforms
> cpu read can happen from LLC which probably doesn't have the latest
> changes made by GPU.
> 
> Cc: Andi Shyti 
> Cc: Janusz Krzysztofik 
> Cc: Jonathan Cavitt 
> Signed-off-by: Nirmoy Das 

I see no problem with this
Reviewed-by: Jonathan Cavitt 
-Jonathan Cavitt

> ---
>  drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c 
> b/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c
> index 65a931ea80e9..3527b8f446fe 100644
> --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c
> +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c
> @@ -196,7 +196,7 @@ static int verify_access(struct drm_i915_private *i915,
>   if (err)
>   goto out_file;
>  
> - mode = intel_gt_coherent_map_type(to_gt(i915), native_obj, true);
> + mode = intel_gt_coherent_map_type(to_gt(i915), native_obj, false);
>   vaddr = i915_gem_object_pin_map_unlocked(native_obj, mode);
>   if (IS_ERR(vaddr)) {
>   err = PTR_ERR(vaddr);
> -- 
> 2.42.0
> 
> 


RE: [PATCH] drm/i915/gem: Downgrade stolen lmem setup warning

2024-04-25 Thread Cavitt, Jonathan
-Original Message-
From: Jani Nikula  
Sent: Monday, April 22, 2024 1:29 AM
To: Cavitt, Jonathan ; 
intel-gfx@lists.freedesktop.org
Cc: Gupta, saurabhg ; Cavitt, Jonathan 
; Harrison, John C ; 
chris.p.wil...@linux.intel.com; andi.sh...@linux.intel.com; Das, Nirmoy 

Subject: Re: [PATCH] drm/i915/gem: Downgrade stolen lmem setup warning
> 
> On Fri, 19 Apr 2024, Jonathan Cavitt  wrote:
> > In the case where lmem_size < dsm_base, hardware is reporting that
> > stolen lmem is unusable.  In this case, instead of throwing a warning,
> > we can continue execution as normal by disabling stolen LMEM support.
> > For example, this change will allow the following error report from
> > ATS-M to no longer apply:
> >
> > <6> [144.859887] pcieport :4b:00.0: bridge window [mem 
> > 0xb100-0xb11f]
> > <6> [144.859900] pcieport :4b:00.0: bridge window [mem 
> > 0x3bbc-0x3bbc17ff 64bit pref]
> > <6> [144.859917] pcieport :4c:01.0: PCI bridge to [bus 4d-4e]
> > <6> [144.859932] pcieport :4c:01.0: bridge window [mem 
> > 0xb100-0xb11f]
> > <6> [144.859945] pcieport :4c:01.0: bridge window [mem 
> > 0x3bbc-0x3bbc17ff 64bit pref]
> > <6> [144.859984] i915 :4d:00.0: [drm] BAR2 resized to 256M
> > <6> [144.860640] i915 :4d:00.0: [drm] Using a reduced BAR size of 
> > 256MiB. Consider enabling 'Resizable BAR' or similar, if available in the 
> > BIOS.
> > <4> [144.860719] ---[ cut here ]---
> > <4> [144.860727] WARNING: CPU: 17 PID: 1815 at 
> > drivers/gpu/drm/i915/gem/i915_gem_stolen.c:939 
> > i915_gem_stolen_lmem_setup+0x38c/0x430 [i915]
> > <4> [144.861430] Modules linked in: i915 snd_intel_dspcfg snd_hda_codec 
> > snd_hwdep snd_hda_core snd_pcm vgem drm_shmem_helper prime_numbers 
> > i2c_algo_bit ttm video drm_display_helper drm_buddy fuse 
> > x86_pkg_temp_thermal coretemp kvm_intel kvm ixgbe mdio irqbypass ptp 
> > crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pps_core i2c_i801 mei_me 
> > i2c_smbus mei wmi acpi_power_meter [last unloaded: i915]
> > <4> [144.861611] CPU: 17 PID: 1815 Comm: i915_module_loa Tainted: G U 
> > 6.8.0-rc5-drmtip_1515-g78f49af27723+ #1
> > <4> [144.861624] Hardware name: Intel Corporation WHITLEY/WHITLEY, BIOS 
> > SE5C6200.86B.0020.P41.2109300305 09/30/2021
> > <4> [144.861632] RIP: 0010:i915_gem_stolen_lmem_setup+0x38c/0x430 [i915]
> > <4> [144.862287] Code: ff 41 c1 e4 05 e9 ac fe ff ff 4d 63 e4 48 89 ef 48 
> > 85 ed 74 04 48 8b 7d 08 48 c7 c6 10 a3 7b a0 e8 e9 90 43 e1 e9 ee fd ff ff 
> > <0f> 0b 49 c7 c4 ed ff ff ff e9 e0 fd ff ff 0f b7 d2 48 c7 c6 00 d9
> > <4> [144.862299] RSP: 0018:c90005607980 EFLAGS: 00010207
> > <4> [144.862315] RAX: fff0 RBX: 0003 RCX: 
> > 
> >
> > Fixes: https://gitlab.freedesktop.org/drm/intel/-/issues/10833
> >
> 
> No blank lines between trailers please.
> 
> BR,
> Jani.

Fixed in latest revision.  Thank you for the revision notes.
-Jonathan Cavitt

> 
> > Suggested-by: Chris Wilson 
> > Signed-off-by: Jonathan Cavitt 
> > ---
> >  drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 6 +-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c 
> > b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
> > index ad6dd7f3259bc..efa632a9e61c6 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
> > @@ -936,8 +936,12 @@ i915_gem_stolen_lmem_setup(struct drm_i915_private 
> > *i915, u16 type,
> > } else {
> > /* Use DSM base address instead for stolen memory */
> > dsm_base = intel_uncore_read64(uncore, GEN6_DSMBASE) & 
> > GEN11_BDSM_MASK;
> > -   if (WARN_ON(lmem_size < dsm_base))
> > +   if (lmem_size < dsm_base) {
> > +   drm_dbg(>drm,
> > +   "Disabling stolen memory support due to OOB 
> > placement: lmem_size = %lli vs dsm_base = %lli\n",
> > +   lmem_size, dsm_base);
> > return ERR_PTR(-ENODEV);
> > +   }
> > dsm_size = ALIGN_DOWN(lmem_size - dsm_base, SZ_1M);
> > }
> 
> -- 
> Jani Nikula, Intel
> 


RE: [PATCH] drm/i915/gem: Downgrade stolen lmem setup warning

2024-04-25 Thread Cavitt, Jonathan
-Original Message-
From: Vivekanandan, Balasubramani  
Sent: Sunday, April 21, 2024 11:29 PM
To: Cavitt, Jonathan ; 
intel-gfx@lists.freedesktop.org
Cc: Gupta, saurabhg ; Harrison, John C 
; chris.p.wil...@linux.intel.com; 
andi.sh...@linux.intel.com; Das, Nirmoy 
Subject: Re: [PATCH] drm/i915/gem: Downgrade stolen lmem setup warning
> 
> On 19.04.2024 14:26, Jonathan Cavitt wrote:
> > In the case where lmem_size < dsm_base, hardware is reporting that
> > stolen lmem is unusable.  In this case, instead of throwing a warning,
> > we can continue execution as normal by disabling stolen LMEM support.
> > For example, this change will allow the following error report from
> > ATS-M to no longer apply:
> > 
> > <6> [144.859887] pcieport :4b:00.0: bridge window [mem 
> > 0xb100-0xb11f]
> > <6> [144.859900] pcieport :4b:00.0: bridge window [mem 
> > 0x3bbc-0x3bbc17ff 64bit pref]
> > <6> [144.859917] pcieport :4c:01.0: PCI bridge to [bus 4d-4e]
> > <6> [144.859932] pcieport :4c:01.0: bridge window [mem 
> > 0xb100-0xb11f]
> > <6> [144.859945] pcieport :4c:01.0: bridge window [mem 
> > 0x3bbc-0x3bbc17ff 64bit pref]
> > <6> [144.859984] i915 :4d:00.0: [drm] BAR2 resized to 256M
> > <6> [144.860640] i915 :4d:00.0: [drm] Using a reduced BAR size of 
> > 256MiB. Consider enabling 'Resizable BAR' or similar, if available in the 
> > BIOS.
> > <4> [144.860719] ---[ cut here ]---
> > <4> [144.860727] WARNING: CPU: 17 PID: 1815 at 
> > drivers/gpu/drm/i915/gem/i915_gem_stolen.c:939 
> > i915_gem_stolen_lmem_setup+0x38c/0x430 [i915]
> > <4> [144.861430] Modules linked in: i915 snd_intel_dspcfg snd_hda_codec 
> > snd_hwdep snd_hda_core snd_pcm vgem drm_shmem_helper prime_numbers 
> > i2c_algo_bit ttm video drm_display_helper drm_buddy fuse 
> > x86_pkg_temp_thermal coretemp kvm_intel kvm ixgbe mdio irqbypass ptp 
> > crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pps_core i2c_i801 mei_me 
> > i2c_smbus mei wmi acpi_power_meter [last unloaded: i915]
> > <4> [144.861611] CPU: 17 PID: 1815 Comm: i915_module_loa Tainted: G U 
> > 6.8.0-rc5-drmtip_1515-g78f49af27723+ #1
> > <4> [144.861624] Hardware name: Intel Corporation WHITLEY/WHITLEY, BIOS 
> > SE5C6200.86B.0020.P41.2109300305 09/30/2021
> > <4> [144.861632] RIP: 0010:i915_gem_stolen_lmem_setup+0x38c/0x430 [i915]
> > <4> [144.862287] Code: ff 41 c1 e4 05 e9 ac fe ff ff 4d 63 e4 48 89 ef 48 
> > 85 ed 74 04 48 8b 7d 08 48 c7 c6 10 a3 7b a0 e8 e9 90 43 e1 e9 ee fd ff ff 
> > <0f> 0b 49 c7 c4 ed ff ff ff e9 e0 fd ff ff 0f b7 d2 48 c7 c6 00 d9
> > <4> [144.862299] RSP: 0018:c90005607980 EFLAGS: 00010207
> > <4> [144.862315] RAX: fff0 RBX: 0003 RCX: 
> > 
> > 
> > Fixes: https://gitlab.freedesktop.org/drm/intel/-/issues/10833
> > 
> > Suggested-by: Chris Wilson 
> > Signed-off-by: Jonathan Cavitt 
> > ---
> >  drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 6 +-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c 
> > b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
> > index ad6dd7f3259bc..efa632a9e61c6 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
> > @@ -936,8 +936,12 @@ i915_gem_stolen_lmem_setup(struct drm_i915_private 
> > *i915, u16 type,
> > } else {
> > /* Use DSM base address instead for stolen memory */
> > dsm_base = intel_uncore_read64(uncore, GEN6_DSMBASE) & 
> > GEN11_BDSM_MASK;
> > -   if (WARN_ON(lmem_size < dsm_base))
> > +   if (lmem_size < dsm_base) {
> > +   drm_dbg(>drm,
> > +   "Disabling stolen memory support due to OOB 
> > placement: lmem_size = %lli vs dsm_base = %lli\n",
> > +   lmem_size, dsm_base);
> > return ERR_PTR(-ENODEV);
> The patch is still returning an error and aborting initialization. Only the
> warning is eliminated. But as per the commit description, the execution
> should continue as normal after disabling stolen lmem support.
> 

Fixed in latest revision.  Thank you for the revision notes.
-Jonathan Cavitt

> Regards,
> Bala
> 
> > +   }
> > dsm_size = ALIGN_DOWN(lmem_size - dsm_base, SZ_1M);
> > }
> >  
> > -- 
> > 2.25.1
> > 
> 


RE: [PATCH i-g-t v3 0/5] lib/kunit: Execute test cases synchronously

2024-03-19 Thread Cavitt, Jonathan
-Original Message-
From: Janusz Krzysztofik  
Sent: Monday, March 18, 2024 3:13 AM
To: igt-...@lists.freedesktop.org
Cc: intel-gfx@lists.freedesktop.org; intel...@lists.freedesktop.org; Kamil 
Konieczny ; Mauro Carvalho Chehab 
; Cavitt, Jonathan ; De Marchi, 
Lucas 
Subject: [PATCH i-g-t v3 0/5] lib/kunit: Execute test cases synchronously
> 
> Up to now we were loading a KUnit test module in test execution mode only
> once per subtest, in background, and then, in parallel with execution of
> test cases while the module was loading, we were looking through dmesg for
> KTAP results from each expected test case.  As a consequence, our IGT
> messages were more or less delayed, never in full sync with kernel
> messages.  Moreover, parsing of KTAP results from already completed test
> cases could be abandoned on a failure from loading the test module or
> kernel taint caused by a subsequent test case.  Also, parsing of KTAP
> results from all subsequent test cases could be abandoned on a failure of
> the parser caused by any test case.  Other than that, if a user requested
> a single dynamic sub-subtest, all test cases were executed anyway while
> results from only one of them that corresponded to the selected dynamic
> sub-subtest were reported.  That way, kernel messages from unrelated test
> cases, not only the selected one, could contribute to dmesg-fail or dmesg-
> warn CI results from that sub-subtest.
> 
> Since recent KUnit implementation is capable of executing only those test
> cases that match a user filter, stop executing all of them asynchronously
> and parsing their KTAP results as they appear.  Instead, reload the test
> module once per each dynamic sub-subtest with a filter that selects a
> specific test case and wait for its completion.  If successful and no
> kernel taint has occurred then parse the whole KTAP report from a single
> test case it has produced and translate it to IGT result of that single
> corresponding sub-subtest.
> 
> v3: Insert new patches 1-3 that fix an infinite loop when we try to get a
> list of test cases from an unexpectedly missing KTAP report.
> v2: Refresh the series on top of changes to KUnit filters handling,
>   - move the code of a new helper from a previous patch 1 to a previous
> patch 2 which now becomes patch 1,
>   - actually limit the scope of the helper to fetching a KTAP report from
> a file descriptor, and let the callers decide on how other steps, like
> setting up filters or loading a test module, and errors they return
> are handled,
>   - update commit description with a more detailed justification of why we
> need these changes,
>   - rebase the former patch 1 on top of the new patch 1, update its commit
> message and description and provide it as patch 2.
> 
> Janusz Krzysztofik (5):
>   lib/kunit: Store igt_ktap_results pointer in a central location
>   lib/kunit: Let igt_ktap_free() take care of pointer reset
>   lib/kunit: Time out promptly on missing KTAP report
>   lib/kunit: Execute test cases synchronously
>   lib/kunit: Minimize code duplication
> 


Acked-by: Jonathan Cavitt 
Ack applies to all patches in series.
It seems that Kamil is giving a proper review to all the individual patches,
so I'd wait until that's done before moving forward.
-Jonathan Cavitt


>  lib/igt_kmod.c  | 193 ++--
>  lib/igt_ktap.c  |   5 +-
>  lib/igt_ktap.h  |   2 +-
>  lib/tests/igt_ktap_parser.c |  24 ++---
>  4 files changed, 93 insertions(+), 131 deletions(-)
> 
> -- 
> 2.43.0
> 
> 


RE: [PATCH] drm/i915/selftests: Pick correct caching mode.

2024-03-12 Thread Cavitt, Jonathan
-Original Message-
From: Das, Nirmoy  
Sent: Tuesday, March 12, 2024 4:18 AM
To: intel-gfx@lists.freedesktop.org
Cc: dri-de...@lists.freedesktop.org; Das, Nirmoy ; Andi 
Shyti ; Janusz Krzysztofik 
; Cavitt, Jonathan 

Subject: [PATCH] drm/i915/selftests: Pick correct caching mode.
> 
> Caching mode is HW dependent so pick a correct one using
> intel_gt_coherent_map_type().
> 
> Cc: Andi Shyti 
> Cc: Janusz Krzysztofik 
> Cc: Jonathan Cavitt 
> Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/10249
> Signed-off-by: Nirmoy Das 

LGTM
Acked-by: Jonathan Cavitt 
-Jonathan Cavitt

> ---
>  drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c 
> b/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c
> index d684a70f2c04..65a931ea80e9 100644
> --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c
> +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c
> @@ -7,6 +7,7 @@
>  #include "i915_drv.h"
>  #include "i915_selftest.h"
>  #include "gem/i915_gem_context.h"
> +#include "gt/intel_gt.h"
>  
>  #include "mock_context.h"
>  #include "mock_dmabuf.h"
> @@ -155,6 +156,7 @@ static int verify_access(struct drm_i915_private *i915,
>   struct file *file;
>   u32 *vaddr;
>   int err = 0, i;
> + unsigned int mode;
>  
>   file = mock_file(i915);
>   if (IS_ERR(file))
> @@ -194,7 +196,8 @@ static int verify_access(struct drm_i915_private *i915,
>   if (err)
>   goto out_file;
>  
> - vaddr = i915_gem_object_pin_map_unlocked(native_obj, I915_MAP_WB);
> + mode = intel_gt_coherent_map_type(to_gt(i915), native_obj, true);
> + vaddr = i915_gem_object_pin_map_unlocked(native_obj, mode);
>   if (IS_ERR(vaddr)) {
>   err = PTR_ERR(vaddr);
>   goto out_file;
> -- 
> 2.42.0
> 
> 


RE: [PATCH i-g-t v2 0/2] lib/kunit: Execute test cases synchronously

2024-03-07 Thread Cavitt, Jonathan


All patches LGTM
Acked-by: Jonathan Cavitt 
-Jonathan Cavitt

-Original Message-
From: Intel-gfx  On Behalf Of Janusz 
Krzysztofik
Sent: Tuesday, February 27, 2024 7:11 AM
To: igt-...@lists.freedesktop.org
Cc: intel-gfx@lists.freedesktop.org; intel...@lists.freedesktop.org; Kamil 
Konieczny ; Mauro Carvalho Chehab 
; De Marchi, Lucas 
Subject: [PATCH i-g-t v2 0/2] lib/kunit: Execute test cases synchronously
> 
> Up to now we were loading a KUnit test module in test execution mode only
> once per subtest, in background, and then, in parallel with execution of
> test cases while the module was loading, we were looking through dmesg for
> KTAP results from each expected test case.  As a consequence, our IGT
> messages were more or less delayed, never in full sync with kernel
> messages.  Moreover, parsing of KTAP results from already completed test
> cases could be abandoned on a failure from loading the test module or
> kernel taint caused by a subsequent test case.  Also, parsing of KTAP
> results from all subsequent test cases could be abandoned on a failure of
> the parser caused by any test case.  Other than that, if a user requested
> a single dynamic sub-subtest, all test cases were executed anyway while
> results from only one of them that corresponded to the selected dynamic
> sub-subtest were reported.  That way, kernel messages from unrelated test
> cases, not only the selected one, could contribute to dmesg-fail or dmesg-
> warn CI results from that sub-subtest.
> 
> Since recent KUnit implementation is capable of executing only those test
> cases that match a user filter, stop executing all of them asynchronously
> and parsing their KTAP results as they appear.  Instead, reload the test
> module once per each dynamic sub-subtest with a filter that selects a
> specific test case and wait for its completion.  If successful and no
> kernel taint has occurred then parse the whole KTAP report from a single
> test case it has produced and translate it to IGT result of that single
> corresponding sub-subtest.
> 
> v2: Refresh the series on top of changes to KUnit filters handling,
>   - move the code of a new helper from a previous patch 1 to a previous
> patch 2 which now becomes patch 1,
>   - actually limit the scope of the helper to fetching a KTAP report from
> a file descriptor, and let the callers decide on how other steps, like
> setting up filters or loading a test module, and errors they return
> are handled,
>   - update commit description with a more detailed justification of why we
> need these changes,
>   - rebase the former patch 1 on top of the new patch 1, update its commit
> message and description and provide it as patch 2.
> 
> Janusz Krzysztofik (2):
>   lib/kunit: Execute test cases synchronously
>   lib/kunit: Minimize code duplication
> 
>  lib/igt_kmod.c | 172 -
>  1 file changed, 54 insertions(+), 118 deletions(-)
> 
> -- 
> 2.43.0
> 
> 


RE: [Intel-gfx] [PATCH] drm/i915/gem: Atomically invalidate userptr on mmu-notifier

2023-12-18 Thread Cavitt, Jonathan
-Original Message-
From: Andi Shyti  
Sent: Monday, December 18, 2023 8:06 AM
To: Cavitt, Jonathan 
Cc: intel-gfx@lists.freedesktop.org; Gupta, saurabhg 
; chris.p.wil...@linux.intel.com
Subject: Re: [Intel-gfx] [PATCH] drm/i915/gem: Atomically invalidate userptr on 
mmu-notifier
> 
> Hi Jonathan,
> 
> On Tue, Nov 28, 2023 at 08:25:05AM -0800, Jonathan Cavitt wrote:
> > Never block for outstanding work on userptr object upon receipt of a
> > mmu-notifier. The reason we originally did so was to immediately unbind
> > the userptr and unpin its pages, but since that has been dropped in
> > commit b4b9731b02c3c ("drm/i915: Simplify userptr locking"), we never
> > return the pages to the system i.e. never drop our page->mapcount and so
> > do not allow the page and CPU PTE to be revoked. Based on this history,
> > we know we are safe to drop the wait entirely.
> > 
> > Upon return from mmu-notifier, we will still have the userptr pages
> > pinned preventing the following PTE operation (such as try_to_unmap)
> > adjusting the vm_area_struct, so it is safe to keep the pages around for
> > as long as we still have i/o pending.
> > 
> > We do not have any means currently to asynchronously revalidate the
> > userptr pages, that is always prior to next use.
> > 
> > Signed-off-by: Chris Wilson 
> > Signed-off-by: Jonathan Cavitt 
> 
> Reviewed-by: Andi Shyti 


Thank you for the review!  I don't think I have permission to push this 
upstream,
though, so you or someone else will have to complete the push.
-Jonathan Cavitt


> 
> Thanks,
> Andi
> 


Re: [Intel-gfx] [PATCH] drm/i915/gt: Temporarily disable CPU caching into DMA for MTL

2023-11-03 Thread Cavitt, Jonathan
-Original Message-
From: Sripada, Radhakrishna  
Sent: Friday, November 3, 2023 1:02 PM
To: Cavitt, Jonathan ; 
intel-gfx@lists.freedesktop.org
Cc: Gupta, saurabhg ; Cavitt, Jonathan 
; chris.p.wil...@linux.intel.com
Subject: RE: [Intel-gfx] [PATCH] drm/i915/gt: Temporarily disable CPU caching 
into DMA for MTL
> 
> Hi Jonathan,
> 
> > -Original Message-
> > From: Intel-gfx  On Behalf Of 
> > Jonathan
> > Cavitt
> > Sent: Thursday, November 2, 2023 10:59 AM
> > To: intel-gfx@lists.freedesktop.org
> > Cc: Gupta, saurabhg ; Cavitt, Jonathan
> > ; chris.p.wil...@linux.intel.com
> > Subject: [Intel-gfx] [PATCH] drm/i915/gt: Temporarily disable CPU caching 
> > into
> > DMA for MTL
> > 
> > FIXME: It is suspected that some Address Translation Service (ATS)
> > issue on IOMMU is causing CAT errors to occur on some MTL workloads.
> > Applying a write barrier to the ppgtt set entry functions appeared
> > to have no effect, so we must temporarily use I915_MAP_WC in the
> > map_pt_dma class of functions on MTL until a proper ATS solution is
> > found.
> > 
> What is the performance impact here? Are we disabling caching only
> for the pte changes/scratch pages or does it extend beyond?


I don't actually know what map_pt_dma is used for, but if the name is
indicative of its purpose, it should only impact mappings into the dma
page table.
As for the performance impact, I don't imagine it'll be much.  Maybe
a single-digit percentage slowdown?  It might actually improve
performance if we're avoiding enough cache misses, but the true
performance impact would have to be measured empirically.
-Jonathan Cavitt


> 
> Regards,
> Radhakrishna(RK) Sripada 
> > Signed-off-by: Jonathan Cavitt 
> > CC: Chris Wilson 
> > ---
> >  drivers/gpu/drm/i915/gt/intel_gtt.c | 20 
> >  1 file changed, 20 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c
> > b/drivers/gpu/drm/i915/gt/intel_gtt.c
> > index 4fbed27ef0ecc..21719563a602a 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gtt.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
> > @@ -95,6 +95,16 @@ int map_pt_dma(struct i915_address_space *vm, struct
> > drm_i915_gem_object *obj)
> > void *vaddr;
> > 
> > type = intel_gt_coherent_map_type(vm->gt, obj, true);
> > +   /*
> > +* FIXME: It is suspected that some Address Translation Service (ATS)
> > +* issue on IOMMU is causing CAT errors to occur on some MTL
> > workloads.
> > +* Applying a write barrier to the ppgtt set entry functions appeared
> > +* to have no effect, so we must temporarily use I915_MAP_WC here on
> > +* MTL until a proper ATS solution is found.
> > +*/
> > +   if (IS_METEORLAKE(vm->i915))
> > +   type = I915_MAP_WC;
> > +
> > vaddr = i915_gem_object_pin_map_unlocked(obj, type);
> > if (IS_ERR(vaddr))
> > return PTR_ERR(vaddr);
> > @@ -109,6 +119,16 @@ int map_pt_dma_locked(struct i915_address_space
> > *vm, struct drm_i915_gem_object
> > void *vaddr;
> > 
> > type = intel_gt_coherent_map_type(vm->gt, obj, true);
> > +   /*
> > +* FIXME: It is suspected that some Address Translation Service (ATS)
> > +* issue on IOMMU is causing CAT errors to occur on some MTL
> > workloads.
> > +* Applying a write barrier to the ppgtt set entry functions appeared
> > +* to have no effect, so we must temporarily use I915_MAP_WC here on
> > +* MTL until a proper ATS solution is found.
> > +*/
> > +   if (IS_METEORLAKE(vm->i915))
> > +   type = I915_MAP_WC;
> > +
> > vaddr = i915_gem_object_pin_map(obj, type);
> > if (IS_ERR(vaddr))
> > return PTR_ERR(vaddr);
> > --
> > 2.25.1
> 
> 


Re: [Intel-gfx] [PATCH v2] drm/i915/mcr: Hold GT forcewake during steering operations

2023-10-20 Thread Cavitt, Jonathan
-Original Message-
From: Sripada, Radhakrishna  
Sent: Friday, October 20, 2023 2:32 PM
To: Roper, Matthew D ; 
intel-gfx@lists.freedesktop.org
Cc: Cavitt, Jonathan 
Subject: RE: [PATCH v2] drm/i915/mcr: Hold GT forcewake during steering 
operations
> > -Original Message-
> > From: Roper, Matthew D 
> > Sent: Thursday, October 19, 2023 10:33 PM
> > To: intel-gfx@lists.freedesktop.org
> > Cc: Roper, Matthew D ; Sripada, Radhakrishna
> > ; Cavitt, Jonathan 
> > 
> > Subject: [PATCH v2] drm/i915/mcr: Hold GT forcewake during steering 
> > operations
> > 
> > The steering control and semaphore registers are inside an "always on"
> > power domain with respect to RC6.  However there are some issues if
> > higher-level platform sleep states are entering/exiting at the same time
> > these registers are accessed.  Grabbing GT forcewake and holding it over
> > the entire lock/steer/unlock cycle ensures that those sleep states have
> > been fully exited before we access these registers.
> > 
> > This is expected to become a formally documented/numbered workaround
> > soon.
> > 
> > Note that this patch alone isn't expected to have an immediately
> > noticeable impact on MCR (mis)behavior; an upcoming pcode firmware
> > update will also be necessary to provide the other half of this
> > workaround.
> > 
> > v2:
> >  - Move the forcewake inside the Xe_LPG-specific IP version check.  This
> >should only be necessary on platforms that have a steering semaphore.
> > 
> LGTM,
> Reviewed-by: Radhakrishna Sripada 

Concurred.  While a WA number would be preferrable here, we can add once the WA
is finalized:
Reviewed-by: Jonathan Cavitt 
-Jonathan Cavitt

> 
> 
> > Fixes: 4186e2185b4f ("drm/i915/gt: Add dedicated MCR lock")
> > Cc: Radhakrishna Sripada 
> > Cc: Jonathan Cavitt 
> > Signed-off-by: Matt Roper 
> > ---
> >  drivers/gpu/drm/i915/gt/intel_gt_mcr.c | 24 ++--
> >  1 file changed, 22 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> > b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> > index 326c2ed1d99b..34913912d8ae 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> > @@ -376,9 +376,26 @@ void intel_gt_mcr_lock(struct intel_gt *gt, unsigned 
> > long
> > *flags)
> >  * driver threads, but also with hardware/firmware agents.  A dedicated
> >  * locking register is used.
> >  */
> > -   if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 70))
> > +   if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 70)) {
> > +   /*
> > +* The steering control and semaphore registers are inside an
> > +* "always on" power domain with respect to RC6.  However
> > there
> > +* are some issues if higher-level platform sleep states are
> > +* entering/exiting at the same time these registers are
> > +* accessed.  Grabbing GT forcewake and holding it over the
> > +* entire lock/steer/unlock cycle ensures that those sleep
> > +* states have been fully exited before we access these
> > +* registers.  This wakeref will be released in the unlock
> > +* routine.
> > +*
> > +* This is expected to become a formally documented/numbered
> > +* workaround soon.
> > +*/
> > +   intel_uncore_forcewake_get(gt->uncore, FORCEWAKE_GT);
> > +
> > err = wait_for(intel_uncore_read_fw(gt->uncore,
> > MTL_STEER_SEMAPHORE)
> > == 0x1, 100);
> > +   }
> > 
> > /*
> >  * Even on platforms with a hardware lock, we'll continue to grab
> > @@ -415,8 +432,11 @@ void intel_gt_mcr_unlock(struct intel_gt *gt, unsigned
> > long flags)
> >  {
> > spin_unlock_irqrestore(>mcr_lock, flags);
> > 
> > -   if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 70))
> > +   if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 70)) {
> > intel_uncore_write_fw(gt->uncore, MTL_STEER_SEMAPHORE,
> > 0x1);
> > +
> > +   intel_uncore_forcewake_put(gt->uncore, FORCEWAKE_GT);
> > +   }
> >  }
> > 
> >  /**
> > --
> > 2.41.0
> 
> 


Re: [Intel-gfx] [PATCH v2 3/4] drm/i915: Add WABB blit for Wa_16018031267 / Wa_16018063123

2023-10-20 Thread Cavitt, Jonathan
-Original Message-
From: Hajda, Andrzej  
Sent: Friday, October 20, 2023 5:09 AM
To: intel-gfx@lists.freedesktop.org
Cc: Nirmoy Das ; Shyti, Andi 
; Cavitt, Jonathan ; Das, 
Nirmoy ; Hajda, Andrzej 
Subject: [PATCH v2 3/4] drm/i915: Add WABB blit for Wa_16018031267 / 
Wa_16018063123
> 
> From: Jonathan Cavitt 
> 
> Apply WABB blit for Wa_16018031267 / Wa_16018063123.
> Additionally, update the lrc selftest to exercise the new
> WABB changes.
> 
> Co-developed-by: Nirmoy Das 
> Signed-off-by: Jonathan Cavitt 
> Signed-off-by: Andrzej Hajda 
> ---
>  drivers/gpu/drm/i915/gt/intel_engine_regs.h |   3 +
>  drivers/gpu/drm/i915/gt/intel_gt.h  |   4 +
>  drivers/gpu/drm/i915/gt/intel_gt_types.h|   2 +
>  drivers/gpu/drm/i915/gt/intel_lrc.c | 100 +++-
>  drivers/gpu/drm/i915/gt/selftest_lrc.c  |  65 +
>  5 files changed, 153 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_regs.h 
> b/drivers/gpu/drm/i915/gt/intel_engine_regs.h
> index fdd4ddd3a978a2..b8618ee3e3041a 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_regs.h
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_regs.h
> @@ -118,6 +118,9 @@
>  #define   CCID_EXTENDED_STATE_RESTOREBIT(2)
>  #define   CCID_EXTENDED_STATE_SAVE   BIT(3)
>  #define RING_BB_PER_CTX_PTR(base)_MMIO((base) + 0x1c0) /* gen8+ 
> */
> +#define   PER_CTX_BB_FORCE   BIT(2)
> +#define   PER_CTX_BB_VALID   BIT(0)
> +
>  #define RING_INDIRECT_CTX(base)  _MMIO((base) + 0x1c4) 
> /* gen8+ */
>  #define RING_INDIRECT_CTX_OFFSET(base)   _MMIO((base) + 0x1c8) 
> /* gen8+ */
>  #define ECOSKPD(base)_MMIO((base) + 0x1d0)
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt.h 
> b/drivers/gpu/drm/i915/gt/intel_gt.h
> index 970bedf6b78a7b..50989fc2b6debe 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gt.h
> @@ -82,6 +82,10 @@ struct drm_printer;
> ##__VA_ARGS__);   \
>  } while (0)
>  
> +#define NEEDS_FASTCOLOR_BLT_WABB(engine) ( \
> + IS_GFX_GT_IP_RANGE(engine->gt, IP_VER(12, 55), IP_VER(12, 71)) && \
> + engine->class == COPY_ENGINE_CLASS)
> +
>  static inline bool gt_is_root(struct intel_gt *gt)
>  {
>   return !gt->info.id;
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h 
> b/drivers/gpu/drm/i915/gt/intel_gt_types.h
> index def7dd0eb6f196..4917633f299dd5 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
> @@ -307,6 +307,8 @@ enum intel_gt_scratch_field {
>  
>   /* 8 bytes */
>   INTEL_GT_SCRATCH_FIELD_COHERENTL3_WA = 256,
> +
> + INTEL_GT_SCRATCH_FIELD_DUMMY_BLIT = 384,


This was an artifact from an older version of the patch when I was originally 
revising it.
I think it can safely be removed: It's not used anywhere in the patch series 
and is missing
a comment denoting the field size anyways.
Otherwise:
Reviewed-by: Jonathan Cavitt 
-Jonathan Cavitt


>  };
>  
>  #define intel_gt_support_legacy_fencing(gt) ((gt)->ggtt->num_fences > 0)
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
> b/drivers/gpu/drm/i915/gt/intel_lrc.c
> index eaf66d90316655..96ef901113eae9 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> @@ -828,6 +828,18 @@ lrc_ring_indirect_offset_default(const struct 
> intel_engine_cs *engine)
>   return 0;
>  }
>  
> +static void
> +lrc_setup_bb_per_ctx(u32 *regs,
> +  const struct intel_engine_cs *engine,
> +  u32 ctx_bb_ggtt_addr)
> +{
> + GEM_BUG_ON(lrc_ring_wa_bb_per_ctx(engine) == -1);
> + regs[lrc_ring_wa_bb_per_ctx(engine) + 1] =
> + ctx_bb_ggtt_addr |
> + PER_CTX_BB_FORCE |
> + PER_CTX_BB_VALID;
> +}
> +
>  static void
>  lrc_setup_indirect_ctx(u32 *regs,
>  const struct intel_engine_cs *engine,
> @@ -1020,7 +1032,13 @@ static u32 context_wa_bb_offset(const struct 
> intel_context *ce)
>   return PAGE_SIZE * ce->wa_bb_page;
>  }
>  
> -static u32 *context_indirect_bb(const struct intel_context *ce)
> +/*
> + * per_ctx below determines which WABB section is used.
> + * When true, the function returns the location of the
> + * PER_CTX_BB.  When false, the function returns the
> + * location of the INDIRECT_CTX.
> + */
> +static u32 *context_wabb(const struct intel_context *ce, bool per_ctx)
>  {
>   void *ptr;
>  
> @@ -1029,6 +1047,7 @@ static u32 *context_indirect

Re: [Intel-gfx] [PATCH v2 4/4] drm/i915: Set copy engine arbitration for Wa_16018031267 / Wa_16018063123

2023-10-20 Thread Cavitt, Jonathan
-Original Message-
From: Hajda, Andrzej  
Sent: Friday, October 20, 2023 5:09 AM
To: intel-gfx@lists.freedesktop.org
Cc: Nirmoy Das ; Shyti, Andi 
; Cavitt, Jonathan ; Das, 
Nirmoy 
Subject: [PATCH v2 4/4] drm/i915: Set copy engine arbitration for 
Wa_16018031267 / Wa_16018063123
> 
> From: Jonathan Cavitt 
> 
> Set copy engine arbitration into round robin mode
> for part of Wa_16018031267 / Wa_16018063123 mitigation.
> 
> Signed-off-by: Nirmoy Das 
> Signed-off-by: Jonathan Cavitt 

Reviewed-by: Jonathan Cavitt 
-Jonathan Cavitt

> ---
>  drivers/gpu/drm/i915/gt/intel_engine_regs.h | 3 +++
>  drivers/gpu/drm/i915/gt/intel_workarounds.c | 5 +
>  2 files changed, 8 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_regs.h 
> b/drivers/gpu/drm/i915/gt/intel_engine_regs.h
> index b8618ee3e3041a..c0c8c12edea104 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_regs.h
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_regs.h
> @@ -124,6 +124,9 @@
>  #define RING_INDIRECT_CTX(base)  _MMIO((base) + 0x1c4) 
> /* gen8+ */
>  #define RING_INDIRECT_CTX_OFFSET(base)   _MMIO((base) + 0x1c8) 
> /* gen8+ */
>  #define ECOSKPD(base)_MMIO((base) + 0x1d0)
> +#define   XEHP_BLITTER_SCHEDULING_MODE_MASK  REG_GENMASK(12, 11)
> +#define   XEHP_BLITTER_ROUND_ROBIN_MODE  \
> + REG_FIELD_PREP(XEHP_BLITTER_SCHEDULING_MODE_MASK, 1)
>  #define   ECO_CONSTANT_BUFFER_SR_DISABLE REG_BIT(4)
>  #define   ECO_GATING_CX_ONLY REG_BIT(3)
>  #define   GEN6_BLITTER_FBC_NOTIFYREG_BIT(3)
> diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c 
> b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> index 192ac0e59afa13..108d9326735910 100644
> --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> @@ -2782,6 +2782,11 @@ xcs_engine_wa_init(struct intel_engine_cs *engine, 
> struct i915_wa_list *wal)
>RING_SEMA_WAIT_POLL(engine->mmio_base),
>1);
>   }
> + /* Wa_16018031267, Wa_16018063123 */
> + if (NEEDS_FASTCOLOR_BLT_WABB(engine))
> + wa_masked_field_set(wal, ECOSKPD(engine->mmio_base),
> + XEHP_BLITTER_SCHEDULING_MODE_MASK,
> + XEHP_BLITTER_ROUND_ROBIN_MODE);
>  }
>  
>  static void
> -- 
> 2.34.1
> 
> 


Re: [Intel-gfx] [PATCH v2 1/4] drm/i915: Enable NULL PTE support for vm scratch

2023-10-20 Thread Cavitt, Jonathan
-Original Message-
From: Hajda, Andrzej  
Sent: Friday, October 20, 2023 5:09 AM
To: intel-gfx@lists.freedesktop.org
Cc: Nirmoy Das ; Shyti, Andi 
; Cavitt, Jonathan ; Chris 
Wilson ; Hajda, Andrzej 

Subject: [PATCH v2 1/4] drm/i915: Enable NULL PTE support for vm scratch
> 
> From: Jonathan Cavitt 
> 
> Enable NULL PTE support for vm scratch pages.
> 
> The use of NULL PTEs in teh vm scratch pages requires us to change how


Right.  I forgot about that typo from my version.  s/teh/the.
But that's just semantics, of course.  I won't block on it.
Reviewed-by: Jonathan Cavitt 
-Jonathan Cavitt


> the i915 gem_contexts live selftest perform vm_isolation: instead of
> checking the scratch pages are isolated and don't affect each other, we
> check that all changes to the scratch pages are voided.
> 
> v2: fixed order of definitions
> 
> Signed-off-by: Jonathan Cavitt 
> Suggested-by: Chris Wilson 
> Signed-off-by: Andrzej Hajda 
> ---
>  drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c | 6 ++
>  drivers/gpu/drm/i915/gt/gen8_ppgtt.c  | 3 +++
>  drivers/gpu/drm/i915/gt/intel_gtt.h   | 1 +
>  drivers/gpu/drm/i915/i915_drv.h   | 2 ++
>  drivers/gpu/drm/i915/i915_pci.c   | 2 ++
>  drivers/gpu/drm/i915/intel_device_info.h  | 1 +
>  6 files changed, 15 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c 
> b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
> index 7021b6e9b219ef..48fc5990343bc7 100644
> --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
> @@ -1751,6 +1751,12 @@ static int check_scratch_page(struct i915_gem_context 
> *ctx, u32 *out)
>   if (!vm)
>   return -ENODEV;
>  
> + if (HAS_NULL_PAGE(vm->i915)) {
> + if (out)
> + *out = 0;
> + return 0;
> + }
> +
>   if (!vm->scratch[0]) {
>   pr_err("No scratch page!\n");
>   return -EINVAL;
> diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c 
> b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> index 9895e18df0435a..84aa29715e0aca 100644
> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> @@ -855,6 +855,9 @@ static int gen8_init_scratch(struct i915_address_space 
> *vm)
> I915_CACHE_NONE),
>  pte_flags);
>  
> + if (HAS_NULL_PAGE(vm->i915))
> + vm->scratch[0]->encode |= PTE_NULL_PAGE;
> +
>   for (i = 1; i <= vm->top; i++) {
>   struct drm_i915_gem_object *obj;
>  
> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h 
> b/drivers/gpu/drm/i915/gt/intel_gtt.h
> index b471edac269920..15c71da14d1d27 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gtt.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
> @@ -151,6 +151,7 @@ typedef u64 gen8_pte_t;
>  
>  #define GEN8_PAGE_PRESENTBIT_ULL(0)
>  #define GEN8_PAGE_RW BIT_ULL(1)
> +#define PTE_NULL_PAGEBIT_ULL(9)
>  
>  #define GEN8_PDE_IPS_64K BIT(11)
>  #define GEN8_PDE_PS_2M   BIT(7)
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index cb60fc9cf87373..8f61137deb6cef 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -776,6 +776,8 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
>   */
>  #define HAS_FLAT_CCS(i915)   (INTEL_INFO(i915)->has_flat_ccs)
>  
> +#define HAS_NULL_PAGE(dev_priv) (INTEL_INFO(dev_priv)->has_null_page)
> +
>  #define HAS_GT_UC(i915)  (INTEL_INFO(i915)->has_gt_uc)
>  
>  #define HAS_POOLED_EU(i915)  (RUNTIME_INFO(i915)->has_pooled_eu)
> diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
> index 454467cfa52b9d..aa6e4559b0f0c7 100644
> --- a/drivers/gpu/drm/i915/i915_pci.c
> +++ b/drivers/gpu/drm/i915/i915_pci.c
> @@ -642,6 +642,7 @@ static const struct intel_device_info jsl_info = {
>   GEN(12), \
>   TGL_CACHELEVEL, \
>   .has_global_mocs = 1, \
> + .has_null_page = 1, \
>   .has_pxp = 1, \
>   .max_pat_index = 3
>  
> @@ -719,6 +720,7 @@ static const struct intel_device_info adl_p_info = {
>   .has_logical_ring_contexts = 1, \
>   .has_logical_ring_elsq = 1, \
>   .has_mslice_steering = 1, \
> + .has_null_page = 1, \
>   .has_oa_bpc_reporting = 1, \
>   .has_oa_slice_contrib_limits = 1, \
>   .has_oam = 1, \
> diff --git a/drivers/gpu/drm/i915/intel_device_info.h 
> b/drivers

Re: [Intel-gfx] [PATCH v2 2/4] drm/i915: Reserve some kernel space per vm

2023-10-20 Thread Cavitt, Jonathan
-Original Message-
From: Hajda, Andrzej  
Sent: Friday, October 20, 2023 5:09 AM
To: intel-gfx@lists.freedesktop.org
Cc: Nirmoy Das ; Shyti, Andi 
; Cavitt, Jonathan ; Hajda, 
Andrzej ; Chris Wilson 
Subject: [PATCH v2 2/4] drm/i915: Reserve some kernel space per vm
> 
> Reserve two pages in each vm for kernel space to use for things
> such as workarounds.
> 
> v2: use real memory, do not decrease vm.total


Would XY_FAST_COLOR_BLIT be able to target this memory region?  I'd have to ask 
Chris about this:
He's better versed in this kind of stuff than I am.
But I guess the answer must be "yes" if Chris suggested this change, so I won't 
block on this question.
Reviewed-by: Jonathan Cavitt 
-Jonathan Cavitt


> 
> Suggested-by: Chris Wilson 
> Signed-off-by: Andrzej Hajda 
> ---
>  drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 38 
>  drivers/gpu/drm/i915/gt/intel_gtt.h  |  1 +
>  2 files changed, 39 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c 
> b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> index 84aa29715e0aca..c25e1d4cceeb17 100644
> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> @@ -5,6 +5,7 @@
>  
>  #include 
>  
> +#include "gem/i915_gem_internal.h"
>  #include "gem/i915_gem_lmem.h"
>  
>  #include "gen8_ppgtt.h"
> @@ -953,6 +954,39 @@ gen8_alloc_top_pd(struct i915_address_space *vm)
>   return ERR_PTR(err);
>  }
>  
> +static int gen8_init_rsvd(struct i915_address_space *vm)
> +{
> + const resource_size_t size = 2 * PAGE_SIZE;
> + struct drm_i915_private *i915 = vm->i915;
> + struct drm_i915_gem_object *obj;
> + struct i915_vma *vma;
> + int ret;
> +
> + obj = i915_gem_object_create_lmem(i915, size,
> +   I915_BO_ALLOC_VOLATILE |
> +   I915_BO_ALLOC_GPU_ONLY);
> + if (IS_ERR(obj))
> + obj = i915_gem_object_create_internal(i915, size);
> + if (IS_ERR(obj))
> + return PTR_ERR(obj);
> +
> + vma = i915_vma_instance(obj, vm, NULL);
> + if (IS_ERR(vma)) {
> + ret = PTR_ERR(vma);
> + goto unref;
> + }
> +
> + ret = i915_vma_pin(vma, 0, 0, PIN_USER | PIN_HIGH);
> + if (ret)
> + goto unref;
> +
> + vm->rsvd = i915_vma_make_unshrinkable(vma);
> +
> +unref:
> + i915_gem_object_put(obj);
> + return ret;
> +}
> +
>  /*
>   * GEN8 legacy ppgtt programming is accomplished through a max 4 PDP 
> registers
>   * with a net effect resembling a 2-level page table in normal x86 terms. 
> Each
> @@ -1034,6 +1068,10 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt 
> *gt,
>   if (intel_vgpu_active(gt->i915))
>   gen8_ppgtt_notify_vgt(ppgtt, true);
>  
> + err = gen8_init_rsvd(>vm);
> + if (err)
> + goto err_put;
> +
>   return ppgtt;
>  
>  err_put:
> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h 
> b/drivers/gpu/drm/i915/gt/intel_gtt.h
> index 15c71da14d1d27..4a35ef24501b5f 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gtt.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
> @@ -250,6 +250,7 @@ struct i915_address_space {
>   struct work_struct release_work;
>  
>   struct drm_mm mm;
> + struct i915_vma *rsvd;
>   struct intel_gt *gt;
>   struct drm_i915_private *i915;
>   struct device *dma;
> -- 
> 2.34.1
> 
> 


Re: [Intel-gfx] [PATCH v16 0/7] drm/i915: Define and use GuC and CTB TLB invalidation routines

2023-10-16 Thread Cavitt, Jonathan
-Original Message-
From: Cavitt, Jonathan  
Sent: Monday, October 16, 2023 7:51 AM
To: intel-gfx@lists.freedesktop.org
Cc: Gupta, saurabhg ; Cavitt, Jonathan 
; chris.p.wil...@linux.intel.com; Iddamsetty, 
Aravind ; Yang, Fei ; Shyti, 
Andi ; Harrison, John C ; Das, 
Nirmoy ; Krzysztofik, Janusz 
; Roper, Matthew D ; 
tvrtko.ursu...@linux.intel.com; jani.nik...@linux.intel.com
Subject: [PATCH v16 0/7] drm/i915: Define and use GuC and CTB TLB invalidation 
routines
> 
> Implement GuC-based TLB invalidations and use them on MTL.
> 
> Some complexity in the implementation was introduced early on
> and will be required for range-based TLB invalidations.
> RFC: https://patchwork.freedesktop.org/series/124922/
> 
> v2:
> - Add missing supporting patches.
> 
> v3:
> - Split suspend/resume changes and multi-gt support into separate
>   patches.
> - Only perform GuC TLB invalidation functions when supported.
> - Move intel_guc_is_enabled check function to usage location.
> - Address comments.
> 
> v4:
> - Change conditions for GuC-based tlb invalidation support
>   to a pci tag that's only active for MTL.
> - Address some FIXMEs and formatting issues.
> - Move suspend/resume changes to helper functions in intel_gt.h
> - Improve comment for ct_handle_event change.
> - Use cleaner if-else conditions.
> - Address comments.
> 
> v5:
> - Reintroduce missing change to selftest msleep duration
> - Move suspend/resume loops from intel_gt.h to intel_tlb.c,
>   making them no longer static inlines.
> - Remove superfluous blocking and error checks.
> - Move ct_handle_event exception to general case in
>   ct_process_request.
> - Explain usage of xa_alloc_cyclic_irq.
> - Modify explanation of purpose of
>   OUTSTANDING_GUC_TIMEOUT_PERIOD macro.
> - Explain purpose of performing tlb invalidation twice in
>   intel_gt_tlb_resume_all.
> 
> v6:
> - Add this cover letter.
> - Fix explanation of purpose of
>   OUTSTANDING_GUC_TIMEOUT_PERIOD macro again.
> - s/pci tags/pci flags
> - Enable GuC TLB Invalidations separately from adding the
>   flags to do so.
> 
> v7:
> - Eliminate pci terminology from patches.
> - Order new device info flag correctly.
> - Run gen8_ggtt_invalidate in more cases, specifically when
>   GuC-based TLB invalidation is not supported.
> - Use intel_uncore_write_fw instead of intel_uncore_write
>   during guc_ggtt_invalidate.
> - Remove duplicate request message clear in ct_process_request.
> - Remove faulty tag from series.
> 
> v8:
> - Simplify cover letter contents.
> - Fix miscellaneous formatting and typos.
> - Reorder device info flags and defines.
> - Reword commit message.
> - Rename TLB invalidation enums and functions.
> - Add comments explaining confusing points.
> - Add helper function getting expected delay of CT buffer.
> - Simplify intel_guc_tlb_invalidation_done by passing computed
>   values.
> - Remove helper functions for tlb suspend and resume.
> - Move tlb suspend and resume paths to uc.
> - Split suspend/resume and wedged into two patches.
> - Clarify purpose of sleep change in tlb selftest.
> 
> v9:
> - Explain complexity of GuC TLB invalidations as required for
>   range-based TLB invalidations, which will be platformed later.
> - Fix CHECKPATCH issues.
> - Explain intel_guc_is_ready tlb invalidation skip in
>   intel_gt_invalidate_tlb_full.
> - Reword comment for unlocked xa_for_each loop in
>   intel_guc_submission_reset.
> - Report all errors in init_tlb_lookup.
> - Remove debug message from fini_tlb_lookup.
> - Use standardized interface for
>   intel_guc_tlb_invalidation_done
> - Remove spurious changes.
> - Move wake_up_all_tlb_invalidate on wedge to correct patch.
> 
> v10:
> - Add lock to tlb_lookup on guc submission reset.
> - Add comment about why timeout increased from 10 ms to 20 ms
>   by default in gt_tlb selftest.
> - Remove spurious changes.
> 
> v11:
> - Update CT size delay helper to be clearer.
> - Reorder some function declarations.
> - Clarify some comments.
> - Produce error message if attempting to free a busy wait
>   during fini_tlb_lookup.
> - Revert default sleep back to 10 ms.
> - Link to RFC.
> 
> v12:
> - Add helper for checking if GuC TLB invalidation is
>   supported and guc is ready.
> - Prevent suspend/resume actions involving GuC TLB
>   invalidations if guc is not ready.
> - Add path for INTEL_GUC_ACTION_TLB_INVALIDATION_DONE
>   to immediately process in ct_process_request after
>   it is submitted to ct_handle_event.
> 
> v13:
> - Re-add error check in intel_guc_tlb_invalidation_done
>   for invalid length.
> - Remove intel_guc_is_ready requirement from
>   wake_up_all_tlb_invalidate.
> - Alig

Re: [Intel-gfx] [PATCH v13 3/7] drm/i915: Define and use GuC and CTB TLB invalidation routines

2023-10-13 Thread Cavitt, Jonathan
-Original Message-
From: Harrison, John C  
Sent: Thursday, October 12, 2023 6:11 PM
To: Cavitt, Jonathan ; 
intel-gfx@lists.freedesktop.org
Cc: Gupta, saurabhg ; chris.p.wil...@linux.intel.com; 
Iddamsetty, Aravind ; Yang, Fei 
; Shyti, Andi ; Das, Nirmoy 
; Krzysztofik, Janusz ; 
Roper, Matthew D ; tvrtko.ursu...@linux.intel.com; 
jani.nik...@linux.intel.com
Subject: Re: [PATCH v13 3/7] drm/i915: Define and use GuC and CTB TLB 
invalidation routines
> 
> On 10/12/2023 15:38, Jonathan Cavitt wrote:
> > From: Prathap Kumar Valsan 
> >
> > The GuC firmware had defined the interface for Translation Look-Aside
> > Buffer (TLB) invalidation.  We should use this interface when
> > invalidating the engine and GuC TLBs.
> > Add additional functionality to intel_gt_invalidate_tlb, invalidating
> > the GuC TLBs and falling back to GT invalidation when the GuC is
> > disabled.
> > The invalidation is done by sending a request directly to the GuC
> > tlb_lookup that invalidates the table.  The invalidation is submitted as
> > a wait request and is performed in the CT event handler.  This means we
> > cannot perform this TLB invalidation path if the CT is not enabled.
> > If the request isn't fulfilled in two seconds, this would constitute
> > an error in the invalidation as that would constitute either a lost
> > request or a severe GuC overload.
> >
> > With this new invalidation routine, we can perform GuC-based GGTT
> > invalidations.  GuC-based GGTT invalidation is incompatible with
> > MMIO invalidation so we should not perform MMIO invalidation when
> > GuC-based GGTT invalidation is expected.
> >
> > The additional complexity incurred in this patch will be necessary for
> > range-based tlb invalidations, which will be platformed in the future.
> >
> > Signed-off-by: Prathap Kumar Valsan 
> > Signed-off-by: Bruce Chang 
> > Signed-off-by: Chris Wilson 
> > Signed-off-by: Umesh Nerlige Ramappa 
> > Signed-off-by: Jonathan Cavitt 
> > Signed-off-by: Aravind Iddamsetty 
> > Signed-off-by: Fei Yang 
> > CC: Andi Shyti 
> > Reviewed-by: Andi Shyti 
> > Acked-by: Tvrtko Ursulin 
> > Acked-by: Nirmoy Das 
> > Reviewed-by: John Harrison 
> > ---
> >   drivers/gpu/drm/i915/gt/intel_ggtt.c  |  33 ++-
> >   drivers/gpu/drm/i915/gt/intel_tlb.c   |  16 +-
> >   .../gpu/drm/i915/gt/uc/abi/guc_actions_abi.h  |  33 +++
> >   drivers/gpu/drm/i915/gt/uc/intel_guc.h|  22 ++
> >   drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c |  11 +
> >   drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   |   1 +
> >   .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 195 +-
> >   7 files changed, 299 insertions(+), 12 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c 
> > b/drivers/gpu/drm/i915/gt/intel_ggtt.c
> > index 4d7d88b92632b..7d145b2d3cb17 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
> > @@ -206,22 +206,37 @@ static void gen8_ggtt_invalidate(struct i915_ggtt 
> > *ggtt)
> > intel_uncore_write_fw(uncore, GFX_FLSH_CNTL_GEN6, GFX_FLSH_CNTL_EN);
> >   }
> >   
> > +static void guc_ggtt_ct_invalidate(struct intel_gt *gt)
> > +{
> > +   struct intel_uncore *uncore = gt->uncore;
> > +   intel_wakeref_t wakeref;
> > +
> > +   with_intel_runtime_pm_if_active(uncore->rpm, wakeref) {
> > +   struct intel_guc *guc = >uc.guc;
> > +
> > +   intel_guc_invalidate_tlb_guc(guc);
> > +   }
> > +}
> > +
> >   static void guc_ggtt_invalidate(struct i915_ggtt *ggtt)
> >   {
> > struct drm_i915_private *i915 = ggtt->vm.i915;
> > +   struct intel_gt *gt;
> >   
> > -   gen8_ggtt_invalidate(ggtt);
> > -
> > -   if (GRAPHICS_VER(i915) >= 12) {
> > -   struct intel_gt *gt;
> > +   if (!HAS_GUC_TLB_INVALIDATION(i915))
> > +   gen8_ggtt_invalidate(ggtt);
> This has not changed? As per comments from Matthew Roper and Nirmoy Das, 
> there needs to be a fixup patch first to stop gen8_ggtt_invalidate() 
> from being called on invalid platforms.


Given the sounds of things, it seems like this change here is irrelevant to 
this patch series, as the reason we're
guarding against gen8_ggtt_invalidate isn't related to GuC-based TLB 
invalidations at all.  Ergo, it would actually
make more sense for me to not skip it here and leave the respective guard 
change to a different patch series.
-Jonathan Cavitt


> 
> >   
> > -   list_for_each_entry(gt, >gt_list, ggt

Re: [Intel-gfx] [PATCH v13 4/7] drm/i915: No TLB invalidation on suspended GT

2023-10-13 Thread Cavitt, Jonathan
-Original Message-
From: Harrison, John C  
Sent: Thursday, October 12, 2023 6:08 PM
To: Cavitt, Jonathan ; 
intel-gfx@lists.freedesktop.org
Cc: Gupta, saurabhg ; chris.p.wil...@linux.intel.com; 
Iddamsetty, Aravind ; Yang, Fei 
; Shyti, Andi ; Das, Nirmoy 
; Krzysztofik, Janusz ; 
Roper, Matthew D ; tvrtko.ursu...@linux.intel.com; 
jani.nik...@linux.intel.com
Subject: Re: [PATCH v13 4/7] drm/i915: No TLB invalidation on suspended GT
> 
> On 10/12/2023 15:38, Jonathan Cavitt wrote:
> > In case of GT is suspended, don't allow submission of new TLB invalidation
> > request and cancel all pending requests. The TLB entries will be
> > invalidated either during GuC reload or on system resume.
> >
> > Signed-off-by: Fei Yang 
> > Signed-off-by: Jonathan Cavitt 
> > CC: John Harrison 
> > Reviewed-by: Andi Shyti 
> > Acked-by: Tvrtko Ursulin 
> > Acked-by: Nirmoy Das 
> > ---
> >   drivers/gpu/drm/i915/gt/uc/intel_guc.h|  1 +
> >   .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 22 ---
> >   drivers/gpu/drm/i915/gt/uc/intel_uc.c |  7 ++
> >   3 files changed, 22 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h 
> > b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> > index 0949628d69f8b..2b6dfe62c8f2a 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> > @@ -537,4 +537,5 @@ int intel_guc_invalidate_tlb_engines(struct intel_guc 
> > *guc);
> >   int intel_guc_invalidate_tlb_guc(struct intel_guc *guc);
> >   int intel_guc_tlb_invalidation_done(struct intel_guc *guc,
> > const u32 *payload, u32 len);
> > +void wake_up_all_tlb_invalidate(struct intel_guc *guc);
> >   #endif
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
> > b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > index 1377398afcdfa..3a0d20064878a 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > @@ -1796,13 +1796,24 @@ static void __guc_reset_context(struct 
> > intel_context *ce, intel_engine_mask_t st
> > intel_context_put(parent);
> >   }
> >   
> > -void intel_guc_submission_reset(struct intel_guc *guc, intel_engine_mask_t 
> > stalled)
> > +void wake_up_all_tlb_invalidate(struct intel_guc *guc)
> >   {
> > struct intel_guc_tlb_wait *wait;
> > +   unsigned long i;
> > +
> > +   if (HAS_GUC_TLB_INVALIDATION(guc_to_gt(guc)->i915)) {
> Why the change from 'if(!is_available) return' to 'if(HAS_) {doStuff}'?


I feel like this question has two parts, so I'll answer them separately:

1. Why HAS_GUC_TLB_INVALIDATION and not intel_guc_tlb_invalidation_is_available?

Wake_up_all_tlb_invalidate is called during the suspend/resume path, 
specifically in the
middle of suspend.  It's required for it to be called here to clean up any 
invalidations left
in the queue during the suspend/resume phase because they are no longer valid 
requests.
However, the suspend/resume phase also resets GuC, so intel_guc_is_ready 
returns false.
In short, using intel_guc_invalidation_is_available was causing us to skip this 
code section
incorrectly, resulting in spurious GuC TLB invalidation timeout errors during 
gt reset.


2. Why use a positive check to perform and not a negative check to skip?

In patch 3, wake_up_all_tlb_invalidate was originally called universally on all 
platforms
during intel_guc_submission_reset, which is incorrect and not how was 
reimplemented here.
I discovered this was the case and retroactively corrected it, as seen below.
Because of how intel_guc_submission_reset is structured, a negative check to 
skip wouldn't
make much sense there, so I used a positive check to perform instead.  This is 
a holdover from
that implementation, and was kept to maintain consistency between patches 3 and 
4.  It's
probably not as big of a deal as I'm imagining, but I think it would be awkward 
if the initial
implementation in intel_guc_submission_reset and the reimplementation in
wake_up_all_tlb_invalidate weren't superficially the same, even if they were 
functionally
equivalent otherwise.


-Jonathan Cavitt


> 
> John.
> 
> > +   xa_lock_irq(>tlb_lookup);
> > +   xa_for_each(>tlb_lookup, i, wait)
> > +   wake_up(>wq);
> > +   xa_unlock_irq(>tlb_lookup);
> > +   }
> > +}
> > +
> > +void intel_guc_submission_reset(struct intel_guc *guc, intel_engine_mask_t 
> > stalled)
> > +{
> > struct intel_context *ce;
> > unsigned long index;
> > unsigned long flags;
> &

Re: [Intel-gfx] [PATCH v12 3/7] drm/i915: Define and use GuC and CTB TLB invalidation routines

2023-10-11 Thread Cavitt, Jonathan
-Original Message-
From: Cavitt, Jonathan  
Sent: Wednesday, October 11, 2023 1:53 PM
To: intel-gfx@lists.freedesktop.org
Cc: Gupta, saurabhg ; Cavitt, Jonathan 
; chris.p.wil...@linux.intel.com; Iddamsetty, 
Aravind ; Yang, Fei ; Shyti, 
Andi ; Harrison, John C ; Das, 
Nirmoy ; Krzysztofik, Janusz 
; Roper, Matthew D ; 
tvrtko.ursu...@linux.intel.com; jani.nik...@linux.intel.com
Subject: [PATCH v12 3/7] drm/i915: Define and use GuC and CTB TLB invalidation 
routines
> 
> From: Prathap Kumar Valsan 
> 
> The GuC firmware had defined the interface for Translation Look-Aside
> Buffer (TLB) invalidation.  We should use this interface when
> invalidating the engine and GuC TLBs.
> Add additional functionality to intel_gt_invalidate_tlb, invalidating
> the GuC TLBs and falling back to GT invalidation when the GuC is
> disabled.
> The invalidation is done by sending a request directly to the GuC
> tlb_lookup that invalidates the table.  The invalidation is submitted as
> a wait request and is performed in the CT event handler.  This means we
> cannot perform this TLB invalidation path if the CT is not enabled.
> If the request isn't fulfilled in two seconds, this would constitute
> an error in the invalidation as that would constitute either a lost
> request or a severe GuC overload.
> 
> With this new invalidation routine, we can perform GuC-based GGTT
> invalidations.  GuC-based GGTT invalidation is incompatible with
> MMIO invalidation so we should not perform MMIO invalidation when
> GuC-based GGTT invalidation is expected.
> 
> The additional complexity incurred in this patch will be necessary for
> range-based tlb invalidations, which will be platformed in the future.
> 
> Signed-off-by: Prathap Kumar Valsan 
> Signed-off-by: Bruce Chang 
> Signed-off-by: Chris Wilson 
> Signed-off-by: Umesh Nerlige Ramappa 
> Signed-off-by: Jonathan Cavitt 
> Signed-off-by: Aravind Iddamsetty 
> Signed-off-by: Fei Yang 
> CC: Andi Shyti 
> Reviewed-by: Andi Shyti 
> Acked-by: Tvrtko Ursulin 
> Acked-by: Nirmoy Das 


Hrmm... It seems in my haste I forgot to include John's RB here.
I guess it's fine: this patch got updated enough that I should request
a re-review of it anyways.
-Jonathan Cavitt


> ---
>  drivers/gpu/drm/i915/gt/intel_ggtt.c  |  33 ++-
>  drivers/gpu/drm/i915/gt/intel_tlb.c   |  16 +-
>  .../gpu/drm/i915/gt/uc/abi/guc_actions_abi.h  |  33 +++
>  drivers/gpu/drm/i915/gt/uc/intel_guc.h|  22 ++
>  drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c |  11 +
>  drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   |   1 +
>  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 191 +-
>  7 files changed, 295 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c 
> b/drivers/gpu/drm/i915/gt/intel_ggtt.c
> index 4d7d88b92632b..7d145b2d3cb17 100644
> --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
> @@ -206,22 +206,37 @@ static void gen8_ggtt_invalidate(struct i915_ggtt *ggtt)
>   intel_uncore_write_fw(uncore, GFX_FLSH_CNTL_GEN6, GFX_FLSH_CNTL_EN);
>  }
>  
> +static void guc_ggtt_ct_invalidate(struct intel_gt *gt)
> +{
> + struct intel_uncore *uncore = gt->uncore;
> + intel_wakeref_t wakeref;
> +
> + with_intel_runtime_pm_if_active(uncore->rpm, wakeref) {
> + struct intel_guc *guc = >uc.guc;
> +
> + intel_guc_invalidate_tlb_guc(guc);
> + }
> +}
> +
>  static void guc_ggtt_invalidate(struct i915_ggtt *ggtt)
>  {
>   struct drm_i915_private *i915 = ggtt->vm.i915;
> + struct intel_gt *gt;
>  
> - gen8_ggtt_invalidate(ggtt);
> -
> - if (GRAPHICS_VER(i915) >= 12) {
> - struct intel_gt *gt;
> + if (!HAS_GUC_TLB_INVALIDATION(i915))
> + gen8_ggtt_invalidate(ggtt);
>  
> - list_for_each_entry(gt, >gt_list, ggtt_link)
> + list_for_each_entry(gt, >gt_list, ggtt_link) {
> + if (intel_guc_tlb_invalidation_is_available(>uc.guc)) {
> + guc_ggtt_ct_invalidate(gt);
> + } else if (GRAPHICS_VER(i915) >= 12) {
>   intel_uncore_write_fw(gt->uncore,
> GEN12_GUC_TLB_INV_CR,
> GEN12_GUC_TLB_INV_CR_INVALIDATE);
> - } else {
> - intel_uncore_write_fw(ggtt->vm.gt->uncore,
> -   GEN8_GTCR, GEN8_GTCR_INVALIDATE);
> + } else {
> + intel_uncore_write_fw(gt->uncore,
> +   GEN8_GTCR, GEN8_GTCR_INVALIDATE);
> + }
>   }
>  }

Re: [Intel-gfx] [PATCH v10 3/7] drm/i915: Define and use GuC and CTB TLB invalidation routines

2023-10-10 Thread Cavitt, Jonathan
-Original Message-
From: Harrison, John C  
Sent: Tuesday, October 10, 2023 2:51 PM
To: Cavitt, Jonathan ; 
intel-gfx@lists.freedesktop.org
Cc: Gupta, saurabhg ; chris.p.wil...@linux.intel.com; 
Iddamsetty, Aravind ; Yang, Fei 
; Shyti, Andi ; Das, Nirmoy 
; Krzysztofik, Janusz ; 
Roper, Matthew D ; tvrtko.ursu...@linux.intel.com; 
jani.nik...@linux.intel.com
Subject: Re: [PATCH v10 3/7] drm/i915: Define and use GuC and CTB TLB 
invalidation routines
> 
> On 10/10/2023 08:02, Jonathan Cavitt wrote:
> > From: Prathap Kumar Valsan 
> >
> > The GuC firmware had defined the interface for Translation Look-Aside
> > Buffer (TLB) invalidation.  We should use this interface when
> > invalidating the engine and GuC TLBs.
> > Add additional functionality to intel_gt_invalidate_tlb, invalidating
> > the GuC TLBs and falling back to GT invalidation when the GuC is
> > disabled.
> > The invalidation is done by sending a request directly to the GuC
> > tlb_lookup that invalidates the table.  The invalidation is submitted as
> > a wait request and is performed in the CT event handler.  This means we
> > cannot perform this TLB invalidation path if the CT is not enabled.
> > If the request isn't fulfilled in two seconds, this would constitute
> > an error in the invalidation as that would constitute either a lost
> > request or a severe GuC overload.
> >
> > With this new invalidation routine, we can perform GuC-based GGTT
> > invalidations.  GuC-based GGTT invalidation is incompatible with
> > MMIO invalidation so we should not perform MMIO invalidation when
> > GuC-based GGTT invalidation is expected.
> >
> > The additional complexity incurred in this patch will be necessary for
> > range-based tlb invalidations, which will be platformed in the future.
> >
> > Signed-off-by: Prathap Kumar Valsan 
> > Signed-off-by: Bruce Chang 
> > Signed-off-by: Chris Wilson 
> > Signed-off-by: Umesh Nerlige Ramappa 
> > Signed-off-by: Jonathan Cavitt 
> > Signed-off-by: Aravind Iddamsetty 
> > Signed-off-by: Fei Yang 
> > CC: Andi Shyti 
> > ---
> >   drivers/gpu/drm/i915/gt/intel_ggtt.c  |  34 +++-
> >   drivers/gpu/drm/i915/gt/intel_tlb.c   |  16 +-
> >   .../gpu/drm/i915/gt/uc/abi/guc_actions_abi.h  |  33 
> >   drivers/gpu/drm/i915/gt/uc/intel_guc.h|  22 +++
> >   drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c |   4 +
> >   drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   |   1 +
> >   .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 186 +-
> >   7 files changed, 284 insertions(+), 12 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c 
> > b/drivers/gpu/drm/i915/gt/intel_ggtt.c
> > index 4d7d88b92632b..a1f7bdc602996 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
> > @@ -206,22 +206,38 @@ static void gen8_ggtt_invalidate(struct i915_ggtt 
> > *ggtt)
> > intel_uncore_write_fw(uncore, GFX_FLSH_CNTL_GEN6, GFX_FLSH_CNTL_EN);
> >   }
> >   
> > +static void guc_ggtt_ct_invalidate(struct intel_gt *gt)
> > +{
> > +   struct intel_uncore *uncore = gt->uncore;
> > +   intel_wakeref_t wakeref;
> > +
> > +   with_intel_runtime_pm_if_active(uncore->rpm, wakeref) {
> > +   struct intel_guc *guc = >uc.guc;
> > +
> > +   intel_guc_invalidate_tlb_guc(guc);
> > +   }
> > +}
> > +
> >   static void guc_ggtt_invalidate(struct i915_ggtt *ggtt)
> >   {
> > struct drm_i915_private *i915 = ggtt->vm.i915;
> > +   struct intel_gt *gt;
> >   
> > -   gen8_ggtt_invalidate(ggtt);
> > -
> > -   if (GRAPHICS_VER(i915) >= 12) {
> > -   struct intel_gt *gt;
> > +   if (!HAS_GUC_TLB_INVALIDATION(i915))
> > +   gen8_ggtt_invalidate(ggtt);
> >   
> > -   list_for_each_entry(gt, >gt_list, ggtt_link)
> > +   list_for_each_entry(gt, >gt_list, ggtt_link) {
> > +   if (HAS_GUC_TLB_INVALIDATION(i915) &&
> > +   intel_guc_is_ready(>uc.guc)) {
> > +   guc_ggtt_ct_invalidate(gt);
> > +   } else if (GRAPHICS_VER(i915) >= 12) {
> > intel_uncore_write_fw(gt->uncore,
> >   GEN12_GUC_TLB_INV_CR,
> >   GEN12_GUC_TLB_INV_CR_INVALIDATE);
> > -   } else {
> > -   intel_uncore_write_fw(ggtt->vm.gt->uncore,
> > - GEN8_GTCR, GEN8_GTCR_INVALIDATE);
> > +  

Re: [Intel-gfx] [PATCH v10 4/7] drm/i915: No TLB invalidation on suspended GT

2023-10-10 Thread Cavitt, Jonathan
-Original Message-
From: Harrison, John C  
Sent: Tuesday, October 10, 2023 2:57 PM
To: Cavitt, Jonathan ; 
intel-gfx@lists.freedesktop.org
Cc: Gupta, saurabhg ; chris.p.wil...@linux.intel.com; 
Iddamsetty, Aravind ; Yang, Fei 
; Shyti, Andi ; Das, Nirmoy 
; Krzysztofik, Janusz ; 
Roper, Matthew D ; tvrtko.ursu...@linux.intel.com; 
jani.nik...@linux.intel.com
Subject: Re: [PATCH v10 4/7] drm/i915: No TLB invalidation on suspended GT
> 
> On 10/10/2023 08:02, Jonathan Cavitt wrote:
> > In case of GT is suspended, don't allow submission of new TLB invalidation
> > request and cancel all pending requests. The TLB entries will be
> > invalidated either during GuC reload or on system resume.
> >
> > Signed-off-by: Fei Yang 
> > Signed-off-by: Jonathan Cavitt 
> > CC: John Harrison 
> > ---
> >   drivers/gpu/drm/i915/gt/uc/intel_guc.h|  1 +
> >   .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 21 +--
> >   drivers/gpu/drm/i915/gt/uc/intel_uc.c |  7 +++
> >   3 files changed, 23 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h 
> > b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> > index 06c44f5c28776..ff7e7b90fd49b 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> > @@ -536,4 +536,5 @@ void intel_guc_dump_time_info(struct intel_guc *guc, 
> > struct drm_printer *p);
> >   
> >   int intel_guc_sched_disable_gucid_threshold_max(struct intel_guc *guc);
> >   
> > +void wake_up_all_tlb_invalidate(struct intel_guc *guc);
> >   #endif
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
> > b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > index e9854652c2b52..b9c168ea57270 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > @@ -1796,13 +1796,25 @@ static void __guc_reset_context(struct 
> > intel_context *ce, intel_engine_mask_t st
> > intel_context_put(parent);
> >   }
> >   
> > -void intel_guc_submission_reset(struct intel_guc *guc, intel_engine_mask_t 
> > stalled)
> > +void wake_up_all_tlb_invalidate(struct intel_guc *guc)
> >   {
> > struct intel_guc_tlb_wait *wait;
> > +   unsigned long i;
> > +
> > +   if (!HAS_GUC_TLB_INVALIDATION(guc_to_gt(guc)->i915))
> > +   return;
> > +
> > +   xa_lock_irq(>tlb_lookup);
> > +   xa_for_each(>tlb_lookup, i, wait)
> > +   wake_up(>wq);
> > +   xa_unlock_irq(>tlb_lookup);
> > +}
> > +
> > +void intel_guc_submission_reset(struct intel_guc *guc, intel_engine_mask_t 
> > stalled)
> What is changed on this line? Or is it just diff being confused and 
> seeing the move of the 'wait' declaration as being the anchor point 
> rather than the function declaration?

It's the latter, yes.  Diff is confused.
-Jonathan Cavitt

> 
> John.
> 
> 
> > +{
> > struct intel_context *ce;
> > unsigned long index;
> > unsigned long flags;
> > -   unsigned long i;
> >   
> > if (unlikely(!guc_submission_initialized(guc))) {
> > /* Reset called during driver load? GuC not yet initialised! */
> > @@ -1833,10 +1845,7 @@ void intel_guc_submission_reset(struct intel_guc 
> > *guc, intel_engine_mask_t stall
> >  * The full GT reset will have cleared the TLB caches and flushed the
> >  * G2H message queue; we can release all the blocked waiters.
> >  */
> > -   xa_lock_irq(>tlb_lookup);
> > -   xa_for_each(>tlb_lookup, i, wait)
> > -   wake_up(>wq);
> > -   xa_unlock_irq(>tlb_lookup);
> > +   wake_up_all_tlb_invalidate(guc);
> >   }
> >   
> >   static void guc_cancel_context_requests(struct intel_context *ce)
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c 
> > b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> > index 98b103375b7ab..750cb63503dd7 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> > @@ -688,6 +688,8 @@ void intel_uc_suspend(struct intel_uc *uc)
> > /* flush the GSC worker */
> > intel_gsc_uc_flush_work(>gsc);
> >   
> > +   wake_up_all_tlb_invalidate(guc);
> > +
> > if (!intel_guc_is_ready(guc)) {
> > guc->interrupts.enabled = false;
> > return;
> > @@ -736,6 +738,11 @@ static int __uc_resume(struct intel_uc *uc, bool 
> > enable_communication)
> >   
> > intel_gsc_uc_resume(>gsc);
> >   
> > +   if (HAS_GUC_TLB_INVALIDATION(gt->i915)) {
> > +   intel_guc_invalidate_tlb_engines(guc);
> > +   intel_guc_invalidate_tlb_guc(guc);
> > +   }
> > +
> > return 0;
> >   }
> >   
> 
> 


Re: [Intel-gfx] [PATCH dii-client 2/2] drm/i915: Use selective tlb invalidations where supported

2023-10-10 Thread Cavitt, Jonathan
Ignore this.  It's not a security hole: it's just a temporary patch I was 
using for rebasing purposes that got smuggled into this series
on accident.  It has a bad tag because of some stale gitconfig params
that I've since removed.
-Jonathan Cavitt

-Original Message-
From: Cavitt, Jonathan  
Sent: Tuesday, October 10, 2023 11:44 AM
To: intel-gfx@lists.freedesktop.org
Cc: Gupta, saurabhg ; Cavitt, Jonathan 
; Das, Nirmoy ; Shyti, Andi 
; tvrtko.ursu...@linux.intel.com; Harrison, John C 

Subject: [PATCH dii-client 2/2] drm/i915: Use selective tlb invalidations where 
supported
> 
> For platforms supporting selective tlb invalidations, we don't need to
> do a full tlb invalidation. Rather do a range based tlb invalidation for
> every unbind of purged vma belongs to an active vm.
> 
> Signed-off-by: Prathap Kumar Valsan 
> Cc: Niranjana Vishwanathapura 
> Cc: Fei Yang 
> Signed-off-by: Mauro Carvalho Chehab 
> Signed-off-by: Jonathan Cavitt 
> ---
>  drivers/gpu/drm/i915/gt/intel_ppgtt.c |  2 +-
>  drivers/gpu/drm/i915/i915_vma.c   | 14 +-
>  drivers/gpu/drm/i915/i915_vma.h   |  3 ++-
>  3 files changed, 12 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_ppgtt.c 
> b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
> index d07a4f97b9434..b43dae3cbd59f 100644
> --- a/drivers/gpu/drm/i915/gt/intel_ppgtt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
> @@ -211,7 +211,7 @@ void ppgtt_unbind_vma(struct i915_address_space *vm,
>   return;
>  
>   vm->clear_range(vm, vma_res->start, vma_res->vma_size);
> - vma_invalidate_tlb(vm, vma_res->tlb);
> + vma_invalidate_tlb(vm, vma_res->tlb, vma_res->start, vma_res->vma_size);
>  }
>  
>  static unsigned long pd_count(u64 size, int shift)
> diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
> index d09aad34ba37f..cb05d794f0d0f 100644
> --- a/drivers/gpu/drm/i915/i915_vma.c
> +++ b/drivers/gpu/drm/i915/i915_vma.c
> @@ -1339,7 +1339,8 @@ I915_SELFTEST_EXPORT int i915_vma_get_pages(struct 
> i915_vma *vma)
>   return err;
>  }
>  
> -void vma_invalidate_tlb(struct i915_address_space *vm, u32 *tlb)
> +void vma_invalidate_tlb(struct i915_address_space *vm, u32 *tlb
> + u64 start, u64 size)
>  {
>   struct intel_gt *gt;
>   int id;
> @@ -1355,9 +1356,11 @@ void vma_invalidate_tlb(struct i915_address_space *vm, 
> u32 *tlb)
>* the most recent TLB invalidation seqno, and if we have not yet
>* flushed the TLBs upon release, perform a full invalidation.
>*/
> - for_each_gt(gt, vm->i915, id)
> - WRITE_ONCE(tlb[id],
> -intel_gt_next_invalidate_tlb_full(gt));
> + for_each_gt(gt, vm->i915, id) {
> + if (!intel_gt_invalidate_tlb_range(gt, start, size))
> + WRITE_ONCE(tlb[id],
> +intel_gt_next_invalidate_tlb_full(gt));
> + }
>  }
>  
>  static void __vma_put_pages(struct i915_vma *vma, unsigned int count)
> @@ -2041,7 +2044,8 @@ struct dma_fence *__i915_vma_evict(struct i915_vma 
> *vma, bool async)
>   dma_fence_put(unbind_fence);
>   unbind_fence = NULL;
>   }
> - vma_invalidate_tlb(vma->vm, vma->obj->mm.tlb);
> + vma_invalidate_tlb(vma->vm, vma->obj->mm.tlb,
> +vma->node.start, vma->size);
>   }
>  
>   /*
> diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
> index e356dfb883d34..5a604aad55dfe 100644
> --- a/drivers/gpu/drm/i915/i915_vma.h
> +++ b/drivers/gpu/drm/i915/i915_vma.h
> @@ -260,7 +260,8 @@ bool i915_vma_misplaced(const struct i915_vma *vma,
>   u64 size, u64 alignment, u64 flags);
>  void __i915_vma_set_map_and_fenceable(struct i915_vma *vma);
>  void i915_vma_revoke_mmap(struct i915_vma *vma);
> -void vma_invalidate_tlb(struct i915_address_space *vm, u32 *tlb);
> +void vma_invalidate_tlb(struct i915_address_space *vm, u32 *tlb,
> + u64 start, u64 size);
>  struct dma_fence *__i915_vma_evict(struct i915_vma *vma, bool async);
>  int __i915_vma_unbind(struct i915_vma *vma);
>  int __must_check i915_vma_unbind(struct i915_vma *vma);
> -- 
> 2.25.1
> 
> 


Re: [Intel-gfx] [PATCH dii-client v10 1/7] drm/i915: Add GuC TLB Invalidation device info flags

2023-10-10 Thread Cavitt, Jonathan
-Original Message-
From: Cavitt, Jonathan  
Sent: Tuesday, October 10, 2023 8:01 AM
To: intel-gfx@lists.freedesktop.org
Cc: Gupta, saurabhg ; Cavitt, Jonathan 
; chris.p.wil...@linux.intel.com; Iddamsetty, 
Aravind ; Yang, Fei ; Shyti, 
Andi ; Harrison, John C ; Das, 
Nirmoy ; Krzysztofik, Janusz 
; Roper, Matthew D ; 
tvrtko.ursu...@linux.intel.com; jani.nik...@linux.intel.com
Subject: [PATCH dii-client v10 1/7] drm/i915: Add GuC TLB Invalidation device 
info flags
> 
> Add device info flags for if GuC TLB Invalidation is enabled.

Sorry!  I jumped the gun!  I'll send the proper version shortly!
-Jonathan Cavitt

> 
> Signed-off-by: Jonathan Cavitt 
> ---
>  drivers/gpu/drm/i915/i915_drv.h  | 2 ++
>  drivers/gpu/drm/i915/intel_device_info.h | 1 +
>  2 files changed, 3 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index cb60fc9cf8737..6a2a78c61f212 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -794,6 +794,8 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
>  #define HAS_GUC_DEPRIVILEGE(i915) \
>   (INTEL_INFO(i915)->has_guc_deprivilege)
>  
> +#define HAS_GUC_TLB_INVALIDATION(i915)   
> (INTEL_INFO(i915)->has_guc_tlb_invalidation)
> +
>  #define HAS_3D_PIPELINE(i915)(INTEL_INFO(i915)->has_3d_pipeline)
>  
>  #define HAS_ONE_EU_PER_FUSE_BIT(i915)
> (INTEL_INFO(i915)->has_one_eu_per_fuse_bit)
> diff --git a/drivers/gpu/drm/i915/intel_device_info.h 
> b/drivers/gpu/drm/i915/intel_device_info.h
> index 39817490b13fd..eba2f0b919c87 100644
> --- a/drivers/gpu/drm/i915/intel_device_info.h
> +++ b/drivers/gpu/drm/i915/intel_device_info.h
> @@ -153,6 +153,7 @@ enum intel_ppgtt_type {
>   func(has_heci_pxp); \
>   func(has_heci_gscfi); \
>   func(has_guc_deprivilege); \
> + func(has_guc_tlb_invalidation); \
>   func(has_l3_ccs_read); \
>   func(has_l3_dpf); \
>   func(has_llc); \
> -- 
> 2.25.1
> 
> 


Re: [Intel-gfx] [PATCH v9 6/7] drm/i915/gt: Increase sleep in gt_tlb selftest sanitycheck

2023-10-10 Thread Cavitt, Jonathan
-Original Message-
From: Andi Shyti  
Sent: Tuesday, October 10, 2023 6:04 AM
To: Tvrtko Ursulin 
Cc: Cavitt, Jonathan ; 
intel-gfx@lists.freedesktop.org; Gupta, saurabhg ; 
yu.bruce.ch...@intel.com; chris.p.wil...@linux.intel.com; Iddamsetty, Aravind 
; Yang, Fei ; Harrison, John 
C ; Das, Nirmoy ; Krzysztofik, 
Janusz ; Roper, Matthew D 
; jani.nik...@linux.intel.com
Subject: Re: [PATCH v9 6/7] drm/i915/gt: Increase sleep in gt_tlb selftest 
sanitycheck
> 
> Hi Tvrtko,
> 
> > On 09/10/2023 18:29, Jonathan Cavitt wrote:
> > > For the gt_tlb live selftest, when operating on the GSC engine,
> > > increase the timeout from 10 ms to 200 ms because the GSC
> > > engine is a bit slower than the rest.
> > 
> > And others from 10ms to 20ms. By accident or deliberate?
> 
> yes, accident :-)


I should've clarified in the patch that this was to resolve a CHECKPATCH error:
-:29: WARNING:MSLEEP: msleep < 20ms can sleep for up to 20ms; see 
Documentation/timers/timers-howto.rst
But I wasn't sure if we were allowed to talk about such things on the mailing 
list.
-Jonathan Cavitt


> 
> Andi
> 
> > 
> > > Signed-off-by: Jonathan Cavitt 
> > > ---
> > >   drivers/gpu/drm/i915/gt/selftest_tlb.c | 11 +--
> > >   1 file changed, 9 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/gt/selftest_tlb.c 
> > > b/drivers/gpu/drm/i915/gt/selftest_tlb.c
> > > index 7e41f69fc818f..24beb94aa7a37 100644
> > > --- a/drivers/gpu/drm/i915/gt/selftest_tlb.c
> > > +++ b/drivers/gpu/drm/i915/gt/selftest_tlb.c
> > > @@ -136,8 +136,15 @@ pte_tlbinv(struct intel_context *ce,
> > >   i915_request_get(rq);
> > >   i915_request_add(rq);
> > > - /* Short sleep to sanitycheck the batch is spinning before we begin */
> > > - msleep(10);
> > > + /*
> > > +  * Short sleep to sanitycheck the batch is spinning before we begin.
> > > +  * FIXME: Why is GSC so slow?
> > > +  */
> > > + if (ce->engine->class == OTHER_CLASS)
> > > + msleep(200);
> > > + else
> > > + msleep(20);
> > > +
> > >   if (va == vb) {
> > >   if (!i915_request_completed(rq)) {
> > >   pr_err("%s(%s): Semaphore sanitycheck failed 
> > > %llx, with alignment %llx, using PTE size %x (phys %x, sg %x)\n",
> 


Re: [Intel-gfx] [PATCH v8 3/7] drm/i915: Define and use GuC and CTB TLB invalidation routines

2023-10-09 Thread Cavitt, Jonathan
-Original Message-
From: Tvrtko Ursulin  
Sent: Monday, October 9, 2023 1:57 AM
To: Cavitt, Jonathan ; 
intel-gfx@lists.freedesktop.org
Cc: Dutt, Sudeep ; Chang, Yu bruce 
; chris.p.wil...@linux.intel.com; Iddamsetty, Aravind 
; Yang, Fei ; Shyti, Andi 
; Harrison, John C ; Das, 
Nirmoy ; Krzysztofik, Janusz 
; Roper, Matthew D ; 
jani.nik...@linux.intel.com
Subject: Re: [PATCH v8 3/7] drm/i915: Define and use GuC and CTB TLB 
invalidation routines
> 
> 
> On 06/10/2023 19:20, Jonathan Cavitt wrote:
> > From: Prathap Kumar Valsan 
> > 
> > The GuC firmware had defined the interface for Translation Look-Aside
> > Buffer (TLB) invalidation.  We should use this interface when
> > invalidating the engine and GuC TLBs.
> > Add additional functionality to intel_gt_invalidate_tlb, invalidating
> > the GuC TLBs and falling back to GT invalidation when the GuC is
> > disabled.
> > The invalidation is done by sending a request directly to the GuC
> > tlb_lookup that invalidates the table.  The invalidation is submitted as
> > a wait request and is performed in the CT event handler.  This means we
> > cannot perform this TLB invalidation path if the CT is not enabled.
> > If the request isn't fulfilled in two seconds, this would constitute
> > an error in the invalidation as that would constitute either a lost
> > request or a severe GuC overload.
> > 
> > With this new invalidation routine, we can perform GuC-based GGTT
> > invalidations.  GuC-based GGTT invalidation is incompatible with
> > MMIO invalidation so we should not perform MMIO invalidation when
> > GuC-based GGTT invalidation is expected.
> > 
> > Purpose of xarray:
> > The tlb_lookup table is allocated as an xarray because the set of
> > pending TLB invalidations may have no upper bound.  The consequence of
> > this is that all actions interfacing with this table need to use the
> > xarray functions, such as xa_alloc_cyclic_irq for array insertion.
> > 
> > Purpose of must_wait_woken:
> > Our wait for the G2H ack for the completion of a TLB invalidation is
> > mandatory; we must wait for the HW to confirm that the physical
> > addresses are no longer accessible before we return those to the system.
> > 
> > On switching to using the wait_woken() convenience routine, we
> > introduced ourselves to an issue where wait_woken() may complete early
> > under a kthread that is stopped. Since we send a TLB invalidation when
> > we try to release pages from the shrinker, we can be called from any
> > process; including kthreads.
> > 
> > Using wait_woken() from any process context causes another issue. The
> > use of is_kthread_should_stop() assumes that any task with PF_KTHREAD
> > set was made by kthread_create() and has called set_kthread_struct().
> > This is not true for the raw kernel_thread():
> 
> This explanation misses the main point of my ask - which is to explain 
> why a simpler scheme isn't sufficient. Simpler scheme aka not needed the 
> xarray or any flavour of wait_token().
> 
> In other words it is obvious we have to wait for the invalidation ack, 
> but not obvious why we need a complicated scheme.


Okay.  I'll remove these chunks and explain that the complexity is required
for range-based TLB invalidaitons, which will land upstream eventually.


> 
> > BUG: kernel NULL pointer dereference, address: 
> > [ 3089.759660] Call Trace:
> > [ 3089.762110]  wait_woken+0x4f/0x80
> > [ 3089.765496]  guc_send_invalidate_tlb+0x1fe/0x310 [i915]
> > [ 3089.770725]  ? syscall_return_via_sysret+0xf/0x7f
> > [ 3089.775426]  ? do_wait_intr_irq+0xb0/0xb0
> > [ 3089.779430]  ? __switch_to_asm+0x40/0x70
> > [ 3089.783349]  ? __switch_to_asm+0x34/0x70
> > [ 3089.787273]  ? __switch_to+0x7a/0x3e0
> > [ 3089.790930]  ? __switch_to_asm+0x34/0x70
> > [ 3089.794883]  intel_guc_invalidate_tlb_full+0x92/0xa0 [i915]
> > [ 3089.800487]  intel_invalidate_tlb_full+0x94/0x190 [i915]
> > [ 3089.805824]  intel_invalidate_tlb_full_sync+0x1b/0x30 [i915]
> > [ 3089.811508]  __i915_gem_object_unset_pages+0x138/0x150 [i915]
> > [ 3089.817279]  __i915_gem_object_put_pages+0x25/0x90 [i915]
> > [ 3089.822706]  i915_gem_shrink+0x532/0x7e0 [i915]
> > [ 3089.827264]  i915_gem_shrinker_scan+0x3d/0xd0 [i915]
> > [ 3089.832230]  do_shrink_slab+0x12c/0x2a0
> > [ 3089.836065]  shrink_slab+0xad/0x2b0
> > [ 3089.839550]  shrink_node+0xcc/0x410
> > [ 3089.843035]  do_try_to_free_pages+0xc6/0x380
> > [ 3089.847306]  try_to_free_pages+0xec/0x1c0
> > [ 3089.851312]  __alloc_pages_slowpath+0x3ad/0xd10
> > [ 3089.855845]  ? update_sd_lb_stats+0x636/0x710

Re: [Intel-gfx] [PATCH v7 2/5] drm/i915: Define and use GuC and CTB TLB invalidation routines

2023-10-06 Thread Cavitt, Jonathan
As far as I can tell, most if not all of the below comments
have now been addressed in version 8.  Please check to
verify this is correct.
-Jonathan Cavitt

-Original Message-
From: Tvrtko Ursulin  
Sent: Friday, October 6, 2023 6:05 AM
To: Cavitt, Jonathan ; 
intel-gfx@lists.freedesktop.org
Cc: Dutt, Sudeep ; Chang, Yu bruce 
; chris.p.wil...@linux.intel.com; Iddamsetty, Aravind 
; Yang, Fei ; Shyti, Andi 
; Harrison, John C ; Das, 
Nirmoy ; Krzysztofik, Janusz 
; Roper, Matthew D ; 
jani.nik...@linux.intel.com
Subject: Re: [PATCH v7 2/5] drm/i915: Define and use GuC and CTB TLB 
invalidation routines
> 
> 
> On 06/10/2023 11:11, Tvrtko Ursulin wrote:
> > 
> > Hi,
> > 
> > 
> > Andi asked me to summarize what I think is unaddressed review feedback 
> > so far in order to consolidate and enable hopefully things to move 
> > forward. So I will try to re-iterate the comments and questions below.
> > 
> > But also note that there is a bunch of new valid comments from John 
> > against v7 which I will not repeat.
> > 
> > On 05/10/2023 20:35, Jonathan Cavitt wrote:
> >> From: Prathap Kumar Valsan 
> >>
> >> The GuC firmware had defined the interface for Translation Look-Aside
> >> Buffer (TLB) invalidation.  We should use this interface when
> >> invalidating the engine and GuC TLBs.
> >> Add additional functionality to intel_gt_invalidate_tlb, invalidating
> >> the GuC TLBs and falling back to GT invalidation when the GuC is
> >> disabled.
> >> The invalidation is done by sending a request directly to the GuC
> >> tlb_lookup that invalidates the table.  The invalidation is submitted as
> >> a wait request and is performed in the CT event handler.  This means we
> >> cannot perform this TLB invalidation path if the CT is not enabled.
> >> If the request isn't fulfilled in two seconds, this would constitute
> >> an error in the invalidation as that would constitute either a lost
> >> request or a severe GuC overload.
> >> The tlb_lookup table is allocated as an xarray because the set of
> >> pending TLB invalidations may have no upper bound.  The consequence of
> >> this is that all actions interfacing with this table need to use the
> >> xarray functions, such as xa_alloc_cyclic_irq for array insertion.
> >>
> >> With this new invalidation routine, we can perform GuC-based GGTT
> >> invalidations.  GuC-based GGTT invalidation is incompatible with
> >> MMIO invalidation so we should not perform MMIO invalidation when
> >> GuC-based GGTT invalidation is expected.
> > 
> > On the commit message, I was asking that it describes the justification 
> > for the complexity patch adds with the wait queue management. It is 
> > non-trivial code, open-coded-almost-copy-of wait_token(), etc, so it 
> > needs explanation.
> > 
> > Today we have all threads serialize their invalidation under 
> > gt->tlb.invalidate_lock. With this patch that remains, but it allows a 
> > little bit of de-serialization in waiting. I suspect this is because 
> > with mmio i915 has direct access to invalidation, where with GuC the 
> > requests are competing for latency with other CT requests too (not 
> > invalidations).
> > 
> > Simpler patch could be doing the same as the GFP_ATOMIC fallback path in 
> > guc_send_invalidate_tlb - ie. serialize it all against one CT 
> > invalidation "slot". Are the gains of allowing multiple wait slots 
> > significant enough to warrant the complexity etc needs to be documented 
> > and the above problem space explained in the commit message.
> 
> Also, any gains from the invidual waiters are limited to 
> ggtt->invalidate() callers right? Because the put_pages invalidations 
> are serialized at the top-level in intel_gt_invalidate_tlb_full() anyway.
> 
> And how frequent or relevant are ggtt invalidations at runtime? It is 
> just context creation and such, no (rings, contexts state)? Are page 
> flips / framebuffers relevant too?
> 
> Question is whether a simpler scheme, with a single wait queue (no open 
> coded wait_token(), xarray etc) and just waking up all waiters on 
> processing CT done, where each waiters check the seqno and goes back to 
> sleep if it's invalidation hasn't been completed yet, would be sufficient.
> 
> Regards,
> 
> Tvrtko
> 
> > 
> >> Signed-off-by: Prathap Kumar Valsan 
> >> Signed-off-by: Bruce Chang 
> >> Signed-off-by: Chris Wilson 
> >> Signed-off-by: Umesh Nerlige Ramappa 
> >> Signed-off-by: Jonathan Cavitt 
> >> Signed-off-by: Aravind I

Re: [Intel-gfx] [PATCH v5 2/4] drm/i915: Define and use GuC and CTB TLB invalidation routines

2023-10-05 Thread Cavitt, Jonathan
-Original Message-
From: Tvrtko Ursulin  
Sent: Thursday, October 5, 2023 1:55 AM
To: Cavitt, Jonathan ; 
intel-gfx@lists.freedesktop.org
Cc: Dutt, Sudeep ; Chang, Yu bruce 
; chris.p.wil...@linux.intel.com; Iddamsetty, Aravind 
; Yang, Fei ; Shyti, Andi 
; Harrison, John C ; Das, 
Nirmoy ; Krzysztofik, Janusz 
; Roper, Matthew D 
Subject: Re: [PATCH v5 2/4] drm/i915: Define and use GuC and CTB TLB 
invalidation routines
> 
> 
> On 04/10/2023 19:36, Jonathan Cavitt wrote:
> > From: Prathap Kumar Valsan 
> > 
> > The GuC firmware had defined the interface for Translation Look-Aside
> > Buffer (TLB) invalidation.  We should use this interface when
> > invalidating the engine and GuC TLBs.
> > Add additional functionality to intel_gt_invalidate_tlb, invalidating
> > the GuC TLBs and falling back to GT invalidation when the GuC is
> > disabled.
> > The invalidation is done by sending a request directly to the GuC
> > tlb_lookup that invalidates the table.  The invalidation is submitted as
> > a wait request and is performed in the CT event handler.  This means we
> > cannot perform this TLB invalidation path if the CT is not enabled.
> > If the request isn't fulfilled in two seconds, this would constitute
> > an error in the invalidation as that would constitute either a lost
> > request or a severe GuC overload.
> > The tlb_lookup table is allocated as an xarray because the set of
> > pending TLB invalidations may have no upper bound.  The consequence of
> > this is that all actions interfacting with this table need to use the
> > xarray functions, such as xa_alloc_cyclic_irq for array insertion.
> > 
> > With this new invalidation routine, we can perform GuC-based GGTT
> > invalidations.  GuC-based GGTT invalidation is incompatible with
> > MMIO invalidation so we should not perform MMIO invalidation when
> > GuC-based GGTT invalidation is expected.
> 
> This respun (series) includes changes some but not all points that I 
> have raised, or at least asked about in the last round. There were no 
> replies, to explain either that I was wrong, or whatever, against the 
> points left un-modified. Some of those I raised more than once.
> 
> And while I appreciate addressing review feedback and quick respins, I 
> don't enjoy the silence on the unaddressed parts since that is not how 
> the review should work. Therefore I will not be reviewing this version 
> until all review comments are dealt with one way or the other.


I went back through your previous revision notes just now.  AFAICT, the only
revision note I did not address in this patch series pertains to the necessity 
for
if (!intel_guc_ct_enabled(>ct)) in guc_send_invalidate_tlb.  However, I
cannot see where this is guarded against in the code prior, so it still appears
necessary.  If this is incorrect, please inform me of what I may have missed.

I understand it's difficult to see what's changed between revisions, and I
recognize that was my fault for not including a cover letter: I was under the
impression I was to not include cover letters in upstream patches unless I
had some additional information to provide, but the way I write cover letters
gives the impression that this is not the case.  I added a cover letter to 
version
six that retroactively applies version change notes to all prior revisions.   
This
may have been missed because I accidentally left the dii-client tag (and a stale
patch duplicate patch) in the series, so it may have been seen as a series sent
to the wrong email address.  I'll aim to remove the tag for version seven.
-Jonathan Cavitt


> 
> Regards,
> 
> Tvrtko
> 
> > Signed-off-by: Prathap Kumar Valsan 
> > Signed-off-by: Bruce Chang 
> > Signed-off-by: Chris Wilson 
> > Signed-off-by: Umesh Nerlige Ramappa 
> > Signed-off-by: Jonathan Cavitt 
> > Signed-off-by: Aravind Iddamsetty 
> > Signed-off-by: Fei Yang 
> > CC: Andi Shyti 
> > ---
> >   drivers/gpu/drm/i915/gt/intel_ggtt.c  |  42 ++--
> >   drivers/gpu/drm/i915/gt/intel_tlb.c   |  14 +-
> >   .../gpu/drm/i915/gt/uc/abi/guc_actions_abi.h  |  33 +++
> >   drivers/gpu/drm/i915/gt/uc/intel_guc.h|  22 ++
> >   drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c |   6 +
> >   drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   |   1 +
> >   .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 211 +-
> >   7 files changed, 313 insertions(+), 16 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c 
> > b/drivers/gpu/drm/i915/gt/intel_ggtt.c
> > index 4d7d88b92632b..be8c216d29f91 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
> > @@ -2

Re: [Intel-gfx] [PATCH v2 1/2] drm/i915: Add WABB blit for Wa_16018031267 / Wa_16018063123

2023-08-24 Thread Cavitt, Jonathan
-Original Message-
From: Andi Shyti  
Sent: Thursday, August 24, 2023 7:54 AM
To: Cavitt, Jonathan 
Cc: intel-gfx@lists.freedesktop.org; Mistat, Tomasz ; 
Vivi, Rodrigo ; Germano, Gregory F 
; Roper, Matthew D ; 
Das, Nirmoy 
Subject: Re: [Intel-gfx] [PATCH v2 1/2] drm/i915: Add WABB blit for 
Wa_16018031267 / Wa_16018063123
> 
> Hi Jonathan,
> 
> few little things...
> 
> On Wed, Aug 23, 2023 at 11:51:03AM -0700, Jonathan Cavitt wrote:
> > From: Nirmoy Das 
> > 
> > Apply WABB blit for Wa_16018031267 / Wa_16018063123.
> > Additionally, update the lrc selftest to exercise the new
> > WABB changes.
> > 
> > Signed-off-by: Jonathan Cavitt 
> > Co-developed-by: Nirmoy Das 
> 
> As the sender of this patch, your SoB should be last and you also
> need to add Nirmoy's SoB above yours.
> 
> (Tags should be added in chronological order)
> 
> [...]
> 
> > +static u32 *
> > +xehp_emit_per_ctx_bb(const struct intel_context *ce, u32 *cs)
> > +{
> > +   /* Wa_16018031267, Wa_16018063123 */
> > +   if (ce->engine->class == COPY_ENGINE_CLASS &&
> > +   NEEDS_FASTCOLOR_BLT_WABB(ce->engine->i915))
> > +   cs = xehp_emit_fastcolor_blt_wabb(ce, cs);
> 
> I thought the trend was to have things like:
> 
>   ..._needs_wa_16018031267()
> 
> But we don't have a unified system yet
> 
> > +   return cs;
> > +}
> > +
> > +
> 
> two blank lines here
> 
> > +static void
> > +setup_per_ctx_bb(const struct intel_context *ce,
> > +const struct intel_engine_cs *engine,
> > +u32 *(*emit)(const struct intel_context *, u32 *))
> > +{
> 
> [...]
> 
> >  static u32 *
> > -emit_indirect_ctx_bb_canary(const struct intel_context *ce, u32 *cs)
> > +emit_wabb_ctx_canary(const struct intel_context *ce,
> > +   u32 *cs, bool per_ctx)
> 
> just a little alignment issue here.
> 
> >  {
> 
> [...]
> 
> Are the failures from CI coming from this series?


Yes.  This series has several failures associated with it, such as a module 
load failure for
DG2/ATSM, and several failures in the new live_lrc_per_ctx_bb selftest.  I'm 
not certain
what's causing either set of failures, to be honest, so if you have any 
guidance on what
might be incorrect in the way I'm setting up the PER_CTX_BB, I'd be willing to 
try just
about anything.
-Jonathan Cavitt


> 
> Andi
> 


Re: [Intel-gfx] [CI] drm/i915/gt: Refactor hangcheck selftest to use igt_spinner

2023-08-21 Thread Cavitt, Jonathan
-Original Message-
From: Andi Shyti  
Sent: Saturday, August 19, 2023 3:50 PM
To: Cavitt, Jonathan 
Cc: intel-gfx 
Subject: [CI] drm/i915/gt: Refactor hangcheck selftest to use igt_spinner
> 
> From: Jonathan Cavitt 
> 
> The hangcheck live selftest contains duplicate declarations of some
> functions that already exist in igt_spinner.c, such as the creation and
> deconstruction of a spinning batch buffer (spinner) that hangs an engine.
> It's undesireable to have such code duplicated, as the requirements for
> the spinner may change with hardware updates, necessitating both
> execution paths be updated.  To avoid this, have the hangcheck live
> selftest use the declaration from igt_spinner.  This eliminates the need
> for the declarations in the selftest itself, as well as the associated
> local helper structures, so we can erase those.
> 
> Suggested-by: Matt Roper 
> Signed-off-by: Jonathan Cavitt 


Test fails with -62 (ETIME) on intel_selftest_wait_for_rq
It looks like the cause might be me calling intel_context_put too early?
Let me move those calls to later and see if that helps.
Give me some time to implement those changes.
-Jonathan Cavitt


> ---
>  drivers/gpu/drm/i915/gt/selftest_hangcheck.c | 457 ++-
>  drivers/gpu/drm/i915/selftests/igt_spinner.c |  15 +-
>  drivers/gpu/drm/i915/selftests/igt_spinner.h |   9 +
>  3 files changed, 155 insertions(+), 326 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c 
> b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> index 0dd4d00ee894e..36376a4ade8e4 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> @@ -29,281 +29,40 @@
>  
>  #define IGT_IDLE_TIMEOUT 50 /* ms; time to wait after flushing between tests 
> */
>  
> -struct hang {
> - struct intel_gt *gt;
> - struct drm_i915_gem_object *hws;
> - struct drm_i915_gem_object *obj;
> - struct i915_gem_context *ctx;
> - u32 *seqno;
> - u32 *batch;
> -};
> -
> -static int hang_init(struct hang *h, struct intel_gt *gt)
> -{
> - void *vaddr;
> - int err;
> -
> - memset(h, 0, sizeof(*h));
> - h->gt = gt;
> -
> - h->ctx = kernel_context(gt->i915, NULL);
> - if (IS_ERR(h->ctx))
> - return PTR_ERR(h->ctx);
> -
> - GEM_BUG_ON(i915_gem_context_is_bannable(h->ctx));
> -
> - h->hws = i915_gem_object_create_internal(gt->i915, PAGE_SIZE);
> - if (IS_ERR(h->hws)) {
> - err = PTR_ERR(h->hws);
> - goto err_ctx;
> - }
> -
> - h->obj = i915_gem_object_create_internal(gt->i915, PAGE_SIZE);
> - if (IS_ERR(h->obj)) {
> - err = PTR_ERR(h->obj);
> - goto err_hws;
> - }
> -
> - i915_gem_object_set_cache_coherency(h->hws, I915_CACHE_LLC);
> - vaddr = i915_gem_object_pin_map_unlocked(h->hws, I915_MAP_WB);
> - if (IS_ERR(vaddr)) {
> - err = PTR_ERR(vaddr);
> - goto err_obj;
> - }
> - h->seqno = memset(vaddr, 0xff, PAGE_SIZE);
> -
> - vaddr = i915_gem_object_pin_map_unlocked(h->obj,
> -  intel_gt_coherent_map_type(gt, 
> h->obj, false));
> - if (IS_ERR(vaddr)) {
> - err = PTR_ERR(vaddr);
> - goto err_unpin_hws;
> - }
> - h->batch = vaddr;
> -
> - return 0;
> -
> -err_unpin_hws:
> - i915_gem_object_unpin_map(h->hws);
> -err_obj:
> - i915_gem_object_put(h->obj);
> -err_hws:
> - i915_gem_object_put(h->hws);
> -err_ctx:
> - kernel_context_close(h->ctx);
> - return err;
> -}
> -
> -static u64 hws_address(const struct i915_vma *hws,
> -const struct i915_request *rq)
> -{
> - return i915_vma_offset(hws) +
> -offset_in_page(sizeof(u32) * rq->fence.context);
> -}
> -
> -static struct i915_request *
> -hang_create_request(struct hang *h, struct intel_engine_cs *engine)
> -{
> - struct intel_gt *gt = h->gt;
> - struct i915_address_space *vm = i915_gem_context_get_eb_vm(h->ctx);
> - struct drm_i915_gem_object *obj;
> - struct i915_request *rq = NULL;
> - struct i915_vma *hws, *vma;
> - unsigned int flags;
> - void *vaddr;
> - u32 *batch;
> - int err;
> -
> - obj = i915_gem_object_create_internal(gt->i915, PAGE_SIZE);
> - if (IS_ERR(obj)) {
> - i915_vm_put(vm);
> - return ERR_CAST(obj);
> - }
> -
> - vaddr = i915_gem_object_pin_map_unlocked(obj, 
> intel_gt_coherent_map_type(gt

Re: [Intel-gfx] [PATCH 1/2] drm/i915/selftests: Align igt_spinner_create_request with hangcheck

2023-08-16 Thread Cavitt, Jonathan
-Original Message-
From: Roper, Matthew D  
Sent: Tuesday, August 15, 2023 12:29 PM
To: Cavitt, Jonathan 
Cc: intel-gfx@lists.freedesktop.org; Dutt, Sudeep ; 
chris.p.wil...@linux.intel.com; Gupta, saurabhg ; 
Shyti, Andi ; Summers, Stuart ; 
Das, Nirmoy ; Belgaumkar, Vinay 
; Winiarski, Michal 
Subject: Re: [PATCH 1/2] drm/i915/selftests: Align igt_spinner_create_request 
with hangcheck
> 
> On Tue, Aug 15, 2023 at 09:53:44AM -0700, Jonathan Cavitt wrote:
> > Align igt_spinner_create_request with the hang_create_request
> > implementation in selftest_hangcheck.c.
> > 
> > Signed-off-by: Jonathan Cavitt 
> 
> Reviewed-by: Matt Roper 
> 
> 
> For the second patch in the series, the general direction looks good to
> me, but I'm not familiar enough with the spinner implementation and
> context handling to do a detailed review there.  Hopefully someone more
> familiar with that code can take a look.
> 


Thank you for the RB!  If it's not too much trouble, could you please push
this patch (independent of the rest of the series) upstream?  Sudeep would
like to see this patch upstreamed as soon as possible to get MTL prepared
for PV.  I figured I'd ask you first since you provided the RB.

Of course, if you're busy, that's understandable as well.
-Jonathan Cavitt


> 
> Matt
> 
> > ---
> >  drivers/gpu/drm/i915/selftests/igt_spinner.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/selftests/igt_spinner.c 
> > b/drivers/gpu/drm/i915/selftests/igt_spinner.c
> > index 0f064930ef11..8c3e1f20e5a1 100644
> > --- a/drivers/gpu/drm/i915/selftests/igt_spinner.c
> > +++ b/drivers/gpu/drm/i915/selftests/igt_spinner.c
> > @@ -179,6 +179,9 @@ igt_spinner_create_request(struct igt_spinner *spin,
> >  
> > *batch++ = arbitration_command;
> >  
> > +   memset32(batch, MI_NOOP, 128);
> > +   batch += 128;
> > +
> > if (GRAPHICS_VER(rq->i915) >= 8)
> > *batch++ = MI_BATCH_BUFFER_START | BIT(8) | 1;
> > else if (IS_HASWELL(rq->i915))
> > -- 
> > 2.25.1
> > 
> 
> -- 
> Matt Roper
> Graphics Software Engineer
> Linux GPU Platform Enablement
> Intel Corporation
> 


Re: [Intel-gfx] [PATCH dii-client 1/2] drm/i915: Make i915_coherent_map_type GT-centric

2023-07-25 Thread Cavitt, Jonathan
-Original Message-
From: Tvrtko Ursulin  
Sent: Tuesday, July 25, 2023 9:23 AM
To: Cavitt, Jonathan ; 
intel-gfx@lists.freedesktop.org
Cc: Shyti, Andi ; Roper, Matthew D 
; chris.p.wil...@linux.intel.com; Das, Nirmoy 

Subject: Re: [Intel-gfx] [PATCH dii-client 1/2] drm/i915: Make 
i915_coherent_map_type GT-centric
> 
> 
> On 25/07/2023 17:01, Jonathan Cavitt wrote:
> > Refactor i915_coherent_map_type to be GT-centric rather than
> > device-centric.  Each GT may require different coherency
> > handling due to hardware workarounds.
> > 
> > Since the function now takes a GT instead of the i915, the function is
> > renamed and moved to the gt folder.
> 
> What about the issue of fake gt passed to shmem_create_from_object I raised?


The function is, presently, only called in __engines_record_defaults, as a part 
of
intel_gt_init.  shmem_create_from_object uses i915_coherent_map_type to 
determine
the map_type to pass to i915_gem_object_pin_map_unlocked.  This creates a 
pointer
that we pass to shmem_create_from_data.  Aside from an i915_gem_object_is_shmem
check at the start, the function is otherwise just calling 
shmem_create_from_data...
which, itself, is only called by shmem_create_from_object.

I'd argue that any additional changes to shmem_create_from_object are 
unnecessary
as the function is only called from __engines_record_defaults.  Additionally, 
the function
is a part of the gt library (shmem_utils.h is in the gt folder), so taking a gt 
argument should
be expected.  However, if you still disagree, here's a few options for how we 
can rectify
this issue:

Debatably, we could just delete shmem_create_from_object and use the full 
expansion
in __engines_record_defaults.  Though this may come with some additional 
complications,
such as the lost helper function being desirable in the future and needing to 
expand the
definition of shmem_create_from_data to include an object pinning requirement.

The second option is that we pass the map_type to the shmem_create_from_object 
function
instead of the GT, bypassing the need for i915_coherent_map_type in the 
function by breaking
it out as a part of __engines_record_defaults.  I'll leave it to your judgement 
whether this would
be more or less confusing than just passing the GT.

Thirdly, we could just hard-code a specific map_type to use, though that seems 
ill-advised.

The last option is to rename the function to something more representative. 
Here's a few ideas
I can think of off the top of my head:

shmem_create
shmem_create_on_gt
shmem_create_from_gt
shmem_create_from_object_on_gt
intel_gt_create_shmem_from_object

If I had to recommend one approach over the rest, it would probably be the 
second option,
followed by the fourth.

-Jonathan Cavitt


>
> Regards,
> 
> Tvrtko
> 
> P.S. See if you can drop the dii-client part from the subject line going 
> forward.
> 
> > 
> > Suggested-by: Matt Roper 
> > Signed-off-by: Jonathan Cavitt 
> > ---
> >   drivers/gpu/drm/i915/display/intel_hdcp_gsc.c   |  3 ++-
> >   drivers/gpu/drm/i915/gem/i915_gem_object.h  |  4 
> >   drivers/gpu/drm/i915/gem/i915_gem_pages.c   | 15 ---
> >   .../drm/i915/gem/selftests/i915_gem_migrate.c   | 12 ++--
> >   drivers/gpu/drm/i915/gt/intel_engine_pm.c   |  2 +-
> >   drivers/gpu/drm/i915/gt/intel_gt.c  | 17 -
> >   drivers/gpu/drm/i915/gt/intel_gt.h  |  3 +++
> >   drivers/gpu/drm/i915/gt/intel_gtt.c |  4 ++--
> >   drivers/gpu/drm/i915/gt/intel_lrc.c |  2 +-
> >   drivers/gpu/drm/i915/gt/intel_ring.c|  3 ++-
> >   drivers/gpu/drm/i915/gt/selftest_context.c  |  2 +-
> >   drivers/gpu/drm/i915/gt/selftest_hangcheck.c|  4 ++--
> >   drivers/gpu/drm/i915/gt/selftest_lrc.c  |  2 +-
> >   drivers/gpu/drm/i915/gt/shmem_utils.c   |  7 ---
> >   drivers/gpu/drm/i915/gt/shmem_utils.h   |  4 +++-
> >   drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c   |  3 +--
> >   drivers/gpu/drm/i915/gt/uc/intel_guc.c  |  2 +-
> >   drivers/gpu/drm/i915/gt/uc/intel_huc_fw.c   |  3 +--
> >   drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c|  3 ++-
> >   drivers/gpu/drm/i915/pxp/intel_pxp_gsccs.c  |  2 +-
> >   drivers/gpu/drm/i915/pxp/intel_pxp_tee.c|  3 ++-
> >   drivers/gpu/drm/i915/selftests/igt_spinner.c|  2 +-
> >   22 files changed, 53 insertions(+), 49 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/display/intel_hdcp_gsc.c 
> > b/drivers/gpu/drm/i915/display/intel_hdcp_gsc.c
> > index ad0405375881..d753db3eef15 100644
> > --- a/drivers/gpu/drm/i915/display/intel_hdcp_gsc.c
> > +++ b/drivers/gpu/drm/i915/display/in

Re: [Intel-gfx] [PATCH v2 2/4] drm/i915/gt: Ensure memory quiesced before invalidation

2023-07-13 Thread Cavitt, Jonathan
-Original Message-
From: Nirmoy Das  
Sent: Thursday, July 13, 2023 7:12 AM
To: Andi Shyti 
Cc: Cavitt, Jonathan ; Intel GFX 
; Roper, Matthew D 
; Chris Wilson ; Mika 
Kuoppala 
Subject: Re: [Intel-gfx] [PATCH v2 2/4] drm/i915/gt: Ensure memory quiesced 
before invalidation
>
>Hi Andi,
>
>On 7/13/2023 2:31 PM, Andi Shyti wrote:
>> Hi Nirmoy and Jonathan,
>>
>>>>>> @@ -202,6 +202,13 @@ int gen12_emit_flush_rcs(struct i915_request *rq, 
>>>>>> u32 mode)
>>>>>> {
>>>>>>  struct intel_engine_cs *engine = rq->engine;
>>>>>> +/*
>>>>>> + * Aux invalidations on Aux CCS platforms require
>>>>>> + * memory traffic is quiesced prior.
>>>>> I see that we are doing aux inval on EMIT_INVALIDATE so it make sense to
>>>>>
>>>>>    do if ((mode & EMIT_INVALIDATE) && !HAS_FLAT_CCS(engine->i915) )
>>>>>
>>>> This is agreeable, though I don't think there's any instances of us 
>>>> calling gen12_emit_flush_rcs with a blank mode,
>>>> since that wouldn't accomplish anything.  So I don't think the additional 
>>>> check/safety net is necessary, but it doesn't
>>>> hurt to have.
>> so... do we agree here that we don't add anything? I don't really
>> mind...
>
>Not a blocking objection but if you are sending another revision of this 
>then why not.


That's about my stance on it as well.  I'd defer the decision to Nirmoy here.


>
>
>> Or, I can queue up a patch 5 adding this "pedantic" check and we
>> can discuss it separately.


I wouldn't offer much in terms of the discussion, unfortunately, so I'm fairly 
certain the
only thing that would come from that is a series of questions asking why the 
patch
wasn't squashed with this one to begin with.


>>
>>>>>> + */
>>>>>> +if (!HAS_FLAT_CCS(engine->i915))
>>>>>> +mode |= EMIT_FLUSH;
>>>>> I think this generic EMIT_FLUSH is not enough. I seeing some missing
>>>>> flags for PIPE_CONTROL
>>>>>
>>>>> As per https://gfxspecs.intel.com/Predator/Home/Index/43904. It makes
>>>>> sense to move this to a
>>>>>
>>>>> new function given the complexity of PIPE_CONTROL flags requires for this.
>>>>>
>>>> I'm assuming when you're talking about the missing flags for PIPE_CONTROL, 
>>>> you're
>>>> referring to CCS Flush, correct?  Because every other flag is already 
>>>> covered in the
>>>> EMIT_FLUSH path.
>>> Yes, CCS Flush and I don't see a L3 fabric flush as well.


Does PIPE_CONTROL_FLUSH_L3 not do this?


>>>
>>>
>>>> I feel like I had this conversation with Matt while the internal version 
>>>> was
>>>> developed back in February, and the consensus was that the CCS Flush
>>>> requirement was already covered.
>>> Wasn't aware of this, would be nice to have a confirmation and a comment so
>>> we
>>>
>>> don't get confused in future.
>>>
>>>> On the other hand, it looks like the CCS Flush
>>>> requirement was only recently added back in May, so it might be worth
>>>> double-checking at the very least.
>>>>
>>>> Although... if CCS Flush is a missing flag, I wonder how we're supposed to 
>>>> set it,
>>>> as there doesn't appear to be a definition for such a flag in 
>>>> intel_gpu_commands.h...
>>>
>>> Yes, not yet but we should add a flag for that.
>> Is it OK if I add in the comment that EMIT_FLUSH covers the CCS
>> flushing?
>
>
>is it though ? I don't see that in the bspec, may be I am missing 
>something ?


That would certainly explain why there's no flag for it, if performing the CCS 
Flush
is a part of the EMIT_FLUSH pipeline by default.
-Jonathan Cavitt


>
>
>Regards,
>
>Nirmoy
>
>>
>> Andi
>


Re: [Intel-gfx] [PATCH v2 2/4] drm/i915/gt: Ensure memory quiesced before invalidation

2023-07-12 Thread Cavitt, Jonathan
-Original Message-
From: Nirmoy Das  
Sent: Wednesday, July 12, 2023 7:18 AM
To: Andi Shyti ; Cavitt, Jonathan 

Cc: Intel GFX ; Roper, Matthew D 
; Chris Wilson ; Mika 
Kuoppala 
Subject: Re: [Intel-gfx] [PATCH v2 2/4] drm/i915/gt: Ensure memory quiesced 
before invalidation
>
>Hi Andi and Jonathan,
>
>On 6/27/2023 11:43 AM, Andi Shyti wrote:
>> From: Jonathan Cavitt 
>>
>> All memory traffic must be quiesced before requesting
>> an aux invalidation on platforms that use Aux CCS.
>>
>> Fixes: 972282c4cf24 ("drm/i915/gen12: Add aux table invalidate for all 
>> engines")
>> Signed-off-by: Jonathan Cavitt 
>> Signed-off-by: Andi Shyti 
>> Cc:  # v5.8+
>> ---
>>   drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 7 +++
>>   1 file changed, 7 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c 
>> b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
>> index 563efee055602..e10e1ad0e841f 100644
>> --- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
>> +++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
>> @@ -202,6 +202,13 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 
>> mode)
>>   {
>>  struct intel_engine_cs *engine = rq->engine;
>>   
>> +/*
>> + * Aux invalidations on Aux CCS platforms require
>> + * memory traffic is quiesced prior.
>
>I see that we are doing aux inval on EMIT_INVALIDATE so it make sense to
>
>  do if ((mode & EMIT_INVALIDATE) && !HAS_FLAT_CCS(engine->i915) )
>

This is agreeable, though I don't think there's any instances of us calling 
gen12_emit_flush_rcs with a blank mode,
since that wouldn't accomplish anything.  So I don't think the additional 
check/safety net is necessary, but it doesn't
hurt to have.

>> + */
>> +if (!HAS_FLAT_CCS(engine->i915))
>> +mode |= EMIT_FLUSH;
>
>I think this generic EMIT_FLUSH is not enough. I seeing some missing 
>flags for PIPE_CONTROL
>
>As per https://gfxspecs.intel.com/Predator/Home/Index/43904. It makes 
>sense to move this to a
>
>new function given the complexity of PIPE_CONTROL flags requires for this.
>

I'm assuming when you're talking about the missing flags for PIPE_CONTROL, 
you're
referring to CCS Flush, correct?  Because every other flag is already covered 
in the
EMIT_FLUSH path.

I feel like I had this conversation with Matt while the internal version was
developed back in February, and the consensus was that the CCS Flush
requirement was already covered.  On the other hand, it looks like the CCS Flush
requirement was only recently added back in May, so it might be worth
double-checking at the very least.

Although... if CCS Flush is a missing flag, I wonder how we're supposed to set 
it,
as there doesn’t appear to be a definition for such a flag in 
intel_gpu_commands.h...

-Jonathan Cavitt

>
>Regards,
>
>Nirmoy
>
>
>> +
>>  if (mode & EMIT_FLUSH) {
>>  u32 flags = 0;
>>  int err;
>


Re: [Intel-gfx] [PATCH v7 0/2] drm/i915: Hugepage manager and test for MTL

2023-04-26 Thread Cavitt, Jonathan
-Original Message-
From: Hajda, Andrzej  
Sent: Wednesday, April 26, 2023 2:37 PM
To: Cavitt, Jonathan ; 
intel-gfx@lists.freedesktop.org; linux-ker...@vger.kernel.org; Auld, Matthew 

Subject: Re: [PATCH v7 0/2] drm/i915: Hugepage manager and test for MTL
>
>On 26.04.2023 17:31, Cavitt, Jonathan wrote:
>> -Original Message-
>> From: Hajda, Andrzej 
>> Sent: Wednesday, April 26, 2023 8:14 AM
>> To: intel-gfx@lists.freedesktop.org; linux-ker...@vger.kernel.org; Cavitt, 
>> Jonathan ; Hajda, Andrzej 
>> ; Auld, Matthew 
>> Subject: [PATCH v7 0/2] drm/i915: Hugepage manager and test for MTL
>>> This patchset patches sent by Jonathan and Andi, with
>>> addressed CI failures:
>>> 1. Fixed checking alignment of 64K pages on both Pre-Gen12 and Gen12.
>>> 2. Fixed start alignment of 2M pages.
>>>
>>> Regards
>>> Andrzej
>>>
>>> Jonathan Cavitt (2):
>>>   drm/i915: Migrate platform-dependent mock hugepage selftests to live
>>>   drm/i915: Use correct huge page manager for MTL
>>>
>>> .../gpu/drm/i915/gem/selftests/huge_pages.c   | 88 +++
>>> drivers/gpu/drm/i915/gt/gen8_ppgtt.c  |  3 +-
>>> 2 files changed, 71 insertions(+), 20 deletions(-)
>>>
>>> Cc: intel-gfx@lists.freedesktop.org
>>> Cc: linux-ker...@vger.kernel.org
>>> Cc: Jonathan Cavitt 
>>> Cc: Andrzej Hajda 
>>> Cc: Matthew Auld 
>>> -- 
>>> 2.39.2
>>>
>>> ---
>>> Jonathan Cavitt (2):
>>>   drm/i915: Migrate platform-dependent mock hugepage selftests to live
>>>   drm/i915: Use correct huge page manager for MTL
>>>
>>> drivers/gpu/drm/i915/gem/selftests/huge_pages.c | 105 
>>> ++--
>>> drivers/gpu/drm/i915/gt/gen8_ppgtt.c|   3 +-
>>> 2 files changed, 82 insertions(+), 26 deletions(-)
>>> ---
>>> base-commit: 4d0066a1c0763d50b6fb017e27d12b081ce21b57
>>> change-id: 20230425-hugepage-migrate-1869aaf31a6d
>>>
>>> Best regards,
>>> -- 
>>> Andrzej Hajda 
>>
>> Just reviewed the changes proper.  It's been a while, so I don't remember 
>> everything
>> about the prior version, but I think I recognized what was changed:
>>
>> - I wasn't aware a 21 bit alignment was required for 2M page sizes.  I'm 
>> glad you caught that.
>> - The extra debugging/error information will be helpful in the case of a 
>> failure.
>> - Grabbing the per-context VM instead of the ppgtt vm sounds good to me.
>>
>> Everything here looks amenable.
>> Acked-by: Jonathan Cavitt 
>> Though, given I'm also one of the signed-off-bys, I don't know if me acking 
>> it is exactly above-board.
>> -Jonathan Cavitt
>
>Thanks for looking at it. CI spotted another issue: 2M pages are 
>preferred over old-64K, if former are available.
>Fixed version sent.
>Regarding tags, I've kept your authorship, s-o-b, and added my 
>Co-developed, if it is OK to you I will keep it this way.
>If you prefer otherwise let me know.

That is entirely acceptable!  Thank you for your time.
-Jonathan Cavitt

>
>Regards
>Andrzej
>
>>
>
>


Re: [Intel-gfx] [PATCH v7 0/2] drm/i915: Hugepage manager and test for MTL

2023-04-26 Thread Cavitt, Jonathan
-Original Message-
From: Hajda, Andrzej  
Sent: Wednesday, April 26, 2023 8:14 AM
To: intel-gfx@lists.freedesktop.org; linux-ker...@vger.kernel.org; Cavitt, 
Jonathan ; Hajda, Andrzej ; 
Auld, Matthew 
Subject: [PATCH v7 0/2] drm/i915: Hugepage manager and test for MTL
>
>This patchset patches sent by Jonathan and Andi, with 
>addressed CI failures:
>1. Fixed checking alignment of 64K pages on both Pre-Gen12 and Gen12.
>2. Fixed start alignment of 2M pages.
>
>Regards
>Andrzej
>
>Jonathan Cavitt (2):
>  drm/i915: Migrate platform-dependent mock hugepage selftests to live
>  drm/i915: Use correct huge page manager for MTL
>
>.../gpu/drm/i915/gem/selftests/huge_pages.c   | 88 +++
> drivers/gpu/drm/i915/gt/gen8_ppgtt.c  |  3 +-
> 2 files changed, 71 insertions(+), 20 deletions(-)
>
>Cc: intel-gfx@lists.freedesktop.org
>Cc: linux-ker...@vger.kernel.org
>Cc: Jonathan Cavitt 
>Cc: Andrzej Hajda 
>Cc: Matthew Auld 
>-- 
>2.39.2
>
>---
>Jonathan Cavitt (2):
>  drm/i915: Migrate platform-dependent mock hugepage selftests to live
>  drm/i915: Use correct huge page manager for MTL
>
> drivers/gpu/drm/i915/gem/selftests/huge_pages.c | 105 ++--
> drivers/gpu/drm/i915/gt/gen8_ppgtt.c|   3 +-
> 2 files changed, 82 insertions(+), 26 deletions(-)
>---
>base-commit: 4d0066a1c0763d50b6fb017e27d12b081ce21b57
>change-id: 20230425-hugepage-migrate-1869aaf31a6d
>
>Best regards,
>-- 
>Andrzej Hajda 


Just reviewed the changes proper.  It's been a while, so I don't remember 
everything
about the prior version, but I think I recognized what was changed:

- I wasn't aware a 21 bit alignment was required for 2M page sizes.  I'm glad 
you caught that.
- The extra debugging/error information will be helpful in the case of a 
failure.
- Grabbing the per-context VM instead of the ppgtt vm sounds good to me.

Everything here looks amenable.
Acked-by: Jonathan Cavitt 
Though, given I'm also one of the signed-off-bys, I don't know if me acking it 
is exactly above-board.
-Jonathan Cavitt


>


Re: [Intel-gfx] [PATCH v5 1/2] drm/i915: Migrate platform-dependent mock hugepage selftests to live

2023-03-13 Thread Cavitt, Jonathan
-Original Message-
From: Auld, Matthew  
Sent: Thursday, March 2, 2023 2:36 AM
To: Cavitt, Jonathan ; 
intel-gfx@lists.freedesktop.org
Cc: Dutt, Sudeep ; thomas.hellst...@linux.intel.com; 
maarten.lankho...@linux.intel.com; Vetter, Daniel ; De 
Marchi, Lucas ; chris.p.wil...@linux.intel.com
Subject: Re: [PATCH v5 1/2] drm/i915: Migrate platform-dependent mock hugepage 
selftests to live
>
>On 28/02/2023 14:08, Matthew Auld wrote:
>> On 27/02/2023 17:19, Jonathan Cavitt wrote:
>>> Convert the igt_mock_ppgtt_huge_fill and igt_mock_ppgtt_64K mock 
>>> selftests into
>>> live selftests as their requirements have recently become 
>>> platform-dependent.
>>> Additionally, apply necessary platform dependency checks to these tests.
>>>
>>> v2: Reorder
>>>
>>> Signed-off-by: Jonathan Cavitt 
>> 
>> r-b still stands for the series. Note that CI is busted atm though, so 
>> we can't merge this yet. Likely need to re-trigger testing for the 
>> series once CI/drm-tip is working again.
>
>CI looks to be back. Can you trigger a retest through patchwork, or 
>resend the series?

The retest was submitted, but the mock hugepages subtest returned with a 
failure.
It didn't do so in the first run, nor did it fail in the prior revision (the 
one with the
incorrect patch order).  Do you have any guidance for forward progress?
-Jonathan Cavitt

>
>> 
>> 
>>> ---
>>>   .../gpu/drm/i915/gem/selftests/huge_pages.c   | 22 ++-
>>>   1 file changed, 17 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c 
>>> b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
>>> index defece0bcb81..375f119ab261 100644
>>> --- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
>>> +++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
>>> @@ -710,7 +710,7 @@ static void close_object_list(struct list_head 
>>> *objects,
>>>   }
>>>   }
>>> -static int igt_mock_ppgtt_huge_fill(void *arg)
>>> +static int igt_ppgtt_huge_fill(void *arg)
>>>   {
>>>   struct i915_ppgtt *ppgtt = arg;
>>>   struct drm_i915_private *i915 = ppgtt->vm.i915;
>>> @@ -784,7 +784,8 @@ static int igt_mock_ppgtt_huge_fill(void *arg)
>>>   GEM_BUG_ON(!expected_gtt);
>>>   GEM_BUG_ON(size);
>>> -    if (expected_gtt & I915_GTT_PAGE_SIZE_4K)
>>> +    if (expected_gtt & I915_GTT_PAGE_SIZE_4K &&
>>> +    GRAPHICS_VER_FULL(i915) < IP_VER(12, 50))
>>>   expected_gtt &= ~I915_GTT_PAGE_SIZE_64K;
>>>   i915_vma_unpin(vma);
>>> @@ -831,7 +832,7 @@ static int igt_mock_ppgtt_huge_fill(void *arg)
>>>   return err;
>>>   }
>>> -static int igt_mock_ppgtt_64K(void *arg)
>>> +static int igt_ppgtt_64K(void *arg)
>>>   {
>>>   struct i915_ppgtt *ppgtt = arg;
>>>   struct drm_i915_private *i915 = ppgtt->vm.i915;
>>> @@ -913,6 +914,17 @@ static int igt_mock_ppgtt_64K(void *arg)
>>>   unsigned int offset = objects[i].offset;
>>>   unsigned int flags = PIN_USER;
>>> +    /*
>>> + * For modern GTT models, the requirements for marking a 
>>> page-table
>>> + * as 64K have been relaxed.  Account for this.
>>> + */
>>> +
>>> +    if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) {
>>> +    expected_gtt = 0;
>>> +    expected_gtt |= size & (SZ_64K | SZ_2M) ? 
>>> I915_GTT_PAGE_SIZE_64K : 0;
>>> +    expected_gtt |= size & SZ_4K ? I915_GTT_PAGE_SIZE_4K : 0;
>>> +    }
>>> +
>>>   for (single = 0; single <= 1; single++) {
>>>   obj = fake_huge_pages_object(i915, size, !!single);
>>>   if (IS_ERR(obj))
>>> @@ -1910,8 +1922,6 @@ int i915_gem_huge_page_mock_selftests(void)
>>>   SUBTEST(igt_mock_exhaust_device_supported_pages),
>>>   SUBTEST(igt_mock_memory_region_huge_pages),
>>>   SUBTEST(igt_mock_ppgtt_misaligned_dma),
>>> -    SUBTEST(igt_mock_ppgtt_huge_fill),
>>> -    SUBTEST(igt_mock_ppgtt_64K),
>>>   };
>>>   struct drm_i915_private *dev_priv;
>>>   struct i915_ppgtt *ppgtt;
>>> @@ -1962,6 +1972,8 @@ int i915_gem_huge_page_live_selftests(struct 
>>> drm_i915_private *i915)
>>>   SUBTEST(igt_ppgtt_sanity_check),
>>>   SUBTEST(igt_ppgtt_compact),
>>>   SUBTEST(igt_ppgtt_mixed),
>>> +    SUBTEST(igt_ppgtt_huge_fill),
>>> +    SUBTEST(igt_ppgtt_64K),
>>>   };
>>>   if (!HAS_PPGTT(i915)) {
>


Re: [Intel-gfx] [PATCH v5 1/2] drm/i915: Migrate platform-dependent mock hugepage selftests to live

2023-03-02 Thread Cavitt, Jonathan
-Original Message-
From: Auld, Matthew  
Sent: Thursday, March 2, 2023 2:36 AM
To: Cavitt, Jonathan ; 
intel-gfx@lists.freedesktop.org
Cc: Dutt, Sudeep ; thomas.hellst...@linux.intel.com; 
maarten.lankho...@linux.intel.com; Vetter, Daniel ; De 
Marchi, Lucas ; chris.p.wil...@linux.intel.com
Subject: Re: [PATCH v5 1/2] drm/i915: Migrate platform-dependent mock hugepage 
selftests to live
> 
> On 28/02/2023 14:08, Matthew Auld wrote:
> > On 27/02/2023 17:19, Jonathan Cavitt wrote:
> >> Convert the igt_mock_ppgtt_huge_fill and igt_mock_ppgtt_64K mock 
> >> selftests into
> >> live selftests as their requirements have recently become 
> >> platform-dependent.
> >> Additionally, apply necessary platform dependency checks to these tests.
> >>
> >> v2: Reorder
> >>
> >> Signed-off-by: Jonathan Cavitt 
> > 
> > r-b still stands for the series. Note that CI is busted atm though, so 
> > we can't merge this yet. Likely need to re-trigger testing for the 
> > series once CI/drm-tip is working again.
> 
> CI looks to be back. Can you trigger a retest through patchwork, or 
> resend the series?

Retest request submitted.
-Jonathan Cavitt

> 
> > 
> > 
> >> ---
> >>   .../gpu/drm/i915/gem/selftests/huge_pages.c   | 22 ++-
> >>   1 file changed, 17 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c 
> >> b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
> >> index defece0bcb81..375f119ab261 100644
> >> --- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
> >> +++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
> >> @@ -710,7 +710,7 @@ static void close_object_list(struct list_head 
> >> *objects,
> >>   }
> >>   }
> >> -static int igt_mock_ppgtt_huge_fill(void *arg)
> >> +static int igt_ppgtt_huge_fill(void *arg)
> >>   {
> >>   struct i915_ppgtt *ppgtt = arg;
> >>   struct drm_i915_private *i915 = ppgtt->vm.i915;
> >> @@ -784,7 +784,8 @@ static int igt_mock_ppgtt_huge_fill(void *arg)
> >>   GEM_BUG_ON(!expected_gtt);
> >>   GEM_BUG_ON(size);
> >> -    if (expected_gtt & I915_GTT_PAGE_SIZE_4K)
> >> +    if (expected_gtt & I915_GTT_PAGE_SIZE_4K &&
> >> +    GRAPHICS_VER_FULL(i915) < IP_VER(12, 50))
> >>   expected_gtt &= ~I915_GTT_PAGE_SIZE_64K;
> >>   i915_vma_unpin(vma);
> >> @@ -831,7 +832,7 @@ static int igt_mock_ppgtt_huge_fill(void *arg)
> >>   return err;
> >>   }
> >> -static int igt_mock_ppgtt_64K(void *arg)
> >> +static int igt_ppgtt_64K(void *arg)
> >>   {
> >>   struct i915_ppgtt *ppgtt = arg;
> >>   struct drm_i915_private *i915 = ppgtt->vm.i915;
> >> @@ -913,6 +914,17 @@ static int igt_mock_ppgtt_64K(void *arg)
> >>   unsigned int offset = objects[i].offset;
> >>   unsigned int flags = PIN_USER;
> >> +    /*
> >> + * For modern GTT models, the requirements for marking a 
> >> page-table
> >> + * as 64K have been relaxed.  Account for this.
> >> + */
> >> +
> >> +    if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) {
> >> +    expected_gtt = 0;
> >> +    expected_gtt |= size & (SZ_64K | SZ_2M) ? 
> >> I915_GTT_PAGE_SIZE_64K : 0;
> >> +    expected_gtt |= size & SZ_4K ? I915_GTT_PAGE_SIZE_4K : 0;
> >> +    }
> >> +
> >>   for (single = 0; single <= 1; single++) {
> >>   obj = fake_huge_pages_object(i915, size, !!single);
> >>   if (IS_ERR(obj))
> >> @@ -1910,8 +1922,6 @@ int i915_gem_huge_page_mock_selftests(void)
> >>   SUBTEST(igt_mock_exhaust_device_supported_pages),
> >>   SUBTEST(igt_mock_memory_region_huge_pages),
> >>   SUBTEST(igt_mock_ppgtt_misaligned_dma),
> >> -    SUBTEST(igt_mock_ppgtt_huge_fill),
> >> -    SUBTEST(igt_mock_ppgtt_64K),
> >>   };
> >>   struct drm_i915_private *dev_priv;
> >>   struct i915_ppgtt *ppgtt;
> >> @@ -1962,6 +1972,8 @@ int i915_gem_huge_page_live_selftests(struct 
> >> drm_i915_private *i915)
> >>   SUBTEST(igt_ppgtt_sanity_check),
> >>   SUBTEST(igt_ppgtt_compact),
> >>   SUBTEST(igt_ppgtt_mixed),
> >> +    SUBTEST(igt_ppgtt_huge_fill),
> >> +    SUBTEST(igt_ppgtt_64K),
> >>   };
> >>   if (!HAS_PPGTT(i915)) {
> 


Re: [Intel-gfx] [PATCH] drm/i915: Use correct huge page manager for MTL

2023-02-27 Thread Cavitt, Jonathan
-Original Message-
From: Auld, Matthew  
Sent: Monday, February 27, 2023 3:20 AM
To: Cavitt, Jonathan ; 
intel-gfx@lists.freedesktop.org
Cc: Dutt, Sudeep ; thomas.hellst...@linux.intel.com; 
maarten.lankho...@linux.intel.com; Vetter, Daniel ; De 
Marchi, Lucas ; chris.p.wil...@linux.intel.com
Subject: Re: [PATCH] drm/i915: Use correct huge page manager for MTL
> 
> On 24/02/2023 17:40, Jonathan Cavitt wrote:
> > MTL currently uses gen8_ppgtt_insert_huge when managing huge pages.  This 
> > is because
> > MTL reports as not supporting 64K pages, or more accurately, the system 
> > that reports
> > whether a platform has 64K pages reports false for MTL.  This is only half 
> > correct,
> > as the 64K page support reporting system only cares about 64K page support 
> > for LMEM,
> > which MTL doesn't have.
> > 
> > MTL should be using xehpsdv_ppgtt_insert_huge.  However, simply changing 
> > over to
> > using that manager doesn't resolve the issue because MTL is expecting the 
> > virtual
> > address space for the page table to be flushed after initialization, so we 
> > must also
> > add a flush statement there.
> > 
> > The changes made to the huge page manager selection indirectly impacted 
> > some of the
> > mock hugepage selftests.  Due to the use of pte level hints, rather than 
> > pde level
> > hints, we now expect 64K page sizes to be properly reported by the GTT, so 
> > we should
> > correct the expected test results to reflect this change.
> 
> For future submissions, please add the version number for each new 
> submission of the same patch, and also please include the changelog.

I thought we weren't supposed to include that information?  Or is that just a 
"from internal to upstream" thing?
-Jonathan Cavitt

> 
> > 
> > Signed-off-by: Jonathan Cavitt 
> > ---
> >   drivers/gpu/drm/i915/gem/selftests/huge_pages.c | 11 ---
> >   drivers/gpu/drm/i915/gt/gen8_ppgtt.c|  3 ++-
> >   2 files changed, 6 insertions(+), 8 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c 
> > b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
> > index defece0bcb81..06554717495f 100644
> > --- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
> > +++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
> > @@ -784,9 +784,6 @@ static int igt_mock_ppgtt_huge_fill(void *arg)
> > GEM_BUG_ON(!expected_gtt);
> > GEM_BUG_ON(size);
> >   
> > -   if (expected_gtt & I915_GTT_PAGE_SIZE_4K)
> > -   expected_gtt &= ~I915_GTT_PAGE_SIZE_64K;
> > -
> > i915_vma_unpin(vma);
> >   
> > if (vma->page_sizes.sg & I915_GTT_PAGE_SIZE_64K) {
> > @@ -849,7 +846,7 @@ static int igt_mock_ppgtt_64K(void *arg)
> > },
> > {
> > .size = SZ_64K + SZ_4K,
> > -   .gtt = I915_GTT_PAGE_SIZE_4K,
> > +   .gtt = I915_GTT_PAGE_SIZE_64K | I915_GTT_PAGE_SIZE_4K,
> > .offset = 0,
> > },
> > {
> > @@ -864,7 +861,7 @@ static int igt_mock_ppgtt_64K(void *arg)
> > },
> > {
> > .size = SZ_2M - SZ_4K,
> > -   .gtt = I915_GTT_PAGE_SIZE_4K,
> > +   .gtt = I915_GTT_PAGE_SIZE_64K | I915_GTT_PAGE_SIZE_4K,
> > .offset = 0,
> > },
> > {
> > @@ -886,12 +883,12 @@ static int igt_mock_ppgtt_64K(void *arg)
> > {
> > .size = SZ_64K,
> > .offset = SZ_2M,
> > -   .gtt = I915_GTT_PAGE_SIZE_4K,
> > +   .gtt = I915_GTT_PAGE_SIZE_64K,
> > },
> > {
> > .size = SZ_128K,
> > .offset = SZ_2M - SZ_64K,
> > -   .gtt = I915_GTT_PAGE_SIZE_4K,
> > +   .gtt = I915_GTT_PAGE_SIZE_64K,
> > },
> 
> Did you consider the suggestion with possibly making this a live test 
> instead?
> 
> The first comment in igt_mock_ppgtt_64K() describing the test is:
> 
> /*
>   * Sanity check some of the trickiness with 64K pages -- either we can
>   * safely mark the whole page-table(2M block) as 64K, or we have to
>   * always fallback to 4K.
>   */
> 
> That doesn't really apply to the new 64K GTT model it seems (which is 
> why it now fails), so trying to adjust the test just because the mock 
> dev

Re: [Intel-gfx] [PATCH] drm/i915: Use correct huge page manager for MTL

2023-02-24 Thread Cavitt, Jonathan
-Original Message-
From: Auld, Matthew  
Sent: Friday, February 24, 2023 7:51 AM
To: Cavitt, Jonathan ; 
intel-gfx@lists.freedesktop.org
Cc: Dutt, Sudeep ; thomas.hellst...@linux.intel.com; 
maarten.lankho...@linux.intel.com; Vetter, Daniel ; De 
Marchi, Lucas ; chris.p.wil...@linux.intel.com
Subject: Re: [PATCH] drm/i915: Use correct huge page manager for MTL
> 
> On 22/02/2023 15:56, Matthew Auld wrote:
> > On 22/02/2023 15:26, Jonathan Cavitt wrote:
> >> MTL currently uses gen8_ppgtt_insert_huge when managing huge pages.  
> >> This is because
> >> MTL reports as not supporting 64K pages, or more accurately, the 
> >> system that reports
> >> whether a platform has 64K pages reports false for MTL.  This is only 
> >> half correct,
> >> as the 64K page support reporting system only cares about 64K page 
> >> support for LMEM,
> >> which MTL doesn't have.
> >>
> >> MTL should be using xehpsdv_ppgtt_insert_huge.  However, simply 
> >> changing over to
> >> using that manager doesn't resolve the issue because MTL is expecting 
> >> the virtual
> >> address space for the page table to be flushed after initialization, 
> >> so we must also
> >> add a flush statement there.
> >>
> >> The changes made to the huge page manager selection indirectly 
> >> impacted some of the
> >> mock hugepage selftests.  Due to the use of pte level hints, rather 
> >> than pde level
> >> hints, we now expect 64K page sizes to be properly reported by the 
> >> GTT, so we should
> >> remove the check that asserts otherwise.
> >>
> >> Signed-off-by: Jonathan Cavitt 
> > 
> > Hopefully CI will be happy now,
> > Reviewed-by: Matthew Auld 
> 
> Looks it passes now, but then fails on the next subtest:
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114259v1/shard-apl7/igt@i915_selftest@m...@hugepages.html
> 
> Maybe it's not too outrageous to just move igt_mock_ppgtt_64K() to the 
> live test section and skip running it with graphics ver >= (12, 50). 
> IIUC that subtest is mostly only interesting with the old model anyway. 
> I guess that test is much too HW specific to be considered good mock 
> test. And maybe we should do the same for igt_mock_ppgtt_huge_fill(). 
> But there we can just make the expected_gtt &= ~I915_GTT_PAGE_SIZE_64K 
> conditional on graphics ver < (12, 50). Also maybe split this into two 
> patches. Thoughts?


IMO, no matter how important it is to get this patch upstreamed, introducing 
regressions to "fix later"
with this patch is just going to erode the trust of the upstream Linux team 
more than it's already been.
So, frustratingly, I think we need to do what's necessary to get this resolved 
in the same patch.

Thankfully, this probably won't be too difficult: igt_mock_ppgtt_64K is the 
last mock test, and we just
need to mark every object .gtt with I915_GTT_PAGE_SIZE_64K when .size & SZ_64K 
is not zero.  We
can do this in one of two ways:

1. Mark it in the object (I.E. adding it directly to .gtt).  This makes the 
most sense, but is the most difficult.
2. Mark it separately (I.E. expected_gtt |= object[i].size & SZ_64K ? 
I915_GTT_PAGE_SIZE_64K : 0).
This is the easiest, but is very likely to be rejected.

We probably have to do fix #1, and a direct fix here probably won't repair the 
issue 100% (There might be
some other objects in the test that aren't correctly formatted).  However, if 
we take this one step at a time,
we'll eventually get this patch upstream-ready.
-Jonathan Cavitt

> 
> > 
> >> ---
> >>   drivers/gpu/drm/i915/gem/selftests/huge_pages.c | 3 ---
> >>   drivers/gpu/drm/i915/gt/gen8_ppgtt.c    | 3 ++-
> >>   2 files changed, 2 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c 
> >> b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
> >> index defece0bcb81..1659ada4ce33 100644
> >> --- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
> >> +++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
> >> @@ -784,9 +784,6 @@ static int igt_mock_ppgtt_huge_fill(void *arg)
> >>   GEM_BUG_ON(!expected_gtt);
> >>   GEM_BUG_ON(size);
> >> -    if (expected_gtt & I915_GTT_PAGE_SIZE_4K)
> >> -    expected_gtt &= ~I915_GTT_PAGE_SIZE_64K;
> >> -
> >>   i915_vma_unpin(vma);
> >>   if (vma->page_sizes.sg & I915_GTT_PAGE_SIZE_64K) {
> >> diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c 
> >> b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> >> index 4daaa6

Re: [Intel-gfx] [PATCH] gen8_ppgtt: Use correct huge page manager for MTL

2023-02-21 Thread Cavitt, Jonathan
-Original Message-
From: Auld, Matthew  
Sent: Tuesday, February 21, 2023 9:46 AM
To: Cavitt, Jonathan 
Cc: Dutt, Sudeep ; Siddiqui, Ayaz A 
; intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH] gen8_ppgtt: Use correct huge page manager for MTL
> 
> On 21/02/2023 17:14, Cavitt, Jonathan wrote:
> > -Original Message-
> > From: Auld, Matthew 
> > Sent: Tuesday, February 21, 2023 8:33 AM
> > To: Cavitt, Jonathan 
> > Cc: Dutt, Sudeep ; Siddiqui, Ayaz A 
> > ; intel-gfx@lists.freedesktop.org
> > Subject: Re: [PATCH] gen8_ppgtt: Use correct huge page manager for MTL
> >>
> >> On 21/02/2023 16:28, Cavitt, Jonathan wrote:
> >>> -Original Message-
> >>> From: Auld, Matthew 
> >>> Sent: Tuesday, February 21, 2023 8:06 AM
> >>> To: Cavitt, Jonathan ; 
> >>> intel-gfx@lists.freedesktop.org
> >>> Cc: Dutt, Sudeep ; Siddiqui, Ayaz A 
> >>> 
> >>> Subject: Re: [PATCH] gen8_ppgtt: Use correct huge page manager for MTL
> >>>>
> >>>> On 17/02/2023 19:18, Jonathan Cavitt wrote:
> >>>>> MTL currently uses gen8_ppgtt_insert_huge when managing huge pages.  
> >>>>> This is because
> >>>>> MTL reports as not supporting 64K pages, or more accurately, the system 
> >>>>> that reports
> >>>>> whether a platform has 64K pages reports false for MTL.  This is only 
> >>>>> half correct,
> >>>>> as the 64K page support reporting system only cares about 64K page 
> >>>>> support for LMEM,
> >>>>> which MTL doesn't have.
> >>>>>
> >>>>> MTL should be using xehpsdv_ppgtt_insert_huge.  However, simply 
> >>>>> changing over to
> >>>>> using that manager doesn't resolve the issue because MTL is expecting 
> >>>>> the virtual
> >>>>> address space for the page table to be flushed after initialization, so 
> >>>>> we must also
> >>>>> add a flush statement there.
> >>>>>
> >>>>> Signed-off-by: Jonathan Cavitt 
> >>>> Reviewed-by: Matthew Auld 
> >>>>
> >>>> Although it looks like the hugepage mock tests are failing with this. I
> >>>> assume the mock device just uses some "max" gen version or so, which now
> >>>> triggers this path. Any ideas for that?
> >>>
> >>> With this patch applied, multiple calls to the hugepages live selftest 
> >>> result in a kernel panic.
> >>> If the mock tests are run immediately after the live ones, that would 
> >>> explain this behavior.
> >>> I was informed when this was initially debugged that the error was a 
> >>> known IOMMU issue
> >>> rather than some novel regression, though it's hard to tell if that was 
> >>> just hopeful optimism
> >>> or not at this point.
> >>
> >> In the test results we now get:
> >>
> >> 6> [183.420316] i915: Running
> >> i915_gem_huge_page_mock_selftests/igt_mock_exhaust_device_supported_pages
> >> <6> [183.436978] i915: Running
> >> i915_gem_huge_page_mock_selftests/igt_mock_memory_region_huge_pages
> >> <6> [183.445777] i915: Running
> >> i915_gem_huge_page_mock_selftests/igt_mock_ppgtt_misaligned_dma
> >> <6> [183.904531] i915: Running
> >> i915_gem_huge_page_mock_selftests/igt_mock_ppgtt_huge_fill
> >> <3> [183.912658] gtt=69632, expected=4096, size=69632, single=yes
> >> <3> [183.912784] i915/i915_gem_huge_page_mock_selftests:
> >> igt_mock_ppgtt_huge_fill failed with error -22
> > 
> >  if (expected_gtt & I915_GTT_PAGE_SIZE_4K)
> >  expected_gtt &= ~I915_GTT_PAGE_SIZE_64K;
> > 
> > I don't know why we're doing that to expected_gtt, but that seems to be the 
> > cause of the
> > problem in this case.
> 
> I think it's due to the older huge page model, where 64K requires the 
> entire page-table to all use 64K pages underneath (pde level hint), so 
> if we see 4K in there somewhere then we don't expect to get back 64K 
> GTT. But on newer HW we now have have pte level hint, so I think the 
> above can just be removed with this patch, since that's what the mock 
> device now uses.

Seems right.  I guess that would be... what?  Is it:
A. Platform specific?  I.E. we need s generation check in the selftest to 
proceed, such as the following:

 

Re: [Intel-gfx] [PATCH] gen8_ppgtt: Use correct huge page manager for MTL

2023-02-21 Thread Cavitt, Jonathan
-Original Message-
From: Auld, Matthew  
Sent: Tuesday, February 21, 2023 8:33 AM
To: Cavitt, Jonathan 
Cc: Dutt, Sudeep ; Siddiqui, Ayaz A 
; intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH] gen8_ppgtt: Use correct huge page manager for MTL
> 
> On 21/02/2023 16:28, Cavitt, Jonathan wrote:
> > -Original Message-
> > From: Auld, Matthew 
> > Sent: Tuesday, February 21, 2023 8:06 AM
> > To: Cavitt, Jonathan ; 
> > intel-gfx@lists.freedesktop.org
> > Cc: Dutt, Sudeep ; Siddiqui, Ayaz A 
> > 
> > Subject: Re: [PATCH] gen8_ppgtt: Use correct huge page manager for MTL
> >>
> >> On 17/02/2023 19:18, Jonathan Cavitt wrote:
> >>> MTL currently uses gen8_ppgtt_insert_huge when managing huge pages.  This 
> >>> is because
> >>> MTL reports as not supporting 64K pages, or more accurately, the system 
> >>> that reports
> >>> whether a platform has 64K pages reports false for MTL.  This is only 
> >>> half correct,
> >>> as the 64K page support reporting system only cares about 64K page 
> >>> support for LMEM,
> >>> which MTL doesn't have.
> >>>
> >>> MTL should be using xehpsdv_ppgtt_insert_huge.  However, simply changing 
> >>> over to
> >>> using that manager doesn't resolve the issue because MTL is expecting the 
> >>> virtual
> >>> address space for the page table to be flushed after initialization, so 
> >>> we must also
> >>> add a flush statement there.
> >>>
> >>> Signed-off-by: Jonathan Cavitt 
> >> Reviewed-by: Matthew Auld 
> >>
> >> Although it looks like the hugepage mock tests are failing with this. I
> >> assume the mock device just uses some "max" gen version or so, which now
> >> triggers this path. Any ideas for that?
> > 
> > With this patch applied, multiple calls to the hugepages live selftest 
> > result in a kernel panic.
> > If the mock tests are run immediately after the live ones, that would 
> > explain this behavior.
> > I was informed when this was initially debugged that the error was a known 
> > IOMMU issue
> > rather than some novel regression, though it's hard to tell if that was 
> > just hopeful optimism
> > or not at this point.
> 
> In the test results we now get:
> 
> 6> [183.420316] i915: Running 
> i915_gem_huge_page_mock_selftests/igt_mock_exhaust_device_supported_pages
> <6> [183.436978] i915: Running 
> i915_gem_huge_page_mock_selftests/igt_mock_memory_region_huge_pages
> <6> [183.445777] i915: Running 
> i915_gem_huge_page_mock_selftests/igt_mock_ppgtt_misaligned_dma
> <6> [183.904531] i915: Running 
> i915_gem_huge_page_mock_selftests/igt_mock_ppgtt_huge_fill
> <3> [183.912658] gtt=69632, expected=4096, size=69632, single=yes
> <3> [183.912784] i915/i915_gem_huge_page_mock_selftests: 
> igt_mock_ppgtt_huge_fill failed with error -22

if (expected_gtt & I915_GTT_PAGE_SIZE_4K)
expected_gtt &= ~I915_GTT_PAGE_SIZE_64K;

I don't know why we're doing that to expected_gtt, but that seems to be the 
cause of the
problem in this case.
-Jonathan Cavitt

> 
> I didn't look any deeper than that though. Note that this a just a 
> mock/fake device. I don't think its IOMMU related.
> 
> > -Jonathan Cavitt
> > 
> >>
> >>> ---
> >>>drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 3 ++-
> >>>1 file changed, 2 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c 
> >>> b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> >>> index 4daaa6f55668..9c571185395f 100644
> >>> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> >>> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> >>> @@ -570,6 +570,7 @@ xehpsdv_ppgtt_insert_huge(struct i915_address_space 
> >>> *vm,
> >>>   }
> >>>   } while (rem >= page_size && index < max);
> >>>
> >>> + drm_clflush_virt_range(vaddr, PAGE_SIZE);
> >>>   vma_res->page_sizes_gtt |= page_size;
> >>>   } while (iter->sg && sg_dma_len(iter->sg));
> >>>}
> >>> @@ -707,7 +708,7 @@ static void gen8_ppgtt_insert(struct 
> >>> i915_address_space *vm,
> >>>   struct sgt_dma iter = sgt_dma(vma_res);
> >>>
> >>>   if (vma_res->bi.page_sizes.sg > I915_GTT_PAGE_SIZE) {
> >>> - if (HAS_64K_PAGES(vm->i915))
> >>> + if (GRAPHICS_VER_FULL(vm->i915) >= IP_VER(12, 50))
> >>>   xehpsdv_ppgtt_insert_huge(vm, vma_res, , 
> >>> cache_level, flags);
> >>>   else
> >>>   gen8_ppgtt_insert_huge(vm, vma_res, , 
> >>> cache_level, flags);
> >>
> 


Re: [Intel-gfx] [PATCH] gen8_ppgtt: Use correct huge page manager for MTL

2023-02-21 Thread Cavitt, Jonathan
-Original Message-
From: Auld, Matthew  
Sent: Tuesday, February 21, 2023 8:06 AM
To: Cavitt, Jonathan ; 
intel-gfx@lists.freedesktop.org
Cc: Dutt, Sudeep ; Siddiqui, Ayaz A 

Subject: Re: [PATCH] gen8_ppgtt: Use correct huge page manager for MTL
> 
> On 17/02/2023 19:18, Jonathan Cavitt wrote:
> > MTL currently uses gen8_ppgtt_insert_huge when managing huge pages.  This 
> > is because
> > MTL reports as not supporting 64K pages, or more accurately, the system 
> > that reports
> > whether a platform has 64K pages reports false for MTL.  This is only half 
> > correct,
> > as the 64K page support reporting system only cares about 64K page support 
> > for LMEM,
> > which MTL doesn't have.
> > 
> > MTL should be using xehpsdv_ppgtt_insert_huge.  However, simply changing 
> > over to
> > using that manager doesn't resolve the issue because MTL is expecting the 
> > virtual
> > address space for the page table to be flushed after initialization, so we 
> > must also
> > add a flush statement there.
> > 
> > Signed-off-by: Jonathan Cavitt 
> Reviewed-by: Matthew Auld 
> 
> Although it looks like the hugepage mock tests are failing with this. I 
> assume the mock device just uses some "max" gen version or so, which now 
> triggers this path. Any ideas for that?

With this patch applied, multiple calls to the hugepages live selftest result 
in a kernel panic.
If the mock tests are run immediately after the live ones, that would explain 
this behavior.
I was informed when this was initially debugged that the error was a known 
IOMMU issue
rather than some novel regression, though it's hard to tell if that was just 
hopeful optimism
or not at this point.
-Jonathan Cavitt

> 
> > ---
> >   drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 3 ++-
> >   1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c 
> > b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> > index 4daaa6f55668..9c571185395f 100644
> > --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> > +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> > @@ -570,6 +570,7 @@ xehpsdv_ppgtt_insert_huge(struct i915_address_space *vm,
> > }
> > } while (rem >= page_size && index < max);
> >   
> > +   drm_clflush_virt_range(vaddr, PAGE_SIZE);
> > vma_res->page_sizes_gtt |= page_size;
> > } while (iter->sg && sg_dma_len(iter->sg));
> >   }
> > @@ -707,7 +708,7 @@ static void gen8_ppgtt_insert(struct i915_address_space 
> > *vm,
> > struct sgt_dma iter = sgt_dma(vma_res);
> >   
> > if (vma_res->bi.page_sizes.sg > I915_GTT_PAGE_SIZE) {
> > -   if (HAS_64K_PAGES(vm->i915))
> > +   if (GRAPHICS_VER_FULL(vm->i915) >= IP_VER(12, 50))
> > xehpsdv_ppgtt_insert_huge(vm, vma_res, , 
> > cache_level, flags);
> > else
> > gen8_ppgtt_insert_huge(vm, vma_res, , cache_level, 
> > flags);
> 


Re: [Intel-gfx] [PATCH] drm/i915: Use uabi engines for the default engine map

2023-01-23 Thread Cavitt, Jonathan
-Original Message-
From: Cavitt, Jonathan  
Sent: Monday, January 23, 2023 7:25 AM
To: intel-gfx@lists.freedesktop.org
Cc: Cavitt, Jonathan ; Dutt, Sudeep 

Subject: [PATCH] drm/i915: Use uabi engines for the default engine map
> 
> From: Tvrtko Ursulin 
> 
> Default engine map is exactly about uabi engines so no excuse not to use
> the appropriate iterator to populate it.
> 
> Signed-off-by: Tvrtko Ursulin 
> Signed-off-by: Jonathan Cavitt 

Reviewed-by: Jonathan Cavitt 
-Jonathan Cavitt

> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c | 9 -
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
> b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index 454e73a433c8..42a39e103d7c 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -1096,16 +1096,15 @@ static struct i915_gem_engines 
> *alloc_engines(unsigned int count)
>  static struct i915_gem_engines *default_engines(struct i915_gem_context *ctx,
>   struct intel_sseu rcs_sseu)
>  {
> - const struct intel_gt *gt = to_gt(ctx->i915);
> + const unsigned int max = I915_NUM_ENGINES;
>   struct intel_engine_cs *engine;
>   struct i915_gem_engines *e, *err;
> - enum intel_engine_id id;
>  
> - e = alloc_engines(I915_NUM_ENGINES);
> + e = alloc_engines(max);
>   if (!e)
>   return ERR_PTR(-ENOMEM);
>  
> - for_each_engine(engine, gt, id) {
> + for_each_uabi_engine(engine, ctx->i915) {
>   struct intel_context *ce;
>   struct intel_sseu sseu = {};
>   int ret;
> @@ -1113,7 +1112,7 @@ static struct i915_gem_engines *default_engines(struct 
> i915_gem_context *ctx,
>   if (engine->legacy_idx == INVALID_ENGINE)
>   continue;
>  
> - GEM_BUG_ON(engine->legacy_idx >= I915_NUM_ENGINES);
> + GEM_BUG_ON(engine->legacy_idx >= max);
>   GEM_BUG_ON(e->engines[engine->legacy_idx]);
>  
>   ce = intel_context_create(engine);
> -- 
> 2.25.1
> 
>