Re: [PATCH v1] drm/scheduler: Fix lockup in drm_sched_entity_kill()

2022-11-22 Thread Christian König

Am 23.11.22 um 01:13 schrieb Dmitry Osipenko:

The drm_sched_entity_kill() is invoked twice by drm_sched_entity_destroy()
while userspace process is exiting or being killed. First time it's invoked
when sched entity is flushed and second time when entity is released. This
causes a lockup within wait_for_completion(entity_idle) due to how completion
API works.

Calling wait_for_completion() more times than complete() was invoked is a
error condition that causes lockup because completion internally uses
counter for complete/wait calls. The complete_all() must be used instead
in such cases.

This patch fixes lockup of Panfrost driver that is reproducible by killing
any application in a middle of 3d drawing operation.

Fixes: 2fdb8a8f07c2 ("drm/scheduler: rework entity flush, kill and fini")
Signed-off-by: Dmitry Osipenko 


Oh, good point. Reviewed-by: Christian König 


---
  drivers/gpu/drm/scheduler/sched_entity.c | 2 +-
  drivers/gpu/drm/scheduler/sched_main.c   | 4 ++--
  2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c
index fe09e5be79bd..15d04a0ec623 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -81,7 +81,7 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
init_completion(&entity->entity_idle);
  
  	/* We start in an idle state. */

-   complete(&entity->entity_idle);
+   complete_all(&entity->entity_idle);
  
  	spin_lock_init(&entity->rq_lock);

spsc_queue_init(&entity->job_queue);
diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index 6ce04c2e90c0..857ec20be9e8 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -1026,7 +1026,7 @@ static int drm_sched_main(void *param)
sched_job = drm_sched_entity_pop_job(entity);
  
  		if (!sched_job) {

-   complete(&entity->entity_idle);
+   complete_all(&entity->entity_idle);
continue;
}
  
@@ -1037,7 +1037,7 @@ static int drm_sched_main(void *param)
  
  		trace_drm_run_job(sched_job, entity);

fence = sched->ops->run_job(sched_job);
-   complete(&entity->entity_idle);
+   complete_all(&entity->entity_idle);
drm_sched_fence_scheduled(s_fence);
  
  		if (!IS_ERR_OR_NULL(fence)) {




RE: [EXT] Re: [PATCH v4 01/10] drm: bridge: cadence: convert mailbox functions to macro functions

2022-11-22 Thread Sandor Yu

> -Original Message-
> From: Fabio Estevam 
> Sent: 2022年11月23日 2:09
> To: Sandor Yu 
> Cc: andrzej.ha...@intel.com; neil.armstr...@linaro.org;
> robert.f...@linaro.org; laurent.pinch...@ideasonboard.com;
> jo...@kwiboo.se; jernej.skra...@gmail.com; airl...@gmail.com;
> dan...@ffwll.ch; robh...@kernel.org; krzysztof.kozlowski...@linaro.org;
> shawn...@kernel.org; s.ha...@pengutronix.de; kis...@ti.com;
> vk...@kernel.org; dri-devel@lists.freedesktop.org;
> devicet...@vger.kernel.org; linux-arm-ker...@lists.infradead.org;
> linux-ker...@vger.kernel.org; linux-...@lists.infradead.org;
> alexander.st...@ew.tq-group.com; ker...@pengutronix.de; dl-linux-imx
> ; Oliver Brown 
> Subject: [EXT] Re: [PATCH v4 01/10] drm: bridge: cadence: convert mailbox
> functions to macro functions
> 
> Caution: EXT Email
> 
> Hi Sandor,
> 
> On Mon, Nov 21, 2022 at 4:27 AM Sandor Yu  wrote:
> >
> > Mailbox access functions could be share to other mhdp driver and
> > HDP-TX HDMI/DP PHY drivers, move those functions to head file
> > include/drm/bridge/cdns-mhdp-mailbox.h and convert them to macro
> > functions.
> 
> What is the reason for converting the functions to macro?
Both HDMI PHY driver and HDMI bridge driver need mailbox API function to access 
register and get status from firmware.
Covert those functions to macro could be easy reused by both PHY and bridge 
driver.

B.R
Sandor



Re: linux-next: build failure after merge of the drm-misc tree

2022-11-22 Thread Stephen Rothwell
Hi Dave,

On Wed, 23 Nov 2022 15:35:50 +1000 David Airlie  wrote:
>
> Nothing gone wrong as such, just the drm-misc-next pull request was
> sent on a regular weekly cadence, then I merged it a few days later.
> The fix for this is still in the drm-misc-next queue for the next PR
> which I will get this week.

There is nothing currently in the drm-misc tree in linux-next (relative
to the drm tree).  And there was never a fix in there for this problem,
the commit was just removed when I reported it.

If there was a fix for this in the drm-misc tree, I would not have seen
the build failure.
-- 
Cheers,
Stephen Rothwell


pgphogk5qqrfF.pgp
Description: OpenPGP digital signature


Re: [PATCH v4 09/11] drm/msm/dpu: add support for MDP_TOP blackhole

2022-11-22 Thread Abhinav Kumar




On 11/22/2022 3:12 PM, Dmitry Baryshkov wrote:

On sm8450 a register block was removed from MDP TOP. Accessing it during
snapshotting results in NoC errors / immediate reboot. Skip accessing
these registers during snapshot.

Tested-by: Vinod Koul 
Reviewed-by: Vinod Koul 
Reviewed-by: Konrad Dybcio 
Signed-off-by: Dmitry Baryshkov 
---
  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h |  1 +
  drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c| 11 +--
  2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
index 38aa38ab1568..4730f8268f2a 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
@@ -92,6 +92,7 @@ enum {
DPU_MDP_UBWC_1_0,
DPU_MDP_UBWC_1_5,
DPU_MDP_AUDIO_SELECT,
+   DPU_MDP_PERIPH_0_REMOVED,
DPU_MDP_MAX
  };


Please update the enum documentation as already requested in the 
previous patchset.


  
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c

index f3660cd14f4f..67f2e5288b3c 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
@@ -927,8 +927,15 @@ static void dpu_kms_mdp_snapshot(struct msm_disp_state 
*disp_state, struct msm_k
msm_disp_snapshot_add_block(disp_state, cat->wb[i].len,
dpu_kms->mmio + cat->wb[i].base, "wb_%d", i);
  
-	msm_disp_snapshot_add_block(disp_state, cat->mdp[0].len,

-   dpu_kms->mmio + cat->mdp[0].base, "top");
+   if (top->caps->features & BIT(DPU_MDP_PERIPH_0_REMOVED)) {
+   msm_disp_snapshot_add_block(disp_state, 0x380,
+   dpu_kms->mmio + cat->mdp[0].base, "top");
+   msm_disp_snapshot_add_block(disp_state, cat->mdp[0].len - 0x3a8,
+   dpu_kms->mmio + cat->mdp[0].base + 0x3a8, 
"top_2");
+   } else {
+   msm_disp_snapshot_add_block(disp_state, cat->mdp[0].len,
+   dpu_kms->mmio + cat->mdp[0].base, "top");
+   }
  
  	pm_runtime_put_sync(&dpu_kms->pdev->dev);

  }


Re: linux-next: build failure after merge of the drm-misc tree

2022-11-22 Thread David Airlie
On Wed, Nov 23, 2022 at 3:21 PM Stephen Rothwell  wrote:
>
> Hi all,
>
> On Thu, 17 Nov 2022 18:32:14 +1100 Stephen Rothwell  
> wrote:
> >
> > After merging the drm-misc tree, today's linux-next build (powerpc
> > ppc44x_defconfig) failed like this:
> >
> > ld: drivers/video/fbdev/core/fbmon.o: in function `fb_modesetting_disabled':
> > fbmon.c:(.text+0x1e4): multiple definition of `fb_modesetting_disabled'; 
> > drivers/video/fbdev/core/fbmem.o:fbmem.c:(.text+0x1bac): first defined here
> > ld: drivers/video/fbdev/core/fbcmap.o: in function 
> > `fb_modesetting_disabled':
> > fbcmap.c:(.text+0x478): multiple definition of `fb_modesetting_disabled'; 
> > drivers/video/fbdev/core/fbmem.o:fbmem.c:(.text+0x1bac): first defined here
> > ld: drivers/video/fbdev/core/fbsysfs.o: in function 
> > `fb_modesetting_disabled':
> > fbsysfs.c:(.text+0xb64): multiple definition of `fb_modesetting_disabled'; 
> > drivers/video/fbdev/core/fbmem.o:fbmem.c:(.text+0x1bac): first defined here
> > ld: drivers/video/fbdev/core/modedb.o: in function 
> > `fb_modesetting_disabled':
> > modedb.c:(.text+0x129c): multiple definition of `fb_modesetting_disabled'; 
> > drivers/video/fbdev/core/fbmem.o:fbmem.c:(.text+0x1bac): first defined here
> > ld: drivers/video/fbdev/core/fbcvt.o: in function `fb_modesetting_disabled':
> > fbcvt.c:(.text+0x0): multiple definition of `fb_modesetting_disabled'; 
> > drivers/video/fbdev/core/fbmem.o:fbmem.c:(.text+0x1bac): first defined here
> >
> > Caused by commit
> >
> >   0ba2fa8cbd29 ("fbdev: Add support for the nomodeset kernel parameter")
> >
> > This build does not have CONFIG_VIDEO_NOMODESET set.
> >
> > I applied the following patch for today.
> >
> > From 63f957a050c62478ed1348c5b204bc65c68df4d7 Mon Sep 17 00:00:00 2001
> > From: Stephen Rothwell 
> > Date: Thu, 17 Nov 2022 18:19:22 +1100
> > Subject: [PATCH] fix up for "fbdev: Add support for the nomodeset kernel 
> > parameter"
> >
> > Signed-off-by: Stephen Rothwell 
> > ---
> >  include/linux/fb.h | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/include/linux/fb.h b/include/linux/fb.h
> > index 3a822e4357b1..ea421724f733 100644
> > --- a/include/linux/fb.h
> > +++ b/include/linux/fb.h
> > @@ -807,7 +807,7 @@ extern int fb_find_mode(struct fb_var_screeninfo *var,
> >  #if defined(CONFIG_VIDEO_NOMODESET)
> >  bool fb_modesetting_disabled(const char *drvname);
> >  #else
> > -bool fb_modesetting_disabled(const char *drvname)
> > +static inline bool fb_modesetting_disabled(const char *drvname)
> >  {
> >   return false;
> >  }
> > --
> > 2.35.1
>
> This commit went away for a couple of linux-next releases, but now has
> reappeared in the drm tree :-(  What went wrong?

Nothing gone wrong as such, just the drm-misc-next pull request was
sent on a regular weekly cadence, then I merged it a few days later.
The fix for this is still in the drm-misc-next queue for the next PR
which I will get this week.

Dave.



Re: linux-next: build failure after merge of the drm-misc tree

2022-11-22 Thread Stephen Rothwell
Hi all,

On Thu, 17 Nov 2022 18:32:14 +1100 Stephen Rothwell  
wrote:
>
> After merging the drm-misc tree, today's linux-next build (powerpc
> ppc44x_defconfig) failed like this:
> 
> ld: drivers/video/fbdev/core/fbmon.o: in function `fb_modesetting_disabled':
> fbmon.c:(.text+0x1e4): multiple definition of `fb_modesetting_disabled'; 
> drivers/video/fbdev/core/fbmem.o:fbmem.c:(.text+0x1bac): first defined here
> ld: drivers/video/fbdev/core/fbcmap.o: in function `fb_modesetting_disabled':
> fbcmap.c:(.text+0x478): multiple definition of `fb_modesetting_disabled'; 
> drivers/video/fbdev/core/fbmem.o:fbmem.c:(.text+0x1bac): first defined here
> ld: drivers/video/fbdev/core/fbsysfs.o: in function `fb_modesetting_disabled':
> fbsysfs.c:(.text+0xb64): multiple definition of `fb_modesetting_disabled'; 
> drivers/video/fbdev/core/fbmem.o:fbmem.c:(.text+0x1bac): first defined here
> ld: drivers/video/fbdev/core/modedb.o: in function `fb_modesetting_disabled':
> modedb.c:(.text+0x129c): multiple definition of `fb_modesetting_disabled'; 
> drivers/video/fbdev/core/fbmem.o:fbmem.c:(.text+0x1bac): first defined here
> ld: drivers/video/fbdev/core/fbcvt.o: in function `fb_modesetting_disabled':
> fbcvt.c:(.text+0x0): multiple definition of `fb_modesetting_disabled'; 
> drivers/video/fbdev/core/fbmem.o:fbmem.c:(.text+0x1bac): first defined here
> 
> Caused by commit
> 
>   0ba2fa8cbd29 ("fbdev: Add support for the nomodeset kernel parameter")
> 
> This build does not have CONFIG_VIDEO_NOMODESET set.
> 
> I applied the following patch for today.
> 
> From 63f957a050c62478ed1348c5b204bc65c68df4d7 Mon Sep 17 00:00:00 2001
> From: Stephen Rothwell 
> Date: Thu, 17 Nov 2022 18:19:22 +1100
> Subject: [PATCH] fix up for "fbdev: Add support for the nomodeset kernel 
> parameter"
> 
> Signed-off-by: Stephen Rothwell 
> ---
>  include/linux/fb.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/linux/fb.h b/include/linux/fb.h
> index 3a822e4357b1..ea421724f733 100644
> --- a/include/linux/fb.h
> +++ b/include/linux/fb.h
> @@ -807,7 +807,7 @@ extern int fb_find_mode(struct fb_var_screeninfo *var,
>  #if defined(CONFIG_VIDEO_NOMODESET)
>  bool fb_modesetting_disabled(const char *drvname);
>  #else
> -bool fb_modesetting_disabled(const char *drvname)
> +static inline bool fb_modesetting_disabled(const char *drvname)
>  {
>   return false;
>  }
> -- 
> 2.35.1

This commit went away for a couple of linux-next releases, but now has
reappeared in the drm tree :-(  What went wrong?

I have reapplied the above patch...

-- 
Cheers,
Stephen Rothwell


pgpx9e_aaDpNr.pgp
Description: OpenPGP digital signature


Re: [PATCH v4] drm/i915/mtl: Media GT and Render GT share common GGTT

2022-11-22 Thread Iddamsetty, Aravind



On 23-11-2022 05:29, Matt Roper wrote:
> On Tue, Nov 22, 2022 at 12:31:26PM +0530, Aravind Iddamsetty wrote:
>> On XE_LPM+ platforms the media engines are carved out into a separate
>> GT but have a common GGTMMADR address range which essentially makes
>> the GGTT address space to be shared between media and render GT. As a
>> result any updates in GGTT shall invalidate TLB of GTs sharing it and
>> similarly any operation on GGTT requiring an action on a GT will have to
>> involve all GTs sharing it. setup_private_pat was being done on a per
>> GGTT based as that doesn't touch any GGTT structures moved it to per GT
>> based.
>>
>> BSPEC: 63834
>>
>> v2:
>> 1. Add details to commit msg
>> 2. includes fix for failure to add item to ggtt->gt_list, as suggested
>> by Lucas
>> 3. as ggtt_flush() is used only for ggtt drop i915_is_ggtt check within
>> it.
>> 4. setup_private_pat moved out of intel_gt_tiles_init
>>
>> v3:
>> 1. Move out for_each_gt from i915_driver.c (Jani Nikula)
>>
>> v4: drop using RCU primitives on ggtt->gt_list as it is not an RCU list
>> (Matt Roper)
>>
>> Cc: Matt Roper 
>> Signed-off-by: Aravind Iddamsetty 
> 
> Reviewed-by: Matt Roper 

Thanks Matt, could you also help with merging the change.

Regards,
Aravind.
> 
>> ---
>>  drivers/gpu/drm/i915/gt/intel_ggtt.c  | 54 +--
>>  drivers/gpu/drm/i915/gt/intel_gt.c| 13 +-
>>  drivers/gpu/drm/i915/gt/intel_gt_types.h  |  3 ++
>>  drivers/gpu/drm/i915/gt/intel_gtt.h   |  4 ++
>>  drivers/gpu/drm/i915/i915_driver.c| 12 ++---
>>  drivers/gpu/drm/i915/i915_gem.c   |  2 +
>>  drivers/gpu/drm/i915/i915_gem_evict.c | 51 +++--
>>  drivers/gpu/drm/i915/i915_vma.c   |  5 ++-
>>  drivers/gpu/drm/i915/selftests/i915_gem.c |  2 +
>>  9 files changed, 111 insertions(+), 35 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c 
>> b/drivers/gpu/drm/i915/gt/intel_ggtt.c
>> index 8145851ad23d..7644738b9cdb 100644
>> --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
>> +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
>> @@ -8,6 +8,7 @@
>>  #include 
>>  #include 
>>  
>> +#include 
>>  #include 
>>  #include 
>>  
>> @@ -196,10 +197,13 @@ void i915_ggtt_suspend_vm(struct i915_address_space 
>> *vm)
>>  
>>  void i915_ggtt_suspend(struct i915_ggtt *ggtt)
>>  {
>> +struct intel_gt *gt;
>> +
>>  i915_ggtt_suspend_vm(&ggtt->vm);
>>  ggtt->invalidate(ggtt);
>>  
>> -intel_gt_check_and_clear_faults(ggtt->vm.gt);
>> +list_for_each_entry(gt, &ggtt->gt_list, ggtt_link)
>> +intel_gt_check_and_clear_faults(gt);
>>  }
>>  
>>  void gen6_ggtt_invalidate(struct i915_ggtt *ggtt)
>> @@ -225,16 +229,21 @@ static void gen8_ggtt_invalidate(struct i915_ggtt 
>> *ggtt)
>>  
>>  static void guc_ggtt_invalidate(struct i915_ggtt *ggtt)
>>  {
>> -struct intel_uncore *uncore = ggtt->vm.gt->uncore;
>>  struct drm_i915_private *i915 = ggtt->vm.i915;
>>  
>>  gen8_ggtt_invalidate(ggtt);
>>  
>> -if (GRAPHICS_VER(i915) >= 12)
>> -intel_uncore_write_fw(uncore, GEN12_GUC_TLB_INV_CR,
>> -  GEN12_GUC_TLB_INV_CR_INVALIDATE);
>> -else
>> -intel_uncore_write_fw(uncore, GEN8_GTCR, GEN8_GTCR_INVALIDATE);
>> +if (GRAPHICS_VER(i915) >= 12) {
>> +struct intel_gt *gt;
>> +
>> +list_for_each_entry(gt, &ggtt->gt_list, ggtt_link)
>> +intel_uncore_write_fw(gt->uncore,
>> +  GEN12_GUC_TLB_INV_CR,
>> +  GEN12_GUC_TLB_INV_CR_INVALIDATE);
>> +} else {
>> +intel_uncore_write_fw(ggtt->vm.gt->uncore,
>> +  GEN8_GTCR, GEN8_GTCR_INVALIDATE);
>> +}
>>  }
>>  
>>  u64 gen8_ggtt_pte_encode(dma_addr_t addr,
>> @@ -986,8 +995,6 @@ static int gen8_gmch_probe(struct i915_ggtt *ggtt)
>>  
>>  ggtt->vm.pte_encode = gen8_ggtt_pte_encode;
>>  
>> -setup_private_pat(ggtt->vm.gt);
>> -
>>  return ggtt_probe_common(ggtt, size);
>>  }
>>  
>> @@ -1196,7 +1203,14 @@ static int ggtt_probe_hw(struct i915_ggtt *ggtt, 
>> struct intel_gt *gt)
>>   */
>>  int i915_ggtt_probe_hw(struct drm_i915_private *i915)
>>  {
>> -int ret;
>> +struct intel_gt *gt;
>> +int ret, i;
>> +
>> +for_each_gt(gt, i915, i) {
>> +ret = intel_gt_assign_ggtt(gt);
>> +if (ret)
>> +return ret;
>> +}
>>  
>>  ret = ggtt_probe_hw(to_gt(i915)->ggtt, to_gt(i915));
>>  if (ret)
>> @@ -1208,6 +1222,19 @@ int i915_ggtt_probe_hw(struct drm_i915_private *i915)
>>  return 0;
>>  }
>>  
>> +struct i915_ggtt *i915_ggtt_create(struct drm_i915_private *i915)
>> +{
>> +struct i915_ggtt *ggtt;
>> +
>> +ggtt = drmm_kzalloc(&i915->drm, sizeof(*ggtt), GFP_KERNEL);
>> +if (!ggtt)
>> +return ERR_PTR(-ENOMEM);
>> +
>> +INIT_LIST_HEAD(&ggtt->gt_list);
>> +
>> +return ggtt;
>> +}
>> +
>>  int i915_gg

[PATCH -next] fbdev: offb: allow build when DRM_OFDRM=m

2022-11-22 Thread Randy Dunlap
Fix build when CONFIG_FB_OF=y and CONFIG_DRM_OFDRM=m.
When the latter symbol is =m, kconfig downgrades (limits) the 'select's
under FB_OF to modular (=m). This causes undefined symbol references:

powerpc64-linux-ld: drivers/video/fbdev/offb.o:(.data.rel.ro+0x58): undefined 
reference to `cfb_fillrect'
powerpc64-linux-ld: drivers/video/fbdev/offb.o:(.data.rel.ro+0x60): undefined 
reference to `cfb_copyarea'
powerpc64-linux-ld: drivers/video/fbdev/offb.o:(.data.rel.ro+0x68): undefined 
reference to `cfb_imageblit'

Fix this by allowing FB_OF any time that DRM_OFDRM != y so that the
selected FB_CFB_* symbols will become =y instead of =m.

In tristate logic (for DRM_OFDRM), this changes the dependency from
!DRM_OFDRM  == 2 - 1 == 1 => modular only (or disabled)
to (boolean)
DRM_OFDRM != y == y, allowing the 'select's to cause the
FB_CFB_* symbols to =y instead of =m.

Fixes: c8a17756c425 ("drm/ofdrm: Add ofdrm for Open Firmware framebuffers")
Signed-off-by: Randy Dunlap 
Suggested-by: Masahiro Yamada 
Cc: Thomas Zimmermann 
Cc: Michal Suchánek 
Cc: linuxppc-...@lists.ozlabs.org
Cc: Daniel Vetter 
Cc: Helge Deller 
Cc: linux-fb...@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
---
 drivers/video/fbdev/Kconfig |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff -- a/drivers/video/fbdev/Kconfig b/drivers/video/fbdev/Kconfig
--- a/drivers/video/fbdev/Kconfig
+++ b/drivers/video/fbdev/Kconfig
@@ -455,7 +455,7 @@ config FB_ATARI
 config FB_OF
bool "Open Firmware frame buffer device support"
depends on (FB = y) && PPC && (!PPC_PSERIES || PCI)
-   depends on !DRM_OFDRM
+   depends on DRM_OFDRM != y
select APERTURE_HELPERS
select FB_CFB_FILLRECT
select FB_CFB_COPYAREA


[PATCH v9 10/11] drm/virtio: Support memory shrinking

2022-11-22 Thread Dmitry Osipenko
Support generic drm-shmem memory shrinker and add new madvise IOCTL to
the VirtIO-GPU driver. BO cache manager of Mesa driver will mark BOs as
"don't need" using the new IOCTL to let shrinker purge the marked BOs on
OOM, the shrinker will also evict unpurgeable shmem BOs from memory if
guest supports SWAP file or partition.

Signed-off-by: Daniel Almeida 
Signed-off-by: Dmitry Osipenko 
---
 drivers/gpu/drm/virtio/virtgpu_drv.h|  18 +++-
 drivers/gpu/drm/virtio/virtgpu_gem.c|  52 ++
 drivers/gpu/drm/virtio/virtgpu_ioctl.c  |  37 +++
 drivers/gpu/drm/virtio/virtgpu_kms.c|   8 ++
 drivers/gpu/drm/virtio/virtgpu_object.c | 132 +++-
 drivers/gpu/drm/virtio/virtgpu_plane.c  |  22 +++-
 drivers/gpu/drm/virtio/virtgpu_vq.c |  40 +++
 include/uapi/drm/virtgpu_drm.h  |  14 +++
 8 files changed, 293 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h 
b/drivers/gpu/drm/virtio/virtgpu_drv.h
index b7a64c7dcc2c..b7e9ca25a627 100644
--- a/drivers/gpu/drm/virtio/virtgpu_drv.h
+++ b/drivers/gpu/drm/virtio/virtgpu_drv.h
@@ -89,6 +89,7 @@ struct virtio_gpu_object {
uint32_t hw_res_handle;
bool dumb;
bool created;
+   bool detached;
bool host3d_blob, guest_blob;
uint32_t blob_mem, blob_flags;
 
@@ -274,7 +275,7 @@ struct virtio_gpu_fpriv {
 };
 
 /* virtgpu_ioctl.c */
-#define DRM_VIRTIO_NUM_IOCTLS 12
+#define DRM_VIRTIO_NUM_IOCTLS 13
 extern struct drm_ioctl_desc virtio_gpu_ioctls[DRM_VIRTIO_NUM_IOCTLS];
 void virtio_gpu_create_context(struct drm_device *dev, struct drm_file *file);
 
@@ -310,6 +311,10 @@ void virtio_gpu_array_put_free(struct 
virtio_gpu_object_array *objs);
 void virtio_gpu_array_put_free_delayed(struct virtio_gpu_device *vgdev,
   struct virtio_gpu_object_array *objs);
 void virtio_gpu_array_put_free_work(struct work_struct *work);
+int virtio_gpu_array_prepare(struct virtio_gpu_device *vgdev,
+struct virtio_gpu_object_array *objs);
+int virtio_gpu_gem_host_mem_release(struct virtio_gpu_object *bo);
+int virtio_gpu_gem_madvise(struct virtio_gpu_object *obj, int madv);
 
 /* virtgpu_vq.c */
 int virtio_gpu_alloc_vbufs(struct virtio_gpu_device *vgdev);
@@ -321,6 +326,8 @@ void virtio_gpu_cmd_create_resource(struct 
virtio_gpu_device *vgdev,
struct virtio_gpu_fence *fence);
 void virtio_gpu_cmd_unref_resource(struct virtio_gpu_device *vgdev,
   struct virtio_gpu_object *bo);
+int virtio_gpu_cmd_release_resource(struct virtio_gpu_device *vgdev,
+   struct virtio_gpu_object *bo);
 void virtio_gpu_cmd_transfer_to_host_2d(struct virtio_gpu_device *vgdev,
uint64_t offset,
uint32_t width, uint32_t height,
@@ -341,6 +348,9 @@ void virtio_gpu_object_attach(struct virtio_gpu_device 
*vgdev,
  struct virtio_gpu_object *obj,
  struct virtio_gpu_mem_entry *ents,
  unsigned int nents);
+void virtio_gpu_object_detach(struct virtio_gpu_device *vgdev,
+ struct virtio_gpu_object *obj,
+ struct virtio_gpu_fence *fence);
 int virtio_gpu_attach_status_page(struct virtio_gpu_device *vgdev);
 int virtio_gpu_detach_status_page(struct virtio_gpu_device *vgdev);
 void virtio_gpu_cursor_ping(struct virtio_gpu_device *vgdev,
@@ -453,6 +463,8 @@ int virtio_gpu_object_create(struct virtio_gpu_device 
*vgdev,
 
 bool virtio_gpu_is_shmem(struct virtio_gpu_object *bo);
 
+int virtio_gpu_reattach_shmem_object(struct virtio_gpu_object *bo);
+
 int virtio_gpu_resource_id_get(struct virtio_gpu_device *vgdev,
   uint32_t *resid);
 /* virtgpu_prime.c */
@@ -483,4 +495,8 @@ void virtio_gpu_vram_unmap_dma_buf(struct device *dev,
   struct sg_table *sgt,
   enum dma_data_direction dir);
 
+/* virtgpu_gem_shrinker.c */
+int virtio_gpu_gem_shrinker_init(struct virtio_gpu_device *vgdev);
+void virtio_gpu_gem_shrinker_fini(struct virtio_gpu_device *vgdev);
+
 #endif
diff --git a/drivers/gpu/drm/virtio/virtgpu_gem.c 
b/drivers/gpu/drm/virtio/virtgpu_gem.c
index 7db48d17ee3a..8f65911b1e99 100644
--- a/drivers/gpu/drm/virtio/virtgpu_gem.c
+++ b/drivers/gpu/drm/virtio/virtgpu_gem.c
@@ -294,3 +294,55 @@ void virtio_gpu_array_put_free_work(struct work_struct 
*work)
}
spin_unlock(&vgdev->obj_free_lock);
 }
+
+int virtio_gpu_array_prepare(struct virtio_gpu_device *vgdev,
+struct virtio_gpu_object_array *objs)
+{
+   struct virtio_gpu_object *bo;
+   int ret = 0;
+   u32 i;
+
+   for (i = 0; i < objs->nents; i++) {
+   bo = gem_to_virtio_gpu_obj(objs->objs[i]);
+
+   if (virtio_g

[PATCH v9 02/11] drm/panfrost: Don't sync rpm suspension after mmu flushing

2022-11-22 Thread Dmitry Osipenko
Lockdep warns about potential circular locking dependency of devfreq
with the fs_reclaim caused by immediate device suspension when mapping is
released by shrinker. Fix it by doing the suspension asynchronously.

Signed-off-by: Dmitry Osipenko 
---
 drivers/gpu/drm/panfrost/panfrost_mmu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c 
b/drivers/gpu/drm/panfrost/panfrost_mmu.c
index e246d914e7f6..99a0975f6f03 100644
--- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
@@ -273,7 +273,7 @@ static void panfrost_mmu_flush_range(struct panfrost_device 
*pfdev,
if (pm_runtime_active(pfdev->dev))
mmu_hw_do_operation(pfdev, mmu, iova, size, 
AS_COMMAND_FLUSH_PT);
 
-   pm_runtime_put_sync_autosuspend(pfdev->dev);
+   pm_runtime_put_autosuspend(pfdev->dev);
 }
 
 static int mmu_map_sg(struct panfrost_device *pfdev, struct panfrost_mmu *mmu,
-- 
2.38.1



[PATCH v9 01/11] drm/msm/gem: Prevent blocking within shrinker loop

2022-11-22 Thread Dmitry Osipenko
Consider this scenario:

1. APP1 continuously creates lots of small GEMs
2. APP2 triggers `drop_caches`
3. Shrinker starts to evict APP1 GEMs, while APP1 produces new purgeable
   GEMs
4. msm_gem_shrinker_scan() returns non-zero number of freed pages
   and causes shrinker to try shrink more
5. msm_gem_shrinker_scan() returns non-zero number of freed pages again,
   goto 4
6. The APP2 is blocked in `drop_caches` until APP1 stops producing
   purgeable GEMs

To prevent this blocking scenario, check number of remaining pages
that GPU shrinker couldn't release due to a GEM locking contention
or shrinking rejection. If there are no remaining pages left to shrink,
then there is no need to free up more pages and shrinker may break out
from the loop.

This problem was found during shrinker/madvise IOCTL testing of
virtio-gpu driver. The MSM driver is affected in the same way.

Signed-off-by: Dmitry Osipenko 
---
 drivers/gpu/drm/drm_gem.c  | 9 +++--
 drivers/gpu/drm/msm/msm_gem_shrinker.c | 8 ++--
 include/drm/drm_gem.h  | 4 +++-
 3 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index b8db675e7fb5..299bca1390aa 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -1375,10 +1375,13 @@ EXPORT_SYMBOL(drm_gem_lru_move_tail);
  *
  * @lru: The LRU to scan
  * @nr_to_scan: The number of pages to try to reclaim
+ * @remaining: The number of pages left to reclaim
  * @shrink: Callback to try to shrink/reclaim the object.
  */
 unsigned long
-drm_gem_lru_scan(struct drm_gem_lru *lru, unsigned nr_to_scan,
+drm_gem_lru_scan(struct drm_gem_lru *lru,
+unsigned int nr_to_scan,
+unsigned long *remaining,
 bool (*shrink)(struct drm_gem_object *obj))
 {
struct drm_gem_lru still_in_lru;
@@ -1417,8 +1420,10 @@ drm_gem_lru_scan(struct drm_gem_lru *lru, unsigned 
nr_to_scan,
 * hit shrinker in response to trying to get backing pages
 * for this obj (ie. while it's lock is already held)
 */
-   if (!dma_resv_trylock(obj->resv))
+   if (!dma_resv_trylock(obj->resv)) {
+   *remaining += obj->size >> PAGE_SHIFT;
goto tail;
+   }
 
if (shrink(obj)) {
freed += obj->size >> PAGE_SHIFT;
diff --git a/drivers/gpu/drm/msm/msm_gem_shrinker.c 
b/drivers/gpu/drm/msm/msm_gem_shrinker.c
index 1de14e67f96b..4c8b0ab61ce4 100644
--- a/drivers/gpu/drm/msm/msm_gem_shrinker.c
+++ b/drivers/gpu/drm/msm/msm_gem_shrinker.c
@@ -116,12 +116,14 @@ msm_gem_shrinker_scan(struct shrinker *shrinker, struct 
shrink_control *sc)
};
long nr = sc->nr_to_scan;
unsigned long freed = 0;
+   unsigned long remaining = 0;
 
for (unsigned i = 0; (nr > 0) && (i < ARRAY_SIZE(stages)); i++) {
if (!stages[i].cond)
continue;
stages[i].freed =
-   drm_gem_lru_scan(stages[i].lru, nr, stages[i].shrink);
+   drm_gem_lru_scan(stages[i].lru, nr, &remaining,
+stages[i].shrink);
nr -= stages[i].freed;
freed += stages[i].freed;
}
@@ -132,7 +134,7 @@ msm_gem_shrinker_scan(struct shrinker *shrinker, struct 
shrink_control *sc)
 stages[3].freed);
}
 
-   return (freed > 0) ? freed : SHRINK_STOP;
+   return (freed > 0 && remaining > 0) ? freed : SHRINK_STOP;
 }
 
 #ifdef CONFIG_DEBUG_FS
@@ -182,10 +184,12 @@ msm_gem_shrinker_vmap(struct notifier_block *nb, unsigned 
long event, void *ptr)
NULL,
};
unsigned idx, unmapped = 0;
+   unsigned long remaining = 0;
 
for (idx = 0; lrus[idx] && unmapped < vmap_shrink_limit; idx++) {
unmapped += drm_gem_lru_scan(lrus[idx],
 vmap_shrink_limit - unmapped,
+&remaining,
 vmap_shrink);
}
 
diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
index a17c2f903f81..b46ade812443 100644
--- a/include/drm/drm_gem.h
+++ b/include/drm/drm_gem.h
@@ -475,7 +475,9 @@ int drm_gem_dumb_map_offset(struct drm_file *file, struct 
drm_device *dev,
 void drm_gem_lru_init(struct drm_gem_lru *lru, struct mutex *lock);
 void drm_gem_lru_remove(struct drm_gem_object *obj);
 void drm_gem_lru_move_tail(struct drm_gem_lru *lru, struct drm_gem_object 
*obj);
-unsigned long drm_gem_lru_scan(struct drm_gem_lru *lru, unsigned nr_to_scan,
+unsigned long drm_gem_lru_scan(struct drm_gem_lru *lru,
+  unsigned int nr_to_scan,
+  unsigned long *remaining,
   bool (*shrink)(struct drm_gem_object *obj));
 
 #

[PATCH v9 08/11] drm/shmem-helper: Add memory shrinker

2022-11-22 Thread Dmitry Osipenko
Introduce common drm-shmem shrinker for DRM drivers.

To start using drm-shmem shrinker drivers should do the following:

1. Implement evict() callback of GEM object where driver should check
   whether object is purgeable or evictable using drm-shmem helpers and
   perform the shrinking action

2. Initialize drm-shmem internals using drmm_gem_shmem_init(drm_device),
   which will register drm-shmem shrinker

3. Implement madvise IOCTL that will use drm_gem_shmem_madvise()

Signed-off-by: Daniel Almeida 
Signed-off-by: Dmitry Osipenko 
---
 drivers/gpu/drm/drm_gem_shmem_helper.c| 465 --
 .../gpu/drm/panfrost/panfrost_gem_shrinker.c  |   9 +-
 include/drm/drm_device.h  |  10 +-
 include/drm/drm_gem_shmem_helper.h|  61 ++-
 4 files changed, 492 insertions(+), 53 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c 
b/drivers/gpu/drm/drm_gem_shmem_helper.c
index b4aa2d253f8e..705bd32a4c92 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -126,6 +127,57 @@ struct drm_gem_shmem_object *drm_gem_shmem_create(struct 
drm_device *dev, size_t
 }
 EXPORT_SYMBOL_GPL(drm_gem_shmem_create);
 
+static void drm_gem_shmem_resv_assert_held(struct drm_gem_shmem_object *shmem)
+{
+   /*
+* Destroying the object is a special case.. drm_gem_shmem_free()
+* calls many things that WARN_ON if the obj lock is not held.  But
+* acquiring the obj lock in drm_gem_shmem_free() can cause a locking
+* order inversion between reservation_ww_class_mutex and fs_reclaim.
+*
+* This deadlock is not actually possible, because no one should
+* be already holding the lock when msm_gem_free_object() is called.
+* Unfortunately lockdep is not aware of this detail.  So when the
+* refcount drops to zero, we pretend it is already locked.
+*/
+   if (kref_read(&shmem->base.refcount))
+   dma_resv_assert_held(shmem->base.resv);
+}
+
+static bool drm_gem_shmem_is_evictable(struct drm_gem_shmem_object *shmem)
+{
+   dma_resv_assert_held(shmem->base.resv);
+
+   return (shmem->madv >= 0) && shmem->base.funcs->evict &&
+   shmem->pages_use_count && !shmem->pages_pin_count &&
+   !shmem->base.dma_buf && !shmem->base.import_attach &&
+   shmem->sgt && !shmem->evicted;
+}
+
+static void
+drm_gem_shmem_update_pages_state(struct drm_gem_shmem_object *shmem)
+{
+   struct drm_gem_object *obj = &shmem->base;
+   struct drm_gem_shmem *shmem_mm = obj->dev->shmem_mm;
+   struct drm_gem_shmem_shrinker *gem_shrinker = &shmem_mm->shrinker;
+
+   drm_gem_shmem_resv_assert_held(shmem);
+
+   if (!gem_shrinker || obj->import_attach)
+   return;
+
+   if (shmem->madv < 0)
+   drm_gem_lru_remove(&shmem->base);
+   else if (drm_gem_shmem_is_evictable(shmem) || 
drm_gem_shmem_is_purgeable(shmem))
+   drm_gem_lru_move_tail(&gem_shrinker->lru_evictable, 
&shmem->base);
+   else if (shmem->evicted)
+   drm_gem_lru_move_tail(&gem_shrinker->lru_evicted, &shmem->base);
+   else if (!shmem->pages)
+   drm_gem_lru_remove(&shmem->base);
+   else
+   drm_gem_lru_move_tail(&gem_shrinker->lru_pinned, &shmem->base);
+}
+
 /**
  * drm_gem_shmem_free - Free resources associated with a shmem GEM object
  * @shmem: shmem GEM object to free
@@ -140,7 +192,8 @@ void drm_gem_shmem_free(struct drm_gem_shmem_object *shmem)
if (obj->import_attach) {
drm_prime_gem_destroy(obj, shmem->sgt);
} else {
-   dma_resv_lock(shmem->base.resv, NULL);
+   /* take out shmem GEM object from the memory shrinker */
+   drm_gem_shmem_madvise(shmem, -1);
 
drm_WARN_ON(obj->dev, shmem->vmap_use_count);
 
@@ -150,12 +203,10 @@ void drm_gem_shmem_free(struct drm_gem_shmem_object 
*shmem)
sg_free_table(shmem->sgt);
kfree(shmem->sgt);
}
-   if (shmem->pages)
+   if (shmem->pages_use_count)
drm_gem_shmem_put_pages(shmem);
 
drm_WARN_ON(obj->dev, shmem->pages_use_count);
-
-   dma_resv_unlock(shmem->base.resv);
}
 
drm_gem_object_release(obj);
@@ -163,19 +214,31 @@ void drm_gem_shmem_free(struct drm_gem_shmem_object 
*shmem)
 }
 EXPORT_SYMBOL_GPL(drm_gem_shmem_free);
 
-static int drm_gem_shmem_get_pages(struct drm_gem_shmem_object *shmem)
+static int
+drm_gem_shmem_acquire_pages(struct drm_gem_shmem_object *shmem)
 {
struct drm_gem_object *obj = &shmem->base;
struct page **pages;
 
-   if (shmem->pages_use_count++ > 0)
+   dma_resv_assert_held(shmem->base.resv);
+
+   if (shmem->madv < 0) {
+  

[PATCH v9 07/11] drm/shmem-helper: Switch to reservation lock

2022-11-22 Thread Dmitry Osipenko
Replace all drm-shmem locks with a GEM reservation lock. This makes locks
consistent with dma-buf locking convention where importers are responsible
for holding reservation lock for all operations performed over dma-bufs,
preventing deadlock between dma-buf importers and exporters.

Suggested-by: Daniel Vetter 
Signed-off-by: Dmitry Osipenko 
---
 drivers/gpu/drm/drm_gem_shmem_helper.c| 183 +++---
 drivers/gpu/drm/lima/lima_gem.c   |   8 +-
 drivers/gpu/drm/panfrost/panfrost_drv.c   |   7 +-
 .../gpu/drm/panfrost/panfrost_gem_shrinker.c  |   6 +-
 drivers/gpu/drm/panfrost/panfrost_mmu.c   |  19 +-
 include/drm/drm_gem_shmem_helper.h|  14 +-
 6 files changed, 94 insertions(+), 143 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c 
b/drivers/gpu/drm/drm_gem_shmem_helper.c
index ba9d9c5f1064..b4aa2d253f8e 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -86,8 +86,6 @@ __drm_gem_shmem_create(struct drm_device *dev, size_t size, 
bool private)
if (ret)
goto err_release;
 
-   mutex_init(&shmem->pages_lock);
-   mutex_init(&shmem->vmap_lock);
INIT_LIST_HEAD(&shmem->madv_list);
 
if (!private) {
@@ -139,11 +137,13 @@ void drm_gem_shmem_free(struct drm_gem_shmem_object 
*shmem)
 {
struct drm_gem_object *obj = &shmem->base;
 
-   drm_WARN_ON(obj->dev, shmem->vmap_use_count);
-
if (obj->import_attach) {
drm_prime_gem_destroy(obj, shmem->sgt);
} else {
+   dma_resv_lock(shmem->base.resv, NULL);
+
+   drm_WARN_ON(obj->dev, shmem->vmap_use_count);
+
if (shmem->sgt) {
dma_unmap_sgtable(obj->dev->dev, shmem->sgt,
  DMA_BIDIRECTIONAL, 0);
@@ -152,18 +152,18 @@ void drm_gem_shmem_free(struct drm_gem_shmem_object 
*shmem)
}
if (shmem->pages)
drm_gem_shmem_put_pages(shmem);
-   }
 
-   drm_WARN_ON(obj->dev, shmem->pages_use_count);
+   drm_WARN_ON(obj->dev, shmem->pages_use_count);
+
+   dma_resv_unlock(shmem->base.resv);
+   }
 
drm_gem_object_release(obj);
-   mutex_destroy(&shmem->pages_lock);
-   mutex_destroy(&shmem->vmap_lock);
kfree(shmem);
 }
 EXPORT_SYMBOL_GPL(drm_gem_shmem_free);
 
-static int drm_gem_shmem_get_pages_locked(struct drm_gem_shmem_object *shmem)
+static int drm_gem_shmem_get_pages(struct drm_gem_shmem_object *shmem)
 {
struct drm_gem_object *obj = &shmem->base;
struct page **pages;
@@ -195,35 +195,16 @@ static int drm_gem_shmem_get_pages_locked(struct 
drm_gem_shmem_object *shmem)
 }
 
 /*
- * drm_gem_shmem_get_pages - Allocate backing pages for a shmem GEM object
+ * drm_gem_shmem_put_pages - Decrease use count on the backing pages for a 
shmem GEM object
  * @shmem: shmem GEM object
  *
- * This function makes sure that backing pages exists for the shmem GEM object
- * and increases the use count.
- *
- * Returns:
- * 0 on success or a negative error code on failure.
+ * This function decreases the use count and puts the backing pages when use 
drops to zero.
  */
-int drm_gem_shmem_get_pages(struct drm_gem_shmem_object *shmem)
+void drm_gem_shmem_put_pages(struct drm_gem_shmem_object *shmem)
 {
struct drm_gem_object *obj = &shmem->base;
-   int ret;
 
-   drm_WARN_ON(obj->dev, obj->import_attach);
-
-   ret = mutex_lock_interruptible(&shmem->pages_lock);
-   if (ret)
-   return ret;
-   ret = drm_gem_shmem_get_pages_locked(shmem);
-   mutex_unlock(&shmem->pages_lock);
-
-   return ret;
-}
-EXPORT_SYMBOL(drm_gem_shmem_get_pages);
-
-static void drm_gem_shmem_put_pages_locked(struct drm_gem_shmem_object *shmem)
-{
-   struct drm_gem_object *obj = &shmem->base;
+   dma_resv_assert_held(shmem->base.resv);
 
if (drm_WARN_ON_ONCE(obj->dev, !shmem->pages_use_count))
return;
@@ -241,19 +222,6 @@ static void drm_gem_shmem_put_pages_locked(struct 
drm_gem_shmem_object *shmem)
  shmem->pages_mark_accessed_on_put);
shmem->pages = NULL;
 }
-
-/*
- * drm_gem_shmem_put_pages - Decrease use count on the backing pages for a 
shmem GEM object
- * @shmem: shmem GEM object
- *
- * This function decreases the use count and puts the backing pages when use 
drops to zero.
- */
-void drm_gem_shmem_put_pages(struct drm_gem_shmem_object *shmem)
-{
-   mutex_lock(&shmem->pages_lock);
-   drm_gem_shmem_put_pages_locked(shmem);
-   mutex_unlock(&shmem->pages_lock);
-}
 EXPORT_SYMBOL(drm_gem_shmem_put_pages);
 
 /**
@@ -270,6 +238,8 @@ int drm_gem_shmem_pin(struct drm_gem_shmem_object *shmem)
 {
struct drm_gem_object *obj = &shmem->base;
 
+   dma_resv_assert_held(shmem->base.resv);
+
drm_WARN_ON(obj->dev, obj->import_attach);
 
return drm_

[PATCH v9 09/11] drm/gem: Add drm_gem_pin_unlocked()

2022-11-22 Thread Dmitry Osipenko
Add unlocked variants of drm_gem_un/pin() functions. These new helpers
will take care of GEM dma-reservation locking for DRM drivers.

VirtIO-GPU driver will use these helpers to pin shmem framebuffers,
preventing them from eviction during scanout.

Signed-off-by: Dmitry Osipenko 
---
 drivers/gpu/drm/drm_gem.c | 29 +
 include/drm/drm_gem.h |  3 +++
 2 files changed, 32 insertions(+)

diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index c0510b8080d2..0ac5f69ee292 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -1154,6 +1154,35 @@ void drm_gem_unpin(struct drm_gem_object *obj)
obj->funcs->unpin(obj);
 }
 
+int drm_gem_pin_unlocked(struct drm_gem_object *obj)
+{
+   int ret;
+
+   if (!obj->funcs->pin)
+   return 0;
+
+   ret = dma_resv_lock_interruptible(obj->resv, NULL);
+   if (ret)
+   return ret;
+
+   ret = obj->funcs->pin(obj);
+   dma_resv_unlock(obj->resv);
+
+   return ret;
+}
+EXPORT_SYMBOL(drm_gem_pin_unlocked);
+
+void drm_gem_unpin_unlocked(struct drm_gem_object *obj)
+{
+   if (!obj->funcs->unpin)
+   return;
+
+   dma_resv_lock(obj->resv, NULL);
+   obj->funcs->unpin(obj);
+   dma_resv_unlock(obj->resv);
+}
+EXPORT_SYMBOL(drm_gem_unpin_unlocked);
+
 int drm_gem_vmap(struct drm_gem_object *obj, struct iosys_map *map)
 {
int ret;
diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
index add1371453f0..8a3a07eae8fa 100644
--- a/include/drm/drm_gem.h
+++ b/include/drm/drm_gem.h
@@ -492,4 +492,7 @@ unsigned long drm_gem_lru_scan(struct drm_gem_lru *lru,
 
 bool drm_gem_object_evict(struct drm_gem_object *obj);
 
+int drm_gem_pin_unlocked(struct drm_gem_object *obj);
+void drm_gem_unpin_unlocked(struct drm_gem_object *obj);
+
 #endif /* __DRM_GEM_H__ */
-- 
2.38.1



[PATCH v9 06/11] drm/shmem-helper: Don't use vmap_use_count for dma-bufs

2022-11-22 Thread Dmitry Osipenko
DMA-buf core has its own refcounting of vmaps, use it instead of drm-shmem
counting. This change prepares drm-shmem for addition of memory shrinker
support where drm-shmem will use a single dma-buf reservation lock for
all operations performed over dma-bufs.

Signed-off-by: Dmitry Osipenko 
---
 drivers/gpu/drm/drm_gem_shmem_helper.c | 35 +++---
 1 file changed, 20 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c 
b/drivers/gpu/drm/drm_gem_shmem_helper.c
index 5504eeb61099..ba9d9c5f1064 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -299,24 +299,22 @@ static int drm_gem_shmem_vmap_locked(struct 
drm_gem_shmem_object *shmem,
struct drm_gem_object *obj = &shmem->base;
int ret = 0;
 
-   if (shmem->vmap_use_count++ > 0) {
-   iosys_map_set_vaddr(map, shmem->vaddr);
-   return 0;
-   }
-
if (obj->import_attach) {
ret = dma_buf_vmap(obj->import_attach->dmabuf, map);
if (!ret) {
if (drm_WARN_ON(obj->dev, map->is_iomem)) {
dma_buf_vunmap(obj->import_attach->dmabuf, map);
-   ret = -EIO;
-   goto err_put_pages;
+   return -EIO;
}
-   shmem->vaddr = map->vaddr;
}
} else {
pgprot_t prot = PAGE_KERNEL;
 
+   if (shmem->vmap_use_count++ > 0) {
+   iosys_map_set_vaddr(map, shmem->vaddr);
+   return 0;
+   }
+
ret = drm_gem_shmem_get_pages(shmem);
if (ret)
goto err_zero_use;
@@ -382,15 +380,15 @@ static void drm_gem_shmem_vunmap_locked(struct 
drm_gem_shmem_object *shmem,
 {
struct drm_gem_object *obj = &shmem->base;
 
-   if (drm_WARN_ON_ONCE(obj->dev, !shmem->vmap_use_count))
-   return;
-
-   if (--shmem->vmap_use_count > 0)
-   return;
-
if (obj->import_attach) {
dma_buf_vunmap(obj->import_attach->dmabuf, map);
} else {
+   if (drm_WARN_ON_ONCE(obj->dev, !shmem->vmap_use_count))
+   return;
+
+   if (--shmem->vmap_use_count > 0)
+   return;
+
vunmap(shmem->vaddr);
drm_gem_shmem_put_pages(shmem);
}
@@ -652,7 +650,14 @@ void drm_gem_shmem_print_info(const struct 
drm_gem_shmem_object *shmem,
  struct drm_printer *p, unsigned int indent)
 {
drm_printf_indent(p, indent, "pages_use_count=%u\n", 
shmem->pages_use_count);
-   drm_printf_indent(p, indent, "vmap_use_count=%u\n", 
shmem->vmap_use_count);
+
+   if (shmem->base.import_attach)
+   drm_printf_indent(p, indent, "vmap_use_count=%u\n",
+ shmem->base.dma_buf->vmapping_counter);
+   else
+   drm_printf_indent(p, indent, "vmap_use_count=%u\n",
+ shmem->vmap_use_count);
+
drm_printf_indent(p, indent, "vaddr=%p\n", shmem->vaddr);
 }
 EXPORT_SYMBOL(drm_gem_shmem_print_info);
-- 
2.38.1



[PATCH v9 11/11] drm/panfrost: Switch to generic memory shrinker

2022-11-22 Thread Dmitry Osipenko
Replace Panfrost's custom memory shrinker with a common drm-shmem
memory shrinker.

Signed-off-by: Dmitry Osipenko 
---
 drivers/gpu/drm/drm_gem_shmem_helper.c|   2 -
 drivers/gpu/drm/panfrost/Makefile |   1 -
 drivers/gpu/drm/panfrost/panfrost_device.h|   4 -
 drivers/gpu/drm/panfrost/panfrost_drv.c   |  27 ++--
 drivers/gpu/drm/panfrost/panfrost_gem.c   |  30 ++--
 drivers/gpu/drm/panfrost/panfrost_gem.h   |   9 --
 .../gpu/drm/panfrost/panfrost_gem_shrinker.c  | 129 --
 drivers/gpu/drm/panfrost/panfrost_job.c   |  18 ++-
 include/drm/drm_gem_shmem_helper.h|   7 -
 9 files changed, 47 insertions(+), 180 deletions(-)
 delete mode 100644 drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c

diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c 
b/drivers/gpu/drm/drm_gem_shmem_helper.c
index 705bd32a4c92..70b25585cead 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -87,8 +87,6 @@ __drm_gem_shmem_create(struct drm_device *dev, size_t size, 
bool private)
if (ret)
goto err_release;
 
-   INIT_LIST_HEAD(&shmem->madv_list);
-
if (!private) {
/*
 * Our buffers are kept pinned, so allocating them
diff --git a/drivers/gpu/drm/panfrost/Makefile 
b/drivers/gpu/drm/panfrost/Makefile
index 7da2b3f02ed9..11622e22cf15 100644
--- a/drivers/gpu/drm/panfrost/Makefile
+++ b/drivers/gpu/drm/panfrost/Makefile
@@ -5,7 +5,6 @@ panfrost-y := \
panfrost_device.o \
panfrost_devfreq.o \
panfrost_gem.o \
-   panfrost_gem_shrinker.o \
panfrost_gpu.o \
panfrost_job.o \
panfrost_mmu.o \
diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h 
b/drivers/gpu/drm/panfrost/panfrost_device.h
index 8b25278f34c8..fe04b21fc044 100644
--- a/drivers/gpu/drm/panfrost/panfrost_device.h
+++ b/drivers/gpu/drm/panfrost/panfrost_device.h
@@ -115,10 +115,6 @@ struct panfrost_device {
atomic_t pending;
} reset;
 
-   struct mutex shrinker_lock;
-   struct list_head shrinker_list;
-   struct shrinker shrinker;
-
struct panfrost_devfreq pfdevfreq;
 };
 
diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c 
b/drivers/gpu/drm/panfrost/panfrost_drv.c
index 94b8e6de34b8..e6d293cd3494 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -160,7 +160,6 @@ panfrost_lookup_bos(struct drm_device *dev,
break;
}
 
-   atomic_inc(&bo->gpu_usecount);
job->mappings[i] = mapping;
}
 
@@ -392,7 +391,6 @@ static int panfrost_ioctl_madvise(struct drm_device *dev, 
void *data,
 {
struct panfrost_file_priv *priv = file_priv->driver_priv;
struct drm_panfrost_madvise *args = data;
-   struct panfrost_device *pfdev = dev->dev_private;
struct drm_gem_object *gem_obj;
struct panfrost_gem_object *bo;
int ret = 0;
@@ -405,11 +403,15 @@ static int panfrost_ioctl_madvise(struct drm_device *dev, 
void *data,
 
bo = to_panfrost_bo(gem_obj);
 
+   if (bo->is_heap) {
+   args->retained = 1;
+   goto out_put_object;
+   }
+
ret = dma_resv_lock_interruptible(bo->base.base.resv, NULL);
if (ret)
goto out_put_object;
 
-   mutex_lock(&pfdev->shrinker_lock);
mutex_lock(&bo->mappings.lock);
if (args->madv == PANFROST_MADV_DONTNEED) {
struct panfrost_gem_mapping *first;
@@ -435,17 +437,8 @@ static int panfrost_ioctl_madvise(struct drm_device *dev, 
void *data,
 
args->retained = drm_gem_shmem_madvise(&bo->base, args->madv);
 
-   if (args->retained) {
-   if (args->madv == PANFROST_MADV_DONTNEED)
-   list_move_tail(&bo->base.madv_list,
-  &pfdev->shrinker_list);
-   else if (args->madv == PANFROST_MADV_WILLNEED)
-   list_del_init(&bo->base.madv_list);
-   }
-
 out_unlock_mappings:
mutex_unlock(&bo->mappings.lock);
-   mutex_unlock(&pfdev->shrinker_lock);
dma_resv_unlock(bo->base.base.resv);
 out_put_object:
drm_gem_object_put(gem_obj);
@@ -577,9 +570,6 @@ static int panfrost_probe(struct platform_device *pdev)
ddev->dev_private = pfdev;
pfdev->ddev = ddev;
 
-   mutex_init(&pfdev->shrinker_lock);
-   INIT_LIST_HEAD(&pfdev->shrinker_list);
-
err = panfrost_device_init(pfdev);
if (err) {
if (err != -EPROBE_DEFER)
@@ -601,10 +591,14 @@ static int panfrost_probe(struct platform_device *pdev)
if (err < 0)
goto err_out1;
 
-   panfrost_gem_shrinker_init(ddev);
+   err = drmm_gem_shmem_init(ddev);
+   if (err < 0)
+   goto err_out2;
 
return 0;
 
+err_out2:
+   drm_dev_unregister(ddev);
 err_out1:
  

[PATCH v9 04/11] drm/shmem: Put booleans in the end of struct drm_gem_shmem_object

2022-11-22 Thread Dmitry Osipenko
Group all 1-bit boolean members of struct drm_gem_shmem_object in the end
of the structure, allowing compiler to pack data better and making code to
look more consistent.

Suggested-by: Thomas Zimmermann 
Signed-off-by: Dmitry Osipenko 
---
 include/drm/drm_gem_shmem_helper.h | 30 +++---
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/include/drm/drm_gem_shmem_helper.h 
b/include/drm/drm_gem_shmem_helper.h
index a2201b2488c5..5994fed5e327 100644
--- a/include/drm/drm_gem_shmem_helper.h
+++ b/include/drm/drm_gem_shmem_helper.h
@@ -60,20 +60,6 @@ struct drm_gem_shmem_object {
 */
struct list_head madv_list;
 
-   /**
-* @pages_mark_dirty_on_put:
-*
-* Mark pages as dirty when they are put.
-*/
-   unsigned int pages_mark_dirty_on_put: 1;
-
-   /**
-* @pages_mark_accessed_on_put:
-*
-* Mark pages as accessed when they are put.
-*/
-   unsigned int pages_mark_accessed_on_put : 1;
-
/**
 * @sgt: Scatter/gather table for imported PRIME buffers
 */
@@ -97,10 +83,24 @@ struct drm_gem_shmem_object {
 */
unsigned int vmap_use_count;
 
+   /**
+* @pages_mark_dirty_on_put:
+*
+* Mark pages as dirty when they are put.
+*/
+   bool pages_mark_dirty_on_put : 1;
+
+   /**
+* @pages_mark_accessed_on_put:
+*
+* Mark pages as accessed when they are put.
+*/
+   bool pages_mark_accessed_on_put : 1;
+
/**
 * @map_wc: map object write-combined (instead of using shmem defaults).
 */
-   bool map_wc;
+   bool map_wc : 1;
 };
 
 #define to_drm_gem_shmem_obj(obj) \
-- 
2.38.1



[PATCH v9 03/11] drm/gem: Add evict() callback to drm_gem_object_funcs

2022-11-22 Thread Dmitry Osipenko
Add new common evict() callback to drm_gem_object_funcs and corresponding
drm_gem_object_evict() helper. This is a first step on a way to providing
common GEM-shrinker API for DRM drivers.

Suggested-by: Thomas Zimmermann 
Signed-off-by: Dmitry Osipenko 
---
 drivers/gpu/drm/drm_gem.c | 15 +++
 include/drm/drm_gem.h | 12 
 2 files changed, 27 insertions(+)

diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index 299bca1390aa..c0510b8080d2 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -1458,3 +1458,18 @@ drm_gem_lru_scan(struct drm_gem_lru *lru,
return freed;
 }
 EXPORT_SYMBOL(drm_gem_lru_scan);
+
+/**
+ * drm_gem_object_evict - helper to evict backing pages for a GEM object
+ * @obj: obj in question
+ */
+bool
+drm_gem_object_evict(struct drm_gem_object *obj)
+{
+   dma_resv_assert_held(obj->resv);
+
+   if (obj->funcs->evict)
+   return obj->funcs->evict(obj);
+
+   return false;
+}
diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
index b46ade812443..add1371453f0 100644
--- a/include/drm/drm_gem.h
+++ b/include/drm/drm_gem.h
@@ -172,6 +172,16 @@ struct drm_gem_object_funcs {
 * This is optional but necessary for mmap support.
 */
const struct vm_operations_struct *vm_ops;
+
+   /**
+* @evict:
+*
+* Evicts gem object out from memory. Used by the drm_gem_object_evict()
+* helper. Returns true on success, false otherwise.
+*
+* This callback is optional.
+*/
+   bool (*evict)(struct drm_gem_object *obj);
 };
 
 /**
@@ -480,4 +490,6 @@ unsigned long drm_gem_lru_scan(struct drm_gem_lru *lru,
   unsigned long *remaining,
   bool (*shrink)(struct drm_gem_object *obj));
 
+bool drm_gem_object_evict(struct drm_gem_object *obj);
+
 #endif /* __DRM_GEM_H__ */
-- 
2.38.1



[PATCH v9 05/11] drm/shmem: Switch to use drm_* debug helpers

2022-11-22 Thread Dmitry Osipenko
Ease debugging of a multi-GPU system by using drm_WARN_*() and
drm_dbg_kms() helpers that print out DRM device name corresponding
to shmem GEM.

Suggested-by: Thomas Zimmermann 
Signed-off-by: Dmitry Osipenko 
---
 drivers/gpu/drm/drm_gem_shmem_helper.c | 38 +++---
 1 file changed, 22 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c 
b/drivers/gpu/drm/drm_gem_shmem_helper.c
index 35138f8a375c..5504eeb61099 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -139,7 +139,7 @@ void drm_gem_shmem_free(struct drm_gem_shmem_object *shmem)
 {
struct drm_gem_object *obj = &shmem->base;
 
-   WARN_ON(shmem->vmap_use_count);
+   drm_WARN_ON(obj->dev, shmem->vmap_use_count);
 
if (obj->import_attach) {
drm_prime_gem_destroy(obj, shmem->sgt);
@@ -154,7 +154,7 @@ void drm_gem_shmem_free(struct drm_gem_shmem_object *shmem)
drm_gem_shmem_put_pages(shmem);
}
 
-   WARN_ON(shmem->pages_use_count);
+   drm_WARN_ON(obj->dev, shmem->pages_use_count);
 
drm_gem_object_release(obj);
mutex_destroy(&shmem->pages_lock);
@@ -173,7 +173,8 @@ static int drm_gem_shmem_get_pages_locked(struct 
drm_gem_shmem_object *shmem)
 
pages = drm_gem_get_pages(obj);
if (IS_ERR(pages)) {
-   DRM_DEBUG_KMS("Failed to get pages (%ld)\n", PTR_ERR(pages));
+   drm_dbg_kms(obj->dev, "Failed to get pages (%ld)\n",
+   PTR_ERR(pages));
shmem->pages_use_count = 0;
return PTR_ERR(pages);
}
@@ -205,9 +206,10 @@ static int drm_gem_shmem_get_pages_locked(struct 
drm_gem_shmem_object *shmem)
  */
 int drm_gem_shmem_get_pages(struct drm_gem_shmem_object *shmem)
 {
+   struct drm_gem_object *obj = &shmem->base;
int ret;
 
-   WARN_ON(shmem->base.import_attach);
+   drm_WARN_ON(obj->dev, obj->import_attach);
 
ret = mutex_lock_interruptible(&shmem->pages_lock);
if (ret)
@@ -223,7 +225,7 @@ static void drm_gem_shmem_put_pages_locked(struct 
drm_gem_shmem_object *shmem)
 {
struct drm_gem_object *obj = &shmem->base;
 
-   if (WARN_ON_ONCE(!shmem->pages_use_count))
+   if (drm_WARN_ON_ONCE(obj->dev, !shmem->pages_use_count))
return;
 
if (--shmem->pages_use_count > 0)
@@ -266,7 +268,9 @@ EXPORT_SYMBOL(drm_gem_shmem_put_pages);
  */
 int drm_gem_shmem_pin(struct drm_gem_shmem_object *shmem)
 {
-   WARN_ON(shmem->base.import_attach);
+   struct drm_gem_object *obj = &shmem->base;
+
+   drm_WARN_ON(obj->dev, obj->import_attach);
 
return drm_gem_shmem_get_pages(shmem);
 }
@@ -281,7 +285,9 @@ EXPORT_SYMBOL(drm_gem_shmem_pin);
  */
 void drm_gem_shmem_unpin(struct drm_gem_shmem_object *shmem)
 {
-   WARN_ON(shmem->base.import_attach);
+   struct drm_gem_object *obj = &shmem->base;
+
+   drm_WARN_ON(obj->dev, obj->import_attach);
 
drm_gem_shmem_put_pages(shmem);
 }
@@ -301,7 +307,7 @@ static int drm_gem_shmem_vmap_locked(struct 
drm_gem_shmem_object *shmem,
if (obj->import_attach) {
ret = dma_buf_vmap(obj->import_attach->dmabuf, map);
if (!ret) {
-   if (WARN_ON(map->is_iomem)) {
+   if (drm_WARN_ON(obj->dev, map->is_iomem)) {
dma_buf_vunmap(obj->import_attach->dmabuf, map);
ret = -EIO;
goto err_put_pages;
@@ -326,7 +332,7 @@ static int drm_gem_shmem_vmap_locked(struct 
drm_gem_shmem_object *shmem,
}
 
if (ret) {
-   DRM_DEBUG_KMS("Failed to vmap pages, error %d\n", ret);
+   drm_dbg_kms(obj->dev, "Failed to vmap pages, error %d\n", ret);
goto err_put_pages;
}
 
@@ -376,7 +382,7 @@ static void drm_gem_shmem_vunmap_locked(struct 
drm_gem_shmem_object *shmem,
 {
struct drm_gem_object *obj = &shmem->base;
 
-   if (WARN_ON_ONCE(!shmem->vmap_use_count))
+   if (drm_WARN_ON_ONCE(obj->dev, !shmem->vmap_use_count))
return;
 
if (--shmem->vmap_use_count > 0)
@@ -461,7 +467,7 @@ void drm_gem_shmem_purge_locked(struct drm_gem_shmem_object 
*shmem)
struct drm_gem_object *obj = &shmem->base;
struct drm_device *dev = obj->dev;
 
-   WARN_ON(!drm_gem_shmem_is_purgeable(shmem));
+   drm_WARN_ON(obj->dev, !drm_gem_shmem_is_purgeable(shmem));
 
dma_unmap_sgtable(dev->dev, shmem->sgt, DMA_BIDIRECTIONAL, 0);
sg_free_table(shmem->sgt);
@@ -553,7 +559,7 @@ static vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf)
mutex_lock(&shmem->pages_lock);
 
if (page_offset >= num_pages ||
-   WARN_ON_ONCE(!shmem->pages) ||
+   drm_WARN_ON_ONCE(obj->dev, !shmem->pages) ||
shmem->madv < 0) {
ret = VM_FAULT_SIGBUS;
 

[PATCH v9 00/11] Add generic memory shrinker to VirtIO-GPU and Panfrost DRM drivers

2022-11-22 Thread Dmitry Osipenko
Hello,

This series:

  1. Makes minor fixes for drm_gem_lru and Panfrost
  2. Brings refactoring for older code
  3. Adds common drm-shmem memory shrinker
  4. Enables shrinker for VirtIO-GPU driver
  5. Switches Panfrost driver to the common shrinker

Changelog:

v9: - Replaced struct drm_gem_shmem_shrinker with drm_gem_shmem and
  moved it to drm_device, like was suggested by Thomas Zimmermann.

- Replaced drm_gem_shmem_shrinker_register() with drmm_gem_shmem_init(),
  like was suggested by Thomas Zimmermann.

- Moved evict() callback to drm_gem_object_funcs and added common
  drm_gem_object_evict() helper, like was suggested by Thomas Zimmermann.

- The shmem object now is evictable by default, like was suggested by
  Thomas Zimmermann. Dropped the set_evictable/purgeble() functions
  as well, drivers will decide whether BO is evictable within theirs
  madvise IOCTL.

- Added patches that convert drm-shmem code to use drm_WARN_ON() and
  drm_dbg_kms(), like was requested by Thomas Zimmermann.

- Turned drm_gem_shmem_object booleans into 1-bit bit fields, like was
  suggested by Thomas Zimmermann.

- Switched to use drm_dev->unique for the shmem shrinker name. Drivers
  don't need to specify the name explicitly anymore.

- Re-added dma_resv_test_signaled() that was missing in v8 and also
  fixed its argument to DMA_RESV_USAGE_READ. See comment to
  dma_resv_usage_rw().

- Added new fix for Panfrost driver that silences lockdep warning
  caused by shrinker. Both Panfrost old and new shmem shrinkers are
  affected.

v8: - Rebased on top of recent linux-next that now has dma-buf locking
  convention patches merged, which was blocking shmem shrinker before.

- Shmem shrinker now uses new drm_gem_lru helper.

- Dropped Steven Price t-b from the Panfrost patch because code
  changed significantly since v6 and should be re-tested.

v7: - dma-buf locking convention

v6: 
https://lore.kernel.org/dri-devel/20220526235040.678984-1-dmitry.osipe...@collabora.com/

Related patches:

Mesa: https://gitlab.freedesktop.org/digetx/mesa/-/commits/virgl-madvise
igt:  
https://gitlab.freedesktop.org/digetx/igt-gpu-tools/-/commits/virtio-madvise
  
https://gitlab.freedesktop.org/digetx/igt-gpu-tools/-/commits/panfrost-madvise

I'm going to upstream Mesa and igt patches once the kernel part will land.

Dmitry Osipenko (11):
  drm/msm/gem: Prevent blocking within shrinker loop
  drm/gem: Add evict() callback to drm_gem_object_funcs
  drm/panfrost: Don't sync rpm suspension after mmu flushing
  drm/shmem: Put booleans in the end of struct drm_gem_shmem_object
  drm/shmem: Switch to use drm_* debug helpers
  drm/shmem-helper: Don't use vmap_use_count for dma-bufs
  drm/shmem-helper: Switch to reservation lock
  drm/shmem-helper: Add memory shrinker
  drm/gem: Add drm_gem_pin_unlocked()
  drm/virtio: Support memory shrinking
  drm/panfrost: Switch to generic memory shrinker

Dmitry Osipenko (11):
  drm/msm/gem: Prevent blocking within shrinker loop
  drm/panfrost: Don't sync rpm suspension after mmu flushing
  drm/gem: Add evict() callback to drm_gem_object_funcs
  drm/shmem: Put booleans in the end of struct drm_gem_shmem_object
  drm/shmem: Switch to use drm_* debug helpers
  drm/shmem-helper: Don't use vmap_use_count for dma-bufs
  drm/shmem-helper: Switch to reservation lock
  drm/shmem-helper: Add memory shrinker
  drm/gem: Add drm_gem_pin_unlocked()
  drm/virtio: Support memory shrinking
  drm/panfrost: Switch to generic memory shrinker

 drivers/gpu/drm/drm_gem.c |  53 +-
 drivers/gpu/drm/drm_gem_shmem_helper.c| 647 +-
 drivers/gpu/drm/lima/lima_gem.c   |   8 +-
 drivers/gpu/drm/msm/msm_gem_shrinker.c|   8 +-
 drivers/gpu/drm/panfrost/Makefile |   1 -
 drivers/gpu/drm/panfrost/panfrost_device.h|   4 -
 drivers/gpu/drm/panfrost/panfrost_drv.c   |  34 +-
 drivers/gpu/drm/panfrost/panfrost_gem.c   |  30 +-
 drivers/gpu/drm/panfrost/panfrost_gem.h   |   9 -
 .../gpu/drm/panfrost/panfrost_gem_shrinker.c  | 122 
 drivers/gpu/drm/panfrost/panfrost_job.c   |  18 +-
 drivers/gpu/drm/panfrost/panfrost_mmu.c   |  21 +-
 drivers/gpu/drm/virtio/virtgpu_drv.h  |  18 +-
 drivers/gpu/drm/virtio/virtgpu_gem.c  |  52 ++
 drivers/gpu/drm/virtio/virtgpu_ioctl.c|  37 +
 drivers/gpu/drm/virtio/virtgpu_kms.c  |   8 +
 drivers/gpu/drm/virtio/virtgpu_object.c   | 132 +++-
 drivers/gpu/drm/virtio/virtgpu_plane.c|  22 +-
 drivers/gpu/drm/virtio/virtgpu_vq.c   |  40 ++
 include/drm/drm_device.h  |  10 +-
 include/drm/drm_gem.h |  19 +-
 include/drm/drm_gem_shmem_helper.h| 112 +--
 include/uapi/drm/virtgpu_drm.h|  14 +
 23 files changed, 1010 insertions(+), 409 deletions(-)
 delete mode 100644 drivers/gpu/drm/panfrost/panfrost

Re: [PATCH] drm/i915/uc: Fix table order verification to check all FW types

2022-11-22 Thread Ceraolo Spurio, Daniele




On 11/22/2022 3:33 PM, john.c.harri...@intel.com wrote:

From: John Harrison 

It was noticed that the table order verification step was only being
run once rather than once per firmware type. Fix that.

Note that the long term plan is to convert this code to be a mock
selftest. It is already only compiled in when selftests are enabled.
And the work involved in the conversion was estimated to be
non-trivial. So that conversion is currently low on the priority list.

Signed-off-by: John Harrison 


Reviewed-by: Daniele Ceraolo Spurio 

Daniele


---
  drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c | 9 +
  1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c 
b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
index 0c80ba51a4bdc..31613c7e0838b 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
@@ -238,7 +238,7 @@ __uc_fw_auto_select(struct drm_i915_private *i915, struct 
intel_uc_fw *uc_fw)
[INTEL_UC_FW_TYPE_GUC] = { blobs_guc, ARRAY_SIZE(blobs_guc) },
[INTEL_UC_FW_TYPE_HUC] = { blobs_huc, ARRAY_SIZE(blobs_huc) },
};
-   static bool verified;
+   static bool verified[INTEL_UC_FW_NUM_TYPES];
const struct uc_fw_platform_requirement *fw_blobs;
enum intel_platform p = INTEL_INFO(i915)->platform;
u32 fw_count;
@@ -291,8 +291,8 @@ __uc_fw_auto_select(struct drm_i915_private *i915, struct 
intel_uc_fw *uc_fw)
}
  
  	/* make sure the list is ordered as expected */

-   if (IS_ENABLED(CONFIG_DRM_I915_SELFTEST) && !verified) {
-   verified = true;
+   if (IS_ENABLED(CONFIG_DRM_I915_SELFTEST) && !verified[uc_fw->type]) {
+   verified[uc_fw->type] = true;
  
  		for (i = 1; i < fw_count; i++) {

/* Next platform is good: */
@@ -343,7 +343,8 @@ __uc_fw_auto_select(struct drm_i915_private *i915, struct 
intel_uc_fw *uc_fw)
continue;
  
  bad:

-   drm_err(&i915->drm, "Invalid FW blob order: %s r%u 
%s%d.%d.%d comes before %s r%u %s%d.%d.%d\n",
+   drm_err(&i915->drm, "Invalid %s blob order: %s r%u 
%s%d.%d.%d comes before %s r%u %s%d.%d.%d\n",
+   intel_uc_fw_type_repr(uc_fw->type),
intel_platform_name(fw_blobs[i - 1].p), 
fw_blobs[i - 1].rev,
fw_blobs[i - 1].blob.legacy ? "L" : "v",
fw_blobs[i - 1].blob.major,




Re: [Intel-gfx] [PATCH v2 4/5] drm/i915/guc: Add GuC CT specific debug print wrappers

2022-11-22 Thread John Harrison

On 11/22/2022 09:54, Michal Wajdeczko wrote:

On 18.11.2022 02:58, john.c.harri...@intel.com wrote:

From: John Harrison 

Re-work the existing GuC CT printers and extend as required to match
the new wrapping scheme.

Signed-off-by: John Harrison 
---
  drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 222 +++---
  1 file changed, 113 insertions(+), 109 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index 2b22065e87bf9..9d404fb377637 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -18,31 +18,49 @@ static inline struct intel_guc *ct_to_guc(struct 
intel_guc_ct *ct)
return container_of(ct, struct intel_guc, ct);
  }
  
-static inline struct intel_gt *ct_to_gt(struct intel_guc_ct *ct)

-{
-   return guc_to_gt(ct_to_guc(ct));
-}
-
  static inline struct drm_i915_private *ct_to_i915(struct intel_guc_ct *ct)
  {
-   return ct_to_gt(ct)->i915;
-}
+   struct intel_guc *guc = ct_to_guc(ct);
+   struct intel_gt *gt = guc_to_gt(guc);
  
-static inline struct drm_device *ct_to_drm(struct intel_guc_ct *ct)

-{
-   return &ct_to_i915(ct)->drm;
+   return gt->i915;
  }
  
-#define CT_ERROR(_ct, _fmt, ...) \

-   drm_err(ct_to_drm(_ct), "CT: " _fmt, ##__VA_ARGS__)
+#define ct_err(_ct, _fmt, ...) \
+   guc_err(ct_to_guc(_ct), "CT " _fmt, ##__VA_ARGS__)
+
+#define ct_warn(_ct, _fmt, ...) \
+   guc_warn(ct_to_guc(_ct), "CT " _fmt, ##__VA_ARGS__)
+
+#define ct_notice(_ct, _fmt, ...) \
+   guc_notice(ct_to_guc(_ct), "CT " _fmt, ##__VA_ARGS__)
+
+#define ct_info(_ct, _fmt, ...) \
+   guc_info(ct_to_guc(_ct), "CT " _fmt, ##__VA_ARGS__)
+
  #ifdef CONFIG_DRM_I915_DEBUG_GUC
-#define CT_DEBUG(_ct, _fmt, ...) \
-   drm_dbg(ct_to_drm(_ct), "CT: " _fmt, ##__VA_ARGS__)
+#define ct_dbg(_ct, _fmt, ...) \
+   guc_dbg(ct_to_guc(_ct), "CT " _fmt, ##__VA_ARGS__)
  #else
-#define CT_DEBUG(...)  do { } while (0)
+#define ct_dbg(...)do { } while (0)
  #endif
-#define CT_PROBE_ERROR(_ct, _fmt, ...) \
-   i915_probe_error(ct_to_i915(ct), "CT: " _fmt, ##__VA_ARGS__)
+
+#define ct_probe_error(_ct, _fmt, ...) \
+   do { \
+   if (i915_error_injected()) \
+   ct_dbg(_ct, _fmt, ##__VA_ARGS__); \
+   else \
+   ct_err(_ct, _fmt, ##__VA_ARGS__); \
+   } while (0)

guc_probe_error ?


+
+#define ct_WARN_ON(_ct, _condition) \
+   ct_WARN(_ct, _condition, "%s", "ct_WARN_ON(" __stringify(_condition) 
")")
+
+#define ct_WARN(_ct, _condition, _fmt, ...) \
+   guc_WARN(ct_to_guc(_ct), _condition, "CT " _fmt, ##__VA_ARGS__)
+
+#define ct_WARN_ONCE(_ct, _condition, _fmt, ...) \
+   guc_WARN_ONCE(ct_to_guc(_ct), _condition, "CT " _fmt, ##__VA_ARGS__)
  
  /**

   * DOC: CTB Blob
@@ -170,7 +188,7 @@ static int ct_control_enable(struct intel_guc_ct *ct, bool 
enable)
err = guc_action_control_ctb(ct_to_guc(ct), enable ?
 GUC_CTB_CONTROL_ENABLE : 
GUC_CTB_CONTROL_DISABLE);
if (unlikely(err))
-   CT_PROBE_ERROR(ct, "Failed to control/%s CTB (%pe)\n",
+   ct_probe_error(ct, "Failed to control/%s CTB (%pe)\n",
   str_enable_disable(enable), ERR_PTR(err));

btw, shouldn't we change all messages to start with lowercase ?

was:
"CT0: Failed to control/%s CTB (%pe)"
is:
"GT0: GuC CT Failed to control/%s CTB (%pe)"

unless we keep colon (as suggested by Tvrtko) as then:

"GT0: GuC CT: Failed to control/%s CTB (%pe)"
Blanket added the colon makes it messy when a string actually wants to 
start with the prefix. The rule I've been using is lower case word when 
the prefix was part of the string, upper case word when the prefix is 
just being added as a prefix. I originally just had the prefix as raw 
with no trailing space, so the individual print could decide to add a 
colon, a space, or whatever as appropriate. But that just makes for 
messy code with some files having every string look like ": Stuff 
happened" and other files have every string look like " failed to ...". 
The current version seems to be the most readable from the point of view 
of writing the code and of reading the dmesg results.


And to be clear, the 'CT0' you have in your 'was' example only exists in 
the internal tree. It never made it to upstream. It is also just plain 
wrong. Each GT has two CTs - send and receive. So having 'CT1' meaning 
some random CT on GT1 (as opposed to the read channel on GT0, for 
example) was very confusing.


John.




Michal

  
  	return err;

@@ -201,7 +219,7 @@ static int ct_register_buffer(struct intel_guc_ct *ct, bool 
send,
   size);
if (unlikely(err))
  failed:
-   CT_PROBE_ERROR(ct, "Failed to register %s buffer (%pe)\n",
+   ct_probe_error(ct, "Failed to register %s buffer (%pe)\n",
   

Re: [PATCH v2 3/5] drm/i915/guc: Add GuC specific debug print wrappers

2022-11-22 Thread John Harrison

On 11/22/2022 09:42, Michal Wajdeczko wrote:

On 18.11.2022 02:58, john.c.harri...@intel.com wrote:

From: John Harrison 

Create a set of GuC printers and start using them.

Signed-off-by: John Harrison 
---
  drivers/gpu/drm/i915/gt/uc/intel_guc.c| 32 --
  drivers/gpu/drm/i915/gt/uc/intel_guc.h| 35 +++
  drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c|  8 +--
  .../gpu/drm/i915/gt/uc/intel_guc_capture.c| 48 +-
  drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c | 19 +++---
  drivers/gpu/drm/i915/gt/uc/intel_guc_log.c| 37 ++-
  drivers/gpu/drm/i915/gt/uc/intel_guc_rc.c |  7 +--
  drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c   | 55 +++-
  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 62 +--
  drivers/gpu/drm/i915/gt/uc/selftest_guc.c | 34 +-
  .../drm/i915/gt/uc/selftest_guc_hangcheck.c   | 22 +++
  .../drm/i915/gt/uc/selftest_guc_multi_lrc.c   | 10 +--
  12 files changed, 179 insertions(+), 190 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
index 52aede324788e..d9972510ee29b 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
@@ -94,8 +94,8 @@ static void gen9_enable_guc_interrupts(struct intel_guc *guc)
assert_rpm_wakelock_held(>->i915->runtime_pm);
  
  	spin_lock_irq(gt->irq_lock);

-   WARN_ON_ONCE(intel_uncore_read(gt->uncore, GEN8_GT_IIR(2)) &
-gt->pm_guc_events);
+   guc_WARN_ON_ONCE(guc, intel_uncore_read(gt->uncore, GEN8_GT_IIR(2)) &
+gt->pm_guc_events);
gen6_gt_pm_enable_irq(gt, gt->pm_guc_events);
spin_unlock_irq(gt->irq_lock);
  
@@ -339,7 +339,7 @@ static void guc_init_params(struct intel_guc *guc)

params[GUC_CTL_DEVID] = guc_ctl_devid(guc);
  
  	for (i = 0; i < GUC_CTL_MAX_DWORDS; i++)

-   DRM_DEBUG_DRIVER("param[%2d] = %#x\n", i, params[i]);
+   guc_dbg(guc, "init param[%2d] = %#x\n", i, params[i]);
  }
  
  /*

@@ -451,7 +451,7 @@ int intel_guc_init(struct intel_guc *guc)
intel_uc_fw_fini(&guc->fw);
  out:
intel_uc_fw_change_status(&guc->fw, INTEL_UC_FIRMWARE_INIT_FAIL);
-   i915_probe_error(gt->i915, "failed with %d\n", ret);
+   guc_probe_error(guc, "init failed with %d\n", ret);
return ret;
  }
  
@@ -484,7 +484,6 @@ void intel_guc_fini(struct intel_guc *guc)

  int intel_guc_send_mmio(struct intel_guc *guc, const u32 *request, u32 len,
u32 *response_buf, u32 response_buf_size)
  {
-   struct drm_i915_private *i915 = guc_to_gt(guc)->i915;
struct intel_uncore *uncore = guc_to_gt(guc)->uncore;
u32 header;
int i;
@@ -519,8 +518,7 @@ int intel_guc_send_mmio(struct intel_guc *guc, const u32 
*request, u32 len,
   10, 10, &header);
if (unlikely(ret)) {
  timeout:
-   drm_err(&i915->drm, "mmio request %#x: no reply %x\n",
-   request[0], header);
+   guc_err(guc, "mmio request %#x: no reply %x\n", request[0], 
header);
goto out;
}
  
@@ -541,8 +539,7 @@ int intel_guc_send_mmio(struct intel_guc *guc, const u32 *request, u32 len,

if (FIELD_GET(GUC_HXG_MSG_0_TYPE, header) == 
GUC_HXG_TYPE_NO_RESPONSE_RETRY) {
u32 reason = FIELD_GET(GUC_HXG_RETRY_MSG_0_REASON, header);
  
-		drm_dbg(&i915->drm, "mmio request %#x: retrying, reason %u\n",

-   request[0], reason);
+   guc_dbg(guc, "mmio request %#x: retrying, reason %u\n", 
request[0], reason);
goto retry;
}
  
@@ -550,16 +547,14 @@ int intel_guc_send_mmio(struct intel_guc *guc, const u32 *request, u32 len,

u32 hint = FIELD_GET(GUC_HXG_FAILURE_MSG_0_HINT, header);
u32 error = FIELD_GET(GUC_HXG_FAILURE_MSG_0_ERROR, header);
  
-		drm_err(&i915->drm, "mmio request %#x: failure %x/%u\n",

-   request[0], error, hint);
+   guc_err(guc, "mmio request %#x: failure %x/%u\n", request[0], 
error, hint);
ret = -ENXIO;
goto out;
}
  
  	if (FIELD_GET(GUC_HXG_MSG_0_TYPE, header) != GUC_HXG_TYPE_RESPONSE_SUCCESS) {

  proto:
-   drm_err(&i915->drm, "mmio request %#x: unexpected reply %#x\n",
-   request[0], header);
+   guc_err(guc, "mmio request %#x: unexpected reply %#x\n", 
request[0], header);
ret = -EPROTO;
goto out;
}
@@ -601,9 +596,9 @@ int intel_guc_to_host_process_recv_msg(struct intel_guc 
*guc,
msg = payload[0] & guc->msg_enabled_mask;
  
  	if (msg & INTEL_GUC_RECV_MSG_CRASH_DUMP_POSTED)

-   drm_err(&guc_to_gt(guc)->i915->drm, "Received early GuC crash dump 
notification!\n");
+   guc_err(guc, "early notification: Crash dump!\n");
if

[PATCH v1] drm/scheduler: Fix lockup in drm_sched_entity_kill()

2022-11-22 Thread Dmitry Osipenko
The drm_sched_entity_kill() is invoked twice by drm_sched_entity_destroy()
while userspace process is exiting or being killed. First time it's invoked
when sched entity is flushed and second time when entity is released. This
causes a lockup within wait_for_completion(entity_idle) due to how completion
API works.

Calling wait_for_completion() more times than complete() was invoked is a
error condition that causes lockup because completion internally uses
counter for complete/wait calls. The complete_all() must be used instead
in such cases.

This patch fixes lockup of Panfrost driver that is reproducible by killing
any application in a middle of 3d drawing operation.

Fixes: 2fdb8a8f07c2 ("drm/scheduler: rework entity flush, kill and fini")
Signed-off-by: Dmitry Osipenko 
---
 drivers/gpu/drm/scheduler/sched_entity.c | 2 +-
 drivers/gpu/drm/scheduler/sched_main.c   | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c
index fe09e5be79bd..15d04a0ec623 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -81,7 +81,7 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
init_completion(&entity->entity_idle);
 
/* We start in an idle state. */
-   complete(&entity->entity_idle);
+   complete_all(&entity->entity_idle);
 
spin_lock_init(&entity->rq_lock);
spsc_queue_init(&entity->job_queue);
diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index 6ce04c2e90c0..857ec20be9e8 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -1026,7 +1026,7 @@ static int drm_sched_main(void *param)
sched_job = drm_sched_entity_pop_job(entity);
 
if (!sched_job) {
-   complete(&entity->entity_idle);
+   complete_all(&entity->entity_idle);
continue;
}
 
@@ -1037,7 +1037,7 @@ static int drm_sched_main(void *param)
 
trace_drm_run_job(sched_job, entity);
fence = sched->ops->run_job(sched_job);
-   complete(&entity->entity_idle);
+   complete_all(&entity->entity_idle);
drm_sched_fence_scheduled(s_fence);
 
if (!IS_ERR_OR_NULL(fence)) {
-- 
2.38.1



Re: [PATCH v2] drm/amdgpu: Fix potential double free and null pointer dereference

2022-11-22 Thread Luben Tuikov
amdgpu_xgmi_hive_type does provide a release method which frees the allocated 
"hive",
so we don't need a kfree() after a kobject_put().

Reviewed-by: Luben Tuikov 

Regards,
Luben

On 2022-11-21 23:28, Liang He wrote:
> In amdgpu_get_xgmi_hive(), we should not call kfree() after
> kobject_put() as the PUT will call kfree().
> 
> In amdgpu_device_ip_init(), we need to check the returned *hive*
> which can be NULL before we dereference it.
> 
> Signed-off-by: Liang He 
> ---
>  v1->v2: we need the extra GET to keep *hive* alive, it is
>  my fault to remove the GET in v1.
> 
> 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 5 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c   | 2 --
>  2 files changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index f1e9663b4051..00976e15b698 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2462,6 +2462,11 @@ static int amdgpu_device_ip_init(struct amdgpu_device 
> *adev)
>   if (!amdgpu_sriov_vf(adev)) {
>   struct amdgpu_hive_info *hive = 
> amdgpu_get_xgmi_hive(adev);
>  
> + if (WARN_ON(!hive)) {
> + r = -ENOENT;
> + goto init_failed;
> + }
> +
>   if (!hive->reset_domain ||
>   
> !amdgpu_reset_get_reset_domain(hive->reset_domain)) {
>   r = -ENOENT;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
> index 47159e9a0884..4b9e7b050ccd 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
> @@ -386,7 +386,6 @@ struct amdgpu_hive_info *amdgpu_get_xgmi_hive(struct 
> amdgpu_device *adev)
>   if (ret) {
>   dev_err(adev->dev, "XGMI: failed initializing kobject for xgmi 
> hive\n");
>   kobject_put(&hive->kobj);
> - kfree(hive);
>   hive = NULL;
>   goto pro_end;
>   }
> @@ -410,7 +409,6 @@ struct amdgpu_hive_info *amdgpu_get_xgmi_hive(struct 
> amdgpu_device *adev)
>   dev_err(adev->dev, "XGMI: failed initializing 
> reset domain for xgmi hive\n");
>   ret = -ENOMEM;
>   kobject_put(&hive->kobj);
> - kfree(hive);
>   hive = NULL;
>   goto pro_end;
>   }



Re: [PATCH v4] drm/i915/mtl: Media GT and Render GT share common GGTT

2022-11-22 Thread Matt Roper
On Tue, Nov 22, 2022 at 12:31:26PM +0530, Aravind Iddamsetty wrote:
> On XE_LPM+ platforms the media engines are carved out into a separate
> GT but have a common GGTMMADR address range which essentially makes
> the GGTT address space to be shared between media and render GT. As a
> result any updates in GGTT shall invalidate TLB of GTs sharing it and
> similarly any operation on GGTT requiring an action on a GT will have to
> involve all GTs sharing it. setup_private_pat was being done on a per
> GGTT based as that doesn't touch any GGTT structures moved it to per GT
> based.
> 
> BSPEC: 63834
> 
> v2:
> 1. Add details to commit msg
> 2. includes fix for failure to add item to ggtt->gt_list, as suggested
> by Lucas
> 3. as ggtt_flush() is used only for ggtt drop i915_is_ggtt check within
> it.
> 4. setup_private_pat moved out of intel_gt_tiles_init
> 
> v3:
> 1. Move out for_each_gt from i915_driver.c (Jani Nikula)
> 
> v4: drop using RCU primitives on ggtt->gt_list as it is not an RCU list
> (Matt Roper)
> 
> Cc: Matt Roper 
> Signed-off-by: Aravind Iddamsetty 

Reviewed-by: Matt Roper 

> ---
>  drivers/gpu/drm/i915/gt/intel_ggtt.c  | 54 +--
>  drivers/gpu/drm/i915/gt/intel_gt.c| 13 +-
>  drivers/gpu/drm/i915/gt/intel_gt_types.h  |  3 ++
>  drivers/gpu/drm/i915/gt/intel_gtt.h   |  4 ++
>  drivers/gpu/drm/i915/i915_driver.c| 12 ++---
>  drivers/gpu/drm/i915/i915_gem.c   |  2 +
>  drivers/gpu/drm/i915/i915_gem_evict.c | 51 +++--
>  drivers/gpu/drm/i915/i915_vma.c   |  5 ++-
>  drivers/gpu/drm/i915/selftests/i915_gem.c |  2 +
>  9 files changed, 111 insertions(+), 35 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c 
> b/drivers/gpu/drm/i915/gt/intel_ggtt.c
> index 8145851ad23d..7644738b9cdb 100644
> --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
> @@ -8,6 +8,7 @@
>  #include 
>  #include 
>  
> +#include 
>  #include 
>  #include 
>  
> @@ -196,10 +197,13 @@ void i915_ggtt_suspend_vm(struct i915_address_space *vm)
>  
>  void i915_ggtt_suspend(struct i915_ggtt *ggtt)
>  {
> + struct intel_gt *gt;
> +
>   i915_ggtt_suspend_vm(&ggtt->vm);
>   ggtt->invalidate(ggtt);
>  
> - intel_gt_check_and_clear_faults(ggtt->vm.gt);
> + list_for_each_entry(gt, &ggtt->gt_list, ggtt_link)
> + intel_gt_check_and_clear_faults(gt);
>  }
>  
>  void gen6_ggtt_invalidate(struct i915_ggtt *ggtt)
> @@ -225,16 +229,21 @@ static void gen8_ggtt_invalidate(struct i915_ggtt *ggtt)
>  
>  static void guc_ggtt_invalidate(struct i915_ggtt *ggtt)
>  {
> - struct intel_uncore *uncore = ggtt->vm.gt->uncore;
>   struct drm_i915_private *i915 = ggtt->vm.i915;
>  
>   gen8_ggtt_invalidate(ggtt);
>  
> - if (GRAPHICS_VER(i915) >= 12)
> - intel_uncore_write_fw(uncore, GEN12_GUC_TLB_INV_CR,
> -   GEN12_GUC_TLB_INV_CR_INVALIDATE);
> - else
> - intel_uncore_write_fw(uncore, GEN8_GTCR, GEN8_GTCR_INVALIDATE);
> + if (GRAPHICS_VER(i915) >= 12) {
> + struct intel_gt *gt;
> +
> + list_for_each_entry(gt, &ggtt->gt_list, ggtt_link)
> + intel_uncore_write_fw(gt->uncore,
> +   GEN12_GUC_TLB_INV_CR,
> +   GEN12_GUC_TLB_INV_CR_INVALIDATE);
> + } else {
> + intel_uncore_write_fw(ggtt->vm.gt->uncore,
> +   GEN8_GTCR, GEN8_GTCR_INVALIDATE);
> + }
>  }
>  
>  u64 gen8_ggtt_pte_encode(dma_addr_t addr,
> @@ -986,8 +995,6 @@ static int gen8_gmch_probe(struct i915_ggtt *ggtt)
>  
>   ggtt->vm.pte_encode = gen8_ggtt_pte_encode;
>  
> - setup_private_pat(ggtt->vm.gt);
> -
>   return ggtt_probe_common(ggtt, size);
>  }
>  
> @@ -1196,7 +1203,14 @@ static int ggtt_probe_hw(struct i915_ggtt *ggtt, 
> struct intel_gt *gt)
>   */
>  int i915_ggtt_probe_hw(struct drm_i915_private *i915)
>  {
> - int ret;
> + struct intel_gt *gt;
> + int ret, i;
> +
> + for_each_gt(gt, i915, i) {
> + ret = intel_gt_assign_ggtt(gt);
> + if (ret)
> + return ret;
> + }
>  
>   ret = ggtt_probe_hw(to_gt(i915)->ggtt, to_gt(i915));
>   if (ret)
> @@ -1208,6 +1222,19 @@ int i915_ggtt_probe_hw(struct drm_i915_private *i915)
>   return 0;
>  }
>  
> +struct i915_ggtt *i915_ggtt_create(struct drm_i915_private *i915)
> +{
> + struct i915_ggtt *ggtt;
> +
> + ggtt = drmm_kzalloc(&i915->drm, sizeof(*ggtt), GFP_KERNEL);
> + if (!ggtt)
> + return ERR_PTR(-ENOMEM);
> +
> + INIT_LIST_HEAD(&ggtt->gt_list);
> +
> + return ggtt;
> +}
> +
>  int i915_ggtt_enable_hw(struct drm_i915_private *i915)
>  {
>   if (GRAPHICS_VER(i915) < 6)
> @@ -1296,9 +1323,11 @@ bool i915_ggtt_resume_vm(struct i915_address_space *vm)
>  
>  void i915_ggtt_resume(struct i915_ggtt *ggtt)
>  {
>

Re: [PATCH 2/2] drm/msm/disp/dpu1: add support for display on SM6115

2022-11-22 Thread Dmitry Baryshkov

On 20/11/2022 15:37, Adam Skladowski wrote:

Add required display hw catalog changes for SM6115.

Signed-off-by: Adam Skladowski 
---
  .../gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c| 87 +++
  .../gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h|  1 +
  drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c   |  1 +
  drivers/gpu/drm/msm/msm_mdss.c|  5 ++
  4 files changed, 94 insertions(+)



[skipped]


diff --git a/drivers/gpu/drm/msm/msm_mdss.c b/drivers/gpu/drm/msm/msm_mdss.c
index 6a4549ef34d4..86b28add1fff 100644
--- a/drivers/gpu/drm/msm/msm_mdss.c
+++ b/drivers/gpu/drm/msm/msm_mdss.c
@@ -280,6 +280,10 @@ static int msm_mdss_enable(struct msm_mdss *msm_mdss)
/* UBWC_2_0 */
msm_mdss_setup_ubwc_dec_20(msm_mdss, 0x1e);
break;
+   case DPU_HW_VER_630:
+   /* UBWC_2_0 */
+   msm_mdss_setup_ubwc_dec_20(msm_mdss, 0x11f);
+   break;


According to the vendor dtsi the sm6115 is UBWC 1.0, not 2.0

Could you please doublecheck?

Looks good to me otherwise.


case DPU_HW_VER_720:
msm_mdss_setup_ubwc_dec_40(msm_mdss, UBWC_3_0, 6, 1, 1, 1);
break;
@@ -509,6 +513,7 @@ static const struct of_device_id mdss_dt_match[] = {
{ .compatible = "qcom,sc7180-mdss" },
{ .compatible = "qcom,sc7280-mdss" },
{ .compatible = "qcom,sc8180x-mdss" },
+   { .compatible = "qcom,sm6115-mdss" },
{ .compatible = "qcom,sm8150-mdss" },
{ .compatible = "qcom,sm8250-mdss" },
{}


--
With best wishes
Dmitry



Re: [PATCH 1/2] dt-bindings: display/msm: add support for the display

2022-11-22 Thread Dmitry Baryshkov

On 20/11/2022 15:37, Adam Skladowski wrote:

Add DPU and MDSS schemas to describe MDSS and DPU blocks on the Qualcomm
SM6115 platform.
Configuration for DSI/PHY is shared with QCM2290 so compatibles are reused.
Lack of dsi phy supply in example is intended
due to fact on qcm2290, sm6115 and sm6125
this phy is supplied via power domain, not regulator.

Signed-off-by: Adam Skladowski 
---
  .../bindings/display/msm/qcom,sm6115-dpu.yaml |  87 
  .../display/msm/qcom,sm6115-mdss.yaml | 187 ++
  2 files changed, 274 insertions(+)
  create mode 100644 
Documentation/devicetree/bindings/display/msm/qcom,sm6115-dpu.yaml
  create mode 100644 
Documentation/devicetree/bindings/display/msm/qcom,sm6115-mdss.yaml

diff --git a/Documentation/devicetree/bindings/display/msm/qcom,sm6115-dpu.yaml 
b/Documentation/devicetree/bindings/display/msm/qcom,sm6115-dpu.yaml
new file mode 100644
index ..cc77675ec4f6
--- /dev/null
+++ b/Documentation/devicetree/bindings/display/msm/qcom,sm6115-dpu.yaml
@@ -0,0 +1,87 @@
+# SPDX-License-Identifier: GPL-2.0-only or BSD-2-Clause
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/display/msm/qcom,sm6115-dpu.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Qualcomm Display DPU dt properties for SM6115 target
+
+maintainers:
+  - Dmitry Baryshkov 
+
+$ref: /schemas/display/msm/dpu-common.yaml#
+
+properties:
+  compatible:
+items:
+  - const: qcom,sm6115-dpu
+
+  reg:
+items:
+  - description: Address offset and size for mdp register set
+  - description: Address offset and size for vbif register set
+
+  reg-names:
+items:
+  - const: mdp
+  - const: vbif
+
+  clocks:
+items:
+  - description: Display AXI clock from gcc
+  - description: Display AHB clock from dispcc
+  - description: Display core clock from dispcc
+  - description: Display lut clock from dispcc
+  - description: Display rotator clock from dispcc
+  - description: Display vsync clock from dispcc
+
+  clock-names:
+items:
+  - const: bus
+  - const: iface
+  - const: core
+  - const: lut
+  - const: rot
+  - const: vsync


Please add:

required:
  - compatible
  - reg
  - reg-names
  - clocks
  - clock-names

Per Krzysztof's request these requirements are migrating from dpu-common 
to individual dpu schemas



+
+unevaluatedProperties: false
+
+examples:
+  - |
+#include 
+#include 
+#include 
+
+display-controller@5e01000 {
+compatible = "qcom,sm6115-dpu";
+reg = <0x05e01000 0x8f000>,
+  <0x05eb 0x2008>;
+reg-names = "mdp", "vbif";
+
+clocks = <&gcc GCC_DISP_HF_AXI_CLK>,
+ <&dispcc DISP_CC_MDSS_AHB_CLK>,
+ <&dispcc DISP_CC_MDSS_MDP_CLK>,
+ <&dispcc DISP_CC_MDSS_MDP_LUT_CLK>,
+ <&dispcc DISP_CC_MDSS_ROT_CLK>,
+ <&dispcc DISP_CC_MDSS_VSYNC_CLK>;
+clock-names = "bus", "iface", "core", "lut", "rot", "vsync";
+
+operating-points-v2 = <&mdp_opp_table>;
+power-domains = <&rpmpd SM6115_VDDCX>;
+
+interrupt-parent = <&mdss>;
+interrupts = <0>;
+
+ports {
+#address-cells = <1>;
+#size-cells = <0>;
+
+port@0 {
+reg = <0>;
+endpoint {
+remote-endpoint = <&dsi0_in>;
+};
+};
+};
+};
+...
diff --git 
a/Documentation/devicetree/bindings/display/msm/qcom,sm6115-mdss.yaml 
b/Documentation/devicetree/bindings/display/msm/qcom,sm6115-mdss.yaml
new file mode 100644
index ..af721aa05b22
--- /dev/null
+++ b/Documentation/devicetree/bindings/display/msm/qcom,sm6115-mdss.yaml
@@ -0,0 +1,187 @@
+# SPDX-License-Identifier: GPL-2.0-only or BSD-2-Clause
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/display/msm/qcom,sm6115-mdss.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Qualcomm SM6115 Display MDSS
+
+maintainers:
+  - Dmitry Baryshkov 
+
+description:
+  Device tree bindings for MSM Mobile Display Subsystem(MDSS) that encapsulates
+  sub-blocks like DPU display controller and DSI. Device tree bindings of MDSS
+  are mentioned for SM6115 target.
+
+$ref: /schemas/display/msm/mdss-common.yaml#
+
+properties:
+  compatible:
+items:
+  - const: qcom,sm6115-mdss
+
+  clocks:
+items:
+  - description: Display AHB clock from gcc
+  - description: Display AXI clock
+  - description: Display core clock
+
+  clock-names:
+items:
+  - const: iface
+  - const: bus
+  - const: core
+
+  iommus:
+maxItems: 2
+
+patternProperties:
+  "^display-controller@[0-9a-f]+$":
+type: object
+properties:
+  compatible:
+const: qcom,sm6115-dpu
+
+  "^dsi@[0-9a-f]+$":
+type: object
+properties:
+  compatible:
+const: qcom,dsi-ctrl-6g-qcm2290
+
+  "^phy@[0-9a-f]+$":
+type: object

Re: [Intel-gfx] [PATCH 3/3] drm/i915/guc: Use GuC submission API version number

2022-11-22 Thread Ceraolo Spurio, Daniele




On 11/22/2022 12:09 PM, john.c.harri...@intel.com wrote:

From: John Harrison 

The GuC firmware includes an extra version number to specify the
submission API level. So use that rather than the main firmware
version number for submission related checks.

Also, while it is guaranteed that GuC version number components are
only 8-bits in size, other firmwares do not have that restriction. So
stop making assumptions about them generically fitting in a u16
individually, or in a u32 as a combined 8.8.8.

Signed-off-by: John Harrison 
---
  drivers/gpu/drm/i915/gt/uc/intel_guc.h| 11 +++
  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 15 +--
  drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c  | 91 ---
  drivers/gpu/drm/i915/gt/uc/intel_uc_fw.h  | 10 +-
  drivers/gpu/drm/i915/gt/uc/intel_uc_fw_abi.h  |  3 +-
  5 files changed, 104 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
index 1bb3f98292866..bb4dfe707a7d0 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
@@ -158,6 +158,9 @@ struct intel_guc {
bool submission_selected;
/** @submission_initialized: tracks whether GuC submission has been 
initialised */
bool submission_initialized;
+   /** @submission_version: Submission API version of the currently loaded 
firmware */
+   struct intel_uc_fw_ver submission_version;
+
/**
 * @rc_supported: tracks whether we support GuC rc on the current 
platform
 */
@@ -268,6 +271,14 @@ struct intel_guc {
  #endif
  };
  
+/*

+ * GuC version number components are only 8-bit, so converting to a 32bit 8.8.8
+ * integer works.
+ */
+#define MAKE_GUC_VER(maj, min, pat)(((maj) << 16) | ((min) << 8) | (pat))
+#define MAKE_GUC_VER_STRUCT(ver)   MAKE_GUC_VER((ver).major, (ver).minor, 
(ver).patch)
+#define GUC_SUBMIT_VER(guc)
MAKE_GUC_VER_STRUCT((guc)->submission_version)
+
  static inline struct intel_guc *log_to_guc(struct intel_guc_log *log)
  {
return container_of(log, struct intel_guc, log);
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 0a42f1807f52c..53f7f599cde3a 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -1890,7 +1890,7 @@ int intel_guc_submission_init(struct intel_guc *guc)
if (guc->submission_initialized)
return 0;
  
-	if (GET_UC_VER(guc) < MAKE_UC_VER(70, 0, 0)) {

+   if (GUC_SUBMIT_VER(guc) < MAKE_GUC_VER(1, 0, 0)) {
ret = guc_lrc_desc_pool_create_v69(guc);
if (ret)
return ret;
@@ -2330,7 +2330,7 @@ static int register_context(struct intel_context *ce, 
bool loop)
GEM_BUG_ON(intel_context_is_child(ce));
trace_intel_context_register(ce);
  
-	if (GET_UC_VER(guc) >= MAKE_UC_VER(70, 0, 0))

+   if (GUC_SUBMIT_VER(guc) >= MAKE_GUC_VER(1, 0, 0))
ret = register_context_v70(guc, ce, loop);
else
ret = register_context_v69(guc, ce, loop);
@@ -2342,7 +2342,7 @@ static int register_context(struct intel_context *ce, 
bool loop)
set_context_registered(ce);
spin_unlock_irqrestore(&ce->guc_state.lock, flags);
  
-		if (GET_UC_VER(guc) >= MAKE_UC_VER(70, 0, 0))

+   if (GUC_SUBMIT_VER(guc) >= MAKE_GUC_VER(1, 0, 0))
guc_context_policy_init_v70(ce, loop);
}
  
@@ -2956,7 +2956,7 @@ static void __guc_context_set_preemption_timeout(struct intel_guc *guc,

 u16 guc_id,
 u32 preemption_timeout)
  {
-   if (GET_UC_VER(guc) >= MAKE_UC_VER(70, 0, 0)) {
+   if (GUC_SUBMIT_VER(guc) >= MAKE_GUC_VER(1, 0, 0)) {
struct context_policy policy;
  
  		__guc_context_policy_start_klv(&policy, guc_id);

@@ -3283,7 +3283,7 @@ static int guc_context_alloc(struct intel_context *ce)
  static void __guc_context_set_prio(struct intel_guc *guc,
   struct intel_context *ce)
  {
-   if (GET_UC_VER(guc) >= MAKE_UC_VER(70, 0, 0)) {
+   if (GUC_SUBMIT_VER(guc) >= MAKE_GUC_VER(1, 0, 0)) {
struct context_policy policy;
  
  		__guc_context_policy_start_klv(&policy, ce->guc_id.id);

@@ -4366,7 +4366,7 @@ static int guc_init_global_schedule_policy(struct 
intel_guc *guc)
intel_wakeref_t wakeref;
int ret = 0;
  
-	if (GET_UC_VER(guc) < MAKE_UC_VER(70, 3, 0))

+   if (GUC_SUBMIT_VER(guc) < MAKE_GUC_VER(1, 1, 0))
return 0;
  
  	__guc_scheduling_policy_start_klv(&policy);

@@ -4905,6 +4905,9 @@ void intel_guc_submission_print_info(struct intel_guc 
*guc,
if (!sched_engine)
return;
  
+	drm_printf(p, "GuC Submission API Version: %d.%d.%d\n"

[PATCH v2 3/5] arm64: dts: qcom: sm8450-hdk: enable display hardware

2022-11-22 Thread Dmitry Baryshkov
Enable MDSS/DPU/DSI0 on SM8450-HDK device. Note, there is no panel
configuration (yet).

Signed-off-by: Dmitry Baryshkov 
---
 arch/arm64/boot/dts/qcom/sm8450-hdk.dts | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/sm8450-hdk.dts 
b/arch/arm64/boot/dts/qcom/sm8450-hdk.dts
index 2dd4f8c8f931..75b7aecb7d8e 100644
--- a/arch/arm64/boot/dts/qcom/sm8450-hdk.dts
+++ b/arch/arm64/boot/dts/qcom/sm8450-hdk.dts
@@ -349,6 +349,28 @@ vreg_l7e_2p8: ldo7 {
};
 };
 
+&dispcc {
+   status = "okay";
+};
+
+&mdss {
+   status = "okay";
+};
+
+&mdss_dsi0 {
+   vdda-supply = <&vreg_l6b_1p2>;
+   status = "okay";
+};
+
+&mdss_dsi0_phy {
+   vdds-supply = <&vreg_l5b_0p88>;
+   status = "okay";
+};
+
+&mdss_mdp {
+   status = "okay";
+};
+
 &pcie0 {
status = "okay";
max-link-speed = <2>;
-- 
2.35.1



[PATCH v2 4/5] arm64: dts: qcom: sm8450-hdk: Add LT9611uxc HDMI bridge

2022-11-22 Thread Dmitry Baryshkov
From: Vinod Koul 

Add the LT9611uxc DSI-HDMI bridge and supplies

Signed-off-by: Vinod Koul 
Reviewed-by: Konrad Dybcio 
Signed-off-by: Dmitry Baryshkov 
---
 arch/arm64/boot/dts/qcom/sm8450-hdk.dts | 61 +
 1 file changed, 61 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/sm8450-hdk.dts 
b/arch/arm64/boot/dts/qcom/sm8450-hdk.dts
index 75b7aecb7d8e..6b6dcd0e0052 100644
--- a/arch/arm64/boot/dts/qcom/sm8450-hdk.dts
+++ b/arch/arm64/boot/dts/qcom/sm8450-hdk.dts
@@ -20,6 +20,28 @@ chosen {
stdout-path = "serial0:115200n8";
};
 
+   lt9611_1v2: lt9611-vdd12-regulator {
+   compatible = "regulator-fixed";
+   regulator-name = "LT9611_1V2";
+
+   vin-supply = <&vph_pwr>;
+   regulator-min-microvolt = <120>;
+   regulator-max-microvolt = <120>;
+   gpio = <&tlmm 9 GPIO_ACTIVE_HIGH>;
+   enable-active-high;
+   };
+
+   lt9611_3v3: lt9611-3v3-regulator {
+   compatible = "regulator-fixed";
+   regulator-name = "LT9611_3V3";
+
+   vin-supply = <&vreg_bob>;
+   gpio = <&tlmm 109 GPIO_ACTIVE_HIGH>;
+   regulator-min-microvolt = <330>;
+   regulator-max-microvolt = <330>;
+   enable-active-high;
+   };
+
vph_pwr: vph-pwr-regulator {
compatible = "regulator-fixed";
regulator-name = "vph_pwr";
@@ -353,6 +375,27 @@ &dispcc {
status = "okay";
 };
 
+&i2c9 {
+   clock-frequency = <40>;
+   status = "okay";
+
+   lt9611_codec: hdmi-bridge@2b {
+   compatible = "lontium,lt9611uxc";
+   reg = <0x2b>;
+
+   interrupts-extended = <&tlmm 44 IRQ_TYPE_EDGE_FALLING>;
+
+   reset-gpios = <&tlmm 107 GPIO_ACTIVE_HIGH>;
+
+   vdd-supply = <<9611_1v2>;
+   vcc-supply = <<9611_3v3>;
+
+   pinctrl-names = "default";
+   pinctrl-0 = <<9611_irq_pin <9611_rst_pin>;
+
+   };
+};
+
 &mdss {
status = "okay";
 };
@@ -416,6 +459,10 @@ &qupv3_id_0 {
status = "okay";
 };
 
+&qupv3_id_1 {
+   status = "okay";
+};
+
 &sdhc_2 {
cd-gpios = <&tlmm 92 GPIO_ACTIVE_HIGH>;
pinctrl-names = "default", "sleep";
@@ -431,6 +478,20 @@ &sdhc_2 {
 &tlmm {
gpio-reserved-ranges = <28 4>, <36 4>;
 
+   lt9611_irq_pin: lt9611-irq {
+   pins = "gpio44";
+   function = "gpio";
+   bias-disable;
+   };
+
+   lt9611_rst_pin: lt9611-rst-state {
+   pins = "gpio107";
+   function = "normal";
+
+   output-high;
+   input-disable;
+   };
+
sdc2_card_det_n: sd-card-det-n-state {
pins = "gpio92";
function = "gpio";
-- 
2.35.1



[PATCH v2 5/5] arm64: dts: qcom: sm8450-hdk: Enable HDMI Display

2022-11-22 Thread Dmitry Baryshkov
From: Vinod Koul 

Add the HDMI display nodes and link it to DSI.

Signed-off-by: Vinod Koul 
Reviewed-by: Krzysztof Kozlowski 
Reviewed-by: Konrad Dybcio 
Signed-off-by: Dmitry Baryshkov 
---
 arch/arm64/boot/dts/qcom/sm8450-hdk.dts | 36 +
 1 file changed, 36 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/sm8450-hdk.dts 
b/arch/arm64/boot/dts/qcom/sm8450-hdk.dts
index 6b6dcd0e0052..709cddaac781 100644
--- a/arch/arm64/boot/dts/qcom/sm8450-hdk.dts
+++ b/arch/arm64/boot/dts/qcom/sm8450-hdk.dts
@@ -20,6 +20,17 @@ chosen {
stdout-path = "serial0:115200n8";
};
 
+   hdmi-out {
+   compatible = "hdmi-connector";
+   type = "a";
+
+   port {
+   hdmi_con: endpoint {
+   remote-endpoint = <<9611_out>;
+   };
+   };
+   };
+
lt9611_1v2: lt9611-vdd12-regulator {
compatible = "regulator-fixed";
regulator-name = "LT9611_1V2";
@@ -393,6 +404,26 @@ lt9611_codec: hdmi-bridge@2b {
pinctrl-names = "default";
pinctrl-0 = <<9611_irq_pin <9611_rst_pin>;
 
+   ports {
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   port@0 {
+   reg = <0>;
+
+   lt9611_a: endpoint {
+   remote-endpoint = <&dsi0_out>;
+   };
+   };
+
+   port@2 {
+   reg = <2>;
+
+   lt9611_out: endpoint {
+   remote-endpoint = <&hdmi_con>;
+   };
+   };
+   };
};
 };
 
@@ -405,6 +436,11 @@ &mdss_dsi0 {
status = "okay";
 };
 
+&mdss_dsi0_out {
+   remote-endpoint = <<9611_a>;
+   data-lanes = <0 1 2 3>;
+};
+
 &mdss_dsi0_phy {
vdds-supply = <&vreg_l5b_0p88>;
status = "okay";
-- 
2.35.1



[PATCH v2 2/5] arm64: dts: qcom: sm8450: add display hardware devices

2022-11-22 Thread Dmitry Baryshkov
Add devices tree nodes describing display hardware on SM8450:
- Display Clock Controller
- MDSS
- MDP
- two DSI controllers and DSI PHYs

This does not provide support for DP controllers present on SM8450.

Reviewed-by: Konrad Dybcio 
Signed-off-by: Dmitry Baryshkov 
---
 arch/arm64/boot/dts/qcom/sm8450.dtsi | 284 ++-
 1 file changed, 280 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/boot/dts/qcom/sm8450.dtsi 
b/arch/arm64/boot/dts/qcom/sm8450.dtsi
index 8cc9f62f7645..0c3a3a5578b0 100644
--- a/arch/arm64/boot/dts/qcom/sm8450.dtsi
+++ b/arch/arm64/boot/dts/qcom/sm8450.dtsi
@@ -2394,6 +2394,282 @@ camcc: clock-controller@ade {
status = "disabled";
};
 
+   mdss: mdss@ae0 {
+   compatible = "qcom,sm8450-mdss";
+   reg = <0 0x0ae0 0 0x1000>;
+   reg-names = "mdss";
+
+   /* same path used twice */
+   interconnects = <&mmss_noc MASTER_MDP_DISP 0 &mc_virt 
SLAVE_EBI1_DISP 0>,
+   <&mmss_noc MASTER_MDP_DISP 0 &mc_virt 
SLAVE_EBI1_DISP 0>;
+   interconnect-names = "mdp0-mem", "mdp1-mem";
+
+   resets = <&dispcc DISP_CC_MDSS_CORE_BCR>;
+
+   power-domains = <&dispcc MDSS_GDSC>;
+
+   clocks = <&dispcc DISP_CC_MDSS_AHB_CLK>,
+<&gcc GCC_DISP_HF_AXI_CLK>,
+<&gcc GCC_DISP_SF_AXI_CLK>,
+<&dispcc DISP_CC_MDSS_MDP_CLK>;
+   clock-names = "iface", "bus", "nrt_bus", "core";
+
+   interrupts = ;
+   interrupt-controller;
+   #interrupt-cells = <1>;
+
+   iommus = <&apps_smmu 0x2800 0x402>;
+
+   #address-cells = <2>;
+   #size-cells = <2>;
+   ranges;
+
+   status = "disabled";
+
+   mdss_mdp: display-controller@ae01000 {
+   compatible = "qcom,sm8450-dpu";
+   reg = <0 0x0ae01000 0 0x8f000>,
+ <0 0x0aeb 0 0x2008>;
+   reg-names = "mdp", "vbif";
+
+   clocks = <&gcc GCC_DISP_HF_AXI_CLK>,
+   <&gcc GCC_DISP_SF_AXI_CLK>,
+   <&dispcc DISP_CC_MDSS_AHB_CLK>,
+   <&dispcc DISP_CC_MDSS_MDP_LUT_CLK>,
+   <&dispcc DISP_CC_MDSS_MDP_CLK>,
+   <&dispcc DISP_CC_MDSS_VSYNC_CLK>;
+   clock-names = "bus",
+ "nrt_bus",
+ "iface",
+ "lut",
+ "core",
+ "vsync";
+
+   assigned-clocks = <&dispcc 
DISP_CC_MDSS_VSYNC_CLK>;
+   assigned-clock-rates = <1920>;
+
+   operating-points-v2 = <&mdp_opp_table>;
+   power-domains = <&rpmhpd SM8450_MMCX>;
+
+   interrupt-parent = <&mdss>;
+   interrupts = <0>;
+
+   ports {
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   port@0 {
+   reg = <0>;
+   dpu_intf1_out: endpoint {
+   remote-endpoint = 
<&mdss_dsi0_in>;
+   };
+   };
+
+   port@1 {
+   reg = <1>;
+   dpu_intf2_out: endpoint {
+   remote-endpoint = 
<&mdss_dsi1_in>;
+   };
+   };
+
+   };
+
+   mdp_opp_table: opp-table {
+   compatible = "operating-points-v2";
+
+   opp-17200 {
+   opp-hz = /bits/ 64 <17200>;
+   required-opps = 
<&rpmhpd_opp_low_svs_d1>;
+   };
+
+   opp-2 {
+   

[PATCH v2 1/5] arm64: dts: qcom: sm8450: add RPMH_REGULATOR_LEVEL_LOW_SVS_D1

2022-11-22 Thread Dmitry Baryshkov
Add another power saving state used on SM8450. Unfortunately adding it
in proper place causes renumbering of all the opp states in sm8450.dtsi

Reviewed-by: Konrad Dybcio 
Signed-off-by: Dmitry Baryshkov 
---
 arch/arm64/boot/dts/qcom/sm8450.dtsi   | 20 
 include/dt-bindings/power/qcom-rpmpd.h |  1 +
 2 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/boot/dts/qcom/sm8450.dtsi 
b/arch/arm64/boot/dts/qcom/sm8450.dtsi
index f20db5456765..8cc9f62f7645 100644
--- a/arch/arm64/boot/dts/qcom/sm8450.dtsi
+++ b/arch/arm64/boot/dts/qcom/sm8450.dtsi
@@ -3211,35 +3211,39 @@ rpmhpd_opp_min_svs: opp2 {
opp-level = 
;
};
 
-   rpmhpd_opp_low_svs: opp3 {
+   rpmhpd_opp_low_svs_d1: opp3 {
+   opp-level = 
;
+   };
+
+   rpmhpd_opp_low_svs: opp4 {
opp-level = 
;
};
 
-   rpmhpd_opp_svs: opp4 {
+   rpmhpd_opp_svs: opp5 {
opp-level = 
;
};
 
-   rpmhpd_opp_svs_l1: opp5 {
+   rpmhpd_opp_svs_l1: opp6 {
opp-level = 
;
};
 
-   rpmhpd_opp_nom: opp6 {
+   rpmhpd_opp_nom: opp7 {
opp-level = 
;
};
 
-   rpmhpd_opp_nom_l1: opp7 {
+   rpmhpd_opp_nom_l1: opp8 {
opp-level = 
;
};
 
-   rpmhpd_opp_nom_l2: opp8 {
+   rpmhpd_opp_nom_l2: opp9 {
opp-level = 
;
};
 
-   rpmhpd_opp_turbo: opp9 {
+   rpmhpd_opp_turbo: opp10 {
opp-level = 
;
};
 
-   rpmhpd_opp_turbo_l1: opp10 {
+   rpmhpd_opp_turbo_l1: opp11 {
opp-level = 
;
};
};
diff --git a/include/dt-bindings/power/qcom-rpmpd.h 
b/include/dt-bindings/power/qcom-rpmpd.h
index 7b2e4b66419a..701401c8b945 100644
--- a/include/dt-bindings/power/qcom-rpmpd.h
+++ b/include/dt-bindings/power/qcom-rpmpd.h
@@ -174,6 +174,7 @@
 /* SDM845 Power Domain performance levels */
 #define RPMH_REGULATOR_LEVEL_RETENTION 16
 #define RPMH_REGULATOR_LEVEL_MIN_SVS   48
+#define RPMH_REGULATOR_LEVEL_LOW_SVS_D156
 #define RPMH_REGULATOR_LEVEL_LOW_SVS   64
 #define RPMH_REGULATOR_LEVEL_SVS   128
 #define RPMH_REGULATOR_LEVEL_SVS_L0144
-- 
2.35.1



[PATCH v2 0/5] arm64: dts: qcom: sm8450-hdk: enable HDMI output

2022-11-22 Thread Dmitry Baryshkov
Add device tree nodes for MDSS, DPU and DSI devices on Qualcomm SM8450
platform. Enable these devices and add the HDMI bridge configuration on
SM8450 HDK.

Changes since v1:
- Reorder properties, making status the last one
- Rename opp nodes to follow the schema
- Renamed display-controller and phy device nodes
- Dropped phy-names for DSI PHYs
- Renamed DSI and DSI PHY labels to include mdss_ prefix
- Renamed 3v3 regulator device node to add -regulator suffix

Dmitry Baryshkov (3):
  arm64: dts: qcom: sm8450: add RPMH_REGULATOR_LEVEL_LOW_SVS_D1
  arm64: dts: qcom: sm8450: add display hardware devices
  arm64: dts: qcom: sm8450-hdk: enable display hardware

Vinod Koul (2):
  arm64: dts: qcom: sm8450-hdk: Add LT9611uxc HDMI bridge
  arm64: dts: qcom: sm8450-hdk: Enable HDMI Display

 arch/arm64/boot/dts/qcom/sm8450-hdk.dts | 119 ++
 arch/arm64/boot/dts/qcom/sm8450.dtsi| 304 +++-
 include/dt-bindings/power/qcom-rpmpd.h  |   1 +
 3 files changed, 412 insertions(+), 12 deletions(-)

-- 
2.35.1



[PATCH] drm/i915/uc: Fix table order verification to check all FW types

2022-11-22 Thread John . C . Harrison
From: John Harrison 

It was noticed that the table order verification step was only being
run once rather than once per firmware type. Fix that.

Note that the long term plan is to convert this code to be a mock
selftest. It is already only compiled in when selftests are enabled.
And the work involved in the conversion was estimated to be
non-trivial. So that conversion is currently low on the priority list.

Signed-off-by: John Harrison 
---
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c 
b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
index 0c80ba51a4bdc..31613c7e0838b 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
@@ -238,7 +238,7 @@ __uc_fw_auto_select(struct drm_i915_private *i915, struct 
intel_uc_fw *uc_fw)
[INTEL_UC_FW_TYPE_GUC] = { blobs_guc, ARRAY_SIZE(blobs_guc) },
[INTEL_UC_FW_TYPE_HUC] = { blobs_huc, ARRAY_SIZE(blobs_huc) },
};
-   static bool verified;
+   static bool verified[INTEL_UC_FW_NUM_TYPES];
const struct uc_fw_platform_requirement *fw_blobs;
enum intel_platform p = INTEL_INFO(i915)->platform;
u32 fw_count;
@@ -291,8 +291,8 @@ __uc_fw_auto_select(struct drm_i915_private *i915, struct 
intel_uc_fw *uc_fw)
}
 
/* make sure the list is ordered as expected */
-   if (IS_ENABLED(CONFIG_DRM_I915_SELFTEST) && !verified) {
-   verified = true;
+   if (IS_ENABLED(CONFIG_DRM_I915_SELFTEST) && !verified[uc_fw->type]) {
+   verified[uc_fw->type] = true;
 
for (i = 1; i < fw_count; i++) {
/* Next platform is good: */
@@ -343,7 +343,8 @@ __uc_fw_auto_select(struct drm_i915_private *i915, struct 
intel_uc_fw *uc_fw)
continue;
 
 bad:
-   drm_err(&i915->drm, "Invalid FW blob order: %s r%u 
%s%d.%d.%d comes before %s r%u %s%d.%d.%d\n",
+   drm_err(&i915->drm, "Invalid %s blob order: %s r%u 
%s%d.%d.%d comes before %s r%u %s%d.%d.%d\n",
+   intel_uc_fw_type_repr(uc_fw->type),
intel_platform_name(fw_blobs[i - 1].p), 
fw_blobs[i - 1].rev,
fw_blobs[i - 1].blob.legacy ? "L" : "v",
fw_blobs[i - 1].blob.major,
-- 
2.37.3



Re: [Freedreno] [RFC PATCH 0/3] Support for Solid Fill Planes

2022-11-22 Thread Jessica Zhang




On 11/8/2022 12:52 AM, Ville Syrjälä wrote:

On Mon, Nov 07, 2022 at 07:34:43PM -0800, Rob Clark wrote:

On Mon, Nov 7, 2022 at 4:22 PM Jessica Zhang wrote:




On 11/7/2022 2:09 PM, Rob Clark wrote:

On Mon, Nov 7, 2022 at 1:32 PM Jessica Zhang  wrote:




On 11/7/2022 11:37 AM, Ville Syrjälä wrote:

On Fri, Oct 28, 2022 at 03:59:49PM -0700, Jessica Zhang wrote:

Introduce and add support for COLOR_FILL and COLOR_FILL_FORMAT
properties. When the color fill value is set, and the framebuffer is set
to NULL, memory fetch will be disabled.


Thinking a bit more universally I wonder if there should be
some kind of enum property:

enum plane_pixel_source {
FB,
COLOR,
LIVE_FOO,
LIVE_BAR,
...
}


Hi Ville,

Makes sense -- this way, we'd also lay some groundwork for cases where
drivers want to use other non-FB sources.




In addition, loosen the NULL FB checks within the atomic commit callstack
to allow a NULL FB when color_fill is nonzero and add FB checks in
methods where the FB was previously assumed to be non-NULL.

Finally, have the DPU driver use drm_plane_state.color_fill and
drm_plane_state.color_fill_format instead of dpu_plane_state.color_fill,
and add extra checks in the DPU atomic commit callstack to account for a
NULL FB in cases where color_fill is set.

Some drivers support hardware that have optimizations for solid fill
planes. This series aims to expose these capabilities to userspace as
some compositors have a solid fill flag (ex. SOLID_COLOR in the Android
hardware composer HAL) that can be set by apps like the Android Gears
app.

Userspace can set the color_fill value by setting COLOR_FILL_FORMATto a
DRM format, setting COLOR_FILL to a color fill value, and setting the
framebuffer to NULL.


Is there some real reason for the format property? Ie. why not
just do what was the plan for the crttc background color and
specify the color in full 16bpc format and just pick as many
msbs from that as the hw can use?


The format property was added because we can't assume that all hardware
will support/use the same color format for solid fill planes. Even for
just MSM devices, the hardware supports different variations of RGB
formats [1].


Sure, but the driver can convert the format into whatever the hw
wants.  A 1x1 color conversion is not going to be problematic ;-)


Hi Rob,

Hm... that's also a fair point. Just wondering, is there any advantage
of having the driver convert the format, other than not having to
implement an extra format property?

(In case we end up wrapping everything into a prop blob or something)



It keeps the uabi simpler.. for obvious reasons you don't want the
driver to do cpu color conversion for an arbitrary size plane, which
is why we go to all the complexity to expose formats and modifiers for
"real" planes, but we are dealing with a single pixel value here,
let's not make the uabi more complex than we need to.  I'd propose
making it float32[4] if float weren't a pita for kernel/uabi, but
u16[4] or u32[4] should be fine, and drivers can translate that easily
into whatever weird formats their hw wants for solid-fill.


u16[4] fits into a single u64 property value.

That was the plan for the background prop as well:
https://lore.kernel.org/all/20190703125442.gw5...@intel.com/T/


Got it, I think that's pretty reasonable then. Will probably use u32[4] 
instead since that is what Pekka and Simon are recommending to match 
Wayland's single-buffer protocol [1].


Thanks,

Jessica Zhang

[1] 
https://gitlab.freedesktop.org/wayland/wayland-protocols/-/merge_requests/104




--
Ville Syrjälä
Intel


Re: [PATCH 2/3] drm/i915/uc: More refactoring of UC version numbers

2022-11-22 Thread Ceraolo Spurio, Daniele




On 11/22/2022 12:09 PM, john.c.harri...@intel.com wrote:

From: John Harrison 

As a precursor to a coming change (for adding a GuC submission API
version), abstract the UC version number into its own private
structure separate to the firmware filename.

Signed-off-by: John Harrison 


Reviewed-by: Daniele Ceraolo Spurio 

Daniele


---
  drivers/gpu/drm/i915/gt/uc/intel_uc.c|  6 +-
  drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c | 76 +++-
  drivers/gpu/drm/i915/gt/uc/intel_uc_fw.h | 15 +++--
  3 files changed, 48 insertions(+), 49 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
index 1d28286e6f066..e6edad6f8f9dd 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
@@ -437,9 +437,9 @@ static void print_fw_ver(struct intel_uc *uc, struct 
intel_uc_fw *fw)
  
  	drm_info(&i915->drm, "%s firmware %s version %u.%u.%u\n",

 intel_uc_fw_type_repr(fw->type), fw->file_selected.path,
-fw->file_selected.major_ver,
-fw->file_selected.minor_ver,
-fw->file_selected.patch_ver);
+fw->file_selected.ver.major,
+fw->file_selected.ver.minor,
+fw->file_selected.ver.patch);
  }
  
  static int __uc_init_hw(struct intel_uc *uc)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c 
b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
index 774c3d84a4243..5e2ee1ac89514 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
@@ -278,8 +278,8 @@ __uc_fw_auto_select(struct drm_i915_private *i915, struct 
intel_uc_fw *uc_fw)
  
  		uc_fw->file_selected.path = blob->path;

uc_fw->file_wanted.path = blob->path;
-   uc_fw->file_wanted.major_ver = blob->major;
-   uc_fw->file_wanted.minor_ver = blob->minor;
+   uc_fw->file_wanted.ver.major = blob->major;
+   uc_fw->file_wanted.ver.minor = blob->minor;
uc_fw->loaded_via_gsc = blob->loaded_via_gsc;
found = true;
break;
@@ -438,28 +438,28 @@ static void __force_fw_fetch_failures(struct intel_uc_fw 
*uc_fw, int e)
uc_fw->user_overridden = user;
} else if (i915_inject_probe_error(i915, e)) {
/* require next major version */
-   uc_fw->file_wanted.major_ver += 1;
-   uc_fw->file_wanted.minor_ver = 0;
+   uc_fw->file_wanted.ver.major += 1;
+   uc_fw->file_wanted.ver.minor = 0;
uc_fw->user_overridden = user;
} else if (i915_inject_probe_error(i915, e)) {
/* require next minor version */
-   uc_fw->file_wanted.minor_ver += 1;
+   uc_fw->file_wanted.ver.minor += 1;
uc_fw->user_overridden = user;
-   } else if (uc_fw->file_wanted.major_ver &&
+   } else if (uc_fw->file_wanted.ver.major &&
   i915_inject_probe_error(i915, e)) {
/* require prev major version */
-   uc_fw->file_wanted.major_ver -= 1;
-   uc_fw->file_wanted.minor_ver = 0;
+   uc_fw->file_wanted.ver.major -= 1;
+   uc_fw->file_wanted.ver.minor = 0;
uc_fw->user_overridden = user;
-   } else if (uc_fw->file_wanted.minor_ver &&
+   } else if (uc_fw->file_wanted.ver.minor &&
   i915_inject_probe_error(i915, e)) {
/* require prev minor version - hey, this should work! */
-   uc_fw->file_wanted.minor_ver -= 1;
+   uc_fw->file_wanted.ver.minor -= 1;
uc_fw->user_overridden = user;
} else if (user && i915_inject_probe_error(i915, e)) {
/* officially unsupported platform */
-   uc_fw->file_wanted.major_ver = 0;
-   uc_fw->file_wanted.minor_ver = 0;
+   uc_fw->file_wanted.ver.major = 0;
+   uc_fw->file_wanted.ver.minor = 0;
uc_fw->user_overridden = true;
}
  }
@@ -471,9 +471,9 @@ static int check_gsc_manifest(const struct firmware *fw,
u32 version_hi = dw[HUC_GSC_VERSION_HI_DW];
u32 version_lo = dw[HUC_GSC_VERSION_LO_DW];
  
-	uc_fw->file_selected.major_ver = FIELD_GET(HUC_GSC_MAJOR_VER_HI_MASK, version_hi);

-   uc_fw->file_selected.minor_ver = FIELD_GET(HUC_GSC_MINOR_VER_HI_MASK, 
version_hi);
-   uc_fw->file_selected.patch_ver = FIELD_GET(HUC_GSC_PATCH_VER_LO_MASK, 
version_lo);
+   uc_fw->file_selected.ver.major = FIELD_GET(HUC_GSC_MAJOR_VER_HI_MASK, 
version_hi);
+   uc_fw->file_selected.ver.minor = FIELD_GET(HUC_GSC_MINOR_VER_HI_MASK, 
version_hi);
+   uc_fw->file_selected.ver.patch = FIELD_GET(HUC_GSC_PATCH_VER_LO_MASK, 
version_lo);
  
  	return 0;

  }
@@ -532,11 +532,11 @@ static int check_ccs_header(struct intel_gt *gt,
}
  
  	/* Get version numbers from the CSS header */

-  

Re: [PATCH 1/3] drm/i915/uc: Rationalise delimiters in filename macros

2022-11-22 Thread Ceraolo Spurio, Daniele




On 11/22/2022 12:09 PM, john.c.harri...@intel.com wrote:

From: John Harrison 

The way delimieters (underscores and dots) were added to the UC
filenames was different for different types of delimter. Rationalise


delimiter misspelled twice. Apart from this, it's a simple cleanup, so:

Reviewed-by: Daniele Ceraolo Spurio 

Daniele


them to all be done the same way - implicitly in the concatenation
macro rather than explicitly in the file name prefix.

Signed-off-by: John Harrison 
---
  drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c | 16 
  1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c 
b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
index 0c80ba51a4bdc..774c3d84a4243 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
@@ -118,35 +118,35 @@ void intel_uc_fw_change_status(struct intel_uc_fw *uc_fw,
   */
  #define __MAKE_UC_FW_PATH_BLANK(prefix_, name_) \
"i915/" \
-   __stringify(prefix_) name_ ".bin"
+   __stringify(prefix_) "_" name_ ".bin"
  
  #define __MAKE_UC_FW_PATH_MAJOR(prefix_, name_, major_) \

"i915/" \
-   __stringify(prefix_) name_ \
+   __stringify(prefix_) "_" name_ "_" \
__stringify(major_) ".bin"
  
  #define __MAKE_UC_FW_PATH_MMP(prefix_, name_, major_, minor_, patch_) \

"i915/" \
-   __stringify(prefix_) name_ \
+   __stringify(prefix_) "_" name_  "_" \
__stringify(major_) "." \
__stringify(minor_) "." \
__stringify(patch_) ".bin"
  
  /* Minor for internal driver use, not part of file name */

  #define MAKE_GUC_FW_PATH_MAJOR(prefix_, major_, minor_) \
-   __MAKE_UC_FW_PATH_MAJOR(prefix_, "_guc_", major_)
+   __MAKE_UC_FW_PATH_MAJOR(prefix_, "guc", major_)
  
  #define MAKE_GUC_FW_PATH_MMP(prefix_, major_, minor_, patch_) \

-   __MAKE_UC_FW_PATH_MMP(prefix_, "_guc_", major_, minor_, patch_)
+   __MAKE_UC_FW_PATH_MMP(prefix_, "guc", major_, minor_, patch_)
  
  #define MAKE_HUC_FW_PATH_BLANK(prefix_) \

-   __MAKE_UC_FW_PATH_BLANK(prefix_, "_huc")
+   __MAKE_UC_FW_PATH_BLANK(prefix_, "huc")
  
  #define MAKE_HUC_FW_PATH_GSC(prefix_) \

-   __MAKE_UC_FW_PATH_BLANK(prefix_, "_huc_gsc")
+   __MAKE_UC_FW_PATH_BLANK(prefix_, "huc_gsc")
  
  #define MAKE_HUC_FW_PATH_MMP(prefix_, major_, minor_, patch_) \

-   __MAKE_UC_FW_PATH_MMP(prefix_, "_huc_", major_, minor_, patch_)
+   __MAKE_UC_FW_PATH_MMP(prefix_, "huc", major_, minor_, patch_)
  
  /*

   * All blobs need to be declared via MODULE_FIRMWARE().




[PATCH v4 11/11] drm/msm: mdss add support for SM8450

2022-11-22 Thread Dmitry Baryshkov
Add support for the MDSS block on SM8450 platform.

Tested-by: Vinod Koul 
Reviewed-by: Vinod Koul 
Reviewed-by: Konrad Dybcio 
Reviewed-by: Abhinav Kumar 
Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/msm_mdss.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/msm/msm_mdss.c b/drivers/gpu/drm/msm/msm_mdss.c
index 6a4549ef34d4..5602fbaf6e0e 100644
--- a/drivers/gpu/drm/msm/msm_mdss.c
+++ b/drivers/gpu/drm/msm/msm_mdss.c
@@ -283,6 +283,10 @@ static int msm_mdss_enable(struct msm_mdss *msm_mdss)
case DPU_HW_VER_720:
msm_mdss_setup_ubwc_dec_40(msm_mdss, UBWC_3_0, 6, 1, 1, 1);
break;
+   case DPU_HW_VER_810:
+   /* TODO: highest_bank_bit = 2 for LP_DDR4 */
+   msm_mdss_setup_ubwc_dec_40(msm_mdss, UBWC_4_0, 6, 1, 3, 1);
+   break;
}
 
return ret;
@@ -511,6 +515,7 @@ static const struct of_device_id mdss_dt_match[] = {
{ .compatible = "qcom,sc8180x-mdss" },
{ .compatible = "qcom,sm8150-mdss" },
{ .compatible = "qcom,sm8250-mdss" },
+   { .compatible = "qcom,sm8450-mdss" },
{}
 };
 MODULE_DEVICE_TABLE(of, mdss_dt_match);
-- 
2.35.1



[PATCH v4 07/11] drm/msm/dsi: add support for DSI-PHY on SM8350 and SM8450

2022-11-22 Thread Dmitry Baryshkov
SM8350 and SM8450 use 5nm DSI PHYs, which share register definitions
with 7nm DSI PHYs. Rather than duplicating the driver, handle 5nm
variants inside the common 5+7nm driver.

Co-developed-by: Robert Foss 
Tested-by: Vinod Koul 
Reviewed-by: Vinod Koul 
Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/Kconfig   |   6 +-
 drivers/gpu/drm/msm/dsi/phy/dsi_phy.c |   4 +
 drivers/gpu/drm/msm/dsi/phy/dsi_phy.h |   2 +
 drivers/gpu/drm/msm/dsi/phy/dsi_phy_7nm.c | 119 --
 4 files changed, 118 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/msm/Kconfig b/drivers/gpu/drm/msm/Kconfig
index 3c9dfdb0b328..e7b100d97f88 100644
--- a/drivers/gpu/drm/msm/Kconfig
+++ b/drivers/gpu/drm/msm/Kconfig
@@ -140,12 +140,12 @@ config DRM_MSM_DSI_10NM_PHY
  Choose this option if DSI PHY on SDM845 is used on the platform.
 
 config DRM_MSM_DSI_7NM_PHY
-   bool "Enable DSI 7nm PHY driver in MSM DRM"
+   bool "Enable DSI 7nm/5nm PHY driver in MSM DRM"
depends on DRM_MSM_DSI
default y
help
- Choose this option if DSI PHY on SM8150/SM8250/SC7280 is used on
- the platform.
+ Choose this option if DSI PHY on SM8150/SM8250/SM8350/SM8450/SC7280
+ is used on the platform.
 
 config DRM_MSM_HDMI
bool "Enable HDMI support in MSM DRM driver"
diff --git a/drivers/gpu/drm/msm/dsi/phy/dsi_phy.c 
b/drivers/gpu/drm/msm/dsi/phy/dsi_phy.c
index ee6051367679..0c956fdab23e 100644
--- a/drivers/gpu/drm/msm/dsi/phy/dsi_phy.c
+++ b/drivers/gpu/drm/msm/dsi/phy/dsi_phy.c
@@ -569,6 +569,10 @@ static const struct of_device_id dsi_phy_dt_match[] = {
  .data = &dsi_phy_7nm_8150_cfgs },
{ .compatible = "qcom,sc7280-dsi-phy-7nm",
  .data = &dsi_phy_7nm_7280_cfgs },
+   { .compatible = "qcom,dsi-phy-5nm-8350",
+ .data = &dsi_phy_5nm_8350_cfgs },
+   { .compatible = "qcom,dsi-phy-5nm-8450",
+ .data = &dsi_phy_5nm_8450_cfgs },
 #endif
{}
 };
diff --git a/drivers/gpu/drm/msm/dsi/phy/dsi_phy.h 
b/drivers/gpu/drm/msm/dsi/phy/dsi_phy.h
index 1096afedd616..f7a907ed2b4b 100644
--- a/drivers/gpu/drm/msm/dsi/phy/dsi_phy.h
+++ b/drivers/gpu/drm/msm/dsi/phy/dsi_phy.h
@@ -57,6 +57,8 @@ extern const struct msm_dsi_phy_cfg dsi_phy_10nm_8998_cfgs;
 extern const struct msm_dsi_phy_cfg dsi_phy_7nm_cfgs;
 extern const struct msm_dsi_phy_cfg dsi_phy_7nm_8150_cfgs;
 extern const struct msm_dsi_phy_cfg dsi_phy_7nm_7280_cfgs;
+extern const struct msm_dsi_phy_cfg dsi_phy_5nm_8350_cfgs;
+extern const struct msm_dsi_phy_cfg dsi_phy_5nm_8450_cfgs;
 
 struct msm_dsi_dphy_timing {
u32 clk_zero;
diff --git a/drivers/gpu/drm/msm/dsi/phy/dsi_phy_7nm.c 
b/drivers/gpu/drm/msm/dsi/phy/dsi_phy_7nm.c
index 0b780f9d3d0a..7b2c16b3a36c 100644
--- a/drivers/gpu/drm/msm/dsi/phy/dsi_phy_7nm.c
+++ b/drivers/gpu/drm/msm/dsi/phy/dsi_phy_7nm.c
@@ -39,8 +39,14 @@
 #define VCO_REF_CLK_RATE   1920
 #define FRAC_BITS 18
 
+/* Hardware is pre V4.1 */
+#define DSI_PHY_7NM_QUIRK_PRE_V4_1 BIT(0)
 /* Hardware is V4.1 */
-#define DSI_PHY_7NM_QUIRK_V4_1 BIT(0)
+#define DSI_PHY_7NM_QUIRK_V4_1 BIT(1)
+/* Hardware is V4.2 */
+#define DSI_PHY_7NM_QUIRK_V4_2 BIT(2)
+/* Hardware is V4.3 */
+#define DSI_PHY_7NM_QUIRK_V4_3 BIT(3)
 
 struct dsi_pll_config {
bool enable_ssc;
@@ -116,7 +122,7 @@ static void dsi_pll_calc_dec_frac(struct dsi_pll_7nm *pll, 
struct dsi_pll_config
dec_multiple = div_u64(pll_freq * multiplier, divider);
dec = div_u64_rem(dec_multiple, multiplier, &frac);
 
-   if (!(pll->phy->cfg->quirks & DSI_PHY_7NM_QUIRK_V4_1))
+   if (pll->phy->cfg->quirks & DSI_PHY_7NM_QUIRK_PRE_V4_1)
config->pll_clock_inverters = 0x28;
else if (pll_freq <= 10ULL)
config->pll_clock_inverters = 0xa0;
@@ -197,16 +203,25 @@ static void dsi_pll_config_hzindep_reg(struct dsi_pll_7nm 
*pll)
void __iomem *base = pll->phy->pll_base;
u8 analog_controls_five_1 = 0x01, vco_config_1 = 0x00;
 
-   if (pll->phy->cfg->quirks & DSI_PHY_7NM_QUIRK_V4_1) {
+   if (!(pll->phy->cfg->quirks & DSI_PHY_7NM_QUIRK_PRE_V4_1))
if (pll->vco_current_rate >= 31ULL)
analog_controls_five_1 = 0x03;
 
+   if (pll->phy->cfg->quirks & DSI_PHY_7NM_QUIRK_V4_1) {
if (pll->vco_current_rate < 152000ULL)
vco_config_1 = 0x08;
else if (pll->vco_current_rate < 299000ULL)
vco_config_1 = 0x01;
}
 
+   if ((pll->phy->cfg->quirks & DSI_PHY_7NM_QUIRK_V4_2) ||
+   (pll->phy->cfg->quirks & DSI_PHY_7NM_QUIRK_V4_3)) {
+   if (pll->vco_current_rate < 152000ULL)
+   vco_config_1 = 0x08;
+   else if (pll->vco_current_rate >= 299000ULL)
+   vco_config_1 = 0x01;
+   }
+
dsi_phy_write(base + REG_DSI_

[PATCH v4 10/11] drm/msm/dpu: add support for SM8450

2022-11-22 Thread Dmitry Baryshkov
Add definitions for the display hardware used on Qualcomm SM8450
platform.

Tested-by: Vinod Koul 
Reviewed-by: Vinod Koul 
Reviewed-by: Konrad Dybcio 
Signed-off-by: Dmitry Baryshkov 
---
 .../gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c| 224 ++
 .../gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h|   1 +
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h   |   3 +
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c   |   1 +
 4 files changed, 229 insertions(+)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
index 1ce237e18506..3934d8976833 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
@@ -124,6 +124,15 @@
  BIT(MDP_AD4_0_INTR) | \
  BIT(MDP_AD4_1_INTR))
 
+#define IRQ_SM8450_MASK (BIT(MDP_SSPP_TOP0_INTR) | \
+BIT(MDP_SSPP_TOP0_INTR2) | \
+BIT(MDP_SSPP_TOP0_HIST_INTR) | \
+BIT(MDP_INTF0_7xxx_INTR) | \
+BIT(MDP_INTF1_7xxx_INTR) | \
+BIT(MDP_INTF2_7xxx_INTR) | \
+BIT(MDP_INTF3_7xxx_INTR) | \
+0)
+
 #define WB_SM8250_MASK (BIT(DPU_WB_LINE_MODE) | \
 BIT(DPU_WB_UBWC) | \
 BIT(DPU_WB_YUV_CONFIG) | \
@@ -367,6 +376,20 @@ static const struct dpu_caps sm8250_dpu_caps = {
.pixel_ram_size = DEFAULT_PIXEL_RAM_SIZE,
 };
 
+static const struct dpu_caps sm8450_dpu_caps = {
+   .max_mixer_width = DEFAULT_DPU_OUTPUT_LINE_WIDTH,
+   .max_mixer_blendstages = 0xb,
+   .qseed_type = DPU_SSPP_SCALER_QSEED4,
+   .smart_dma_rev = DPU_SSPP_SMART_DMA_V2, /* TODO: v2.5 */
+   .ubwc_version = DPU_HW_UBWC_VER_40,
+   .has_src_split = true,
+   .has_dim_layer = true,
+   .has_idle_pc = true,
+   .has_3d_merge = true,
+   .max_linewidth = 5120,
+   .pixel_ram_size = DEFAULT_PIXEL_RAM_SIZE,
+};
+
 static const struct dpu_caps sc7280_dpu_caps = {
.max_mixer_width = DEFAULT_DPU_OUTPUT_LINE_WIDTH,
.max_mixer_blendstages = 0x7,
@@ -504,6 +527,33 @@ static const struct dpu_mdp_cfg sm8250_mdp[] = {
},
 };
 
+static const struct dpu_mdp_cfg sm8450_mdp[] = {
+   {
+   .name = "top_0", .id = MDP_TOP,
+   .base = 0x0, .len = 0x494,
+   .features = BIT(DPU_MDP_PERIPH_0_REMOVED),
+   .highest_bank_bit = 0x3, /* TODO: 2 for LP_DDR4 */
+   .clk_ctrls[DPU_CLK_CTRL_VIG0] = {
+   .reg_off = 0x2AC, .bit_off = 0},
+   .clk_ctrls[DPU_CLK_CTRL_VIG1] = {
+   .reg_off = 0x2B4, .bit_off = 0},
+   .clk_ctrls[DPU_CLK_CTRL_VIG2] = {
+   .reg_off = 0x2BC, .bit_off = 0},
+   .clk_ctrls[DPU_CLK_CTRL_VIG3] = {
+   .reg_off = 0x2C4, .bit_off = 0},
+   .clk_ctrls[DPU_CLK_CTRL_DMA0] = {
+   .reg_off = 0x2AC, .bit_off = 8},
+   .clk_ctrls[DPU_CLK_CTRL_DMA1] = {
+   .reg_off = 0x2B4, .bit_off = 8},
+   .clk_ctrls[DPU_CLK_CTRL_CURSOR0] = {
+   .reg_off = 0x2BC, .bit_off = 8},
+   .clk_ctrls[DPU_CLK_CTRL_CURSOR1] = {
+   .reg_off = 0x2C4, .bit_off = 8},
+   .clk_ctrls[DPU_CLK_CTRL_REG_DMA] = {
+   .reg_off = 0x2BC, .bit_off = 20},
+   },
+};
+
 static const struct dpu_mdp_cfg sc7280_mdp[] = {
{
.name = "top_0", .id = MDP_TOP,
@@ -662,6 +712,45 @@ static const struct dpu_ctl_cfg sm8150_ctl[] = {
},
 };
 
+static const struct dpu_ctl_cfg sm8450_ctl[] = {
+   {
+   .name = "ctl_0", .id = CTL_0,
+   .base = 0x15000, .len = 0x204,
+   .features = BIT(DPU_CTL_ACTIVE_CFG) | BIT(DPU_CTL_SPLIT_DISPLAY) | 
BIT(DPU_CTL_FETCH_ACTIVE),
+   .intr_start = DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR2, 9),
+   },
+   {
+   .name = "ctl_1", .id = CTL_1,
+   .base = 0x16000, .len = 0x204,
+   .features = BIT(DPU_CTL_ACTIVE_CFG) | BIT(DPU_CTL_SPLIT_DISPLAY) | 
BIT(DPU_CTL_FETCH_ACTIVE),
+   .intr_start = DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR2, 10),
+   },
+   {
+   .name = "ctl_2", .id = CTL_2,
+   .base = 0x17000, .len = 0x204,
+   .features = BIT(DPU_CTL_ACTIVE_CFG) | BIT(DPU_CTL_FETCH_ACTIVE),
+   .intr_start = DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR2, 11),
+   },
+   {
+   .name = "ctl_3", .id = CTL_3,
+   .base = 0x18000, .len = 0x204,
+   .features = BIT(DPU_CTL_ACTIVE_CFG) | BIT(DPU_CTL_FETCH_ACTIVE),
+   .intr_start = DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR2, 12),
+   },
+   {
+   .name = "ctl_4", .id = CTL_4,
+   .base = 0x19000, .len = 0x204,
+   .features = BIT(DPU_CTL_ACTIVE_CFG) | BIT(DPU_CTL_FETCH_ACTIVE),
+   .intr_start = DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR2, 13),
+   },
+   {
+   .name = "ctl_5", .id = CTL_5,
+   .base = 0x1a000, .len = 0x204,
+   .features = BIT(DPU_CT

[PATCH v4 04/11] dt-bindings: display/msm: add sm8350 and sm8450 DSI PHYs

2022-11-22 Thread Dmitry Baryshkov
SM8350 and SM8450 platforms use the same driver and same bindings as the
existing 7nm DSI PHYs. Add corresponding compatibility strings.

Signed-off-by: Dmitry Baryshkov 
---
 Documentation/devicetree/bindings/display/msm/dsi-phy-7nm.yaml | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/Documentation/devicetree/bindings/display/msm/dsi-phy-7nm.yaml 
b/Documentation/devicetree/bindings/display/msm/dsi-phy-7nm.yaml
index c851770bbdf2..bffd161fedfd 100644
--- a/Documentation/devicetree/bindings/display/msm/dsi-phy-7nm.yaml
+++ b/Documentation/devicetree/bindings/display/msm/dsi-phy-7nm.yaml
@@ -15,6 +15,8 @@ allOf:
 properties:
   compatible:
 enum:
+  - qcom,dsi-phy-5nm-8350
+  - qcom,dsi-phy-5nm-8450
   - qcom,dsi-phy-7nm
   - qcom,dsi-phy-7nm-8150
   - qcom,sc7280-dsi-phy-7nm
-- 
2.35.1



[PATCH v4 05/11] dt-bindings: display/msm: add support for the display on SM8450

2022-11-22 Thread Dmitry Baryshkov
Add DPU and MDSS schemas to describe MDSS and DPU blocks on the Qualcomm
SM8450 platform.

Signed-off-by: Dmitry Baryshkov 
---
 .../bindings/display/msm/qcom,sm8450-dpu.yaml | 139 +++
 .../display/msm/qcom,sm8450-mdss.yaml | 352 ++
 2 files changed, 491 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/display/msm/qcom,sm8450-dpu.yaml
 create mode 100644 
Documentation/devicetree/bindings/display/msm/qcom,sm8450-mdss.yaml

diff --git a/Documentation/devicetree/bindings/display/msm/qcom,sm8450-dpu.yaml 
b/Documentation/devicetree/bindings/display/msm/qcom,sm8450-dpu.yaml
new file mode 100644
index ..8e25d456e5e9
--- /dev/null
+++ b/Documentation/devicetree/bindings/display/msm/qcom,sm8450-dpu.yaml
@@ -0,0 +1,139 @@
+# SPDX-License-Identifier: GPL-2.0-only or BSD-2-Clause
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/display/msm/qcom,sm8450-dpu.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Qualcomm SM8450 Display DPU
+
+maintainers:
+  - Dmitry Baryshkov 
+
+$ref: /schemas/display/msm/dpu-common.yaml#
+
+properties:
+  compatible:
+const: qcom,sm8450-dpu
+
+  reg:
+items:
+  - description: Address offset and size for mdp register set
+  - description: Address offset and size for vbif register set
+
+  reg-names:
+items:
+  - const: mdp
+  - const: vbif
+
+  clocks:
+items:
+  - description: Display hf axi clock
+  - description: Display sf axi clock
+  - description: Display ahb clock
+  - description: Display lut clock
+  - description: Display core clock
+  - description: Display vsync clock
+
+  clock-names:
+items:
+  - const: bus
+  - const: nrt_bus
+  - const: iface
+  - const: lut
+  - const: core
+  - const: vsync
+
+required:
+  - compatible
+  - reg
+  - reg-names
+  - clocks
+  - clock-names
+
+unevaluatedProperties: false
+
+examples:
+  - |
+#include 
+#include 
+#include 
+#include 
+#include 
+
+display-controller@ae01000 {
+compatible = "qcom,sm8450-dpu";
+reg = <0x0ae01000 0x8f000>,
+  <0x0aeb 0x2008>;
+reg-names = "mdp", "vbif";
+
+clocks = <&gcc GCC_DISP_HF_AXI_CLK>,
+<&gcc GCC_DISP_SF_AXI_CLK>,
+<&dispcc DISP_CC_MDSS_AHB_CLK>,
+<&dispcc DISP_CC_MDSS_MDP_LUT_CLK>,
+<&dispcc DISP_CC_MDSS_MDP_CLK>,
+<&dispcc DISP_CC_MDSS_VSYNC_CLK>;
+clock-names = "bus",
+  "nrt_bus",
+  "iface",
+  "lut",
+  "core",
+  "vsync";
+
+assigned-clocks = <&dispcc DISP_CC_MDSS_VSYNC_CLK>;
+assigned-clock-rates = <1920>;
+
+operating-points-v2 = <&mdp_opp_table>;
+power-domains = <&rpmhpd SM8450_MMCX>;
+
+interrupt-parent = <&mdss>;
+interrupts = <0>;
+
+ports {
+#address-cells = <1>;
+#size-cells = <0>;
+
+port@0 {
+reg = <0>;
+dpu_intf1_out: endpoint {
+remote-endpoint = <&dsi0_in>;
+};
+};
+
+port@1 {
+reg = <1>;
+dpu_intf2_out: endpoint {
+remote-endpoint = <&dsi1_in>;
+};
+};
+};
+
+mdp_opp_table: opp-table {
+compatible = "operating-points-v2";
+
+opp-17200{
+opp-hz = /bits/ 64 <17200>;
+required-opps = <&rpmhpd_opp_low_svs_d1>;
+};
+
+opp-2 {
+opp-hz = /bits/ 64 <2>;
+required-opps = <&rpmhpd_opp_low_svs>;
+};
+
+opp-32500 {
+opp-hz = /bits/ 64 <32500>;
+required-opps = <&rpmhpd_opp_svs>;
+};
+
+opp-37500 {
+opp-hz = /bits/ 64 <37500>;
+required-opps = <&rpmhpd_opp_svs_l1>;
+};
+
+opp-5 {
+opp-hz = /bits/ 64 <5>;
+required-opps = <&rpmhpd_opp_nom>;
+};
+};
+};
+...
diff --git 
a/Documentation/devicetree/bindings/display/msm/qcom,sm8450-mdss.yaml 
b/Documentation/devicetree/bindings/display/msm/qcom,sm8450-mdss.yaml
new file mode 100644
index ..73f8c5caf637
--- /dev/null
+++ b/Documentation/devicetree/bindings/display/msm/qcom,sm8450-mdss.yaml
@@ -0,0 +1,352 @@
+# SPDX-License-Identifier: GPL-2.0-only or BSD-2-Clause
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/display/msm/qcom,sm8450-mdss.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Qualcomm SM8450 Display MDSS
+
+maintainers:
+  - Dmitry Baryshkov 
+
+description:
+  Device tree bindings for MSM Mobile Display Subsystem(MDSS) that encapsulates
+  sub-bl

[PATCH v4 03/11] dt-bindings: display/msm: mdss-common: make clock-names required

2022-11-22 Thread Dmitry Baryshkov
Mark clock-names property as required to be used on all MDSS devices.

Signed-off-by: Dmitry Baryshkov 
---
 Documentation/devicetree/bindings/display/msm/mdss-common.yaml | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/display/msm/mdss-common.yaml 
b/Documentation/devicetree/bindings/display/msm/mdss-common.yaml
index 59f17ac898aa..e2980aebf178 100644
--- a/Documentation/devicetree/bindings/display/msm/mdss-common.yaml
+++ b/Documentation/devicetree/bindings/display/msm/mdss-common.yaml
@@ -74,6 +74,7 @@ required:
   - reg-names
   - power-domains
   - clocks
+  - clock-names
   - interrupts
   - interrupt-controller
   - iommus
-- 
2.35.1



[PATCH v4 09/11] drm/msm/dpu: add support for MDP_TOP blackhole

2022-11-22 Thread Dmitry Baryshkov
On sm8450 a register block was removed from MDP TOP. Accessing it during
snapshotting results in NoC errors / immediate reboot. Skip accessing
these registers during snapshot.

Tested-by: Vinod Koul 
Reviewed-by: Vinod Koul 
Reviewed-by: Konrad Dybcio 
Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h |  1 +
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c| 11 +--
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
index 38aa38ab1568..4730f8268f2a 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
@@ -92,6 +92,7 @@ enum {
DPU_MDP_UBWC_1_0,
DPU_MDP_UBWC_1_5,
DPU_MDP_AUDIO_SELECT,
+   DPU_MDP_PERIPH_0_REMOVED,
DPU_MDP_MAX
 };
 
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
index f3660cd14f4f..67f2e5288b3c 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
@@ -927,8 +927,15 @@ static void dpu_kms_mdp_snapshot(struct msm_disp_state 
*disp_state, struct msm_k
msm_disp_snapshot_add_block(disp_state, cat->wb[i].len,
dpu_kms->mmio + cat->wb[i].base, "wb_%d", i);
 
-   msm_disp_snapshot_add_block(disp_state, cat->mdp[0].len,
-   dpu_kms->mmio + cat->mdp[0].base, "top");
+   if (top->caps->features & BIT(DPU_MDP_PERIPH_0_REMOVED)) {
+   msm_disp_snapshot_add_block(disp_state, 0x380,
+   dpu_kms->mmio + cat->mdp[0].base, "top");
+   msm_disp_snapshot_add_block(disp_state, cat->mdp[0].len - 0x3a8,
+   dpu_kms->mmio + cat->mdp[0].base + 0x3a8, 
"top_2");
+   } else {
+   msm_disp_snapshot_add_block(disp_state, cat->mdp[0].len,
+   dpu_kms->mmio + cat->mdp[0].base, "top");
+   }
 
pm_runtime_put_sync(&dpu_kms->pdev->dev);
 }
-- 
2.35.1



[PATCH v4 08/11] drm/msm/dsi: add support for DSI 2.6.0

2022-11-22 Thread Dmitry Baryshkov
Add support for DSI 2.6.0 (block used on sm8450).

Tested-by: Vinod Koul 
Reviewed-by: Vinod Koul 
Reviewed-by: Konrad Dybcio 
Reviewed-by: Abhinav Kumar 
Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/dsi/dsi_cfg.c | 2 ++
 drivers/gpu/drm/msm/dsi/dsi_cfg.h | 1 +
 2 files changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/msm/dsi/dsi_cfg.c 
b/drivers/gpu/drm/msm/dsi/dsi_cfg.c
index 7e97c239ed48..59a4cc95a251 100644
--- a/drivers/gpu/drm/msm/dsi/dsi_cfg.c
+++ b/drivers/gpu/drm/msm/dsi/dsi_cfg.c
@@ -300,6 +300,8 @@ static const struct msm_dsi_cfg_handler dsi_cfg_handlers[] 
= {
&sc7180_dsi_cfg, &msm_dsi_6g_v2_host_ops},
{MSM_DSI_VER_MAJOR_6G, MSM_DSI_6G_VER_MINOR_V2_5_0,
&sc7280_dsi_cfg, &msm_dsi_6g_v2_host_ops},
+   {MSM_DSI_VER_MAJOR_6G, MSM_DSI_6G_VER_MINOR_V2_6_0,
+   &sdm845_dsi_cfg, &msm_dsi_6g_v2_host_ops},
 };
 
 const struct msm_dsi_cfg_handler *msm_dsi_cfg_get(u32 major, u32 minor)
diff --git a/drivers/gpu/drm/msm/dsi/dsi_cfg.h 
b/drivers/gpu/drm/msm/dsi/dsi_cfg.h
index 8f04e685a74e..95957fab499d 100644
--- a/drivers/gpu/drm/msm/dsi/dsi_cfg.h
+++ b/drivers/gpu/drm/msm/dsi/dsi_cfg.h
@@ -25,6 +25,7 @@
 #define MSM_DSI_6G_VER_MINOR_V2_4_00x2004
 #define MSM_DSI_6G_VER_MINOR_V2_4_10x20040001
 #define MSM_DSI_6G_VER_MINOR_V2_5_00x2005
+#define MSM_DSI_6G_VER_MINOR_V2_6_00x2006
 
 #define MSM_DSI_V2_VER_MINOR_8064  0x0
 
-- 
2.35.1



[PATCH v4 06/11] drm/msm/dsi/phy: rework register setting for 7nm PHY

2022-11-22 Thread Dmitry Baryshkov
In preparation to adding the sm8350 and sm8450 PHYs support, rearrange
register values calculations in dsi_7nm_phy_enable(). This change bears
no functional changes itself, it is merely a preparation for the next
patch.

Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/dsi/phy/dsi_phy_7nm.c | 26 +++
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/msm/dsi/phy/dsi_phy_7nm.c 
b/drivers/gpu/drm/msm/dsi/phy/dsi_phy_7nm.c
index 9e7fa7d88ead..0b780f9d3d0a 100644
--- a/drivers/gpu/drm/msm/dsi/phy/dsi_phy_7nm.c
+++ b/drivers/gpu/drm/msm/dsi/phy/dsi_phy_7nm.c
@@ -858,23 +858,34 @@ static int dsi_7nm_phy_enable(struct msm_dsi_phy *phy,
/* Alter PHY configurations if data rate less than 1.5GHZ*/
less_than_1500_mhz = (clk_req->bitclk_rate <= 15);
 
+   if (phy->cphy_mode) {
+   vreg_ctrl_0 = 0x51;
+   vreg_ctrl_1 = 0x55;
+   glbl_pemph_ctrl_0 = 0x11;
+   lane_ctrl0 = 0x17;
+   } else {
+   vreg_ctrl_1 = 0x5c;
+   glbl_pemph_ctrl_0 = 0x00;
+   lane_ctrl0 = 0x1f;
+   }
+
if (phy->cfg->quirks & DSI_PHY_7NM_QUIRK_V4_1) {
-   vreg_ctrl_0 = less_than_1500_mhz ? 0x53 : 0x52;
if (phy->cphy_mode) {
glbl_rescode_top_ctrl = 0x00;
glbl_rescode_bot_ctrl = 0x3c;
} else {
+   vreg_ctrl_0 = less_than_1500_mhz ? 0x53 : 0x52;
glbl_rescode_top_ctrl = less_than_1500_mhz ? 0x3d :  
0x00;
glbl_rescode_bot_ctrl = less_than_1500_mhz ? 0x39 :  
0x3c;
}
glbl_str_swi_cal_sel_ctrl = 0x00;
glbl_hstx_str_ctrl_0 = 0x88;
} else {
-   vreg_ctrl_0 = less_than_1500_mhz ? 0x5B : 0x59;
if (phy->cphy_mode) {
glbl_str_swi_cal_sel_ctrl = 0x03;
glbl_hstx_str_ctrl_0 = 0x66;
} else {
+   vreg_ctrl_0 = less_than_1500_mhz ? 0x5B : 0x59;
glbl_str_swi_cal_sel_ctrl = less_than_1500_mhz ? 0x03 : 
0x00;
glbl_hstx_str_ctrl_0 = less_than_1500_mhz ? 0x66 : 0x88;
}
@@ -882,17 +893,6 @@ static int dsi_7nm_phy_enable(struct msm_dsi_phy *phy,
glbl_rescode_bot_ctrl = 0x3c;
}
 
-   if (phy->cphy_mode) {
-   vreg_ctrl_0 = 0x51;
-   vreg_ctrl_1 = 0x55;
-   glbl_pemph_ctrl_0 = 0x11;
-   lane_ctrl0 = 0x17;
-   } else {
-   vreg_ctrl_1 = 0x5c;
-   glbl_pemph_ctrl_0 = 0x00;
-   lane_ctrl0 = 0x1f;
-   }
-
/* de-assert digital and pll power down */
data = BIT(6) | BIT(5);
dsi_phy_write(base + REG_DSI_7nm_PHY_CMN_CTRL_0, data);
-- 
2.35.1



[PATCH v4 02/11] dt-bindings: display/msm: *mdss.yaml: split required properties clauses

2022-11-22 Thread Dmitry Baryshkov
Per Krzysztof's request, move a clause requiring 'compatible' property to
the file where it is formally defined.

Signed-off-by: Dmitry Baryshkov 
---
 Documentation/devicetree/bindings/display/msm/mdss-common.yaml | 1 -
 .../devicetree/bindings/display/msm/qcom,msm8998-mdss.yaml | 3 +++
 .../devicetree/bindings/display/msm/qcom,qcm2290-mdss.yaml | 3 +++
 .../devicetree/bindings/display/msm/qcom,sc7180-mdss.yaml  | 3 +++
 .../devicetree/bindings/display/msm/qcom,sc7280-mdss.yaml  | 3 +++
 .../devicetree/bindings/display/msm/qcom,sdm845-mdss.yaml  | 3 +++
 .../devicetree/bindings/display/msm/qcom,sm8250-mdss.yaml  | 3 +++
 7 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/display/msm/mdss-common.yaml 
b/Documentation/devicetree/bindings/display/msm/mdss-common.yaml
index 27d7242657b2..59f17ac898aa 100644
--- a/Documentation/devicetree/bindings/display/msm/mdss-common.yaml
+++ b/Documentation/devicetree/bindings/display/msm/mdss-common.yaml
@@ -70,7 +70,6 @@ properties:
   - description: MDSS_CORE reset
 
 required:
-  - compatible
   - reg
   - reg-names
   - power-domains
diff --git 
a/Documentation/devicetree/bindings/display/msm/qcom,msm8998-mdss.yaml 
b/Documentation/devicetree/bindings/display/msm/qcom,msm8998-mdss.yaml
index cf52ff77a41a..fc6969c9c52e 100644
--- a/Documentation/devicetree/bindings/display/msm/qcom,msm8998-mdss.yaml
+++ b/Documentation/devicetree/bindings/display/msm/qcom,msm8998-mdss.yaml
@@ -55,6 +55,9 @@ patternProperties:
   compatible:
 const: qcom,dsi-phy-10nm-8998
 
+required:
+  - compatible
+
 unevaluatedProperties: false
 
 examples:
diff --git 
a/Documentation/devicetree/bindings/display/msm/qcom,qcm2290-mdss.yaml 
b/Documentation/devicetree/bindings/display/msm/qcom,qcm2290-mdss.yaml
index d6f043a4b08d..0c2f9755125e 100644
--- a/Documentation/devicetree/bindings/display/msm/qcom,qcm2290-mdss.yaml
+++ b/Documentation/devicetree/bindings/display/msm/qcom,qcm2290-mdss.yaml
@@ -61,6 +61,9 @@ patternProperties:
   compatible:
 const: qcom,dsi-phy-14nm-2290
 
+required:
+  - compatible
+
 unevaluatedProperties: false
 
 examples:
diff --git 
a/Documentation/devicetree/bindings/display/msm/qcom,sc7180-mdss.yaml 
b/Documentation/devicetree/bindings/display/msm/qcom,sc7180-mdss.yaml
index 13e396d61a51..fb835a4d9114 100644
--- a/Documentation/devicetree/bindings/display/msm/qcom,sc7180-mdss.yaml
+++ b/Documentation/devicetree/bindings/display/msm/qcom,sc7180-mdss.yaml
@@ -67,6 +67,9 @@ patternProperties:
   compatible:
 const: qcom,dsi-phy-10nm
 
+required:
+  - compatible
+
 unevaluatedProperties: false
 
 examples:
diff --git 
a/Documentation/devicetree/bindings/display/msm/qcom,sc7280-mdss.yaml 
b/Documentation/devicetree/bindings/display/msm/qcom,sc7280-mdss.yaml
index a3de1744ba11..a4e3ada2affc 100644
--- a/Documentation/devicetree/bindings/display/msm/qcom,sc7280-mdss.yaml
+++ b/Documentation/devicetree/bindings/display/msm/qcom,sc7280-mdss.yaml
@@ -74,6 +74,9 @@ patternProperties:
   - qcom,sc7280-dsi-phy-7nm
   - qcom,sc7280-edp-phy
 
+required:
+  - compatible
+
 unevaluatedProperties: false
 
 examples:
diff --git 
a/Documentation/devicetree/bindings/display/msm/qcom,sdm845-mdss.yaml 
b/Documentation/devicetree/bindings/display/msm/qcom,sdm845-mdss.yaml
index 31ca6f99fc22..2a0960bf3052 100644
--- a/Documentation/devicetree/bindings/display/msm/qcom,sdm845-mdss.yaml
+++ b/Documentation/devicetree/bindings/display/msm/qcom,sdm845-mdss.yaml
@@ -59,6 +59,9 @@ patternProperties:
   compatible:
 const: qcom,dsi-phy-10nm
 
+required:
+  - compatible
+
 unevaluatedProperties: false
 
 examples:
diff --git 
a/Documentation/devicetree/bindings/display/msm/qcom,sm8250-mdss.yaml 
b/Documentation/devicetree/bindings/display/msm/qcom,sm8250-mdss.yaml
index 0d3be5386b3f..d752fd022ac5 100644
--- a/Documentation/devicetree/bindings/display/msm/qcom,sm8250-mdss.yaml
+++ b/Documentation/devicetree/bindings/display/msm/qcom,sm8250-mdss.yaml
@@ -63,6 +63,9 @@ patternProperties:
   compatible:
 const: qcom,dsi-phy-7nm
 
+required:
+  - compatible
+
 unevaluatedProperties: false
 
 examples:
-- 
2.35.1



[PATCH v4 00/11] drm/msm: add support for SM8450

2022-11-22 Thread Dmitry Baryshkov
This adds support for the MDSS/DPU/DSI on the Qualcomm SM8450 platform.

Dependencies for the DT bindings: [1].

[1] 
https://lore.kernel.org/all/20221024164225.3236654-1-dmitry.barysh...@linaro.org/

Change since v3:
- Reworked the dpu-common.yaml / mdss-common.yaml to require properties
  from the same schema where they are defined (Krzysztof)
- Reworked PHY register settings to make it easier to understand
  (Konrad)

Change since v2:
- Rebased onto msm-next-lumag
- Cleaned up bindings according to Krzysztof's suggestions

Change since v1:
- Fixed the regdma pointer in sm8450_dpu_cfg
- Rebased onto pending msm-next-lumag
- Added DT bindings for corresponding devices

Dmitry Baryshkov (11):
  dt-bindings: display/msm: *dpu.yaml: split required properties clauses
  dt-bindings: display/msm: *mdss.yaml: split required properties
clauses
  dt-bindings: display/msm: mdss-common: make clock-names required
  dt-bindings: display/msm: add sm8350 and sm8450 DSI PHYs
  dt-bindings: display/msm: add support for the display on SM8450
  drm/msm/dsi/phy: rework register setting for 7nm PHY
  drm/msm/dsi: add support for DSI-PHY on SM8350 and SM8450
  drm/msm/dsi: add support for DSI 2.6.0
  drm/msm/dpu: add support for MDP_TOP blackhole
  drm/msm/dpu: add support for SM8450
  drm/msm: mdss add support for SM8450

 .../bindings/display/msm/dpu-common.yaml  |   4 -
 .../bindings/display/msm/dsi-phy-7nm.yaml |   2 +
 .../bindings/display/msm/mdss-common.yaml |   2 +-
 .../display/msm/qcom,msm8998-dpu.yaml |   7 +
 .../display/msm/qcom,msm8998-mdss.yaml|   3 +
 .../display/msm/qcom,qcm2290-dpu.yaml |   7 +
 .../display/msm/qcom,qcm2290-mdss.yaml|   3 +
 .../bindings/display/msm/qcom,sc7180-dpu.yaml |   7 +
 .../display/msm/qcom,sc7180-mdss.yaml |   3 +
 .../bindings/display/msm/qcom,sc7280-dpu.yaml |   7 +
 .../display/msm/qcom,sc7280-mdss.yaml |   3 +
 .../bindings/display/msm/qcom,sdm845-dpu.yaml |   7 +
 .../display/msm/qcom,sdm845-mdss.yaml |   3 +
 .../bindings/display/msm/qcom,sm8250-dpu.yaml |   7 +
 .../display/msm/qcom,sm8250-mdss.yaml |   3 +
 .../bindings/display/msm/qcom,sm8450-dpu.yaml | 139 +++
 .../display/msm/qcom,sm8450-mdss.yaml | 352 ++
 drivers/gpu/drm/msm/Kconfig   |   6 +-
 .../gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c| 224 +++
 .../gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h|   2 +
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h   |   3 +
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c   |  12 +-
 drivers/gpu/drm/msm/dsi/dsi_cfg.c |   2 +
 drivers/gpu/drm/msm/dsi/dsi_cfg.h |   1 +
 drivers/gpu/drm/msm/dsi/phy/dsi_phy.c |   4 +
 drivers/gpu/drm/msm/dsi/phy/dsi_phy.h |   2 +
 drivers/gpu/drm/msm/dsi/phy/dsi_phy_7nm.c | 141 +--
 drivers/gpu/drm/msm/msm_mdss.c|   5 +
 28 files changed, 930 insertions(+), 31 deletions(-)
 create mode 100644 
Documentation/devicetree/bindings/display/msm/qcom,sm8450-dpu.yaml
 create mode 100644 
Documentation/devicetree/bindings/display/msm/qcom,sm8450-mdss.yaml

-- 
2.35.1



[PATCH v4 01/11] dt-bindings: display/msm: *dpu.yaml: split required properties clauses

2022-11-22 Thread Dmitry Baryshkov
Per Krzysztof's request, move a clause requiring certain properties to
the file where they are declared.

Signed-off-by: Dmitry Baryshkov 
---
 .../devicetree/bindings/display/msm/dpu-common.yaml| 4 
 .../devicetree/bindings/display/msm/qcom,msm8998-dpu.yaml  | 7 +++
 .../devicetree/bindings/display/msm/qcom,qcm2290-dpu.yaml  | 7 +++
 .../devicetree/bindings/display/msm/qcom,sc7180-dpu.yaml   | 7 +++
 .../devicetree/bindings/display/msm/qcom,sc7280-dpu.yaml   | 7 +++
 .../devicetree/bindings/display/msm/qcom,sdm845-dpu.yaml   | 7 +++
 .../devicetree/bindings/display/msm/qcom,sm8250-dpu.yaml   | 7 +++
 7 files changed, 42 insertions(+), 4 deletions(-)

diff --git a/Documentation/devicetree/bindings/display/msm/dpu-common.yaml 
b/Documentation/devicetree/bindings/display/msm/dpu-common.yaml
index 8ffbc30c6b7f..870158bb2aa0 100644
--- a/Documentation/devicetree/bindings/display/msm/dpu-common.yaml
+++ b/Documentation/devicetree/bindings/display/msm/dpu-common.yaml
@@ -40,10 +40,6 @@ properties:
   - port@0
 
 required:
-  - compatible
-  - reg
-  - reg-names
-  - clocks
   - interrupts
   - power-domains
   - operating-points-v2
diff --git 
a/Documentation/devicetree/bindings/display/msm/qcom,msm8998-dpu.yaml 
b/Documentation/devicetree/bindings/display/msm/qcom,msm8998-dpu.yaml
index b02adba36e9e..479ce75bd451 100644
--- a/Documentation/devicetree/bindings/display/msm/qcom,msm8998-dpu.yaml
+++ b/Documentation/devicetree/bindings/display/msm/qcom,msm8998-dpu.yaml
@@ -46,6 +46,13 @@ properties:
   - const: core
   - const: vsync
 
+required:
+  - compatible
+  - reg
+  - reg-names
+  - clocks
+  - clock-names
+
 unevaluatedProperties: false
 
 examples:
diff --git 
a/Documentation/devicetree/bindings/display/msm/qcom,qcm2290-dpu.yaml 
b/Documentation/devicetree/bindings/display/msm/qcom,qcm2290-dpu.yaml
index a7b382f01b56..e794f0dd8ef4 100644
--- a/Documentation/devicetree/bindings/display/msm/qcom,qcm2290-dpu.yaml
+++ b/Documentation/devicetree/bindings/display/msm/qcom,qcm2290-dpu.yaml
@@ -42,6 +42,13 @@ properties:
   - const: lut
   - const: vsync
 
+required:
+  - compatible
+  - reg
+  - reg-names
+  - clocks
+  - clock-names
+
 unevaluatedProperties: false
 
 examples:
diff --git a/Documentation/devicetree/bindings/display/msm/qcom,sc7180-dpu.yaml 
b/Documentation/devicetree/bindings/display/msm/qcom,sc7180-dpu.yaml
index bd590a6b5b96..0dfdf8f3c5b4 100644
--- a/Documentation/devicetree/bindings/display/msm/qcom,sc7180-dpu.yaml
+++ b/Documentation/devicetree/bindings/display/msm/qcom,sc7180-dpu.yaml
@@ -44,6 +44,13 @@ properties:
   - const: core
   - const: vsync
 
+required:
+  - compatible
+  - reg
+  - reg-names
+  - clocks
+  - clock-names
+
 unevaluatedProperties: false
 
 examples:
diff --git a/Documentation/devicetree/bindings/display/msm/qcom,sc7280-dpu.yaml 
b/Documentation/devicetree/bindings/display/msm/qcom,sc7280-dpu.yaml
index 924059b387b6..512d23f8d629 100644
--- a/Documentation/devicetree/bindings/display/msm/qcom,sc7280-dpu.yaml
+++ b/Documentation/devicetree/bindings/display/msm/qcom,sc7280-dpu.yaml
@@ -43,6 +43,13 @@ properties:
   - const: core
   - const: vsync
 
+required:
+  - compatible
+  - reg
+  - reg-names
+  - clocks
+  - clock-names
+
 unevaluatedProperties: false
 
 examples:
diff --git a/Documentation/devicetree/bindings/display/msm/qcom,sdm845-dpu.yaml 
b/Documentation/devicetree/bindings/display/msm/qcom,sdm845-dpu.yaml
index 5719b45f2860..d5a55e898b11 100644
--- a/Documentation/devicetree/bindings/display/msm/qcom,sdm845-dpu.yaml
+++ b/Documentation/devicetree/bindings/display/msm/qcom,sdm845-dpu.yaml
@@ -42,6 +42,13 @@ properties:
   - const: core
   - const: vsync
 
+required:
+  - compatible
+  - reg
+  - reg-names
+  - clocks
+  - clock-names
+
 unevaluatedProperties: false
 
 examples:
diff --git a/Documentation/devicetree/bindings/display/msm/qcom,sm8250-dpu.yaml 
b/Documentation/devicetree/bindings/display/msm/qcom,sm8250-dpu.yaml
index 9ff8a265c85f..687c8c170cd4 100644
--- a/Documentation/devicetree/bindings/display/msm/qcom,sm8250-dpu.yaml
+++ b/Documentation/devicetree/bindings/display/msm/qcom,sm8250-dpu.yaml
@@ -39,6 +39,13 @@ properties:
   - const: core
   - const: vsync
 
+required:
+  - compatible
+  - reg
+  - reg-names
+  - clocks
+  - clock-names
+
 unevaluatedProperties: false
 
 examples:
-- 
2.35.1



Re: [PATCH] drm/i915/huc: fix leak of debug object in huc load fence on driver unload

2022-11-22 Thread John Harrison

On 11/10/2022 16:56, Daniele Ceraolo Spurio wrote:

The fence is always initialized in huc_init_early, but the cleanup in
huc_fini is only being run if HuC is enabled. This causes a leaking of
the debug object when HuC is disabled/not supported, which can in turn
trigger a warning if we try to register a new debug offset at the same
address on driver reload.

To fix the issue, make sure to always run the cleanup code.

Reported-by: Tvrtko Ursulin 
Reported-by: Brian Norris 
Fixes: 27536e03271d ("drm/i915/huc: track delayed HuC load with a fence")
Signed-off-by: Daniele Ceraolo Spurio 
Cc: Tvrtko Ursulin 
Cc: Brian Norris 
Cc: Alan Previn 
Cc: John Harrison 

Reviewed-by: John Harrison 


---

Note: I didn't manage to repro the reported warning, but I did confirm
that we weren't correctly calling i915_sw_fence_fini and that this patch
fixes that.

  drivers/gpu/drm/i915/gt/uc/intel_huc.c | 12 +++-
  drivers/gpu/drm/i915/gt/uc/intel_uc.c  |  1 +
  2 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
index fbc8bae14f76..83735a1528fe 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
@@ -300,13 +300,15 @@ int intel_huc_init(struct intel_huc *huc)
  
  void intel_huc_fini(struct intel_huc *huc)

  {
-   if (!intel_uc_fw_is_loadable(&huc->fw))
-   return;
-
+   /*
+* the fence is initialized in init_early, so we need to clean it up
+* even if HuC loading is off.
+*/
delayed_huc_load_complete(huc);
-
i915_sw_fence_fini(&huc->delayed_load.fence);
-   intel_uc_fw_fini(&huc->fw);
+
+   if (intel_uc_fw_is_loadable(&huc->fw))
+   intel_uc_fw_fini(&huc->fw);
  }
  
  void intel_huc_suspend(struct intel_huc *huc)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
index dbd048b77e19..41f08b55790e 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
@@ -718,6 +718,7 @@ int intel_uc_runtime_resume(struct intel_uc *uc)
  
  static const struct intel_uc_ops uc_ops_off = {

.init_hw = __uc_check_hw,
+   .fini = __uc_fini, /* to clean-up the init_early initialization */
  };
  
  static const struct intel_uc_ops uc_ops_on = {




Re: [Intel-gfx] [PATCH 5/6] drm/i915/gsc: Disable GSC engine and power well if FW is not selected

2022-11-22 Thread Ceraolo Spurio, Daniele




On 11/22/2022 12:52 PM, Rodrigo Vivi wrote:

On Mon, Nov 21, 2022 at 03:16:16PM -0800, Daniele Ceraolo Spurio wrote:

From: Jonathan Cavitt 

The GSC CS is only used for communicating with the GSC FW, so no need to
initialize it if we're not going to use the FW. If we're not using
neither the engine nor the microcontoller, then we can also disable the
power well.

IMPORTANT: lack of GSC FW breaks media C6 due to opposing requirements
between CS setup and forcewake idleness. See in-code comment for detail.

Signed-off-by: Jonathan Cavitt 
Signed-off-by: Daniele Ceraolo Spurio 
Cc: Matt Roper 
Cc: John C Harrison 
Cc: Rodrigo Vivi 
Cc: Vinay Belgaumkar 
---
  drivers/gpu/drm/i915/gt/intel_engine_cs.c | 18 ++
  drivers/gpu/drm/i915/intel_uncore.c   |  3 +++
  2 files changed, 21 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index c33e0d72d670..99c4b866addd 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -894,6 +894,24 @@ static intel_engine_mask_t init_engine_mask(struct 
intel_gt *gt)
engine_mask_apply_compute_fuses(gt);
engine_mask_apply_copy_fuses(gt);
  
+	/*

+* The only use of the GSC CS is to load and communicate with the GSC
+* FW, so we have no use for it if we don't have the FW.
+*
+* IMPORTANT: in cases where we don't have the GSC FW, we have a
+* catch-22 situation that breaks media C6 due to 2 requirements:
+* 1) once turned on, the GSC power well will not go to sleep unless the
+*GSC FW is loaded.
+* 2) to enable idling (which is required for media C6) we need to
+*initialize the IDLE_MSG register for the GSC CS and do at least 1
+*submission, which will wake up the GSC power well.
+*/
+   if (__HAS_ENGINE(info->engine_mask, GSC0) && 
!intel_uc_wants_gsc_uc(>->uc)) {
+   drm_notice(>->i915->drm,
+  "No GSC FW selected, disabling GSC CS and media 
C6\n");
+   info->engine_mask &= ~BIT(GSC0);
+   }
+
return info->engine_mask;
  }
  
diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c

index c1befa33ff59..e63d957b59eb 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -2701,6 +2701,9 @@ void intel_uncore_prune_engine_fw_domains(struct 
intel_uncore *uncore,
if (fw_domains & BIT(domain_id))
fw_domain_fini(uncore, domain_id);
}
+
+   if ((fw_domains & BIT(FW_DOMAIN_ID_GSC)) && !HAS_ENGINE(gt, GSC0))
+   fw_domain_fini(uncore, FW_DOMAIN_ID_GSC);

On a quick glace I was asking "why do you need this since it doesn't have the 
gsc0?
Then I remember that fw_domain got initialized and it will be skipped, right?
Then I though about at least have a comment here, but finally I got myself 
wondering
why we don't do this already in the if above, while we are cleaning the engine 
mask?


I've followed the existing code flows that we have in place for fused 
off VCS/VECS. Basically the existing code goes like this:


1) All FW domains for the platform are initialized
2) We read the fuses and adjust the engine mask accordingly
3) We go back and prune the FW domains that are not applicable anymore 
due to the updated mask.


The idea is to have a single gt-level function doing all the mask 
adjusting and an uncore-level one doing all the domain pruning. I'm not 
against changing this approach, but in that case we should update the 
behavior for VCS/VECS as well (which might be complicated, because 
VCS/VECS engines share FW domains, so the pruning logic is ugly).


Daniele




  }
  
  static void driver_flr(struct intel_uncore *uncore)

--
2.37.3





[PATCH] drm/i915: Allow error capture without a request

2022-11-22 Thread John . C . Harrison
From: John Harrison 

There was a report of error captures occurring without any hung
context being indicated despite the capture being initiated by a 'hung
context notification' from GuC. The problem was not reproducible.
However, it is possible to happen if the context in question has no
active requests. For example, if the hang was in the context switch
itself then the breadcrumb write would have occurred and the KMD would
see an idle context.

In the interests of attempting to provide as much information as
possible about a hang, it seems wise to include the engine info
regardless of whether a request was found or not. As opposed to just
prentending there was no hang at all.

So update the error capture code to always record engine information
if an engine is given. Which means updating record_context() to take a
context instead of a request (which it only ever used to find the
context anyway). And split the request agnostic parts of
intel_engine_coredump_add_request() out into a seaprate function.

Signed-off-by: John Harrison 
---
 drivers/gpu/drm/i915/i915_gpu_error.c | 55 +++
 1 file changed, 40 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index 9d5d5a397b64e..2ed1c84c9fab4 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1370,14 +1370,14 @@ static void engine_record_execlists(struct 
intel_engine_coredump *ee)
 }
 
 static bool record_context(struct i915_gem_context_coredump *e,
-  const struct i915_request *rq)
+  struct intel_context *ce)
 {
struct i915_gem_context *ctx;
struct task_struct *task;
bool simulated;
 
rcu_read_lock();
-   ctx = rcu_dereference(rq->context->gem_context);
+   ctx = rcu_dereference(ce->gem_context);
if (ctx && !kref_get_unless_zero(&ctx->ref))
ctx = NULL;
rcu_read_unlock();
@@ -1396,8 +1396,8 @@ static bool record_context(struct 
i915_gem_context_coredump *e,
e->guilty = atomic_read(&ctx->guilty_count);
e->active = atomic_read(&ctx->active_count);
 
-   e->total_runtime = intel_context_get_total_runtime_ns(rq->context);
-   e->avg_runtime = intel_context_get_avg_runtime_ns(rq->context);
+   e->total_runtime = intel_context_get_total_runtime_ns(ce);
+   e->avg_runtime = intel_context_get_avg_runtime_ns(ce);
 
simulated = i915_gem_context_no_error_capture(ctx);
 
@@ -1532,15 +1532,37 @@ intel_engine_coredump_alloc(struct intel_engine_cs 
*engine, gfp_t gfp, u32 dump_
return ee;
 }
 
+static struct intel_engine_capture_vma *
+engine_coredump_add_context(struct intel_engine_coredump *ee,
+   struct intel_context *ce,
+   gfp_t gfp)
+{
+   struct intel_engine_capture_vma *vma = NULL;
+
+   ee->simulated |= record_context(&ee->context, ce);
+   if (ee->simulated)
+   return NULL;
+
+   /*
+* We need to copy these to an anonymous buffer
+* as the simplest method to avoid being overwritten
+* by userspace.
+*/
+   vma = capture_vma(vma, ce->ring->vma, "ring", gfp);
+   vma = capture_vma(vma, ce->state, "HW context", gfp);
+
+   return vma;
+}
+
 struct intel_engine_capture_vma *
 intel_engine_coredump_add_request(struct intel_engine_coredump *ee,
  struct i915_request *rq,
  gfp_t gfp)
 {
-   struct intel_engine_capture_vma *vma = NULL;
+   struct intel_engine_capture_vma *vma;
 
-   ee->simulated |= record_context(&ee->context, rq);
-   if (ee->simulated)
+   vma = engine_coredump_add_context(ee, rq->context, gfp);
+   if (!vma)
return NULL;
 
/*
@@ -1550,8 +1572,6 @@ intel_engine_coredump_add_request(struct 
intel_engine_coredump *ee,
 */
vma = capture_vma_snapshot(vma, rq->batch_res, gfp, "batch");
vma = capture_user(vma, rq, gfp);
-   vma = capture_vma(vma, rq->ring->vma, "ring", gfp);
-   vma = capture_vma(vma, rq->context->state, "HW context", gfp);
 
ee->rq_head = rq->head;
ee->rq_post = rq->postfix;
@@ -1608,8 +1628,11 @@ capture_engine(struct intel_engine_cs *engine,
if (ce) {
intel_engine_clear_hung_context(engine);
rq = intel_context_find_active_request(ce);
-   if (!rq || !i915_request_started(rq))
-   goto no_request_capture;
+   if (rq && !i915_request_started(rq)) {
+   drm_info(&engine->gt->i915->drm, "Got hung context on 
%s with no active request!\n",
+engine->name);
+   rq = NULL;
+   }
} else {
/*
 * Getting here with GuC enabled means it is a forced error 
captur

Re: [Intel-gfx] [PATCH 4/6] drm/i915/gsc: Do a driver-FLR on unload if GSC was loaded

2022-11-22 Thread Ceraolo Spurio, Daniele




On 11/22/2022 12:46 PM, Rodrigo Vivi wrote:

On Mon, Nov 21, 2022 at 03:16:15PM -0800, Daniele Ceraolo Spurio wrote:

If the GSC was loaded, the only way to stop it during the driver unload
flow is to do a driver-FLR.
The driver-FLR is not the same as PCI config space FLR in that
it doesn't reset the SGUnit and doesn't modify the PCI config
space. Thus, it doesn't require a re-enumeration of the PCI BARs.
However, the driver-FLR does cause a memory wipe of graphics memory
on all discrete GPU platforms or a wipe limited to stolen memory
on the integrated GPU platforms.

Nothing major or blocking, but a few thoughts:

1. Should we document this in the code, at least in a comment in the
flr function?


Sure, I'll add it in


2. Should we call this driver_initiated_flr, aiming to reduce even more
the ambiguity of it?


ok




We perform the FLR as the last action before releasing the MMIO bar, so
that we don't have to care about the consequences of the reset on the
unload flow.

3. should we try to implement this already in the gt_reset case as the
last resrouce before wedging the gt? So we can already test this flow
in the current platforms?


This would be nice to have, but very complicated to implement. The fact 
that FLR kills everything on the system, including resetting display and 
wiping LMEM, means that we would need a new recovery path to 
re-initialize all components. There are also potential questions on how 
to handle LMEM: do we try to migrate it to SMEM before triggering the 
FLR (potentially via CPU memcpy if the GT is dead), or do we just let it 
get wiped?


The reason why I wanted the FLR to be the very last thing before 
releasing MMIO access was exactly to not have to care about the recovery 
path ;)


Daniele




Signed-off-by: Daniele Ceraolo Spurio 
Signed-off-by: Alan Previn 
---
  drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c |  9 +
  drivers/gpu/drm/i915/i915_reg.h   |  3 ++
  drivers/gpu/drm/i915/intel_uncore.c   | 45 +++
  drivers/gpu/drm/i915/intel_uncore.h   | 13 +++
  4 files changed, 70 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c 
b/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c
index 510fb47193ec..5dad3c19c445 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c
@@ -173,6 +173,15 @@ int intel_gsc_fw_upload(struct intel_gsc_uc *gsc)
if (err)
goto fail;
  
+	/*

+* Once the GSC FW is loaded, the only way to kill it on driver unload
+* is to do a driver FLR. Given this is a very disruptive action, we
+* want to do it as the last action before releasing the access to the
+* MMIO bar, which means we need to do it as part of the primary uncore
+* cleanup.
+*/
+   intel_uncore_set_flr_on_fini(>->i915->uncore);
+
/* FW is not fully operational until we enable SW proxy */
intel_uc_fw_change_status(gsc_fw, INTEL_UC_FIRMWARE_TRANSFERRED);
  
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h

index 8e1892d14774..60e55245200b 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -118,6 +118,9 @@
  
  #define GU_CNTL_MMIO(0x101010)

  #define   LMEM_INIT   REG_BIT(7)
+#define   DRIVERFLRREG_BIT(31)
+#define GU_DEBUG   _MMIO(0x101018)
+#define   DRIVERFLR_STATUS REG_BIT(31)
  
  #define GEN6_STOLEN_RESERVED		_MMIO(0x1082C0)

  #define GEN6_STOLEN_RESERVED_ADDR_MASK(0xFFF << 20)
diff --git a/drivers/gpu/drm/i915/intel_uncore.c 
b/drivers/gpu/drm/i915/intel_uncore.c
index 8006a6c61466..c1befa33ff59 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -2703,6 +2703,48 @@ void intel_uncore_prune_engine_fw_domains(struct 
intel_uncore *uncore,
}
  }
  
+static void driver_flr(struct intel_uncore *uncore)

+{
+   struct drm_i915_private *i915 = uncore->i915;
+   const unsigned int flr_timeout_ms = 3000; /* specs recommend a 3s wait 
*/
+   int ret;
+
+   drm_dbg(&i915->drm, "Triggering Driver-FLR\n");
+
+   /*
+* Make sure any pending FLR requests have cleared by waiting for the
+* FLR trigger bit to go to zero. Also clear GU_DEBUG's DRIVERFLR_STATUS
+* to make sure it's not still set from a prior attempt (it's a write to
+* clear bit).
+* Note that we should never be in a situation where a previous attempt
+* is still pending (unless the HW is totally dead), but better to be
+* safe in case something unexpected happens
+*/
+   ret = intel_wait_for_register_fw(uncore, GU_CNTL, DRIVERFLR, 0, 
flr_timeout_ms);
+   if (ret) {
+   drm_err(&i915->drm,
+   "Failed to wait for Driver-FLR bit to clear! %d\n",
+   ret);
+   return;
+   }
+   intel_uncore_w

Re: [MAINTAINER TOOLS] docs: updated rules for topic/core-for-CI commit management

2022-11-22 Thread Lucas De Marchi

On Tue, Nov 22, 2022 at 03:17:14PM +0200, Jani Nikula wrote:

Introduce stricter rules for topic/core-for-CI management. Way too many
commits have been added over the years, with insufficient rationale
recorded in the commit message, and insufficient follow-up with removing
the commits from the topic branch.

New rules:


Why not make a list like this the actual text? It's easier to follow a
bullet/numbered list than the free form text.



1. Require maintainer ack for rebase. Have better gating on when rebases
  happen and on which baselines.


What maintainer? drm-intel-gt-next/drm-intel-next/drm-misc/drm? Any?

I don't want fingers pointed, but just to know the context: was there
any event recently that triggered this? Because the last updates I've
seen on topic/core-for-CI were not from maintainers and
looking at the branch I don't see any issue with the recent commits.
The issue actually seems to be the very old ones.  I'm not sure such
a measure will actually fix the problem.

I myself pushed recently to topic/core-for-CI so I want to know if **I**
caused any issue.



2. Require maintainer/committer ack for adding/removing commits. No
  single individual should decide.


s@maintainers/committer @@? Or just let it have the same requirement as
the drm-intel-* branches. It seems odd to raise the bar for
topic/core-for-CI above the requirement for drm-intel-* branches (even
though that latter is a r-b). From committer-drm-intel.rst:

* Reviewed-by/Acked-by/Tested-by must include the name and email of a 
real
  person for transparency. Anyone can give these, and therefore you 
have to
  value them according to the merits of the person. Quality matters, not
  quantity. Be suspicious of rubber stamps.

* Reviewed-by/Acked-by/Tested-by can be asked for and given informally 
(on the
  list, IRC, in person, in a meeting) but must be added to the commit.

* Reviewed-by. All patches must be reviewed, no exceptions. Please see
  "Reviewer's statement of oversight" in 
`Documentation/process/submitting-patches
  
`_
  and `review training
  `_.



3. Require gitlab issues for new commits added. Improve tracking for
  removing the commits.

Also use the stronger "must" for commit message requiring the
justification for the commit being in topic/core-for-CI.

Cc: Joonas Lahtinen 
Cc: Rodrigo Vivi 
Cc: Tvrtko Ursulin 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: intel-...@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Cc: dim-to...@lists.freedesktop.org
Signed-off-by: Jani Nikula 
---
drm-tip.rst | 27 ---
1 file changed, 20 insertions(+), 7 deletions(-)

diff --git a/drm-tip.rst b/drm-tip.rst
index deae95cdd2fe..24036e2ef576 100644
--- a/drm-tip.rst
+++ b/drm-tip.rst
@@ -203,11 +203,13 @@ justified exception. The primary goal is to fix issues 
originating from Linus'
tree. Issues that would need drm-next or other DRM subsystem tree as baseline
should be fixed in the offending DRM subsystem tree.

-Only rebase the branch if you really know what you're doing. When in doubt, ask
-the maintainers. You'll need to be able to handle any conflicts in non-drm code
-while rebasing.
+Only rebase the branch if you really know what you're doing. You'll need to be
+able to handle any conflicts in non-drm code while rebasing.

-Simply drop fixes that are already available in the new baseline.
+Always ask for maintainer ack before rebasing. IRC ack is sufficient.
+
+Simply drop fixes that are already available in the new baseline. Close the
+associated gitlab issue when removing commits.

Force pushing a rebased topic/core-for-CI requires passing the ``--force``
parameter to git::


there is a main issue here that is not being fixed: testing the merged
branch.  I think it would be much better to have the instruction here
to rebuild drm-tip without pushing... This will use the local topic branch:

dim -d rebuild-tip topic/core-for-CI

It's the only way I ever update it because I don't want to push a branch
and have a small window to potentially solve the merge conflicts (while
leaving others wondering why the tip is broken).


@@ -225,11 +227,22 @@ judgement call.
Only add or remove commits if you really know what you're doing. When in doubt,
ask the maintainers.

-Apply new commits on top with regular push. The commit message needs to explain
-why the patch has been applied to topic/core-for-CI. If it's a cherry-pick from
+Always ask for maintainer/committer ack before adding/removing commits. IRC ack
+is sufficient. Record the ``Acked-by:`` in commits being added.
+
+Apply new commits on top with regular push. The commit message must explain why
+the patch has been applied to topic/core-for-CI. If it's a cherry-pick from
another subsystem, please reference the commit with ``

Re: git send-email friendly smtp provider anyone?

2022-11-22 Thread Konstantin Ryabitsev
On Tue, Nov 22, 2022 at 10:10:47PM +0100, Noralf Trønnes wrote:
> Konstantin found a workaround, so I was able to push the patches.

Yes, this uncovered quite a few bugs -- which is excellent for me, not so
excellent for you. :)

> Here's the result if anyone is interested in seeing the result of using
> b4 and the web endpoint:
> https://lore.kernel.org/dri-devel/20221122-gud-shadow-plane-v1-0-9de3afa33...@tronnes.org/
> 
> Patchwork gave me a new submitter ID:
> https://patchwork.freedesktop.org/series/111222/

Oooh, I see that patchwork is still not doing the right thing with
X-Original-From. It will only do the substitution when the From email address
is the same as the email address of the list.

https://github.com/getpatchwork/patchwork/blob/main/patchwork/parser.py#L437

There's unfortunately no fix for this that I can do on my end. :(

-K


Re: git send-email friendly smtp provider anyone?

2022-11-22 Thread Noralf Trønnes



Den 22.11.2022 20.22, skrev Noralf Trønnes:
> 
> 
> Den 22.11.2022 19.50, skrev Konstantin Ryabitsev:
>> On Tue, Nov 22, 2022 at 06:42:19PM +0100, Noralf Trønnes wrote:
>>> The first thing that strikes me is that everyone mentioned in one of the
>>> patches get the entire patchset, even sta...@vger.kernel.org (cc'ed in a
>>> fixes patch). The first patch touches a core file and as a result a few
>>> drivers, so I've cc'ed the driver maintainers in that patch, but now
>>> they get the entire patchset where 5 of 6 patches is about a driver that
>>> I maintain. So from their point of view, they see a patchset about a
>>> driver they don't care about and a patch touching a core file, but from
>>> the subject it's not apparent that it touches their driver. I'm afraid
>>> that this might result in none of them looking at that patch. In this
>>> particular case it's not that important, but in another case it might be.
>>
>> I did some (unscientific) polling among kernel maintainers and, by a vast
>> margin, they always prefer to receive the entire series instead of
>> cherry-picked patches -- having the entire series helps provide important
>> context for the change they are looking at.
>>
>> So, this is deliberate and, for now at least, not configurable. Unless you're
>> sending 100+ patch series, I doubt anyone will have any problem with 
>> receiving
>> the whole series instead of individual patches.
>>
>>> As for the setting up the web endpoint, should I just follow the b4 docs
>>> on that?
>>>
>>> I use b4 version 0.10.1, is that recent enough?
>>
>> Yes. There will be a 0.10.2 in the near future, but the incoming fixes
>> shouldn't make much difference for the b4 send code.
>>
> 
> This is what I got:
> 
> $ b4 send --web-auth-verify 
> Signing challenge
> Submitting verification to https://lkml.kernel.org/_b4_submit
> Traceback (most recent call last):
>   File "/home/pi/.local/bin/b4", line 8, in 
> sys.exit(cmd())
>   File "/home/pi/.local/lib/python3.10/site-packages/b4/command.py",
> line 341, in cmd
> cmdargs.func(cmdargs)
>   File "/home/pi/.local/lib/python3.10/site-packages/b4/command.py",
> line 86, in cmd_send
> b4.ez.cmd_send(cmdargs)
>   File "/home/pi/.local/lib/python3.10/site-packages/b4/ez.py", line
> 1102, in cmd_send
> auth_verify(cmdargs)
>   File "/home/pi/.local/lib/python3.10/site-packages/b4/ez.py", line
> 188, in auth_verify
> res = ses.post(endpoint, json=req)
>   File "/usr/lib/python3/dist-packages/requests/sessions.py", line 590,
> in post
> return self.request('POST', url, data=data, json=json, **kwargs)
>   File "/usr/lib/python3/dist-packages/requests/sessions.py", line 528,
> in request
> prep = self.prepare_request(req)
>   File "/usr/lib/python3/dist-packages/requests/sessions.py", line 456,
> in prepare_request
> p.prepare(
>   File "/usr/lib/python3/dist-packages/requests/models.py", line 319, in
> prepare
> self.prepare_body(data, files, json)
>   File "/usr/lib/python3/dist-packages/requests/models.py", line 469, in
> prepare_body
> body = complexjson.dumps(json)
>   File "/usr/lib/python3.10/json/__init__.py", line 231, in dumps
> return _default_encoder.encode(obj)
>   File "/usr/lib/python3.10/json/encoder.py", line 199, in encode
> chunks = self.iterencode(o, _one_shot=True)
>   File "/usr/lib/python3.10/json/encoder.py", line 257, in iterencode
> return _iterencode(o, 0)
>   File "/usr/lib/python3.10/json/encoder.py", line 179, in default
> raise TypeError(f'Object of type {o.__class__.__name__} '
> TypeError: Object of type bytes is not JSON serializable
> 

Konstantin found a workaround, so I was able to push the patches.

Here's the result if anyone is interested in seeing the result of using
b4 and the web endpoint:
https://lore.kernel.org/dri-devel/20221122-gud-shadow-plane-v1-0-9de3afa33...@tronnes.org/

Patchwork gave me a new submitter ID:
https://patchwork.freedesktop.org/series/111222/

Noralf.


[PATCH 6/6] drm/gud: Use the shadow plane helper

2022-11-22 Thread Noralf Trønnes via B4 Submission Endpoint
From: Noralf Trønnes 

Use the shadow plane helper to take care of preparing the framebuffer for
CPU access. The synchronous flushing is now done inline without the use of
a worker. The async path now uses a shadow buffer to hold framebuffer
changes and it doesn't read the framebuffer behind userspace's back
anymore.

Signed-off-by: Noralf Trønnes 
---
 drivers/gpu/drm/gud/gud_drv.c  |  1 +
 drivers/gpu/drm/gud/gud_internal.h |  1 +
 drivers/gpu/drm/gud/gud_pipe.c | 69 --
 3 files changed, 46 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/gud/gud_drv.c b/drivers/gpu/drm/gud/gud_drv.c
index d57dab104358..5aac7cda0505 100644
--- a/drivers/gpu/drm/gud/gud_drv.c
+++ b/drivers/gpu/drm/gud/gud_drv.c
@@ -365,6 +365,7 @@ static void gud_debugfs_init(struct drm_minor *minor)
 static const struct drm_simple_display_pipe_funcs gud_pipe_funcs = {
.check  = gud_pipe_check,
.update = gud_pipe_update,
+   DRM_GEM_SIMPLE_DISPLAY_PIPE_SHADOW_PLANE_FUNCS
 };
 
 static const struct drm_mode_config_funcs gud_mode_config_funcs = {
diff --git a/drivers/gpu/drm/gud/gud_internal.h 
b/drivers/gpu/drm/gud/gud_internal.h
index e351a1f1420d..0d148a6f27aa 100644
--- a/drivers/gpu/drm/gud/gud_internal.h
+++ b/drivers/gpu/drm/gud/gud_internal.h
@@ -43,6 +43,7 @@ struct gud_device {
struct drm_framebuffer *fb;
struct drm_rect damage;
bool prev_flush_failed;
+   void *shadow_buf;
 };
 
 static inline struct gud_device *to_gud_device(struct drm_device *drm)
diff --git a/drivers/gpu/drm/gud/gud_pipe.c b/drivers/gpu/drm/gud/gud_pipe.c
index dfada6eedc58..7686325f7ee7 100644
--- a/drivers/gpu/drm/gud/gud_pipe.c
+++ b/drivers/gpu/drm/gud/gud_pipe.c
@@ -358,10 +358,10 @@ static void gud_flush_damage(struct gud_device *gdrm, 
struct drm_framebuffer *fb
 void gud_flush_work(struct work_struct *work)
 {
struct gud_device *gdrm = container_of(work, struct gud_device, work);
-   struct iosys_map gem_map = { }, fb_map = { };
+   struct iosys_map shadow_map;
struct drm_framebuffer *fb;
struct drm_rect damage;
-   int idx, ret;
+   int idx;
 
if (!drm_dev_enter(&gdrm->drm, &idx))
return;
@@ -369,6 +369,7 @@ void gud_flush_work(struct work_struct *work)
mutex_lock(&gdrm->damage_lock);
fb = gdrm->fb;
gdrm->fb = NULL;
+   iosys_map_set_vaddr(&shadow_map, gdrm->shadow_buf);
damage = gdrm->damage;
gud_clear_damage(gdrm);
mutex_unlock(&gdrm->damage_lock);
@@ -376,33 +377,33 @@ void gud_flush_work(struct work_struct *work)
if (!fb)
goto out;
 
-   ret = drm_gem_fb_vmap(fb, &gem_map, &fb_map);
-   if (ret)
-   goto fb_put;
+   gud_flush_damage(gdrm, fb, &shadow_map, true, &damage);
 
-   ret = drm_gem_fb_begin_cpu_access(fb, DMA_FROM_DEVICE);
-   if (ret)
-   goto vunmap;
-
-   /* Imported buffers are assumed to be WriteCombined with uncached reads 
*/
-   gud_flush_damage(gdrm, fb, &fb_map, !fb->obj[0]->import_attach, 
&damage);
-
-   drm_gem_fb_end_cpu_access(fb, DMA_FROM_DEVICE);
-vunmap:
-   drm_gem_fb_vunmap(fb, &gem_map);
-fb_put:
drm_framebuffer_put(fb);
 out:
drm_dev_exit(idx);
 }
 
-static void gud_fb_queue_damage(struct gud_device *gdrm, struct 
drm_framebuffer *fb,
-   struct drm_rect *damage)
+static int gud_fb_queue_damage(struct gud_device *gdrm, struct drm_framebuffer 
*fb,
+  const struct iosys_map *map, struct drm_rect 
*damage)
 {
struct drm_framebuffer *old_fb = NULL;
+   struct iosys_map shadow_map;
 
mutex_lock(&gdrm->damage_lock);
 
+   if (!gdrm->shadow_buf) {
+   gdrm->shadow_buf = vzalloc(fb->pitches[0] * fb->height);
+   if (!gdrm->shadow_buf) {
+   mutex_unlock(&gdrm->damage_lock);
+   return -ENOMEM;
+   }
+   }
+
+   iosys_map_set_vaddr(&shadow_map, gdrm->shadow_buf);
+   iosys_map_incr(&shadow_map, drm_fb_clip_offset(fb->pitches[0], 
fb->format, damage));
+   drm_fb_memcpy(&shadow_map, fb->pitches, map, fb, damage);
+
if (fb != gdrm->fb) {
old_fb = gdrm->fb;
drm_framebuffer_get(fb);
@@ -420,6 +421,26 @@ static void gud_fb_queue_damage(struct gud_device *gdrm, 
struct drm_framebuffer
 
if (old_fb)
drm_framebuffer_put(old_fb);
+
+   return 0;
+}
+
+static void gud_fb_handle_damage(struct gud_device *gdrm, struct 
drm_framebuffer *fb,
+const struct iosys_map *map, struct drm_rect 
*damage)
+{
+   int ret;
+
+   if (gdrm->flags & GUD_DISPLAY_FLAG_FULL_UPDATE)
+   drm_rect_init(damage, 0, 0, fb->width, fb->height);
+
+   if (gud_async_flush) {
+   ret = gud_fb_queue_damage(gdrm, fb, map, damage);
+

[PATCH 4/6] drm/gud: Split up gud_flush_work()

2022-11-22 Thread Noralf Trønnes via B4 Submission Endpoint
From: Noralf Trønnes 

In preparation for inlining synchronous flushing split out the part of
gud_flush_work() that can be shared by the sync and async code paths.

Signed-off-by: Noralf Trønnes 
---
 drivers/gpu/drm/gud/gud_pipe.c | 72 +++---
 1 file changed, 39 insertions(+), 33 deletions(-)

diff --git a/drivers/gpu/drm/gud/gud_pipe.c b/drivers/gpu/drm/gud/gud_pipe.c
index ff1358815af5..d2af9947494f 100644
--- a/drivers/gpu/drm/gud/gud_pipe.c
+++ b/drivers/gpu/drm/gud/gud_pipe.c
@@ -333,15 +333,49 @@ void gud_clear_damage(struct gud_device *gdrm)
gdrm->damage.y2 = 0;
 }
 
+static void gud_flush_damage(struct gud_device *gdrm, struct drm_framebuffer 
*fb,
+struct drm_rect *damage)
+{
+   const struct drm_format_info *format;
+   unsigned int i, lines;
+   size_t pitch;
+   int ret;
+
+   format = fb->format;
+   if (format->format == DRM_FORMAT_XRGB && 
gdrm->xrgb_emulation_format)
+   format = gdrm->xrgb_emulation_format;
+
+   /* Split update if it's too big */
+   pitch = drm_format_info_min_pitch(format, 0, drm_rect_width(damage));
+   lines = drm_rect_height(damage);
+
+   if (gdrm->bulk_len < lines * pitch)
+   lines = gdrm->bulk_len / pitch;
+
+   for (i = 0; i < DIV_ROUND_UP(drm_rect_height(damage), lines); i++) {
+   struct drm_rect rect = *damage;
+
+   rect.y1 += i * lines;
+   rect.y2 = min_t(u32, rect.y1 + lines, damage->y2);
+
+   ret = gud_flush_rect(gdrm, fb, format, &rect);
+   if (ret) {
+   if (ret != -ENODEV && ret != -ECONNRESET &&
+   ret != -ESHUTDOWN && ret != -EPROTO)
+   dev_err_ratelimited(fb->dev->dev,
+   "Failed to flush 
framebuffer: error=%d\n", ret);
+   gdrm->prev_flush_failed = true;
+   break;
+   }
+   }
+}
+
 void gud_flush_work(struct work_struct *work)
 {
struct gud_device *gdrm = container_of(work, struct gud_device, work);
-   const struct drm_format_info *format;
struct drm_framebuffer *fb;
struct drm_rect damage;
-   unsigned int i, lines;
-   int idx, ret = 0;
-   size_t pitch;
+   int idx;
 
if (!drm_dev_enter(&gdrm->drm, &idx))
return;
@@ -356,35 +390,7 @@ void gud_flush_work(struct work_struct *work)
if (!fb)
goto out;
 
-   format = fb->format;
-   if (format->format == DRM_FORMAT_XRGB && 
gdrm->xrgb_emulation_format)
-   format = gdrm->xrgb_emulation_format;
-
-   /* Split update if it's too big */
-   pitch = drm_format_info_min_pitch(format, 0, drm_rect_width(&damage));
-   lines = drm_rect_height(&damage);
-
-   if (gdrm->bulk_len < lines * pitch)
-   lines = gdrm->bulk_len / pitch;
-
-   for (i = 0; i < DIV_ROUND_UP(drm_rect_height(&damage), lines); i++) {
-   struct drm_rect rect = damage;
-
-   rect.y1 += i * lines;
-   rect.y2 = min_t(u32, rect.y1 + lines, damage.y2);
-
-   ret = gud_flush_rect(gdrm, fb, format, &rect);
-   if (ret) {
-   if (ret != -ENODEV && ret != -ECONNRESET &&
-   ret != -ESHUTDOWN && ret != -EPROTO)
-   dev_err_ratelimited(fb->dev->dev,
-   "Failed to flush 
framebuffer: error=%d\n", ret);
-   gdrm->prev_flush_failed = true;
-   break;
-   }
-
-   gdrm->prev_flush_failed = false;
-   }
+   gud_flush_damage(gdrm, fb, &damage);
 
drm_framebuffer_put(fb);
 out:

-- 
b4 0.10.1



[PATCH 5/6] drm/gud: Prepare buffer for CPU access in gud_flush_work()

2022-11-22 Thread Noralf Trønnes via B4 Submission Endpoint
From: Noralf Trønnes 

In preparation for moving to the shadow plane helper prepare the
framebuffer for CPU access as early as possible.

Signed-off-by: Noralf Trønnes 
---
 drivers/gpu/drm/gud/gud_pipe.c | 67 +-
 1 file changed, 33 insertions(+), 34 deletions(-)

diff --git a/drivers/gpu/drm/gud/gud_pipe.c b/drivers/gpu/drm/gud/gud_pipe.c
index d2af9947494f..dfada6eedc58 100644
--- a/drivers/gpu/drm/gud/gud_pipe.c
+++ b/drivers/gpu/drm/gud/gud_pipe.c
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -152,32 +153,21 @@ static size_t gud_xrgb_to_color(u8 *dst, const struct 
drm_format_info *forma
 }
 
 static int gud_prep_flush(struct gud_device *gdrm, struct drm_framebuffer *fb,
+ const struct iosys_map *map, bool cached_reads,
  const struct drm_format_info *format, struct drm_rect 
*rect,
  struct gud_set_buffer_req *req)
 {
-   struct dma_buf_attachment *import_attach = fb->obj[0]->import_attach;
u8 compression = gdrm->compression;
-   struct iosys_map map[DRM_FORMAT_MAX_PLANES] = { };
-   struct iosys_map map_data[DRM_FORMAT_MAX_PLANES] = { };
struct iosys_map dst;
void *vaddr, *buf;
size_t pitch, len;
-   int ret = 0;
 
pitch = drm_format_info_min_pitch(format, 0, drm_rect_width(rect));
len = pitch * drm_rect_height(rect);
if (len > gdrm->bulk_len)
return -E2BIG;
 
-   ret = drm_gem_fb_vmap(fb, map, map_data);
-   if (ret)
-   return ret;
-
-   vaddr = map_data[0].vaddr;
-
-   ret = drm_gem_fb_begin_cpu_access(fb, DMA_FROM_DEVICE);
-   if (ret)
-   goto vunmap;
+   vaddr = map[0].vaddr;
 retry:
if (compression)
buf = gdrm->compress_buf;
@@ -192,29 +182,27 @@ static int gud_prep_flush(struct gud_device *gdrm, struct 
drm_framebuffer *fb,
if (format != fb->format) {
if (format->format == GUD_DRM_FORMAT_R1) {
len = gud_xrgb_to_r124(buf, format, vaddr, fb, 
rect);
-   if (!len) {
-   ret = -ENOMEM;
-   goto end_cpu_access;
-   }
+   if (!len)
+   return -ENOMEM;
} else if (format->format == DRM_FORMAT_R8) {
-   drm_fb_xrgb_to_gray8(&dst, NULL, map_data, fb, 
rect);
+   drm_fb_xrgb_to_gray8(&dst, NULL, map, fb, rect);
} else if (format->format == DRM_FORMAT_RGB332) {
-   drm_fb_xrgb_to_rgb332(&dst, NULL, map_data, fb, 
rect);
+   drm_fb_xrgb_to_rgb332(&dst, NULL, map, fb, rect);
} else if (format->format == DRM_FORMAT_RGB565) {
-   drm_fb_xrgb_to_rgb565(&dst, NULL, map_data, fb, 
rect,
+   drm_fb_xrgb_to_rgb565(&dst, NULL, map, fb, rect,
  gud_is_big_endian());
} else if (format->format == DRM_FORMAT_RGB888) {
-   drm_fb_xrgb_to_rgb888(&dst, NULL, map_data, fb, 
rect);
+   drm_fb_xrgb_to_rgb888(&dst, NULL, map, fb, rect);
} else {
len = gud_xrgb_to_color(buf, format, vaddr, fb, 
rect);
}
} else if (gud_is_big_endian() && format->cpp[0] > 1) {
-   drm_fb_swab(&dst, NULL, map_data, fb, rect, !import_attach);
-   } else if (compression && !import_attach && pitch == fb->pitches[0]) {
+   drm_fb_swab(&dst, NULL, map, fb, rect, cached_reads);
+   } else if (compression && cached_reads && pitch == fb->pitches[0]) {
/* can compress directly from the framebuffer */
buf = vaddr + rect->y1 * pitch;
} else {
-   drm_fb_memcpy(&dst, NULL, map_data, fb, rect);
+   drm_fb_memcpy(&dst, NULL, map, fb, rect);
}
 
memset(req, 0, sizeof(*req));
@@ -237,12 +225,7 @@ static int gud_prep_flush(struct gud_device *gdrm, struct 
drm_framebuffer *fb,
req->compressed_length = cpu_to_le32(complen);
}
 
-end_cpu_access:
-   drm_gem_fb_end_cpu_access(fb, DMA_FROM_DEVICE);
-vunmap:
-   drm_gem_fb_vunmap(fb, map);
-
-   return ret;
+   return 0;
 }
 
 struct gud_usb_bulk_context {
@@ -285,6 +268,7 @@ static int gud_usb_bulk(struct gud_device *gdrm, size_t len)
 }
 
 static int gud_flush_rect(struct gud_device *gdrm, struct drm_framebuffer *fb,
+ const struct iosys_map *map, bool cached_reads,
  const struct drm_format_info *format, struct drm_rect 
*rect)
 {
struct gud_set_buffer_req req;
@@ -293,7 +277,7 @@ static int gud_flush_rect(struct gud_device *gdrm, s

[PATCH 0/6] drm/gud: Use the shadow plane helper

2022-11-22 Thread Noralf Trønnes via B4 Submission Endpoint
From: Noralf Trønnes 

Hi,

I have started to look at igt for testing and want to use CRC tests. To
implement support for this I need to move away from the simple kms
helper.

When looking around for examples I came across Thomas' nice shadow
helper and thought, yes this is perfect for drm/gud. So I'll switch to
that before I move away from the simple kms helper.

The async framebuffer flushing code path now uses a shadow buffer and
doesn't touch the framebuffer when it shouldn't. I have also taken the
opportunity to inline the synchronous flush code path since this will be
the future default when userspace predominently don't run all displays
in the same rendering loop. A shared rendering loop slows down all
displays to run at the speed of the slowest one.

Noralf.

Cc: Maxime Ripard 
Cc: Thomas Zimmermann 
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Noralf Trønnes 

---
Noralf Trønnes (6):
  drm/gem: shadow_fb_access: Prepare imported buffers for CPU access
  drm/gud: Fix UBSAN warning
  drm/gud: Don't retry a failed framebuffer flush
  drm/gud: Split up gud_flush_work()
  drm/gud: Prepare buffer for CPU access in gud_flush_work()
  drm/gud: Use the shadow plane helper

 drivers/gpu/drm/drm_gem_atomic_helper.c |  13 ++-
 drivers/gpu/drm/gud/gud_drv.c   |   1 +
 drivers/gpu/drm/gud/gud_internal.h  |   1 +
 drivers/gpu/drm/gud/gud_pipe.c  | 198 +++-
 drivers/gpu/drm/solomon/ssd130x.c   |  10 +-
 drivers/gpu/drm/tiny/gm12u320.c |  10 +-
 drivers/gpu/drm/tiny/ofdrm.c|  10 +-
 drivers/gpu/drm/tiny/simpledrm.c|  10 +-
 drivers/gpu/drm/udl/udl_modeset.c   |  11 +-
 9 files changed, 117 insertions(+), 147 deletions(-)
---
base-commit: 7257702951305b1f0259c3468c39fc59d1ad4d8b
change-id: 20221122-gud-shadow-plane-ae37a95d4d8d

Best regards,
-- 
Noralf Trønnes 



Re: [Intel-gfx] [PATCH 3/6] drm/i915/gsc: GSC firmware loading

2022-11-22 Thread Rodrigo Vivi
On Tue, Nov 22, 2022 at 11:39:31AM -0800, Ceraolo Spurio, Daniele wrote:
> 
> 
> On 11/22/2022 11:01 AM, Rodrigo Vivi wrote:
> > On Mon, Nov 21, 2022 at 03:16:14PM -0800, Daniele Ceraolo Spurio wrote:
> > > GSC FW is loaded by submitting a dedicated command via the GSC engine.
> > > The memory area used for loading the FW is then re-purposed as local
> > > memory for the GSC itself, so we use a separate allocation instead of
> > > using the one where we keep the firmware stored for reload.
> > > 
> > > The GSC is not reset as part of GT reset, so we only need to load it on
> > > first boot and S3/S4 exit.
> > > 
> > > Note that the GSC load takes a lot of time (up to a few hundred ms).
> > > This patch loads it serially as part of driver init/resume, but, given
> > > that GSC is only required for PM and content-protection features
> > > (media C6, PXP, HDCP), we could move the load to a worker thread to 
> > > unblock
> > > non-CP userspace submissions earlier. This will be done as a follow up
> > > step, because there are extra init steps required to actually make use of
> > > the GSC (including a mei component) and it will be cleaner (and easier to
> > > review) if we implement the async load once all the pieces we need for GSC
> > > to work are in place. A TODO has been added to the code to mark this
> > > intention.
> > > 
> > > Bspec: 63347, 65346
> > > Signed-off-by: Daniele Ceraolo Spurio 
> > > Cc: Alan Previn 
> > > Cc: John Harrison 
> > > ---
> > >   drivers/gpu/drm/i915/Makefile|   1 +
> > >   drivers/gpu/drm/i915/gem/i915_gem_pm.c   |  14 +-
> > >   drivers/gpu/drm/i915/gt/intel_engine.h   |   2 +
> > >   drivers/gpu/drm/i915/gt/intel_gpu_commands.h |   7 +
> > >   drivers/gpu/drm/i915/gt/intel_gt.c   |  11 ++
> > >   drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c| 186 +++
> > >   drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.h|  13 ++
> > >   drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.c|  35 +++-
> > >   drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.h|   7 +
> > >   drivers/gpu/drm/i915/gt/uc/intel_uc.c|  15 ++
> > >   drivers/gpu/drm/i915/gt/uc/intel_uc.h|   2 +
> > >   drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c |  20 +-
> > >   drivers/gpu/drm/i915/gt/uc/intel_uc_fw.h |   1 +
> > >   13 files changed, 307 insertions(+), 7 deletions(-)
> > >   create mode 100644 drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c
> > >   create mode 100644 drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.h
> > > 
> > > diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> > > index 92d37cf71e16..1d45a6f451fa 100644
> > > --- a/drivers/gpu/drm/i915/Makefile
> > > +++ b/drivers/gpu/drm/i915/Makefile
> > > @@ -206,6 +206,7 @@ i915-y += gt/uc/intel_uc.o \
> > > gt/uc/intel_huc.o \
> > > gt/uc/intel_huc_debugfs.o \
> > > gt/uc/intel_huc_fw.o \
> > > +   gt/uc/intel_gsc_fw.o \
> > > gt/uc/intel_gsc_uc.o
> > >   # graphics system controller (GSC) support
> > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pm.c 
> > > b/drivers/gpu/drm/i915/gem/i915_gem_pm.c
> > > index 0d812f4d787d..f77eb4009aba 100644
> > > --- a/drivers/gpu/drm/i915/gem/i915_gem_pm.c
> > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_pm.c
> > > @@ -232,10 +232,22 @@ void i915_gem_resume(struct drm_i915_private *i915)
> > >* guarantee that the context image is complete. So let's just 
> > > reset
> > >* it and start again.
> > >*/
> > > - for_each_gt(gt, i915, i)
> > > + for_each_gt(gt, i915, i) {
> > >   if (intel_gt_resume(gt))
> > >   goto err_wedged;
> > > + /*
> > > +  * TODO: this is a long operation (up to ~200ms) and we don't
> > > +  * need to complete it before driver load/resume is done, so it
> > > +  * should be handled in a separate thread to unlock userspace
> > > +  * submission. However, there are a couple of other pieces that
> > > +  * are required for full GSC support that will complicate things
> > > +  * a bit, and it is easier to move everything to a worker at the
> > > +  * same time, so keep it here for now.
> > > +  */
> > > + intel_uc_init_hw_late(>->uc);
> > > + }
> > > +
> > >   ret = lmem_restore(i915, I915_TTM_BACKUP_ALLOW_GPU);
> > >   GEM_WARN_ON(ret);
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h 
> > > b/drivers/gpu/drm/i915/gt/intel_engine.h
> > > index cbc8b857d5f7..0e24af5efee9 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_engine.h
> > > +++ b/drivers/gpu/drm/i915/gt/intel_engine.h
> > > @@ -172,6 +172,8 @@ intel_write_status_page(struct intel_engine_cs 
> > > *engine, int reg, u32 value)
> > >   #define I915_GEM_HWS_MIGRATE(0x42 * sizeof(u32))
> > >   #define I915_GEM_HWS_PXP0x60
> > >   #define I915_GEM_HWS_PXP_ADDR   (I915_GEM_HWS_PXP * sizeof(u32))
> > > +#define I915_GEM_HWS

[PATCH 2/6] drm/gud: Fix UBSAN warning

2022-11-22 Thread Noralf Trønnes via B4 Submission Endpoint
From: Noralf Trønnes 

UBSAN complains about invalid value for bool:

[  101.165172] [drm] Initialized gud 1.0.0 20200422 for 2-3.2:1.0 on minor 1
[  101.213360] gud 2-3.2:1.0: [drm] fb1: guddrmfb frame buffer device
[  101.213426] usbcore: registered new interface driver gud
[  101.989431] 

[  101.989441] UBSAN: invalid-load in 
/home/pi/linux/include/linux/iosys-map.h:253:9
[  101.989447] load of value 121 is not a valid value for type '_Bool'
[  101.989451] CPU: 1 PID: 455 Comm: kworker/1:6 Not tainted 
5.18.0-rc5-gud-5.18-rc5 #3
[  101.989456] Hardware name: Hewlett-Packard HP EliteBook 820 G1/1991, BIOS 
L71 Ver. 01.44 04/12/2018
[  101.989459] Workqueue: events_long gud_flush_work [gud]
[  101.989471] Call Trace:
[  101.989474]  
[  101.989479]  dump_stack_lvl+0x49/0x5f
[  101.989488]  dump_stack+0x10/0x12
[  101.989493]  ubsan_epilogue+0x9/0x3b
[  101.989498]  __ubsan_handle_load_invalid_value.cold+0x44/0x49
[  101.989504]  dma_buf_vmap.cold+0x38/0x3d
[  101.989511]  ? find_busiest_group+0x48/0x300
[  101.989520]  drm_gem_shmem_vmap+0x76/0x1b0 [drm_shmem_helper]
[  101.989528]  drm_gem_shmem_object_vmap+0x9/0xb [drm_shmem_helper]
[  101.989535]  drm_gem_vmap+0x26/0x60 [drm]
[  101.989594]  drm_gem_fb_vmap+0x47/0x150 [drm_kms_helper]
[  101.989630]  gud_prep_flush+0xc1/0x710 [gud]
[  101.989639]  ? _raw_spin_lock+0x17/0x40
[  101.989648]  gud_flush_work+0x1e0/0x430 [gud]
[  101.989653]  ? __switch_to+0x11d/0x470
[  101.989664]  process_one_work+0x21f/0x3f0
[  101.989673]  worker_thread+0x200/0x3e0
[  101.989679]  ? rescuer_thread+0x390/0x390
[  101.989684]  kthread+0xfd/0x130
[  101.989690]  ? kthread_complete_and_exit+0x20/0x20
[  101.989696]  ret_from_fork+0x22/0x30
[  101.989706]  
[  101.989708] 


The source of this warning is in iosys_map_clear() called from
dma_buf_vmap(). It conditionally sets values based on map->is_iomem. The
iosys_map variables are allocated uninitialized on the stack leading to
->is_iomem having all kinds of values and not only 0/1.

Fix this by zeroing the iosys_map variables.

Fixes: 40e1a70b4aed ("drm: Add GUD USB Display driver")
Cc:  # v5.18+
Signed-off-by: Noralf Trønnes 
---
 drivers/gpu/drm/gud/gud_pipe.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/gud/gud_pipe.c b/drivers/gpu/drm/gud/gud_pipe.c
index 7c6dc2bcd14a..61f4abaf1811 100644
--- a/drivers/gpu/drm/gud/gud_pipe.c
+++ b/drivers/gpu/drm/gud/gud_pipe.c
@@ -157,8 +157,8 @@ static int gud_prep_flush(struct gud_device *gdrm, struct 
drm_framebuffer *fb,
 {
struct dma_buf_attachment *import_attach = fb->obj[0]->import_attach;
u8 compression = gdrm->compression;
-   struct iosys_map map[DRM_FORMAT_MAX_PLANES];
-   struct iosys_map map_data[DRM_FORMAT_MAX_PLANES];
+   struct iosys_map map[DRM_FORMAT_MAX_PLANES] = { };
+   struct iosys_map map_data[DRM_FORMAT_MAX_PLANES] = { };
struct iosys_map dst;
void *vaddr, *buf;
size_t pitch, len;

-- 
b4 0.10.1



[PATCH 3/6] drm/gud: Don't retry a failed framebuffer flush

2022-11-22 Thread Noralf Trønnes via B4 Submission Endpoint
From: Noralf Trønnes 

If a framebuffer flush fails the driver will do one retry by requeing the
worker. Currently the worker is used even for synchronous flushing, but a
later patch will inline it, so this needs to change. Thinking about how to
solve this I came to the conclusion that this retry mechanism was a fix
for a problem that was only in the mind of the developer (me) and not
something that solved a real problem.

So let's remove this for now and revisit later should it become necessary.
gud_add_damage() has now only one caller so it can be inlined.

Signed-off-by: Noralf Trønnes 
---
 drivers/gpu/drm/gud/gud_pipe.c | 48 +++---
 1 file changed, 8 insertions(+), 40 deletions(-)

diff --git a/drivers/gpu/drm/gud/gud_pipe.c b/drivers/gpu/drm/gud/gud_pipe.c
index 61f4abaf1811..ff1358815af5 100644
--- a/drivers/gpu/drm/gud/gud_pipe.c
+++ b/drivers/gpu/drm/gud/gud_pipe.c
@@ -333,37 +333,6 @@ void gud_clear_damage(struct gud_device *gdrm)
gdrm->damage.y2 = 0;
 }
 
-static void gud_add_damage(struct gud_device *gdrm, struct drm_rect *damage)
-{
-   gdrm->damage.x1 = min(gdrm->damage.x1, damage->x1);
-   gdrm->damage.y1 = min(gdrm->damage.y1, damage->y1);
-   gdrm->damage.x2 = max(gdrm->damage.x2, damage->x2);
-   gdrm->damage.y2 = max(gdrm->damage.y2, damage->y2);
-}
-
-static void gud_retry_failed_flush(struct gud_device *gdrm, struct 
drm_framebuffer *fb,
-  struct drm_rect *damage)
-{
-   /*
-* pipe_update waits for the worker when the display mode is going to 
change.
-* This ensures that the width and height is still the same making it 
safe to
-* add back the damage.
-*/
-
-   mutex_lock(&gdrm->damage_lock);
-   if (!gdrm->fb) {
-   drm_framebuffer_get(fb);
-   gdrm->fb = fb;
-   }
-   gud_add_damage(gdrm, damage);
-   mutex_unlock(&gdrm->damage_lock);
-
-   /* Retry only once to avoid a possible storm in case of continues 
errors. */
-   if (!gdrm->prev_flush_failed)
-   queue_work(system_long_wq, &gdrm->work);
-   gdrm->prev_flush_failed = true;
-}
-
 void gud_flush_work(struct work_struct *work)
 {
struct gud_device *gdrm = container_of(work, struct gud_device, work);
@@ -407,14 +376,10 @@ void gud_flush_work(struct work_struct *work)
ret = gud_flush_rect(gdrm, fb, format, &rect);
if (ret) {
if (ret != -ENODEV && ret != -ECONNRESET &&
-   ret != -ESHUTDOWN && ret != -EPROTO) {
-   bool prev_flush_failed = 
gdrm->prev_flush_failed;
-
-   gud_retry_failed_flush(gdrm, fb, &damage);
-   if (!prev_flush_failed)
-   dev_err_ratelimited(fb->dev->dev,
-   "Failed to flush 
framebuffer: error=%d\n", ret);
-   }
+   ret != -ESHUTDOWN && ret != -EPROTO)
+   dev_err_ratelimited(fb->dev->dev,
+   "Failed to flush 
framebuffer: error=%d\n", ret);
+   gdrm->prev_flush_failed = true;
break;
}
 
@@ -439,7 +404,10 @@ static void gud_fb_queue_damage(struct gud_device *gdrm, 
struct drm_framebuffer
gdrm->fb = fb;
}
 
-   gud_add_damage(gdrm, damage);
+   gdrm->damage.x1 = min(gdrm->damage.x1, damage->x1);
+   gdrm->damage.y1 = min(gdrm->damage.y1, damage->y1);
+   gdrm->damage.x2 = max(gdrm->damage.x2, damage->x2);
+   gdrm->damage.y2 = max(gdrm->damage.y2, damage->y2);
 
mutex_unlock(&gdrm->damage_lock);
 

-- 
b4 0.10.1



[PATCH 1/6] drm/gem: shadow_fb_access: Prepare imported buffers for CPU access

2022-11-22 Thread Noralf Trønnes via B4 Submission Endpoint
From: Noralf Trønnes 

Complete the shadow fb access functions by also preparing imported buffers
for CPU access. Update the affected drivers that currently use
drm_gem_fb_begin_cpu_access().

Through this change the following SHMEM drivers will now also make sure
their imported buffers are prepared for CPU access: cirrus, hyperv,
mgag200, vkms

Cc: Thomas Zimmermann 
Cc: Javier Martinez Canillas 
Cc: Hans de Goede 
Cc: Dave Airlie 
Signed-off-by: Noralf Trønnes 
---
 drivers/gpu/drm/drm_gem_atomic_helper.c | 13 -
 drivers/gpu/drm/solomon/ssd130x.c   | 10 +-
 drivers/gpu/drm/tiny/gm12u320.c | 10 +-
 drivers/gpu/drm/tiny/ofdrm.c| 10 ++
 drivers/gpu/drm/tiny/simpledrm.c| 10 ++
 drivers/gpu/drm/udl/udl_modeset.c   | 11 ++-
 6 files changed, 20 insertions(+), 44 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem_atomic_helper.c 
b/drivers/gpu/drm/drm_gem_atomic_helper.c
index e42800718f51..0eef4bb30d25 100644
--- a/drivers/gpu/drm/drm_gem_atomic_helper.c
+++ b/drivers/gpu/drm/drm_gem_atomic_helper.c
@@ -368,6 +368,7 @@ EXPORT_SYMBOL(drm_gem_reset_shadow_plane);
  * maps all buffer objects of the plane's framebuffer into kernel address
  * space and stores them in struct &drm_shadow_plane_state.map. The first data
  * bytes are available in struct &drm_shadow_plane_state.data.
+ * It also prepares imported buffers for CPU access.
  *
  * See drm_gem_end_shadow_fb_access() for cleanup.
  *
@@ -378,11 +379,20 @@ int drm_gem_begin_shadow_fb_access(struct drm_plane 
*plane, struct drm_plane_sta
 {
struct drm_shadow_plane_state *shadow_plane_state = 
to_drm_shadow_plane_state(plane_state);
struct drm_framebuffer *fb = plane_state->fb;
+   int ret;
 
if (!fb)
return 0;
 
-   return drm_gem_fb_vmap(fb, shadow_plane_state->map, 
shadow_plane_state->data);
+   ret = drm_gem_fb_vmap(fb, shadow_plane_state->map, 
shadow_plane_state->data);
+   if (ret)
+   return ret;
+
+   ret = drm_gem_fb_begin_cpu_access(fb, DMA_FROM_DEVICE);
+   if (ret)
+   drm_gem_fb_vunmap(fb, shadow_plane_state->map);
+
+   return ret;
 }
 EXPORT_SYMBOL(drm_gem_begin_shadow_fb_access);
 
@@ -404,6 +414,7 @@ void drm_gem_end_shadow_fb_access(struct drm_plane *plane, 
struct drm_plane_stat
if (!fb)
return;
 
+   drm_gem_fb_end_cpu_access(fb, DMA_FROM_DEVICE);
drm_gem_fb_vunmap(fb, shadow_plane_state->map);
 }
 EXPORT_SYMBOL(drm_gem_end_shadow_fb_access);
diff --git a/drivers/gpu/drm/solomon/ssd130x.c 
b/drivers/gpu/drm/solomon/ssd130x.c
index 53464afc2b9a..58a2f0113f24 100644
--- a/drivers/gpu/drm/solomon/ssd130x.c
+++ b/drivers/gpu/drm/solomon/ssd130x.c
@@ -544,7 +544,6 @@ static int ssd130x_fb_blit_rect(struct drm_framebuffer *fb, 
const struct iosys_m
struct ssd130x_device *ssd130x = drm_to_ssd130x(fb->dev);
struct iosys_map dst;
unsigned int dst_pitch;
-   int ret = 0;
u8 *buf = NULL;
 
/* Align y to display page boundaries */
@@ -556,21 +555,14 @@ static int ssd130x_fb_blit_rect(struct drm_framebuffer 
*fb, const struct iosys_m
if (!buf)
return -ENOMEM;
 
-   ret = drm_gem_fb_begin_cpu_access(fb, DMA_FROM_DEVICE);
-   if (ret)
-   goto out_free;
-
iosys_map_set_vaddr(&dst, buf);
drm_fb_xrgb_to_mono(&dst, &dst_pitch, vmap, fb, rect);
 
-   drm_gem_fb_end_cpu_access(fb, DMA_FROM_DEVICE);
-
ssd130x_update_rect(ssd130x, buf, rect);
 
-out_free:
kfree(buf);
 
-   return ret;
+   return 0;
 }
 
 static void ssd130x_primary_plane_helper_atomic_update(struct drm_plane *plane,
diff --git a/drivers/gpu/drm/tiny/gm12u320.c b/drivers/gpu/drm/tiny/gm12u320.c
index 130fd07a967d..59aad4b468cc 100644
--- a/drivers/gpu/drm/tiny/gm12u320.c
+++ b/drivers/gpu/drm/tiny/gm12u320.c
@@ -252,7 +252,7 @@ static void gm12u320_32bpp_to_24bpp_packed(u8 *dst, u8 
*src, int len)
 
 static void gm12u320_copy_fb_to_blocks(struct gm12u320_device *gm12u320)
 {
-   int block, dst_offset, len, remain, ret, x1, x2, y1, y2;
+   int block, dst_offset, len, remain, x1, x2, y1, y2;
struct drm_framebuffer *fb;
void *vaddr;
u8 *src;
@@ -269,12 +269,6 @@ static void gm12u320_copy_fb_to_blocks(struct 
gm12u320_device *gm12u320)
y2 = gm12u320->fb_update.rect.y2;
vaddr = gm12u320->fb_update.src_map.vaddr; /* TODO: Use mapping 
abstraction properly */
 
-   ret = drm_gem_fb_begin_cpu_access(fb, DMA_FROM_DEVICE);
-   if (ret) {
-   GM12U320_ERR("drm_gem_fb_begin_cpu_access err: %d\n", ret);
-   goto put_fb;
-   }
-
src = vaddr + y1 * fb->pitches[0] + x1 * 4;
 
x1 += (GM12U320_REAL_WIDTH - GM12U320_USER_WIDTH) / 2;
@@ -309,8 +303,6 @@ static void gm12u320_copy_fb_to_blocks(struct 
gm12u320_device *gm12u320)
src += fb->pitches[0];
 

Re: [Intel-gfx] [PATCH 6/6] drm/i915/mtl: MTL has one GSC CS on the media GT

2022-11-22 Thread Rodrigo Vivi
On Mon, Nov 21, 2022 at 03:16:17PM -0800, Daniele Ceraolo Spurio wrote:
> Now that we have the GSC FW support code as a user to the GSC CS, we
> can add the relevant flag to the engine mask. Note that the engine will
> still be disabled until we define the GSC FW binary file.
> 
> Signed-off-by: Daniele Ceraolo Spurio 
> Cc: Matt Roper 
> Cc: Rodrigo Vivi 

Reviewed-by: Rodrigo Vivi 
> ---
>  drivers/gpu/drm/i915/i915_pci.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
> index 6da9784fe4a2..46acbf390195 100644
> --- a/drivers/gpu/drm/i915/i915_pci.c
> +++ b/drivers/gpu/drm/i915/i915_pci.c
> @@ -1124,7 +1124,7 @@ static const struct intel_gt_definition 
> xelpmp_extra_gt[] = {
>   .type = GT_MEDIA,
>   .name = "Standalone Media GT",
>   .gsi_offset = MTL_MEDIA_GSI_BASE,
> - .engine_mask = BIT(VECS0) | BIT(VCS0) | BIT(VCS2),
> + .engine_mask = BIT(VECS0) | BIT(VCS0) | BIT(VCS2) | BIT(GSC0),
>   },
>   {}
>  };
> -- 
> 2.37.3
> 


Re: [Intel-gfx] [PATCH 5/6] drm/i915/gsc: Disable GSC engine and power well if FW is not selected

2022-11-22 Thread Rodrigo Vivi
On Mon, Nov 21, 2022 at 03:16:16PM -0800, Daniele Ceraolo Spurio wrote:
> From: Jonathan Cavitt 
> 
> The GSC CS is only used for communicating with the GSC FW, so no need to
> initialize it if we're not going to use the FW. If we're not using
> neither the engine nor the microcontoller, then we can also disable the
> power well.
> 
> IMPORTANT: lack of GSC FW breaks media C6 due to opposing requirements
> between CS setup and forcewake idleness. See in-code comment for detail.
> 
> Signed-off-by: Jonathan Cavitt 
> Signed-off-by: Daniele Ceraolo Spurio 
> Cc: Matt Roper 
> Cc: John C Harrison 
> Cc: Rodrigo Vivi 
> Cc: Vinay Belgaumkar 
> ---
>  drivers/gpu/drm/i915/gt/intel_engine_cs.c | 18 ++
>  drivers/gpu/drm/i915/intel_uncore.c   |  3 +++
>  2 files changed, 21 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
> b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> index c33e0d72d670..99c4b866addd 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> @@ -894,6 +894,24 @@ static intel_engine_mask_t init_engine_mask(struct 
> intel_gt *gt)
>   engine_mask_apply_compute_fuses(gt);
>   engine_mask_apply_copy_fuses(gt);
>  
> + /*
> +  * The only use of the GSC CS is to load and communicate with the GSC
> +  * FW, so we have no use for it if we don't have the FW.
> +  *
> +  * IMPORTANT: in cases where we don't have the GSC FW, we have a
> +  * catch-22 situation that breaks media C6 due to 2 requirements:
> +  * 1) once turned on, the GSC power well will not go to sleep unless the
> +  *GSC FW is loaded.
> +  * 2) to enable idling (which is required for media C6) we need to
> +  *initialize the IDLE_MSG register for the GSC CS and do at least 1
> +  *submission, which will wake up the GSC power well.
> +  */
> + if (__HAS_ENGINE(info->engine_mask, GSC0) && 
> !intel_uc_wants_gsc_uc(>->uc)) {
> + drm_notice(>->i915->drm,
> +"No GSC FW selected, disabling GSC CS and media 
> C6\n");
> + info->engine_mask &= ~BIT(GSC0);
> + }
> +
>   return info->engine_mask;
>  }
>  
> diff --git a/drivers/gpu/drm/i915/intel_uncore.c 
> b/drivers/gpu/drm/i915/intel_uncore.c
> index c1befa33ff59..e63d957b59eb 100644
> --- a/drivers/gpu/drm/i915/intel_uncore.c
> +++ b/drivers/gpu/drm/i915/intel_uncore.c
> @@ -2701,6 +2701,9 @@ void intel_uncore_prune_engine_fw_domains(struct 
> intel_uncore *uncore,
>   if (fw_domains & BIT(domain_id))
>   fw_domain_fini(uncore, domain_id);
>   }
> +
> + if ((fw_domains & BIT(FW_DOMAIN_ID_GSC)) && !HAS_ENGINE(gt, GSC0))
> + fw_domain_fini(uncore, FW_DOMAIN_ID_GSC);

On a quick glace I was asking "why do you need this since it doesn't have the 
gsc0?
Then I remember that fw_domain got initialized and it will be skipped, right?
Then I though about at least have a comment here, but finally I got myself 
wondering
why we don't do this already in the if above, while we are cleaning the engine 
mask?

>  }
>  
>  static void driver_flr(struct intel_uncore *uncore)
> -- 
> 2.37.3
> 


Re: [Intel-gfx] [PATCH 4/6] drm/i915/gsc: Do a driver-FLR on unload if GSC was loaded

2022-11-22 Thread Rodrigo Vivi
On Mon, Nov 21, 2022 at 03:16:15PM -0800, Daniele Ceraolo Spurio wrote:
> If the GSC was loaded, the only way to stop it during the driver unload
> flow is to do a driver-FLR.
> The driver-FLR is not the same as PCI config space FLR in that
> it doesn't reset the SGUnit and doesn't modify the PCI config
> space. Thus, it doesn't require a re-enumeration of the PCI BARs.
> However, the driver-FLR does cause a memory wipe of graphics memory
> on all discrete GPU platforms or a wipe limited to stolen memory
> on the integrated GPU platforms.

Nothing major or blocking, but a few thoughts:

1. Should we document this in the code, at least in a comment in the
flr function?
2. Should we call this driver_initiated_flr, aiming to reduce even more
the ambiguity of it?

> 
> We perform the FLR as the last action before releasing the MMIO bar, so
> that we don't have to care about the consequences of the reset on the
> unload flow.

3. should we try to implement this already in the gt_reset case as the
last resrouce before wedging the gt? So we can already test this flow
in the current platforms?

> 
> Signed-off-by: Daniele Ceraolo Spurio 
> Signed-off-by: Alan Previn 
> ---
>  drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c |  9 +
>  drivers/gpu/drm/i915/i915_reg.h   |  3 ++
>  drivers/gpu/drm/i915/intel_uncore.c   | 45 +++
>  drivers/gpu/drm/i915/intel_uncore.h   | 13 +++
>  4 files changed, 70 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c 
> b/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c
> index 510fb47193ec..5dad3c19c445 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c
> @@ -173,6 +173,15 @@ int intel_gsc_fw_upload(struct intel_gsc_uc *gsc)
>   if (err)
>   goto fail;
>  
> + /*
> +  * Once the GSC FW is loaded, the only way to kill it on driver unload
> +  * is to do a driver FLR. Given this is a very disruptive action, we
> +  * want to do it as the last action before releasing the access to the
> +  * MMIO bar, which means we need to do it as part of the primary uncore
> +  * cleanup.
> +  */
> + intel_uncore_set_flr_on_fini(>->i915->uncore);
> +
>   /* FW is not fully operational until we enable SW proxy */
>   intel_uc_fw_change_status(gsc_fw, INTEL_UC_FIRMWARE_TRANSFERRED);
>  
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index 8e1892d14774..60e55245200b 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -118,6 +118,9 @@
>  
>  #define GU_CNTL  _MMIO(0x101010)
>  #define   LMEM_INIT  REG_BIT(7)
> +#define   DRIVERFLR  REG_BIT(31)
> +#define GU_DEBUG _MMIO(0x101018)
> +#define   DRIVERFLR_STATUS   REG_BIT(31)
>  
>  #define GEN6_STOLEN_RESERVED _MMIO(0x1082C0)
>  #define GEN6_STOLEN_RESERVED_ADDR_MASK   (0xFFF << 20)
> diff --git a/drivers/gpu/drm/i915/intel_uncore.c 
> b/drivers/gpu/drm/i915/intel_uncore.c
> index 8006a6c61466..c1befa33ff59 100644
> --- a/drivers/gpu/drm/i915/intel_uncore.c
> +++ b/drivers/gpu/drm/i915/intel_uncore.c
> @@ -2703,6 +2703,48 @@ void intel_uncore_prune_engine_fw_domains(struct 
> intel_uncore *uncore,
>   }
>  }
>  
> +static void driver_flr(struct intel_uncore *uncore)
> +{
> + struct drm_i915_private *i915 = uncore->i915;
> + const unsigned int flr_timeout_ms = 3000; /* specs recommend a 3s wait 
> */
> + int ret;
> +
> + drm_dbg(&i915->drm, "Triggering Driver-FLR\n");
> +
> + /*
> +  * Make sure any pending FLR requests have cleared by waiting for the
> +  * FLR trigger bit to go to zero. Also clear GU_DEBUG's DRIVERFLR_STATUS
> +  * to make sure it's not still set from a prior attempt (it's a write to
> +  * clear bit).
> +  * Note that we should never be in a situation where a previous attempt
> +  * is still pending (unless the HW is totally dead), but better to be
> +  * safe in case something unexpected happens
> +  */
> + ret = intel_wait_for_register_fw(uncore, GU_CNTL, DRIVERFLR, 0, 
> flr_timeout_ms);
> + if (ret) {
> + drm_err(&i915->drm,
> + "Failed to wait for Driver-FLR bit to clear! %d\n",
> + ret);
> + return;
> + }
> + intel_uncore_write_fw(uncore, GU_DEBUG, DRIVERFLR_STATUS);
> +
> + /* Trigger the actual Driver-FLR */
> + intel_uncore_rmw_fw(uncore, GU_CNTL, 0, DRIVERFLR);
> +
> + ret = intel_wait_for_register_fw(uncore, GU_DEBUG,
> +  DRIVERFLR_STATUS, DRIVERFLR_STATUS,
> +  flr_timeout_ms);
> + if (ret) {
> + drm_err(&i915->drm, "wait for Driver-FLR completion failed! 
> %d\n", ret);
> + return;
> + }
> +
> + intel_uncore_write_fw(uncore, GU_DEBUG, DRIVE

RE: [PATCH 6/6] drm/i915: Bpp/timeslot calculation fixes for DP MST DSC

2022-11-22 Thread Navare, Manasi D
Thanks Stan for the explanation,
With that

Reviewed-by: Manasi Navare 

Manasi


-Original Message-
From: Lisovskiy, Stanislav  
Sent: Tuesday, November 22, 2022 2:40 AM
To: Navare, Manasi D 
Cc: intel-...@lists.freedesktop.org; Saarinen, Jani ; 
Nikula, Jani ; dri-devel@lists.freedesktop.org; 
Govindapillai, Vinod 
Subject: Re: [PATCH 6/6] drm/i915: Bpp/timeslot calculation fixes for DP MST DSC

On Thu, Nov 10, 2022 at 02:23:53PM -0800, Navare, Manasi wrote:
> On Thu, Nov 03, 2022 at 03:23:00PM +0200, Stanislav Lisovskiy wrote:
> > Fix intel_dp_dsc_compute_config, previously timeslots parameter was 
> > used in fact not as a timeslots, but more like a ratio timeslots/64, 
> > which of course didn't have any effect for SST DSC, but causes now 
> > issues for MST DSC.
> > Secondly we need to calculate pipe_bpp using 
> > intel_dp_dsc_compute_bpp only for SST DSC case, while for MST case 
> > it has been calculated earlier already with 
> > intel_dp_dsc_mst_compute_link_config.
> > Third we also were wrongly determining sink min bpp/max bpp, those 
> > limites should be intersected with our limits to find common 
> > acceptable bpp's, plus on top of that we should align those with 
> > VESA bpps and only then calculate required timeslots amount.
> > Some MST hubs started to work only after third change was made.
> > 
> > v2: Make kernel test robot happy(claimed there was unitialzed use,
> > while there is none)
> > 
> > Signed-off-by: Stanislav Lisovskiy 
> > ---
> >  drivers/gpu/drm/i915/display/intel_dp.c | 69 ++---
> >  drivers/gpu/drm/i915/display/intel_dp.h |  3 +-
> >  drivers/gpu/drm/i915/display/intel_dp_mst.c | 69 
> > +
> >  3 files changed, 106 insertions(+), 35 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/display/intel_dp.c 
> > b/drivers/gpu/drm/i915/display/intel_dp.c
> > index 8288a30dbd51..82752b696498 100644
> > --- a/drivers/gpu/drm/i915/display/intel_dp.c
> > +++ b/drivers/gpu/drm/i915/display/intel_dp.c
> > @@ -716,9 +716,14 @@ u16 intel_dp_dsc_get_output_bpp(struct 
> > drm_i915_private *i915,
> >  * for SST -> TimeSlotsPerMTP is 1,
> >  * for MST -> TimeSlotsPerMTP has to be calculated
> >  */
> > -   bits_per_pixel = (link_clock * lane_count * 8) * timeslots /
> > -intel_dp_mode_to_fec_clock(mode_clock);
> > -   drm_dbg_kms(&i915->drm, "Max link bpp: %u\n", bits_per_pixel);
> > +   bits_per_pixel = DIV_ROUND_UP((link_clock * lane_count) * timeslots,
> > + intel_dp_mode_to_fec_clock(mode_clock) * 
> > 8);
> 
> Why did we remove the *8 in the numerator for the total bandwidth 
> link_clock * lane_count * 8 ?
> 
> Other than this clarification, all changes look good
> 
> Manasi

Hi Manasi,

Because previously this function was actually confusing the ratio timeslots/64, 
with the timeslots number.

It was actually expecting a ratio timeslots/64, rather than the timeslots 
number.

For SST it didn't matter as timeslots were always 1, but for MST case if we 
multiply that by number number of timeslots, this formula will return some big 
bogus bits_per_pixel number(checked that). 
Of course we can pass a ratio timeslots/64 here, but it isn't very convenient 
and intuitive to manipulate.
So I made it to use a "timeslots" parameter as timeslots number, so that the 
ratio is calculated as part of the formula i.e:

((link_clock * lane_count * 8) * (timeslots / 64)) /  
intel_dp_mode_to_fec_clock(mode_clock);

which can be simplified as

((link_clock * lane_count * timeslots) / 8) / 
intel_dp_mode_to_fec_clock(mode_clock);

the whole formula comes from that
pipe_bpp * crtc_clock should be equal to link_total_bw * (timeslots / 64), i.e
timeslots/64 ratio defines, how much of the link_total_bw(link_clock * 
lane_count * 8) we have for those pipe_bpp * crtc_clock, which we want to 
accomodate there.

Obviously if we just multiplied link_total_bw by timeslots, we would get a 
situation that the more timeslots we allocate, the more total bw we get, which 
is wrong and will result in some bogus huge pipe_bpp numbers.

Stan

> 
> > +
> > +   drm_dbg_kms(&i915->drm, "Max link bpp is %u for %u timeslots "
> > +   "total bw %u pixel clock %u\n",
> > +   bits_per_pixel, timeslots,
> > +   (link_clock * lane_count * 8),
> > +   intel_dp_mode_to_fec_clock(mode_clock));
> >  
> > /* Small Joiner Check: output bpp <= joiner RAM (bits) / Horiz. width */
> > max_bpp_small_joiner_ram = small_joiner_ram_size_bits(i915) / @@ 
> > -1047,7 +1052,7 @@ intel_dp_mode_valid(struct drm_connector *_connector,
> > target_clock,
> > mode->hdisplay,
> > bigjoiner,
> > -   pipe_bpp

[linux-next:master] BUILD REGRESSION 771a207d1ee9f38da8c0cee1412228f18b900bac

2022-11-22 Thread kernel test robot
tree/branch: 
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
branch HEAD: 771a207d1ee9f38da8c0cee1412228f18b900bac  Add linux-next specific 
files for 20221122

Error/Warning reports:

https://lore.kernel.org/oe-kbuild-all/202211130053.np70vidn-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202211211917.ylicunmb-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202211221848.n0wn2gk3-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202211221932.a1a12ylh-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202211222131.h2kt55xh-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202211222348.riebqcjq-...@intel.com

Error/Warning: (recently discovered and may have been fixed)

ERROR: modpost: "__ld_r13_to_r22" [lib/zstd/zstd_decompress.ko] undefined!
ERROR: modpost: "devm_ioremap_resource" [drivers/dma/qcom/hdma.ko] undefined!
ERROR: modpost: "lockdep_is_held" [fs/dlm/dlm.ko] undefined!
arch/arm/mach-s3c/devs.c:32:10: fatal error: linux/platform_data/dma-s3c24xx.h: 
No such file or directory
drivers/clk/clk.c:1022:5: error: redefinition of 'clk_prepare'
drivers/clk/clk.c:1268:6: error: redefinition of 'clk_is_enabled_when_prepared'
drivers/clk/clk.c:941:6: error: redefinition of 'clk_unprepare'
drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc.c:4968: warning: This comment 
starts with '/**', but isn't a kernel-doc comment. Refer 
Documentation/doc-guide/kernel-doc.rst
drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link_dp.c:5075:24: warning: 
implicit conversion from 'enum ' to 'enum dc_status' 
[-Wenum-conversion]
drivers/gpu/drm/amd/amdgpu/../display/dc/irq/dcn201/irq_service_dcn201.c:139:43:
 warning: unused variable 'dmub_outbox_irq_info_funcs' [-Wunused-const-variable]
drivers/gpu/drm/amd/amdgpu/../display/dc/irq/dcn201/irq_service_dcn201.c:40:20: 
warning: no previous prototype for 'to_dal_irq_source_dcn201' 
[-Wmissing-prototypes]
drivers/gpu/drm/amd/amdgpu/../display/dc/irq/dcn201/irq_service_dcn201.c:40:20: 
warning: no previous prototype for function 'to_dal_irq_source_dcn201' 
[-Wmissing-prototypes]
drivers/gpu/drm/nouveau/nvkm/engine/fifo/gf100.c:451:1: warning: no previous 
prototype for 'gf100_fifo_nonstall_block' [-Wmissing-prototypes]
drivers/gpu/drm/nouveau/nvkm/engine/fifo/gf100.c:451:1: warning: no previous 
prototype for function 'gf100_fifo_nonstall_block' [-Wmissing-prototypes]
drivers/gpu/drm/nouveau/nvkm/engine/fifo/runl.c:34:1: warning: no previous 
prototype for 'nvkm_engn_cgrp_get' [-Wmissing-prototypes]
drivers/gpu/drm/nouveau/nvkm/engine/fifo/runl.c:34:1: warning: no previous 
prototype for function 'nvkm_engn_cgrp_get' [-Wmissing-prototypes]
drivers/gpu/drm/nouveau/nvkm/engine/gr/tu102.c:210:1: warning: no previous 
prototype for 'tu102_gr_load' [-Wmissing-prototypes]
drivers/gpu/drm/nouveau/nvkm/engine/gr/tu102.c:210:1: warning: no previous 
prototype for function 'tu102_gr_load' [-Wmissing-prototypes]
drivers/gpu/drm/nouveau/nvkm/nvfw/acr.c:49:1: warning: no previous prototype 
for 'wpr_generic_header_dump' [-Wmissing-prototypes]
drivers/gpu/drm/nouveau/nvkm/nvfw/acr.c:49:1: warning: no previous prototype 
for function 'wpr_generic_header_dump' [-Wmissing-prototypes]
drivers/gpu/drm/nouveau/nvkm/subdev/acr/lsfw.c:221:21: warning: variable 'loc' 
set but not used [-Wunused-but-set-variable]
fs/dlm/lowcomms.c:623: undefined reference to `lockdep_is_held'
include/net/sock.h:1713: undefined reference to `lockdep_is_held'
ld.lld: error: undefined symbol: devm_drm_of_get_bridge
ld.lld: error: undefined symbol: drm_atomic_get_new_connector_for_encoder
ld.lld: error: undefined symbol: drm_atomic_helper_bridge_destroy_state
ld.lld: error: undefined symbol: drm_atomic_helper_bridge_duplicate_state
ld.lld: error: undefined symbol: drm_atomic_helper_bridge_reset
ld.lld: error: undefined symbol: drm_bridge_add
ld.lld: error: undefined symbol: drm_bridge_attach
ld.lld: error: undefined symbol: drm_bridge_remove
ld.lld: error: undefined symbol: drm_of_get_data_lanes_count_ep
ld.lld: error: undefined symbol: lockdep_is_held
microblaze-linux-ld: (.text+0x158): undefined reference to `drm_bridge_add'
microblaze-linux-ld: drivers/gpu/drm/rcar-du/rzg2l_mipi_dsi.o:(.rodata+0x3b4): 
undefined reference to `drm_atomic_helper_bridge_duplicate_state'
microblaze-linux-ld: drivers/gpu/drm/rcar-du/rzg2l_mipi_dsi.o:(.rodata+0x3b8): 
undefined reference to `drm_atomic_helper_bridge_destroy_state'
microblaze-linux-ld: drivers/gpu/drm/rcar-du/rzg2l_mipi_dsi.o:(.rodata+0x3c8): 
undefined reference to `drm_atomic_helper_bridge_reset'

Unverified Error/Warning (likely false positive, please contact us if 
interested):

drivers/gpu/drm/nouveau/nvkm/falcon/base.c:47:23: warning: use of uninitialized 
value '' [CWE-457] [-Wa

[PATCH 2/4] drm/i915/gt: Pass gt rather than uncore to lowest-level reads/writes

2022-11-22 Thread Matt Roper
Passing the GT rather than uncore to the lowest level MCR read and write
functions will make it easier to introduce dedicated MCR locking in a
folling patch.

Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/intel_gt_mcr.c | 18 ++
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c 
b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
index ea86c1ab5dc5..f4484bb18ec9 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
@@ -221,7 +221,7 @@ static i915_reg_t mcr_reg_cast(const i915_mcr_reg_t mcr)
 
 /*
  * rw_with_mcr_steering_fw - Access a register with specific MCR steering
- * @uncore: pointer to struct intel_uncore
+ * @gt: GT to read register from
  * @reg: register being accessed
  * @rw_flag: FW_REG_READ for read access or FW_REG_WRITE for write access
  * @group: group number (documented as "sliceid" on older platforms)
@@ -232,10 +232,11 @@ static i915_reg_t mcr_reg_cast(const i915_mcr_reg_t mcr)
  *
  * Caller needs to make sure the relevant forcewake wells are up.
  */
-static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore,
+static u32 rw_with_mcr_steering_fw(struct intel_gt *gt,
   i915_mcr_reg_t reg, u8 rw_flag,
   int group, int instance, u32 value)
 {
+   struct intel_uncore *uncore = gt->uncore;
u32 mcr_mask, mcr_ss, mcr, old_mcr, val = 0;
 
lockdep_assert_held(&uncore->lock);
@@ -308,11 +309,12 @@ static u32 rw_with_mcr_steering_fw(struct intel_uncore 
*uncore,
return val;
 }
 
-static u32 rw_with_mcr_steering(struct intel_uncore *uncore,
+static u32 rw_with_mcr_steering(struct intel_gt *gt,
i915_mcr_reg_t reg, u8 rw_flag,
int group, int instance,
u32 value)
 {
+   struct intel_uncore *uncore = gt->uncore;
enum forcewake_domains fw_domains;
u32 val;
 
@@ -325,7 +327,7 @@ static u32 rw_with_mcr_steering(struct intel_uncore *uncore,
spin_lock_irq(&uncore->lock);
intel_uncore_forcewake_get__locked(uncore, fw_domains);
 
-   val = rw_with_mcr_steering_fw(uncore, reg, rw_flag, group, instance, 
value);
+   val = rw_with_mcr_steering_fw(gt, reg, rw_flag, group, instance, value);
 
intel_uncore_forcewake_put__locked(uncore, fw_domains);
spin_unlock_irq(&uncore->lock);
@@ -347,7 +349,7 @@ u32 intel_gt_mcr_read(struct intel_gt *gt,
  i915_mcr_reg_t reg,
  int group, int instance)
 {
-   return rw_with_mcr_steering(gt->uncore, reg, FW_REG_READ, group, 
instance, 0);
+   return rw_with_mcr_steering(gt, reg, FW_REG_READ, group, instance, 0);
 }
 
 /**
@@ -364,7 +366,7 @@ u32 intel_gt_mcr_read(struct intel_gt *gt,
 void intel_gt_mcr_unicast_write(struct intel_gt *gt, i915_mcr_reg_t reg, u32 
value,
int group, int instance)
 {
-   rw_with_mcr_steering(gt->uncore, reg, FW_REG_WRITE, group, instance, 
value);
+   rw_with_mcr_steering(gt, reg, FW_REG_WRITE, group, instance, value);
 }
 
 /**
@@ -588,7 +590,7 @@ u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, 
i915_mcr_reg_t reg)
for (type = 0; type < NUM_STEERING_TYPES; type++) {
if (reg_needs_read_steering(gt, reg, type)) {
get_nonterminated_steering(gt, type, &group, &instance);
-   return rw_with_mcr_steering_fw(gt->uncore, reg,
+   return rw_with_mcr_steering_fw(gt, reg,
   FW_REG_READ,
   group, instance, 0);
}
@@ -615,7 +617,7 @@ u32 intel_gt_mcr_read_any(struct intel_gt *gt, 
i915_mcr_reg_t reg)
for (type = 0; type < NUM_STEERING_TYPES; type++) {
if (reg_needs_read_steering(gt, reg, type)) {
get_nonterminated_steering(gt, type, &group, &instance);
-   return rw_with_mcr_steering(gt->uncore, reg,
+   return rw_with_mcr_steering(gt, reg,
FW_REG_READ,
group, instance, 0);
}
-- 
2.38.1



[PATCH 3/4] drm/i915/gt: Add dedicated MCR lock

2022-11-22 Thread Matt Roper
We've been overloading uncore->lock to protect access to the MCR
steering register.  That's not really what uncore->lock is intended for,
and it would be better if we didn't need to hold such a high-traffic
spinlock for the whole sequence of (apply steering, access MCR register,
restore steering).  Let's create a dedicated MCR lock to protect the
steering control register over this critical section and stop relying on
the high-traffic uncore->lock.

For now the new lock is a software lock.  However some platforms (MTL
and beyond) have a hardware-provided locking mechanism that can be used
to serialize not only software accesses, but also hardware/firmware
accesses as well; support for that hardware level lock will be added in
a future patch.

Cc: Chris Wilson 
Cc: Mika Kuoppala 
Cc: Balasubramani Vivekanandan 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/intel_gt.c  |  2 +
 drivers/gpu/drm/i915/gt/intel_gt_mcr.c  | 66 -
 drivers/gpu/drm/i915/gt/intel_gt_mcr.h  |  2 +
 drivers/gpu/drm/i915/gt/intel_gt_types.h|  8 +++
 drivers/gpu/drm/i915/gt/intel_mocs.c|  2 +
 drivers/gpu/drm/i915/gt/intel_workarounds.c |  4 ++
 6 files changed, 82 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c 
b/drivers/gpu/drm/i915/gt/intel_gt.c
index b5ad9caa5537..f823fc0b3827 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -1094,6 +1094,7 @@ static void mmio_invalidate_full(struct intel_gt *gt)
 
intel_uncore_forcewake_get(uncore, FORCEWAKE_ALL);
 
+   intel_gt_mcr_lock(gt);
spin_lock_irq(&uncore->lock); /* serialise invalidate with GT reset */
 
awake = 0;
@@ -1129,6 +1130,7 @@ static void mmio_invalidate_full(struct intel_gt *gt)
intel_uncore_write_fw(uncore, GEN12_OA_TLB_INV_CR, 1);
 
spin_unlock_irq(&uncore->lock);
+   intel_gt_mcr_unlock(gt);
 
for_each_engine_masked(engine, gt, awake, tmp) {
struct reg_and_bit rb;
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c 
b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
index f4484bb18ec9..f9e722d91904 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
@@ -143,6 +143,8 @@ void intel_gt_mcr_init(struct intel_gt *gt)
unsigned long fuse;
int i;
 
+   spin_lock_init(>->mcr_lock);
+
/*
 * An mslice is unavailable only if both the meml3 for the slice is
 * disabled *and* all of the DSS in the slice (quadrant) are disabled.
@@ -228,6 +230,7 @@ static i915_reg_t mcr_reg_cast(const i915_mcr_reg_t mcr)
  * @instance: instance number (documented as "subsliceid" on older platforms)
  * @value: register value to be written (ignored for read)
  *
+ * Context: The caller must hold the MCR lock
  * Return: 0 for write access. register value for read access.
  *
  * Caller needs to make sure the relevant forcewake wells are up.
@@ -239,7 +242,7 @@ static u32 rw_with_mcr_steering_fw(struct intel_gt *gt,
struct intel_uncore *uncore = gt->uncore;
u32 mcr_mask, mcr_ss, mcr, old_mcr, val = 0;
 
-   lockdep_assert_held(&uncore->lock);
+   lockdep_assert_held(>->mcr_lock);
 
if (GRAPHICS_VER_FULL(uncore->i915) >= IP_VER(12, 70)) {
/*
@@ -324,6 +327,7 @@ static u32 rw_with_mcr_steering(struct intel_gt *gt,
 GEN8_MCR_SELECTOR,
 FW_REG_READ | 
FW_REG_WRITE);
 
+   intel_gt_mcr_lock(gt);
spin_lock_irq(&uncore->lock);
intel_uncore_forcewake_get__locked(uncore, fw_domains);
 
@@ -331,10 +335,45 @@ static u32 rw_with_mcr_steering(struct intel_gt *gt,
 
intel_uncore_forcewake_put__locked(uncore, fw_domains);
spin_unlock_irq(&uncore->lock);
+   intel_gt_mcr_unlock(gt);
 
return val;
 }
 
+/**
+ * intel_gt_mcr_lock - Acquire MCR steering lock
+ * @gt: GT structure
+ *
+ * Performs locking to protect the steering for the duration of an MCR
+ * operation.  Depending on the platform, this may be a software lock
+ * (gt->mcr_lock) or a hardware lock (i.e., a register that synchronizes
+ * access not only for the driver, but also for external hardware and
+ * firmware agents).
+ *
+ * Context: Takes gt->mcr_lock.  uncore->lock should *not* be held when this
+ *  function is called, although it may be acquired after this
+ *  function call.
+ */
+void intel_gt_mcr_lock(struct intel_gt *gt)
+{
+   lockdep_assert_not_held(>->uncore->lock);
+
+   spin_lock(>->mcr_lock);
+}
+
+/**
+ * intel_gt_mcr_unlock - Release MCR steering lock
+ * @gt: GT structure
+ *
+ * Releases the lock acquired by intel_gt_mcr_lock().
+ *
+ * Context: Releases gt->mcr_lock
+ */
+void intel_gt_mcr_unlock(struct intel_gt *gt)
+{
+   spin_unlock(>->mcr_lock);
+}
+
 /**
  * intel_gt_mcr_read - read a specific instance of an MCR register
  *

[PATCH 1/4] drm/i915/gt: Correct kerneldoc for intel_gt_mcr_wait_for_reg()

2022-11-22 Thread Matt Roper
The kerneldoc function name was not updated when this function was
converted to a non-fw form.

Fixes: 192bb40f030a ("drm/i915/gt: Manage uncore->lock while waiting on MCR 
register")
Reported-by: kernel test robot 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/intel_gt_mcr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c 
b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
index d9a8ff9e5e57..ea86c1ab5dc5 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
@@ -702,7 +702,7 @@ void intel_gt_mcr_get_ss_steering(struct intel_gt *gt, 
unsigned int dss,
 }
 
 /**
- * intel_gt_mcr_wait_for_reg_fw - wait until MCR register matches expected 
state
+ * intel_gt_mcr_wait_for_reg - wait until MCR register matches expected state
  * @gt: GT structure
  * @reg: the register to read
  * @mask: mask to apply to register value
-- 
2.38.1



[PATCH 4/4] drm/i915/mtl: Add hardware-level lock for steering

2022-11-22 Thread Matt Roper
Starting with MTL, the driver needs to not only protect the steering
control register from simultaneous software accesses, but also protect
against races with hardware/firmware agents.  The hardware provides a
dedicated locking mechanism to support this via the STEER_SEMAPHORE
register.  Reading the register acts as a 'trylock' operation; the read
will return 0x1 if the lock is acquired or 0x0 if something else is
already holding the lock; once acquired, writing 0x1 to the register
will release the lock.

We'll continue to grab the software lock as well, just so lockdep can
track our locking; assuming the hardware lock is behaving properly,
there should never be any contention on the software lock in this case.

Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/intel_gt_mcr.c  | 29 +
 drivers/gpu/drm/i915/gt/intel_gt_regs.h |  1 +
 2 files changed, 26 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c 
b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
index f9e722d91904..fe5f5e0affdf 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
@@ -345,10 +345,9 @@ static u32 rw_with_mcr_steering(struct intel_gt *gt,
  * @gt: GT structure
  *
  * Performs locking to protect the steering for the duration of an MCR
- * operation.  Depending on the platform, this may be a software lock
- * (gt->mcr_lock) or a hardware lock (i.e., a register that synchronizes
- * access not only for the driver, but also for external hardware and
- * firmware agents).
+ * operation.  On MTL and beyond, a hardware lock will also be taken to
+ * serialize access not only for the driver, but also for external hardware and
+ * firmware agents.
  *
  * Context: Takes gt->mcr_lock.  uncore->lock should *not* be held when this
  *  function is called, although it may be acquired after this
@@ -356,9 +355,28 @@ static u32 rw_with_mcr_steering(struct intel_gt *gt,
  */
 void intel_gt_mcr_lock(struct intel_gt *gt)
 {
+   int err = 0;
+
lockdep_assert_not_held(>->uncore->lock);
 
+   /*
+* Starting with MTL, we need to coordinate not only with other
+* driver threads, but also with hardware/firmware agents.  A dedicated
+* locking register is used.
+*/
+   if (GRAPHICS_VER(gt->i915) >= IP_VER(12, 70))
+   err = wait_for(intel_uncore_read_fw(gt->uncore,
+   STEER_SEMAPHORE) == 0x1, 1);
+
+   /*
+* Even on platforms with a hardware lock, we'll continue to grab
+* a software spinlock too for lockdep purposes.  If the hardware lock
+* was already acquired, there should never be contention on the
+* software lock.
+*/
spin_lock(>->mcr_lock);
+
+   drm_WARN_ON_ONCE(>->i915->drm, err == -ETIMEDOUT);
 }
 
 /**
@@ -372,6 +390,9 @@ void intel_gt_mcr_lock(struct intel_gt *gt)
 void intel_gt_mcr_unlock(struct intel_gt *gt)
 {
spin_unlock(>->mcr_lock);
+
+   if (GRAPHICS_VER(gt->i915) >= IP_VER(12, 70))
+   intel_uncore_write_fw(gt->uncore, STEER_SEMAPHORE, 0x1);
 }
 
 /**
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h 
b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
index 80a979e6f6be..412c0b399ebd 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
@@ -67,6 +67,7 @@
 #define GMD_ID_MEDIA   _MMIO(MTL_MEDIA_GSI_BASE + 
0xd8c)
 
 #define MCFG_MCR_SELECTOR  _MMIO(0xfd0)
+#define STEER_SEMAPHORE_MMIO(0xfd0)
 #define MTL_MCR_SELECTOR   _MMIO(0xfd4)
 #define SF_MCR_SELECTOR_MMIO(0xfd8)
 #define GEN8_MCR_SELECTOR  _MMIO(0xfdc)
-- 
2.38.1



[PATCH 0/4] i915: dedicated MCR locking and hardware semaphore

2022-11-22 Thread Matt Roper
We've been overloading uncore->lock to protect access to the MCR
steering register.  That's not really what uncore->lock is intended for,
and it would be better if we didn't need to hold such a high-traffic
spinlock for the whole sequence of (apply steering, access MCR register,
restore steering).  Switch to a dedicated MCR lock to protect the
steering control register over this critical section and stop relying on
the high-traffic uncore->lock.  On pre-MTL platforms the dedicated MCR
lock is just another software lock, but on MTL and beyond we also
utilize the hardware-provided STEER_SEMAPHORE that allows us to
synchronize with external hardware and firmware agents.

Matt Roper (4):
  drm/i915/gt: Correct kerneldoc for intel_gt_mcr_wait_for_reg()
  drm/i915/gt: Pass gt rather than uncore to lowest-level reads/writes
  drm/i915/gt: Add dedicated MCR lock
  drm/i915/mtl: Add hardware-level lock for steering

 drivers/gpu/drm/i915/gt/intel_gt.c  |   2 +
 drivers/gpu/drm/i915/gt/intel_gt_mcr.c  | 107 ++--
 drivers/gpu/drm/i915/gt/intel_gt_mcr.h  |   2 +
 drivers/gpu/drm/i915/gt/intel_gt_regs.h |   1 +
 drivers/gpu/drm/i915/gt/intel_gt_types.h|   8 ++
 drivers/gpu/drm/i915/gt/intel_mocs.c|   2 +
 drivers/gpu/drm/i915/gt/intel_workarounds.c |   4 +
 7 files changed, 115 insertions(+), 11 deletions(-)

-- 
2.38.1



Re: [Intel-gfx] [PATCH 1/6] drm/i915/uc: Introduce GSC FW

2022-11-22 Thread Jani Nikula
On Tue, 22 Nov 2022, "Ceraolo Spurio, Daniele" 
 wrote:
> On 11/22/2022 1:03 AM, Jani Nikula wrote:
>> On Mon, 21 Nov 2022, Daniele Ceraolo Spurio 
>>  wrote:
>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.h 
>>> b/drivers/gpu/drm/i915/gt/uc/intel_uc.h
>>> index a8f38c2c60e2..5d0f1bcc381e 100644
>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_uc.h
>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.h
>>> @@ -6,6 +6,7 @@
>>>   #ifndef _INTEL_UC_H_
>>>   #define _INTEL_UC_H_
>>>   
>>> +#include "intel_gsc_uc.h"
>> And thus intel_gsc_uc.h becomes another file that causes the entire
>> driver to be rebuilt when modified.
>>
>> *sad trombone*
>
> I just followed the same pattern as what is done for GuC and HuC files. 
> What's the recommendation here? Should I split out gsc_uc_types.h from 
> gsc_uc.h ?

Sorry for not being clear, I'm not insisting you do anything at this
time.

But it is something that needs to be refactored eventually.

As an anecdotal data point: I just scripted all the #include
dependencies across all the files in the driver into a digraph and had
graphviz turn it into svg, and on my 80 cm wide 4k screen zoomed out as
far as Firefox can, it's still 15 screenfuls side by side. ;D

BR,
Jani.


-- 
Jani Nikula, Intel Open Source Graphics Center


[PATCH 3/3] drm/i915/guc: Use GuC submission API version number

2022-11-22 Thread John . C . Harrison
From: John Harrison 

The GuC firmware includes an extra version number to specify the
submission API level. So use that rather than the main firmware
version number for submission related checks.

Also, while it is guaranteed that GuC version number components are
only 8-bits in size, other firmwares do not have that restriction. So
stop making assumptions about them generically fitting in a u16
individually, or in a u32 as a combined 8.8.8.

Signed-off-by: John Harrison 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc.h| 11 +++
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 15 +--
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c  | 91 ---
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.h  | 10 +-
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw_abi.h  |  3 +-
 5 files changed, 104 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
index 1bb3f98292866..bb4dfe707a7d0 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
@@ -158,6 +158,9 @@ struct intel_guc {
bool submission_selected;
/** @submission_initialized: tracks whether GuC submission has been 
initialised */
bool submission_initialized;
+   /** @submission_version: Submission API version of the currently loaded 
firmware */
+   struct intel_uc_fw_ver submission_version;
+
/**
 * @rc_supported: tracks whether we support GuC rc on the current 
platform
 */
@@ -268,6 +271,14 @@ struct intel_guc {
 #endif
 };
 
+/*
+ * GuC version number components are only 8-bit, so converting to a 32bit 8.8.8
+ * integer works.
+ */
+#define MAKE_GUC_VER(maj, min, pat)(((maj) << 16) | ((min) << 8) | (pat))
+#define MAKE_GUC_VER_STRUCT(ver)   MAKE_GUC_VER((ver).major, (ver).minor, 
(ver).patch)
+#define GUC_SUBMIT_VER(guc)
MAKE_GUC_VER_STRUCT((guc)->submission_version)
+
 static inline struct intel_guc *log_to_guc(struct intel_guc_log *log)
 {
return container_of(log, struct intel_guc, log);
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 0a42f1807f52c..53f7f599cde3a 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -1890,7 +1890,7 @@ int intel_guc_submission_init(struct intel_guc *guc)
if (guc->submission_initialized)
return 0;
 
-   if (GET_UC_VER(guc) < MAKE_UC_VER(70, 0, 0)) {
+   if (GUC_SUBMIT_VER(guc) < MAKE_GUC_VER(1, 0, 0)) {
ret = guc_lrc_desc_pool_create_v69(guc);
if (ret)
return ret;
@@ -2330,7 +2330,7 @@ static int register_context(struct intel_context *ce, 
bool loop)
GEM_BUG_ON(intel_context_is_child(ce));
trace_intel_context_register(ce);
 
-   if (GET_UC_VER(guc) >= MAKE_UC_VER(70, 0, 0))
+   if (GUC_SUBMIT_VER(guc) >= MAKE_GUC_VER(1, 0, 0))
ret = register_context_v70(guc, ce, loop);
else
ret = register_context_v69(guc, ce, loop);
@@ -2342,7 +2342,7 @@ static int register_context(struct intel_context *ce, 
bool loop)
set_context_registered(ce);
spin_unlock_irqrestore(&ce->guc_state.lock, flags);
 
-   if (GET_UC_VER(guc) >= MAKE_UC_VER(70, 0, 0))
+   if (GUC_SUBMIT_VER(guc) >= MAKE_GUC_VER(1, 0, 0))
guc_context_policy_init_v70(ce, loop);
}
 
@@ -2956,7 +2956,7 @@ static void __guc_context_set_preemption_timeout(struct 
intel_guc *guc,
 u16 guc_id,
 u32 preemption_timeout)
 {
-   if (GET_UC_VER(guc) >= MAKE_UC_VER(70, 0, 0)) {
+   if (GUC_SUBMIT_VER(guc) >= MAKE_GUC_VER(1, 0, 0)) {
struct context_policy policy;
 
__guc_context_policy_start_klv(&policy, guc_id);
@@ -3283,7 +3283,7 @@ static int guc_context_alloc(struct intel_context *ce)
 static void __guc_context_set_prio(struct intel_guc *guc,
   struct intel_context *ce)
 {
-   if (GET_UC_VER(guc) >= MAKE_UC_VER(70, 0, 0)) {
+   if (GUC_SUBMIT_VER(guc) >= MAKE_GUC_VER(1, 0, 0)) {
struct context_policy policy;
 
__guc_context_policy_start_klv(&policy, ce->guc_id.id);
@@ -4366,7 +4366,7 @@ static int guc_init_global_schedule_policy(struct 
intel_guc *guc)
intel_wakeref_t wakeref;
int ret = 0;
 
-   if (GET_UC_VER(guc) < MAKE_UC_VER(70, 3, 0))
+   if (GUC_SUBMIT_VER(guc) < MAKE_GUC_VER(1, 1, 0))
return 0;
 
__guc_scheduling_policy_start_klv(&policy);
@@ -4905,6 +4905,9 @@ void intel_guc_submission_print_info(struct intel_guc 
*guc,
if (!sched_engine)
return;
 
+   drm_printf(p, "GuC Submission API Version: %d.%d.%d\n",
+  guc->s

[PATCH 1/3] drm/i915/uc: Rationalise delimiters in filename macros

2022-11-22 Thread John . C . Harrison
From: John Harrison 

The way delimieters (underscores and dots) were added to the UC
filenames was different for different types of delimter. Rationalise
them to all be done the same way - implicitly in the concatenation
macro rather than explicitly in the file name prefix.

Signed-off-by: John Harrison 
---
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c 
b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
index 0c80ba51a4bdc..774c3d84a4243 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
@@ -118,35 +118,35 @@ void intel_uc_fw_change_status(struct intel_uc_fw *uc_fw,
  */
 #define __MAKE_UC_FW_PATH_BLANK(prefix_, name_) \
"i915/" \
-   __stringify(prefix_) name_ ".bin"
+   __stringify(prefix_) "_" name_ ".bin"
 
 #define __MAKE_UC_FW_PATH_MAJOR(prefix_, name_, major_) \
"i915/" \
-   __stringify(prefix_) name_ \
+   __stringify(prefix_) "_" name_ "_" \
__stringify(major_) ".bin"
 
 #define __MAKE_UC_FW_PATH_MMP(prefix_, name_, major_, minor_, patch_) \
"i915/" \
-   __stringify(prefix_) name_ \
+   __stringify(prefix_) "_" name_  "_" \
__stringify(major_) "." \
__stringify(minor_) "." \
__stringify(patch_) ".bin"
 
 /* Minor for internal driver use, not part of file name */
 #define MAKE_GUC_FW_PATH_MAJOR(prefix_, major_, minor_) \
-   __MAKE_UC_FW_PATH_MAJOR(prefix_, "_guc_", major_)
+   __MAKE_UC_FW_PATH_MAJOR(prefix_, "guc", major_)
 
 #define MAKE_GUC_FW_PATH_MMP(prefix_, major_, minor_, patch_) \
-   __MAKE_UC_FW_PATH_MMP(prefix_, "_guc_", major_, minor_, patch_)
+   __MAKE_UC_FW_PATH_MMP(prefix_, "guc", major_, minor_, patch_)
 
 #define MAKE_HUC_FW_PATH_BLANK(prefix_) \
-   __MAKE_UC_FW_PATH_BLANK(prefix_, "_huc")
+   __MAKE_UC_FW_PATH_BLANK(prefix_, "huc")
 
 #define MAKE_HUC_FW_PATH_GSC(prefix_) \
-   __MAKE_UC_FW_PATH_BLANK(prefix_, "_huc_gsc")
+   __MAKE_UC_FW_PATH_BLANK(prefix_, "huc_gsc")
 
 #define MAKE_HUC_FW_PATH_MMP(prefix_, major_, minor_, patch_) \
-   __MAKE_UC_FW_PATH_MMP(prefix_, "_huc_", major_, minor_, patch_)
+   __MAKE_UC_FW_PATH_MMP(prefix_, "huc", major_, minor_, patch_)
 
 /*
  * All blobs need to be declared via MODULE_FIRMWARE().
-- 
2.37.3



[PATCH 2/3] drm/i915/uc: More refactoring of UC version numbers

2022-11-22 Thread John . C . Harrison
From: John Harrison 

As a precursor to a coming change (for adding a GuC submission API
version), abstract the UC version number into its own private
structure separate to the firmware filename.

Signed-off-by: John Harrison 
---
 drivers/gpu/drm/i915/gt/uc/intel_uc.c|  6 +-
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c | 76 +++-
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.h | 15 +++--
 3 files changed, 48 insertions(+), 49 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
index 1d28286e6f066..e6edad6f8f9dd 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
@@ -437,9 +437,9 @@ static void print_fw_ver(struct intel_uc *uc, struct 
intel_uc_fw *fw)
 
drm_info(&i915->drm, "%s firmware %s version %u.%u.%u\n",
 intel_uc_fw_type_repr(fw->type), fw->file_selected.path,
-fw->file_selected.major_ver,
-fw->file_selected.minor_ver,
-fw->file_selected.patch_ver);
+fw->file_selected.ver.major,
+fw->file_selected.ver.minor,
+fw->file_selected.ver.patch);
 }
 
 static int __uc_init_hw(struct intel_uc *uc)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c 
b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
index 774c3d84a4243..5e2ee1ac89514 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
@@ -278,8 +278,8 @@ __uc_fw_auto_select(struct drm_i915_private *i915, struct 
intel_uc_fw *uc_fw)
 
uc_fw->file_selected.path = blob->path;
uc_fw->file_wanted.path = blob->path;
-   uc_fw->file_wanted.major_ver = blob->major;
-   uc_fw->file_wanted.minor_ver = blob->minor;
+   uc_fw->file_wanted.ver.major = blob->major;
+   uc_fw->file_wanted.ver.minor = blob->minor;
uc_fw->loaded_via_gsc = blob->loaded_via_gsc;
found = true;
break;
@@ -438,28 +438,28 @@ static void __force_fw_fetch_failures(struct intel_uc_fw 
*uc_fw, int e)
uc_fw->user_overridden = user;
} else if (i915_inject_probe_error(i915, e)) {
/* require next major version */
-   uc_fw->file_wanted.major_ver += 1;
-   uc_fw->file_wanted.minor_ver = 0;
+   uc_fw->file_wanted.ver.major += 1;
+   uc_fw->file_wanted.ver.minor = 0;
uc_fw->user_overridden = user;
} else if (i915_inject_probe_error(i915, e)) {
/* require next minor version */
-   uc_fw->file_wanted.minor_ver += 1;
+   uc_fw->file_wanted.ver.minor += 1;
uc_fw->user_overridden = user;
-   } else if (uc_fw->file_wanted.major_ver &&
+   } else if (uc_fw->file_wanted.ver.major &&
   i915_inject_probe_error(i915, e)) {
/* require prev major version */
-   uc_fw->file_wanted.major_ver -= 1;
-   uc_fw->file_wanted.minor_ver = 0;
+   uc_fw->file_wanted.ver.major -= 1;
+   uc_fw->file_wanted.ver.minor = 0;
uc_fw->user_overridden = user;
-   } else if (uc_fw->file_wanted.minor_ver &&
+   } else if (uc_fw->file_wanted.ver.minor &&
   i915_inject_probe_error(i915, e)) {
/* require prev minor version - hey, this should work! */
-   uc_fw->file_wanted.minor_ver -= 1;
+   uc_fw->file_wanted.ver.minor -= 1;
uc_fw->user_overridden = user;
} else if (user && i915_inject_probe_error(i915, e)) {
/* officially unsupported platform */
-   uc_fw->file_wanted.major_ver = 0;
-   uc_fw->file_wanted.minor_ver = 0;
+   uc_fw->file_wanted.ver.major = 0;
+   uc_fw->file_wanted.ver.minor = 0;
uc_fw->user_overridden = true;
}
 }
@@ -471,9 +471,9 @@ static int check_gsc_manifest(const struct firmware *fw,
u32 version_hi = dw[HUC_GSC_VERSION_HI_DW];
u32 version_lo = dw[HUC_GSC_VERSION_LO_DW];
 
-   uc_fw->file_selected.major_ver = FIELD_GET(HUC_GSC_MAJOR_VER_HI_MASK, 
version_hi);
-   uc_fw->file_selected.minor_ver = FIELD_GET(HUC_GSC_MINOR_VER_HI_MASK, 
version_hi);
-   uc_fw->file_selected.patch_ver = FIELD_GET(HUC_GSC_PATCH_VER_LO_MASK, 
version_lo);
+   uc_fw->file_selected.ver.major = FIELD_GET(HUC_GSC_MAJOR_VER_HI_MASK, 
version_hi);
+   uc_fw->file_selected.ver.minor = FIELD_GET(HUC_GSC_MINOR_VER_HI_MASK, 
version_hi);
+   uc_fw->file_selected.ver.patch = FIELD_GET(HUC_GSC_PATCH_VER_LO_MASK, 
version_lo);
 
return 0;
 }
@@ -532,11 +532,11 @@ static int check_ccs_header(struct intel_gt *gt,
}
 
/* Get version numbers from the CSS header */
-   uc_fw->file_selected.major_ver = FIELD_GET(CSS_SW_VERSION_UC_MAJOR,
+   uc_fw->file_selec

[PATCH 0/3] More GuC firmware version improvements

2022-11-22 Thread John . C . Harrison
From: John Harrison 

Start using the 'submission API version' for deciding which GuC API to
use in the submission code.

Correct version number manipulation code to support full 32bit
major/minor/patch components, except for GuC which is guaranteed to be
8bit safe.

Other minor code clean ups around version number handling.

Signed-off-by: John Harrison 


John Harrison (3):
  drm/i915/uc: Rationalise delimiters in filename macros
  drm/i915/uc: More refactoring of UC version numbers
  drm/i915/guc: Use GuC submission API version number

 drivers/gpu/drm/i915/gt/uc/intel_guc.h|  11 ++
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  15 +-
 drivers/gpu/drm/i915/gt/uc/intel_uc.c |   6 +-
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c  | 173 --
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.h  |  15 +-
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw_abi.h  |   3 +-
 6 files changed, 150 insertions(+), 73 deletions(-)

-- 
2.37.3



[PATCH] dt-bindings: display: bridge: renesas, rzg2l-mipi-dsi: Document RZ/V2L support

2022-11-22 Thread Biju Das
Document RZ/V2L DSI bindings. RZ/V2L MIPI DSI is identical to one found on
the RZ/G2L SoC. No driver changes are required as generic compatible
string "renesas,rzg2l-mipi-dsi" will be used as a fallback.

Signed-off-by: Biju Das 
---
 .../devicetree/bindings/display/bridge/renesas,dsi.yaml  | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/display/bridge/renesas,dsi.yaml 
b/Documentation/devicetree/bindings/display/bridge/renesas,dsi.yaml
index 131d5b63ec4f..e08c24633926 100644
--- a/Documentation/devicetree/bindings/display/bridge/renesas,dsi.yaml
+++ b/Documentation/devicetree/bindings/display/bridge/renesas,dsi.yaml
@@ -22,6 +22,7 @@ properties:
 items:
   - enum:
   - renesas,r9a07g044-mipi-dsi # RZ/G2{L,LC}
+  - renesas,r9a07g054-mipi-dsi # RZ/V2L
   - const: renesas,rzg2l-mipi-dsi
 
   reg:
-- 
2.25.1



Re: [PATCH] dma-buf: Require VM_PFNMAP vma for mmap

2022-11-22 Thread Daniel Vetter
On Tue, 22 Nov 2022 at 20:34, Jason Gunthorpe  wrote:
> On Tue, Nov 22, 2022 at 08:29:05PM +0100, Daniel Vetter wrote:
> > You nuke all the ptes. Drivers that move have slightly more than a
> > bare struct file, they also have a struct address_space so that
> > invalidate_mapping_range() works.
>
> Okay, this is one of the ways that this can be made to work correctly,
> as long as you never allow GUP/GUP_fast to succeed on the PTEs. (this
> was the DAX mistake)

Hence this patch, to enforce that no dma-buf exporter gets this wrong.
Which some did, and then blamed bug reporters for the resulting splats
:-) One of the things we've reverted was the ttm huge pte support,
since that doesn't have the pmd_special flag (yet) and so would let
gup_fast through.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [syzbot] inconsistent lock state in sync_info_debugfs_show

2022-11-22 Thread Daniel Vetter
On Sun, 20 Nov 2022 at 21:51, syzbot
 wrote:
>
> syzbot has bisected this issue to:
>
> commit 997acaf6b4b59c6a9c259740312a69ea549cc684
> Author: Mark Rutland 
> Date:   Mon Jan 11 15:37:07 2021 +
>
> lockdep: report broken irq restoration

Ok this looks funny. I'm pretty sure the code in
drivers/dma-buf/sw_sync.c around sync_timeline_fence_lock is correct.
And we don't do anything that this patch claims to catch, it's all
just plain spin_lock_irq and spin_lock_irqsave usage. Only thing that
crossed my mind here is that maybe lockdep somehow ends up with two
different keys for the same spinlock? I'm really confused ...
-Daniel

> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=115b350d88
> start commit:   84368d882b96 Merge tag 'soc-fixes-6.1-3' of git://git.kern..
> git tree:   upstream
> final oops: https://syzkaller.appspot.com/x/report.txt?x=135b350d88
> console output: https://syzkaller.appspot.com/x/log.txt?x=155b350d88
> kernel config:  https://syzkaller.appspot.com/x/.config?x=6f4e5e9899396248
> dashboard link: https://syzkaller.appspot.com/bug?extid=007bfe0f3330f6e1e7d1
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=164376f988
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=16cf096588
>
> Reported-by: syzbot+007bfe0f3330f6e1e...@syzkaller.appspotmail.com
> Fixes: 997acaf6b4b5 ("lockdep: report broken irq restoration")
>
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [Intel-gfx] [PATCH 1/6] drm/i915/uc: Introduce GSC FW

2022-11-22 Thread Ceraolo Spurio, Daniele




On 11/22/2022 1:03 AM, Jani Nikula wrote:

On Mon, 21 Nov 2022, Daniele Ceraolo Spurio  
wrote:

On MTL the GSC FW needs to be loaded on the media GT by the graphics
driver. We're going to treat it like a new uc_fw, so add the initial
defs and init/fini functions for it.

Similarly to the other FWs, the GSC FW path can be overriden via
modparam. The modparam can also be used to disable the GSC FW loading by
setting it to an empty string.

Note that the new structure has been called intel_gsc_uc to avoid
confusion with the existing intel_gsc, which instead represents the heci
gsc interfaces.

Signed-off-by: Daniele Ceraolo Spurio 
Cc: Alan Previn 
Cc: John Harrison 
---
  drivers/gpu/drm/i915/Makefile |  3 +-
  drivers/gpu/drm/i915/gt/intel_gt.h|  5 ++
  drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.c | 70 +++
  drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.h | 36 
  drivers/gpu/drm/i915/gt/uc/intel_uc.c | 17 ++
  drivers/gpu/drm/i915/gt/uc/intel_uc.h |  3 +
  drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c  | 25 +++-
  drivers/gpu/drm/i915/gt/uc/intel_uc_fw.h  |  7 ++-
  drivers/gpu/drm/i915/i915_params.c|  3 +
  drivers/gpu/drm/i915/i915_params.h|  1 +
  10 files changed, 164 insertions(+), 6 deletions(-)
  create mode 100644 drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.c
  create mode 100644 drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 01974b82d205..92d37cf71e16 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -205,7 +205,8 @@ i915-y += gt/uc/intel_uc.o \
  gt/uc/intel_guc_submission.o \
  gt/uc/intel_huc.o \
  gt/uc/intel_huc_debugfs.o \
- gt/uc/intel_huc_fw.o
+ gt/uc/intel_huc_fw.o \
+ gt/uc/intel_gsc_uc.o

Comment near the top of the file:

# Please keep these build lists sorted!


My bad, dumb mistake.



  
  # graphics system controller (GSC) support

  i915-y += gt/intel_gsc.o
diff --git a/drivers/gpu/drm/i915/gt/intel_gt.h 
b/drivers/gpu/drm/i915/gt/intel_gt.h
index e0365d556248..d2f4fbde5f9f 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt.h
@@ -39,6 +39,11 @@ static inline struct intel_gt *huc_to_gt(struct intel_huc 
*huc)
return container_of(huc, struct intel_gt, uc.huc);
  }
  
+static inline struct intel_gt *gsc_uc_to_gt(struct intel_gsc_uc *gsc_uc)

+{
+   return container_of(gsc_uc, struct intel_gt, uc.gsc);
+}
+
  static inline struct intel_gt *gsc_to_gt(struct intel_gsc *gsc)
  {
return container_of(gsc, struct intel_gt, gsc);
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.c
new file mode 100644
index ..65cbf1ce9fa1
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.c
@@ -0,0 +1,70 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#include 
+
+#include "gt/intel_gt.h"
+#include "intel_gsc_uc.h"
+#include "i915_drv.h"
+
+static bool gsc_engine_supported(struct intel_gt *gt)
+{
+   intel_engine_mask_t mask;
+
+   /*
+* We reach here from i915_driver_early_probe for the primary GT before
+* its engine mask is set, so we use the device info engine mask for it.
+* For other GTs we expect the GT-specific mask to be set before we
+* call this function.
+*/
+   GEM_BUG_ON(!gt_is_root(gt) && !gt->info.engine_mask);
+
+   if (gt_is_root(gt))
+   mask = RUNTIME_INFO(gt->i915)->platform_engine_mask;
+   else
+   mask = gt->info.engine_mask;
+
+   return __HAS_ENGINE(mask, GSC0);
+}
+
+void intel_gsc_uc_init_early(struct intel_gsc_uc *gsc)
+{
+   intel_uc_fw_init_early(&gsc->fw, INTEL_UC_FW_TYPE_GSC);
+
+   /* we can arrive here from i915_driver_early_probe for primary
+* GT with it being not fully setup hence check device info's
+* engine mask
+*/
+   if (!gsc_engine_supported(gsc_uc_to_gt(gsc))){
+   intel_uc_fw_change_status(&gsc->fw, 
INTEL_UC_FIRMWARE_NOT_SUPPORTED);
+   return;
+   }
+}
+
+int intel_gsc_uc_init(struct intel_gsc_uc *gsc)
+{
+   struct drm_i915_private *i915 = gsc_uc_to_gt(gsc)->i915;
+   int err;
+
+   err = intel_uc_fw_init(&gsc->fw);
+   if (err)
+   goto out;
+
+   intel_uc_fw_change_status(&gsc->fw, INTEL_UC_FIRMWARE_LOADABLE);
+
+   return 0;
+
+out:
+   i915_probe_error(i915, "failed with %d\n", err);
+   return err;
+}
+
+void intel_gsc_uc_fini(struct intel_gsc_uc *gsc)
+{
+   if (!intel_uc_fw_is_loadable(&gsc->fw))
+   return;
+
+   intel_uc_fw_fini(&gsc->fw);
+}
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.h 
b/drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.h
new file mode 100644
index ..ea2b1c0713b8
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.h

Re: [Intel-gfx] [PATCH 3/6] drm/i915/gsc: GSC firmware loading

2022-11-22 Thread Ceraolo Spurio, Daniele




On 11/22/2022 11:01 AM, Rodrigo Vivi wrote:

On Mon, Nov 21, 2022 at 03:16:14PM -0800, Daniele Ceraolo Spurio wrote:

GSC FW is loaded by submitting a dedicated command via the GSC engine.
The memory area used for loading the FW is then re-purposed as local
memory for the GSC itself, so we use a separate allocation instead of
using the one where we keep the firmware stored for reload.

The GSC is not reset as part of GT reset, so we only need to load it on
first boot and S3/S4 exit.

Note that the GSC load takes a lot of time (up to a few hundred ms).
This patch loads it serially as part of driver init/resume, but, given
that GSC is only required for PM and content-protection features
(media C6, PXP, HDCP), we could move the load to a worker thread to unblock
non-CP userspace submissions earlier. This will be done as a follow up
step, because there are extra init steps required to actually make use of
the GSC (including a mei component) and it will be cleaner (and easier to
review) if we implement the async load once all the pieces we need for GSC
to work are in place. A TODO has been added to the code to mark this
intention.

Bspec: 63347, 65346
Signed-off-by: Daniele Ceraolo Spurio 
Cc: Alan Previn 
Cc: John Harrison 
---
  drivers/gpu/drm/i915/Makefile|   1 +
  drivers/gpu/drm/i915/gem/i915_gem_pm.c   |  14 +-
  drivers/gpu/drm/i915/gt/intel_engine.h   |   2 +
  drivers/gpu/drm/i915/gt/intel_gpu_commands.h |   7 +
  drivers/gpu/drm/i915/gt/intel_gt.c   |  11 ++
  drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c| 186 +++
  drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.h|  13 ++
  drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.c|  35 +++-
  drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.h|   7 +
  drivers/gpu/drm/i915/gt/uc/intel_uc.c|  15 ++
  drivers/gpu/drm/i915/gt/uc/intel_uc.h|   2 +
  drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c |  20 +-
  drivers/gpu/drm/i915/gt/uc/intel_uc_fw.h |   1 +
  13 files changed, 307 insertions(+), 7 deletions(-)
  create mode 100644 drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c
  create mode 100644 drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 92d37cf71e16..1d45a6f451fa 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -206,6 +206,7 @@ i915-y += gt/uc/intel_uc.o \
  gt/uc/intel_huc.o \
  gt/uc/intel_huc_debugfs.o \
  gt/uc/intel_huc_fw.o \
+ gt/uc/intel_gsc_fw.o \
  gt/uc/intel_gsc_uc.o
  
  # graphics system controller (GSC) support

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pm.c 
b/drivers/gpu/drm/i915/gem/i915_gem_pm.c
index 0d812f4d787d..f77eb4009aba 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_pm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_pm.c
@@ -232,10 +232,22 @@ void i915_gem_resume(struct drm_i915_private *i915)
 * guarantee that the context image is complete. So let's just reset
 * it and start again.
 */
-   for_each_gt(gt, i915, i)
+   for_each_gt(gt, i915, i) {
if (intel_gt_resume(gt))
goto err_wedged;
  
+		/*

+* TODO: this is a long operation (up to ~200ms) and we don't
+* need to complete it before driver load/resume is done, so it
+* should be handled in a separate thread to unlock userspace
+* submission. However, there are a couple of other pieces that
+* are required for full GSC support that will complicate things
+* a bit, and it is easier to move everything to a worker at the
+* same time, so keep it here for now.
+*/
+   intel_uc_init_hw_late(>->uc);
+   }
+
ret = lmem_restore(i915, I915_TTM_BACKUP_ALLOW_GPU);
GEM_WARN_ON(ret);
  
diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h

index cbc8b857d5f7..0e24af5efee9 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine.h
@@ -172,6 +172,8 @@ intel_write_status_page(struct intel_engine_cs *engine, int 
reg, u32 value)
  #define I915_GEM_HWS_MIGRATE  (0x42 * sizeof(u32))
  #define I915_GEM_HWS_PXP  0x60
  #define I915_GEM_HWS_PXP_ADDR (I915_GEM_HWS_PXP * sizeof(u32))
+#define I915_GEM_HWS_GSC   0x62
+#define I915_GEM_HWS_GSC_ADDR  (I915_GEM_HWS_GSC * sizeof(u32))
  #define I915_GEM_HWS_SCRATCH  0x80
  
  #define I915_HWS_CSB_BUF0_INDEX		0x10

diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h 
b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
index f50ea92910d9..49ebda141266 100644
--- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
+++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
@@ -21,6 +21,7 @@
  #define INSTR_CLIENT_SHIFT  29
  #define   INSTR_MI_CLIENT   0x0
  #define   INSTR_BC_CLIENT   0x2
+#define   INS

Re: [PATCH] dma-buf: Require VM_PFNMAP vma for mmap

2022-11-22 Thread Jason Gunthorpe
On Tue, Nov 22, 2022 at 08:29:05PM +0100, Daniel Vetter wrote:

> You nuke all the ptes. Drivers that move have slightly more than a
> bare struct file, they also have a struct address_space so that
> invalidate_mapping_range() works.

Okay, this is one of the ways that this can be made to work correctly,
as long as you never allow GUP/GUP_fast to succeed on the PTEs. (this
was the DAX mistake)

Jason


Re: [PATCH] dma-buf: Require VM_PFNMAP vma for mmap

2022-11-22 Thread Daniel Vetter
On Tue, 22 Nov 2022 at 19:50, Jason Gunthorpe  wrote:
>
> On Tue, Nov 22, 2022 at 07:08:25PM +0100, Daniel Vetter wrote:
> > On Tue, 22 Nov 2022 at 19:04, Jason Gunthorpe  wrote:
> > >
> > > On Tue, Nov 22, 2022 at 06:08:00PM +0100, Daniel Vetter wrote:
> > > > tldr; DMA buffers aren't normal memory, expecting that you can use
> > > > them like that (like calling get_user_pages works, or that they're
> > > > accounting like any other normal memory) cannot be guaranteed.
> > > >
> > > > Since some userspace only runs on integrated devices, where all
> > > > buffers are actually all resident system memory, there's a huge
> > > > temptation to assume that a struct page is always present and useable
> > > > like for any more pagecache backed mmap. This has the potential to
> > > > result in a uapi nightmare.
> > > >
> > > > To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
> > > > blocks get_user_pages and all the other struct page based
> > > > infrastructure for everyone. In spirit this is the uapi counterpart to
> > > > the kernel-internal CONFIG_DMABUF_DEBUG.
> > > >
> > > > Motivated by a recent patch which wanted to swich the system dma-buf
> > > > heap to vm_insert_page instead of vm_insert_pfn.
> > > >
> > > > v2:
> > > >
> > > > Jason brought up that we also want to guarantee that all ptes have the
> > > > pte_special flag set, to catch fast get_user_pages (on architectures
> > > > that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
> > > > still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.
> > > >
> > > > From auditing the various functions to insert pfn pte entires
> > > > (vm_insert_pfn_prot, remap_pfn_range and all it's callers like
> > > > dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
> > > > this should be the correct flag to check for.
> > >
> > > I didn't look at how this actually gets used, but it is a bit of a
> > > pain to insert a lifetime controlled object like a struct page as a
> > > special PTE/VM_PFNMAP
> > >
> > > How is the lifetime model implemented here? How do you know when
> > > userspace has finally unmapped the page?
> >
> > The vma has a filp which is the refcounted dma_buf. With dma_buf you
> > never get an individual page it's always the entire object. And it's
> > up to the allocator how exactly it wants to use or not use the page's
> > refcount. So if gup goes in and elevates the refcount, you can break
> > stuff, which is why I'm doing this.
>
> But how does move work?

You nuke all the ptes. Drivers that move have slightly more than a
bare struct file, they also have a struct address_space so that
invalidate_mapping_range() works. Refaulting and any coherency issues
when a refault races against a dma-buf migration is up to the
driver/exporter to handle correctly. None rely on struct page like mm/
moving stuff around for compaction/ksm/numa-balancing/whateverr.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: git send-email friendly smtp provider anyone?

2022-11-22 Thread Noralf Trønnes



Den 22.11.2022 19.50, skrev Konstantin Ryabitsev:
> On Tue, Nov 22, 2022 at 06:42:19PM +0100, Noralf Trønnes wrote:
>> The first thing that strikes me is that everyone mentioned in one of the
>> patches get the entire patchset, even sta...@vger.kernel.org (cc'ed in a
>> fixes patch). The first patch touches a core file and as a result a few
>> drivers, so I've cc'ed the driver maintainers in that patch, but now
>> they get the entire patchset where 5 of 6 patches is about a driver that
>> I maintain. So from their point of view, they see a patchset about a
>> driver they don't care about and a patch touching a core file, but from
>> the subject it's not apparent that it touches their driver. I'm afraid
>> that this might result in none of them looking at that patch. In this
>> particular case it's not that important, but in another case it might be.
> 
> I did some (unscientific) polling among kernel maintainers and, by a vast
> margin, they always prefer to receive the entire series instead of
> cherry-picked patches -- having the entire series helps provide important
> context for the change they are looking at.
> 
> So, this is deliberate and, for now at least, not configurable. Unless you're
> sending 100+ patch series, I doubt anyone will have any problem with receiving
> the whole series instead of individual patches.
> 
>> As for the setting up the web endpoint, should I just follow the b4 docs
>> on that?
>>
>> I use b4 version 0.10.1, is that recent enough?
> 
> Yes. There will be a 0.10.2 in the near future, but the incoming fixes
> shouldn't make much difference for the b4 send code.
> 

This is what I got:

$ b4 send --web-auth-verify 
Signing challenge
Submitting verification to https://lkml.kernel.org/_b4_submit
Traceback (most recent call last):
  File "/home/pi/.local/bin/b4", line 8, in 
sys.exit(cmd())
  File "/home/pi/.local/lib/python3.10/site-packages/b4/command.py",
line 341, in cmd
cmdargs.func(cmdargs)
  File "/home/pi/.local/lib/python3.10/site-packages/b4/command.py",
line 86, in cmd_send
b4.ez.cmd_send(cmdargs)
  File "/home/pi/.local/lib/python3.10/site-packages/b4/ez.py", line
1102, in cmd_send
auth_verify(cmdargs)
  File "/home/pi/.local/lib/python3.10/site-packages/b4/ez.py", line
188, in auth_verify
res = ses.post(endpoint, json=req)
  File "/usr/lib/python3/dist-packages/requests/sessions.py", line 590,
in post
return self.request('POST', url, data=data, json=json, **kwargs)
  File "/usr/lib/python3/dist-packages/requests/sessions.py", line 528,
in request
prep = self.prepare_request(req)
  File "/usr/lib/python3/dist-packages/requests/sessions.py", line 456,
in prepare_request
p.prepare(
  File "/usr/lib/python3/dist-packages/requests/models.py", line 319, in
prepare
self.prepare_body(data, files, json)
  File "/usr/lib/python3/dist-packages/requests/models.py", line 469, in
prepare_body
body = complexjson.dumps(json)
  File "/usr/lib/python3.10/json/__init__.py", line 231, in dumps
return _default_encoder.encode(obj)
  File "/usr/lib/python3.10/json/encoder.py", line 199, in encode
chunks = self.iterencode(o, _one_shot=True)
  File "/usr/lib/python3.10/json/encoder.py", line 257, in iterencode
return _iterencode(o, 0)
  File "/usr/lib/python3.10/json/encoder.py", line 179, in default
raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type bytes is not JSON serializable

$ python3 --version
Python 3.10.6

Turning on debug output didn't add much:

$ b4 -d send --web-auth-verify 7ad470b4-f531-4632-8093-738d4d3e5d88
Running git --no-pager rev-parse --show-toplevel
Running git --no-pager config -z --get-regexp b4\..*
Running git --no-pager config -z --get-regexp gpg\..*
Running git --no-pager config -z --get-regexp user\..*
Signing challenge
Submitting verification to https://lkml.kernel.org/_b4_submit
Traceback (most recent call last):
  File "/home/pi/.local/bin/b4", line 8, in 
sys.exit(cmd())
  File "/home/pi/.local/lib/python3.10/site-packages/b4/command.py",
line 341, in cmd
cmdargs.func(cmdargs)
  File "/home/pi/.local/lib/python3.10/site-packages/b4/command.py",
line 86, in cmd_send
b4.ez.cmd_send(cmdargs)
  File "/home/pi/.local/lib/python3.10/site-packages/b4/ez.py", line
1102, in cmd_send
auth_verify(cmdargs)
  File "/home/pi/.local/lib/python3.10/site-packages/b4/ez.py", line
188, in auth_verify
res = ses.post(endpoint, json=req)
  File "/usr/lib/python3/dist-packages/requests/sessions.py", line 590,
in post
return self.request('POST', url, data=data, json=json, **kwargs)
  File "/usr/lib/python3/dist-packages/requests/sessions.py", line 528,
in request
prep = self.prepare_request(req)
  File "/usr/lib/python3/dist-packages/requests/sessions.py", line 456,
in prepare_request
p.prepare(
  File "/usr/lib/python3/dist-packages/requests/models.py", line 319, in
prepare
self.prepare_body(data, files, json)
  File "/usr/lib/python3/dist-packages/requ

[PATCH 6/6] drm/todo: update the debugfs clean up task

2022-11-22 Thread Maíra Canal
The structs drm_debugfs_info and drm_debugfs_entry introduced a new
debugfs structure to DRM, centered on drm_device instead of drm_minor.
Therefore, remove the tasks related to create a new device-centered
debugfs structure and add a new task to replace the use of
drm_debugfs_create_files() for the use of drm_debugfs_add_file() and
drm_debugfs_add_files().

Signed-off-by: Maíra Canal 
---
 Documentation/gpu/todo.rst | 9 ++---
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/Documentation/gpu/todo.rst b/Documentation/gpu/todo.rst
index b2c6aaf1edf2..f64abf69f341 100644
--- a/Documentation/gpu/todo.rst
+++ b/Documentation/gpu/todo.rst
@@ -508,17 +508,14 @@ Clean up the debugfs support
 
 There's a bunch of issues with it:
 
-- The drm_info_list ->show() function doesn't even bother to cast to the drm
-  structure for you. This is lazy.
+- Convert drivers to support the drm_debugfs_add_files() function instead of
+  the drm_debugfs_create_files() function.
 
 - We probably want to have some support for debugfs files on crtc/connectors 
and
   maybe other kms objects directly in core. There's even drm_print support in
   the funcs for these objects to dump kms state, so it's all there. And then 
the
   ->show() functions should obviously give you a pointer to the right object.
 
-- The drm_info_list stuff is centered on drm_minor instead of drm_device. For
-  anything we want to print drm_device (or maybe drm_file) is the right thing.
-
 - The drm_driver->debugfs_init hooks we have is just an artifact of the old
   midlayered load sequence. DRM debugfs should work more like sysfs, where you
   can create properties/files for an object anytime you want, and the core
@@ -527,8 +524,6 @@ There's a bunch of issues with it:
   this (together with the drm_minor->drm_device move) would allow us to remove
   debugfs_init.
 
-Previous RFC that hasn't landed yet: 
https://lore.kernel.org/dri-devel/20200513114130.28641-2-wambui.karu...@gmail.com/
-
 Contact: Daniel Vetter
 
 Level: Intermediate
-- 
2.38.1



[PATCH 5/6] drm/vkms: use new debugfs device-centered functions

2022-11-22 Thread Maíra Canal
Replace the use of drm_debugfs_create_files() with the new
drm_debugfs_add_files() function, which centers the debugfs files
management on the drm_device instead of drm_minor.

Signed-off-by: Maíra Canal 
---
 drivers/gpu/drm/vkms/vkms_drv.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/vkms/vkms_drv.c b/drivers/gpu/drm/vkms/vkms_drv.c
index 293dbca50c31..15e7e270fba2 100644
--- a/drivers/gpu/drm/vkms/vkms_drv.c
+++ b/drivers/gpu/drm/vkms/vkms_drv.c
@@ -91,8 +91,8 @@ static void vkms_atomic_commit_tail(struct drm_atomic_state 
*old_state)
 
 static int vkms_config_show(struct seq_file *m, void *data)
 {
-   struct drm_info_node *node = (struct drm_info_node *)m->private;
-   struct drm_device *dev = node->minor->dev;
+   struct drm_debugfs_entry *entry = m->private;
+   struct drm_device *dev = entry->dev;
struct vkms_device *vkmsdev = drm_device_to_vkms_device(dev);
 
seq_printf(m, "writeback=%d\n", vkmsdev->config->writeback);
@@ -102,14 +102,14 @@ static int vkms_config_show(struct seq_file *m, void 
*data)
return 0;
 }
 
-static const struct drm_info_list vkms_config_debugfs_list[] = {
+static const struct drm_debugfs_info vkms_config_debugfs_list[] = {
{ "vkms_config", vkms_config_show, 0 },
 };
 
 static void vkms_config_debugfs_init(struct drm_minor *minor)
 {
-   drm_debugfs_create_files(vkms_config_debugfs_list, 
ARRAY_SIZE(vkms_config_debugfs_list),
-minor->debugfs_root, minor);
+   drm_debugfs_add_files(minor->dev, vkms_config_debugfs_list,
+ ARRAY_SIZE(vkms_config_debugfs_list));
 }
 
 static const struct drm_driver vkms_driver = {
-- 
2.38.1



[PATCH 3/6] drm/vc4: use new debugfs device-centered functions

2022-11-22 Thread Maíra Canal
Currently, vc4 has its own debugfs infrastructure that adds the debugfs
files on drm_dev_register(). With the introduction of the new debugfs,
functions, replace the vc4 debugfs structure with the DRM debugfs
device-centered function, drm_debugfs_add_file().

Signed-off-by: Maíra Canal 
---
 drivers/gpu/drm/vc4/vc4_bo.c  |  6 +++---
 drivers/gpu/drm/vc4/vc4_debugfs.c | 30 --
 drivers/gpu/drm/vc4/vc4_drv.c |  1 -
 drivers/gpu/drm/vc4/vc4_drv.h | 16 
 drivers/gpu/drm/vc4/vc4_hdmi.c|  6 +++---
 drivers/gpu/drm/vc4/vc4_hvs.c | 12 ++--
 drivers/gpu/drm/vc4/vc4_v3d.c |  6 +++---
 7 files changed, 19 insertions(+), 58 deletions(-)

diff --git a/drivers/gpu/drm/vc4/vc4_bo.c b/drivers/gpu/drm/vc4/vc4_bo.c
index 43d9b3a6a352..10823e054b83 100644
--- a/drivers/gpu/drm/vc4/vc4_bo.c
+++ b/drivers/gpu/drm/vc4/vc4_bo.c
@@ -69,8 +69,8 @@ static void vc4_bo_stats_print(struct drm_printer *p, struct 
vc4_dev *vc4)
 
 static int vc4_bo_stats_debugfs(struct seq_file *m, void *unused)
 {
-   struct drm_info_node *node = (struct drm_info_node *)m->private;
-   struct drm_device *dev = node->minor->dev;
+   struct drm_debugfs_entry *entry = m->private;
+   struct drm_device *dev = entry->dev;
struct vc4_dev *vc4 = to_vc4_dev(dev);
struct drm_printer p = drm_seq_file_printer(m);
 
@@ -998,7 +998,7 @@ int vc4_bo_debugfs_init(struct drm_minor *minor)
if (!vc4->v3d)
return -ENODEV;
 
-   ret = vc4_debugfs_add_file(minor, "bo_stats",
+   ret = drm_debugfs_add_file(drm, "bo_stats",
   vc4_bo_stats_debugfs, NULL);
if (ret)
return ret;
diff --git a/drivers/gpu/drm/vc4/vc4_debugfs.c 
b/drivers/gpu/drm/vc4/vc4_debugfs.c
index 19cda4f91a82..53f308357442 100644
--- a/drivers/gpu/drm/vc4/vc4_debugfs.c
+++ b/drivers/gpu/drm/vc4/vc4_debugfs.c
@@ -34,9 +34,9 @@ vc4_debugfs_init(struct drm_minor *minor)
 
 static int vc4_debugfs_regset32(struct seq_file *m, void *unused)
 {
-   struct drm_info_node *node = (struct drm_info_node *)m->private;
-   struct drm_device *drm = node->minor->dev;
-   struct debugfs_regset32 *regset = node->info_ent->data;
+   struct drm_debugfs_entry *entry = m->private;
+   struct drm_device *drm = entry->dev;
+   struct debugfs_regset32 *regset = entry->file.data;
struct drm_printer p = drm_seq_file_printer(m);
int idx;
 
@@ -50,31 +50,9 @@ static int vc4_debugfs_regset32(struct seq_file *m, void 
*unused)
return 0;
 }
 
-int vc4_debugfs_add_file(struct drm_minor *minor,
-const char *name,
-int (*show)(struct seq_file*, void*),
-void *data)
-{
-   struct drm_device *dev = minor->dev;
-   struct dentry *root = minor->debugfs_root;
-   struct drm_info_list *file;
-
-   file = drmm_kzalloc(dev, sizeof(*file), GFP_KERNEL);
-   if (!file)
-   return -ENOMEM;
-
-   file->name = name;
-   file->show = show;
-   file->data = data;
-
-   drm_debugfs_create_files(file, 1, root, minor);
-
-   return 0;
-}
-
 int vc4_debugfs_add_regset32(struct drm_minor *minor,
 const char *name,
 struct debugfs_regset32 *regset)
 {
-   return vc4_debugfs_add_file(minor, name, vc4_debugfs_regset32, regset);
+   return drm_debugfs_add_file(minor->dev, name, vc4_debugfs_regset32, 
regset);
 }
diff --git a/drivers/gpu/drm/vc4/vc4_drv.c b/drivers/gpu/drm/vc4/vc4_drv.c
index b66bf7aea632..6e21ae7240ce 100644
--- a/drivers/gpu/drm/vc4/vc4_drv.c
+++ b/drivers/gpu/drm/vc4/vc4_drv.c
@@ -320,7 +320,6 @@ static int vc4_drm_bind(struct device *dev)
 
drm = &vc4->base;
platform_set_drvdata(pdev, drm);
-   INIT_LIST_HEAD(&vc4->debugfs_list);
 
if (!is_vc5) {
ret = drmm_mutex_init(drm, &vc4->bin_bo_lock);
diff --git a/drivers/gpu/drm/vc4/vc4_drv.h b/drivers/gpu/drm/vc4/vc4_drv.h
index 515228682e8e..e2bf76bc0843 100644
--- a/drivers/gpu/drm/vc4/vc4_drv.h
+++ b/drivers/gpu/drm/vc4/vc4_drv.h
@@ -221,11 +221,6 @@ struct vc4_dev {
struct drm_private_obj hvs_channels;
struct drm_private_obj load_tracker;
 
-   /* List of vc4_debugfs_info_entry for adding to debugfs once
-* the minor is available (after drm_dev_register()).
-*/
-   struct list_head debugfs_list;
-
/* Mutex for binner bo allocation. */
struct mutex bin_bo_lock;
/* Reference count for our binner bo. */
@@ -884,21 +879,10 @@ void vc4_crtc_get_margins(struct drm_crtc_state *state,
 /* vc4_debugfs.c */
 void vc4_debugfs_init(struct drm_minor *minor);
 #ifdef CONFIG_DEBUG_FS
-int vc4_debugfs_add_file(struct drm_minor *minor,
-const char *filename,
-int (*show)(struct seq_file*, void*),
-void *data);
 int vc

[PATCH 4/6] drm/v3d: use new debugfs device-centered functions

2022-11-22 Thread Maíra Canal
Replace the use of drm_debugfs_create_files() with the new
drm_debugfs_add_files() function, which centers the debugfs files
management on the drm_device instead of drm_minor.

Signed-off-by: Maíra Canal 
---
 drivers/gpu/drm/v3d/v3d_debugfs.c | 22 ++
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/v3d/v3d_debugfs.c 
b/drivers/gpu/drm/v3d/v3d_debugfs.c
index efbde124c296..330669f51fa7 100644
--- a/drivers/gpu/drm/v3d/v3d_debugfs.c
+++ b/drivers/gpu/drm/v3d/v3d_debugfs.c
@@ -79,8 +79,8 @@ static const struct v3d_reg_def v3d_csd_reg_defs[] = {
 
 static int v3d_v3d_debugfs_regs(struct seq_file *m, void *unused)
 {
-   struct drm_info_node *node = (struct drm_info_node *)m->private;
-   struct drm_device *dev = node->minor->dev;
+   struct drm_debugfs_entry *entry = m->private;
+   struct drm_device *dev = entry->dev;
struct v3d_dev *v3d = to_v3d_dev(dev);
int i, core;
 
@@ -126,8 +126,8 @@ static int v3d_v3d_debugfs_regs(struct seq_file *m, void 
*unused)
 
 static int v3d_v3d_debugfs_ident(struct seq_file *m, void *unused)
 {
-   struct drm_info_node *node = (struct drm_info_node *)m->private;
-   struct drm_device *dev = node->minor->dev;
+   struct drm_debugfs_entry *entry = m->private;
+   struct drm_device *dev = entry->dev;
struct v3d_dev *v3d = to_v3d_dev(dev);
u32 ident0, ident1, ident2, ident3, cores;
int core;
@@ -188,8 +188,8 @@ static int v3d_v3d_debugfs_ident(struct seq_file *m, void 
*unused)
 
 static int v3d_debugfs_bo_stats(struct seq_file *m, void *unused)
 {
-   struct drm_info_node *node = (struct drm_info_node *)m->private;
-   struct drm_device *dev = node->minor->dev;
+   struct drm_debugfs_entry *entry = m->private;
+   struct drm_device *dev = entry->dev;
struct v3d_dev *v3d = to_v3d_dev(dev);
 
mutex_lock(&v3d->bo_lock);
@@ -204,8 +204,8 @@ static int v3d_debugfs_bo_stats(struct seq_file *m, void 
*unused)
 
 static int v3d_measure_clock(struct seq_file *m, void *unused)
 {
-   struct drm_info_node *node = (struct drm_info_node *)m->private;
-   struct drm_device *dev = node->minor->dev;
+   struct drm_debugfs_entry *entry = m->private;
+   struct drm_device *dev = entry->dev;
struct v3d_dev *v3d = to_v3d_dev(dev);
uint32_t cycles;
int core = 0;
@@ -236,7 +236,7 @@ static int v3d_measure_clock(struct seq_file *m, void 
*unused)
return 0;
 }
 
-static const struct drm_info_list v3d_debugfs_list[] = {
+static const struct drm_debugfs_info v3d_debugfs_list[] = {
{"v3d_ident", v3d_v3d_debugfs_ident, 0},
{"v3d_regs", v3d_v3d_debugfs_regs, 0},
{"measure_clock", v3d_measure_clock, 0},
@@ -246,7 +246,5 @@ static const struct drm_info_list v3d_debugfs_list[] = {
 void
 v3d_debugfs_init(struct drm_minor *minor)
 {
-   drm_debugfs_create_files(v3d_debugfs_list,
-ARRAY_SIZE(v3d_debugfs_list),
-minor->debugfs_root, minor);
+   drm_debugfs_add_files(minor->dev, v3d_debugfs_list, 
ARRAY_SIZE(v3d_debugfs_list));
 }
-- 
2.38.1



[PATCH 1/6] drm/debugfs: create device-centered debugfs functions

2022-11-22 Thread Maíra Canal
Introduce the ability to track requests for the addition of DRM debugfs
files at any time and have them added all at once during
drm_dev_register().

Drivers can add DRM debugfs files to a device-managed list and, during
drm_dev_register(), all added files will be created at once.

Now, the drivers can use the functions drm_debugfs_add_file() and
drm_debugfs_add_files() to create DRM debugfs files instead of using the
drm_debugfs_create_files() function.

Co-developed-by: Wambui Karuga 
Signed-off-by: Wambui Karuga 
Signed-off-by: Maíra Canal 
---
 drivers/gpu/drm/drm_debugfs.c | 76 +++
 drivers/gpu/drm/drm_drv.c |  3 ++
 include/drm/drm_debugfs.h | 45 +
 include/drm/drm_device.h  | 15 +++
 4 files changed, 139 insertions(+)

diff --git a/drivers/gpu/drm/drm_debugfs.c b/drivers/gpu/drm/drm_debugfs.c
index ee445f4605ba..ca27c2b05051 100644
--- a/drivers/gpu/drm/drm_debugfs.c
+++ b/drivers/gpu/drm/drm_debugfs.c
@@ -38,6 +38,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "drm_crtc_internal.h"
 #include "drm_internal.h"
@@ -151,6 +152,21 @@ static int drm_debugfs_open(struct inode *inode, struct 
file *file)
return single_open(file, node->info_ent->show, node);
 }
 
+static int drm_debugfs_entry_open(struct inode *inode, struct file *file)
+{
+   struct drm_debugfs_entry *entry = inode->i_private;
+   struct drm_debugfs_info *node = &entry->file;
+
+   return single_open(file, node->show, entry);
+}
+
+static const struct file_operations drm_debugfs_entry_fops = {
+   .owner = THIS_MODULE,
+   .open = drm_debugfs_entry_open,
+   .read = seq_read,
+   .llseek = seq_lseek,
+   .release = single_release,
+};
 
 static const struct file_operations drm_debugfs_fops = {
.owner = THIS_MODULE,
@@ -207,6 +223,7 @@ int drm_debugfs_init(struct drm_minor *minor, int minor_id,
 struct dentry *root)
 {
struct drm_device *dev = minor->dev;
+   struct drm_debugfs_entry *entry;
char name[64];
 
INIT_LIST_HEAD(&minor->debugfs_list);
@@ -230,6 +247,11 @@ int drm_debugfs_init(struct drm_minor *minor, int minor_id,
if (dev->driver->debugfs_init)
dev->driver->debugfs_init(minor);
 
+   list_for_each_entry(entry, &dev->debugfs_list, list) {
+   debugfs_create_file(entry->file.name, S_IFREG | S_IRUGO,
+   minor->debugfs_root, entry, 
&drm_debugfs_entry_fops);
+   }
+
return 0;
 }
 
@@ -281,6 +303,60 @@ void drm_debugfs_cleanup(struct drm_minor *minor)
minor->debugfs_root = NULL;
 }
 
+/**
+ * drm_debugfs_add_file - Add a given file to the DRM device debugfs file list
+ * @dev: drm device for the ioctl
+ * @name: debugfs file name
+ * @show: show callback
+ * @data: driver-private data, should not be device-specific
+ *
+ * Add a given file entry to the DRM device debugfs file list to be created on
+ * drm_debugfs_init.
+ */
+int drm_debugfs_add_file(struct drm_device *dev, const char *name,
+int (*show)(struct seq_file*, void*), void *data)
+{
+   struct drm_debugfs_entry *entry = drmm_kzalloc(dev, sizeof(*entry), 
GFP_KERNEL);
+
+   if (!entry)
+   return -ENOMEM;
+
+   entry->file.name = name;
+   entry->file.show = show;
+   entry->file.data = data;
+   entry->dev = dev;
+
+   mutex_lock(&dev->debugfs_mutex);
+   list_add(&entry->list, &dev->debugfs_list);
+   mutex_unlock(&dev->debugfs_mutex);
+
+   return 0;
+}
+EXPORT_SYMBOL(drm_debugfs_add_file);
+
+/**
+ * drm_debugfs_add_files - Add an array of files to the DRM device debugfs 
file list
+ * @dev: drm device for the ioctl
+ * @files: The array of files to create
+ * @count: The number of files given
+ *
+ * Add a given set of debugfs files represented by an array of
+ * &struct drm_debugfs_info in the DRM device debugfs file list.
+ */
+int drm_debugfs_add_files(struct drm_device *dev, const struct 
drm_debugfs_info *files, int count)
+{
+   int i, ret = 0, err;
+
+   for (i = 0; i < count; i++) {
+   err = drm_debugfs_add_file(dev, files[i].name, files[i].show, 
files[i].data);
+   if (err)
+   ret = err;
+   }
+
+   return ret;
+}
+EXPORT_SYMBOL(drm_debugfs_add_files);
+
 static int connector_show(struct seq_file *m, void *data)
 {
struct drm_connector *connector = m->private;
diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
index 8214a0b1ab7f..803942008fcb 100644
--- a/drivers/gpu/drm/drm_drv.c
+++ b/drivers/gpu/drm/drm_drv.c
@@ -575,6 +575,7 @@ static void drm_dev_init_release(struct drm_device *dev, 
void *res)
mutex_destroy(&dev->clientlist_mutex);
mutex_destroy(&dev->filelist_mutex);
mutex_destroy(&dev->struct_mutex);
+   mutex_destroy(&dev->debugfs_mutex);
drm_legacy_destroy_members(dev);
 }
 
@@ -608,12 +609

[PATCH 2/6] drm: use new debugfs device-centered functions on DRM core files

2022-11-22 Thread Maíra Canal
Replace the use of drm_debugfs_create_files() with the new
drm_debugfs_add_files() function in all DRM core files, centering the
debugfs files management on the drm_device instead of drm_minor.

Signed-off-by: Maíra Canal 
---
 drivers/gpu/drm/drm_atomic.c  | 11 +--
 drivers/gpu/drm/drm_client.c  | 11 +--
 drivers/gpu/drm/drm_debugfs.c | 18 --
 drivers/gpu/drm/drm_framebuffer.c | 11 +--
 drivers/gpu/drm/drm_gem_vram_helper.c | 11 +--
 5 files changed, 28 insertions(+), 34 deletions(-)

diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c
index f197f59f6d99..c7f23cf2552c 100644
--- a/drivers/gpu/drm/drm_atomic.c
+++ b/drivers/gpu/drm/drm_atomic.c
@@ -1756,8 +1756,8 @@ EXPORT_SYMBOL(drm_state_dump);
 #ifdef CONFIG_DEBUG_FS
 static int drm_state_info(struct seq_file *m, void *data)
 {
-   struct drm_info_node *node = (struct drm_info_node *) m->private;
-   struct drm_device *dev = node->minor->dev;
+   struct drm_debugfs_entry *entry = m->private;
+   struct drm_device *dev = entry->dev;
struct drm_printer p = drm_seq_file_printer(m);
 
__drm_state_dump(dev, &p, true);
@@ -1766,14 +1766,13 @@ static int drm_state_info(struct seq_file *m, void 
*data)
 }
 
 /* any use in debugfs files to dump individual planes/crtc/etc? */
-static const struct drm_info_list drm_atomic_debugfs_list[] = {
+static const struct drm_debugfs_info drm_atomic_debugfs_list[] = {
{"state", drm_state_info, 0},
 };
 
 void drm_atomic_debugfs_init(struct drm_minor *minor)
 {
-   drm_debugfs_create_files(drm_atomic_debugfs_list,
-ARRAY_SIZE(drm_atomic_debugfs_list),
-minor->debugfs_root, minor);
+   drm_debugfs_add_files(minor->dev, drm_atomic_debugfs_list,
+ ARRAY_SIZE(drm_atomic_debugfs_list));
 }
 #endif
diff --git a/drivers/gpu/drm/drm_client.c b/drivers/gpu/drm/drm_client.c
index fd67efe37c63..262ec64d4397 100644
--- a/drivers/gpu/drm/drm_client.c
+++ b/drivers/gpu/drm/drm_client.c
@@ -480,8 +480,8 @@ EXPORT_SYMBOL(drm_client_framebuffer_flush);
 #ifdef CONFIG_DEBUG_FS
 static int drm_client_debugfs_internal_clients(struct seq_file *m, void *data)
 {
-   struct drm_info_node *node = m->private;
-   struct drm_device *dev = node->minor->dev;
+   struct drm_debugfs_entry *entry = m->private;
+   struct drm_device *dev = entry->dev;
struct drm_printer p = drm_seq_file_printer(m);
struct drm_client_dev *client;
 
@@ -493,14 +493,13 @@ static int drm_client_debugfs_internal_clients(struct 
seq_file *m, void *data)
return 0;
 }
 
-static const struct drm_info_list drm_client_debugfs_list[] = {
+static const struct drm_debugfs_info drm_client_debugfs_list[] = {
{ "internal_clients", drm_client_debugfs_internal_clients, 0 },
 };
 
 void drm_client_debugfs_init(struct drm_minor *minor)
 {
-   drm_debugfs_create_files(drm_client_debugfs_list,
-ARRAY_SIZE(drm_client_debugfs_list),
-minor->debugfs_root, minor);
+   drm_debugfs_add_files(minor->dev, drm_client_debugfs_list,
+ ARRAY_SIZE(drm_client_debugfs_list));
 }
 #endif
diff --git a/drivers/gpu/drm/drm_debugfs.c b/drivers/gpu/drm/drm_debugfs.c
index ca27c2b05051..83f7530e7b46 100644
--- a/drivers/gpu/drm/drm_debugfs.c
+++ b/drivers/gpu/drm/drm_debugfs.c
@@ -51,9 +51,8 @@
 
 static int drm_name_info(struct seq_file *m, void *data)
 {
-   struct drm_info_node *node = (struct drm_info_node *) m->private;
-   struct drm_minor *minor = node->minor;
-   struct drm_device *dev = minor->dev;
+   struct drm_debugfs_entry *entry = m->private;
+   struct drm_device *dev = entry->dev;
struct drm_master *master;
 
mutex_lock(&dev->master_mutex);
@@ -73,8 +72,8 @@ static int drm_name_info(struct seq_file *m, void *data)
 
 static int drm_clients_info(struct seq_file *m, void *data)
 {
-   struct drm_info_node *node = (struct drm_info_node *) m->private;
-   struct drm_device *dev = node->minor->dev;
+   struct drm_debugfs_entry *entry = m->private;
+   struct drm_device *dev = entry->dev;
struct drm_file *priv;
kuid_t uid;
 
@@ -125,8 +124,8 @@ static int drm_gem_one_name_info(int id, void *ptr, void 
*data)
 
 static int drm_gem_name_info(struct seq_file *m, void *data)
 {
-   struct drm_info_node *node = (struct drm_info_node *) m->private;
-   struct drm_device *dev = node->minor->dev;
+   struct drm_debugfs_entry *entry = m->private;
+   struct drm_device *dev = entry->dev;
 
seq_printf(m, "  name size handles refcount\n");
 
@@ -137,7 +136,7 @@ static int drm_gem_name_info(struct seq_file *m, void *data)
return 0;
 }
 
-static const struct drm_info_list drm_debugfs_list[] = {
+static const struct drm_debugfs_info drm_debugf

[PATCH 0/6] Introduce debugfs device-centered functions

2022-11-22 Thread Maíra Canal
This series introduces the initial structure to make DRM debugfs more
device-centered and it is the first step to drop the
drm_driver->debugfs_init hooks in the future [1].

Currently, DRM debugfs files are created using drm_debugfs_create_files()
on request. The first patch of this series makes it possible for DRM devices
for creating debugfs files during drm_dev_register(). For it, it introduces
two new functions that can be used by the drivers: drm_debugfs_add_files()
and drm_debugfs_add_file(). The requests are added to a list and are created
all at once during drm_dev_register(). Moreover, the first patch was based on
this RFC series [2].

The main difference between the RFC series and the current series is the
creation of a new fops structure to accommodate the new structs and, also,
the creation of a new drm_debugfs_open. Moreover, the new series uses
device-managed allocation, returns memory allocation errors, and converts
more drivers to the new structure.

Apart from the first patch, the following patches are converting some drivers
to the new DRM debugfs structure and the last patch update the TODO task
related to it.

[1] https://cgit.freedesktop.org/drm/drm/tree/Documentation/gpu/todo.rst#n506
[2] 
https://lore.kernel.org/dri-devel/20200513114130.28641-2-wambui.karu...@gmail.com/

Best Regards,
- Maíra Canal

Maíra Canal (6):
  drm/debugfs: create device-centered debugfs functions
  drm: use new debugfs device-centered functions on DRM core files
  drm/vc4: use new debugfs device-centered functions
  drm/v3d: use new debugfs device-centered functions
  drm/vkms: use new debugfs device-centered functions
  drm/todo: update the debugfs clean up task

 Documentation/gpu/todo.rst|  9 +--
 drivers/gpu/drm/drm_atomic.c  | 11 ++--
 drivers/gpu/drm/drm_client.c  | 11 ++--
 drivers/gpu/drm/drm_debugfs.c | 94 ---
 drivers/gpu/drm/drm_drv.c |  3 +
 drivers/gpu/drm/drm_framebuffer.c | 11 ++--
 drivers/gpu/drm/drm_gem_vram_helper.c | 11 ++--
 drivers/gpu/drm/v3d/v3d_debugfs.c | 22 +++
 drivers/gpu/drm/vc4/vc4_bo.c  |  6 +-
 drivers/gpu/drm/vc4/vc4_debugfs.c | 30 ++---
 drivers/gpu/drm/vc4/vc4_drv.c |  1 -
 drivers/gpu/drm/vc4/vc4_drv.h | 16 -
 drivers/gpu/drm/vc4/vc4_hdmi.c|  6 +-
 drivers/gpu/drm/vc4/vc4_hvs.c | 12 ++--
 drivers/gpu/drm/vc4/vc4_v3d.c |  6 +-
 drivers/gpu/drm/vkms/vkms_drv.c   | 10 +--
 include/drm/drm_debugfs.h | 45 +
 include/drm/drm_device.h  | 15 +
 18 files changed, 203 insertions(+), 116 deletions(-)

-- 
2.38.1



Re: [Intel-gfx] [PATCH 3/6] drm/i915/gsc: GSC firmware loading

2022-11-22 Thread Rodrigo Vivi
On Mon, Nov 21, 2022 at 03:16:14PM -0800, Daniele Ceraolo Spurio wrote:
> GSC FW is loaded by submitting a dedicated command via the GSC engine.
> The memory area used for loading the FW is then re-purposed as local
> memory for the GSC itself, so we use a separate allocation instead of
> using the one where we keep the firmware stored for reload.
> 
> The GSC is not reset as part of GT reset, so we only need to load it on
> first boot and S3/S4 exit.
> 
> Note that the GSC load takes a lot of time (up to a few hundred ms).
> This patch loads it serially as part of driver init/resume, but, given
> that GSC is only required for PM and content-protection features
> (media C6, PXP, HDCP), we could move the load to a worker thread to unblock
> non-CP userspace submissions earlier. This will be done as a follow up
> step, because there are extra init steps required to actually make use of
> the GSC (including a mei component) and it will be cleaner (and easier to
> review) if we implement the async load once all the pieces we need for GSC
> to work are in place. A TODO has been added to the code to mark this
> intention.
> 
> Bspec: 63347, 65346
> Signed-off-by: Daniele Ceraolo Spurio 
> Cc: Alan Previn 
> Cc: John Harrison 
> ---
>  drivers/gpu/drm/i915/Makefile|   1 +
>  drivers/gpu/drm/i915/gem/i915_gem_pm.c   |  14 +-
>  drivers/gpu/drm/i915/gt/intel_engine.h   |   2 +
>  drivers/gpu/drm/i915/gt/intel_gpu_commands.h |   7 +
>  drivers/gpu/drm/i915/gt/intel_gt.c   |  11 ++
>  drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c| 186 +++
>  drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.h|  13 ++
>  drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.c|  35 +++-
>  drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.h|   7 +
>  drivers/gpu/drm/i915/gt/uc/intel_uc.c|  15 ++
>  drivers/gpu/drm/i915/gt/uc/intel_uc.h|   2 +
>  drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c |  20 +-
>  drivers/gpu/drm/i915/gt/uc/intel_uc_fw.h |   1 +
>  13 files changed, 307 insertions(+), 7 deletions(-)
>  create mode 100644 drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c
>  create mode 100644 drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.h
> 
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index 92d37cf71e16..1d45a6f451fa 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -206,6 +206,7 @@ i915-y += gt/uc/intel_uc.o \
> gt/uc/intel_huc.o \
> gt/uc/intel_huc_debugfs.o \
> gt/uc/intel_huc_fw.o \
> +   gt/uc/intel_gsc_fw.o \
> gt/uc/intel_gsc_uc.o
>  
>  # graphics system controller (GSC) support
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pm.c 
> b/drivers/gpu/drm/i915/gem/i915_gem_pm.c
> index 0d812f4d787d..f77eb4009aba 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_pm.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_pm.c
> @@ -232,10 +232,22 @@ void i915_gem_resume(struct drm_i915_private *i915)
>* guarantee that the context image is complete. So let's just reset
>* it and start again.
>*/
> - for_each_gt(gt, i915, i)
> + for_each_gt(gt, i915, i) {
>   if (intel_gt_resume(gt))
>   goto err_wedged;
>  
> + /*
> +  * TODO: this is a long operation (up to ~200ms) and we don't
> +  * need to complete it before driver load/resume is done, so it
> +  * should be handled in a separate thread to unlock userspace
> +  * submission. However, there are a couple of other pieces that
> +  * are required for full GSC support that will complicate things
> +  * a bit, and it is easier to move everything to a worker at the
> +  * same time, so keep it here for now.
> +  */
> + intel_uc_init_hw_late(>->uc);
> + }
> +
>   ret = lmem_restore(i915, I915_TTM_BACKUP_ALLOW_GPU);
>   GEM_WARN_ON(ret);
>  
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h 
> b/drivers/gpu/drm/i915/gt/intel_engine.h
> index cbc8b857d5f7..0e24af5efee9 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine.h
> +++ b/drivers/gpu/drm/i915/gt/intel_engine.h
> @@ -172,6 +172,8 @@ intel_write_status_page(struct intel_engine_cs *engine, 
> int reg, u32 value)
>  #define I915_GEM_HWS_MIGRATE (0x42 * sizeof(u32))
>  #define I915_GEM_HWS_PXP 0x60
>  #define I915_GEM_HWS_PXP_ADDR(I915_GEM_HWS_PXP * sizeof(u32))
> +#define I915_GEM_HWS_GSC 0x62
> +#define I915_GEM_HWS_GSC_ADDR(I915_GEM_HWS_GSC * sizeof(u32))
>  #define I915_GEM_HWS_SCRATCH 0x80
>  
>  #define I915_HWS_CSB_BUF0_INDEX  0x10
> diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h 
> b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
> index f50ea92910d9..49ebda141266 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
> @@ -21,6 +21,7 @@
>  #define INSTR_

[PATCH v2 4/4] Revert "drm/i915: Improve on suspend / resume time with VT-d enabled"

2022-11-22 Thread Andi Shyti
This reverts commit 2ef6efa79fecd5e3457b324155d35524d95f2b6b.

Checking the presence if the IRST (Intel Rapid Start Technology)
through the ACPI to decide whether to rebuild or not the GGTT
puts us at the mercy of the boot firmware and we need to
unnecessarily rely on third parties.

Because now we avoid adding scratch pages to the entire GGTT we
don't need this hack anymore.

Signed-off-by: Andi Shyti 
Cc: Thomas Hellström 
Cc: Chris Wilson 
Cc: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/gt/intel_ggtt.c | 69 ++--
 drivers/gpu/drm/i915/gt/intel_gtt.h  | 24 --
 drivers/gpu/drm/i915/i915_driver.c   | 16 ---
 3 files changed, 13 insertions(+), 96 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c 
b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index 5ccec5c9206d2..9d76a573255f6 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
@@ -26,13 +26,6 @@
 #include "intel_gtt.h"
 #include "gen8_ppgtt.h"
 
-static inline bool suspend_retains_ptes(struct i915_address_space *vm)
-{
-   return GRAPHICS_VER(vm->i915) >= 8 &&
-   !HAS_LMEM(vm->i915) &&
-   vm->is_ggtt;
-}
-
 static void i915_ggtt_color_adjust(const struct drm_mm_node *node,
   unsigned long color,
   u64 *start,
@@ -104,23 +97,6 @@ int i915_ggtt_init_hw(struct drm_i915_private *i915)
return 0;
 }
 
-/*
- * Return the value of the last GGTT pte cast to an u64, if
- * the system is supposed to retain ptes across resume. 0 otherwise.
- */
-static u64 read_last_pte(struct i915_address_space *vm)
-{
-   struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
-   gen8_pte_t __iomem *ptep;
-
-   if (!suspend_retains_ptes(vm))
-   return 0;
-
-   GEM_BUG_ON(GRAPHICS_VER(vm->i915) < 8);
-   ptep = (typeof(ptep))ggtt->gsm + (ggtt_total_entries(ggtt) - 1);
-   return readq(ptep);
-}
-
 /**
  * i915_ggtt_suspend_vm - Suspend the memory mappings for a GGTT or DPT VM
  * @vm: The VM to suspend the mappings for
@@ -184,10 +160,7 @@ void i915_ggtt_suspend_vm(struct i915_address_space *vm)
i915_gem_object_unlock(obj);
}
 
-   if (!suspend_retains_ptes(vm))
-   vm->clear_range(vm, 0, vm->total);
-   else
-   i915_vm_to_ggtt(vm)->probed_pte = read_last_pte(vm);
+   vm->clear_range(vm, 0, vm->total);
 
vm->skip_pte_rewrite = save_skip_rewrite;
 
@@ -536,8 +509,6 @@ static int init_ggtt(struct i915_ggtt *ggtt)
struct drm_mm_node *entry;
int ret;
 
-   ggtt->pte_lost = true;
-
/*
 * GuC requires all resources that we're sharing with it to be placed in
 * non-WOPCM memory. If GuC is not present or not in use we still need a
@@ -1236,20 +1207,11 @@ bool i915_ggtt_resume_vm(struct i915_address_space *vm)
 {
struct i915_vma *vma;
bool write_domain_objs = false;
-   bool retained_ptes;
 
drm_WARN_ON(&vm->i915->drm, !vm->is_ggtt && !vm->is_dpt);
 
-   /*
-* First fill our portion of the GTT with scratch pages if
-* they were not retained across suspend.
-*/
-   retained_ptes = suspend_retains_ptes(vm) &&
-   !i915_vm_to_ggtt(vm)->pte_lost &&
-   !GEM_WARN_ON(i915_vm_to_ggtt(vm)->probed_pte != 
read_last_pte(vm));
-
-   if (!retained_ptes)
-   vm->clear_range(vm, 0, vm->total);
+   /* First fill our portion of the GTT with scratch pages */
+   vm->clear_range(vm, 0, vm->total);
 
/* clflush objects bound into the GGTT and rebind them. */
list_for_each_entry(vma, &vm->bound_list, vm_link) {
@@ -1258,16 +1220,16 @@ bool i915_ggtt_resume_vm(struct i915_address_space *vm)
atomic_read(&vma->flags) & I915_VMA_BIND_MASK;
 
GEM_BUG_ON(!was_bound);
-   if (!retained_ptes) {
-   /*
-* Clear the bound flags of the vma resource to allow
-* ptes to be repopulated.
-*/
-   vma->resource->bound_flags = 0;
-   vma->ops->bind_vma(vm, NULL, vma->resource,
-  obj ? obj->cache_level : 0,
-  was_bound);
-   }
+
+   /*
+* Clear the bound flags of the vma resource to allow
+* ptes to be repopulated.
+*/
+   vma->resource->bound_flags = 0;
+   vma->ops->bind_vma(vm, NULL, vma->resource,
+  obj ? obj->cache_level : 0,
+  was_bound);
+
if (obj) { /* only used during resume => exclusive access */
write_domain_objs |= fetch_and_zero(&obj->write_domain);
obj->read_domains |= I915_GEM_DOMAIN_GTT;
@@ -

[PATCH v2 3/4] drm/i915: Refine VT-d scanout workaround

2022-11-22 Thread Andi Shyti
From: Chris Wilson 

VT-d may cause overfetch of the scanout PTE, both before and after the
vma (depending on the scanout orientation). bspec recommends that we
provide a tile-row in either directions, and suggests using 168 PTE,
warning that the accesses will wrap around the ends of the GGTT.
Currently, we fill the entire GGTT with scratch pages when using VT-d to
always ensure there are valid entries around every vma, including
scanout. However, writing every PTE is slow as on recent devices we
perform 8MiB of uncached writes, incurring an extra 100ms during resume.

If instead we focus on only putting guard pages around scanout, we can
avoid touching the whole GGTT. To avoid having to introduce extra nodes
around each scanout vma, we adjust the scanout drm_mm_node to be smaller
than the allocated space, and fixup the extra PTE during dma binding.

Signed-off-by: Chris Wilson 
Signed-off-by: Tejas Upadhyay 
Signed-off-by: Tvrtko Ursulin 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/gem/i915_gem_domain.c | 13 +++
 drivers/gpu/drm/i915/gt/intel_ggtt.c   | 25 +-
 drivers/gpu/drm/i915/i915_vma.c|  9 
 3 files changed, 23 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c 
b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
index d44a152ce6800..882b91519f92b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
@@ -17,6 +17,8 @@
 #include "i915_gem_object.h"
 #include "i915_vma.h"
 
+#define VTD_GUARD (168u * I915_GTT_PAGE_SIZE) /* 168 or tile-row PTE padding */
+
 static bool gpu_write_needs_clflush(struct drm_i915_gem_object *obj)
 {
struct drm_i915_private *i915 = to_i915(obj->base.dev);
@@ -424,6 +426,17 @@ i915_gem_object_pin_to_display_plane(struct 
drm_i915_gem_object *obj,
if (ret)
return ERR_PTR(ret);
 
+   /* VT-d may overfetch before/after the vma, so pad with scratch */
+   if (intel_scanout_needs_vtd_wa(i915)) {
+   unsigned int guard = VTD_GUARD;
+
+   if (i915_gem_object_is_tiled(obj))
+   guard = max(guard,
+   i915_gem_object_get_tile_row_size(obj));
+
+   flags |= PIN_OFFSET_GUARD | guard;
+   }
+
/*
 * As the user may map the buffer once pinned in the display plane
 * (e.g. libkms for the bootup splash), we have to ensure that we
diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c 
b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index 133710258eae6..5ccec5c9206d2 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
@@ -367,27 +367,6 @@ static void nop_clear_range(struct i915_address_space *vm,
 {
 }
 
-static void gen8_ggtt_clear_range(struct i915_address_space *vm,
- u64 start, u64 length)
-{
-   struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
-   unsigned int first_entry = start / I915_GTT_PAGE_SIZE;
-   unsigned int num_entries = length / I915_GTT_PAGE_SIZE;
-   const gen8_pte_t scratch_pte = vm->scratch[0]->encode;
-   gen8_pte_t __iomem *gtt_base =
-   (gen8_pte_t __iomem *)ggtt->gsm + first_entry;
-   const int max_entries = ggtt_total_entries(ggtt) - first_entry;
-   int i;
-
-   if (WARN(num_entries > max_entries,
-"First entry = %d; Num entries = %d (max=%d)\n",
-first_entry, num_entries, max_entries))
-   num_entries = max_entries;
-
-   for (i = 0; i < num_entries; i++)
-   gen8_set_pte(>t_base[i], scratch_pte);
-}
-
 static void bxt_vtd_ggtt_wa(struct i915_address_space *vm)
 {
/*
@@ -959,8 +938,6 @@ static int gen8_gmch_probe(struct i915_ggtt *ggtt)
ggtt->vm.cleanup = gen6_gmch_remove;
ggtt->vm.insert_page = gen8_ggtt_insert_page;
ggtt->vm.clear_range = nop_clear_range;
-   if (intel_scanout_needs_vtd_wa(i915))
-   ggtt->vm.clear_range = gen8_ggtt_clear_range;
 
ggtt->vm.insert_entries = gen8_ggtt_insert_entries;
 
@@ -1121,7 +1098,7 @@ static int gen6_gmch_probe(struct i915_ggtt *ggtt)
ggtt->vm.alloc_scratch_dma = alloc_pt_dma;
 
ggtt->vm.clear_range = nop_clear_range;
-   if (!HAS_FULL_PPGTT(i915) || intel_scanout_needs_vtd_wa(i915))
+   if (!HAS_FULL_PPGTT(i915))
ggtt->vm.clear_range = gen6_ggtt_clear_range;
ggtt->vm.insert_page = gen6_ggtt_insert_page;
ggtt->vm.insert_entries = gen6_ggtt_insert_entries;
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 457e35e03895f..840c7daf8bb70 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -677,6 +677,10 @@ bool i915_vma_misplaced(const struct i915_vma *vma,
i915_vma_offset(vma) != (flags & PIN_OFFSET_MASK))
return true;
 
+   if (flags & PIN_OFFSET_GUARD &&
+   v

[PATCH v2 2/4] drm/i915: Introduce guard pages to i915_vma

2022-11-22 Thread Andi Shyti
From: Chris Wilson 

Introduce the concept of padding the i915_vma with guard pages before
and after. The major consequence is that all ordinary uses of i915_vma
must use i915_vma_offset/i915_vma_size and not i915_vma.node.start/size
directly, as the drm_mm_node will include the guard pages that surround
our object.

The biggest connundrum is how exactly to mix requesting a fixed address
with guard pages, particularly through the existing uABI. The user does
not know about guard pages, so such must be transparent to the user, and
so the execobj.offset must be that of the object itself excluding the
guard. So a PIN_OFFSET_FIXED must then be exclusive of the guard pages.
The caveat is that some placements will be impossible with guard pages,
as wrap arounds need to be avoided, and the vma itself will require a
larger node. We must not report EINVAL but ENOSPC as these are unavailable
locations within the GTT rather than conflicting user requirements.

In the next patch, we start using guard pages for scanout objects. While
these are limited to GGTT vma, on a few platforms these vma (or at least
an alias of the vma) is shared with userspace, so we may leak the
existence of such guards if we are not careful to ensure that the
execobj.offset is transparent and excludes the guards. (On such platforms
like ivb, without full-ppgtt, userspace has to use relocations so the
presence of more untouchable regions within its GTT such be of no further
issue.)

Signed-off-by: Chris Wilson 
Signed-off-by: Tejas Upadhyay 
Signed-off-by: Tvrtko Ursulin 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/gt/intel_ggtt.c | 14 
 drivers/gpu/drm/i915/i915_gem_gtt.h  |  3 ++-
 drivers/gpu/drm/i915/i915_vma.c  | 27 ++--
 drivers/gpu/drm/i915/i915_vma.h  |  5 +++--
 drivers/gpu/drm/i915/i915_vma_resource.c |  4 ++--
 drivers/gpu/drm/i915/i915_vma_resource.h |  7 +-
 drivers/gpu/drm/i915/i915_vma_types.h|  3 ++-
 7 files changed, 46 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c 
b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index 8145851ad23d5..133710258eae6 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
@@ -287,8 +287,11 @@ static void gen8_ggtt_insert_entries(struct 
i915_address_space *vm,
 */
 
gte = (gen8_pte_t __iomem *)ggtt->gsm;
-   gte += vma_res->start / I915_GTT_PAGE_SIZE;
-   end = gte + vma_res->node_size / I915_GTT_PAGE_SIZE;
+   gte += (vma_res->start - vma_res->guard) / I915_GTT_PAGE_SIZE;
+   end = gte + vma_res->guard / I915_GTT_PAGE_SIZE;
+   while (gte < end)
+   gen8_set_pte(gte++, vm->scratch[0]->encode);
+   end += (vma_res->node_size + vma_res->guard) / I915_GTT_PAGE_SIZE;
 
for_each_sgt_daddr(addr, iter, vma_res->bi.pages)
gen8_set_pte(gte++, pte_encode | addr);
@@ -338,9 +341,12 @@ static void gen6_ggtt_insert_entries(struct 
i915_address_space *vm,
dma_addr_t addr;
 
gte = (gen6_pte_t __iomem *)ggtt->gsm;
-   gte += vma_res->start / I915_GTT_PAGE_SIZE;
-   end = gte + vma_res->node_size / I915_GTT_PAGE_SIZE;
+   gte += (vma_res->start - vma_res->guard) / I915_GTT_PAGE_SIZE;
 
+   end = gte + vma_res->guard / I915_GTT_PAGE_SIZE;
+   while (gte < end)
+   iowrite32(vm->scratch[0]->encode, gte++);
+   end += (vma_res->node_size + vma_res->guard) / I915_GTT_PAGE_SIZE;
for_each_sgt_daddr(addr, iter, vma_res->bi.pages)
iowrite32(vm->pte_encode(addr, level, flags), gte++);
GEM_BUG_ON(gte > end);
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h 
b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 8c2f57eb5ddaa..2434197830523 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -44,7 +44,8 @@ int i915_gem_gtt_insert(struct i915_address_space *vm,
 #define PIN_HIGH   BIT_ULL(5)
 #define PIN_OFFSET_BIASBIT_ULL(6)
 #define PIN_OFFSET_FIXED   BIT_ULL(7)
-#define PIN_VALIDATE   BIT_ULL(8) /* validate placement only, no need 
to call unpin() */
+#define PIN_OFFSET_GUARD   BIT_ULL(8)
+#define PIN_VALIDATE   BIT_ULL(9) /* validate placement only, no need 
to call unpin() */
 
 #define PIN_GLOBAL BIT_ULL(10) /* I915_VMA_GLOBAL_BIND */
 #define PIN_USER   BIT_ULL(11) /* I915_VMA_LOCAL_BIND */
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 2232118babeb3..457e35e03895f 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -419,7 +419,7 @@ i915_vma_resource_init_from_vma(struct i915_vma_resource 
*vma_res,
   obj->mm.rsgt, i915_gem_object_is_readonly(obj),
   i915_gem_object_is_lmem(obj), obj->mm.region,
   vma->ops, vma->private, __i915_vma_offset(vma),
-

  1   2   3   >