Re: [PATCH v10 2/2] drm: add kms driver for loongson display controller

2023-04-10 Thread Sui Jingfeng

Hi,

On 2023/4/4 22:10, Emil Velikov wrote:

+static void lsdc_crtc_reset(struct drm_crtc *crtc)
+{
+   struct lsdc_display_pipe *dispipe = crtc_to_display_pipe(crtc);
+   struct drm_device *ddev = crtc->dev;
+   struct lsdc_device *ldev = to_lsdc(ddev);
+   struct lsdc_crtc_state *priv_crtc_state;
+   unsigned int index = dispipe->index;
+   u32 val;
+
+   val = LSDC_PF_XRGB | CFG_RESET_N;
+   if (ldev->descp->chip == CHIP_LS7A2000)
+   val |= LSDC_DMA_STEP_64_BYTES;
+
+   lsdc_crtc_wreg32(ldev, LSDC_CRTC0_CFG_REG, index, val);
+
+   if (ldev->descp->chip == CHIP_LS7A2000) {
+   val = PHY_CLOCK_EN | PHY_DATA_EN;
+   lsdc_crtc_wreg32(ldev, LSDC_CRTC0_PANEL_CONF_REG, index, val);
+   }
+

AFAICT no other driver touches the HW in their reset callback. Should
the above be moved to another callback?


You may correct in the 95% of all cases.

After reading the comments of the reset callback of struct 
drm_crtc_funcs in drm_crtc.h,


It seems that it does not prohibit us to touches the hardware.

I copy that comments and paste into here for easier to read,as below:


    /*
     * @reset:
     *
     * Reset CRTC hardware and software state to off. This function isn't
     * called by the core directly, only through drm_mode_config_reset().
     * It's not a helper hook only for historical reasons.
     *
     * Atomic drivers can use drm_atomic_helper_crtc_reset() to reset
     * atomic state using this hook.
     */


It seem allowable to reset CRTC hardware in this callback, did it cue us?

What we know is that this reset callback (and others, such as encoder's 
reset)


is called by drm_mode_config_reset(). It is the first set of functions 
get called


before other hardware related callbacks.


I don't not see how other drivers implement this callback, after you 
mention this


I skim over a few, I found tilcdc also writing the hardware in their 
tilcdc_crtc_reset()


function. See it in drm/tildc/tilclc_crtc.c


In addition, Loongson platform support efifb,  in order to light up the 
monitor in


firmware stage and the booting stage, the firmware touch the display 
hardware


register directly. After efifb get kick out, when drm/loongson driver 
taken over the


hardware, the register setting state still remain in the hardware 
register. Those


register setting may no longer correct for the subsequent operationd. 
What we


doing here is to giving the hardware a basic healthy state prepare to be 
update


further. As the reset callback is call very early, we found that it's 
the best fit.


The reset will also get called when resume(S3).


The problem is that we don't find a good to place to move those setting 
currently.




linux-next: build failure after merge of the driver-core tree

2023-04-10 Thread Stephen Rothwell
Hi all,

After merging the driver-core tree, today's linux-next build (x86_64
allmodconfig) failed like this:

In file included from include/linux/linkage.h:7,
 from include/linux/kernel.h:17,
 from drivers/accel/qaic/mhi_qaic_ctrl.c:4:
drivers/accel/qaic/mhi_qaic_ctrl.c: In function 'mhi_qaic_ctrl_init':
include/linux/export.h:27:22: error: passing argument 1 of 'class_create' from 
incompatible pointer type [-Werror=incompatible-pointer-types]
   27 | #define THIS_MODULE (&__this_module)
  | ~^~~
  |  |
  |  struct module *
drivers/accel/qaic/mhi_qaic_ctrl.c:544:38: note: in expansion of macro 
'THIS_MODULE'
  544 | mqc_dev_class = class_create(THIS_MODULE, 
MHI_QAIC_CTRL_DRIVER_NAME);
  |  ^~~
In file included from include/linux/device.h:31,
 from include/linux/mhi.h:9,
 from drivers/accel/qaic/mhi_qaic_ctrl.c:5:
include/linux/device/class.h:229:54: note: expected 'const char *' but argument 
is of type 'struct module *'
  229 | struct class * __must_check class_create(const char *name);
  |  ^~~~
drivers/accel/qaic/mhi_qaic_ctrl.c:544:25: error: too many arguments to 
function 'class_create'
  544 | mqc_dev_class = class_create(THIS_MODULE, 
MHI_QAIC_CTRL_DRIVER_NAME);
  | ^~~~
include/linux/device/class.h:229:29: note: declared here
  229 | struct class * __must_check class_create(const char *name);
  | ^~~~

Caused by commit

  1aaba11da9aa ("driver core: class: remove module * from class_create()")

interacting with commit

  566fc96198b4 ("accel/qaic: Add mhi_qaic_cntl")

from the drm tree.

I have applied the following merge fix patch for today.

From: Stephen Rothwell 
Date: Tue, 11 Apr 2023 14:16:57 +1000
Subject: [PATCH] fixup for "driver core: class: remove module * from 
class_create()"

interacting with "accel/qaic: Add mhi_qaic_cntl"

Signed-off-by: Stephen Rothwell 
---
 drivers/accel/qaic/mhi_qaic_ctrl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/accel/qaic/mhi_qaic_ctrl.c 
b/drivers/accel/qaic/mhi_qaic_ctrl.c
index 0c7e571f1f12..96db1580c72d 100644
--- a/drivers/accel/qaic/mhi_qaic_ctrl.c
+++ b/drivers/accel/qaic/mhi_qaic_ctrl.c
@@ -541,7 +541,7 @@ int mhi_qaic_ctrl_init(void)
return ret;
 
mqc_dev_major = ret;
-   mqc_dev_class = class_create(THIS_MODULE, MHI_QAIC_CTRL_DRIVER_NAME);
+   mqc_dev_class = class_create(MHI_QAIC_CTRL_DRIVER_NAME);
if (IS_ERR(mqc_dev_class)) {
ret = PTR_ERR(mqc_dev_class);
goto unregister_chrdev;
-- 
2.39.2

-- 
Cheers,
Stephen Rothwell


pgp_BrfPgv3IY.pgp
Description: OpenPGP digital signature


[PATCH 7/9] drm/i915: use pat_index instead of cache_level

2023-04-10 Thread fei . yang
From: Fei Yang 

Currently the KMD is using enum i915_cache_level to set caching policy for
buffer objects. This is flaky because the PAT index which really controls
the caching behavior in PTE has far more levels than what's defined in the
enum. In addition, the PAT index is platform dependent, having to translate
between i915_cache_level and PAT index is not reliable, and makes the code
more complicated.

>From UMD's perspective there is also a necessity to set caching policy for
performance fine tuning. It's much easier for the UMD to directly use PAT
index because the behavior of each PAT index is clearly defined in Bspec.
Haivng the abstracted i915_cache_level sitting in between would only cause
more ambiguity.

For these reasons this patch replaces i915_cache_level with PAT index. Also
note, the cache_level is not completely removed yet, because the KMD still
has the need of creating buffer objects with simple cache settings such as
cached, uncached, or writethrough. For these simple cases, using cache_level
would help simplify the code.

Cc: Chris Wilson 
Cc: Matt Roper 
Signed-off-by: Fei Yang 
---
 drivers/gpu/drm/i915/display/intel_dpt.c  | 12 +--
 drivers/gpu/drm/i915/gem/i915_gem_domain.c| 27 ++
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 10 ++-
 drivers/gpu/drm/i915/gem/i915_gem_mman.c  |  3 +-
 drivers/gpu/drm/i915/gem/i915_gem_object.c| 52 +++-
 drivers/gpu/drm/i915/gem/i915_gem_object.h|  4 +
 .../gpu/drm/i915/gem/i915_gem_object_types.h  | 25 +-
 drivers/gpu/drm/i915/gem/i915_gem_stolen.c|  4 +-
 drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c  | 16 ++--
 .../gpu/drm/i915/gem/selftests/huge_pages.c   |  2 +-
 .../drm/i915/gem/selftests/i915_gem_migrate.c |  2 +-
 .../drm/i915/gem/selftests/i915_gem_mman.c|  2 +-
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c  | 10 ++-
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c  | 76 -
 drivers/gpu/drm/i915/gt/gen8_ppgtt.h  |  3 +-
 drivers/gpu/drm/i915/gt/intel_ggtt.c  | 82 +--
 drivers/gpu/drm/i915/gt/intel_gtt.h   | 20 ++---
 drivers/gpu/drm/i915/gt/intel_migrate.c   | 47 ++-
 drivers/gpu/drm/i915/gt/intel_migrate.h   | 13 ++-
 drivers/gpu/drm/i915/gt/intel_ppgtt.c |  6 +-
 drivers/gpu/drm/i915/gt/selftest_migrate.c| 47 ++-
 drivers/gpu/drm/i915/gt/selftest_reset.c  |  8 +-
 drivers/gpu/drm/i915/gt/selftest_timeline.c   |  2 +-
 drivers/gpu/drm/i915/gt/selftest_tlb.c|  4 +-
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c  | 10 ++-
 drivers/gpu/drm/i915/i915_debugfs.c   | 55 ++---
 drivers/gpu/drm/i915/i915_gem.c   | 16 +++-
 drivers/gpu/drm/i915/i915_gpu_error.c |  8 +-
 drivers/gpu/drm/i915/i915_vma.c   | 16 ++--
 drivers/gpu/drm/i915/i915_vma.h   |  2 +-
 drivers/gpu/drm/i915/i915_vma_types.h |  2 -
 drivers/gpu/drm/i915/selftests/i915_gem.c |  5 +-
 .../gpu/drm/i915/selftests/i915_gem_evict.c   |  4 +-
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 15 ++--
 .../drm/i915/selftests/intel_memory_region.c  |  4 +-
 drivers/gpu/drm/i915/selftests/mock_gtt.c |  8 +-
 36 files changed, 383 insertions(+), 239 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_dpt.c 
b/drivers/gpu/drm/i915/display/intel_dpt.c
index c5eacfdba1a5..7c5fddb203ba 100644
--- a/drivers/gpu/drm/i915/display/intel_dpt.c
+++ b/drivers/gpu/drm/i915/display/intel_dpt.c
@@ -43,24 +43,24 @@ static void gen8_set_pte(void __iomem *addr, gen8_pte_t pte)
 static void dpt_insert_page(struct i915_address_space *vm,
dma_addr_t addr,
u64 offset,
-   enum i915_cache_level level,
+   unsigned int pat_index,
u32 flags)
 {
struct i915_dpt *dpt = i915_vm_to_dpt(vm);
gen8_pte_t __iomem *base = dpt->iomem;
 
gen8_set_pte(base + offset / I915_GTT_PAGE_SIZE,
-vm->pte_encode(addr, level, flags));
+vm->pte_encode(addr, pat_index, flags));
 }
 
 static void dpt_insert_entries(struct i915_address_space *vm,
   struct i915_vma_resource *vma_res,
-  enum i915_cache_level level,
+  unsigned int pat_index,
   u32 flags)
 {
struct i915_dpt *dpt = i915_vm_to_dpt(vm);
gen8_pte_t __iomem *base = dpt->iomem;
-   const gen8_pte_t pte_encode = vm->pte_encode(0, level, flags);
+   const gen8_pte_t pte_encode = vm->pte_encode(0, pat_index, flags);
struct sgt_iter sgt_iter;
dma_addr_t addr;
int i;
@@ -83,7 +83,7 @@ static void dpt_clear_range(struct i915_address_space *vm,
 static void dpt_bind_vma(struct i915_address_space *vm,
 struct i915_vm_pt_stash *stash,
 struct i915_vma_resource 

[PATCH 5/9] drm/i915/mtl: end support for set caching ioctl

2023-04-10 Thread fei . yang
From: Fei Yang 

The design is to keep Buffer Object's caching policy immutable through
out its life cycle. This patch ends the support for set caching ioctl
from MTL onward. While doing that we also set BO's to be 1-way coherent
at creation time because GPU is no longer automatically snooping CPU
cache.

Signed-off-by: Fei Yang 
---
 drivers/gpu/drm/i915/gem/i915_gem_domain.c | 3 +++
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c  | 9 -
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c 
b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
index d2d5a24301b2..bb3575b1479f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
@@ -337,6 +337,9 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void 
*data,
if (IS_DGFX(i915))
return -ENODEV;
 
+   if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70))
+   return -EOPNOTSUPP;
+
switch (args->caching) {
case I915_CACHING_NONE:
level = I915_CACHE_NONE;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c 
b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
index 37d1efcd3ca6..e602c323896b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
@@ -601,7 +601,14 @@ static int shmem_object_init(struct intel_memory_region 
*mem,
obj->write_domain = I915_GEM_DOMAIN_CPU;
obj->read_domains = I915_GEM_DOMAIN_CPU;
 
-   if (HAS_LLC(i915))
+   /*
+* MTL doesn't snooping CPU cache by default for GPU access (namely
+* 1-way coherency). However some UMD's are currently depending on
+* that. Make 1-way coherent the default setting for MTL. A follow
+* up patch will extend the GEM_CREATE uAPI to allow UMD's specify
+* caching mode at BO creation time
+*/
+   if (HAS_LLC(i915) || (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70)))
/* On some devices, we can have the GPU use the LLC (the CPU
 * cache) for about a 10% performance improvement
 * compared to uncached.  Graphics requests other than
-- 
2.25.1



[PATCH 0/9] drm/i915/mtl: Define MOCS and PAT tables for MTL

2023-04-10 Thread fei . yang
From: Fei Yang 

The series includes patches needed to enable MTL.
Also add new extension for GEM_CREATE uAPI to let
user space set cache policy for buffer objects.

Fei Yang (9):
  drm/i915/mtl: Set has_llc=0
  drm/i915/mtl: Define MOCS and PAT tables for MTL
  drm/i915/mtl: Add PTE encode function
  drm/i915/mtl: workaround coherency issue for Media
  drm/i915/mtl: end support for set caching ioctl
  drm/i915: preparation for using PAT index
  drm/i915: use pat_index instead of cache_level
  drm/i915: making mtl pte encode generic for gen12
  drm/i915: Allow user to set cache at BO creation

 drivers/gpu/drm/i915/display/intel_dpt.c  | 14 ++--
 drivers/gpu/drm/i915/gem/i915_gem_create.c| 36 
 drivers/gpu/drm/i915/gem/i915_gem_domain.c| 30 +++
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 10 ++-
 drivers/gpu/drm/i915/gem/i915_gem_mman.c  |  3 +-
 drivers/gpu/drm/i915/gem/i915_gem_object.c| 67 ++-
 drivers/gpu/drm/i915/gem/i915_gem_object.h|  8 ++
 .../gpu/drm/i915/gem/i915_gem_object_types.h  | 26 +-
 drivers/gpu/drm/i915/gem/i915_gem_pages.c |  5 +-
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c |  9 +-
 drivers/gpu/drm/i915/gem/i915_gem_shrinker.c  |  2 -
 drivers/gpu/drm/i915/gem/i915_gem_stolen.c|  4 +-
 drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c  | 16 ++--
 .../gpu/drm/i915/gem/selftests/huge_pages.c   |  2 +-
 .../drm/i915/gem/selftests/i915_gem_migrate.c |  2 +-
 .../drm/i915/gem/selftests/i915_gem_mman.c|  2 +-
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c  | 10 ++-
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c  | 81 +-
 drivers/gpu/drm/i915/gt/gen8_ppgtt.h  |  6 +-
 drivers/gpu/drm/i915/gt/intel_ggtt.c  | 84 +--
 drivers/gpu/drm/i915/gt/intel_gtt.c   | 23 -
 drivers/gpu/drm/i915/gt/intel_gtt.h   | 38 ++---
 drivers/gpu/drm/i915/gt/intel_migrate.c   | 47 ++-
 drivers/gpu/drm/i915/gt/intel_migrate.h   | 13 ++-
 drivers/gpu/drm/i915/gt/intel_mocs.c  | 76 -
 drivers/gpu/drm/i915/gt/intel_ppgtt.c |  6 +-
 drivers/gpu/drm/i915/gt/selftest_migrate.c| 47 ++-
 drivers/gpu/drm/i915/gt/selftest_mocs.c   |  2 +-
 drivers/gpu/drm/i915/gt/selftest_reset.c  |  8 +-
 drivers/gpu/drm/i915/gt/selftest_timeline.c   |  2 +-
 drivers/gpu/drm/i915/gt/selftest_tlb.c|  4 +-
 drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c | 13 +++
 drivers/gpu/drm/i915/gt/uc/intel_guc.c|  7 ++
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 18 ++--
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c  | 10 ++-
 drivers/gpu/drm/i915/i915_debugfs.c   | 55 +---
 drivers/gpu/drm/i915/i915_gem.c   | 16 +++-
 drivers/gpu/drm/i915/i915_gpu_error.c |  8 +-
 drivers/gpu/drm/i915/i915_pci.c   | 76 +++--
 drivers/gpu/drm/i915/i915_vma.c   | 16 ++--
 drivers/gpu/drm/i915/i915_vma.h   |  2 +-
 drivers/gpu/drm/i915/i915_vma_types.h |  2 -
 drivers/gpu/drm/i915/intel_device_info.h  |  5 ++
 drivers/gpu/drm/i915/selftests/i915_gem.c |  5 +-
 .../gpu/drm/i915/selftests/i915_gem_evict.c   |  4 +-
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 15 ++--
 .../drm/i915/selftests/intel_memory_region.c  |  4 +-
 .../gpu/drm/i915/selftests/mock_gem_device.c  |  6 ++
 drivers/gpu/drm/i915/selftests/mock_gtt.c |  8 +-
 include/uapi/drm/i915_drm.h   | 36 
 tools/include/uapi/drm/i915_drm.h | 36 
 51 files changed, 794 insertions(+), 231 deletions(-)

-- 
2.25.1



[PATCH 3/9] drm/i915/mtl: Add PTE encode function

2023-04-10 Thread fei . yang
From: Fei Yang 

PTE encode functions are platform dependent. This patch implements
PTE functions for MTL, and ensures the correct PTE encode function
is used by calling pte_encode function pointer instead of the
hardcoded gen8 version of PTE encode.

Signed-off-by: Fei Yang 
---
 drivers/gpu/drm/i915/display/intel_dpt.c |  2 +-
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 45 
 drivers/gpu/drm/i915/gt/gen8_ppgtt.h |  3 ++
 drivers/gpu/drm/i915/gt/intel_ggtt.c | 36 +--
 4 files changed, 75 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_dpt.c 
b/drivers/gpu/drm/i915/display/intel_dpt.c
index b8027392144d..c5eacfdba1a5 100644
--- a/drivers/gpu/drm/i915/display/intel_dpt.c
+++ b/drivers/gpu/drm/i915/display/intel_dpt.c
@@ -300,7 +300,7 @@ intel_dpt_create(struct intel_framebuffer *fb)
vm->vma_ops.bind_vma= dpt_bind_vma;
vm->vma_ops.unbind_vma  = dpt_unbind_vma;
 
-   vm->pte_encode = gen8_ggtt_pte_encode;
+   vm->pte_encode = vm->gt->ggtt->vm.pte_encode;
 
dpt->obj = dpt_obj;
dpt->obj->is_dpt = true;
diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c 
b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index 4daaa6f55668..11b91e0453c8 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -55,6 +55,34 @@ static u64 gen8_pte_encode(dma_addr_t addr,
return pte;
 }
 
+static u64 mtl_pte_encode(dma_addr_t addr,
+ enum i915_cache_level level,
+ u32 flags)
+{
+   gen8_pte_t pte = addr | GEN8_PAGE_PRESENT | GEN8_PAGE_RW;
+
+   if (unlikely(flags & PTE_READ_ONLY))
+   pte &= ~GEN8_PAGE_RW;
+
+   if (flags & PTE_LM)
+   pte |= GEN12_PPGTT_PTE_LM | GEN12_PPGTT_PTE_NC;
+
+   switch (level) {
+   case I915_CACHE_NONE:
+   pte |= GEN12_PPGTT_PTE_PAT1;
+   break;
+   case I915_CACHE_LLC:
+   case I915_CACHE_L3_LLC:
+   pte |= GEN12_PPGTT_PTE_PAT0 | GEN12_PPGTT_PTE_PAT1;
+   break;
+   case I915_CACHE_WT:
+   pte |= GEN12_PPGTT_PTE_PAT0;
+   break;
+   }
+
+   return pte;
+}
+
 static void gen8_ppgtt_notify_vgt(struct i915_ppgtt *ppgtt, bool create)
 {
struct drm_i915_private *i915 = ppgtt->vm.i915;
@@ -427,7 +455,7 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
  u32 flags)
 {
struct i915_page_directory *pd;
-   const gen8_pte_t pte_encode = gen8_pte_encode(0, cache_level, flags);
+   const gen8_pte_t pte_encode = ppgtt->vm.pte_encode(0, cache_level, 
flags);
gen8_pte_t *vaddr;
 
pd = i915_pd_entry(pdp, gen8_pd_index(idx, 2));
@@ -580,7 +608,7 @@ static void gen8_ppgtt_insert_huge(struct 
i915_address_space *vm,
   enum i915_cache_level cache_level,
   u32 flags)
 {
-   const gen8_pte_t pte_encode = gen8_pte_encode(0, cache_level, flags);
+   const gen8_pte_t pte_encode = vm->pte_encode(0, cache_level, flags);
unsigned int rem = sg_dma_len(iter->sg);
u64 start = vma_res->start;
 
@@ -743,7 +771,7 @@ static void gen8_ppgtt_insert_entry(struct 
i915_address_space *vm,
GEM_BUG_ON(pt->is_compact);
 
vaddr = px_vaddr(pt);
-   vaddr[gen8_pd_index(idx, 0)] = gen8_pte_encode(addr, level, flags);
+   vaddr[gen8_pd_index(idx, 0)] = vm->pte_encode(addr, level, flags);
drm_clflush_virt_range([gen8_pd_index(idx, 0)], sizeof(*vaddr));
 }
 
@@ -773,7 +801,7 @@ static void __xehpsdv_ppgtt_insert_entry_lm(struct 
i915_address_space *vm,
}
 
vaddr = px_vaddr(pt);
-   vaddr[gen8_pd_index(idx, 0) / 16] = gen8_pte_encode(addr, level, flags);
+   vaddr[gen8_pd_index(idx, 0) / 16] = vm->pte_encode(addr, level, flags);
 }
 
 static void xehpsdv_ppgtt_insert_entry(struct i915_address_space *vm,
@@ -820,8 +848,8 @@ static int gen8_init_scratch(struct i915_address_space *vm)
pte_flags |= PTE_LM;
 
vm->scratch[0]->encode =
-   gen8_pte_encode(px_dma(vm->scratch[0]),
-   I915_CACHE_NONE, pte_flags);
+   vm->pte_encode(px_dma(vm->scratch[0]),
+  I915_CACHE_NONE, pte_flags);
 
for (i = 1; i <= vm->top; i++) {
struct drm_i915_gem_object *obj;
@@ -963,7 +991,10 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt,
 */
ppgtt->vm.alloc_scratch_dma = alloc_pt_dma;
 
-   ppgtt->vm.pte_encode = gen8_pte_encode;
+   if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 70))
+   ppgtt->vm.pte_encode = mtl_pte_encode;
+   else
+   ppgtt->vm.pte_encode = gen8_pte_encode;
 
ppgtt->vm.bind_async_flags = I915_VMA_LOCAL_BIND;
ppgtt->vm.insert_entries = gen8_ppgtt_insert;
diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.h 

[PATCH 9/9] drm/i915: Allow user to set cache at BO creation

2023-04-10 Thread fei . yang
From: Fei Yang 

To comply with the design that buffer objects shall have immutable
cache setting through out its life cycle, {set, get}_caching ioctl's
are no longer supported from MTL onward. With that change caching
policy can only be set at object creation time. The current code
applies a default (platform dependent) cache setting for all objects.
However this is not optimal for performance tuning. The patch extends
the existing gem_create uAPI to let user set PAT index for the object
at creation time.
The new extension is platform independent, so UMD's can switch to using
this extension for older platforms as well, while {set, get}_caching are
still supported on these legacy paltforms for compatibility reason.

Cc: Chris Wilson 
Cc: Matt Roper 
Cc: Andi Shyti 
Signed-off-by: Fei Yang 
---
 drivers/gpu/drm/i915/gem/i915_gem_create.c | 36 ++
 drivers/gpu/drm/i915/gem/i915_gem_object.c |  6 
 include/uapi/drm/i915_drm.h| 36 ++
 tools/include/uapi/drm/i915_drm.h  | 36 ++
 4 files changed, 114 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_create.c 
b/drivers/gpu/drm/i915/gem/i915_gem_create.c
index bfe1dbda4cb7..723c3ddd6c74 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_create.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_create.c
@@ -245,6 +245,7 @@ struct create_ext {
unsigned int n_placements;
unsigned int placement_mask;
unsigned long flags;
+   unsigned int pat_index;
 };
 
 static void repr_placements(char *buf, size_t size,
@@ -394,11 +395,39 @@ static int ext_set_protected(struct i915_user_extension 
__user *base, void *data
return 0;
 }
 
+static int ext_set_pat(struct i915_user_extension __user *base, void *data)
+{
+   struct create_ext *ext_data = data;
+   struct drm_i915_private *i915 = ext_data->i915;
+   struct drm_i915_gem_create_ext_set_pat ext;
+   unsigned int max_pat_index;
+
+   BUILD_BUG_ON(sizeof(struct drm_i915_gem_create_ext_set_pat) !=
+offsetofend(struct drm_i915_gem_create_ext_set_pat, rsvd));
+
+   if (copy_from_user(, base, sizeof(ext)))
+   return -EFAULT;
+
+   max_pat_index = INTEL_INFO(i915)->max_pat_index;
+
+   if (ext.pat_index > max_pat_index) {
+   drm_dbg(>drm, "PAT index is invalid: %u\n",
+   ext.pat_index);
+   return -EINVAL;
+   }
+
+   ext_data->pat_index = ext.pat_index;
+
+   return 0;
+}
+
 static const i915_user_extension_fn create_extensions[] = {
[I915_GEM_CREATE_EXT_MEMORY_REGIONS] = ext_set_placements,
[I915_GEM_CREATE_EXT_PROTECTED_CONTENT] = ext_set_protected,
+   [I915_GEM_CREATE_EXT_SET_PAT] = ext_set_pat,
 };
 
+#define PAT_INDEX_NOT_SET  0x
 /**
  * i915_gem_create_ext_ioctl - Creates a new mm object and returns a handle to 
it.
  * @dev: drm device pointer
@@ -418,6 +447,7 @@ i915_gem_create_ext_ioctl(struct drm_device *dev, void 
*data,
if (args->flags & ~I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS)
return -EINVAL;
 
+   ext_data.pat_index = PAT_INDEX_NOT_SET;
ret = i915_user_extensions(u64_to_user_ptr(args->extensions),
   create_extensions,
   ARRAY_SIZE(create_extensions),
@@ -454,5 +484,11 @@ i915_gem_create_ext_ioctl(struct drm_device *dev, void 
*data,
if (IS_ERR(obj))
return PTR_ERR(obj);
 
+   if (ext_data.pat_index != PAT_INDEX_NOT_SET) {
+   i915_gem_object_set_pat_index(obj, ext_data.pat_index);
+   /* Mark pat_index is set by UMD */
+   obj->cache_level = I915_CACHE_INVAL;
+   }
+
return i915_gem_publish(obj, file, >size, >handle);
 }
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 27c948350b5b..61651f7e5806 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -209,6 +209,12 @@ bool i915_gem_object_can_bypass_llc(struct 
drm_i915_gem_object *obj)
if (!(obj->flags & I915_BO_ALLOC_USER))
return false;
 
+   /*
+* Always flush cache for UMD objects at creation time.
+*/
+   if (obj->cache_level == I915_CACHE_INVAL)
+   return true;
+
/*
 * EHL and JSL add the 'Bypass LLC' MOCS entry, which should make it
 * possible for userspace to bypass the GTT caching bits set by the
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index dba7c5a5b25e..03c5c314846e 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -3630,9 +3630,13 @@ struct drm_i915_gem_create_ext {
 *
 * For I915_GEM_CREATE_EXT_PROTECTED_CONTENT usage see
 * struct drm_i915_gem_create_ext_protected_content.
+*
+* For I915_GEM_CREATE_EXT_SET_PAT 

[PATCH 2/9] drm/i915/mtl: Define MOCS and PAT tables for MTL

2023-04-10 Thread fei . yang
From: Fei Yang 

On MTL, GT can no longer allocate on LLC - only the CPU can.
This, along with addition of support for ADM/L4 cache calls a
MOCS/PAT table update.

BSpec: 44509, 45101, 44235

Cc: Matt Roper 
Cc: Lucas De Marchi 
Signed-off-by: Madhumitha Tolakanahalli Pradeep 

Signed-off-by: Aravind Iddamsetty 
Signed-off-by: Fei Yang 
---
 drivers/gpu/drm/i915/gt/intel_gtt.c | 23 +++-
 drivers/gpu/drm/i915/gt/intel_gtt.h | 20 ++-
 drivers/gpu/drm/i915/gt/intel_mocs.c| 76 +++--
 drivers/gpu/drm/i915/gt/selftest_mocs.c |  2 +-
 4 files changed, 114 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c 
b/drivers/gpu/drm/i915/gt/intel_gtt.c
index 4f436ba7a3c8..1e1b34e22cf5 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -468,6 +468,25 @@ void gtt_write_workarounds(struct intel_gt *gt)
}
 }
 
+static void mtl_setup_private_ppat(struct intel_uncore *uncore)
+{
+   intel_uncore_write(uncore, GEN12_PAT_INDEX(0),
+  MTL_PPAT_L4_0_WB);
+   intel_uncore_write(uncore, GEN12_PAT_INDEX(1),
+  MTL_PPAT_L4_1_WT);
+   intel_uncore_write(uncore, GEN12_PAT_INDEX(2),
+  MTL_PPAT_L4_3_UC);
+   intel_uncore_write(uncore, GEN12_PAT_INDEX(3),
+  MTL_PPAT_L4_0_WB | MTL_2_COH_1W);
+   intel_uncore_write(uncore, GEN12_PAT_INDEX(4),
+  MTL_PPAT_L4_0_WB | MTL_3_COH_2W);
+
+   /*
+* Remaining PAT entries are left at the hardware-default
+* fully-cached setting
+*/
+}
+
 static void tgl_setup_private_ppat(struct intel_uncore *uncore)
 {
/* TGL doesn't support LLC or AGE settings */
@@ -603,7 +622,9 @@ void setup_private_pat(struct intel_gt *gt)
 
GEM_BUG_ON(GRAPHICS_VER(i915) < 8);
 
-   if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50))
+   if (IS_METEORLAKE(i915))
+   mtl_setup_private_ppat(uncore);
+   else if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50))
xehp_setup_private_ppat(gt);
else if (GRAPHICS_VER(i915) >= 12)
tgl_setup_private_ppat(uncore);
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h 
b/drivers/gpu/drm/i915/gt/intel_gtt.h
index 69ce55f517f5..b632167eaf2e 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -88,9 +88,18 @@ typedef u64 gen8_pte_t;
 #define BYT_PTE_SNOOPED_BY_CPU_CACHES  REG_BIT(2)
 #define BYT_PTE_WRITEABLE  REG_BIT(1)
 
+#define GEN12_PPGTT_PTE_PAT3BIT_ULL(62)
 #define GEN12_PPGTT_PTE_LM BIT_ULL(11)
+#define GEN12_PPGTT_PTE_PAT2BIT_ULL(7)
+#define GEN12_PPGTT_PTE_NC  BIT_ULL(5)
+#define GEN12_PPGTT_PTE_PAT1BIT_ULL(4)
+#define GEN12_PPGTT_PTE_PAT0BIT_ULL(3)
 
-#define GEN12_GGTT_PTE_LM  BIT_ULL(1)
+#define GEN12_GGTT_PTE_LM  BIT_ULL(1)
+#define MTL_GGTT_PTE_PAT0  BIT_ULL(52)
+#define MTL_GGTT_PTE_PAT1  BIT_ULL(53)
+#define GEN12_GGTT_PTE_ADDR_MASK   GENMASK_ULL(45, 12)
+#define MTL_GGTT_PTE_PAT_MASK  GENMASK_ULL(53, 52)
 
 #define GEN12_PDE_64K BIT(6)
 #define GEN12_PTE_PS64 BIT(8)
@@ -147,6 +156,15 @@ typedef u64 gen8_pte_t;
 #define GEN8_PDE_IPS_64K BIT(11)
 #define GEN8_PDE_PS_2M   BIT(7)
 
+#define MTL_PPAT_L4_CACHE_POLICY_MASK  REG_GENMASK(3, 2)
+#define MTL_PAT_INDEX_COH_MODE_MASKREG_GENMASK(1, 0)
+#define MTL_PPAT_L4_3_UC   REG_FIELD_PREP(MTL_PPAT_L4_CACHE_POLICY_MASK, 3)
+#define MTL_PPAT_L4_1_WT   REG_FIELD_PREP(MTL_PPAT_L4_CACHE_POLICY_MASK, 1)
+#define MTL_PPAT_L4_0_WB   REG_FIELD_PREP(MTL_PPAT_L4_CACHE_POLICY_MASK, 0)
+#define MTL_3_COH_2W   REG_FIELD_PREP(MTL_PAT_INDEX_COH_MODE_MASK, 3)
+#define MTL_2_COH_1W   REG_FIELD_PREP(MTL_PAT_INDEX_COH_MODE_MASK, 2)
+#define MTL_0_COH_NON  REG_FIELD_PREP(MTL_PAT_INDEX_COH_MODE_MASK, 0)
+
 enum i915_cache_level;
 
 struct drm_i915_gem_object;
diff --git a/drivers/gpu/drm/i915/gt/intel_mocs.c 
b/drivers/gpu/drm/i915/gt/intel_mocs.c
index 69b489e8dfed..89570f137b2c 100644
--- a/drivers/gpu/drm/i915/gt/intel_mocs.c
+++ b/drivers/gpu/drm/i915/gt/intel_mocs.c
@@ -40,6 +40,10 @@ struct drm_i915_mocs_table {
 #define LE_COS(value)  ((value) << 15)
 #define LE_SSE(value)  ((value) << 17)
 
+/* Defines for the tables (GLOB_MOCS_0 - GLOB_MOCS_16) */
+#define _L4_CACHEABILITY(value)((value) << 2)
+#define IG_PAT(value)  ((value) << 8)
+
 /* Defines for the tables (LNCFMOCS0 - LNCFMOCS31) - two entries per word */
 #define L3_ESC(value)  ((value) << 0)
 #define L3_SCC(value)  ((value) << 1)
@@ -50,6 +54,7 @@ struct drm_i915_mocs_table {
 /* Helper defines */
 #define GEN9_NUM_MOCS_ENTRIES  64  /* 63-64 are reserved, but configured. */
 #define PVC_NUM_MOCS_ENTRIES   3
+#define MTL_NUM_MOCS_ENTRIES   16
 
 /* (e)LLC caching options */
 /*
@@ -73,6 +78,12 @@ struct drm_i915_mocs_table {
 #define L3_2_RESERVED  

[PATCH 4/9] drm/i915/mtl: workaround coherency issue for Media

2023-04-10 Thread fei . yang
From: Fei Yang 

This patch implements Wa_22016122933.

In MTL, memory writes initiated by Media tile update the whole
cache line even for partial writes. This creates a coherency
problem for cacheable memory if both CPU and GPU are writing data
to different locations within a single cache line. CTB communication
is impacted by this issue because the head and tail pointers are
adjacent words within a cache line (see struct guc_ct_buffer_desc),
where one is written by GuC and the other by the host.
This patch circumvents the issue by making CPU/GPU shared memory
uncacheable (WC on CPU side, and PAT index 2 for GPU). Also for
CTB which is being updated by both CPU and GuC, mfence instruction
is added to make sure the CPU writes are visible to GPU right away
(flush the write combining buffer).

While fixing the CTB issue, we noticed some random GSC firmware
loading failure because the share buffers are cacheable (WB) on CPU
side but uncached on GPU side. To fix these issues we need to map
such shared buffers as WC on CPU side. Since such allocations are
not all done through GuC allocator, to avoid too many code changes,
the i915_coherent_map_type() is now hard coded to return WC for MTL.

BSpec: 45101

Signed-off-by: Fei Yang 
---
 drivers/gpu/drm/i915/gem/i915_gem_pages.c |  5 -
 drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c | 13 +
 drivers/gpu/drm/i915/gt/uc/intel_guc.c|  7 +++
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 18 --
 4 files changed, 36 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c 
b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
index ecd86130b74f..89fc8ea6bcfc 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
@@ -469,7 +469,10 @@ enum i915_map_type i915_coherent_map_type(struct 
drm_i915_private *i915,
  struct drm_i915_gem_object *obj,
  bool always_coherent)
 {
-   if (i915_gem_object_is_lmem(obj))
+   /*
+* Wa_22016122933: always return I915_MAP_WC for MTL
+*/
+   if (i915_gem_object_is_lmem(obj) || IS_METEORLAKE(i915))
return I915_MAP_WC;
if (HAS_LLC(i915) || always_coherent)
return I915_MAP_WB;
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c 
b/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c
index 1d9fdfb11268..236673c02f9a 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c
@@ -110,6 +110,13 @@ static int gsc_fw_load_prepare(struct intel_gsc_uc *gsc)
if (obj->base.size < gsc->fw.size)
return -ENOSPC;
 
+   /*
+* Wa_22016122933: For MTL the shared memory needs to be mapped
+* as WC on CPU side and UC (PAT index 2) on GPU side
+*/
+   if (IS_METEORLAKE(i915))
+   i915_gem_object_set_cache_coherency(obj, I915_CACHE_NONE);
+
dst = i915_gem_object_pin_map_unlocked(obj,
   i915_coherent_map_type(i915, 
obj, true));
if (IS_ERR(dst))
@@ -125,6 +132,12 @@ static int gsc_fw_load_prepare(struct intel_gsc_uc *gsc)
memset(dst, 0, obj->base.size);
memcpy(dst, src, gsc->fw.size);
 
+   /*
+* Wa_22016122933: Making sure the data in dst is
+* visible to GSC right away
+*/
+   intel_guc_write_barrier(>uc.guc);
+
i915_gem_object_unpin_map(gsc->fw.obj);
i915_gem_object_unpin_map(obj);
 
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
index d76508fa3af7..f9bddaa876d9 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
@@ -743,6 +743,13 @@ struct i915_vma *intel_guc_allocate_vma(struct intel_guc 
*guc, u32 size)
if (IS_ERR(obj))
return ERR_CAST(obj);
 
+   /*
+* Wa_22016122933: For MTL the shared memory needs to be mapped
+* as WC on CPU side and UC (PAT index 2) on GPU side
+*/
+   if (IS_METEORLAKE(gt->i915))
+   i915_gem_object_set_cache_coherency(obj, I915_CACHE_NONE);
+
vma = i915_vma_instance(obj, >ggtt->vm, NULL);
if (IS_ERR(vma))
goto err;
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index 1803a633ed64..98e682b7df07 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -415,12 +415,6 @@ static int ct_write(struct intel_guc_ct *ct,
}
GEM_BUG_ON(tail > size);
 
-   /*
-* make sure H2G buffer update and LRC tail update (if this triggering a
-* submission) are visible before updating the descriptor tail
-*/
-   intel_guc_write_barrier(ct_to_guc(ct));
-
/* update local copies */
ctb->tail = tail;
GEM_BUG_ON(atomic_read(>space) < len + 

[PATCH 8/9] drm/i915: making mtl pte encode generic for gen12

2023-04-10 Thread fei . yang
From: Fei Yang 

PTE encode is platform dependent. After replacing cache_level with
pat_index, the newly introduced mtl_pte_encode is actually generic
for all gen12 platforms, thus rename it to gen12_pte_encode and
apply it to all gen12 platforms.

Cc: Chris Wilson 
Cc: Matt Roper 
Cc: Andi Shyti 
Signed-off-by: Fei Yang 
---
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c 
b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index f76ec2cb29ef..e393e20b5894 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -60,7 +60,7 @@ static u64 gen8_pte_encode(dma_addr_t addr,
return pte;
 }
 
-static u64 mtl_pte_encode(dma_addr_t addr,
+static u64 gen12_pte_encode(dma_addr_t addr,
  unsigned int pat_index,
  u32 flags)
 {
@@ -999,8 +999,8 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt,
 */
ppgtt->vm.alloc_scratch_dma = alloc_pt_dma;
 
-   if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 70))
-   ppgtt->vm.pte_encode = mtl_pte_encode;
+   if (GRAPHICS_VER(gt->i915) >= 12)
+   ppgtt->vm.pte_encode = gen12_pte_encode;
else
ppgtt->vm.pte_encode = gen8_pte_encode;
 
-- 
2.25.1



[PATCH 1/9] drm/i915/mtl: Set has_llc=0

2023-04-10 Thread fei . yang
From: Fei Yang 

On MTL, GT is no longer allocated on LLC, set has_llc=0.

Signed-off-by: Fei Yang 
---
 drivers/gpu/drm/i915/i915_pci.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index cddb6e197972..025d32c0b161 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -1146,6 +1146,7 @@ static const struct intel_device_info mtl_info = {
.has_flat_ccs = 0,
.has_gmd_id = 1,
.has_guc_deprivilege = 1,
+   .has_llc = 0,
.has_mslice_steering = 0,
.has_snoop = 1,
.__runtime.memory_regions = REGION_SMEM | REGION_STOLEN_LMEM,
-- 
2.25.1



[PATCH 6/9] drm/i915: preparation for using PAT index

2023-04-10 Thread fei . yang
From: Fei Yang 

This patch is a preparation for replacing enum i915_cache_level with PAT
index. Caching policy for buffer objects is set through the PAT index in
PTE, the old i915_cache_level is not sufficient to represent all caching
modes supported by the hardware.

Preparing the transition by adding some platform dependent data structures
and helper functions to translate the cache_level to pat_index.

cachelevel_to_pat: a platform dependent array mapping cache_level to
   pat_index.

max_pat_index: the maximum PAT index supported by the hardware. Needed for
   validating the PAT index passed in from user space.

i915_gem_get_pat_index: function to convert cache_level to PAT index.

obj_to_i915(obj): macro moved to header file for wider usage.

I915_MAX_CACHE_LEVEL: upper bound of i915_cache_level for the
  convenience of coding.

Cc: Chris Wilson 
Cc: Matt Roper 
Cc: Andi Shyti 
Signed-off-by: Fei Yang 
---
 drivers/gpu/drm/i915/gem/i915_gem_object.c|  9 +++
 drivers/gpu/drm/i915/gem/i915_gem_object.h|  4 +
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |  1 +
 drivers/gpu/drm/i915/gem/i915_gem_shrinker.c  |  2 -
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c  |  6 ++
 drivers/gpu/drm/i915/gt/intel_ggtt.c  |  6 ++
 drivers/gpu/drm/i915/i915_pci.c   | 75 +--
 drivers/gpu/drm/i915/intel_device_info.h  |  5 ++
 .../gpu/drm/i915/selftests/mock_gem_device.c  |  6 ++
 9 files changed, 104 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 4666bb82f312..8c70a0ec7d2f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -45,6 +45,15 @@ static struct kmem_cache *slab_objects;
 
 static const struct drm_gem_object_funcs i915_gem_object_funcs;
 
+unsigned int i915_gem_get_pat_index(struct drm_i915_private *i915,
+   enum i915_cache_level level)
+{
+   if (drm_WARN_ON(>drm, level >= I915_MAX_CACHE_LEVEL))
+   return 0;
+
+   return INTEL_INFO(i915)->cachelevel_to_pat[level];
+}
+
 struct drm_i915_gem_object *i915_gem_object_alloc(void)
 {
struct drm_i915_gem_object *obj;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 885ccde9dc3c..4c92e17b4337 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -20,6 +20,8 @@
 
 enum intel_region_id;
 
+#define obj_to_i915(obj__) to_i915((obj__)->base.dev)
+
 static inline bool i915_gem_object_size_2big(u64 size)
 {
struct drm_i915_gem_object *obj;
@@ -30,6 +32,8 @@ static inline bool i915_gem_object_size_2big(u64 size)
return false;
 }
 
+unsigned int i915_gem_get_pat_index(struct drm_i915_private *i915,
+   enum i915_cache_level level);
 void i915_gem_init__objects(struct drm_i915_private *i915);
 
 void i915_objects_module_exit(void);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index 830c11431ee8..41b35abccf88 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -194,6 +194,7 @@ enum i915_cache_level {
 * engine.
 */
I915_CACHE_WT,
+   I915_MAX_CACHE_LEVEL,
 };
 
 enum i915_map_type {
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c 
b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
index b1672e054b21..214763942aa2 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
@@ -460,8 +460,6 @@ void i915_gem_shrinker_taints_mutex(struct drm_i915_private 
*i915,
fs_reclaim_release(GFP_KERNEL);
 }
 
-#define obj_to_i915(obj__) to_i915((obj__)->base.dev)
-
 /**
  * i915_gem_object_make_unshrinkable - Hide the object from the shrinker. By
  * default all object types that support shrinking(see IS_SHRINKABLE), will 
also
diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c 
b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index 11b91e0453c8..7a4b1d1afce9 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -78,6 +78,12 @@ static u64 mtl_pte_encode(dma_addr_t addr,
case I915_CACHE_WT:
pte |= GEN12_PPGTT_PTE_PAT0;
break;
+   default:
+   /* This should never happen. Added to deal with the compile
+* error due to the addition of I915_MAX_CACHE_LEVEL. Will
+* be removed by the pat_index patch.
+*/
+   break;
}
 
return pte;
diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c 
b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index ba3109338aee..91056b9a60a9 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
@@ -242,6 

linux-next: manual merge of the tip tree with the drm tree

2023-04-10 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the tip tree got a conflict in:

  drivers/gpu/drm/i915/gt/intel_rc6.c

between commit:

  3735040978a4 ("drm/i915/mtl: Synchronize i915/BIOS on C6 enabling")

from the drm tree and commit:

  f7faedffa92c ("drm/i915/gt: use __xchg instead of internal helper")

from the tip tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc drivers/gpu/drm/i915/gt/intel_rc6.c
index 8f3cd68d14f8,3d6109f1d05c..
--- a/drivers/gpu/drm/i915/gt/intel_rc6.c
+++ b/drivers/gpu/drm/i915/gt/intel_rc6.c
@@@ -733,11 -710,7 +733,11 @@@ void intel_rc6_fini(struct intel_rc6 *r
  
intel_rc6_disable(rc6);
  
 +  /* We want the BIOS C6 state preserved across loads for MTL */
 +  if (IS_METEORLAKE(rc6_to_i915(rc6)) && rc6->bios_state_captured)
 +  set(uncore, GEN6_RC_STATE, rc6->bios_rc_state);
 +
-   pctx = fetch_and_zero(>pctx);
+   pctx = __xchg(>pctx, 0);
if (pctx)
i915_gem_object_put(pctx);
  


pgp8XmFzbOtls.pgp
Description: OpenPGP digital signature


RE: [Intel-gfx] [PATCH 1/8] drm/i915/mtl: Define MOCS and PAT tables for MTL

2023-04-10 Thread Yang, Fei
>Subject: Re: [Intel-gfx] [PATCH 1/8] drm/i915/mtl: Define MOCS and PAT tables 
>for MTL
>
> On Fri, Apr 07, 2023 at 12:12:29AM -0700, 
> fei.y...@intel.com wrote:
>> From: Fei Yang fei.y...@intel.com
>>
>> On MTL, GT can no longer allocate on LLC - only the CPU can.
>> This, along with addition of support for ADM/L4 cache calls a MOCS/PAT
>> table update. Also defines PTE encode functions for MTL as it has
>> different PAT index definition than previous platforms.
>
> It might be best to keep the PTE encoding as a separate patch from the
> MOCS/PAT tables.  It's a different enough topic that it probably deserves
> a patch of its own.

Will update in the next revision.

>>
>> BSpec: 44509, 45101, 44235
>>
>> Cc: Matt Roper matthew.d.ro...@intel.com
>> Cc: Lucas De Marchi lucas.demar...@intel.com
>> Signed-off-by: Madhumitha Tolakanahalli Pradeep
>> madhumitha.tolakanahalli.prad...@intel.com
>> Signed-off-by: Aravind Iddamsetty 
>> aravind.iddamse...@intel.com
>> Signed-off-by: Fei Yang fei.y...@intel.com
>> ---
>>  drivers/gpu/drm/i915/gt/gen8_ppgtt.c| 28 +
>>  drivers/gpu/drm/i915/gt/gen8_ppgtt.h|  3 +
>>  drivers/gpu/drm/i915/gt/intel_ggtt.c| 27 +
>>  drivers/gpu/drm/i915/gt/intel_gtt.c | 23 +++-
>>  drivers/gpu/drm/i915/gt/intel_gtt.h | 20 ++-
>>  drivers/gpu/drm/i915/gt/intel_mocs.c| 76 +++--
>>  drivers/gpu/drm/i915/gt/selftest_mocs.c |  2 +-
>>  drivers/gpu/drm/i915/i915_pci.c |  1 +
>>  8 files changed, 173 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
>> b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
>> index 4daaa6f55668..df4073d32114 100644
>> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
>> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
>> @@ -55,6 +55,34 @@ static u64 gen8_pte_encode(dma_addr_t addr,
>>  return pte;
>>  }
>>
>> +static u64 mtl_pte_encode(dma_addr_t addr,
>> +   enum i915_cache_level level,
>> +   u32 flags)
>> +{
>> +   gen8_pte_t pte = addr | GEN8_PAGE_PRESENT | GEN8_PAGE_RW;
>> +
>> +   if (unlikely(flags & PTE_READ_ONLY))
>> +  pte &= ~GEN8_PAGE_RW;
>> +
>> +   if (flags & PTE_LM)
>> +  pte |= GEN12_PPGTT_PTE_LM | GEN12_PPGTT_PTE_NC;
>> +
>> +   switch (level) {
>> +   case I915_CACHE_NONE:
>> +  pte |= GEN12_PPGTT_PTE_PAT1;
>> +  break;
>> +   case I915_CACHE_LLC:
>> +   case I915_CACHE_L3_LLC:
>> +  pte |= GEN12_PPGTT_PTE_PAT0 | GEN12_PPGTT_PTE_PAT1;
>> +  break;
>> +   case I915_CACHE_WT:
>> +  pte |= GEN12_PPGTT_PTE_PAT0;
>> +  break;
>> +   }
>> +
>> +   return pte;
>> +}
>> +
>>  static void gen8_ppgtt_notify_vgt(struct i915_ppgtt *ppgtt, bool
>> create)  {
>>  struct drm_i915_private *i915 = ppgtt->vm.i915; diff --git
>> a/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
>> b/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
>> index f541d19264b4..6b8ce7f4d25a 100644
>> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
>> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
>> @@ -18,5 +18,8 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt
>> *gt,
>>  u64 gen8_ggtt_pte_encode(dma_addr_t addr,
>> enum i915_cache_level level,
>> u32 flags);
>> +u64 mtl_ggtt_pte_encode(dma_addr_t addr,
>> + unsigned int pat_index,
>> + u32 flags);
>>
>>  #endif
>> diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c
>> b/drivers/gpu/drm/i915/gt/intel_ggtt.c
>> index 3c7f1ed92f5b..4a16bfcde1de 100644
>> --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
>> +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
>> @@ -220,6 +220,33 @@ static void guc_ggtt_invalidate(struct i915_ggtt *ggtt)
>>  }
>>  }
>>
>> +u64 mtl_ggtt_pte_encode(dma_addr_t addr,
>> + enum i915_cache_level level,
>> + u32 flags)
>> +{
>> +   gen8_pte_t pte = addr | GEN8_PAGE_PRESENT;
>> +
>> +   GEM_BUG_ON(addr & ~GEN12_GGTT_PTE_ADDR_MASK);
>> +
>> +   if (flags & PTE_LM)
>> +  pte |= GEN12_GGTT_PTE_LM;
>> +
>> +   switch (level) {
>> +   case I915_CACHE_NONE:
>> +  pte |= MTL_GGTT_PTE_PAT1;
>> +  break;
>> +   case I915_CACHE_LLC:
>> +   case I915_CACHE_L3_LLC:
>> +  pte |= MTL_GGTT_PTE_PAT0 | MTL_GGTT_PTE_PAT1;
>> +  break;
>> +   case I915_CACHE_WT:
>> +  pte |= MTL_GGTT_PTE_PAT0;
>> 

Re: [PATCH v10 2/2] drm: add kms driver for loongson display controller

2023-04-10 Thread Sui Jingfeng

Hi,

On 2023/4/4 22:10, Emil Velikov wrote:

--- /dev/null
+++ b/drivers/gpu/drm/loongson/lsdc_debugfs.c
+void lsdc_debugfs_init(struct drm_minor *minor)
+{
+#ifdef CONFIG_DEBUG_FS
+   drm_debugfs_create_files(lsdc_debugfs_list,
+ARRAY_SIZE(lsdc_debugfs_list),
+minor->debugfs_root,
+minor);
+#endif
+}

Should probably build the file when debugfs is enabled and provide
no-op stub in the header. See nouveau for an example.

But doing that way introduce duplication,  you actually write two 
implements for the same function prototype.


One for the real, another one for the dummy.

Typically skilled core framework programmer/writer like it, for multiple 
backend and multiple arch support


Because the functions set need to be implemented is large for those cases.

While we are just a driver implement based the drm core and only one 
single function here,


DEBUG_FS is enabled by default on our Mips and Loongarch. It is not 
suffer from high frequency changes.


In this case , CONFIG_DEBUG_FS just boils down to "true", a nearly 
always enabled decoration.



We do implement debugfs support that way in the before[1], but we pursue 
compact in the afterwards.


We could revise our driver if that is strongly recommended.


[1] https://patchwork.freedesktop.org/patch/480521/



linux-next: manual merge of the drm tree with the mm-stable tree

2023-04-10 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the drm tree got a conflict in:

  drivers/gpu/drm/ttm/ttm_pool.c

between commit:

  23baf831a32c ("mm, treewide: redefine MAX_ORDER sanely")

from the mm-stable tree and commit:

  322458c2bb1a ("drm/ttm: Reduce the number of used allocation orders for TTM 
pages")

from the drm tree.

I fixed it up (I just used the latter version - though I may have missed
something) and can carry the fix as necessary. This is now fixed as far as
linux-next is concerned, but any non trivial conflicts should be mentioned
to your upstream maintainer when your tree is submitted for merging.
You may also want to consider cooperating with the maintainer of the
conflicting tree to minimise any particularly complex conflicts.

-- 
Cheers,
Stephen Rothwell


pgpWa5TUt2S4e.pgp
Description: OpenPGP digital signature


Re: [PATCH v5 4/4] drm/msm/dpu: use CTL_SC7280_MASK for sm8450's ctl_0

2023-04-10 Thread Abhinav Kumar




On 4/7/2023 5:27 PM, Dmitry Baryshkov wrote:

On sm8450 platform the CTL_0 doesn't differ from the rest of CTL blocks,
so switch it to CTL_SC7280_MASK too.

Some background: original commit 100d7ef6995d ("drm/msm/dpu: add support
for SM8450") had all (relevant at that time) bit spelled individually.
Then commit 0e91bcbb0016 ("drm/msm/dpu: Add SM8350 to hw catalog"),
despite being a mismerge, correctly changed all other CTL entries to use
CTL_SC7280_MASK, except CTL_0.

While the current BLOCK_SOC_MASK style is not ideal (and while we are
working on a better scheme), let's follow its usage as a least minimal
surprise. For example, sc8280xp, a close associate of sm8450, also uses
CTL_SC7280_MASK.

Signed-off-by: Dmitry Baryshkov 
---


Although I dont totally agree with this, but because sc8280xp also uses 
the same, I am fine.



Reviewed-by: Abhinav Kumar 

But either we need to work on a better scheme or expand the macros but 
not duplicate these for the next chipset which gets added.


Re: [PATCH v5 3/4] drm/msm/dpu: enable DSPP and DSC on sc8180x

2023-04-10 Thread Abhinav Kumar




On 4/7/2023 5:27 PM, Dmitry Baryshkov wrote:

Enable DSPP and DSC hardware blocks on sc8180x platform.

Signed-off-by: Dmitry Baryshkov 
---


Reviewed-by: Abhinav Kumar 


Re: [PATCH v5 2/4] drm/msm/dpu: enable DSPP_2/3 for LM_2/3 on sm8450

2023-04-10 Thread Abhinav Kumar




On 4/7/2023 5:27 PM, Dmitry Baryshkov wrote:

Mark DSPP_2 and DSPP_3 as used for LM_2 and LM_3

Fixes: 100d7ef6995d ("drm/msm/dpu: add support for SM8450")
Signed-off-by: Dmitry Baryshkov 
---


Reviewed-by: Abhinav Kumar 


Re: [PATCH v5 1/4] drm/msm/dpu: enable DPU_CTL_SPLIT_DISPLAY for sc8280xp

2023-04-10 Thread Abhinav Kumar




On 4/7/2023 5:27 PM, Dmitry Baryshkov wrote:

Theoretically, since sm8150 we should be using a single CTL for the
split panel case, but since we do not support it for now, fallback to
DPU_CTL_SPLIT_DISPLAY.

Signed-off-by: Dmitry Baryshkov 


Reviewed-by: Abhinav Kumar 


Re: [PATCH] drm/msm/adreno: warn if chip revn is verified before being set

2023-04-10 Thread Stephen Boyd
Quoting Dmitry Baryshkov (2023-04-10 15:38:36)
> The commit 010c8bbad2cb ("drm: msm: adreno: Disable preemption on Adreno
> 510") tried to check GPU's revn before revn being set. Add WARN_ON_ONCE
> to prevent such bugs from happening again. A separate helper is
> necessary so that the warning is displayed really just once instead of
> being displayed for each of comparisons.
>
> Suggested-by: Stephen Boyd 
> Signed-off-by: Dmitry Baryshkov 
> ---
>  drivers/gpu/drm/msm/adreno/adreno_gpu.h | 63 -
>  1 file changed, 40 insertions(+), 23 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h 
> b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> index f62612a5c70f..47e21d44ac24 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> @@ -145,40 +145,51 @@ struct adreno_platform_config {
>
>  bool adreno_cmp_rev(struct adreno_rev rev1, struct adreno_rev rev2);
>
> +static inline bool adreno_cmp_revn(struct adreno_gpu *gpu, uint32_t revn)

'cmp' in the name makes me think it's comparing. Maybe 'is' is better
because we're testing for equality.

adreno_is_revn()

Also 'const struct adreno_gpu *' because it isn't changing?

> +{
> +   WARN_ON_ONCE(!gpu->revn);
> +
> +   return gpu->revn == revn;
> +}
> +
>  static inline bool adreno_is_a2xx(struct adreno_gpu *gpu)

And then these can all be const in a followup patch probably.


[GIT PULL] mediatek drm next for 6.4

2023-04-10 Thread Chun-Kuang Hu
Hi, Dave & Daniel:

This includes:

1. Add support for 10-bit overlays
2. Add MediaTek SoC DRM (vdosys1) support for mt8195
3. Change mmsys compatible for mt8195 mediatek-drm
4. Only trigger DRM HPD events if bridge is attached
5. Change the aux retries times when receiving AUX_DEFER

Regards,
Chun-Kuang.

The following changes since commit fe15c26ee26efa11741a7b632e9f23b01aca4cc6:

  Linux 6.3-rc1 (2023-03-05 14:52:03 -0800)

are available in the Git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux.git 
tags/mediatek-drm-next-6.4

for you to fetch changes up to 9243d70e05c5989f84f840612965f96b524da925:

  drm/mediatek: dp: Change the aux retries times when receiving AUX_DEFER 
(2023-04-03 16:49:49 +)


Mediatek DRM Next for Linux 6.4

1. Add support for 10-bit overlays
2. Add MediaTek SoC DRM (vdosys1) support for mt8195
3. Change mmsys compatible for mt8195 mediatek-drm
4. Only trigger DRM HPD events if bridge is attached
5. Change the aux retries times when receiving AUX_DEFER


Alexandre Mergnat (1):
  dt-bindings: display: mediatek: clean unnecessary item

Chen-Yu Tsai (1):
  drm/mediatek: dp: Only trigger DRM HPD events if bridge is attached

Jason-JH.Lin (1):
  drm/mediatek: Change mmsys compatible for mt8195 mediatek-drm

Justin Green (3):
  drm/mediatek: Refactor pixel format logic
  drm/mediatek: Add support for AR30 and BA30 overlays
  drm/mediatek: Enable AR30 and BA30 overlays on MT8195

Nancy.Lin (9):
  dt-bindings: mediatek: add ethdr definition for mt8195
  drm/mediatek: Add ETHDR support for MT8195
  drm/mediatek: Add ovl_adaptor support for MT8195
  drm/mediatek: Add dma dev get function
  drm/mediatek: Modify mediatek-drm for mt8195 multi mmsys support
  drm/mediatek: Add drm ovl_adaptor sub driver for MT8195
  drm/mediatek: Add mediatek-drm of vdosys1 support for MT8195
  drm/mediatek: Add mdp_rdma get format function
  drm/mediatek: Add ovl_adaptor get format function

Nathan Lu (1):
  drm/mediatek: Add mediatek-drm of vdosys0 support for mt8188

Xinlei Lee (1):
  drm/mediatek: dp: Change the aux retries times when receiving AUX_DEFER

 .../bindings/display/mediatek/mediatek,ccorr.yaml  |   5 +-
 .../bindings/display/mediatek/mediatek,ethdr.yaml  | 182 +++
 drivers/gpu/drm/mediatek/Makefile  |   2 +
 drivers/gpu/drm/mediatek/mtk_disp_drv.h|  35 ++
 drivers/gpu/drm/mediatek/mtk_disp_ovl.c|  94 
 drivers/gpu/drm/mediatek/mtk_disp_ovl_adaptor.c| 547 +
 drivers/gpu/drm/mediatek/mtk_disp_rdma.c   |  38 ++
 drivers/gpu/drm/mediatek/mtk_dp.c  |  15 +-
 drivers/gpu/drm/mediatek/mtk_drm_crtc.c|  89 +++-
 drivers/gpu/drm/mediatek/mtk_drm_crtc.h|   6 +-
 drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c| 135 +++--
 drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h|  78 ++-
 drivers/gpu/drm/mediatek/mtk_drm_drv.c | 475 ++
 drivers/gpu/drm/mediatek/mtk_drm_drv.h |  30 +-
 drivers/gpu/drm/mediatek/mtk_drm_plane.c   |  24 +-
 drivers/gpu/drm/mediatek/mtk_drm_plane.h   |   3 +-
 drivers/gpu/drm/mediatek/mtk_ethdr.c   | 370 ++
 drivers/gpu/drm/mediatek/mtk_ethdr.h   |  25 +
 drivers/gpu/drm/mediatek/mtk_mdp_rdma.c|  24 +
 19 files changed, 1858 insertions(+), 319 deletions(-)
 create mode 100644 
Documentation/devicetree/bindings/display/mediatek/mediatek,ethdr.yaml
 create mode 100644 drivers/gpu/drm/mediatek/mtk_disp_ovl_adaptor.c
 create mode 100644 drivers/gpu/drm/mediatek/mtk_ethdr.c
 create mode 100644 drivers/gpu/drm/mediatek/mtk_ethdr.h


[PATCH] drm: etnaviv: Replace of_platform.h with explicit includes

2023-04-10 Thread Rob Herring
Etnaviv doesn't use anything from of_platform.h, but depends on
of.h, of_device.h, and platform_device.h which are all implicitly
included, but that is going to be removed soon.

Signed-off-by: Rob Herring 
---
 drivers/gpu/drm/etnaviv/etnaviv_drv.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_drv.c 
b/drivers/gpu/drm/etnaviv/etnaviv_drv.c
index 44ca803237a5..c68e83ed5a23 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_drv.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_drv.c
@@ -6,7 +6,9 @@
 #include 
 #include 
 #include 
-#include 
+#include 
+#include 
+#include 
 #include 
 
 #include 
-- 
2.39.2



Re: [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete

2023-04-10 Thread Dixit, Ashutosh
On Fri, 07 Apr 2023 04:04:06 -0700, Rodrigo Vivi wrote:
>

Hi Rodrigo,

> On Wed, Apr 05, 2023 at 09:45:22PM -0700, Ashutosh Dixit wrote:
> > Instead of erroring out when GuC reset is in progress, block waiting for
> > GuC reset to complete which is a more reasonable uapi behavior.
> >
> > Signed-off-by: Ashutosh Dixit 
> > ---
> >  drivers/gpu/drm/i915/i915_hwmon.c | 13 ++---
> >  1 file changed, 10 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_hwmon.c 
> > b/drivers/gpu/drm/i915/i915_hwmon.c
> > index 9ab8971679fe3..4343efb48e61b 100644
> > --- a/drivers/gpu/drm/i915/i915_hwmon.c
> > +++ b/drivers/gpu/drm/i915/i915_hwmon.c
> > @@ -51,6 +51,7 @@ struct hwm_drvdata {
> > char name[12];
> > int gt_n;
> > bool reset_in_progress;
> > +   wait_queue_head_t wqh;
> >  };
> >
> >  struct i915_hwmon {
> > @@ -400,10 +401,15 @@ hwm_power_max_write(struct hwm_drvdata *ddat, long 
> > val)
> > int ret = 0;
> > u32 nval;
> >
> > +retry:
> > mutex_lock(>hwmon_lock);
> > if (hwmon->ddat.reset_in_progress) {
> > -   ret = -EAGAIN;
> > -   goto unlock;
> > +   mutex_unlock(>hwmon_lock);
> > +   ret = wait_event_interruptible(ddat->wqh,
> > +  !hwmon->ddat.reset_in_progress);
>
> this is indeed very clever!

Not clever, see below :/

> my fear is probably due to the lack of knowledge on this wait queue, but
> I'm wondering what could go wrong if due to some funny race you enter this
> check right after wake_up_all below has passed and then you will be here
> indefinitely waiting...

You are absolutely right, there is indeed a race in the patch because in
the above code when we drop the mutex (mutex_unlock) the wake_up_all can
happen before we have queued ourselves for the wake up.

Solving this race needs a more complicated prepare_to_wait/finish_wait
sequence which I have gone ahead and implemented in patch v2. The v2 code
is also a standard code pattern and the pattern I have implemented is
basically the same as that in intel_guc_wait_for_pending_msg() in i915
which I liked.

I have read in several places (e.g. in the Advanced Sleeping section in
https://static.lwn.net/images/pdf/LDD3/ch06.pdf and in kernel documentation
for try_to_wake_up()) that this sequence will avoid the race (between
schedule() and wake_up()). The crucial difference from the v1 patch is that
in v2 the mutex is dropped after we queue ourselves in prepare_to_wait()
just before calling schedule_timeout().

> maybe just use the timeout version to be on the safeside and then return the
> -EAGAIN on timeout?

Also incorporated timeout in the new version. All code paths in the new
patch have been tested.

Thanks.
--
Ashutosh

> > +   if (ret)
> > +   return ret;
> > +   goto retry;
> > }
> > wakeref = intel_runtime_pm_get(ddat->uncore->rpm);
> >
> > @@ -426,7 +432,6 @@ hwm_power_max_write(struct hwm_drvdata *ddat, long val)
> >  PKG_PWR_LIM_1_EN | PKG_PWR_LIM_1, nval);
> >  exit:
> > intel_runtime_pm_put(ddat->uncore->rpm, wakeref);
> > -unlock:
> > mutex_unlock(>hwmon_lock);
> > return ret;
> >  }
> > @@ -508,6 +513,7 @@ void i915_hwmon_power_max_restore(struct 
> > drm_i915_private *i915, bool old)
> > intel_uncore_rmw(hwmon->ddat.uncore, hwmon->rg.pkg_rapl_limit,
> >  PKG_PWR_LIM_1_EN, old ? PKG_PWR_LIM_1_EN : 0);
> > hwmon->ddat.reset_in_progress = false;
> > +   wake_up_all(>ddat.wqh);
> >
> > mutex_unlock(>hwmon_lock);
> >  }
> > @@ -784,6 +790,7 @@ void i915_hwmon_register(struct drm_i915_private *i915)
> > ddat->uncore = >uncore;
> > snprintf(ddat->name, sizeof(ddat->name), "i915");
> > ddat->gt_n = -1;
> > +   init_waitqueue_head(>wqh);
> >
> > for_each_gt(gt, i915, i) {
> > ddat_gt = hwmon->ddat_gt + i;
> > --
> > 2.38.0
> >


[PATCH v2] drm/msm/adreno: fix sparse warnings in a6xx code

2023-04-10 Thread Dmitry Baryshkov
Sparse reports plenty of warnings against the a6xx code because of
a6xx_gmu::mmio and a6xx_gmu::rscc members. For some reason they were
defined as __iomem pointers rather than pointers to __iomem memory.
Correct the __iomem attribute.

Fixes: 02ef80c54e7c ("drm/msm/a6xx: update pdc/rscc GMU registers for 
A640/A650")
Reported-by: kernel test robot 
Link: https://lore.kernel.org/oe-kbuild-all/202304070550.nrbhjcvp-...@intel.com/
Reviewed-by: Javier Martinez Canillas 
Reviewed-by: Stephen Boyd 
Signed-off-by: Dmitry Baryshkov 
---

Changes since v1: removed whispace after the star (Stephen)

---
 drivers/gpu/drm/msm/adreno/a6xx_gmu.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h 
b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
index 0bc3eb443fec..4759a8ce51e4 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
@@ -51,8 +51,8 @@ struct a6xx_gmu {
 
struct msm_gem_address_space *aspace;
 
-   void * __iomem mmio;
-   void * __iomem rscc;
+   void __iomem *mmio;
+   void __iomem *rscc;
 
int hfi_irq;
int gmu_irq;
-- 
2.30.2



[PATCH] drm/msm/adreno: warn if chip revn is verified before being set

2023-04-10 Thread Dmitry Baryshkov
The commit 010c8bbad2cb ("drm: msm: adreno: Disable preemption on Adreno
510") tried to check GPU's revn before revn being set. Add WARN_ON_ONCE
to prevent such bugs from happening again. A separate helper is
necessary so that the warning is displayed really just once instead of
being displayed for each of comparisons.

Suggested-by: Stephen Boyd 
Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/adreno/adreno_gpu.h | 63 -
 1 file changed, 40 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h 
b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
index f62612a5c70f..47e21d44ac24 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
@@ -145,40 +145,51 @@ struct adreno_platform_config {
 
 bool adreno_cmp_rev(struct adreno_rev rev1, struct adreno_rev rev2);
 
+static inline bool adreno_cmp_revn(struct adreno_gpu *gpu, uint32_t revn)
+{
+   WARN_ON_ONCE(!gpu->revn);
+
+   return gpu->revn == revn;
+}
+
 static inline bool adreno_is_a2xx(struct adreno_gpu *gpu)
 {
+   WARN_ON_ONCE(!gpu->revn);
+
return (gpu->revn < 300);
 }
 
 static inline bool adreno_is_a20x(struct adreno_gpu *gpu)
 {
+   WARN_ON_ONCE(!gpu->revn);
+
return (gpu->revn < 210);
 }
 
 static inline bool adreno_is_a225(struct adreno_gpu *gpu)
 {
-   return gpu->revn == 225;
+   return adreno_cmp_revn(gpu, 225);
 }
 
 static inline bool adreno_is_a305(struct adreno_gpu *gpu)
 {
-   return gpu->revn == 305;
+   return adreno_cmp_revn(gpu, 305);
 }
 
 static inline bool adreno_is_a306(struct adreno_gpu *gpu)
 {
/* yes, 307, because a305c is 306 */
-   return gpu->revn == 307;
+   return adreno_cmp_revn(gpu, 307);
 }
 
 static inline bool adreno_is_a320(struct adreno_gpu *gpu)
 {
-   return gpu->revn == 320;
+   return adreno_cmp_revn(gpu, 320);
 }
 
 static inline bool adreno_is_a330(struct adreno_gpu *gpu)
 {
-   return gpu->revn == 330;
+   return adreno_cmp_revn(gpu, 330);
 }
 
 static inline bool adreno_is_a330v2(struct adreno_gpu *gpu)
@@ -188,77 +199,78 @@ static inline bool adreno_is_a330v2(struct adreno_gpu 
*gpu)
 
 static inline int adreno_is_a405(struct adreno_gpu *gpu)
 {
-   return gpu->revn == 405;
+   return adreno_cmp_revn(gpu, 405);
 }
 
 static inline int adreno_is_a420(struct adreno_gpu *gpu)
 {
-   return gpu->revn == 420;
+   return adreno_cmp_revn(gpu, 420);
 }
 
 static inline int adreno_is_a430(struct adreno_gpu *gpu)
 {
-   return gpu->revn == 430;
+   return adreno_cmp_revn(gpu, 430);
 }
 
 static inline int adreno_is_a506(struct adreno_gpu *gpu)
 {
-   return gpu->revn == 506;
+   return adreno_cmp_revn(gpu, 506);
 }
 
 static inline int adreno_is_a508(struct adreno_gpu *gpu)
 {
-   return gpu->revn == 508;
+   return adreno_cmp_revn(gpu, 508);
 }
 
 static inline int adreno_is_a509(struct adreno_gpu *gpu)
 {
-   return gpu->revn == 509;
+   return adreno_cmp_revn(gpu, 509);
 }
 
 static inline int adreno_is_a510(struct adreno_gpu *gpu)
 {
-   return gpu->revn == 510;
+   return adreno_cmp_revn(gpu, 510);
 }
 
 static inline int adreno_is_a512(struct adreno_gpu *gpu)
 {
-   return gpu->revn == 512;
+   return adreno_cmp_revn(gpu, 512);
 }
 
 static inline int adreno_is_a530(struct adreno_gpu *gpu)
 {
-   return gpu->revn == 530;
+   return adreno_cmp_revn(gpu, 530);
 }
 
 static inline int adreno_is_a540(struct adreno_gpu *gpu)
 {
-   return gpu->revn == 540;
+   return adreno_cmp_revn(gpu, 540);
 }
 
 static inline int adreno_is_a618(struct adreno_gpu *gpu)
 {
-   return gpu->revn == 618;
+   return adreno_cmp_revn(gpu, 618);
 }
 
 static inline int adreno_is_a619(struct adreno_gpu *gpu)
 {
-   return gpu->revn == 619;
+   return adreno_cmp_revn(gpu, 619);
 }
 
 static inline int adreno_is_a630(struct adreno_gpu *gpu)
 {
-   return gpu->revn == 630;
+   return adreno_cmp_revn(gpu, 630);
 }
 
 static inline int adreno_is_a640_family(struct adreno_gpu *gpu)
 {
-   return (gpu->revn == 640) || (gpu->revn == 680);
+   return adreno_cmp_revn(gpu, 640) ||
+   adreno_cmp_revn(gpu, 680);
 }
 
 static inline int adreno_is_a650(struct adreno_gpu *gpu)
 {
-   return gpu->revn == 650;
+   return adreno_cmp_revn(gpu, 650);
 }
 
 static inline int adreno_is_7c3(struct adreno_gpu *gpu)
@@ -269,13 +281,16 @@ static inline int adreno_is_7c3(struct adreno_gpu *gpu)
 
 static inline int adreno_is_a660(struct adreno_gpu *gpu)
 {
-   return gpu->revn == 660;
+   return adreno_cmp_revn(gpu, 660);
 }
 
 /* check for a615, a616, a618, a619 or any derivatives */
 static inline int adreno_is_a615_family(struct adreno_gpu *gpu)
 {
-   return gpu->revn == 615 || gpu->revn == 616 || gpu->revn == 618 || 
gpu->revn == 619;
+   return adreno_cmp_revn(gpu, 615) ||
+   adreno_cmp_revn(gpu, 616) ||
+   

Re: [Intel-gfx] [PATCH 2/3] drm/i915/guc: Disable PL1 power limit when loading GuC firmware

2023-04-10 Thread Dixit, Ashutosh
On Fri, 07 Apr 2023 04:08:31 -0700, Rodrigo Vivi wrote:
>

Hi Rodrigo,

> On Wed, Apr 05, 2023 at 09:45:21PM -0700, Ashutosh Dixit wrote:
> > On dGfx, the PL1 power limit being enabled and set to a low value results
> > in a low GPU operating freq. It also negates the freq raise operation which
> > is done before GuC firmware load. As a result GuC firmware load can time
> > out. Such timeouts were seen in the GL #8062 bug below (where the PL1 power
> > limit was enabled and set to a low value). Therefore disable the PL1 power
> > limit when allowed by HW when loading GuC firmware.
> >
> > v2:
> >  - Take mutex (to disallow writes to power1_max) across GuC reset/fw load
> >  - Add hwm_power_max_restore to error return code path
> >
> > v3 (Jani N):
> >  - Add/remove explanatory comments
> >  - Function renames
> >  - Type corrections
> >  - Locking annotation
> >
> > v4:
> >  - Don't hold the lock across GuC reset (Rodrigo)
> >  - New locking scheme (suggested by Rodrigo)
> >  - Eliminate rpm_get in power_max_disable/restore, not needed (Tvrtko)
> >
> > Link: https://gitlab.freedesktop.org/drm/intel/-/issues/8062
> > Signed-off-by: Ashutosh Dixit 
> > ---
> >  drivers/gpu/drm/i915/gt/uc/intel_uc.c |  9 ++
> >  drivers/gpu/drm/i915/i915_hwmon.c | 40 +++
> >  drivers/gpu/drm/i915/i915_hwmon.h |  7 +
> >  3 files changed, 56 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c 
> > b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> > index 4ccb4be4c9cba..aa8e35a5636a0 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> > @@ -18,6 +18,7 @@
> >  #include "intel_uc.h"
> >
> >  #include "i915_drv.h"
> > +#include "i915_hwmon.h"
> >
> >  static const struct intel_uc_ops uc_ops_off;
> >  static const struct intel_uc_ops uc_ops_on;
> > @@ -461,6 +462,7 @@ static int __uc_init_hw(struct intel_uc *uc)
> > struct intel_guc *guc = >guc;
> > struct intel_huc *huc = >huc;
> > int ret, attempts;
> > +   bool pl1en;
>
> we need to initialize this to make warn free builds happy...
> what's our default btw? false? true? we need to read it back?

Yes this was a real bug caught by the kernel build robot. We don't know the
default till we read it back, which would mean exposing a new function. I
have avoided exposing the new function, i.e. I have fixed this by creating a
new (err_rps) label which will make sure that the variable is not used
unless it is initialized. I am not expecting to see warnings from the build
robot with this fix now.

> >
> > GEM_BUG_ON(!intel_uc_supports_guc(uc));
> > GEM_BUG_ON(!intel_uc_wants_guc(uc));
> > @@ -491,6 +493,9 @@ static int __uc_init_hw(struct intel_uc *uc)
> > else
> > attempts = 1;
> >
> > +   /* Disable a potentially low PL1 power limit to allow freq to be raised 
> > */
> > +   i915_hwmon_power_max_disable(gt->i915, );
> > +
> > intel_rps_raise_unslice(_to_gt(uc)->rps);
> >
> > while (attempts--) {
> > @@ -547,6 +552,8 @@ static int __uc_init_hw(struct intel_uc *uc)
> > intel_rps_lower_unslice(_to_gt(uc)->rps);
> > }
> >
> > +   i915_hwmon_power_max_restore(gt->i915, pl1en);
> > +
> > guc_info(guc, "submission %s\n", 
> > str_enabled_disabled(intel_uc_uses_guc_submission(uc)));
> > guc_info(guc, "SLPC %s\n", 
> > str_enabled_disabled(intel_uc_uses_guc_slpc(uc)));
> >
> > @@ -563,6 +570,8 @@ static int __uc_init_hw(struct intel_uc *uc)
> > /* Return GT back to RPn */
> > intel_rps_lower_unslice(_to_gt(uc)->rps);
> >
> > +   i915_hwmon_power_max_restore(gt->i915, pl1en);
> > +
> > __uc_sanitize(uc);
> >
> > if (!ret) {
> > diff --git a/drivers/gpu/drm/i915/i915_hwmon.c 
> > b/drivers/gpu/drm/i915/i915_hwmon.c
> > index 7f44e809ca155..9ab8971679fe3 100644
> > --- a/drivers/gpu/drm/i915/i915_hwmon.c
> > +++ b/drivers/gpu/drm/i915/i915_hwmon.c
> > @@ -50,6 +50,7 @@ struct hwm_drvdata {
> > struct hwm_energy_info ei;  /*  Energy info for 
> > energy1_input */
> > char name[12];
> > int gt_n;
> > +   bool reset_in_progress;
> >  };
> >
> >  struct i915_hwmon {
> > @@ -400,6 +401,10 @@ hwm_power_max_write(struct hwm_drvdata *ddat, long val)
> > u32 nval;
> >
> > mutex_lock(>hwmon_lock);
> > +   if (hwmon->ddat.reset_in_progress) {
> > +   ret = -EAGAIN;
> > +   goto unlock;
> > +   }
> > wakeref = intel_runtime_pm_get(ddat->uncore->rpm);
> >
> > /* Disable PL1 limit and verify, because the limit cannot be disabled 
> > on all platforms */
> > @@ -421,6 +426,7 @@ hwm_power_max_write(struct hwm_drvdata *ddat, long val)
> >  PKG_PWR_LIM_1_EN | PKG_PWR_LIM_1, nval);
> >  exit:
> > intel_runtime_pm_put(ddat->uncore->rpm, wakeref);
> > +unlock:
> > mutex_unlock(>hwmon_lock);
> > return ret;
> >  }
> > @@ -472,6 +478,40 @@ hwm_power_write(struct hwm_drvdata *ddat, u32 attr, 
> > int chan, long val)
> > }
> >  }
> >
> > +void 

[PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete

2023-04-10 Thread Ashutosh Dixit
Instead of erroring out when GuC reset is in progress, block waiting for
GuC reset to complete which is a more reasonable uapi behavior.

v2: Avoid race between wake_up_all and waiting for wakeup (Rodrigo)

Signed-off-by: Ashutosh Dixit 
---
 drivers/gpu/drm/i915/i915_hwmon.c | 38 +++
 1 file changed, 33 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_hwmon.c 
b/drivers/gpu/drm/i915/i915_hwmon.c
index 9ab8971679fe3..8471a667dfc71 100644
--- a/drivers/gpu/drm/i915/i915_hwmon.c
+++ b/drivers/gpu/drm/i915/i915_hwmon.c
@@ -51,6 +51,7 @@ struct hwm_drvdata {
char name[12];
int gt_n;
bool reset_in_progress;
+   wait_queue_head_t waitq;
 };
 
 struct i915_hwmon {
@@ -395,16 +396,41 @@ hwm_power_max_read(struct hwm_drvdata *ddat, long *val)
 static int
 hwm_power_max_write(struct hwm_drvdata *ddat, long val)
 {
+#define GUC_RESET_TIMEOUT msecs_to_jiffies(2000)
+
+   int ret = 0, timeout = GUC_RESET_TIMEOUT;
struct i915_hwmon *hwmon = ddat->hwmon;
intel_wakeref_t wakeref;
-   int ret = 0;
+   DEFINE_WAIT(wait);
u32 nval;
 
-   mutex_lock(>hwmon_lock);
-   if (hwmon->ddat.reset_in_progress) {
-   ret = -EAGAIN;
-   goto unlock;
+   /* Block waiting for GuC reset to complete when needed */
+   for (;;) {
+   mutex_lock(>hwmon_lock);
+
+   prepare_to_wait(>waitq, , TASK_INTERRUPTIBLE);
+
+   if (!hwmon->ddat.reset_in_progress)
+   break;
+
+   if (signal_pending(current)) {
+   ret = -EINTR;
+   break;
+   }
+
+   if (!timeout) {
+   ret = -ETIME;
+   break;
+   }
+
+   mutex_unlock(>hwmon_lock);
+
+   timeout = schedule_timeout(timeout);
}
+   finish_wait(>waitq, );
+   if (ret)
+   goto unlock;
+
wakeref = intel_runtime_pm_get(ddat->uncore->rpm);
 
/* Disable PL1 limit and verify, because the limit cannot be disabled 
on all platforms */
@@ -508,6 +534,7 @@ void i915_hwmon_power_max_restore(struct drm_i915_private 
*i915, bool old)
intel_uncore_rmw(hwmon->ddat.uncore, hwmon->rg.pkg_rapl_limit,
 PKG_PWR_LIM_1_EN, old ? PKG_PWR_LIM_1_EN : 0);
hwmon->ddat.reset_in_progress = false;
+   wake_up_all(>ddat.waitq);
 
mutex_unlock(>hwmon_lock);
 }
@@ -784,6 +811,7 @@ void i915_hwmon_register(struct drm_i915_private *i915)
ddat->uncore = >uncore;
snprintf(ddat->name, sizeof(ddat->name), "i915");
ddat->gt_n = -1;
+   init_waitqueue_head(>waitq);
 
for_each_gt(gt, i915, i) {
ddat_gt = hwmon->ddat_gt + i;
-- 
2.38.0



[PATCH 1/3] drm/i915/hwmon: Get mutex and rpm ref just once in hwm_power_max_write

2023-04-10 Thread Ashutosh Dixit
In preparation for follow-on patches, refactor hwm_power_max_write to take
hwmon_lock and runtime pm wakeref at start of the function and release them
at the end, therefore acquiring these just once each.

Signed-off-by: Ashutosh Dixit 
Reviewed-by: Rodrigo Vivi 
---
 drivers/gpu/drm/i915/i915_hwmon.c | 28 +++-
 1 file changed, 15 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_hwmon.c 
b/drivers/gpu/drm/i915/i915_hwmon.c
index 8e7dccc8d3a0e..7f44e809ca155 100644
--- a/drivers/gpu/drm/i915/i915_hwmon.c
+++ b/drivers/gpu/drm/i915/i915_hwmon.c
@@ -396,31 +396,33 @@ hwm_power_max_write(struct hwm_drvdata *ddat, long val)
 {
struct i915_hwmon *hwmon = ddat->hwmon;
intel_wakeref_t wakeref;
+   int ret = 0;
u32 nval;
 
+   mutex_lock(>hwmon_lock);
+   wakeref = intel_runtime_pm_get(ddat->uncore->rpm);
+
/* Disable PL1 limit and verify, because the limit cannot be disabled 
on all platforms */
if (val == PL1_DISABLE) {
-   mutex_lock(>hwmon_lock);
-   with_intel_runtime_pm(ddat->uncore->rpm, wakeref) {
-   intel_uncore_rmw(ddat->uncore, hwmon->rg.pkg_rapl_limit,
-PKG_PWR_LIM_1_EN, 0);
-   nval = intel_uncore_read(ddat->uncore, 
hwmon->rg.pkg_rapl_limit);
-   }
-   mutex_unlock(>hwmon_lock);
+   intel_uncore_rmw(ddat->uncore, hwmon->rg.pkg_rapl_limit,
+PKG_PWR_LIM_1_EN, 0);
+   nval = intel_uncore_read(ddat->uncore, 
hwmon->rg.pkg_rapl_limit);
 
if (nval & PKG_PWR_LIM_1_EN)
-   return -ENODEV;
-   return 0;
+   ret = -ENODEV;
+   goto exit;
}
 
/* Computation in 64-bits to avoid overflow. Round to nearest. */
nval = DIV_ROUND_CLOSEST_ULL((u64)val << hwmon->scl_shift_power, 
SF_POWER);
nval = PKG_PWR_LIM_1_EN | REG_FIELD_PREP(PKG_PWR_LIM_1, nval);
 
-   hwm_locked_with_pm_intel_uncore_rmw(ddat, hwmon->rg.pkg_rapl_limit,
-   PKG_PWR_LIM_1_EN | PKG_PWR_LIM_1,
-   nval);
-   return 0;
+   intel_uncore_rmw(ddat->uncore, hwmon->rg.pkg_rapl_limit,
+PKG_PWR_LIM_1_EN | PKG_PWR_LIM_1, nval);
+exit:
+   intel_runtime_pm_put(ddat->uncore->rpm, wakeref);
+   mutex_unlock(>hwmon_lock);
+   return ret;
 }
 
 static int
-- 
2.38.0



[PATCH 0/3] drm/i915/guc: Disable PL1 power limit when loading GuC firmware

2023-04-10 Thread Ashutosh Dixit
Updates to Patch 2/3 and Patch 3/3 in this version.

Ashutosh Dixit (3):
  drm/i915/hwmon: Get mutex and rpm ref just once in hwm_power_max_write
  drm/i915/guc: Disable PL1 power limit when loading GuC firmware
  drm/i915/hwmon: Block waiting for GuC reset to complete

 drivers/gpu/drm/i915/gt/uc/intel_uc.c | 13 +++-
 drivers/gpu/drm/i915/i915_hwmon.c | 94 +++
 drivers/gpu/drm/i915/i915_hwmon.h |  7 ++
 3 files changed, 100 insertions(+), 14 deletions(-)

-- 
2.38.0



[PATCH 2/3] drm/i915/guc: Disable PL1 power limit when loading GuC firmware

2023-04-10 Thread Ashutosh Dixit
On dGfx, the PL1 power limit being enabled and set to a low value results
in a low GPU operating freq. It also negates the freq raise operation which
is done before GuC firmware load. As a result GuC firmware load can time
out. Such timeouts were seen in the GL #8062 bug below (where the PL1 power
limit was enabled and set to a low value). Therefore disable the PL1 power
limit when allowed by HW when loading GuC firmware.

v2:
 - Take mutex (to disallow writes to power1_max) across GuC reset/fw load
 - Add hwm_power_max_restore to error return code path

v3 (Jani N):
 - Add/remove explanatory comments
 - Function renames
 - Type corrections
 - Locking annotation

v4:
 - Don't hold the lock across GuC reset (Rodrigo)
 - New locking scheme (suggested by Rodrigo)
 - Eliminate rpm_get in power_max_disable/restore, not needed (Tvrtko)

v5:
 - Fix uninitialized pl1en variable compile warning reported by kernel
   build robot by creating new err_rps label

Link: https://gitlab.freedesktop.org/drm/intel/-/issues/8062
Signed-off-by: Ashutosh Dixit 
Reviewed-by: Rodrigo Vivi 
---
 drivers/gpu/drm/i915/gt/uc/intel_uc.c | 13 +++--
 drivers/gpu/drm/i915/i915_hwmon.c | 40 +++
 drivers/gpu/drm/i915/i915_hwmon.h |  7 +
 3 files changed, 58 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
index 4ccb4be4c9cba..996168312340e 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
@@ -18,6 +18,7 @@
 #include "intel_uc.h"
 
 #include "i915_drv.h"
+#include "i915_hwmon.h"
 
 static const struct intel_uc_ops uc_ops_off;
 static const struct intel_uc_ops uc_ops_on;
@@ -461,6 +462,7 @@ static int __uc_init_hw(struct intel_uc *uc)
struct intel_guc *guc = >guc;
struct intel_huc *huc = >huc;
int ret, attempts;
+   bool pl1en;
 
GEM_BUG_ON(!intel_uc_supports_guc(uc));
GEM_BUG_ON(!intel_uc_wants_guc(uc));
@@ -491,6 +493,9 @@ static int __uc_init_hw(struct intel_uc *uc)
else
attempts = 1;
 
+   /* Disable a potentially low PL1 power limit to allow freq to be raised 
*/
+   i915_hwmon_power_max_disable(gt->i915, );
+
intel_rps_raise_unslice(_to_gt(uc)->rps);
 
while (attempts--) {
@@ -500,7 +505,7 @@ static int __uc_init_hw(struct intel_uc *uc)
 */
ret = __uc_sanitize(uc);
if (ret)
-   goto err_out;
+   goto err_rps;
 
intel_huc_fw_upload(huc);
intel_guc_ads_reset(guc);
@@ -547,6 +552,8 @@ static int __uc_init_hw(struct intel_uc *uc)
intel_rps_lower_unslice(_to_gt(uc)->rps);
}
 
+   i915_hwmon_power_max_restore(gt->i915, pl1en);
+
guc_info(guc, "submission %s\n", 
str_enabled_disabled(intel_uc_uses_guc_submission(uc)));
guc_info(guc, "SLPC %s\n", 
str_enabled_disabled(intel_uc_uses_guc_slpc(uc)));
 
@@ -559,10 +566,12 @@ static int __uc_init_hw(struct intel_uc *uc)
intel_guc_submission_disable(guc);
 err_log_capture:
__uc_capture_load_err_log(uc);
-err_out:
+err_rps:
/* Return GT back to RPn */
intel_rps_lower_unslice(_to_gt(uc)->rps);
 
+   i915_hwmon_power_max_restore(gt->i915, pl1en);
+err_out:
__uc_sanitize(uc);
 
if (!ret) {
diff --git a/drivers/gpu/drm/i915/i915_hwmon.c 
b/drivers/gpu/drm/i915/i915_hwmon.c
index 7f44e809ca155..9ab8971679fe3 100644
--- a/drivers/gpu/drm/i915/i915_hwmon.c
+++ b/drivers/gpu/drm/i915/i915_hwmon.c
@@ -50,6 +50,7 @@ struct hwm_drvdata {
struct hwm_energy_info ei;  /*  Energy info for 
energy1_input */
char name[12];
int gt_n;
+   bool reset_in_progress;
 };
 
 struct i915_hwmon {
@@ -400,6 +401,10 @@ hwm_power_max_write(struct hwm_drvdata *ddat, long val)
u32 nval;
 
mutex_lock(>hwmon_lock);
+   if (hwmon->ddat.reset_in_progress) {
+   ret = -EAGAIN;
+   goto unlock;
+   }
wakeref = intel_runtime_pm_get(ddat->uncore->rpm);
 
/* Disable PL1 limit and verify, because the limit cannot be disabled 
on all platforms */
@@ -421,6 +426,7 @@ hwm_power_max_write(struct hwm_drvdata *ddat, long val)
 PKG_PWR_LIM_1_EN | PKG_PWR_LIM_1, nval);
 exit:
intel_runtime_pm_put(ddat->uncore->rpm, wakeref);
+unlock:
mutex_unlock(>hwmon_lock);
return ret;
 }
@@ -472,6 +478,40 @@ hwm_power_write(struct hwm_drvdata *ddat, u32 attr, int 
chan, long val)
}
 }
 
+void i915_hwmon_power_max_disable(struct drm_i915_private *i915, bool *old)
+{
+   struct i915_hwmon *hwmon = i915->hwmon;
+   u32 r;
+
+   if (!hwmon || !i915_mmio_reg_valid(hwmon->rg.pkg_rapl_limit))
+   return;
+
+   mutex_lock(>hwmon_lock);
+
+   hwmon->ddat.reset_in_progress = true;
+   r = 

Re: [PATCH] drm/nouveau/fb: add missing sysmen flush callbacks

2023-04-10 Thread Ben Skeggs
On Wed, 5 Apr 2023 at 21:05, Karol Herbst  wrote:
>
> Closes: https://gitlab.freedesktop.org/drm/nouveau/-/issues/203
> Fixes: 5728d064190e1 ("drm/nouveau/fb: handle sysmem flush page from common 
> code")
> Signed-off-by: Karol Herbst 
Oops, that must've gotten lost in a rebase somehow.

Reviewed-by: Ben Skeggs 

> ---
>  drivers/gpu/drm/nouveau/nvkm/subdev/fb/gf108.c | 1 +
>  drivers/gpu/drm/nouveau/nvkm/subdev/fb/gk104.c | 1 +
>  drivers/gpu/drm/nouveau/nvkm/subdev/fb/gk110.c | 1 +
>  drivers/gpu/drm/nouveau/nvkm/subdev/fb/gm107.c | 1 +
>  4 files changed, 4 insertions(+)
>
> diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/fb/gf108.c 
> b/drivers/gpu/drm/nouveau/nvkm/subdev/fb/gf108.c
> index 76678dd60f93f..c4c6f67af7ccc 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/subdev/fb/gf108.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/fb/gf108.c
> @@ -31,6 +31,7 @@ gf108_fb = {
> .init = gf100_fb_init,
> .init_page = gf100_fb_init_page,
> .intr = gf100_fb_intr,
> +   .sysmem.flush_page_init = gf100_fb_sysmem_flush_page_init,
> .ram_new = gf108_ram_new,
> .default_bigpage = 17,
>  };
> diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/fb/gk104.c 
> b/drivers/gpu/drm/nouveau/nvkm/subdev/fb/gk104.c
> index f73442ccb424b..433fa966ba231 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/subdev/fb/gk104.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/fb/gk104.c
> @@ -77,6 +77,7 @@ gk104_fb = {
> .init = gf100_fb_init,
> .init_page = gf100_fb_init_page,
> .intr = gf100_fb_intr,
> +   .sysmem.flush_page_init = gf100_fb_sysmem_flush_page_init,
> .ram_new = gk104_ram_new,
> .default_bigpage = 17,
> .clkgate_pack = gk104_fb_clkgate_pack,
> diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/fb/gk110.c 
> b/drivers/gpu/drm/nouveau/nvkm/subdev/fb/gk110.c
> index 45d6cdffafeed..4dc283dedf8b5 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/subdev/fb/gk110.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/fb/gk110.c
> @@ -59,6 +59,7 @@ gk110_fb = {
> .init = gf100_fb_init,
> .init_page = gf100_fb_init_page,
> .intr = gf100_fb_intr,
> +   .sysmem.flush_page_init = gf100_fb_sysmem_flush_page_init,
> .ram_new = gk104_ram_new,
> .default_bigpage = 17,
> .clkgate_pack = gk110_fb_clkgate_pack,
> diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/fb/gm107.c 
> b/drivers/gpu/drm/nouveau/nvkm/subdev/fb/gm107.c
> index de52462a92bf0..90bfff616d35b 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/subdev/fb/gm107.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/fb/gm107.c
> @@ -31,6 +31,7 @@ gm107_fb = {
> .init = gf100_fb_init,
> .init_page = gf100_fb_init_page,
> .intr = gf100_fb_intr,
> +   .sysmem.flush_page_init = gf100_fb_sysmem_flush_page_init,
> .ram_new = gm107_ram_new,
> .default_bigpage = 17,
>  };
> --
> 2.39.2
>


[PATCH v2 2/2] drm/msm: Add memory stats to fdinfo

2023-04-10 Thread Rob Clark
From: Rob Clark 

Use the new helper to export stats about memory usage.

v2: Drop unintended hunk

Signed-off-by: Rob Clark 
Reviewed-by: Emil Velikov 
---
 drivers/gpu/drm/msm/msm_drv.c | 25 -
 drivers/gpu/drm/msm/msm_gpu.c |  2 --
 2 files changed, 24 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index 9b6f17b1261f..9acc3ebfbc08 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -1043,17 +1043,40 @@ static const struct drm_ioctl_desc msm_ioctls[] = {
DRM_IOCTL_DEF_DRV(MSM_SUBMITQUEUE_QUERY, msm_ioctl_submitqueue_query, 
DRM_RENDER_ALLOW),
 };
 
+enum drm_gem_object_status gem_status(struct drm_gem_object *obj)
+{
+   struct msm_gem_object *msm_obj = to_msm_bo(obj);
+   enum drm_gem_object_status status = 0;
+
+   if (!dma_resv_test_signaled(obj->resv, dma_resv_usage_rw(true)))
+   status |= DRM_GEM_OBJECT_ACTIVE;
+
+   if (msm_obj->pages)
+   status |= DRM_GEM_OBJECT_RESIDENT;
+
+   if (msm_obj->madv == MSM_MADV_DONTNEED)
+   status |= DRM_GEM_OBJECT_PURGEABLE;
+
+   return status;
+}
+
 static void msm_fop_show_fdinfo(struct seq_file *m, struct file *f)
 {
struct drm_file *file = f->private_data;
struct drm_device *dev = file->minor->dev;
struct msm_drm_private *priv = dev->dev_private;
+   struct msm_file_private *ctx = file->driver_priv;
struct drm_printer p = drm_seq_file_printer(m);
 
if (!priv->gpu)
return;
 
-   msm_gpu_show_fdinfo(priv->gpu, file->driver_priv, );
+   drm_printf(, "drm-driver:\t%s\n", dev->driver->name);
+   drm_printf(, "drm-client-id:\t%u\n", ctx->seqno);
+
+   msm_gpu_show_fdinfo(priv->gpu, ctx, );
+
+   drm_print_memory_stats(file, , gem_status);
 }
 
 static const struct file_operations fops = {
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index 26ebda40be4f..c403912d13ab 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -151,8 +151,6 @@ int msm_gpu_pm_suspend(struct msm_gpu *gpu)
 void msm_gpu_show_fdinfo(struct msm_gpu *gpu, struct msm_file_private *ctx,
 struct drm_printer *p)
 {
-   drm_printf(p, "drm-driver:\t%s\n", gpu->dev->driver->name);
-   drm_printf(p, "drm-client-id:\t%u\n", ctx->seqno);
drm_printf(p, "drm-engine-gpu:\t%llu ns\n", ctx->elapsed_ns);
drm_printf(p, "drm-cycles-gpu:\t%llu\n", ctx->cycles);
drm_printf(p, "drm-maxfreq-gpu:\t%u Hz\n", gpu->fast_rate);
-- 
2.39.2



[PATCH v2 0/2] drm: fdinfo memory stats

2023-04-10 Thread Rob Clark
From: Rob Clark 

Similar motivation to other similar recent attempt[1].  But with an
attempt to have some shared code for this.  As well as documentation.

It is probably a bit UMA-centric, I guess devices with VRAM might want
some placement stats as well.  But this seems like a reasonable start.

Basic gputop support: https://patchwork.freedesktop.org/series/116236/
And already nvtop support: https://github.com/Syllo/nvtop/pull/204

[1] https://patchwork.freedesktop.org/series/112397/

Rob Clark (2):
  drm: Add fdinfo memory stats
  drm/msm: Add memory stats to fdinfo

 Documentation/gpu/drm-usage-stats.rst | 21 +++
 drivers/gpu/drm/drm_file.c| 79 +++
 drivers/gpu/drm/msm/msm_drv.c | 25 -
 drivers/gpu/drm/msm/msm_gpu.c |  2 -
 include/drm/drm_file.h| 10 
 5 files changed, 134 insertions(+), 3 deletions(-)

-- 
2.39.2



[PATCH v2 1/2] drm: Add fdinfo memory stats

2023-04-10 Thread Rob Clark
From: Rob Clark 

Add a helper to dump memory stats to fdinfo.  For the things the drm
core isn't aware of, use a callback.

v2: Fix typos, change size units to match docs, use div_u64

Signed-off-by: Rob Clark 
Reviewed-by: Emil Velikov 
---
 Documentation/gpu/drm-usage-stats.rst | 21 +++
 drivers/gpu/drm/drm_file.c| 79 +++
 include/drm/drm_file.h| 10 
 3 files changed, 110 insertions(+)

diff --git a/Documentation/gpu/drm-usage-stats.rst 
b/Documentation/gpu/drm-usage-stats.rst
index b46327356e80..b5e7802532ed 100644
--- a/Documentation/gpu/drm-usage-stats.rst
+++ b/Documentation/gpu/drm-usage-stats.rst
@@ -105,6 +105,27 @@ object belong to this client, in the respective memory 
region.
 Default unit shall be bytes with optional unit specifiers of 'KiB' or 'MiB'
 indicating kibi- or mebi-bytes.
 
+- drm-shared-memory:  [KiB|MiB]
+
+The total size of buffers that are shared with another file (ie. have more
+than a single handle).
+
+- drm-private-memory:  [KiB|MiB]
+
+The total size of buffers that are not shared with another file.
+
+- drm-resident-memory:  [KiB|MiB]
+
+The total size of buffers that are resident in system memory.
+
+- drm-purgeable-memory:  [KiB|MiB]
+
+The total size of buffers that are purgeable.
+
+- drm-active-memory:  [KiB|MiB]
+
+The total size of buffers that are active on one or more rings.
+
 - drm-cycles- 
 
 Engine identifier string must be the same as the one specified in the
diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
index a51ff8cee049..085b01842a87 100644
--- a/drivers/gpu/drm/drm_file.c
+++ b/drivers/gpu/drm/drm_file.c
@@ -42,6 +42,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include "drm_crtc_internal.h"
@@ -868,6 +869,84 @@ void drm_send_event(struct drm_device *dev, struct 
drm_pending_event *e)
 }
 EXPORT_SYMBOL(drm_send_event);
 
+static void print_size(struct drm_printer *p, const char *stat, size_t sz)
+{
+   const char *units[] = {"", " KiB", " MiB"};
+   unsigned u;
+
+   for (u = 0; u < ARRAY_SIZE(units) - 1; u++) {
+   if (sz < SZ_1K)
+   break;
+   sz = div_u64(sz, SZ_1K);
+   }
+
+   drm_printf(p, "%s:\t%zu%s\n", stat, sz, units[u]);
+}
+
+/**
+ * drm_print_memory_stats - Helper to print standard fdinfo memory stats
+ * @file: the DRM file
+ * @p: the printer to print output to
+ * @status: callback to get driver tracked object status
+ *
+ * Helper to iterate over GEM objects with a handle allocated in the specified
+ * file.  The optional status callback can return additional object state which
+ * determines which stats the object is counted against.  The callback is 
called
+ * under table_lock.  Racing against object status change is "harmless", and 
the
+ * callback can expect to not race against object destruction.
+ */
+void drm_print_memory_stats(struct drm_file *file, struct drm_printer *p,
+   enum drm_gem_object_status (*status)(struct 
drm_gem_object *))
+{
+   struct drm_gem_object *obj;
+   struct {
+   size_t shared;
+   size_t private;
+   size_t resident;
+   size_t purgeable;
+   size_t active;
+   } size = {0};
+   int id;
+
+   spin_lock(>table_lock);
+   idr_for_each_entry (>object_idr, obj, id) {
+   enum drm_gem_object_status s = 0;
+
+   if (status)
+   s = status(obj);
+
+   if (obj->handle_count > 1) {
+   size.shared += obj->size;
+   } else {
+   size.private += obj->size;
+   }
+
+   if (s & DRM_GEM_OBJECT_RESIDENT) {
+   size.resident += obj->size;
+   s &= ~DRM_GEM_OBJECT_PURGEABLE;
+   }
+
+   if (s & DRM_GEM_OBJECT_ACTIVE) {
+   size.active += obj->size;
+   s &= ~DRM_GEM_OBJECT_PURGEABLE;
+   }
+
+   if (s & DRM_GEM_OBJECT_PURGEABLE)
+   size.purgeable += obj->size;
+   }
+   spin_unlock(>table_lock);
+
+   print_size(p, "drm-shared-memory", size.shared);
+   print_size(p, "drm-private-memory", size.private);
+
+   if (status) {
+   print_size(p, "drm-resident-memory", size.resident);
+   print_size(p, "drm-purgeable-memory", size.purgeable);
+   print_size(p, "drm-active-memory", size.active);
+   }
+}
+EXPORT_SYMBOL(drm_print_memory_stats);
+
 /**
  * mock_drm_getfile - Create a new struct file for the drm device
  * @minor: drm minor to wrap (e.g. #drm_device.primary)
diff --git a/include/drm/drm_file.h b/include/drm/drm_file.h
index 0d1f853092ab..7bd8a1374f39 100644
--- a/include/drm/drm_file.h
+++ b/include/drm/drm_file.h
@@ -41,6 +41,7 @@
 struct dma_fence;
 struct drm_file;
 struct drm_device;
+struct 

Re: [PATCH] drm/msm/a6xx: don't set IO_PGTABLE_QUIRK_ARM_OUTER_WBWA with coherent SMMU

2023-04-10 Thread David Heidelberg

Thank you!

Tested-by: David Heidelberg 

On 10/04/2023 20:52, Dmitry Baryshkov wrote:

If the Adreno SMMU is dma-coherent, allocation will fail unless we
disable IO_PGTABLE_QUIRK_ARM_OUTER_WBWA. Skip setting this quirk for the
coherent SMMUs (like we have on sm8350 platform).

Fixes: 54af0ceb7595 ("arm64: dts: qcom: sm8350: add GPU, GMU, GPU CC and SMMU 
nodes")
Reported-by: David Heidelberg 
Signed-off-by: Dmitry Baryshkov 
---
  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 2942d2548ce6..f74495dcbd96 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -1793,7 +1793,8 @@ a6xx_create_address_space(struct msm_gpu *gpu, struct 
platform_device *pdev)
 * This allows GPU to set the bus attributes required to use system
 * cache on behalf of the iommu page table walker.
 */
-   if (!IS_ERR_OR_NULL(a6xx_gpu->htw_llc_slice))
+   if (!IS_ERR_OR_NULL(a6xx_gpu->htw_llc_slice) &&
+   !device_iommu_capable(>dev, IOMMU_CAP_CACHE_COHERENCY))
quirks |= IO_PGTABLE_QUIRK_ARM_OUTER_WBWA;
  
  	return adreno_iommu_create_address_space(gpu, pdev, quirks);


--
David Heidelberg
Consultant Software Engineer



Re: [RFC PATCH] drm/msm/a5xx: really check for A510 in a5xx_gpu_init

2023-04-10 Thread Dmitry Baryshkov
On Mon, 10 Apr 2023 at 22:28, Stephen Boyd  wrote:
>
> Quoting Dmitry Baryshkov (2023-04-08 18:13:29)
> > The commit 010c8bbad2cb ("drm: msm: adreno: Disable preemption on Adreno
> > 510") added special handling for a510 (this SKU doesn't seem to support
> > preemption, so the driver should clamp nr_rings to 1). However the
> > gpu->revn is not yet set (it is set later, in adreno_gpu_init()) and
> > thus the condition is always false. Check config->rev instead.
> >
> > Fixes: 010c8bbad2cb ("drm: msm: adreno: Disable preemption on Adreno 510")
> > Reported-by: Adam Skladowski 
> > Signed-off-by: Dmitry Baryshkov 
> > ---
>
> Maybe as a followup you can put a WARN_ON_ONCE() inside a new function
> that gets gpu->revn and warns if the value is 0?

Sounds like a good idea.

-- 
With best wishes
Dmitry


Re: [RFC PATCH] drm/msm/a5xx: really check for A510 in a5xx_gpu_init

2023-04-10 Thread Stephen Boyd
Quoting Dmitry Baryshkov (2023-04-08 18:13:29)
> The commit 010c8bbad2cb ("drm: msm: adreno: Disable preemption on Adreno
> 510") added special handling for a510 (this SKU doesn't seem to support
> preemption, so the driver should clamp nr_rings to 1). However the
> gpu->revn is not yet set (it is set later, in adreno_gpu_init()) and
> thus the condition is always false. Check config->rev instead.
>
> Fixes: 010c8bbad2cb ("drm: msm: adreno: Disable preemption on Adreno 510")
> Reported-by: Adam Skladowski 
> Signed-off-by: Dmitry Baryshkov 
> ---

Maybe as a followup you can put a WARN_ON_ONCE() inside a new function
that gets gpu->revn and warns if the value is 0?


[PATCH 1/2] drm/i915: Dump error capture to kernel log

2023-04-10 Thread John . C . Harrison
From: John Harrison 

This is useful for getting debug information out in certain
situations, such as failing kernel selftests and CI runs that don't
log error captures. It is especially useful for things like retrieving
GuC logs as GuC operation can't be tracked by adding printk or ftrace
entries.

Signed-off-by: John Harrison 
---
 drivers/gpu/drm/i915/i915_gpu_error.c | 130 ++
 drivers/gpu/drm/i915/i915_gpu_error.h |   8 ++
 2 files changed, 138 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index f020c0086fbcd..500fec34188a0 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -2219,3 +2219,133 @@ void i915_disable_error_state(struct drm_i915_private 
*i915, int err)
i915->gpu_error.first_error = ERR_PTR(err);
spin_unlock_irq(>gpu_error.lock);
 }
+
+void intel_klog_error_capture(struct intel_gt *gt,
+ intel_engine_mask_t engine_mask)
+{
+   static int g_count;
+   struct drm_i915_private *i915 = gt->i915;
+   struct i915_gpu_coredump *error;
+   intel_wakeref_t wakeref;
+   size_t buf_size = PAGE_SIZE * 128;
+   size_t pos_err;
+   char *buf, *ptr, *next;
+   int l_count = g_count++;
+   int line = 0;
+
+   /* Can't allocate memory during a reset */
+   if (test_bit(I915_RESET_BACKOFF, >reset.flags)) {
+   drm_err(>i915->drm, "[Capture/%d.%d] Inside GT reset, 
skipping error capture :(\n",
+   l_count, line++);
+   return;
+   }
+
+   error = READ_ONCE(i915->gpu_error.first_error);
+   if (error) {
+   drm_err(>drm, "[Capture/%d.%d] Clearing existing error 
capture first...\n",
+   l_count, line++);
+   i915_reset_error_state(i915);
+   }
+
+   with_intel_runtime_pm(>runtime_pm, wakeref)
+   error = i915_gpu_coredump(gt, engine_mask, CORE_DUMP_FLAG_NONE);
+
+   if (IS_ERR(error)) {
+   drm_err(>drm, "[Capture/%d.%d] Failed to capture error 
capture: %ld!\n",
+   l_count, line++, PTR_ERR(error));
+   return;
+   }
+
+   buf = kvmalloc(buf_size, GFP_KERNEL);
+   if (!buf) {
+   drm_err(>drm, "[Capture/%d.%d] Failed to allocate buffer 
for error capture!\n",
+   l_count, line++);
+   i915_gpu_coredump_put(error);
+   return;
+   }
+
+   drm_info(>drm, "[Capture/%d.%d] Dumping i915 error capture for 
%ps...\n",
+l_count, line++, __builtin_return_address(0));
+
+   /* Largest string length safe to print via dmesg */
+#  define MAX_CHUNK800
+
+   pos_err = 0;
+   while (1) {
+   ssize_t got = i915_gpu_coredump_copy_to_buffer(error, buf, 
pos_err, buf_size - 1);
+
+   if (got <= 0)
+   break;
+
+   buf[got] = 0;
+   pos_err += got;
+
+   ptr = buf;
+   while (got > 0) {
+   size_t count;
+   char tag[2];
+
+   next = strnchr(ptr, got, '\n');
+   if (next) {
+   count = next - ptr;
+   *next = 0;
+   tag[0] = '>';
+   tag[1] = '<';
+   } else {
+   count = got;
+   tag[0] = '}';
+   tag[1] = '{';
+   }
+
+   if (count > MAX_CHUNK) {
+   size_t pos;
+   char *ptr2 = ptr;
+
+   for (pos = MAX_CHUNK; pos < count; pos += 
MAX_CHUNK) {
+   char chr = ptr[pos];
+
+   ptr[pos] = 0;
+   drm_info(>drm, "[Capture/%d.%d] 
}%s{\n",
+l_count, line++, ptr2);
+   ptr[pos] = chr;
+   ptr2 = ptr + pos;
+
+   /*
+* If spewing large amounts of data via 
a serial console,
+* this can be a very slow process. So 
be friendly and try
+* not to cause 'softlockup on CPU' 
problems.
+*/
+   cond_resched();
+   }
+
+   if (ptr2 < (ptr + count))
+   drm_info(>drm, "[Capture/%d.%d] 
%c%s%c\n",
+l_count, line++, tag[0], ptr2, 
tag[1]);
+ 

[PATCH 2/2] drm/i915/guc: Dump error capture to dmesg on CTB error

2023-04-10 Thread John . C . Harrison
From: John Harrison 

In the past, There have been sporadic CTB failures which proved hard
to reproduce manually. The most effective solution was to dump the GuC
log at the point of failure and let the CI system do the repro. It is
preferable not to dump the GuC log via dmesg for all issues as it is
not always necessary and is not helpful for end users. But rather than
trying to re-invent the code to do this each time it is wanted, commit
the code but for DEBUG_GUC builds only.

Signed-off-by: John Harrison 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 53 +++
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h |  6 +++
 2 files changed, 59 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index 1803a633ed648..66a1818a3485f 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -13,6 +13,30 @@
 #include "intel_guc_ct.h"
 #include "intel_guc_print.h"
 
+#if defined(CONFIG_DRM_I915_DEBUG_GUC)
+enum {
+   CT_DEAD_ALIVE = 0,
+   CT_DEAD_SETUP,
+   CT_DEAD_WRITE,
+   CT_DEAD_DEADLOCK,
+   CT_DEAD_H2G_HAS_ROOM,
+   CT_DEAD_READ,
+   CT_DEAD_PROCESS_FAILED,
+};
+
+static void ct_dead_ct_worker_func(struct work_struct *w);
+
+#define CT_DEAD(ct, reason)\
+   do { \
+   if (!(ct)->dead_ct_reported) { \
+   (ct)->dead_ct_reason |= 1 << CT_DEAD_##reason; \
+   queue_work(system_unbound_wq, &(ct)->dead_ct_worker); \
+   } \
+   } while (0)
+#else
+#define CT_DEAD(ct, reason)do { } while (0)
+#endif
+
 static inline struct intel_guc *ct_to_guc(struct intel_guc_ct *ct)
 {
return container_of(ct, struct intel_guc, ct);
@@ -93,6 +117,9 @@ void intel_guc_ct_init_early(struct intel_guc_ct *ct)
spin_lock_init(>requests.lock);
INIT_LIST_HEAD(>requests.pending);
INIT_LIST_HEAD(>requests.incoming);
+#if defined(CONFIG_DRM_I915_DEBUG_GUC)
+   INIT_WORK(>dead_ct_worker, ct_dead_ct_worker_func);
+#endif
INIT_WORK(>requests.worker, ct_incoming_request_worker_func);
tasklet_setup(>receive_tasklet, ct_receive_tasklet_func);
init_waitqueue_head(>wq);
@@ -319,11 +346,16 @@ int intel_guc_ct_enable(struct intel_guc_ct *ct)
 
ct->enabled = true;
ct->stall_time = KTIME_MAX;
+#if defined(CONFIG_DRM_I915_DEBUG_GUC)
+   ct->dead_ct_reported = false;
+   ct->dead_ct_reason = CT_DEAD_ALIVE;
+#endif
 
return 0;
 
 err_out:
CT_PROBE_ERROR(ct, "Failed to enable CTB (%pe)\n", ERR_PTR(err));
+   CT_DEAD(ct, SETUP);
return err;
 }
 
@@ -434,6 +466,7 @@ static int ct_write(struct intel_guc_ct *ct,
 corrupted:
CT_ERROR(ct, "Corrupted descriptor head=%u tail=%u status=%#x\n",
 desc->head, desc->tail, desc->status);
+   CT_DEAD(ct, WRITE);
ctb->broken = true;
return -EPIPE;
 }
@@ -504,6 +537,7 @@ static inline bool ct_deadlocked(struct intel_guc_ct *ct)
CT_ERROR(ct, "Head: %u\n (Dwords)", ct->ctbs.recv.desc->head);
CT_ERROR(ct, "Tail: %u\n (Dwords)", ct->ctbs.recv.desc->tail);
 
+   CT_DEAD(ct, DEADLOCK);
ct->ctbs.send.broken = true;
}
 
@@ -552,6 +586,7 @@ static inline bool h2g_has_room(struct intel_guc_ct *ct, 
u32 len_dw)
 head, ctb->size);
desc->status |= GUC_CTB_STATUS_OVERFLOW;
ctb->broken = true;
+   CT_DEAD(ct, H2G_HAS_ROOM);
return false;
}
 
@@ -908,6 +943,7 @@ static int ct_read(struct intel_guc_ct *ct, struct 
ct_incoming_msg **msg)
CT_ERROR(ct, "Corrupted descriptor head=%u tail=%u status=%#x\n",
 desc->head, desc->tail, desc->status);
ctb->broken = true;
+   CT_DEAD(ct, READ);
return -EPIPE;
 }
 
@@ -1057,6 +1093,7 @@ static bool ct_process_incoming_requests(struct 
intel_guc_ct *ct)
if (unlikely(err)) {
CT_ERROR(ct, "Failed to process CT message (%pe) %*ph\n",
 ERR_PTR(err), 4 * request->size, request->msg);
+   CT_DEAD(ct, PROCESS_FAILED);
ct_free_msg(request);
}
 
@@ -1233,3 +1270,19 @@ void intel_guc_ct_print_info(struct intel_guc_ct *ct,
drm_printf(p, "Tail: %u\n",
   ct->ctbs.recv.desc->tail);
 }
+
+#if defined(CONFIG_DRM_I915_DEBUG_GUC)
+static void ct_dead_ct_worker_func(struct work_struct *w)
+{
+   struct intel_guc_ct *ct = container_of(w, struct intel_guc_ct, 
dead_ct_worker);
+   struct intel_guc *guc = ct_to_guc(ct);
+
+   if (ct->dead_ct_reported)
+   return;
+
+   ct->dead_ct_reported = true;
+
+   guc_info(guc, "CTB is dead - reason=0x%X\n", ct->dead_ct_reason);
+   intel_klog_error_capture(guc_to_gt(guc), (intel_engine_mask_t)~0U);
+}
+#endif
diff --git 

[PATCH 0/2] Add support for dumping error captures via kernel logging

2023-04-10 Thread John . C . Harrison
From: John Harrison 

Sometimes, the only effective way to debug an issue is to dump all the
interesting information at the point of failure. So add support for
doing that.

Signed-off-by: John Harrison 


John Harrison (2):
  drm/i915: Dump error capture to kernel log
  drm/i915/guc: Dump error capture to dmesg on CTB error

 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c |  53 +
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h |   6 +
 drivers/gpu/drm/i915/i915_gpu_error.c | 130 ++
 drivers/gpu/drm/i915/i915_gpu_error.h |   8 ++
 4 files changed, 197 insertions(+)

-- 
2.39.1



Re: [PATCH] drm/msm/adreno: fix sparse warnings in a6xx code

2023-04-10 Thread Stephen Boyd
Quoting Dmitry Baryshkov (2023-04-06 18:07:41)
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h 
> b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
> index 0bc3eb443fec..84d345af126f 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
> @@ -51,8 +51,8 @@ struct a6xx_gmu {
>
> struct msm_gem_address_space *aspace;
>
> -   void * __iomem mmio;
> -   void * __iomem rscc;
> +   void __iomem * mmio;
> +   void __iomem * rscc;

Should stick that * to the member name.

void __iomem *rscc;

with that

Reviewed-by: Stephen Boyd 


Re: [RFC 1/2] drm: Add fdinfo memory stats

2023-04-10 Thread Rob Clark
On Sat, Apr 8, 2023 at 5:20 AM Emil Velikov  wrote:
>
> Hey Rob,
>
> On Thu, 6 Apr 2023 at 22:59, Rob Clark  wrote:
>
> > +- drm-purgeable-memory:  [KiB|MiB]
> > +
> > +The total size of buffers that are purgable.
>
> s/purgable/purgeable/
>
>
> > +static void print_size(struct drm_printer *p, const char *stat, size_t sz)
> > +{
> > +   const char *units[] = {"B", "KiB", "MiB", "GiB"};
>
> The documentation says:
>
> > Default unit shall be bytes with optional unit specifiers of 'KiB' or 'MiB'
> > indicating kibi- or mebi-bytes.
>
> So I would drop the B and/or update the documentation to mention B && GiB.
>
> > +   unsigned u;
> > +
> > +   for (u = 0; u < ARRAY_SIZE(units) - 1; u++) {
> > +   if (sz < SZ_1K)
> > +   break;
> > +   sz /= SZ_1K;
>
> IIRC size_t can be 64bit, so we should probably use do_div() here.
>
> > +   }
> > +
> > +   drm_printf(p, "%s:\t%lu %s\n", stat, sz, units[u]);
> > +}
> > +
> > +/**
> > + * drm_print_memory_stats - Helper to print standard fdinfo memory stats
> > + * @file: the DRM file
> > + * @p: the printer to print output to
> > + * @status: callback to get driver tracked object status
> > + *
> > + * Helper to iterate over GEM objects with a handle allocated in the 
> > specified
> > + * file.  The optional status callback can return additional object state 
> > which
>
> s/return additional/return an additional/

"an" reads funny to me, as the state is plural (bitmask).. but agreed
on the other things

> > + * determines which stats the object is counted against.  The callback is 
> > called
> > + * under table_lock.  Racing against object status change is "harmless", 
> > and the
> > + * callback can expect to not race against object destroy.
>
> s/destroy/destruction/
>
> > + */
> > +void drm_print_memory_stats(struct drm_file *file, struct drm_printer *p,
> > +   enum drm_gem_object_status (*status)(struct 
> > drm_gem_object *))
> > +{
>
> > +   if (s & DRM_GEM_OBJECT_RESIDENT) {
> > +   size.resident += obj->size;
> > +   s &= ~DRM_GEM_OBJECT_PURGEABLE;
>
> Is MSM capable of marking the object as both purgeable and resident or
> is this to catch other drivers? Should we add a note to the
> documentation above - resident memory cannot be purgeable

It is just to simplify drivers so they don't have to repeat this
logic.  Ie. an object can be marked purgeable while it is still active
(so it will be eventually purgeable when it becomes idle).  Likewise
it doesn't make sense to count an object that has already been purged
(is no longer resident) as purgeable.

BR,
-R

> > +   }
> > +
> > +   if (s & DRM_GEM_OBJECT_ACTIVE) {
> > +   size.active += obj->size;
> > +   s &= ~DRM_GEM_OBJECT_PURGEABLE;
>
> Ditto.
>
> With the above nits, the patch is:
> Reviewed-by: Emil Velikov 
>
> HTH
> Emil


[PATCH] drm/msm/a6xx: don't set IO_PGTABLE_QUIRK_ARM_OUTER_WBWA with coherent SMMU

2023-04-10 Thread Dmitry Baryshkov
If the Adreno SMMU is dma-coherent, allocation will fail unless we
disable IO_PGTABLE_QUIRK_ARM_OUTER_WBWA. Skip setting this quirk for the
coherent SMMUs (like we have on sm8350 platform).

Fixes: 54af0ceb7595 ("arm64: dts: qcom: sm8350: add GPU, GMU, GPU CC and SMMU 
nodes")
Reported-by: David Heidelberg 
Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 2942d2548ce6..f74495dcbd96 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -1793,7 +1793,8 @@ a6xx_create_address_space(struct msm_gpu *gpu, struct 
platform_device *pdev)
 * This allows GPU to set the bus attributes required to use system
 * cache on behalf of the iommu page table walker.
 */
-   if (!IS_ERR_OR_NULL(a6xx_gpu->htw_llc_slice))
+   if (!IS_ERR_OR_NULL(a6xx_gpu->htw_llc_slice) &&
+   !device_iommu_capable(>dev, IOMMU_CAP_CACHE_COHERENCY))
quirks |= IO_PGTABLE_QUIRK_ARM_OUTER_WBWA;
 
return adreno_iommu_create_address_space(gpu, pdev, quirks);
-- 
2.39.2



Re: [PATCH 2/2] vkms: Add support for multiple connectors

2023-04-10 Thread Melissa Wen
On 04/06, Marius Vlad wrote:
> Hi Thomas,
> 
> Thanks for the clarifications! Made a couple of remarks in-line.
> 
> On 4/6/23 14:29, Thomas Zimmermann wrote:
> > Hi
> > 
> > Am 06.04.23 um 11:17 schrieb Marius Vlad:
> > > Hi Maira,
> > > 
> > > Thanks a lot for taking a look. Yeah, indeed, this creates a new
> > > connector, a CRTC and planes for it. Terminology wise, multiple
> > > pipes. The idea is to have multiple outputs, each with its own
> > > connector, as to be able to simulate (a few) more outputs. At least
> > > that's what I'm looking for.
> > > 
> > > I'll adjust the commits description to clarify that.
> > > 
> > > Regarding the planes, it seemed a bit easier to just create a new
> > > tuple of planes for each output, as in to reuse vkms_output_init().
> > > So I guess that you're suggestion would be to make use the existing
> > > planes.
> > > 
> > > Seems a bit more work, not that keen on changing that, but I do have
> > > some follow-up questions for my own curiosity in case I do this:
> > > 
> > > -  Don't I need an entire pipe (a primary plane, crtc, encoder,
> > > connector) to power up the end side of the sink (display)?
> > 
> > Yes, you need at least one full pipeline. I don't see how you'd get
> > something displayed otherwise.
> > 
> > > - Can the primary one be sufficient for multiple outputs?
> > 
> > We have no concept of "primary pipelines." The individual components
> > have index numbers. There's a primary plane attached to each CRTC, but
> > even that concept comes more from HW limitations and historical designs,
> > than fundamental requirements.
> 
> Right, meant the primary plane, rather than pipeline.
> 
> > 
> > For multiple outputs, you can attach multiple encoders/connectors to the
> > same CRTC. They will then all display the same screen at the same
> > display resolution
> Yeah, this kind of sounds like cloning to me, and would like to display
> different things at the same time, on different outputs, to me it sounds I
> need primary plane and a CRTC for each connector. Right?
> 
> > 
> > > - Can the overlay planes be shared or I need to
> > >    discard the ones that are already in-use by other outputs?
> > 
> > Even if your hardware planes support it, you cannot share planes among
> > CRTCs with DRM. At least I'm not aware how to. Each plane struct has a
> > designated CRTC struct. For most flexibility I'd have to match HW planes
> > and DRM planes dynamically. For example if you have 2 CRTCs that can
> > share 10 HW planes, you can allocate 10 DRM planes for each CRTC. The
> > atomic_check functions have to implement the mapping from hardware to
> > software plane and fail if no more hardware planes are available.
> > 
> > If you want to display the same screen on multiple CRTCs, it's possible
> > to share the DRM framebuffers among similar the planes.
> 
> Aham, understood, thanks again!

Some drivers allow overlay planes to move between CRTCs. We have briefly
discussed on IRC that it would be interesting to have this (even for all
plane types) in vkms for testing and validation, but in a next step. As
it's not included in this proposal here, I'd suggest you to include this
feature/improvement in the vkms TODO in your next version, to keep it in
our minds for future works.

BR,

Melissa

> 
> > 
> > Best regards
> > Thomas
> > 
> > > 
> > > I'll have a look at that writeback test as well see what's causing that
> > > NULL pointer deref.
> > > 
> > > 
> > > On 4/5/23 16:51, Maíra Canal wrote:
> > > > Hi Marius,
> > > > 
> > > > > This patch adds support for creating multiple virtual connectors, in
> > > > > case one would need it. Use module parameters to specify how many,
> > > > > defaulting to just one, allocating from the start, a maximum of 4
> > > > > possible outputs.
> > > > 
> > > > I got a bit confused by this description. The commit message
> > > > says that you
> > > > are creating multiple connectors, but it seems like you are
> > > > creating multiple
> > > > pipes. From what I could see, for each new connector created,
> > > > you are also
> > > > creating a new CRTC and new planes.
> > > > 
> > > > Moreover, if your real intention is to create multiple pipes, I
> > > > believe that
> > > > you don't really need to duplicate the planes as well.
> > > > 
> > > > I ran the VKMS CI [1] with your patches applied and, although
> > > > most of the
> > > > tests are now passing with multiple pipes, which is really nice,
> > > > the test
> > > > kms_writeback crashes with the following error:
> > > > 
> > > > [ 1997.244347] [IGT] kms_writeback: starting subtest
> > > > writeback-check-output
> > > > [ 1997.250673] BUG: kernel NULL pointer dereference, address:
> > > > 016c
> > > > [ 1997.250691] #PF: supervisor read access in kernel mode
> > > > [ 1997.250693] #PF: error_code(0x) - not-present page
> > > > [ 1997.250694] PGD 80001a877067 P4D 80001a877067 PUD
> > > > 1a872067 PMD 0
> > > > [ 1997.250697] Oops:  [#1] PREEMPT SMP 

[PATCH] drm/pl111: fix drm and dev leak when irq request failed.

2023-04-10 Thread Gencen Gan
From: Gan Gecen 

Smatch reports:
pl111_amba_probe() warn: missing unwind goto?

Code segment for dev_put is:
dev_put:
drm_dev_put(drm);
of_reserved_mem_device_release(dev);

When err happened, jumping to dev_put will release
drm and dev resources allocated or initialized 
before. But after devm_request_irq() failed, it
returns directly without releasing drm and dev, 
which may cause memory leak.

Fixes: 3f9d6bccff6c ("drm/pl111: fix drm and dev leak when irq request failed")
Signed-off-by: Gan Gecen 
---
 drivers/gpu/drm/pl111/pl111_drv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/pl111/pl111_drv.c 
b/drivers/gpu/drm/pl111/pl111_drv.c
index 00deba0b7271..52945f7e0aef 100644
--- a/drivers/gpu/drm/pl111/pl111_drv.c
+++ b/drivers/gpu/drm/pl111/pl111_drv.c
@@ -297,7 +297,7 @@ static int pl111_amba_probe(struct amba_device *amba_dev,
   variant->name, priv);
if (ret != 0) {
dev_err(dev, "%s failed irq %d\n", __func__, ret);
-   return ret;
+   goto dev_put;
}
 
ret = pl111_modeset_init(drm);
-- 
2.34.1


Re: [PATCH 1/2] drm: sun4i/dsi: factor out DSI PHY poweron and poweroff

2023-04-10 Thread kernel test robot
Hi Brandon,

kernel test robot noticed the following build errors:

[auto build test ERROR on sunxi/sunxi/for-next]
[also build test ERROR on linus/master v6.3-rc6 next-20230406]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:
https://github.com/intel-lab-lkp/linux/commits/Brandon-Cheo-Fusi/drm-sun4i-dsi-factor-out-DSI-PHY-poweron-and-poweroff/20230410-165257
base:   https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux.git 
sunxi/for-next
patch link:
https://lore.kernel.org/r/20230410084750.164016-2-fusibrandon13%40gmail.com
patch subject: [PATCH 1/2] drm: sun4i/dsi: factor out DSI PHY poweron and 
poweroff
config: ia64-allyesconfig 
(https://download.01.org/0day-ci/archive/20230411/202304110110.zlinpepn-...@intel.com/config)
compiler: ia64-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://github.com/intel-lab-lkp/linux/commit/afa9cb6821e4527f07c10a777ea44e380b524858
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review 
Brandon-Cheo-Fusi/drm-sun4i-dsi-factor-out-DSI-PHY-poweron-and-poweroff/20230410-165257
git checkout afa9cb6821e4527f07c10a777ea44e380b524858
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 
O=build_dir ARCH=ia64 olddefconfig
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 
O=build_dir ARCH=ia64 SHELL=/bin/bash drivers/gpu/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot 
| Link: 
https://lore.kernel.org/oe-kbuild-all/202304110110.zlinpepn-...@intel.com/

All errors (new ones prefixed by >>):

>> drivers/gpu/drm/sun4i/sun6i_mipi_dsi.c:843:27: error: initialization of 
>> 'void (*)(struct drm_encoder *)' from incompatible pointer type 'void 
>> (*)(struct drm_encoder *, struct drm_atomic_state *)' 
>> [-Werror=incompatible-pointer-types]
 843 | .enable = sun6i_dsi_encoder_enable,
 |   ^~~~
   drivers/gpu/drm/sun4i/sun6i_mipi_dsi.c:843:27: note: (near initialization 
for 'sun6i_dsi_enc_helper_funcs.enable')
   cc1: some warnings being treated as errors


vim +843 drivers/gpu/drm/sun4i/sun6i_mipi_dsi.c

133add5b5ad42b Maxime Ripard 2018-04-04  841  
133add5b5ad42b Maxime Ripard 2018-04-04  842  static const struct 
drm_encoder_helper_funcs sun6i_dsi_enc_helper_funcs = {
133add5b5ad42b Maxime Ripard 2018-04-04 @843.enable = 
sun6i_dsi_encoder_enable,
133add5b5ad42b Maxime Ripard 2018-04-04  844  };
133add5b5ad42b Maxime Ripard 2018-04-04  845  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests


Re: [PATCH v2] drm/vkms: add module parameter to set background color

2023-04-10 Thread Melissa Wen
On 04/10, Simon Ser wrote:
> I think this should be a KMS property instead of a module parameter.
> Is there a reason why this patch uses a module parameter? It breaks
> user-space expectations.

Hi Simon,

a KMS property is what we have on vkms TODO [1] and the module parameter
was Maíra's first step to open a discussion for this property [2].
AFAIK, we would need to create the KMS property first, but it seems
there isn't an userspace case/need to support this API change.
Do you know any valid use cases to support a bkg color property?

BR,

Melissa

[1] https://dri.freedesktop.org/docs/drm/gpu/vkms.html#add-plane-features
[2] https://lore.kernel.org/dri-devel/20230406172002.124456-1-mca...@igalia.com/


signature.asc
Description: PGP signature


Re: [PATCH v7 6/8] drm/i915/uapi/pxp: Fix UAPI spec comments and add GET_PARAM for PXP

2023-04-10 Thread Ceraolo Spurio, Daniele




On 4/6/2023 10:44 AM, Alan Previn wrote:

1. UAPI update:
Without actually changing backward compatible behavior, update
i915's drm-uapi comments that describe the possible error values
when creating a context with I915_CONTEXT_PARAM_PROTECTED_CONTENT.
Since the first merge of PXP support on ADL, i915 returns
-ENXIO if a dependency such as firmware or component driver
was yet to be loaded or returns -EIO if the creation attempt
failed when requested by the PXP firmware (specific firmware
error responses are reported in dmesg).

2. GET_PARAM for PXP:
Because of the additional firmware, component-driver and
initialization depedencies required on MTL platform before a
PXP context can be created, UMD calling for PXP creation as a
way to get-caps can take a long time. An actual real world
customer stack has seen this happen in the 4-to-8 second range
after the kernel starts (which sees MESA's init appear in the
middle of this range as the compositor comes up). To avoid
unncessary delays experienced by the UMD for get-caps purposes,
add a GET_PARAM for I915_PARAM_PXP_SUPPORT.

However, some failures can still occur after all the depedencies
are met (such as firmware init flow failure, bios configurations
or SOC fusing not allowing PXP enablement). Those scenarios will
only be known to user space when it attempts creating a PXP context.

With this change, large delays are only met by user-space procsses
that explicitly need to create a PXP context and boot very early.
There is no way to avoid this today.

Signed-off-by: Alan Previn 
---
  drivers/gpu/drm/i915/i915_getparam.c |  5 +
  include/uapi/drm/i915_drm.h  | 22 ++
  2 files changed, 27 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_getparam.c 
b/drivers/gpu/drm/i915/i915_getparam.c
index 2238e096c957..9729384f033f 100644
--- a/drivers/gpu/drm/i915/i915_getparam.c
+++ b/drivers/gpu/drm/i915/i915_getparam.c
@@ -5,6 +5,8 @@
  #include "gem/i915_gem_mman.h"
  #include "gt/intel_engine_user.h"
  
+#include "pxp/intel_pxp.h"

+
  #include "i915_cmd_parser.h"
  #include "i915_drv.h"
  #include "i915_getparam.h"
@@ -102,6 +104,9 @@ int i915_getparam_ioctl(struct drm_device *dev, void *data,
if (value < 0)
return value;
break;
+   case I915_PARAM_PXP_STATUS:
+   value = intel_pxp_is_enabled(i915->pxp) ? 0 : -ENODEV;
+   break;
case I915_PARAM_MMAP_GTT_VERSION:
/* Though we've started our numbering from 1, and so class all
 * earlier versions as 0, in effect their value is undefined as
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index dba7c5a5b25e..0c1729bd911d 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -771,6 +771,20 @@ typedef struct drm_i915_irq_wait {
   */
  #define I915_PARAM_OA_TIMESTAMP_FREQUENCY 57
  
+/*

+ * Query the status of PXP support in i915.
+ *
+ * The query can fail in the following scenarios with the listed error codes:
+ *  -ENODEV = PXP support is not available on the GPU device or in the kernel
+ *due to missing component drivers or kernel configs.
+ * If the IOCTL is successful, the returned parameter will be set to one of the
+ * following values:
+ *   0 = PXP support maybe available but underlying SOC fusing, BIOS or 
firmware
+ *   configuration is unknown and a PXP-context-creation would be required
+ *   for final verification of feature availibility.


Would it be useful to add:

1 = PXP support is available

And start returning that after we've successfully created our first 
session? Not sure if userspace would use this though, since they still 
need to handle the 0 case anyway.
I'm also ok with this patch as-is, as long as you get an ack from the 
userspace drivers for this interface behavior:


Reviewed-by: Daniele Ceraolo Spurio 

Daniele


+ */
+#define I915_PARAM_PXP_STATUS   58
+
  /* Must be kept compact -- no holes and well documented */
  
  /**

@@ -2096,6 +2110,14 @@ struct drm_i915_gem_context_param {
   *
   * -ENODEV: feature not available
   * -EPERM: trying to mark a recoverable or not bannable context as protected
+ * -ENXIO: A dependency such as a component driver or firmware is not yet
+ * loaded and user space may attempt again. Depending on the device
+ * this error may be reported if the protected context creation is
+ * attempted very early from kernel start (numbers vary depending on
+ * system and kernel config):
+ *- ADL/RPL: up to 3 seconds
+ *- MTL: up to 8 seconds
+ * -EIO: The firmware did not succeed in creating the protected context.
   */
  #define I915_CONTEXT_PARAM_PROTECTED_CONTENT0xd
  /* Must be kept compact -- no holes and well documented */




Re: [PATCH v7 5/8] drm/i915/pxp: Add ARB session creation and cleanup

2023-04-10 Thread Ceraolo Spurio, Daniele




On 4/6/2023 10:44 AM, Alan Previn wrote:

Add MTL's function for ARB session creation using PXP firmware
version 4.3 ABI structure format.

Also add MTL's function for ARB session invalidation but this
reuses PXP firmware version 4.2 ABI structure format.

For both cases, in the back-end gsccs functions for sending messages
to the firmware inspect the GSC-CS-Mem-Header's pending-bit which
means the GSC firmware is busy and we should retry.

Given the last hw requirement, lets also update functions in
front-end layer that wait for session creation or teardown
completion to use new worst case timeout periods.

Signed-off-by: Alan Previn 
---
  drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c |  10 ++
  drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.h |   1 +
  drivers/gpu/drm/i915/pxp/intel_pxp.c  |  28 +++-
  .../drm/i915/pxp/intel_pxp_cmd_interface_43.h |  21 +++
  drivers/gpu/drm/i915/pxp/intel_pxp_gsccs.c| 130 ++
  drivers/gpu/drm/i915/pxp/intel_pxp_gsccs.h|   8 +-
  drivers/gpu/drm/i915/pxp/intel_pxp_session.c  |  11 +-
  7 files changed, 202 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c 
b/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c
index 1d9fdfb11268..85f720af2f75 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c
@@ -13,6 +13,7 @@
  #define GSC_FW_STATUS_REG _MMIO(0x116C40)
  #define GSC_FW_CURRENT_STATE  REG_GENMASK(3, 0)
  #define   GSC_FW_CURRENT_STATE_RESET  0
+#define   GSC_FW_PROXY_STATE_NORMAL5
  #define GSC_FW_INIT_COMPLETE_BIT  REG_BIT(9)
  
  static bool gsc_is_in_reset(struct intel_uncore *uncore)

@@ -31,6 +32,15 @@ bool intel_gsc_uc_fw_init_done(struct intel_gsc_uc *gsc)
return fw_status & GSC_FW_INIT_COMPLETE_BIT;
  }
  
+bool intel_gsc_uc_fw_proxy_init_done(struct intel_gsc_uc *gsc)

+{
+   struct intel_uncore *uncore = gsc_uc_to_gt(gsc)->uncore;
+   u32 fw_status = intel_uncore_read(uncore, GSC_FW_STATUS_REG);
+
+   return REG_FIELD_GET(GSC_FW_CURRENT_STATE, fw_status) ==
+  GSC_FW_PROXY_STATE_NORMAL;
+}
+
  static int emit_gsc_fw_load(struct i915_request *rq, struct intel_gsc_uc *gsc)
  {
u32 offset = i915_ggtt_offset(gsc->local);
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.h 
b/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.h
index f4c1106bb2a9..fff8928218df 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.h
@@ -13,5 +13,6 @@ struct intel_uncore;
  
  int intel_gsc_uc_fw_upload(struct intel_gsc_uc *gsc);

  bool intel_gsc_uc_fw_init_done(struct intel_gsc_uc *gsc);
+bool intel_gsc_uc_fw_proxy_init_done(struct intel_gsc_uc *gsc);
  
  #endif

diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.c 
b/drivers/gpu/drm/i915/pxp/intel_pxp.c
index 8949d4be7882..550457bbb034 100644
--- a/drivers/gpu/drm/i915/pxp/intel_pxp.c
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp.c
@@ -291,6 +291,8 @@ static bool pxp_component_bound(struct intel_pxp *pxp)
  
  static int __pxp_global_teardown_final(struct intel_pxp *pxp)

  {
+   int timeout;
+
if (!pxp->arb_is_valid)
return 0;
/*
@@ -300,7 +302,12 @@ static int __pxp_global_teardown_final(struct intel_pxp 
*pxp)
intel_pxp_mark_termination_in_progress(pxp);
intel_pxp_terminate(pxp, false);
  
-	if (!wait_for_completion_timeout(>termination, msecs_to_jiffies(250)))

+   if (HAS_ENGINE(pxp->ctrl_gt, GSC0))
+   timeout = GSCFW_MAX_ROUND_TRIP_LATENCY_MS;
+   else
+   timeout = 250;
+
+   if (!wait_for_completion_timeout(>termination, 
msecs_to_jiffies(timeout)))
return -ETIMEDOUT;
  
  	return 0;

@@ -308,6 +315,8 @@ static int __pxp_global_teardown_final(struct intel_pxp 
*pxp)
  
  static int __pxp_global_teardown_restart(struct intel_pxp *pxp)

  {
+   int timeout;
+
if (pxp->arb_is_valid)
return 0;
/*
@@ -316,7 +325,12 @@ static int __pxp_global_teardown_restart(struct intel_pxp 
*pxp)
 */
pxp_queue_termination(pxp);
  
-	if (!wait_for_completion_timeout(>termination, msecs_to_jiffies(250)))

+   if (HAS_ENGINE(pxp->ctrl_gt, GSC0))
+   timeout = GSCFW_MAX_ROUND_TRIP_LATENCY_MS;
+   else
+   timeout = 250;
+
+   if (!wait_for_completion_timeout(>termination, 
msecs_to_jiffies(timeout)))
return -ETIMEDOUT;
  
  	return 0;

@@ -354,8 +368,14 @@ int intel_pxp_start(struct intel_pxp *pxp)
if (!intel_pxp_is_enabled(pxp))
return -ENODEV;
  
-	if (wait_for(pxp_component_bound(pxp), 250))

-   return -ENXIO;
+   if (HAS_ENGINE(pxp->ctrl_gt, GSC0)) {
+   /* Use a larger 1 second timeout considering all the 
dependencies. */
+   if (wait_for(intel_pxp_gsccs_is_ready_for_sessions(pxp), 1000))
+   return 

[PATCH] drm/msm/a6xx: initialize GMU mutex earlier

2023-04-10 Thread Dmitry Baryshkov
Move GMU mutex initialization earlier to make sure that it is always
initialized. a6xx_destroy can be called from ther failure path before
GMU initialization.

This fixes the following backtrace:

[ cut here ]
DEBUG_LOCKS_WARN_ON(lock->magic != lock)
WARNING: CPU: 0 PID: 58 at kernel/locking/mutex.c:582 __mutex_lock+0x1ec/0x3d0
Modules linked in:
CPU: 0 PID: 58 Comm: kworker/u16:1 Not tainted 6.3.0-rc5-00155-g187c06436519 
#565
Hardware name: Qualcomm Technologies, Inc. SM8350 HDK (DT)
Workqueue: events_unbound deferred_probe_work_func
pstate: 6045 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : __mutex_lock+0x1ec/0x3d0
lr : __mutex_lock+0x1ec/0x3d0
sp : 88993620
x29: 88993620 x28: 0002 x27: 47b253c52800
x26: 01000606 x25: 47b240bb2810 x24: fff4
x23:  x22: c38bba15ac14 x21: 0002
x20: 88993690 x19: 47b2430cc668 x18: fffe98f0
x17: 6f74616c75676572 x16: 20796d6d75642067 x15: 0038
x14:  x13: c38bbba050b8 x12: 0666
x11: 0222 x10: c38bbba603e8 x9 : c38bbba050b8
x8 : efff x7 : c38bbba5d0b8 x6 : 0222
x5 : bff4 x4 : 4000f222 x3 : 
x2 :  x1 :  x0 : 47b240cb1880
Call trace:
 __mutex_lock+0x1ec/0x3d0
 mutex_lock_nested+0x2c/0x38
 a6xx_destroy+0xa0/0x138
 a6xx_gpu_init+0x41c/0x618
 adreno_bind+0x188/0x290
 component_bind_all+0x118/0x248
 msm_drm_bind+0x1c0/0x670
 try_to_bring_up_aggregate_device+0x164/0x1d0
 __component_add+0xa8/0x16c
 component_add+0x14/0x20
 dsi_dev_attach+0x20/0x2c
 dsi_host_attach+0x9c/0x144
 devm_mipi_dsi_attach+0x34/0xac
 lt9611uxc_attach_dsi.isra.0+0x84/0xfc
 lt9611uxc_probe+0x5b8/0x67c
 i2c_device_probe+0x1ac/0x358
 really_probe+0x148/0x2ac
 __driver_probe_device+0x78/0xe0
 driver_probe_device+0x3c/0x160
 __device_attach_driver+0xb8/0x138
 bus_for_each_drv+0x84/0xe0
 __device_attach+0x9c/0x188
 device_initial_probe+0x14/0x20
 bus_probe_device+0xac/0xb0
 deferred_probe_work_func+0x8c/0xc8
 process_one_work+0x2bc/0x594
 worker_thread+0x228/0x438
 kthread+0x108/0x10c
 ret_from_fork+0x10/0x20
irq event stamp: 299345
hardirqs last  enabled at (299345): [] 
put_cpu_partial+0x1c8/0x22c
hardirqs last disabled at (299344): [] 
put_cpu_partial+0x1c0/0x22c
softirqs last  enabled at (296752): [] _stext+0x434/0x4e8
softirqs last disabled at (296741): [] 
do_softirq+0x10/0x1c
---[ end trace  ]---

Fixes: 4cd15a3e8b36 ("drm/msm/a6xx: Make GPU destroy a bit safer")
Cc: Douglas Anderson 
Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 2 --
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 2 ++
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
index 7f5bc73b2040..611311b65b16 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -1514,8 +1514,6 @@ int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct 
device_node *node)
if (!pdev)
return -ENODEV;
 
-   mutex_init(>lock);
-
gmu->dev = >dev;
 
of_dma_configure(gmu->dev, node, true);
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 6faea5049f76..2942d2548ce6 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -1998,6 +1998,8 @@ struct msm_gpu *a6xx_gpu_init(struct drm_device *dev)
adreno_gpu = _gpu->base;
gpu = _gpu->base;
 
+   mutex_init(_gpu->gmu.lock);
+
adreno_gpu->registers = NULL;
 
/*
-- 
2.39.2



Re: [PATCH v2] drm/vkms: add module parameter to set background color

2023-04-10 Thread Simon Ser
I think this should be a KMS property instead of a module parameter.
Is there a reason why this patch uses a module parameter? It breaks
user-space expectations.


Re: [Intel-gfx] [PATCH 1/8] drm/i915/mtl: Define MOCS and PAT tables for MTL

2023-04-10 Thread Matt Roper
On Fri, Apr 07, 2023 at 12:12:29AM -0700, fei.y...@intel.com wrote:
> From: Fei Yang 
> 
> On MTL, GT can no longer allocate on LLC - only the CPU can.
> This, along with addition of support for ADM/L4 cache calls a
> MOCS/PAT table update. Also defines PTE encode functions for
> MTL as it has different PAT index definition than previous
> platforms.

It might be best to keep the PTE encoding as a separate patch from the
MOCS/PAT tables.  It's a different enough topic that it probably
deserves a patch of its own.

> 
> BSpec: 44509, 45101, 44235
> 
> Cc: Matt Roper 
> Cc: Lucas De Marchi 
> Signed-off-by: Madhumitha Tolakanahalli Pradeep 
> 
> Signed-off-by: Aravind Iddamsetty 
> Signed-off-by: Fei Yang 
> ---
>  drivers/gpu/drm/i915/gt/gen8_ppgtt.c| 28 +
>  drivers/gpu/drm/i915/gt/gen8_ppgtt.h|  3 +
>  drivers/gpu/drm/i915/gt/intel_ggtt.c| 27 +
>  drivers/gpu/drm/i915/gt/intel_gtt.c | 23 +++-
>  drivers/gpu/drm/i915/gt/intel_gtt.h | 20 ++-
>  drivers/gpu/drm/i915/gt/intel_mocs.c| 76 +++--
>  drivers/gpu/drm/i915/gt/selftest_mocs.c |  2 +-
>  drivers/gpu/drm/i915/i915_pci.c |  1 +
>  8 files changed, 173 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c 
> b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> index 4daaa6f55668..df4073d32114 100644
> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> @@ -55,6 +55,34 @@ static u64 gen8_pte_encode(dma_addr_t addr,
>   return pte;
>  }
>  
> +static u64 mtl_pte_encode(dma_addr_t addr,
> +   enum i915_cache_level level,
> +   u32 flags)
> +{
> + gen8_pte_t pte = addr | GEN8_PAGE_PRESENT | GEN8_PAGE_RW;
> +
> + if (unlikely(flags & PTE_READ_ONLY))
> + pte &= ~GEN8_PAGE_RW;
> +
> + if (flags & PTE_LM)
> + pte |= GEN12_PPGTT_PTE_LM | GEN12_PPGTT_PTE_NC;
> +
> + switch (level) {
> + case I915_CACHE_NONE:
> + pte |= GEN12_PPGTT_PTE_PAT1;
> + break;
> + case I915_CACHE_LLC:
> + case I915_CACHE_L3_LLC:
> + pte |= GEN12_PPGTT_PTE_PAT0 | GEN12_PPGTT_PTE_PAT1;
> + break;
> + case I915_CACHE_WT:
> + pte |= GEN12_PPGTT_PTE_PAT0;
> + break;
> + }
> +
> + return pte;
> +}
> +
>  static void gen8_ppgtt_notify_vgt(struct i915_ppgtt *ppgtt, bool create)
>  {
>   struct drm_i915_private *i915 = ppgtt->vm.i915;
> diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.h 
> b/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
> index f541d19264b4..6b8ce7f4d25a 100644
> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
> @@ -18,5 +18,8 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt,
>  u64 gen8_ggtt_pte_encode(dma_addr_t addr,
>enum i915_cache_level level,
>u32 flags);
> +u64 mtl_ggtt_pte_encode(dma_addr_t addr,
> + unsigned int pat_index,
> + u32 flags);
>  
>  #endif
> diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c 
> b/drivers/gpu/drm/i915/gt/intel_ggtt.c
> index 3c7f1ed92f5b..4a16bfcde1de 100644
> --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
> @@ -220,6 +220,33 @@ static void guc_ggtt_invalidate(struct i915_ggtt *ggtt)
>   }
>  }
>  
> +u64 mtl_ggtt_pte_encode(dma_addr_t addr,
> + enum i915_cache_level level,
> + u32 flags)
> +{
> + gen8_pte_t pte = addr | GEN8_PAGE_PRESENT;
> +
> + GEM_BUG_ON(addr & ~GEN12_GGTT_PTE_ADDR_MASK);
> +
> + if (flags & PTE_LM)
> + pte |= GEN12_GGTT_PTE_LM;
> +
> + switch (level) {
> + case I915_CACHE_NONE:
> + pte |= MTL_GGTT_PTE_PAT1;
> + break;
> + case I915_CACHE_LLC:
> + case I915_CACHE_L3_LLC:
> + pte |= MTL_GGTT_PTE_PAT0 | MTL_GGTT_PTE_PAT1;
> + break;
> + case I915_CACHE_WT:
> + pte |= MTL_GGTT_PTE_PAT0;
> + break;
> + }
> +
> + return pte;
> +}
> +
>  u64 gen8_ggtt_pte_encode(dma_addr_t addr,
>enum i915_cache_level level,
>u32 flags)
> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c 
> b/drivers/gpu/drm/i915/gt/intel_gtt.c
> index 4f436ba7a3c8..1e1b34e22cf5 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gtt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
> @@ -468,6 +468,25 @@ void gtt_write_workarounds(struct intel_gt *gt)
>   }
>  }
>  
> +static void mtl_setup_private_ppat(struct intel_uncore *uncore)
> +{
> + intel_uncore_write(uncore, GEN12_PAT_INDEX(0),
> +MTL_PPAT_L4_0_WB);
> + intel_uncore_write(uncore, GEN12_PAT_INDEX(1),
> +MTL_PPAT_L4_1_WT);
> + intel_uncore_write(uncore, GEN12_PAT_INDEX(2),
> +MTL_PPAT_L4_3_UC);
> + intel_uncore_write(uncore, 

Re: [PATCH v7 4/8] drm/i915/pxp: Add GSC-CS backend to send GSC fw messages

2023-04-10 Thread Ceraolo Spurio, Daniele




On 4/6/2023 10:44 AM, Alan Previn wrote:

Add GSC engine based method for sending PXP firmware packets
to the GSC firmware for MTL (and future) products.

Use the newly added helpers to populate the GSC-CS memory
header and send the message packet to the FW by dispatching
the GSC_HECI_CMD_PKT instruction on the GSC engine.

We use non-priveleged batches for submission to GSC engine
which require two buffers for the request:
  - a buffer for the HECI packet that contains PXP FW commands
  - a batch-buffer that contains the engine instruction for
sending the HECI packet to the GSC firmware.

Thus, add the allocation and freeing of these buffers in gsccs
init and fini.

The GSC-fw may reply to commands with a SUCCESS but with an
additional pending-bit set in the reply packet. This bit
means the GSC-FW is currently busy and the caller needs to
try again with the gsc_message_handle the fw returned. Thus,
add a wrapper to continuously retry send_message while
replaying the gsc_message_handle. Retries need to follow the
arch-spec count and delay until GSC-FW replies with the real
SUCCESS or timeout after that spec'd delay.

The GSC-fw requires a non-zero host_session_handle provided
by the caller to enable gsc_message_handle tracking. Thus,
allocate the host_session_handle at init and destroy it
at fini (the latter requiring an FYI to the gsc-firmware).

Signed-off-by: Alan Previn 
---
  .../i915/gt/uc/intel_gsc_uc_heci_cmd_submit.h |   1 +
  .../drm/i915/pxp/intel_pxp_cmd_interface_43.h |   3 +
  drivers/gpu/drm/i915/pxp/intel_pxp_gsccs.c| 240 +-
  drivers/gpu/drm/i915/pxp/intel_pxp_gsccs.h|   4 +
  drivers/gpu/drm/i915/pxp/intel_pxp_types.h|   6 +
  5 files changed, 253 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_gsc_uc_heci_cmd_submit.h 
b/drivers/gpu/drm/i915/gt/uc/intel_gsc_uc_heci_cmd_submit.h
index 3addce861854..e4d6662339e8 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_gsc_uc_heci_cmd_submit.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_gsc_uc_heci_cmd_submit.h
@@ -51,6 +51,7 @@ struct intel_gsc_mtl_header {
 */
u32 flags;
  #define GSC_OUTFLAG_MSG_PENDING 1
+#define GSC_INFLAG_MSG_CLEANUP BIT(1)


For consistency those should all be BIT() defines

  
  	u32 status;

  } __packed;





@@ -38,18 +248,46 @@ gsccs_allocate_execution_resource(struct intel_pxp *pxp)
if (!engine)
return -ENODEV;
  
+	/*

+* Now, allocate, pin and map two objects, one for the heci message 
packet
+* and another for the batch buffer we submit into GSC engine (that 
includes the packet).
+* NOTE: GSC-CS backend is currently only supported on MTL, so we 
allocate shmem.
+*/
+   err = gsccs_create_buffer(pxp->ctrl_gt, "Heci Packet",
+ 2 * PXP43_MAX_HECI_INOUT_SIZE,
+ _res->pkt_vma, _res->pkt_vaddr);
+   if (err)
+   return err;
+
+   err = gsccs_create_buffer(pxp->ctrl_gt, "Batch Buffer", PAGE_SIZE,
+ _res->bb_vma, _res->bb_vaddr);
+   if (err)
+   goto free_pkt;
+
/* Finally, create an intel_context to be used during the submission */
ce = intel_context_create(engine);
if (IS_ERR(ce)) {
drm_err(>i915->drm, "Failed creating gsccs backend ctx\n");
-   return PTR_ERR(ce);
+   err = PTR_ERR(ce);
+   goto free_batch;
}
  
  	i915_vm_put(ce->vm);

ce->vm = i915_vm_get(pxp->ctrl_gt->vm);
exec_res->ce = ce;
  
+	/* initialize host-session-handle (for all i915-to-gsc-firmware PXP cmds) */

+   get_random_bytes(_res->host_session_handle, 
sizeof(exec_res->host_session_handle));


Thinking back at this, maybe a possible solution to avoid randomly 
generated clashing values is to check if any of the existing exec_res 
already uses the generated value. Not a blocker because we only have 1 
exec_res for now, so no chance of clashing.


With the define fixed:

Reviewed-by: Daniele Ceraolo Spurio 

Daniele


+
return 0;
+
+free_batch:
+   i915_vma_unpin_and_release(_res->bb_vma, I915_VMA_RELEASE_MAP);
+free_pkt:
+   i915_vma_unpin_and_release(_res->pkt_vma, I915_VMA_RELEASE_MAP);
+   memset(exec_res, 0, sizeof(*exec_res));
+
+   return err;
  }
  
  void intel_pxp_gsccs_fini(struct intel_pxp *pxp)

diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_gsccs.h 
b/drivers/gpu/drm/i915/pxp/intel_pxp_gsccs.h
index 354ea9a8f940..bd1c028bc80f 100644
--- a/drivers/gpu/drm/i915/pxp/intel_pxp_gsccs.h
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp_gsccs.h
@@ -10,6 +10,10 @@
  
  struct intel_pxp;
  
+#define GSC_REPLY_LATENCY_MS 200

+#define GSC_PENDING_RETRY_MAXCOUNT 40
+#define GSC_PENDING_RETRY_PAUSE_MS 50
+
  #ifdef CONFIG_DRM_I915_PXP
  void intel_pxp_gsccs_fini(struct intel_pxp *pxp);
  int intel_pxp_gsccs_init(struct intel_pxp *pxp);
diff --git 

RE: [PATCH 2/2] drm/bridge: adv7533: Add option to disable lane switching

2023-04-10 Thread Biju Das
Hi All,

I would like to drop this patch as [1] and [2] fixes the issue.

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/drivers/gpu/drm/bridge/adv7511/adv7533.c?h=next-20230406=9a0cdcd6649b76f0b7ceec0e55b0a718321e34d3

[2] 
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/drivers/gpu/drm/bridge/adv7511/adv7533.c?h=next-20230406=ee0285e13455fdbce5de315bdbe91b5f198a2a06

Cheers,
Biju

> -Original Message-
> From: Biju Das 
> Sent: Wednesday, March 9, 2022 3:11 PM
> To: Andrzej Hajda ; Neil Armstrong
> ; Robert Foss ; David
> Airlie ; Daniel Vetter 
> Cc: Biju Das ; Laurent Pinchart
> ; Jonas Karlman ; Jernej
> Skrabec ; Maxime Ripard ; Sam
> Ravnborg ; Sia Jee Heng ; Abhinav
> Kumar ; Nicolas Boichat ;
> dri-devel@lists.freedesktop.org; Geert Uytterhoeven
> ; Chris Paterson ;
> Biju Das ; Prabhakar Mahadev Lad
> ; linux-renesas-...@vger.kernel.org
> Subject: [PATCH 2/2] drm/bridge: adv7533: Add option to disable lane
> switching
> 
> On Renesas RZ/{G2L,V2L} platforms changing the lanes from 4 to 3 at lower
> frequencies causes display instability. On such platforms, it is better to
> avoid switching lanes from 4 to 3 as it needs different set of PLL parameter
> constraints to make the display stable with 3 lanes.
> 
> This patch adds an option to disable lane switching if 'adi,disable-lanes-
> override' property is present in DT.
> 
> Signed-off-by: Biju Das 
> Reviewed-by: Lad Prabhakar 
> ---
>  drivers/gpu/drm/bridge/adv7511/adv7511.h | 1 +
> drivers/gpu/drm/bridge/adv7511/adv7533.c | 5 -
>  2 files changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/bridge/adv7511/adv7511.h
> b/drivers/gpu/drm/bridge/adv7511/adv7511.h
> index 592ecfcf00ca..7505601f10fb 100644
> --- a/drivers/gpu/drm/bridge/adv7511/adv7511.h
> +++ b/drivers/gpu/drm/bridge/adv7511/adv7511.h
> @@ -368,6 +368,7 @@ struct adv7511 {
>   struct mipi_dsi_device *dsi;
>   u8 num_dsi_lanes;
>   bool use_timing_gen;
> + bool override_lanes;
> 
>   enum adv7511_type type;
>   struct platform_device *audio_pdev;
> diff --git a/drivers/gpu/drm/bridge/adv7511/adv7533.c
> b/drivers/gpu/drm/bridge/adv7511/adv7533.c
> index eb7579dec40a..7f6a8e95d70e 100644
> --- a/drivers/gpu/drm/bridge/adv7511/adv7533.c
> +++ b/drivers/gpu/drm/bridge/adv7511/adv7533.c
> @@ -108,7 +108,7 @@ void adv7533_mode_set(struct adv7511 *adv, const struct
> drm_display_mode *mode)
>   if (adv->num_dsi_lanes != 4)
>   return;
> 
> - if (mode->clock > 8)
> + if (!adv->override_lanes || mode->clock > 8)
>   lanes = 4;
>   else
>   lanes = 3;
> @@ -195,6 +195,9 @@ int adv7533_parse_dt(struct device_node *np, struct
> adv7511 *adv)
>   adv->use_timing_gen = !of_property_read_bool(np,
>   "adi,disable-timing-generator");
> 
> + adv->override_lanes = !of_property_read_bool(np,
> + "adi,disable-lanes-override");
> +
>   /* TODO: Check if these need to be parsed by DT or not */
>   adv->rgb = true;
>   adv->embedded_sync = false;
> --
> 2.17.1



Re: [PATCH 1/2] drm: sun4i/dsi: factor out DSI PHY poweron and poweroff

2023-04-10 Thread kernel test robot
Hi Brandon,

kernel test robot noticed the following build errors:

[auto build test ERROR on sunxi/sunxi/for-next]
[also build test ERROR on linus/master v6.3-rc6 next-20230406]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:
https://github.com/intel-lab-lkp/linux/commits/Brandon-Cheo-Fusi/drm-sun4i-dsi-factor-out-DSI-PHY-poweron-and-poweroff/20230410-165257
base:   https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux.git 
sunxi/for-next
patch link:
https://lore.kernel.org/r/20230410084750.164016-2-fusibrandon13%40gmail.com
patch subject: [PATCH 1/2] drm: sun4i/dsi: factor out DSI PHY poweron and 
poweroff
config: arm64-randconfig-r002-20230410 
(https://download.01.org/0day-ci/archive/20230411/202304110053.n5nu03yz-...@intel.com/config)
compiler: clang version 17.0.0 (https://github.com/llvm/llvm-project 
2c57868e2e877f73c339796c3374ae660bb77f0d)
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# install arm64 cross compiling tool for clang build
# apt-get install binutils-aarch64-linux-gnu
# 
https://github.com/intel-lab-lkp/linux/commit/afa9cb6821e4527f07c10a777ea44e380b524858
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review 
Brandon-Cheo-Fusi/drm-sun4i-dsi-factor-out-DSI-PHY-poweron-and-poweroff/20230410-165257
git checkout afa9cb6821e4527f07c10a777ea44e380b524858
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 
O=build_dir ARCH=arm64 olddefconfig
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 
O=build_dir ARCH=arm64 SHELL=/bin/bash drivers/gpu/drm/sun4i/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot 
| Link: 
https://lore.kernel.org/oe-kbuild-all/202304110053.n5nu03yz-...@intel.com/

All errors (new ones prefixed by >>):

>> drivers/gpu/drm/sun4i/sun6i_mipi_dsi.c:843:13: error: incompatible function 
>> pointer types initializing 'void (*)(struct drm_encoder *)' with an 
>> expression of type 'void (struct drm_encoder *, struct drm_atomic_state *)' 
>> [-Wincompatible-function-pointer-types]
   .enable = sun6i_dsi_encoder_enable,
 ^~~~
   1 error generated.


vim +843 drivers/gpu/drm/sun4i/sun6i_mipi_dsi.c

133add5b5ad42b Maxime Ripard 2018-04-04  841  
133add5b5ad42b Maxime Ripard 2018-04-04  842  static const struct 
drm_encoder_helper_funcs sun6i_dsi_enc_helper_funcs = {
133add5b5ad42b Maxime Ripard 2018-04-04 @843.enable = 
sun6i_dsi_encoder_enable,
133add5b5ad42b Maxime Ripard 2018-04-04  844  };
133add5b5ad42b Maxime Ripard 2018-04-04  845  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests


RE: [PATCH 1/2] dt-bindings: drm: bridge: adi,adv7533: Document adi,disable-lanes-override property

2023-04-10 Thread Biju Das
Hi All,

I would like to drop this patch as[1] and [2] fixes the issue

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/drivers/gpu/drm/bridge/adv7511/adv7533.c?h=next-20230406=9a0cdcd6649b76f0b7ceec0e55b0a718321e34d3

[2] 
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/drivers/gpu/drm/bridge/adv7511/adv7533.c?h=next-20230406=ee0285e13455fdbce5de315bdbe91b5f198a2a06

Cheers,
Biju


> -Original Message-
> From: Biju Das 
> Sent: Wednesday, March 9, 2022 3:11 PM
> To: David Airlie ; Daniel Vetter ; Rob
> Herring 
> Cc: Biju Das ; Ricardo Cañuelo
> ; Laurent Pinchart
> ; dri-devel@lists.freedesktop.org;
> devicet...@vger.kernel.org; Geert Uytterhoeven ;
> Chris Paterson ; Biju Das
> ; Prabhakar Mahadev Lad  lad...@bp.renesas.com>; linux-renesas-...@vger.kernel.org
> Subject: [PATCH 1/2] dt-bindings: drm: bridge: adi,adv7533: Document
> adi,disable-lanes-override property
> 
> On Renesas RZ/{G2L,V2L} platforms changing the lanes from 4 to 3 at lower
> frequencies causes display instability. On such platforms, it is better to
> avoid switching lanes from 4 to 3 as it needs different set of PLL parameter
> constraints to make the display stable with 3 lanes.
> 
> This patch introduces 'adi,disable-lanes-override' property to disable lane
> switching at lower frequencies.
> 
> Signed-off-by: Biju Das 
> Reviewed-by: Lad Prabhakar 
> ---
>  .../devicetree/bindings/display/bridge/adi,adv7533.yaml  | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git
> a/Documentation/devicetree/bindings/display/bridge/adi,adv7533.yaml
> b/Documentation/devicetree/bindings/display/bridge/adi,adv7533.yaml
> index f36209137c8a..2dc378039d21 100644
> --- a/Documentation/devicetree/bindings/display/bridge/adi,adv7533.yaml
> +++ b/Documentation/devicetree/bindings/display/bridge/adi,adv7533.yaml
> @@ -84,6 +84,11 @@ properties:
>timings for HDMI output.
>  type: boolean
> 
> +  adi,disable-lanes-override:
> +description:
> +  Disables the overriding lanes at lower frequencies.
> +type: boolean
> +
>adi,dsi-lanes:
>  description: Number of DSI data lanes connected to the DSI host.
>  $ref: /schemas/types.yaml#/definitions/uint32
> --
> 2.17.1



Re: [PATCH v7 3/8] drm/i915/pxp: Add MTL helpers to submit Heci-Cmd-Packet to GSC

2023-04-10 Thread Ceraolo Spurio, Daniele




On 4/6/2023 10:44 AM, Alan Previn wrote:

Add helper functions into a new file for heci-packet-submission.
The helpers will handle generating the MTL GSC-CS Memory-Header
and submission of the Heci-Cmd-Packet instructions to the engine.

NOTE1: These common functions for heci-packet-submission will be used
by different i915 callers:
  1- GSC-SW-Proxy: This is pending upstream publication awaiting
 a few remaining opens
  2- MTL-HDCP: An equivalent patch has also been published at:
 https://patchwork.freedesktop.org/series/111876/. (Patch 1)
  3- PXP: This series.

NOTE2: A difference in this patch vs what is appearing is in bullet 2
above is that HDCP (and SW-Proxy) will be using priveleged submission
(GGTT and common gsc-uc-context) while PXP will be using non-priveleged
PPGTT, context and batch buffer. Therefore this patch will only slightly
overlap with the MTL-HDCP patches despite have very similar function
names (emit_foo vs emit_nonpriv_foo). This is because HECI_CMD_PKT
instructions require different flows and hw-specific code when done
via PPGTT based submission (not different from other engines). MTL-HDCP
contains the same intel_gsc_mtl_header_t structures as this but the
helpers there are different. Both add the same new file names.

NOTE3: Additional clarity about the heci-cmd-pkt layout and where the
common helpers come in:
  - On MTL, when an i915 subsystem needs to send a command request
to the security firmware, it will send that via the GSC-
engine-command-streamer.
  - However those commands, (lets call them "gsc_specific_fw_api"
calls), are not understood by the GSC command streamer hw.
  - The GSC CS only looks at the GSC_HECI_CMD_PKT instruction and
passes it along to the GSC firmware.
  - The GSC FW on the other hand needs additional metadata to know
which usage service is being called (PXP, HDCP, proxy, etc) along
with session specific info. Thus an extra header called GSC-CS
HECI Memory Header, (C) in below diagram is prepended before
the FW specific API, (D).
  - Thus, the structural layout of the request submitted would
need to look like the diagram below (for non-priv PXP).
  - In the diagram, the common helper for HDCP, (GSC-Sw-Proxy) and
PXP (i.e. new function intel_gsc_uc_heci_cmd_emit_mtl_header)
will populate blob (C) while additional helpers, different for
PPGGTT (this patch) vs GGTT (HDCP series) will populate
blobs (A) and (B) below.
   ___
  (A)  |  MI_BATCH_BUFFER_START (ppgtt, batchbuff-addr, ...) |
   | |   |
   |_|   |
   | (B)| GSC_HECI_CMD_PKT (pkt-addr-in, pkt-size-in,|   |
   ||   pkt-addr-out, pkt-size-out)  |
   || MI_BATCH_BUFFER_END|   |   |
   |||   |   |
   | |   |
   |_|   |
 |
 -
 |
\|/
   __V___
   |   _|
   |(C)|   ||
   |   | struct intel_gsc_mtl_header { ||
   |   |   validity marker ||
   |   |   heci_clent_id   ||
   |   |   ... ||
   |   |  }||
   |   |___||
   |(D)|   ||
   |   | struct gsc_fw_specific_api_foobar {   ||
   |   | ...   ||
   |   | For an example, see   ||
   |   | 'struct pxp43_create_arb_in' at   ||
   |   | intel_pxp_cmd_interface_43.h  ||
   |   |   ||
   |   | } ||
   |   |  Struture depends on command type ||
   |   | struct gsc_fw_specific_api_foobar {   ||
   |   |___||
   ||

That said, this patch provides basic helpers but leaves the
PXP subsystem (i.e. the caller) to handle (D) and everything
else such as input/output size verification or handling the
responses from security firmware (for example, requiring a retry).

Signed-off-by: Alan Previn 
---
  .../i915/gt/uc/intel_gsc_uc_heci_cmd_submit.c | 102 

[pull] drm/msm: drm-msm-next-2023-04-10 for v6.4

2023-04-10 Thread Rob Clark
Hi Dave,

This is the main pull for v6.4, see below for description.  A bit big
this time because of (1) generated header updates and (2) dpu hw
catelog rework which split the increasingly unwieldy
big-giant-file-of-tables into per-SoC files.  But those are mainly
mechanical churn.

The following changes since commit e752ab11dcb48353727ea26eefd740155e028865:

  Merge remote-tracking branch 'drm/drm-next' into msm-next
(2023-03-20 10:31:25 -0700)

are available in the Git repository at:

  https://gitlab.freedesktop.org/drm/msm.git tags/drm-msm-next-2023-04-10

for you to fetch changes up to ac7e7c9c65ecfb1fcc99de91cfd6b17a8d4cb9c1:

  drm/msm/dpu: drop unused macros from hw catalog (2023-04-07 03:54:50 +0300)


main pull request for v6.4

Core Display:

* Bugfixes for error handling during probe
* rework UBWC decoder programming
* prepare_commit cleanup
* bindings for SM8550 (MDSS, DPU), SM8450 (DP)
* timeout calculation fixup
* atomic: use drm_crtc_next_vblank_start() instead of our own
  custom thing to calculate the start of next vblank

DP:
==
* interrupts cleanup

DPU:
===
* DSPP sub-block flush on sc7280
* support AR30 in addition to XR30 format
* Allow using REC_0 and REC_1 to handle wide (4k) RGB planes
* Split the HW catalog into individual per-SoC files

DSI:
===
* rework DSI instance ID detection on obscure platforms

GPU:
===
* uapi C++ compatibility fix
* a6xx: More robust gdsc reset
* a3xx and a4xx devfreq support
* update generated headers
* various cleanups and fixes
* GPU and GEM updates to avoid allocations which could trigger
  reclaim (shrinker) in fence signaling path
* dma-fence deadline hint support and wait-boost
* a640 speedbin support
* a650 speedbin support


Abhinav Kumar (3):
  MAINTAINERS: Update the URI for MSM DRM bugs
  drm/msm/dpu: log the multirect_index in _dpu_crtc_blend_setup_pipe
  drm/msm/dpu: remove unused dpu_plane_validate_multirect_v2 function

Adam Skladowski (1):
  drm: msm: adreno: Disable preemption on Adreno 510

Akhil P Oommen (3):
  drm/msm/a6xx: Vote for cx gdsc from gpu driver
  drm/msm/a6xx: Remove cx gdsc polling using 'reset'
  drm/msm/a6xx: Use genpd notifier to ensure cx-gdsc collapse

Arnd Bergmann (1):
  drm/msm/a6xx: add CONFIG_PM dependency

Colin Ian King (2):
  drm/msm/mdss: Fix spelling mistake "Unuspported" -> "Unsupported"
  drm/msm/dp: Fix spelling mistake "Capabiity" -> "Capability"

Danylo Piliaiev (1):
  drm/msm: Rename drm_msm_gem_submit_reloc::or in C++ code

Dmitry Baryshkov (67):
  drm/msm/adreno: stall translation on fault for all GPU families
  drm/msm/adreno: split a6xx fault handler into generic and a6xx parts
  drm/msm/a5xx: add devcoredump support to the fault handler
  drm/msm/mdss: convert UBWC setup to use match data
  drm/msm/mdss: add data for sc8180xp
  drm/msm/mdss: add the sdm845 data for completeness
  drm/msm/dpu: rename struct dpu_hw_pipe(_cfg) to dpu_hw_sspp(_cfg)
  drm/msm/dpu: move SSPP allocation to the RM
  drm/msm/dpu: move SSPP debugfs creation to dpu_kms.c
  drm/msm/dpu: drop EAGAIN check from dpu_format_populate_layout
  drm/msm/dpu: move pipe_hw to dpu_plane_state
  drm/msm/dpu: drop dpu_plane_pipe function
  drm/msm/dpu: introduce struct dpu_sw_pipe
  drm/msm/dpu: use dpu_sw_pipe for dpu_hw_sspp callbacks
  drm/msm/dpu: pass dpu_format to _dpu_hw_sspp_setup_scaler3()
  drm/msm/dpu: clean up SRC addresses when setting up SSPP for solid fill
  drm/msm/dpu: move stride programming to dpu_hw_sspp_setup_sourceaddress
  drm/msm/dpu: remove dpu_hw_fmt_layout from struct dpu_hw_sspp_cfg
  drm/msm/dpu: rename dpu_hw_sspp_cfg to dpu_sw_pipe_cfg
  drm/msm/dpu: drop src_split and multirect check from dpu_crtc_atomic_check
  drm/msm/dpu: don't use unsupported blend stages
  drm/msm/dpu: move the rest of plane checks to dpu_plane_atomic_check()
  drm/msm/dpu: drop redundant plane dst check from dpu_crtc_atomic_check()
  drm/msm/dpu: rewrite plane's QoS-related functions to take
dpu_sw_pipe and dpu_format
  drm/msm/dpu: make _dpu_plane_calc_clk accept mode directly
  drm/msm/dpu: add dpu_hw_sspp_cfg to dpu_plane_state
  drm/msm/dpu: simplify dpu_plane_validate_src()
  drm/msm/dpu: rework dpu_plane_sspp_atomic_update()
  drm/msm/dpu: rework dpu_plane_atomic_check()
  drm/msm/dpu: rework plane CSC setting
  drm/msm/dpu: rework static color fill code
  drm/msm/dpu: split pipe handling from _dpu_crtc_blend_setup_mixer
  drm/msm/dpu: add support for wide planes
  drm/msm/dpu: populate SmartDMA features in hw catalog
  drm/msm/dpu: drop smart_dma_rev from dpu_caps
  Merge branch 'msm-next-lumag-dpu' into msm-next-lumag
  Merge branches 'msm-next-lumag-dp', 'msm-next-lumag-dsi',
'msm-next-lumag-mdp5' 

Re: [PATCH] drm/amd/display: set variables dml*_funcs storage-class-specifier to static

2023-04-10 Thread Hamza Mahfooz

On 4/8/23 09:43, Tom Rix wrote:

smatch reports
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.c:44:24: warning: 
symbol
   'dml20_funcs' was not declared. Should it be static?
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.c:51:24: warning: 
symbol
   'dml20v2_funcs' was not declared. Should it be static?
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.c:58:24: warning: 
symbol
   'dml21_funcs' was not declared. Should it be static?
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.c:65:24: warning: 
symbol
   'dml30_funcs' was not declared. Should it be static?
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.c:72:24: warning: 
symbol
   'dml31_funcs' was not declared. Should it be static?
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.c:79:24: warning: 
symbol
   'dml314_funcs' was not declared. Should it be static?
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.c:86:24: warning: 
symbol
   'dml32_funcs' was not declared. Should it be static?

These variables are only used in one file so should be static.
Cleanup whitespace, use tabs consistently for indents.

Signed-off-by: Tom Rix 


Applied, thanks!


---
  .../drm/amd/display/dc/dml/display_mode_lib.c | 24 +--
  1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/display_mode_lib.c 
b/drivers/gpu/drm/amd/display/dc/dml/display_mode_lib.c
index 4125d3d111d1..bdf3ac6cadd5 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/display_mode_lib.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/display_mode_lib.c
@@ -41,51 +41,51 @@
  #include "dcn32/display_rq_dlg_calc_32.h"
  #include "dml_logger.h"
  
-const struct dml_funcs dml20_funcs = {

+static const struct dml_funcs dml20_funcs = {
.validate = dml20_ModeSupportAndSystemConfigurationFull,
.recalculate = dml20_recalculate,
.rq_dlg_get_dlg_reg = dml20_rq_dlg_get_dlg_reg,
.rq_dlg_get_rq_reg = dml20_rq_dlg_get_rq_reg
  };
  
-const struct dml_funcs dml20v2_funcs = {

+static const struct dml_funcs dml20v2_funcs = {
.validate = dml20v2_ModeSupportAndSystemConfigurationFull,
.recalculate = dml20v2_recalculate,
.rq_dlg_get_dlg_reg = dml20v2_rq_dlg_get_dlg_reg,
.rq_dlg_get_rq_reg = dml20v2_rq_dlg_get_rq_reg
  };
  
-const struct dml_funcs dml21_funcs = {

-.validate = dml21_ModeSupportAndSystemConfigurationFull,
-.recalculate = dml21_recalculate,
-.rq_dlg_get_dlg_reg = dml21_rq_dlg_get_dlg_reg,
-.rq_dlg_get_rq_reg = dml21_rq_dlg_get_rq_reg
+static const struct dml_funcs dml21_funcs = {
+   .validate = dml21_ModeSupportAndSystemConfigurationFull,
+   .recalculate = dml21_recalculate,
+   .rq_dlg_get_dlg_reg = dml21_rq_dlg_get_dlg_reg,
+   .rq_dlg_get_rq_reg = dml21_rq_dlg_get_rq_reg
  };
  
-const struct dml_funcs dml30_funcs = {

+static const struct dml_funcs dml30_funcs = {
.validate = dml30_ModeSupportAndSystemConfigurationFull,
.recalculate = dml30_recalculate,
.rq_dlg_get_dlg_reg = dml30_rq_dlg_get_dlg_reg,
.rq_dlg_get_rq_reg = dml30_rq_dlg_get_rq_reg
  };
  
-const struct dml_funcs dml31_funcs = {

+static const struct dml_funcs dml31_funcs = {
.validate = dml31_ModeSupportAndSystemConfigurationFull,
.recalculate = dml31_recalculate,
.rq_dlg_get_dlg_reg = dml31_rq_dlg_get_dlg_reg,
.rq_dlg_get_rq_reg = dml31_rq_dlg_get_rq_reg
  };
  
-const struct dml_funcs dml314_funcs = {

+static const struct dml_funcs dml314_funcs = {
.validate = dml314_ModeSupportAndSystemConfigurationFull,
.recalculate = dml314_recalculate,
.rq_dlg_get_dlg_reg = dml314_rq_dlg_get_dlg_reg,
.rq_dlg_get_rq_reg = dml314_rq_dlg_get_rq_reg
  };
  
-const struct dml_funcs dml32_funcs = {

+static const struct dml_funcs dml32_funcs = {
.validate = dml32_ModeSupportAndSystemConfigurationFull,
-.recalculate = dml32_recalculate,
+   .recalculate = dml32_recalculate,
.rq_dlg_get_dlg_reg_v2 = dml32_rq_dlg_get_dlg_reg,
.rq_dlg_get_rq_reg_v2 = dml32_rq_dlg_get_rq_reg
  };

--
Hamza



Re: [PATCH] drm/amd/display: set variables aperture_default_system and context0_default_system storage-class-specifier to static

2023-04-10 Thread Hamza Mahfooz

On 4/6/23 15:58, Tom Rix wrote:

smatch reports
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn10/dcn10_hubp.c:758:10: warning: 
symbol
   'aperture_default_system' was not declared. Should it be static?
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn10/dcn10_hubp.c:759:10: warning: 
symbol
   'context0_default_system' was not declared. Should it be static?

These variables are only used in one file so should be static.

Signed-off-by: Tom Rix 


Applied, thanks!


---
  drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubp.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubp.c 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubp.c
index a142a00bc432..bf399819ca80 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubp.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubp.c
@@ -755,8 +755,8 @@ bool hubp1_is_flip_pending(struct hubp *hubp)
return false;
  }
  
-uint32_t aperture_default_system = 1;

-uint32_t context0_default_system; /* = 0;*/
+static uint32_t aperture_default_system = 1;
+static uint32_t context0_default_system; /* = 0;*/
  
  static void hubp1_set_vm_system_aperture_settings(struct hubp *hubp,

struct vm_system_aperture_param *apt)

--
Hamza



Re: [PATCH] drm/amd/display: set variable dcn3_14_soc storage-class-specifier to static

2023-04-10 Thread Hamza Mahfooz

On 4/6/23 14:44, Tom Rix wrote:

smatch reports
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn314/dcn314_fpu.c:100:37: 
warning: symbol
   'dcn3_14_soc' was not declared. Should it be static?

This variable is only used in one file so should be static.

Signed-off-by: Tom Rix 


Applied, thanks!


---
  drivers/gpu/drm/amd/display/dc/dml/dcn314/dcn314_fpu.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn314/dcn314_fpu.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn314/dcn314_fpu.c
index c52b76610bd2..44082f65de1f 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn314/dcn314_fpu.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn314/dcn314_fpu.c
@@ -97,7 +97,7 @@ struct _vcs_dpi_ip_params_st dcn3_14_ip = {
.dcc_supported = true,
  };
  
-struct _vcs_dpi_soc_bounding_box_st dcn3_14_soc = {

+static struct _vcs_dpi_soc_bounding_box_st dcn3_14_soc = {
/*TODO: correct dispclk/dppclk voltage level determination*/
.clock_limits = {
{

--
Hamza



Re: [PATCH] drm/amd/display: remove unused matching_stream_ptrs variable

2023-04-10 Thread Hamza Mahfooz

On 3/25/23 09:45, Tom Rix wrote:

clang with W=1 reports
drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link_enc_cfg.c:625:6: error:
   variable 'matching_stream_ptrs' set but not used 
[-Werror,-Wunused-but-set-variable]
 int matching_stream_ptrs = 0;
 ^
This variable is not used so remove it.

Signed-off-by: Tom Rix 


Applied, thanks!


---
  drivers/gpu/drm/amd/display/dc/core/dc_link_enc_cfg.c | 5 +
  1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_enc_cfg.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link_enc_cfg.c
index 41198c729d90..30c0644d4418 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_enc_cfg.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_enc_cfg.c
@@ -622,7 +622,6 @@ bool link_enc_cfg_validate(struct dc *dc, struct dc_state 
*state)
int i, j;
uint8_t valid_count = 0;
uint8_t dig_stream_count = 0;
-   int matching_stream_ptrs = 0;
int eng_ids_per_ep_id[MAX_PIPES] = {0};
int ep_ids_per_eng_id[MAX_PIPES] = {0};
int valid_bitmap = 0;
@@ -645,9 +644,7 @@ bool link_enc_cfg_validate(struct dc *dc, struct dc_state 
*state)
struct link_enc_assignment assignment = 
state->res_ctx.link_enc_cfg_ctx.link_enc_assignments[i];
  
  		if (assignment.valid) {

-   if (assignment.stream == state->streams[i])
-   matching_stream_ptrs++;
-   else
+   if (assignment.stream != state->streams[i])
valid_stream_ptrs = false;
}
}

--
Hamza



[PATCH v2] drm/vkms: add module parameter to set background color

2023-04-10 Thread Maíra Canal
After commit 8ba1648567e2 ("drm: vkms: Refactor the plane composer to
accept new formats") the composition is no longer performed on top of
the primary plane, but instead on top of the CRTC, which means that
now we have a background.

This opens to the possibility of coloring the background with a
personalized color. Therefore, create a module parameter that takes a
unsigned int number as an XRGB color and set the background
color to it. That said, the composition will be performed on top of
this background color. By default, the background color is black.

Signed-off-by: Maíra Canal 
---

In order to test this functionality, I wrote some IGT tests to ensure that
it is working correctly [1]. The tests take the CRC of a colored primary
plane, offset the primary plane out of the screen, and take the CRC
of the colored background. The two CRC must be equal.

v1 -> v2 [2]

* Use XRGB as input format to avoid compilation errors on PPC (kernel test 
robot)

[1] 
https://gitlab.freedesktop.org/mairacanal/igt-gpu-tools/-/tree/vkms/background-color
[2] 
https://lore.kernel.org/dri-devel/20230406172002.124456-1-mca...@igalia.com/T/

Best Regards,
- Maíra Canal

---
 Documentation/gpu/vkms.rst   |  2 --
 drivers/gpu/drm/vkms/vkms_composer.c | 20 ++--
 drivers/gpu/drm/vkms/vkms_drv.c  |  6 ++
 drivers/gpu/drm/vkms/vkms_drv.h  |  4 
 4 files changed, 24 insertions(+), 8 deletions(-)

diff --git a/Documentation/gpu/vkms.rst b/Documentation/gpu/vkms.rst
index 49db221c0f52..dc01689d8c76 100644
--- a/Documentation/gpu/vkms.rst
+++ b/Documentation/gpu/vkms.rst
@@ -121,8 +121,6 @@ There's lots of plane features we could add support for:
 - ARGB format on primary plane: blend the primary plane into background with
   translucent alpha.
 
-- Add background color KMS property[Good to get started].
-
 - Full alpha blending on all planes.
 
 - Rotation, scaling.
diff --git a/drivers/gpu/drm/vkms/vkms_composer.c 
b/drivers/gpu/drm/vkms/vkms_composer.c
index 8e53fa80742b..b0056fad908e 100644
--- a/drivers/gpu/drm/vkms/vkms_composer.c
+++ b/drivers/gpu/drm/vkms/vkms_composer.c
@@ -79,7 +79,8 @@ static void fill_background(const struct pixel_argb_u16 
*background_color,
  * from all planes, calculates the crc32 of the output from the former step,
  * and, if necessary, convert and store the output to the writeback buffer.
  */
-static void blend(struct vkms_writeback_job *wb,
+static void blend(struct vkms_device *vkms_dev,
+ struct vkms_writeback_job *wb,
  struct vkms_crtc_state *crtc_state,
  u32 *crc32, struct line_buffer *stage_buffer,
  struct line_buffer *output_buffer, size_t row_size)
@@ -87,7 +88,12 @@ static void blend(struct vkms_writeback_job *wb,
struct vkms_plane_state **plane = crtc_state->active_planes;
u32 n_active_planes = crtc_state->num_active_planes;
 
-   const struct pixel_argb_u16 background_color = { .a = 0x };
+   const struct pixel_argb_u16 background_color = {
+   .a =  0x,
+   .r = ((*vkms_dev->config->background_color >> 16) & 0xff) * 257,
+   .g = ((*vkms_dev->config->background_color >> 8) & 0xff) * 257,
+   .b = (*vkms_dev->config->background_color & 0xff) * 257,
+   };
 
size_t crtc_y_limit = crtc_state->base.crtc->mode.vdisplay;
 
@@ -139,7 +145,8 @@ static int check_iosys_map(struct vkms_crtc_state 
*crtc_state)
return 0;
 }
 
-static int compose_active_planes(struct vkms_writeback_job *active_wb,
+static int compose_active_planes(struct vkms_device *vkms_dev,
+struct vkms_writeback_job *active_wb,
 struct vkms_crtc_state *crtc_state,
 u32 *crc32)
 {
@@ -178,7 +185,7 @@ static int compose_active_planes(struct vkms_writeback_job 
*active_wb,
goto free_stage_buffer;
}
 
-   blend(active_wb, crtc_state, crc32, _buffer,
+   blend(vkms_dev, active_wb, crtc_state, crc32, _buffer,
  _buffer, line_width * pixel_size);
 
kvfree(output_buffer.pixels);
@@ -205,6 +212,7 @@ void vkms_composer_worker(struct work_struct *work)
struct drm_crtc *crtc = crtc_state->base.crtc;
struct vkms_writeback_job *active_wb = crtc_state->active_writeback;
struct vkms_output *out = drm_crtc_to_vkms_output(crtc);
+   struct vkms_device *vkms_dev = vkms_output_to_vkms_device(out);
bool crc_pending, wb_pending;
u64 frame_start, frame_end;
u32 crc32 = 0;
@@ -228,9 +236,9 @@ void vkms_composer_worker(struct work_struct *work)
return;
 
if (wb_pending)
-   ret = compose_active_planes(active_wb, crtc_state, );
+   ret = compose_active_planes(vkms_dev, active_wb, crtc_state, 
);
else
-   ret = compose_active_planes(NULL, crtc_state, );
+   ret = 

[git pull] habanalabs for drm-next-6.4

2023-04-10 Thread Oded Gabbay
Hi Dave, Daniel.

An additional pull request for 6.4.

Mostly bug fixes and cleanups.

Full details are in the signed tag.

Thanks,
Oded

The following changes since commit 4d877b1a6e855d1c8685fa0e27ad7a521b31b6ca:

  Merge tag 'drm-intel-next-2023-04-06' of 
git://anongit.freedesktop.org/drm/drm-intel into drm-next (2023-04-06 16:31:33 
+0200)

are available in the Git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/ogabbay/linux.git 
tags/drm-habanalabs-next-2023-04-10

for you to fetch changes up to 56499c461589634f2c89ffbd9cfb78268191d349:

  accel/habanalabs: add missing error flow in hl_sysfs_init() (2023-04-08 
10:44:23 +0300)


This tag contains additional habanalabs driver changes for v6.4:

- uAPI changes:
  - Add a definition of a new Gaudi2 server type. This is used by userspace
to know what is the connectivity between the accelerators inside the
server

- New features and improvements:
  - speedup h/w queues test in Gaudi2 to reduce device initialization times.

- Firmware related fixes:
  - Fixes to the handshake protocol during f/w initialization.
  - Sync f/w events interrupt in hard reset to avoid warning message.
  - Improvements to extraction of the firmware version.

- Misc bug fixes and code cleanups. Notable fixes are:
  - Multiple fixes for interrupt handling in Gaudi2.
  - Unmap mapped memory in case TLB invalidation fails.


Cai Huoqing (1):
  accel/habanalabs: Remove redundant pci_clear_master

Dafna Hirschfeld (2):
  accel/habanalabs: check return value of add_va_block_locked
  accel/habanalabs: improvements to FW ver extraction

Dani Liberman (2):
  accel/habanalabs: fix access error clear event
  accel/habanalabs: fix handling of arc farm sei event

Koby Elbaz (3):
  accel/habanalabs: unmap mapped memory when TLB inv fails
  accel/habanalabs: change COMMS warning messages to error level
  accel/habanalabs: don't wait for STS_OK after sending COMMS WFE

Moti Haimovski (1):
  accel/habanalabs: speedup h/w queues test in Gaudi2

Oded Gabbay (1):
  accel/habanalabs/uapi: new Gaudi2 server type

Ofir Bitton (5):
  accel/habanalabs: fix HBM MMU interrupt handling
  accel/habanalabs: print raw binning masks in debug level
  accel/habanalabs: fix wrong reset and event flags
  accel/habanalabs: fixes for unexpected error interrupt
  accel/habanalabs: remove Gaudi1 multi MSI code

Tal Cohen (4):
  accel/habanalabs: print event type when device is disabled
  accel/habanalabs: remove duplicated disable pci msg
  accel/habanalabs: send disable pci when compute ctx is active
  accel/habanalabs: sync f/w events interrupt in hard reset

Tomer Tayar (3):
  accel/habanalabs: remove completion from abnormal interrupt work name
  accel/habanalabs: fix events mask of decoder abnormal interrupts
  accel/habanalabs: add missing error flow in hl_sysfs_init()

 drivers/accel/habanalabs/common/command_buffer.c   |  15 +-
 drivers/accel/habanalabs/common/decoder.c  |  40 ++-
 drivers/accel/habanalabs/common/device.c   |  54 ++--
 drivers/accel/habanalabs/common/firmware_if.c  |  17 +-
 drivers/accel/habanalabs/common/habanalabs.h   |  14 +-
 drivers/accel/habanalabs/common/irq.c  |  11 +-
 drivers/accel/habanalabs/common/memory.c   |  11 +-
 drivers/accel/habanalabs/common/mmu/mmu.c  |   8 +-
 drivers/accel/habanalabs/common/pci/pci.c  |   2 -
 drivers/accel/habanalabs/common/sysfs.c|   6 +-
 drivers/accel/habanalabs/gaudi/gaudi.c |  86 +
 drivers/accel/habanalabs/gaudi/gaudiP.h|  15 -
 drivers/accel/habanalabs/gaudi2/gaudi2.c   | 347 +++--
 drivers/accel/habanalabs/gaudi2/gaudi2P.h  |  17 +
 drivers/accel/habanalabs/goya/goya.c   |   1 +
 .../include/gaudi2/asic_reg/gaudi2_regs.h  |   4 +-
 include/uapi/drm/habanalabs_accel.h|   3 +-
 17 files changed, 382 insertions(+), 269 deletions(-)


Re: [PATCH 3/5] drm: shmobile: Switch to drm_crtc_init_with_planes()

2023-04-10 Thread Laurent Pinchart
Hi Geert,

On Mon, Apr 10, 2023 at 11:35:56AM +0200, Geert Uytterhoeven wrote:
> On Wed, Apr 5, 2023 at 5:59 AM Laurent Pinchart wrote:
> > On Fri, Mar 31, 2023 at 04:48:09PM +0200, Geert Uytterhoeven wrote:
> > > The SH-Mobile DRM driver uses the legacy drm_crtc_init(), which
> > > advertizes only the formats in safe_modeset_formats[] (XR24 and AR24) as
> > > being supported.
> > >
> > > Switch to drm_crtc_init_with_planes(), and advertize all supported
> > > (A)RGB modes, so we can use RGB565 as the default mode for the console.
> > >
> > > Signed-off-by: Geert Uytterhoeven 
> 
> > > --- a/drivers/gpu/drm/shmobile/shmob_drm_crtc.c
> > > +++ b/drivers/gpu/drm/shmobile/shmob_drm_crtc.c
> > > @@ -18,6 +18,7 @@
> > >  #include 
> > >  #include 
> > >  #include 
> > > +#include 
> > >  #include 
> > >  #include 
> > >  #include 
> > > @@ -478,16 +479,41 @@ static const struct drm_crtc_funcs crtc_funcs = {
> > >   .disable_vblank = shmob_drm_disable_vblank,
> > >  };
> > >
> > > +static const uint32_t modeset_formats[] = {
> > > + DRM_FORMAT_RGB565,
> > > + DRM_FORMAT_RGB888,
> > > + DRM_FORMAT_ARGB,
> > > + DRM_FORMAT_XRGB,
> > > +};
> > > +
> > > +static const struct drm_plane_funcs primary_plane_funcs = {
> > > + DRM_PLANE_NON_ATOMIC_FUNCS,
> > > +};
> > > +
> > >  int shmob_drm_crtc_create(struct shmob_drm_device *sdev)
> > >  {
> > >   struct drm_crtc *crtc = >crtc.crtc;
> > > + struct drm_plane *primary;
> > >   int ret;
> > >
> > >   sdev->crtc.dpms = DRM_MODE_DPMS_OFF;
> > >
> > > - ret = drm_crtc_init(sdev->ddev, crtc, _funcs);
> > > - if (ret < 0)
> > > + primary = __drm_universal_plane_alloc(sdev->ddev, sizeof(*primary), 
> > > 0,
> > > +   0, _plane_funcs,
> > > +   modeset_formats,
> > > +   ARRAY_SIZE(modeset_formats),
> > > +   NULL, DRM_PLANE_TYPE_PRIMARY,
> > > +   NULL);
> > > + if (IS_ERR(primary))
> > > + return PTR_ERR(primary);
> >
> > This seems like a bit of a hack to me. Why don't you use the planes
> 
> I'm following what Thomas did in the nouveau driver
> 
> > created by shmob_drm_plane_create() instead of allocating a new one ?
> 
> Is that possible? shmob_drm_plane_create() creates overlay planes,
> while this is for the primary plane.

You're right, for some reason I overlooked that. Sorry for the noise.

It would be good to move handling of the primary plane to
shmob_drm_plane.c, but that's for later, when moving the driver to the
atomic API. For now,

Reviewed-by: Laurent Pinchart 

> > > +
> > > + ret = drm_crtc_init_with_planes(sdev->ddev, crtc, primary, NULL,
> > > + _funcs, NULL);
> > > + if (ret < 0) {
> > > + drm_plane_cleanup(primary);
> > > + kfree(primary);
> > >   return ret;
> > > + }
> > >
> > >   drm_crtc_helper_add(crtc, _helper_funcs);

-- 
Regards,

Laurent Pinchart


Re: [PATCH 30/68] hwmon: lochnagar: constify pointers to hwmon_channel_info

2023-04-10 Thread Charles Keepax
On Thu, Apr 06, 2023 at 10:30:25PM +0200, Krzysztof Kozlowski wrote:
> Statically allocated array of pointed to hwmon_channel_info can be made
> const for safety.
> 
> Signed-off-by: Krzysztof Kozlowski 
> ---

Acked-by: Charles Keepax 

Thanks,
Charles


Re: [PATCH 5/5] drm: shmobile: Make DRM_SHMOBILE visible on Renesas SoC platforms

2023-04-10 Thread Geert Uytterhoeven
Hi Laurent,

On Wed, Apr 5, 2023 at 6:02 AM Laurent Pinchart
 wrote:
> On Fri, Mar 31, 2023 at 04:48:11PM +0200, Geert Uytterhoeven wrote:
> > The LCD Controller supported by the drm-shmob driver is not only present
> > on SuperH SH-Mobile SoCs, but also on Renesas ARM SH/R-Mobile SoCs.
> > Make its option visible, so the user can enable support for it.
> >
> > Signed-off-by: Geert Uytterhoeven 
> > ---
> >  drivers/gpu/drm/shmobile/Kconfig | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/shmobile/Kconfig 
> > b/drivers/gpu/drm/shmobile/Kconfig
> > index 4ec5dc74a6b0b880..719d4e7a5cd75aad 100644
> > --- a/drivers/gpu/drm/shmobile/Kconfig
> > +++ b/drivers/gpu/drm/shmobile/Kconfig
> > @@ -2,7 +2,7 @@
> >  config DRM_SHMOBILE
> >   tristate "DRM Support for SH Mobile"
> >   depends on DRM && ARM
>
> There shouldn't be anything ARM-dependent, could you drop "&& ARM" while
> at it ?

Oops, that was added back in 2014, when the driver stopped building on SH.
The build issue seems to be fixed, so I'll drop the dependency on ARM.

> Reviewed-by: Laurent Pinchart 

Thanks!

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


Re: [PATCH 3/5] drm: shmobile: Switch to drm_crtc_init_with_planes()

2023-04-10 Thread Geert Uytterhoeven
Hi Laurent,

Thanks for your comments!

On Wed, Apr 5, 2023 at 5:59 AM Laurent Pinchart
 wrote:
> On Fri, Mar 31, 2023 at 04:48:09PM +0200, Geert Uytterhoeven wrote:
> > The SH-Mobile DRM driver uses the legacy drm_crtc_init(), which
> > advertizes only the formats in safe_modeset_formats[] (XR24 and AR24) as
> > being supported.
> >
> > Switch to drm_crtc_init_with_planes(), and advertize all supported
> > (A)RGB modes, so we can use RGB565 as the default mode for the console.
> >
> > Signed-off-by: Geert Uytterhoeven 

> > --- a/drivers/gpu/drm/shmobile/shmob_drm_crtc.c
> > +++ b/drivers/gpu/drm/shmobile/shmob_drm_crtc.c
> > @@ -18,6 +18,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  #include 
> >  #include 
> >  #include 
> > @@ -478,16 +479,41 @@ static const struct drm_crtc_funcs crtc_funcs = {
> >   .disable_vblank = shmob_drm_disable_vblank,
> >  };
> >
> > +static const uint32_t modeset_formats[] = {
> > + DRM_FORMAT_RGB565,
> > + DRM_FORMAT_RGB888,
> > + DRM_FORMAT_ARGB,
> > + DRM_FORMAT_XRGB,
> > +};
> > +
> > +static const struct drm_plane_funcs primary_plane_funcs = {
> > + DRM_PLANE_NON_ATOMIC_FUNCS,
> > +};
> > +
> >  int shmob_drm_crtc_create(struct shmob_drm_device *sdev)
> >  {
> >   struct drm_crtc *crtc = >crtc.crtc;
> > + struct drm_plane *primary;
> >   int ret;
> >
> >   sdev->crtc.dpms = DRM_MODE_DPMS_OFF;
> >
> > - ret = drm_crtc_init(sdev->ddev, crtc, _funcs);
> > - if (ret < 0)
> > + primary = __drm_universal_plane_alloc(sdev->ddev, sizeof(*primary), 0,
> > +   0, _plane_funcs,
> > +   modeset_formats,
> > +   ARRAY_SIZE(modeset_formats),
> > +   NULL, DRM_PLANE_TYPE_PRIMARY,
> > +   NULL);
> > + if (IS_ERR(primary))
> > + return PTR_ERR(primary);
>
> This seems like a bit of a hack to me. Why don't you use the planes

I'm following what Thomas did in the nouveau driver

> created by shmob_drm_plane_create() instead of allocating a new one ?

Is that possible? shmob_drm_plane_create() creates overlay planes,
while this is for the primary plane.

>
> > +
> > + ret = drm_crtc_init_with_planes(sdev->ddev, crtc, primary, NULL,
> > + _funcs, NULL);
> > + if (ret < 0) {
> > + drm_plane_cleanup(primary);
> > + kfree(primary);
> >   return ret;
> > + }
> >
> >   drm_crtc_helper_add(crtc, _helper_funcs);

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


Re: [PATCH v2 9/9] drm/i915: Use kmap_local_page() in gem/i915_gem_execbuffer.c

2023-04-10 Thread Zhao Liu
Thanks all for your review!

On Fri, Mar 31, 2023 at 05:32:17PM +0200, Fabio M. De Francesco wrote:
> Date: Fri, 31 Mar 2023 17:32:17 +0200
> From: "Fabio M. De Francesco" 
> Subject: Re: [PATCH v2 9/9] drm/i915: Use kmap_local_page() in
>  gem/i915_gem_execbuffer.c
> 
> On venerd? 31 marzo 2023 13:30:20 CEST Tvrtko Ursulin wrote:
> > On 31/03/2023 05:18, Ira Weiny wrote:
> 

[snip]

>  
> > However I am unsure if disabling pagefaulting is needed or not. Thomas,
> > Matt, being the last to touch this area, perhaps you could have a look?
> > Because I notice we have a fallback iomap path which still uses
> > io_mapping_map_atomic_wc. So if kmap_atomic to kmap_local conversion is
> > safe, does the iomap side also needs converting to
> > io_mapping_map_local_wc? Or they have separate requirements?
> 
> AFAIK, the requirements for io_mapping_map_local_wc() are the same as for 
> kmap_local_page(): the kernel virtual address is _only_ valid in the caller 
> context, and map/unmap nesting must be done in stack-based ordering (LIFO).
> 
> I think a follow up patch could safely switch to io_mapping_map_local_wc() / 
> io_mapping_unmap_local_wc since the address is local to context.
> 
> However, not being an expert, reading your note now I suspect that I'm 
> missing 
> something. Can I ask why you think that page-faults disabling might be 
> necessary? 


About the disabling of pagefault here, could you please talk more about
it? :-)

>From previous discussions and commit history, I didn't find relevant
information and I lack background knowledge about it...

If we have the reason to diable pagefault, I will fix and refresh the new
version.

Thanks,
Zhao

> 
> Thanks,
> 
> Fabio
> 
> > Regards,
> > 
> > Tvrtko
> 
> 
> 


Re: [Intel-gfx] [PATCH 7/7] drm/i915: Allow user to set cache at BO creation

2023-04-10 Thread Jordan Justen
On 2023-04-05 13:26:43, Jordan Justen wrote:
> On 2023-04-05 00:45:24, Lionel Landwerlin wrote:
> > On 04/04/2023 19:04, Yang, Fei wrote:
> > >> Subject: Re: [Intel-gfx] [PATCH 7/7] drm/i915: Allow user to set cache 
> > >> at BO creation
> > >>
> > >> Just like the protected content uAPI, there is no way for userspace to 
> > >> tell
> > >> this feature is available other than trying using it.
> > >>
> > >> Given the issues with protected content, is it not thing we could want 
> > >> to add?
> > > Sorry I'm not aware of the issues with protected content, could you 
> > > elaborate?
> > > There was a long discussion on teams uAPI channel, could you comment 
> > > there if
> > > any concerns?
> > >
> > 
> > We wanted to have a getparam to detect protected support and were told 
> > to detect it by trying to create a context with it.
> > 
> 
> An extensions system where the detection mechanism is "just try it",
> and assume it's not supported if it fails. ??
> 

I guess no one wants to discuss the issues with this so-called
detection mechanism for i915 extensions. (Just try it and if it fails,
it must not be supported.)

I wonder how many ioctls we will be making a couple years down the
road just to see what the kernel supports.

Maybe we'll get more fun 8 second timeouts to deal with. Maybe these
probing ioctls failing or succeeding will alter the kmd's state in
some unexpected way.

It'll also be fun to debug cases where the driver is not starting up
with the noise of a bunch of probing ioctls flying by.

I thought about suggesting at least something like
I915_PARAM_CMD_PARSER_VERSION, but I don't know if that could have
prevented this 8 second timeout for creating a protected content
context. Maybe it's better than nothing though.

Of course, there was also the vague idea I threw out below for getting
a list of supported extentions.

-Jordan

> 
> This seem likely to get more and more problematic as a detection
> mechanism as more extensions are added.
> 
> > 
> > Now it appears trying to create a protected context can block for 
> > several seconds.
> > 
> > Since we have to report capabilities to the user even before it creates 
> > protected contexts, any app is at risk of blocking.
> > 
> 
> This failure path is not causing any re-thinking about using this as
> the extension detection mechanism?
> 
> Doesn't the ioctl# + input-struct-size + u64-extension# identify the
> extension such that the kernel could indicate if it is supported or
> not. (Or, perhaps return an array of the supported extensions so the
> umd doesn't have to potentially make many ioctls for each extension of
> interest.)
> 
> -Jordan