Re: [PATCH v16 12/20] drm/shmem-helper: Add and use lockless drm_gem_shmem_get_pages()

2023-09-04 Thread Boris Brezillon
On Sun,  3 Sep 2023 20:07:28 +0300
Dmitry Osipenko  wrote:

> Add lockless drm_gem_shmem_get_pages() helper that skips taking reservation
> lock if pages_use_count is non-zero, leveraging from atomicity of the
> refcount_t. Make drm_gem_shmem_mmap() to utilize the new helper.
> 
> Suggested-by: Boris Brezillon 
> Signed-off-by: Dmitry Osipenko 

Reviewed-by: Boris Brezillon 

> ---
>  drivers/gpu/drm/drm_gem_shmem_helper.c | 19 +++
>  1 file changed, 15 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c 
> b/drivers/gpu/drm/drm_gem_shmem_helper.c
> index a0faef3e762d..d93ebfef20c7 100644
> --- a/drivers/gpu/drm/drm_gem_shmem_helper.c
> +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
> @@ -227,6 +227,20 @@ void drm_gem_shmem_put_pages_locked(struct 
> drm_gem_shmem_object *shmem)
>  }
>  EXPORT_SYMBOL_GPL(drm_gem_shmem_put_pages_locked);
>  
> +static int drm_gem_shmem_get_pages(struct drm_gem_shmem_object *shmem)
> +{
> + int ret;
> +
> + if (refcount_inc_not_zero(&shmem->pages_use_count))
> + return 0;
> +
> + dma_resv_lock(shmem->base.resv, NULL);
> + ret = drm_gem_shmem_get_pages_locked(shmem);
> + dma_resv_unlock(shmem->base.resv);
> +
> + return ret;
> +}
> +
>  static int drm_gem_shmem_pin_locked(struct drm_gem_shmem_object *shmem)
>  {
>   int ret;
> @@ -610,10 +624,7 @@ int drm_gem_shmem_mmap(struct drm_gem_shmem_object 
> *shmem, struct vm_area_struct
>   return ret;
>   }
>  
> - dma_resv_lock(shmem->base.resv, NULL);
> - ret = drm_gem_shmem_get_pages_locked(shmem);
> - dma_resv_unlock(shmem->base.resv);
> -
> + ret = drm_gem_shmem_get_pages(shmem);
>   if (ret)
>   return ret;
>  



Re: [PATCH v16 11/20] drm/shmem-helper: Use refcount_t for pages_use_count

2023-09-04 Thread Boris Brezillon
On Sun,  3 Sep 2023 20:07:27 +0300
Dmitry Osipenko  wrote:

> Use atomic refcount_t helper for pages_use_count to optimize pin/unpin
> functions by skipping reservation locking while GEM's pin refcount > 1.
> 
> Suggested-by: Boris Brezillon 
> Signed-off-by: Dmitry Osipenko 
> ---
>  drivers/gpu/drm/drm_gem_shmem_helper.c  | 35 +++--
>  drivers/gpu/drm/lima/lima_gem.c |  2 +-
>  drivers/gpu/drm/panfrost/panfrost_mmu.c |  2 +-
>  include/drm/drm_gem_shmem_helper.h  |  2 +-
>  4 files changed, 19 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c 
> b/drivers/gpu/drm/drm_gem_shmem_helper.c
> index 7e1e674e2c9f..a0faef3e762d 100644
> --- a/drivers/gpu/drm/drm_gem_shmem_helper.c
> +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
> @@ -152,12 +152,12 @@ void drm_gem_shmem_free(struct drm_gem_shmem_object 
> *shmem)
>   sg_free_table(shmem->sgt);
>   kfree(shmem->sgt);
>   }
> - if (shmem->pages) {
> + if (refcount_read(&shmem->pages_use_count)) {

As explained in my v15 review, I'm not convinced this is the right
thing to do. We should instead move the
drm_gem_shmem_put_pages_locked() call in the preceding 'if (shmem->sgt)'
block, because the sgt creation logic is where this implicit pages ref
was taken. If shmem->sgt == NULL, there's no reason to call
drm_gem_shmem_put_pages_locked() and we should let the following
drm_WARN_ON(obj->dev, refcount_read(&shmem->pages_use_count)) complain.

Besides, I don't think this change should be part of the
uint -> refcount_t conversion.

>   drm_gem_shmem_put_pages_locked(shmem);
>   drm_WARN_ON(obj->dev, !shmem->got_pages_sgt);
>   }
>  
> - drm_WARN_ON(obj->dev, shmem->pages_use_count);
> + drm_WARN_ON(obj->dev, refcount_read(&shmem->pages_use_count));
>  
>   dma_resv_unlock(shmem->base.resv);
>   }
> @@ -174,14 +174,13 @@ static int drm_gem_shmem_get_pages_locked(struct 
> drm_gem_shmem_object *shmem)
>  
>   dma_resv_assert_held(shmem->base.resv);
>  
> - if (shmem->pages_use_count++ > 0)
> + if (refcount_inc_not_zero(&shmem->pages_use_count))
>   return 0;
>  
>   pages = drm_gem_get_pages(obj);
>   if (IS_ERR(pages)) {
>   drm_dbg_kms(obj->dev, "Failed to get pages (%ld)\n",
>   PTR_ERR(pages));
> - shmem->pages_use_count = 0;
>   return PTR_ERR(pages);
>   }
>  
> @@ -197,6 +196,8 @@ static int drm_gem_shmem_get_pages_locked(struct 
> drm_gem_shmem_object *shmem)
>  
>   shmem->pages = pages;
>  
> + refcount_set(&shmem->pages_use_count, 1);
> +
>   return 0;
>  }
>  
> @@ -212,21 +213,17 @@ void drm_gem_shmem_put_pages_locked(struct 
> drm_gem_shmem_object *shmem)
>  
>   dma_resv_assert_held(shmem->base.resv);
>  
> - if (drm_WARN_ON_ONCE(obj->dev, !shmem->pages_use_count))
> - return;
> -
> - if (--shmem->pages_use_count > 0)
> - return;
> -
> + if (refcount_dec_and_test(&shmem->pages_use_count)) {
>  #ifdef CONFIG_X86
> - if (shmem->map_wc)
> - set_pages_array_wb(shmem->pages, obj->size >> PAGE_SHIFT);
> + if (shmem->map_wc)
> + set_pages_array_wb(shmem->pages, obj->size >> 
> PAGE_SHIFT);
>  #endif
>  
> - drm_gem_put_pages(obj, shmem->pages,
> -   shmem->pages_mark_dirty_on_put,
> -   shmem->pages_mark_accessed_on_put);
> - shmem->pages = NULL;
> + drm_gem_put_pages(obj, shmem->pages,
> +   shmem->pages_mark_dirty_on_put,
> +   shmem->pages_mark_accessed_on_put);
> + shmem->pages = NULL;
> + }
>  }
>  EXPORT_SYMBOL_GPL(drm_gem_shmem_put_pages_locked);
>  
> @@ -553,8 +550,8 @@ static void drm_gem_shmem_vm_open(struct vm_area_struct 
> *vma)
>* mmap'd, vm_open() just grabs an additional reference for the new
>* mm the vma is getting copied into (ie. on fork()).
>*/
> - if (!drm_WARN_ON_ONCE(obj->dev, !shmem->pages_use_count))
> - shmem->pages_use_count++;
> + drm_WARN_ON_ONCE(obj->dev,
> +  !refcount_inc_not_zero(&shmem->pages_use_count));
>  
>   dma_resv_unlock(shmem->base.resv);
>  
> @@ -642,7 +639,7 @@ void drm_gem_shmem_print_info(const struct 
> drm_gem_shmem_object *shmem,
>   return;
>  
>   drm_printf_indent(p, indent, "pages_pin_count=%u\n", 
> refcount_read(&shmem->pages_pin_count));
> - drm_printf_indent(p, indent, "pages_use_count=%u\n", 
> shmem->pages_use_count);
> + drm_printf_indent(p, indent, "pages_use_count=%u\n", 
> refcount_read(&shmem->pages_use_count));
>   drm_printf_indent(p, indent, "vmap_use_count=%u\n", 
> shmem->vmap_use_count);
>   drm_printf_indent(p, indent, "vaddr=%p\n", shmem-

Re: [PATCH v16 09/20] drm/shmem-helper: Remove obsoleted is_iomem test

2023-09-04 Thread Boris Brezillon
On Sun,  3 Sep 2023 20:07:25 +0300
Dmitry Osipenko  wrote:

> Everything that uses the mapped buffer should be agnostic to is_iomem.
> The only reason for the is_iomem test is that we're setting shmem->vaddr
> to the returned map->vaddr. Now that the shmem->vaddr code is gone, remove
> the obsoleted is_iomem test to clean up the code.
> 
> Suggested-by: Thomas Zimmermann 
> Signed-off-by: Dmitry Osipenko 
> ---
>  drivers/gpu/drm/drm_gem_shmem_helper.c | 6 --
>  1 file changed, 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c 
> b/drivers/gpu/drm/drm_gem_shmem_helper.c
> index 2b50d1a7f718..25e99468ced2 100644
> --- a/drivers/gpu/drm/drm_gem_shmem_helper.c
> +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
> @@ -317,12 +317,6 @@ int drm_gem_shmem_vmap_locked(struct 
> drm_gem_shmem_object *shmem,
>  
>   if (obj->import_attach) {
>   ret = dma_buf_vmap(obj->import_attach->dmabuf, map);
> - if (!ret) {
> - if (drm_WARN_ON(obj->dev, map->is_iomem)) {
> - dma_buf_vunmap(obj->import_attach->dmabuf, map);
> - return -EIO;
> - }
> - }

Given there's nothing to unroll for the dmabuf case, I think it'd be
good to return directly and skip all the error paths. It would also
allow you to get rid of one indentation level for the !dmabuf path.

if (obj->import_attach)
return dma_buf_vmap(obj->import_attach->dmabuf, map);

// non-dmabuf vmap logic here...



>   } else {
>   pgprot_t prot = PAGE_KERNEL;
>  



Re: [PATCH v16 10/20] drm/shmem-helper: Add and use pages_pin_count

2023-09-04 Thread Boris Brezillon
On Sun,  3 Sep 2023 20:07:26 +0300
Dmitry Osipenko  wrote:

> Add separate pages_pin_count for tracking of whether drm-shmem pages are
> moveable or not. With the addition of memory shrinker support to drm-shmem,
> the pages_use_count will no longer determine whether pages are hard-pinned
> in memory, but whether pages exist and are soft-pinned (and could be swapped
> out). The pages_pin_count > 1 will hard-pin pages in memory.
> 
> Suggested-by: Boris Brezillon 
> Signed-off-by: Dmitry Osipenko 

Reviewed-by: Boris Brezillon 

> ---
>  drivers/gpu/drm/drm_gem_shmem_helper.c | 24 
>  include/drm/drm_gem_shmem_helper.h | 10 ++
>  2 files changed, 26 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c 
> b/drivers/gpu/drm/drm_gem_shmem_helper.c
> index 25e99468ced2..7e1e674e2c9f 100644
> --- a/drivers/gpu/drm/drm_gem_shmem_helper.c
> +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
> @@ -236,18 +236,16 @@ static int drm_gem_shmem_pin_locked(struct 
> drm_gem_shmem_object *shmem)
>  
>   dma_resv_assert_held(shmem->base.resv);
>  
> + if (refcount_inc_not_zero(&shmem->pages_pin_count))
> + return 0;
> +
>   ret = drm_gem_shmem_get_pages_locked(shmem);
> + if (!ret)
> + refcount_set(&shmem->pages_pin_count, 1);
>  
>   return ret;
>  }
>  
> -static void drm_gem_shmem_unpin_locked(struct drm_gem_shmem_object *shmem)
> -{
> - dma_resv_assert_held(shmem->base.resv);
> -
> - drm_gem_shmem_put_pages_locked(shmem);
> -}
> -
>  /**
>   * drm_gem_shmem_pin - Pin backing pages for a shmem GEM object
>   * @shmem: shmem GEM object
> @@ -265,6 +263,9 @@ int drm_gem_shmem_pin(struct drm_gem_shmem_object *shmem)
>  
>   drm_WARN_ON(obj->dev, obj->import_attach);
>  
> + if (refcount_inc_not_zero(&shmem->pages_pin_count))
> + return 0;
> +
>   ret = dma_resv_lock_interruptible(shmem->base.resv, NULL);
>   if (ret)
>   return ret;
> @@ -288,8 +289,14 @@ void drm_gem_shmem_unpin(struct drm_gem_shmem_object 
> *shmem)
>  
>   drm_WARN_ON(obj->dev, obj->import_attach);
>  
> + if (refcount_dec_not_one(&shmem->pages_pin_count))
> + return;
> +
>   dma_resv_lock(shmem->base.resv, NULL);
> - drm_gem_shmem_unpin_locked(shmem);
> +
> + if (refcount_dec_and_test(&shmem->pages_pin_count))
> + drm_gem_shmem_put_pages_locked(shmem);
> +
>   dma_resv_unlock(shmem->base.resv);
>  }
>  EXPORT_SYMBOL_GPL(drm_gem_shmem_unpin);
> @@ -634,6 +641,7 @@ void drm_gem_shmem_print_info(const struct 
> drm_gem_shmem_object *shmem,
>   if (shmem->base.import_attach)
>   return;
>  
> + drm_printf_indent(p, indent, "pages_pin_count=%u\n", 
> refcount_read(&shmem->pages_pin_count));
>   drm_printf_indent(p, indent, "pages_use_count=%u\n", 
> shmem->pages_use_count);
>   drm_printf_indent(p, indent, "vmap_use_count=%u\n", 
> shmem->vmap_use_count);
>   drm_printf_indent(p, indent, "vaddr=%p\n", shmem->vaddr);
> diff --git a/include/drm/drm_gem_shmem_helper.h 
> b/include/drm/drm_gem_shmem_helper.h
> index 808083279fd5..1cd74ae5761a 100644
> --- a/include/drm/drm_gem_shmem_helper.h
> +++ b/include/drm/drm_gem_shmem_helper.h
> @@ -39,6 +39,16 @@ struct drm_gem_shmem_object {
>*/
>   unsigned int pages_use_count;
>  
> + /**
> +  * @pages_pin_count:
> +  *
> +  * Reference count on the pinned pages table.
> +  * The pages allowed to be evicted and purged by memory
> +  * shrinker only when the count is zero, otherwise pages
> +  * are hard-pinned in memory.
> +  */
> + refcount_t pages_pin_count;
> +
>   /**
>* @madv: State for madvise
>*



Re: [RFC,drm-misc-next v4 3/9] drm/radeon: Implement .be_primary() callback

2023-09-04 Thread Christian König

Am 04.09.23 um 21:57 schrieb Sui Jingfeng:

From: Sui Jingfeng 

On a machine with multiple GPUs, a Linux user has no control over which one
is primary at boot time.


Question is why is that useful? Should we give users the ability to 
control that?


I don't see an use case for this.

Regards,
Christian.


  This patch tries to solve the mentioned problem by
implementing the .be_primary() callback. Pass radeon.modeset=10 on the
kernel cmd line if you really want the device bound by radeon to be the
primary video adapter, no matter what VGAARB say.

Cc: Alex Deucher 
Cc: Christian Koenig 
Signed-off-by: Sui Jingfeng 
---
  drivers/gpu/drm/radeon/radeon_device.c | 10 +-
  1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/radeon_device.c 
b/drivers/gpu/drm/radeon/radeon_device.c
index 71f2ff39d6a1..b661cd3a8dc2 100644
--- a/drivers/gpu/drm/radeon/radeon_device.c
+++ b/drivers/gpu/drm/radeon/radeon_device.c
@@ -1263,6 +1263,14 @@ static const struct vga_switcheroo_client_ops 
radeon_switcheroo_ops = {
.can_switch = radeon_switcheroo_can_switch,
  };
  
+static bool radeon_want_to_be_primary(struct pci_dev *pdev)

+{
+   if (radeon_modeset == 10)
+   return true;
+
+   return false;
+}
+
  /**
   * radeon_device_init - initialize the driver
   *
@@ -1425,7 +1433,7 @@ int radeon_device_init(struct radeon_device *rdev,
/* if we have > 1 VGA cards, then disable the radeon VGA resources */
/* this will fail for cards that aren't VGA class devices, just
 * ignore it */
-   vga_client_register(rdev->pdev, radeon_vga_set_decode, NULL);
+   vga_client_register(rdev->pdev, radeon_vga_set_decode, 
radeon_want_to_be_primary);
  
  	if (rdev->flags & RADEON_IS_PX)

runtime = true;




RE: [PATCH v14 RESEND 1/6] dt-bindings: display: imx: Add i.MX8qxp/qm DPU binding

2023-09-04 Thread Ying Liu
On Thursday, August 24, 2023 5:48 PM, Maxime Ripard  wrote: 
> On Wed, Aug 23, 2023 at 08:47:51AM +, Ying Liu wrote:
> > > > This dt-binding just follows generic dt-binding rule to describe the DPU
> IP
> > > > hardware, not the software implementation.  DPU internal units do not
> > > > constitute separate devices.
> > >
> > > I mean, your driver does split them into separate devices so surely it
> > > constitutes separate devices.
> >
> > My driver treats them as DPU internal units, especially not Linux devices.
> >
> > Let's avoid Linuxisms when implementing this dt-binding and just be simple
> > to describe necessary stuff exposing to DPU's embodying system/SoC, like
> > reg, interrupts, clocks and power-domains.
> 
> Let's focus the conversation here, because it's redundant with the rest.
> 
> Your driver registers two additional devices, that have a different
> register space, different clocks, different interrupts, different power
> domains, etc. That has nothing to do with Linux, it's hardware
> properties.
> 
> That alone is a very good indication to me that these devices should be
> modeled as such. And your driver agrees.
> 
> Whether or not the other internal units need to be described as separate
> devices, I can't really tell, I don't have the datasheet.

i.MX8qxp and i.MX8qm SoC reference manuals can be found at(I think
registration is needed first):
https://www.nxp.com/webapp/Download?colCode=IMX8DQXPRM
https://www.nxp.com/webapp/Download?colCode=IMX8QMRM

Sorry for putting this in a short way, but the DPU is one IP, so one dt-binding.

> 
> But at least the CRTC and the interrupt controller should be split away,
> or explained and detailed far better than "well it's just convenient".

CRTC is Linuxisms, which cannot be referenced to determine dt-binding.

DPU as Display Controller is listed as a standalone module/IP in RM.
This is how the IP is designed in the first place, not for any convenient
purpose.

Regards,
Liu Ying

> 
> Maxime


RE: [PATCH v14 RESEND 5/6] drm/imx: Introduce i.MX8qm/qxp DPU DRM

2023-09-04 Thread Ying Liu
On Tuesday, August 22, 2023 8:59 PM, Maxime Ripard  wrote:
> 
> Hi,

Hi,

> 
> Aside from the discussion on the binding and the general architecture, I
> have some comments there.

Thanks for your comments.

> 
> On Tue, Aug 22, 2023 at 04:59:48PM +0800, Liu Ying wrote:
> > +int dpu_cf_init(struct dpu_soc *dpu, unsigned int index,
> > +   unsigned int id, enum dpu_unit_type type,
> > +   unsigned long pec_base, unsigned long base)
> > +{
> > +   struct dpu_constframe *cf;
> > +
> > +   cf = devm_kzalloc(dpu->dev, sizeof(*cf), GFP_KERNEL);
> > +   if (!cf)
> > +   return -ENOMEM;
> > +
> > +   dpu->cf_priv[index] = cf;
> 
> You can't store structures related to KMS in a device managed structure.
> The DRM KMS device will stick around (and be accessible from userspace)
> after the device has been removed until the last application closed its
> file descriptor to the device.

The DRM device is registered after component_bind_all() is called in
dpu_drm_bind().  The CRTC components' platform devices are created
in the dpu_core_probe() where the device managed resources are
created.   So, it looks those resources are safe because the DRM device
will be unregistered before those resources are freed.

> 
> This can be checked by enabling KASAN and manually unbinding the driver
> through sysfs.

I enabled KASAN and manually unbound the dpu-core driver with command:

echo 5618.dpu > 
/sys/bus/platform/drivers/dpu-core/5618.dpu/driver/unbind 

KASAN didin't report memory issue regarding those device managed
resources.  However, it did report another issue in dpu_drm_unbind(),
where drm_device should be got from drv_data->drm_dev instead of
dev_get_drvdata(dev).  I'll fix that in next version.

BTW, the dpu-core driver was successfully bound again after unbinding with
command:

echo  5618.dpu > /sys/bus/platform/drivers/dpu-core/bind

> 
> > +   cf->pec_base = devm_ioremap(dpu->dev, pec_base, SZ_16);
> > +   if (!cf->pec_base)
> > +   return -ENOMEM;
> > +
> > +   cf->base = devm_ioremap(dpu->dev, base, SZ_32);
> > +   if (!cf->base)
> > +   return -ENOMEM;
> 
> For the same reason, you need to protect any access to a device managed
> resource (so clocks, registers, regulators, etc.) by a call to
> drm_dev_enter/drm_dev_exit and you need to call drm_dev_unplug instead
> of drm_dev_unregister.

That's a good point.  I've tried to do that, but it turns out that the display 
controller
cannot be enabled again after binding the dpu-core driver manually again.  It 
seems
that the display controller requires a proper disablement procedure, but the 
"driver
instance overview " kdoc mentions the shortcoming of no proper disablement if
drm_dev_unplug() is used:

"""
* Drivers that want to support device unplugging (USB, DT overlay unload) should
 * use drm_dev_unplug() instead of drm_dev_unregister(). The driver must protect
 * regions that is accessing device resources to prevent use after they're
 * released. This is done using drm_dev_enter() and drm_dev_exit(). There is one
 * shortcoming however, drm_dev_unplug() marks the drm_device as unplugged 
before
 * drm_atomic_helper_shutdown() is called. This means that if the disable code
 * paths are protected, they will not run on regular driver module unload,
 * possibly leaving the hardware enabled.
"""

A DPU reset in dpu_core() might be helpful, but I'm not sure if there is any
reset line provided by the embodying system.

Even if the reset works, the 2nd DPU instance in i.MX8qm would be a problem,
because it won't be reset or properly disabled if the 1st DPU instance is 
unbound.
Although the two DPU instances could be wrapped by two DRM devices, I tend
not to do that because downstream bridges in future SoCs might be able to mux
to different DPU instances at runtime.

Due to the disablement issue, can we set drm_dev_enter/exit/unplug aside first?

> 
> > +static int dpu_crtc_pm_runtime_get_sync(struct dpu_crtc *dpu_crtc)
> > +{
> > +   int ret;
> > +
> > +   ret = pm_runtime_get_sync(dpu_crtc->dev->parent);
> > +   if (ret < 0) {
> > +   pm_runtime_put_noidle(dpu_crtc->dev->parent);
> > +   dpu_crtc_err(&dpu_crtc->base,
> > +"failed to get parent device RPM sync: %d\n", ret);
> > +   }
> > +
> > +   return ret;
> > +}
> 
> That's pm_runtime_resume_and_get.

Ok, will use it.

> 
> > +static int dpu_crtc_pm_runtime_put(struct dpu_crtc *dpu_crtc)
> > +{
> > +   int ret;
> > +
> > +   ret = pm_runtime_put(dpu_crtc->dev->parent);
> > +   if (ret < 0)
> > +   dpu_crtc_err(&dpu_crtc->base,
> > +"failed to put parent device RPM: %d\n", ret);
> > +
> > +   return ret;
> > +}
> > +
> > +static void dpu_crtc_mode_set_nofb(struct drm_crtc *crtc)
> > +{
> > +   struct dpu_crtc *dpu_crtc = to_dpu_crtc(crtc);
> > +   struct drm_display_mode *adj = &crtc->state->adjusted_mode;
> > +   enum dpu_link_id cf_link;
> > +
> > +   dpu_crtc_dbg(crtc, "mode " DRM_MODE_FMT "

Re: [PATCH 0/4] ppc, fbdev: Clean up fbdev mmap helper

2023-09-04 Thread Michael Ellerman
Thomas Zimmermann  writes:
> Refactor fb_pgprotect() in PowerPC to work without struct file. Then
> clean up and rename fb_pgprotect(). This change has been discussed at
> [1] in the context of refactoring fbdev's mmap code.
>
> The first three patches adapt PowerPC's internal interfaces to
> provide a phys_mem_access_prot() that works without struct file. Neither
> the architecture code or fbdev helpers need the parameter.
>
> Patch 4 replaces fbdev's fb_pgprotect() with fb_pgprot_device() on
> all architectures. The new helper with its stream-lined interface
> enables more refactoring within fbdev's mmap implementation.

The content of this series is OK, but the way it's structured makes it a
real headache to merge, because it's mostly powerpc changes and then a
dependant cross architecture patch at the end.

It would be simpler if patch 4 was first and just passed file=NULL to
the powerpc helper, with an explanation that it's unused and will be
dropped in a future cleanup.

We could then put the first patch (previously patch 4) in a topic branch
that is shared between the powerpc tree and the fbdev tree, and then the
powerpc changes could be staged on top of that through the powerpc tree.

cheers


[PATCH v3 6/8] drm/msm/dpu: drop the dpu_caps::qseed_type field

2023-09-04 Thread Dmitry Baryshkov
The qseed_type field points out the particular QSEED type implemented by
the scaler. However this field is unused by the driver. Correct scaler
type is inferred from the features data. Drop the the qseed_type field.

Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h  | 1 -
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_4_0_sdm845.h   | 1 -
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h   | 1 -
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_1_sc8180x.h  | 1 -
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_0_sm8250.h   | 1 -
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_2_sc7180.h   | 1 -
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_3_sm6115.h   | 1 -
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_4_sm6350.h   | 1 -
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_9_sm6375.h   | 1 -
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h   | 1 -
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h   | 1 -
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h | 1 -
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h   | 1 -
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h   | 1 -
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h   | 2 --
 15 files changed, 16 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h
index 5ea938b57eda..1276981c16d2 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h
@@ -10,7 +10,6 @@
 static const struct dpu_caps msm8998_dpu_caps = {
.max_mixer_width = DEFAULT_DPU_OUTPUT_LINE_WIDTH,
.max_mixer_blendstages = 0x7,
-   .qseed_type = DPU_SSPP_SCALER_QSEED3,
.has_src_split = true,
.has_dim_layer = true,
.has_idle_pc = true,
diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_4_0_sdm845.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_4_0_sdm845.h
index 440d49842f31..bfd2fa4d27ef 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_4_0_sdm845.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_4_0_sdm845.h
@@ -10,7 +10,6 @@
 static const struct dpu_caps sdm845_dpu_caps = {
.max_mixer_width = DEFAULT_DPU_OUTPUT_LINE_WIDTH,
.max_mixer_blendstages = 0xb,
-   .qseed_type = DPU_SSPP_SCALER_QSEED3,
.has_src_split = true,
.has_dim_layer = true,
.has_idle_pc = true,
diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h
index 619afa54c714..c873743d9123 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h
@@ -10,7 +10,6 @@
 static const struct dpu_caps sm8150_dpu_caps = {
.max_mixer_width = DEFAULT_DPU_OUTPUT_LINE_WIDTH,
.max_mixer_blendstages = 0xb,
-   .qseed_type = DPU_SSPP_SCALER_QSEED3,
.has_src_split = true,
.has_dim_layer = true,
.has_idle_pc = true,
diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_1_sc8180x.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_1_sc8180x.h
index 668b48e3c922..20e95a0d3e81 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_1_sc8180x.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_1_sc8180x.h
@@ -10,7 +10,6 @@
 static const struct dpu_caps sc8180x_dpu_caps = {
.max_mixer_width = DEFAULT_DPU_OUTPUT_LINE_WIDTH,
.max_mixer_blendstages = 0xb,
-   .qseed_type = DPU_SSPP_SCALER_QSEED3,
.has_src_split = true,
.has_dim_layer = true,
.has_idle_pc = true,
diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_0_sm8250.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_0_sm8250.h
index 52f3884e587b..e1a06e609cc1 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_0_sm8250.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_0_sm8250.h
@@ -10,7 +10,6 @@
 static const struct dpu_caps sm8250_dpu_caps = {
.max_mixer_width = DEFAULT_DPU_OUTPUT_LINE_WIDTH,
.max_mixer_blendstages = 0xb,
-   .qseed_type = DPU_SSPP_SCALER_QSEED4,
.has_src_split = true,
.has_dim_layer = true,
.has_idle_pc = true,
diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_2_sc7180.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_2_sc7180.h
index e76a6d329896..206e5a64e5e4 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_2_sc7180.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_2_sc7180.h
@@ -10,7 +10,6 @@
 static const struct dpu_caps sc7180_dpu_caps = {
.max_mixer_width = DEFAULT_DPU_OUTPUT_LINE_WIDTH,
.max_mixer_blendstages = 0x9,
-   .qseed_type = DPU_SSPP_SCALER_QSEED4,
.has_dim_layer = true,
.has_idle_pc = true,
.max_linewidth = DEFAULT_DPU_OUTPUT_LINE_WIDTH,
diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_3_sm6115.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_3_sm6115.h
index 8fc938398aa6..1122a62acddf 100644
--- a/drivers

[PATCH v3 7/8] drm/msm/dpu: merge DPU_SSPP_SCALER_QSEED3, QSEED3LITE, QSEED4

2023-09-04 Thread Dmitry Baryshkov
Three different features, DPU_SSPP_SCALER_QSEED3, QSEED3LITE and QSEED4
are all related to different versions of the same HW scaling block.
Corresponding driver parts use scaler_blk.version to identify the
correct way to program the hardware. In order to simplify the driver
codepath, merge these three feature bits.

Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c | 4 ++--
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h | 6 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.c| 9 ++---
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h| 4 +---
 drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c  | 3 +--
 5 files changed, 7 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
index b37b4076e53a..67d66319a825 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
@@ -31,10 +31,10 @@
(VIG_SDM845_MASK | BIT(DPU_SSPP_SMART_DMA_V2))
 
 #define VIG_SC7180_MASK \
-   (VIG_MASK | BIT(DPU_SSPP_QOS_8LVL) | BIT(DPU_SSPP_SCALER_QSEED4))
+   (VIG_MASK | BIT(DPU_SSPP_QOS_8LVL) | BIT(DPU_SSPP_SCALER_QSEED3))
 
 #define VIG_SM6125_MASK \
-   (VIG_MASK | BIT(DPU_SSPP_QOS_8LVL) | BIT(DPU_SSPP_SCALER_QSEED3LITE))
+   (VIG_MASK | BIT(DPU_SSPP_QOS_8LVL) | BIT(DPU_SSPP_SCALER_QSEED3))
 
 #define VIG_SC7180_MASK_SDMA \
(VIG_SC7180_MASK | BIT(DPU_SSPP_SMART_DMA_V2))
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
index 7ca6286756f6..8dbf0322394e 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
@@ -51,9 +51,7 @@ enum {
 /**
  * SSPP sub-blocks/features
  * @DPU_SSPP_SCALER_QSEED2,  QSEED2 algorithm support
- * @DPU_SSPP_SCALER_QSEED3,  QSEED3 alogorithm support
- * @DPU_SSPP_SCALER_QSEED3LITE,  QSEED3 Lite alogorithm support
- * @DPU_SSPP_SCALER_QSEED4,  QSEED4 algorithm support
+ * @DPU_SSPP_SCALER_QSEED3,  QSEED3 alogorithm support (also QSEED3LITE and 
QSEED4)
  * @DPU_SSPP_SCALER_RGB, RGB Scaler, supported by RGB pipes
  * @DPU_SSPP_CSC,Support of Color space converion
  * @DPU_SSPP_CSC_10BIT,  Support of 10-bit Color space conversion
@@ -72,8 +70,6 @@ enum {
 enum {
DPU_SSPP_SCALER_QSEED2 = 0x1,
DPU_SSPP_SCALER_QSEED3,
-   DPU_SSPP_SCALER_QSEED3LITE,
-   DPU_SSPP_SCALER_QSEED4,
DPU_SSPP_SCALER_RGB,
DPU_SSPP_CSC,
DPU_SSPP_CSC_10BIT,
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.c
index f2192de93713..c20f37c8033c 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.c
@@ -603,9 +603,7 @@ static void _setup_layer_ops(struct dpu_hw_sspp *c,
test_bit(DPU_SSPP_SMART_DMA_V2, &c->cap->features))
c->ops.setup_multirect = dpu_hw_sspp_setup_multirect;
 
-   if (test_bit(DPU_SSPP_SCALER_QSEED3, &features) ||
-   test_bit(DPU_SSPP_SCALER_QSEED3LITE, &features) ||
-   test_bit(DPU_SSPP_SCALER_QSEED4, &features)) {
+   if (test_bit(DPU_SSPP_SCALER_QSEED3, &features)) {
c->ops.setup_scaler = _dpu_hw_sspp_setup_scaler3;
c->ops.get_scaler_ver = _dpu_hw_sspp_get_scaler3_ver;
}
@@ -640,10 +638,7 @@ int _dpu_hw_sspp_init_debugfs(struct dpu_hw_sspp *hw_pipe, 
struct dpu_kms *kms,
cfg->len,
kms);
 
-   if (cfg->features & BIT(DPU_SSPP_SCALER_QSEED3) ||
-   cfg->features & BIT(DPU_SSPP_SCALER_QSEED3LITE) ||
-   cfg->features & BIT(DPU_SSPP_SCALER_QSEED2) ||
-   cfg->features & BIT(DPU_SSPP_SCALER_QSEED4))
+   if (sblk->scaler_blk.len)
dpu_debugfs_create_regset32("scaler_blk", 0400,
debugfs_root,
sblk->scaler_blk.base + cfg->base,
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h
index cbf4f95ff0fd..d7954e900296 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h
@@ -26,9 +26,7 @@ struct dpu_hw_sspp;
  */
 #define DPU_SSPP_SCALER (BIT(DPU_SSPP_SCALER_RGB) | \
 BIT(DPU_SSPP_SCALER_QSEED2) | \
-BIT(DPU_SSPP_SCALER_QSEED3) | \
-BIT(DPU_SSPP_SCALER_QSEED3LITE) | \
-BIT(DPU_SSPP_SCALER_QSEED4))
+BIT(DPU_SSPP_SCALER_QSEED3))
 
 /*
  * Define all CSC feature bits in catalog
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c
index c2aaaded07ed..109355275ec5 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c
@@ -438,8 +43

[PATCH v3 8/8] drm/msm/gpu: drop duplicating VIG feature masks

2023-09-04 Thread Dmitry Baryshkov
After folding QSEED3LITE and QSEED4 feature bits into QSEED3 several
VIG feature masks became equal. Drop these duplicates.

Signed-off-by: Dmitry Baryshkov 
---
 .../gpu/drm/msm/disp/dpu1/catalog/dpu_5_4_sm6125.h|  2 +-
 .../gpu/drm/msm/disp/dpu1/catalog/dpu_6_0_sm8250.h|  8 
 .../gpu/drm/msm/disp/dpu1/catalog/dpu_6_2_sc7180.h|  2 +-
 .../gpu/drm/msm/disp/dpu1/catalog/dpu_6_3_sm6115.h|  2 +-
 .../gpu/drm/msm/disp/dpu1/catalog/dpu_6_4_sm6350.h|  2 +-
 .../gpu/drm/msm/disp/dpu1/catalog/dpu_6_9_sm6375.h|  2 +-
 .../gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h|  8 
 .../gpu/drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h  |  8 
 .../gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h|  8 
 .../gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h|  8 
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c| 11 +--
 11 files changed, 26 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_4_sm6125.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_4_sm6125.h
index c5c44e15a8ea..e196f9e7fd82 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_4_sm6125.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_4_sm6125.h
@@ -68,7 +68,7 @@ static const struct dpu_sspp_cfg sm6125_sspp[] = {
{
.name = "sspp_0", .id = SSPP_VIG0,
.base = 0x4000, .len = 0x1f0,
-   .features = VIG_SM6125_MASK,
+   .features = VIG_SDM845_MASK,
.sblk = &dpu_vig_sblk_2_4,
.xin_id = 0,
.type = SSPP_TYPE_VIG,
diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_0_sm8250.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_0_sm8250.h
index e1a06e609cc1..c9576a7b8bef 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_0_sm8250.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_0_sm8250.h
@@ -74,7 +74,7 @@ static const struct dpu_sspp_cfg sm8250_sspp[] = {
{
.name = "sspp_0", .id = SSPP_VIG0,
.base = 0x4000, .len = 0x1f8,
-   .features = VIG_SC7180_MASK_SDMA,
+   .features = VIG_SDM845_MASK_SDMA,
.sblk = &dpu_vig_sblk_3_0,
.xin_id = 0,
.type = SSPP_TYPE_VIG,
@@ -82,7 +82,7 @@ static const struct dpu_sspp_cfg sm8250_sspp[] = {
}, {
.name = "sspp_1", .id = SSPP_VIG1,
.base = 0x6000, .len = 0x1f8,
-   .features = VIG_SC7180_MASK_SDMA,
+   .features = VIG_SDM845_MASK_SDMA,
.sblk = &dpu_vig_sblk_3_0,
.xin_id = 4,
.type = SSPP_TYPE_VIG,
@@ -90,7 +90,7 @@ static const struct dpu_sspp_cfg sm8250_sspp[] = {
}, {
.name = "sspp_2", .id = SSPP_VIG2,
.base = 0x8000, .len = 0x1f8,
-   .features = VIG_SC7180_MASK_SDMA,
+   .features = VIG_SDM845_MASK_SDMA,
.sblk = &dpu_vig_sblk_3_0,
.xin_id = 8,
.type = SSPP_TYPE_VIG,
@@ -98,7 +98,7 @@ static const struct dpu_sspp_cfg sm8250_sspp[] = {
}, {
.name = "sspp_3", .id = SSPP_VIG3,
.base = 0xa000, .len = 0x1f8,
-   .features = VIG_SC7180_MASK_SDMA,
+   .features = VIG_SDM845_MASK_SDMA,
.sblk = &dpu_vig_sblk_3_0,
.xin_id = 12,
.type = SSPP_TYPE_VIG,
diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_2_sc7180.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_2_sc7180.h
index 206e5a64e5e4..7e1156f1ef54 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_2_sc7180.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_2_sc7180.h
@@ -51,7 +51,7 @@ static const struct dpu_sspp_cfg sc7180_sspp[] = {
{
.name = "sspp_0", .id = SSPP_VIG0,
.base = 0x4000, .len = 0x1f8,
-   .features = VIG_SC7180_MASK,
+   .features = VIG_SDM845_MASK,
.sblk = &dpu_vig_sblk_3_0,
.xin_id = 0,
.type = SSPP_TYPE_VIG,
diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_3_sm6115.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_3_sm6115.h
index 1122a62acddf..49d360d2b73b 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_3_sm6115.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_3_sm6115.h
@@ -38,7 +38,7 @@ static const struct dpu_sspp_cfg sm6115_sspp[] = {
{
.name = "sspp_0", .id = SSPP_VIG0,
.base = 0x4000, .len = 0x1f8,
-   .features = VIG_SC7180_MASK,
+   .features = VIG_SDM845_MASK,
.sblk = &dpu_vig_sblk_3_0,
.xin_id = 0,
.type = SSPP_TYPE_VIG,
diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_4_sm6350.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_4_sm6350.h
index 8aea53d5c86f..ce54e0c695d6 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/

[PATCH v3 1/8] drm/msm/dpu: populate SSPP scaler block version

2023-09-04 Thread Dmitry Baryshkov
The function _dpu_hw_sspp_setup_scaler3() passes and
dpu_hw_setup_scaler3() uses scaler_blk.version to determine in which way
the scaler (QSEED3) block should be programmed. However up to now we
were not setting this field. Set it now, splitting the vig_sblk data
which has different version fields.

Reported-by: Marijn Suijten 
Fixes: 9b6f4fedaac2 ("drm/msm/dpu: Add SM6125 support")
Fixes: 27f0df03f3ff ("drm/msm/dpu: Add SM6375 support")
Fixes: 3186acba5cdc ("drm/msm/dpu: Add SM6350 support")
Fixes: efcd0107727c ("drm/msm/dpu: add support for SM8550")
Fixes: 4a352c2fc15a ("drm/msm/dpu: Introduce SC8280XP")
Fixes: 0e91bcbb0016 ("drm/msm/dpu: Add SM8350 to hw catalog")
Fixes: 100d7ef6995d ("drm/msm/dpu: add support for SM8450")
Fixes: 3581b7062cec ("drm/msm/disp/dpu1: add support for display on SM6115")
Fixes: dabfdd89eaa9 ("drm/msm/disp/dpu1: add inline rotation support for 
sc7280")
Fixes: f3af2d6ee9ab ("drm/msm/dpu: Add SC8180x to hw catalog")
Fixes: 94391a14fc27 ("drm/msm/dpu1: Add MSM8998 to hw catalog")
Fixes: af776a3e1c30 ("drm/msm/dpu: add SM8250 to hw catalog")
Fixes: 386fced3f76f ("drm/msm/dpu: add SM8150 to hw catalog")
Fixes: b75ab05a3479 ("msm:disp:dpu1: add scaler support on SC7180 display")
Fixes: 25fdd5933e4c ("drm/msm: Add SDM845 DPU support")
Signed-off-by: Dmitry Baryshkov 
---
 .../msm/disp/dpu1/catalog/dpu_5_0_sm8150.h|  8 +-
 .../msm/disp/dpu1/catalog/dpu_5_1_sc8180x.h   |  8 +-
 .../msm/disp/dpu1/catalog/dpu_8_1_sm8450.h|  8 +-
 .../gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c| 95 ++-
 4 files changed, 85 insertions(+), 34 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h
index 99acaf917e43..f0c3804f4258 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h
@@ -77,7 +77,7 @@ static const struct dpu_sspp_cfg sm8150_sspp[] = {
.name = "sspp_0", .id = SSPP_VIG0,
.base = 0x4000, .len = 0x1f0,
.features = VIG_SDM845_MASK,
-   .sblk = &sdm845_vig_sblk_0,
+   .sblk = &sm8150_vig_sblk_0,
.xin_id = 0,
.type = SSPP_TYPE_VIG,
.clk_ctrl = DPU_CLK_CTRL_VIG0,
@@ -85,7 +85,7 @@ static const struct dpu_sspp_cfg sm8150_sspp[] = {
.name = "sspp_1", .id = SSPP_VIG1,
.base = 0x6000, .len = 0x1f0,
.features = VIG_SDM845_MASK,
-   .sblk = &sdm845_vig_sblk_1,
+   .sblk = &sm8150_vig_sblk_1,
.xin_id = 4,
.type = SSPP_TYPE_VIG,
.clk_ctrl = DPU_CLK_CTRL_VIG1,
@@ -93,7 +93,7 @@ static const struct dpu_sspp_cfg sm8150_sspp[] = {
.name = "sspp_2", .id = SSPP_VIG2,
.base = 0x8000, .len = 0x1f0,
.features = VIG_SDM845_MASK,
-   .sblk = &sdm845_vig_sblk_2,
+   .sblk = &sm8150_vig_sblk_2,
.xin_id = 8,
.type = SSPP_TYPE_VIG,
.clk_ctrl = DPU_CLK_CTRL_VIG2,
@@ -101,7 +101,7 @@ static const struct dpu_sspp_cfg sm8150_sspp[] = {
.name = "sspp_3", .id = SSPP_VIG3,
.base = 0xa000, .len = 0x1f0,
.features = VIG_SDM845_MASK,
-   .sblk = &sdm845_vig_sblk_3,
+   .sblk = &sm8150_vig_sblk_3,
.xin_id = 12,
.type = SSPP_TYPE_VIG,
.clk_ctrl = DPU_CLK_CTRL_VIG3,
diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_1_sc8180x.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_1_sc8180x.h
index f3de21025ca7..3ec954722a8e 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_1_sc8180x.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_1_sc8180x.h
@@ -76,7 +76,7 @@ static const struct dpu_sspp_cfg sc8180x_sspp[] = {
.name = "sspp_0", .id = SSPP_VIG0,
.base = 0x4000, .len = 0x1f0,
.features = VIG_SDM845_MASK,
-   .sblk = &sdm845_vig_sblk_0,
+   .sblk = &sm8150_vig_sblk_0,
.xin_id = 0,
.type = SSPP_TYPE_VIG,
.clk_ctrl = DPU_CLK_CTRL_VIG0,
@@ -84,7 +84,7 @@ static const struct dpu_sspp_cfg sc8180x_sspp[] = {
.name = "sspp_1", .id = SSPP_VIG1,
.base = 0x6000, .len = 0x1f0,
.features = VIG_SDM845_MASK,
-   .sblk = &sdm845_vig_sblk_1,
+   .sblk = &sm8150_vig_sblk_1,
.xin_id = 4,
.type = SSPP_TYPE_VIG,
.clk_ctrl = DPU_CLK_CTRL_VIG1,
@@ -92,7 +92,7 @@ static const struct dpu_sspp_cfg sc8180x_sspp[] = {
.name = "sspp_2", .id = SSPP_VIG2,
.base = 0x8000, .len = 0x1f0,
.features = VIG_SDM845_MASK,
-   .sblk = &sdm845_vig_sblk_2,
+   .sblk = &sm8150_vig_sblk_2,
.xin_id 

[PATCH v3 5/8] drm/msm/dpu: drop DPU_HW_SUBBLK_INFO macro

2023-09-04 Thread Dmitry Baryshkov
As the subblock info is now mostly gone, inline and drop the macro
DPU_HW_SUBBLK_INFO.

Signed-off-by: Dmitry Baryshkov 
---
 .../gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h| 40 ++-
 1 file changed, 21 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
index 7c08bbd2bdc6..63716ff5558f 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
@@ -252,48 +252,50 @@ enum {
u32 len; \
unsigned long features
 
-/**
- * MACRO DPU_HW_SUBBLK_INFO - information of HW sub-block inside DPU
- * @name:  string name for debug purposes
- * @base:  offset of this sub-block relative to the block
- * offset
- * @lenregister block length of this sub-block
- */
-#define DPU_HW_SUBBLK_INFO \
-   char name[DPU_HW_BLK_NAME_LEN]; \
-   u32 base; \
-   u32 len
-
 /**
  * struct dpu_scaler_blk: Scaler information
- * @info:   HW register and features supported by this sub-blk
+ * @name: string name for debug purposes
+ * @base: offset of this sub-block relative to the block offset
+ * @len: register block length of this sub-block
  * @version: qseed block revision
  */
 struct dpu_scaler_blk {
-   DPU_HW_SUBBLK_INFO;
+   char name[DPU_HW_BLK_NAME_LEN];
+   u32 base;
+   u32 len;
u32 version;
 };
 
 struct dpu_csc_blk {
-   DPU_HW_SUBBLK_INFO;
+   char name[DPU_HW_BLK_NAME_LEN];
+   u32 base;
+   u32 len;
 };
 
 /**
  * struct dpu_pp_blk : Pixel processing sub-blk information
- * @info:   HW register and features supported by this sub-blk
+ * @name: string name for debug purposes
+ * @base: offset of this sub-block relative to the block offset
+ * @len: register block length of this sub-block
  * @version: HW Algorithm version
  */
 struct dpu_pp_blk {
-   DPU_HW_SUBBLK_INFO;
+   char name[DPU_HW_BLK_NAME_LEN];
+   u32 base;
+   u32 len;
u32 version;
 };
 
 /**
  * struct dpu_dsc_blk - DSC Encoder sub-blk information
- * @info:   HW register and features supported by this sub-blk
+ * @name: string name for debug purposes
+ * @base: offset of this sub-block relative to the block offset
+ * @len: register block length of this sub-block
  */
 struct dpu_dsc_blk {
-   DPU_HW_SUBBLK_INFO;
+   char name[DPU_HW_BLK_NAME_LEN];
+   u32 base;
+   u32 len;
 };
 
 /**
-- 
2.39.2



[PATCH v3 4/8] drm/msm/dpu: deduplicate some (most) of SSPP sub-blocks

2023-09-04 Thread Dmitry Baryshkov
As we have dropped the variadic parts of SSPP sub-blocks declarations,
deduplicate them now, reducing memory cruft.

Signed-off-by: Dmitry Baryshkov 
---
 .../msm/disp/dpu1/catalog/dpu_3_0_msm8998.h   | 16 +--
 .../msm/disp/dpu1/catalog/dpu_4_0_sdm845.h| 16 +--
 .../msm/disp/dpu1/catalog/dpu_5_0_sm8150.h| 16 +--
 .../msm/disp/dpu1/catalog/dpu_5_1_sc8180x.h   | 16 +--
 .../msm/disp/dpu1/catalog/dpu_5_4_sm6125.h|  6 +-
 .../msm/disp/dpu1/catalog/dpu_6_0_sm8250.h| 16 +--
 .../msm/disp/dpu1/catalog/dpu_6_2_sc7180.h|  8 +-
 .../msm/disp/dpu1/catalog/dpu_6_3_sm6115.h|  4 +-
 .../msm/disp/dpu1/catalog/dpu_6_4_sm6350.h|  8 +-
 .../msm/disp/dpu1/catalog/dpu_6_5_qcm2290.h   |  4 +-
 .../msm/disp/dpu1/catalog/dpu_6_9_sm6375.h|  4 +-
 .../msm/disp/dpu1/catalog/dpu_7_0_sm8350.h| 16 +--
 .../msm/disp/dpu1/catalog/dpu_7_2_sc7280.h|  8 +-
 .../msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h  | 16 +--
 .../msm/disp/dpu1/catalog/dpu_8_1_sm8450.h| 16 +--
 .../msm/disp/dpu1/catalog/dpu_9_0_sm8550.h| 20 ++--
 .../gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c| 97 +--
 17 files changed, 120 insertions(+), 167 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h
index 43c47a19cd94..5ea938b57eda 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h
@@ -70,7 +70,7 @@ static const struct dpu_sspp_cfg msm8998_sspp[] = {
.name = "sspp_0", .id = SSPP_VIG0,
.base = 0x4000, .len = 0x1ac,
.features = VIG_MSM8998_MASK,
-   .sblk = &msm8998_vig_sblk_0,
+   .sblk = &dpu_vig_sblk_1_2,
.xin_id = 0,
.type = SSPP_TYPE_VIG,
.clk_ctrl = DPU_CLK_CTRL_VIG0,
@@ -78,7 +78,7 @@ static const struct dpu_sspp_cfg msm8998_sspp[] = {
.name = "sspp_1", .id = SSPP_VIG1,
.base = 0x6000, .len = 0x1ac,
.features = VIG_MSM8998_MASK,
-   .sblk = &msm8998_vig_sblk_1,
+   .sblk = &dpu_vig_sblk_1_2,
.xin_id = 4,
.type = SSPP_TYPE_VIG,
.clk_ctrl = DPU_CLK_CTRL_VIG1,
@@ -86,7 +86,7 @@ static const struct dpu_sspp_cfg msm8998_sspp[] = {
.name = "sspp_2", .id = SSPP_VIG2,
.base = 0x8000, .len = 0x1ac,
.features = VIG_MSM8998_MASK,
-   .sblk = &msm8998_vig_sblk_2,
+   .sblk = &dpu_vig_sblk_1_2,
.xin_id = 8,
.type = SSPP_TYPE_VIG,
.clk_ctrl = DPU_CLK_CTRL_VIG2,
@@ -94,7 +94,7 @@ static const struct dpu_sspp_cfg msm8998_sspp[] = {
.name = "sspp_3", .id = SSPP_VIG3,
.base = 0xa000, .len = 0x1ac,
.features = VIG_MSM8998_MASK,
-   .sblk = &msm8998_vig_sblk_3,
+   .sblk = &dpu_vig_sblk_1_2,
.xin_id = 12,
.type = SSPP_TYPE_VIG,
.clk_ctrl = DPU_CLK_CTRL_VIG3,
@@ -102,7 +102,7 @@ static const struct dpu_sspp_cfg msm8998_sspp[] = {
.name = "sspp_8", .id = SSPP_DMA0,
.base = 0x24000, .len = 0x1ac,
.features = DMA_MSM8998_MASK,
-   .sblk = &sdm845_dma_sblk_0,
+   .sblk = &dpu_dma_sblk,
.xin_id = 1,
.type = SSPP_TYPE_DMA,
.clk_ctrl = DPU_CLK_CTRL_DMA0,
@@ -110,7 +110,7 @@ static const struct dpu_sspp_cfg msm8998_sspp[] = {
.name = "sspp_9", .id = SSPP_DMA1,
.base = 0x26000, .len = 0x1ac,
.features = DMA_MSM8998_MASK,
-   .sblk = &sdm845_dma_sblk_1,
+   .sblk = &dpu_dma_sblk,
.xin_id = 5,
.type = SSPP_TYPE_DMA,
.clk_ctrl = DPU_CLK_CTRL_DMA1,
@@ -118,7 +118,7 @@ static const struct dpu_sspp_cfg msm8998_sspp[] = {
.name = "sspp_10", .id = SSPP_DMA2,
.base = 0x28000, .len = 0x1ac,
.features = DMA_CURSOR_MSM8998_MASK,
-   .sblk = &sdm845_dma_sblk_2,
+   .sblk = &dpu_dma_sblk,
.xin_id = 9,
.type = SSPP_TYPE_DMA,
.clk_ctrl = DPU_CLK_CTRL_DMA2,
@@ -126,7 +126,7 @@ static const struct dpu_sspp_cfg msm8998_sspp[] = {
.name = "sspp_11", .id = SSPP_DMA3,
.base = 0x2a000, .len = 0x1ac,
.features = DMA_CURSOR_MSM8998_MASK,
-   .sblk = &sdm845_dma_sblk_3,
+   .sblk = &dpu_dma_sblk,
.xin_id = 13,
.type = SSPP_TYPE_DMA,
.clk_ctrl = DPU_CLK_CTRL_DMA3,
diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_4_0_sdm845.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_4_0_sdm845.h
index 88a5177dfdb7..440d49842f31 100644
--- a/drivers/gpu/drm/msm/disp/

[PATCH v3 3/8] drm/msm/dpu: drop the `smart_dma_priority' field from struct dpu_sspp_sub_blks

2023-09-04 Thread Dmitry Baryshkov
In preparation to deduplicating SSPP subblocks, drop the (unused)
`smart_dma_priority' field from struct dpu_sspp_sub_blks. If it is
needed later (e.g. for SmartDMA v1), it should be added to the SSPP
declarations themselves.

Signed-off-by: Dmitry Baryshkov 
---
 .../gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c| 112 +++---
 .../gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h|   2 -
 2 files changed, 40 insertions(+), 74 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
index ed7458991509..e9773274bdd6 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
@@ -253,11 +253,10 @@ static const uint32_t wb2_formats[] = {
 #define SSPP_SCALER_VER(maj, min) (((maj) << 16) | (min))
 
 /* SSPP common configuration */
-#define _VIG_SBLK(sdma_pri, scaler_ver) \
+#define _VIG_SBLK(scaler_ver) \
{ \
.maxdwnscale = MAX_DOWNSCALE_RATIO, \
.maxupscale = MAX_UPSCALE_RATIO, \
-   .smart_dma_priority = sdma_pri, \
.scaler_blk = {.name = "scaler", \
.version = scaler_ver, \
.base = 0xa00, .len = 0xa0,}, \
@@ -270,11 +269,10 @@ static const uint32_t wb2_formats[] = {
.rotation_cfg = NULL, \
}
 
-#define _VIG_SBLK_ROT(sdma_pri, scaler_ver, rot_cfg) \
+#define _VIG_SBLK_ROT(scaler_ver, rot_cfg) \
{ \
.maxdwnscale = MAX_DOWNSCALE_RATIO, \
.maxupscale = MAX_UPSCALE_RATIO, \
-   .smart_dma_priority = sdma_pri, \
.scaler_blk = {.name = "scaler", \
.version = scaler_ver, \
.base = 0xa00, .len = 0xa0,}, \
@@ -287,11 +285,10 @@ static const uint32_t wb2_formats[] = {
.rotation_cfg = rot_cfg, \
}
 
-#define _DMA_SBLK(sdma_pri) \
+#define _DMA_SBLK() \
{ \
.maxdwnscale = SSPP_UNITY_SCALE, \
.maxupscale = SSPP_UNITY_SCALE, \
-   .smart_dma_priority = sdma_pri, \
.format_list = plane_formats, \
.num_formats = ARRAY_SIZE(plane_formats), \
.virt_format_list = plane_formats, \
@@ -299,17 +296,13 @@ static const uint32_t wb2_formats[] = {
}
 
 static const struct dpu_sspp_sub_blks msm8998_vig_sblk_0 =
-   _VIG_SBLK(0,
- SSPP_SCALER_VER(1, 2));
+   _VIG_SBLK(SSPP_SCALER_VER(1, 2));
 static const struct dpu_sspp_sub_blks msm8998_vig_sblk_1 =
-   _VIG_SBLK(0,
- SSPP_SCALER_VER(1, 2));
+   _VIG_SBLK(SSPP_SCALER_VER(1, 2));
 static const struct dpu_sspp_sub_blks msm8998_vig_sblk_2 =
-   _VIG_SBLK(0,
- SSPP_SCALER_VER(1, 2));
+   _VIG_SBLK(SSPP_SCALER_VER(1, 2));
 static const struct dpu_sspp_sub_blks msm8998_vig_sblk_3 =
-   _VIG_SBLK(0,
- SSPP_SCALER_VER(1, 2));
+   _VIG_SBLK(SSPP_SCALER_VER(1, 2));
 
 static const struct dpu_rotation_cfg dpu_rot_sc7280_cfg_v2 = {
.rot_maxheight = 1088,
@@ -318,107 +311,82 @@ static const struct dpu_rotation_cfg 
dpu_rot_sc7280_cfg_v2 = {
 };
 
 static const struct dpu_sspp_sub_blks sdm845_vig_sblk_0 =
-   _VIG_SBLK(5,
- SSPP_SCALER_VER(1, 3));
+   _VIG_SBLK(SSPP_SCALER_VER(1, 3));
 static const struct dpu_sspp_sub_blks sdm845_vig_sblk_1 =
-   _VIG_SBLK(6,
- SSPP_SCALER_VER(1, 3));
+   _VIG_SBLK(SSPP_SCALER_VER(1, 3));
 static const struct dpu_sspp_sub_blks sdm845_vig_sblk_2 =
-   _VIG_SBLK(7,
- SSPP_SCALER_VER(1, 3));
+   _VIG_SBLK(SSPP_SCALER_VER(1, 3));
 static const struct dpu_sspp_sub_blks sdm845_vig_sblk_3 =
-   _VIG_SBLK(8,
- SSPP_SCALER_VER(1, 3));
+   _VIG_SBLK(SSPP_SCALER_VER(1, 3));
 
 static const struct dpu_sspp_sub_blks sm8150_vig_sblk_0 =
-   _VIG_SBLK(5,
- SSPP_SCALER_VER(1, 4));
+   _VIG_SBLK(SSPP_SCALER_VER(1, 4));
 static const struct dpu_sspp_sub_blks sm8150_vig_sblk_1 =
-   _VIG_SBLK(6,
- SSPP_SCALER_VER(1, 4));
+   _VIG_SBLK(SSPP_SCALER_VER(1, 4));
 static const struct dpu_sspp_sub_blks sm8150_vig_sblk_2 =
-   _VIG_SBLK(7,
- SSPP_SCALER_VER(1, 4));
+   _VIG_SBLK(SSPP_SCALER_VER(1, 4));
 sta

[PATCH v3 2/8] drm/msm/dpu: drop the `id' field from DPU_HW_SUBBLK_INFO

2023-09-04 Thread Dmitry Baryshkov
The field `id' is not used for subblocks. The handling code usually
knows, which sub-block it is now looking at. Drop the field completely.

Signed-off-by: Dmitry Baryshkov 
---
 .../gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c| 76 +--
 .../gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h|  2 -
 2 files changed, 36 insertions(+), 42 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
index 77d09f961d86..ed7458991509 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
@@ -253,17 +253,15 @@ static const uint32_t wb2_formats[] = {
 #define SSPP_SCALER_VER(maj, min) (((maj) << 16) | (min))
 
 /* SSPP common configuration */
-#define _VIG_SBLK(sdma_pri, qseed_ver, scaler_ver) \
+#define _VIG_SBLK(sdma_pri, scaler_ver) \
{ \
.maxdwnscale = MAX_DOWNSCALE_RATIO, \
.maxupscale = MAX_UPSCALE_RATIO, \
.smart_dma_priority = sdma_pri, \
.scaler_blk = {.name = "scaler", \
-   .id = qseed_ver, \
.version = scaler_ver, \
.base = 0xa00, .len = 0xa0,}, \
.csc_blk = {.name = "csc", \
-   .id = DPU_SSPP_CSC_10BIT, \
.base = 0x1a00, .len = 0x100,}, \
.format_list = plane_formats_yuv, \
.num_formats = ARRAY_SIZE(plane_formats_yuv), \
@@ -272,17 +270,15 @@ static const uint32_t wb2_formats[] = {
.rotation_cfg = NULL, \
}
 
-#define _VIG_SBLK_ROT(sdma_pri, qseed_ver, scaler_ver, rot_cfg) \
+#define _VIG_SBLK_ROT(sdma_pri, scaler_ver, rot_cfg) \
{ \
.maxdwnscale = MAX_DOWNSCALE_RATIO, \
.maxupscale = MAX_UPSCALE_RATIO, \
.smart_dma_priority = sdma_pri, \
.scaler_blk = {.name = "scaler", \
-   .id = qseed_ver, \
.version = scaler_ver, \
.base = 0xa00, .len = 0xa0,}, \
.csc_blk = {.name = "csc", \
-   .id = DPU_SSPP_CSC_10BIT, \
.base = 0x1a00, .len = 0x100,}, \
.format_list = plane_formats_yuv, \
.num_formats = ARRAY_SIZE(plane_formats_yuv), \
@@ -303,16 +299,16 @@ static const uint32_t wb2_formats[] = {
}
 
 static const struct dpu_sspp_sub_blks msm8998_vig_sblk_0 =
-   _VIG_SBLK(0, DPU_SSPP_SCALER_QSEED3,
+   _VIG_SBLK(0,
  SSPP_SCALER_VER(1, 2));
 static const struct dpu_sspp_sub_blks msm8998_vig_sblk_1 =
-   _VIG_SBLK(0, DPU_SSPP_SCALER_QSEED3,
+   _VIG_SBLK(0,
  SSPP_SCALER_VER(1, 2));
 static const struct dpu_sspp_sub_blks msm8998_vig_sblk_2 =
-   _VIG_SBLK(0, DPU_SSPP_SCALER_QSEED3,
+   _VIG_SBLK(0,
  SSPP_SCALER_VER(1, 2));
 static const struct dpu_sspp_sub_blks msm8998_vig_sblk_3 =
-   _VIG_SBLK(0, DPU_SSPP_SCALER_QSEED3,
+   _VIG_SBLK(0,
  SSPP_SCALER_VER(1, 2));
 
 static const struct dpu_rotation_cfg dpu_rot_sc7280_cfg_v2 = {
@@ -322,29 +318,29 @@ static const struct dpu_rotation_cfg 
dpu_rot_sc7280_cfg_v2 = {
 };
 
 static const struct dpu_sspp_sub_blks sdm845_vig_sblk_0 =
-   _VIG_SBLK(5, DPU_SSPP_SCALER_QSEED3,
+   _VIG_SBLK(5,
  SSPP_SCALER_VER(1, 3));
 static const struct dpu_sspp_sub_blks sdm845_vig_sblk_1 =
-   _VIG_SBLK(6, DPU_SSPP_SCALER_QSEED3,
+   _VIG_SBLK(6,
  SSPP_SCALER_VER(1, 3));
 static const struct dpu_sspp_sub_blks sdm845_vig_sblk_2 =
-   _VIG_SBLK(7, DPU_SSPP_SCALER_QSEED3,
+   _VIG_SBLK(7,
  SSPP_SCALER_VER(1, 3));
 static const struct dpu_sspp_sub_blks sdm845_vig_sblk_3 =
-   _VIG_SBLK(8, DPU_SSPP_SCALER_QSEED3,
+   _VIG_SBLK(8,
  SSPP_SCALER_VER(1, 3));
 
 static const struct dpu_sspp_sub_blks sm8150_vig_sblk_0 =
-   _VIG_SBLK(5, DPU_SSPP_SCALER_QSEED3,
+   _VIG_SBLK(5,
  SSPP_SCALER_VER(1, 4));
 static const struct dpu_sspp_sub_blks sm8150_vig_sblk_1 =
-   _VIG_SBLK(6, DPU_SSPP_SCALER_QSEED3,
+   _VIG_SBLK(6,
  SSPP_SCALER_VER(1, 4));
 static const struct dpu_sspp_sub_blks sm8150_vig_sblk_2 =
-   _VIG_SBLK(7, DPU_SSPP_SCALER_QSEED3,
+   _VIG_SBLK(7,
 

[PATCH v3 0/8] drm/msm/dpu: simplify DPU sub-blocks info

2023-09-04 Thread Dmitry Baryshkov
The handling code also usually knows, which sub-block it is now looking
at. Drop unused 'id' field and arguments and merge some of sub-block
declarations.

While we are at it, also fix all VIG subblocks to contain correct scaler
block version and drop the becoming unused QSEED-related feature bits.

Changes since v2:
- Reworked the VIG SBLK definitions to set the scaler version (Marijn,
  Abhinav)
- Rebased the reset of the patches on top of this (intrusive) change.
- Folded QSEED3LITE and QSEED4 feature bits into QSEED3

Changes since v1:
- Dropped the patch dropping 'name' field (Abhinav).
- Deduplicate equivalent SBLK definitions.
- Dropped the dpu_csc_blk and dpu_dsc_blk merge.

Dmitry Baryshkov (8):
  drm/msm/dpu: populate SSPP scaler block version
  drm/msm/dpu: drop the `id' field from DPU_HW_SUBBLK_INFO
  drm/msm/dpu: drop the `smart_dma_priority' field from struct
dpu_sspp_sub_blks
  drm/msm/dpu: deduplicate some (most) of SSPP sub-blocks
  drm/msm/dpu: drop DPU_HW_SUBBLK_INFO macro
  drm/msm/dpu: drop the dpu_caps::qseed_type field
  drm/msm/dpu: merge DPU_SSPP_SCALER_QSEED3, QSEED3LITE, QSEED4
  drm/msm/gpu: drop duplicating VIG feature masks

 .../msm/disp/dpu1/catalog/dpu_3_0_msm8998.h   |  17 +-
 .../msm/disp/dpu1/catalog/dpu_4_0_sdm845.h|  17 +-
 .../msm/disp/dpu1/catalog/dpu_5_0_sm8150.h|  17 +-
 .../msm/disp/dpu1/catalog/dpu_5_1_sc8180x.h   |  17 +-
 .../msm/disp/dpu1/catalog/dpu_5_4_sm6125.h|   8 +-
 .../msm/disp/dpu1/catalog/dpu_6_0_sm8250.h|  25 ++-
 .../msm/disp/dpu1/catalog/dpu_6_2_sc7180.h|  11 +-
 .../msm/disp/dpu1/catalog/dpu_6_3_sm6115.h|   7 +-
 .../msm/disp/dpu1/catalog/dpu_6_4_sm6350.h|  11 +-
 .../msm/disp/dpu1/catalog/dpu_6_5_qcm2290.h   |   4 +-
 .../msm/disp/dpu1/catalog/dpu_6_9_sm6375.h|   7 +-
 .../msm/disp/dpu1/catalog/dpu_7_0_sm8350.h|  25 ++-
 .../msm/disp/dpu1/catalog/dpu_7_2_sc7280.h|   9 +-
 .../msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h  |  25 ++-
 .../msm/disp/dpu1/catalog/dpu_8_1_sm8450.h|  25 ++-
 .../msm/disp/dpu1/catalog/dpu_9_0_sm8550.h|  29 ++--
 .../gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c| 145 +++---
 .../gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h|  52 +++
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.c   |   9 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h   |   4 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c |   3 +-
 21 files changed, 198 insertions(+), 269 deletions(-)

-- 
2.39.2



Re: [PATCH 4/4] drm/doc/rfc: Mark GPU VA as complete.

2023-09-04 Thread Danilo Krummrich

On 8/31/23 21:17, Rodrigo Vivi wrote:

On Tue, Aug 29, 2023 at 12:30:04PM -0400, Rodrigo Vivi wrote:

Nouveau has landed the GPU VA helpers, support and documentation
already and Xe is already using the upstream GPU VA.


Danilo, although this is more on the Xe side and I wouldn't ask you
to review our code entirely, I'd like to get your ack here as Daniel
recommended. Meaning that we are aligned there and not creating any
change on top of GPU VA. Xe is currently using GPU VA directly without
any customization.

Link: 
https://gitlab.freedesktop.org/drm/xe/kernel/-/commit/ea4ae69e66b2940107e74f240ecb9dae87bf1ff1
Link: 
https://gitlab.freedesktop.org/drm/xe/kernel/-/commits/drm-xe-next?ref_type=heads


Acked-by: Danilo Krummrich 

Just one note: If we end up to agree on [1] few more adjustments are needed.

Otherwise, same as the other commit, where is the paragraph going?

- Danilo

[1] 
https://lore.kernel.org/dri-devel/202308221050.ktj8ufma-...@intel.com/T/#m7f3b5a7ff70723332adeea32671578cb95c62f7c





Signed-off-by: Rodrigo Vivi 
---
  Documentation/gpu/rfc/xe.rst | 36 ++--
  1 file changed, 18 insertions(+), 18 deletions(-)

diff --git a/Documentation/gpu/rfc/xe.rst b/Documentation/gpu/rfc/xe.rst
index a115526c03e0..b67f8e6a1825 100644
--- a/Documentation/gpu/rfc/xe.rst
+++ b/Documentation/gpu/rfc/xe.rst
@@ -88,24 +88,6 @@ depend on any other patch touching drm_scheduler itself that 
was not yet merged
  through drm-misc. This, by itself, already includes the reach of an agreement 
for
  uniform 1 to 1 relationship implementation / usage across drivers.
  
-GPU VA

---
-Two main goals of Xe are meeting together here:
-
-1) Have an uAPI that aligns with modern UMD needs.
-
-2) Early upstream engagement.
-
-RedHat engineers working on Nouveau proposed a new DRM feature to handle 
keeping
-track of GPU virtual address mappings. This is still not merged upstream, but
-this aligns very well with our goals and with our VM_BIND. The engagement with
-upstream and the port of Xe towards GPUVA is already ongoing.
-
-As a key measurable result, Xe needs to be aligned with the GPU VA and working 
in
-our tree. Missing Nouveau patches should *not* block Xe and any needed GPUVA
-related patch should be independent and present on dri-devel or acked by
-maintainers to go along with the first Xe pull request towards drm-next.
-
  ASYNC VM_BIND
  -
  Although having a common DRM level IOCTL for VM_BIND is not a requirement to 
get
@@ -230,3 +212,21 @@ Xe merged, it is mandatory to enforce the overall locking 
scheme for all major
  structs and list (so vm and vma). So, a consensus is needed, and possibly some
  common helpers. If helpers are needed, they should be also documented in this
  document.
+
+GPU VA
+--
+Two main goals of Xe are meeting together here:
+
+1) Have an uAPI that aligns with modern UMD needs.
+
+2) Early upstream engagement.
+
+RedHat engineers working on Nouveau proposed a new DRM feature to handle 
keeping
+track of GPU virtual address mappings. This is still not merged upstream, but
+this aligns very well with our goals and with our VM_BIND. The engagement with
+upstream and the port of Xe towards GPUVA is already ongoing.
+
+As a key measurable result, Xe needs to be aligned with the GPU VA and working 
in
+our tree. Missing Nouveau patches should *not* block Xe and any needed GPUVA
+related patch should be independent and present on dri-devel or acked by
+maintainers to go along with the first Xe pull request towards drm-next.
--
2.41.0







Re: [PATCH 3/4] drm/doc/rfc: Mark DRM_VM_BIND as complete.

2023-09-04 Thread Danilo Krummrich

Hi Rodrigo,

On 8/31/23 21:10, Rodrigo Vivi wrote:

On Tue, Aug 29, 2023 at 12:30:03PM -0400, Rodrigo Vivi wrote:

The consensus is for individual drivers VM_BIND uapis with
the GPUVA helpers that are already implemented and merged
upstream.

The merged GPUVA documentation also establish some overall
rules for the locking to be followed by the drivers.


Danilo, do you agree with this?
if nothing is missing on that front, could you please ack this patch?


I agree with the above, hence:

Acked-by: Danilo Krummrich 

However, the documentation below seems to be more about a common DRM_VM_BIND 
IOCTL?
I guess your commit refers to the end where it talks about common VM_BIND 
helpers.

Otherwise the patch is moving the "DRM_VM_BIND" paragraph somewhere below the
"Dev_coredump"paragraph. Is there some kind of "Done-Section" I'm missing?

- Danilo





Signed-off-by: Rodrigo Vivi 
---
  Documentation/gpu/rfc/xe.rst | 34 +-
  1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/Documentation/gpu/rfc/xe.rst b/Documentation/gpu/rfc/xe.rst
index bf60c5c82d0e..a115526c03e0 100644
--- a/Documentation/gpu/rfc/xe.rst
+++ b/Documentation/gpu/rfc/xe.rst
@@ -106,23 +106,6 @@ our tree. Missing Nouveau patches should *not* block Xe 
and any needed GPUVA
  related patch should be independent and present on dri-devel or acked by
  maintainers to go along with the first Xe pull request towards drm-next.
  
-DRM_VM_BIND


-Nouveau, and Xe are all implementing ‘VM_BIND’ and new ‘Exec’ uAPIs in order to
-fulfill the needs of the modern uAPI. Xe merge should *not* be blocked on the
-development of a common new drm_infrastructure. However, the Xe team needs to
-engage with the community to explore the options of a common API.
-
-As a key measurable result, the DRM_VM_BIND needs to be documented in this file
-below, or this entire block deleted if the consensus is for independent drivers
-vm_bind ioctls.
-
-Although having a common DRM level IOCTL for VM_BIND is not a requirement to 
get
-Xe merged, it is mandatory to enforce the overall locking scheme for all major
-structs and list (so vm and vma). So, a consensus is needed, and possibly some
-common helpers. If helpers are needed, they should be also documented in this
-document.
-
  ASYNC VM_BIND
  -
  Although having a common DRM level IOCTL for VM_BIND is not a requirement to 
get
@@ -230,3 +213,20 @@ Later, when we are in-tree, the goal is to collaborate 
with devcoredump
  infrastructure with overall possible improvements, like multiple file support
  for better organization of the dumps, snapshot support, dmesg extra print,
  and whatever may make sense and help the overall infrastructure.
+
+DRM_VM_BIND
+---
+Nouveau, and Xe are all implementing ‘VM_BIND’ and new ‘Exec’ uAPIs in order to
+fulfill the needs of the modern uAPI. Xe merge should *not* be blocked on the
+development of a common new drm_infrastructure. However, the Xe team needs to
+engage with the community to explore the options of a common API.
+
+As a key measurable result, the DRM_VM_BIND needs to be documented in this file
+below, or this entire block deleted if the consensus is for independent drivers
+vm_bind ioctls.
+
+Although having a common DRM level IOCTL for VM_BIND is not a requirement to 
get
+Xe merged, it is mandatory to enforce the overall locking scheme for all major
+structs and list (so vm and vma). So, a consensus is needed, and possibly some
+common helpers. If helpers are needed, they should be also documented in this
+document.
--
2.41.0







Re: [PATCH] drm/tests: Add KUnit tests for drm_fb_blit()

2023-09-04 Thread Maira Canal

Hi Arthur,

On 9/1/23 14:08, Arthur Grillo wrote:

Insert parameterized test for the drm_fb_blit() to ensure correctness
and prevent future regressions.

The test is done by adding a call to drm_fb_blit() on every format
that has support. Also, to fully test the function, add new format
conversion tests.


Wouldn't be better to separate this patches into two patches: one adding
new format conversion tests and another adding a call to drm_fb_blit()?

Best Regards,
- Maíra



Signed-off-by: Arthur Grillo 
---
  drivers/gpu/drm/tests/drm_format_helper_test.c | 284 +
  1 file changed, 284 insertions(+)

diff --git a/drivers/gpu/drm/tests/drm_format_helper_test.c 
b/drivers/gpu/drm/tests/drm_format_helper_test.c
index 79bc9d4bbd71..889287245b1e 100644
--- a/drivers/gpu/drm/tests/drm_format_helper_test.c
+++ b/drivers/gpu/drm/tests/drm_format_helper_test.c
@@ -81,6 +81,16 @@ struct fb_swab_result {
const u32 expected[TEST_BUF_SIZE];
  };
  
+struct convert_to_xbgr_result {

+   unsigned int dst_pitch;
+   const u32 expected[TEST_BUF_SIZE];
+};
+
+struct convert_to_abgr_result {
+   unsigned int dst_pitch;
+   const u32 expected[TEST_BUF_SIZE];
+};
+
  struct convert_xrgb_case {
const char *name;
unsigned int pitch;
@@ -98,6 +108,8 @@ struct convert_xrgb_case {
struct convert_to_argb2101010_result argb2101010_result;
struct convert_to_mono_result mono_result;
struct fb_swab_result swab_result;
+   struct convert_to_xbgr_result xbgr_result;
+   struct convert_to_abgr_result abgr_result;
  };
  
  static struct convert_xrgb_case convert_xrgb_cases[] = {

@@ -155,6 +167,14 @@ static struct convert_xrgb_case 
convert_xrgb_cases[] = {
.dst_pitch =  TEST_USE_DEFAULT_PITCH,
.expected = { 0xFF01 },
},
+   .xbgr_result = {
+   .dst_pitch =  TEST_USE_DEFAULT_PITCH,
+   .expected = { 0x01FF },
+   },
+   .abgr_result = {
+   .dst_pitch =  TEST_USE_DEFAULT_PITCH,
+   .expected = { 0xFFFF },
+   },
},
{
.name = "single_pixel_clip_rectangle",
@@ -213,6 +233,14 @@ static struct convert_xrgb_case 
convert_xrgb_cases[] = {
.dst_pitch =  TEST_USE_DEFAULT_PITCH,
.expected = { 0xFF10 },
},
+   .xbgr_result = {
+   .dst_pitch =  TEST_USE_DEFAULT_PITCH,
+   .expected = { 0x10FF },
+   },
+   .abgr_result = {
+   .dst_pitch =  TEST_USE_DEFAULT_PITCH,
+   .expected = { 0xFFFF },
+   },
},
{
/* Well known colors: White, black, red, green, blue, magenta,
@@ -343,6 +371,24 @@ static struct convert_xrgb_case 
convert_xrgb_cases[] = {
0x0077, 0x0088,
},
},
+   .xbgr_result = {
+   .dst_pitch =  TEST_USE_DEFAULT_PITCH,
+   .expected = {
+   0x11FF, 0x2200,
+   0x33FF, 0x4400FF00,
+   0x55FF, 0x66FF00FF,
+   0x7700, 0x8800,
+   },
+   },
+   .abgr_result = {
+   .dst_pitch =  TEST_USE_DEFAULT_PITCH,
+   .expected = {
+   0x, 0xFF00,
+   0xFFFF, 0xFF00FF00,
+   0x, 0x00FF,
+   0xFF00, 0xFF00,
+   },
+   },
},
{
/* Randomly picked colors. Full buffer within the clip area. */
@@ -458,6 +504,22 @@ static struct convert_xrgb_case 
convert_xrgb_cases[] = {
0x0303A8C2, 0x73F06CD2, 0x9C440EA3, 0x, 
0x,
},
},
+   .xbgr_result = {
+   .dst_pitch =  20,
+   .expected = {
+   0xA19C440E, 0xB1054D11, 0xC103F3A8, 0x, 
0x,
+   0xD173F06C, 0xA29C440E, 0xB2054D11, 0x, 
0x,
+   0xC20303A8, 0xD273F06C, 0xA39C440E, 0x, 
0x,
+   },
+   },
+   .abgr_result = {
+   .dst_pitch =  20,
+   .expected = {
+   0xFF9C440E, 0xFF054D11, 0xFF03F3A8, 0x, 
0x,
+   

Re: [PATCH] drm/debugfs: Add inline to drm_debugfs_dev_init() to suppres -Wunused-function

2023-09-04 Thread Maira Canal

On 9/1/23 15:05, Arthur Grillo wrote:

When CONFIG_DEBUG_FS is not set -Wunused-function warnings appear,
make the static function inline to suppress that.

Reported-by: kernel test robot 
Closes: 
https://lore.kernel.org/oe-kbuild-all/202309012114.t8vlfaf8-...@intel.com/
Closes: 
https://lore.kernel.org/oe-kbuild-all/202309012131.feakbzej-...@intel.com/
Signed-off-by: Arthur Grillo 


Reviewed-by: Maíra Canal 

Best Regards,
- Maíra


---
  include/drm/drm_drv.h | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
index 9850fe73b739..e2640dc64e08 100644
--- a/include/drm/drm_drv.h
+++ b/include/drm/drm_drv.h
@@ -584,7 +584,7 @@ static inline bool drm_firmware_drivers_only(void)
  #if defined(CONFIG_DEBUG_FS)
  void drm_debugfs_dev_init(struct drm_device *dev, struct dentry *root);
  #else
-static void drm_debugfs_dev_init(struct drm_device *dev, struct dentry *root)
+static inline void drm_debugfs_dev_init(struct drm_device *dev, struct dentry 
*root)
  {
  }
  #endif

---
base-commit: 8e455145d8f163aefa6b9cc29478e0a9f82276e6
change-id: 20230901-debugfs-fix-unused-function-warning-9ebbecbd6a5a

Best regards,


Re: [PATCH] drm/tests: Zero initialize fourccs_out

2023-09-04 Thread Maira Canal

On 9/1/23 15:52, Arthur Grillo wrote:

fourccs_out array is not initialized. As the
drm_fb_build_fourcc_list() doesn't necessarily change all the array,
and the test compares all of it, the comparison could fail if the
array is not initialized. Zero initialize the array to fix this.

Fixes: 371e0b186a13 ("drm/tests: Add KUnit tests for 
drm_fb_build_fourcc_list()")
Signed-off-by: Arthur Grillo 


Reviewed-by: Maíra Canal 

Best Regards,
- Maíra


---
  drivers/gpu/drm/tests/drm_format_helper_test.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/tests/drm_format_helper_test.c 
b/drivers/gpu/drm/tests/drm_format_helper_test.c
index 79bc9d4bbd71..1a6bd291345d 100644
--- a/drivers/gpu/drm/tests/drm_format_helper_test.c
+++ b/drivers/gpu/drm/tests/drm_format_helper_test.c
@@ -1165,7 +1165,7 @@ KUNIT_ARRAY_PARAM(fb_build_fourcc_list, 
fb_build_fourcc_list_cases, fb_build_fou
  static void drm_test_fb_build_fourcc_list(struct kunit *test)
  {
const struct fb_build_fourcc_list_case *params = test->param_value;
-   u32 fourccs_out[TEST_BUF_SIZE];
+   u32 fourccs_out[TEST_BUF_SIZE] = {0};
size_t nfourccs_out;
struct drm_device *drm;
struct device *dev;

---
base-commit: 8e455145d8f163aefa6b9cc29478e0a9f82276e6
change-id: 20230901-zero-init-fourcc-list-test-2c934b6b7eb8

Best regards,


[RFC, drm-misc-next v4 9/9] drm/gma500: Register as a VGA client by calling vga_client_register()

2023-09-04 Thread Sui Jingfeng
From: Sui Jingfeng 

Because the display controller in N2000/D2000 series can be VGA-compatible,
so let's register gma500 as a VGA client, despite the firmware may alter
the PCI class code of IGD on a multiple GPU co-exist configuration. But
this commit no crime, because VGAARB only cares about VGA devices.

Noticed that the display controller in N2000/D2000 processor don't has a
valid VRAM BAR, the firmware put the EFI firmware framebuffer into the
stolen memory, so the commit <86fd887b7fe3> ("vgaarb: Don't default
exclusively to first video device with mem+io") is not effictive on such
a case. But the benefits of the stolen memory is that it will not suffer
from PCI resource relocation. Becase the stolen memory is carved out by
the firmware and reside in system RAM. Therefore, while at it, provided a
naive version of firmware framebuffer identification function and use the
new machanism just created.

Signed-off-by: Sui Jingfeng 
---
 drivers/gpu/drm/gma500/psb_drv.c | 57 ++--
 1 file changed, 55 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/gma500/psb_drv.c b/drivers/gpu/drm/gma500/psb_drv.c
index 8b64f61ffaf9..eb95d030d981 100644
--- a/drivers/gpu/drm/gma500/psb_drv.c
+++ b/drivers/gpu/drm/gma500/psb_drv.c
@@ -14,7 +14,7 @@
 #include 
 #include 
 #include 
-
+#include 
 #include 
 
 #include 
@@ -36,6 +36,11 @@
 #include "psb_irq.h"
 #include "psb_reg.h"
 
+static int gma500_modeset = -1;
+
+MODULE_PARM_DESC(modeset, "Disable/Enable modesetting");
+module_param_named(modeset, gma500_modeset, int, 0400);
+
 static const struct drm_driver driver;
 static int psb_pci_probe(struct pci_dev *pdev, const struct pci_device_id 
*ent);
 
@@ -446,6 +451,49 @@ static int gma_remove_conflicting_framebuffers(struct 
pci_dev *pdev,
return __aperture_remove_legacy_vga_devices(pdev);
 }
 
+static bool gma_contain_firmware_fb(u64 ap_start, u64 ap_end)
+{
+   u64 fb_start;
+   u64 fb_size;
+   u64 fb_end;
+
+   if (screen_info.capabilities & VIDEO_CAPABILITY_64BIT_BASE)
+   fb_start = (u64)screen_info.ext_lfb_base << 32 | 
screen_info.lfb_base;
+   else
+   fb_start = screen_info.lfb_base;
+
+   fb_size = screen_info.lfb_size;
+   fb_end = fb_start + fb_size - 1;
+
+   /* No firmware framebuffer support */
+   if (!fb_start || !fb_size)
+   return false;
+
+   if (fb_start >= ap_start && fb_end <= ap_end)
+   return true;
+
+   return false;
+}
+
+static bool gma_want_to_be_primary(struct pci_dev *pdev)
+{
+   struct drm_device *drm = pci_get_drvdata(pdev);
+   struct drm_psb_private *priv = to_drm_psb_private(drm);
+   u64 vram_base = priv->stolen_base;
+   u64 vram_size = priv->vram_stolen_size;
+
+   if (gma500_modeset == 10)
+   return true;
+
+   /* Stolen memory are not going to be moved */
+   if (gma_contain_firmware_fb(vram_base, vram_base + vram_size)) {
+   drm_dbg(drm, "Contains firmware FB in the stolen memory\n");
+   return true;
+   }
+
+   return false;
+}
+
 static int psb_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 {
struct drm_psb_private *dev_priv;
@@ -475,6 +523,8 @@ static int psb_pci_probe(struct pci_dev *pdev, const struct 
pci_device_id *ent)
if (ret)
return ret;
 
+   vga_client_register(pdev, NULL, gma_want_to_be_primary);
+
psb_fbdev_setup(dev_priv);
 
return 0;
@@ -526,7 +576,10 @@ static struct pci_driver psb_pci_driver = {
 
 static int __init psb_init(void)
 {
-   if (drm_firmware_drivers_only())
+   if (drm_firmware_drivers_only() && (gma500_modeset == -1))
+   return -ENODEV;
+
+   if (!gma500_modeset)
return -ENODEV;
 
return pci_register_driver(&psb_pci_driver);
-- 
2.34.1



[RFC, drm-misc-next v4 8/9] drm/hibmc: Register as a VGA client by calling vga_client_register()

2023-09-04 Thread Sui Jingfeng
From: Sui Jingfeng 

Because the display controller in the Hibmc chip is a VGA compatible
display controller. Because ARM64 doesn't need the VGA console. It does not
need to worry about the side effects that come with the VGA compatible.
However, the real problem is that some ARM64 PCs and servers do not have
good UEFI firmware support. At least, it is not as good as UEFI firmware
for x86. The Huawei KunPeng 920 PC and Taishan 100 server are examples.
When a discrete GPU is mounted on such machines, the UEFI firmware still
selects the integrated display controller (in the BMC) as the primary GPU.
It is hardcoded, no options are provided for selection. A Linux user has
no control at all.

Signed-off-by: Sui Jingfeng 
---
 drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c 
b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
index 8a98fa276e8a..73a3f1cb109a 100644
--- a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
+++ b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
@@ -13,6 +13,7 @@
 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -27,6 +28,10 @@
 #include "hibmc_drm_drv.h"
 #include "hibmc_drm_regs.h"
 
+static int hibmc_modeset = -1;
+MODULE_PARM_DESC(modeset, "Disable/Enable modesetting");
+module_param_named(modeset, hibmc_modeset, int, 0400);
+
 DEFINE_DRM_GEM_FOPS(hibmc_fops);
 
 static irqreturn_t hibmc_interrupt(int irq, void *arg)
@@ -299,6 +304,14 @@ static int hibmc_load(struct drm_device *dev)
return ret;
 }
 
+static bool hibmc_want_to_be_primary(struct pci_dev *pdev)
+{
+   if (hibmc_modeset == 10)
+   return true;
+
+   return false;
+}
+
 static int hibmc_pci_probe(struct pci_dev *pdev,
   const struct pci_device_id *ent)
 {
@@ -339,6 +352,8 @@ static int hibmc_pci_probe(struct pci_dev *pdev,
goto err_unload;
}
 
+   vga_client_register(pdev, NULL, hibmc_want_to_be_primary);
+
drm_fbdev_generic_setup(dev, 32);
 
return 0;
-- 
2.34.1



[RFC, drm-misc-next v4 5/9] drm/i915: Implement .be_primary() callback

2023-09-04 Thread Sui Jingfeng
From: Sui Jingfeng 

On a machine with multiple GPUs, a Linux user has no control over which one
is primary at boot time. This patch tries to solve the mentioned problem by
implementing the .be_primary() callback. Pass i915.modeset=10 on the kernel
cmd line if you really want the device bound by i915 drm driver to be the
primary video adapter, no matter what VGAARB say.

Cc: Jani Nikula 
Cc: David Airlie 
Cc: Daniel Vetter 
Signed-off-by: Sui Jingfeng 
---
 drivers/gpu/drm/i915/display/intel_vga.c | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/display/intel_vga.c 
b/drivers/gpu/drm/i915/display/intel_vga.c
index 98d7d4dffe9f..e3f78ba2668b 100644
--- a/drivers/gpu/drm/i915/display/intel_vga.c
+++ b/drivers/gpu/drm/i915/display/intel_vga.c
@@ -113,6 +113,17 @@ intel_vga_set_decode(struct pci_dev *pdev, bool 
enable_decode)
return VGA_RSRC_NORMAL_IO | VGA_RSRC_NORMAL_MEM;
 }
 
+static bool intel_want_to_be_primary(struct pci_dev *pdev)
+{
+   struct drm_i915_private *i915 = pdev_to_i915(pdev);
+   struct i915_params *params = &i915->params;
+
+   if (params->modeset == 10)
+   return true;
+
+   return false;
+}
+
 int intel_vga_register(struct drm_i915_private *i915)
 {
struct pci_dev *pdev = to_pci_dev(i915->drm.dev);
@@ -126,7 +137,8 @@ int intel_vga_register(struct drm_i915_private *i915)
 * then we do not take part in VGA arbitration and the
 * vga_client_register() fails with -ENODEV.
 */
-   ret = vga_client_register(pdev, intel_vga_set_decode, NULL);
+   ret = vga_client_register(pdev, intel_vga_set_decode,
+ intel_want_to_be_primary);
if (ret && ret != -ENODEV)
return ret;
 
-- 
2.34.1



[RFC, drm-misc-next v4 7/9] drm/ast: Register as a VGA client by calling vga_client_register()

2023-09-04 Thread Sui Jingfeng
From: Sui Jingfeng 

Becasuse the display controller in the ASpeed BMC chip is a VGA-compatible
device, the software programming guide of AST2400 say that it is fully
IBM VGA compliant. Thus, it should also participate in the arbitration.

Cc: Thomas Zimmermann 
Cc: Jocelyn Falempe 
Signed-off-by: Sui Jingfeng 
---
 drivers/gpu/drm/ast/ast_drv.c | 31 +++
 1 file changed, 31 insertions(+)

diff --git a/drivers/gpu/drm/ast/ast_drv.c b/drivers/gpu/drm/ast/ast_drv.c
index e1224ef4ad83..1349f7bb5dfb 100644
--- a/drivers/gpu/drm/ast/ast_drv.c
+++ b/drivers/gpu/drm/ast/ast_drv.c
@@ -28,6 +28,7 @@
 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -89,6 +90,34 @@ static const struct pci_device_id ast_pciidlist[] = {
 
 MODULE_DEVICE_TABLE(pci, ast_pciidlist);
 
+static bool ast_want_to_be_primary(struct pci_dev *pdev)
+{
+   if (ast_modeset == 10)
+   return true;
+
+   return false;
+}
+
+static unsigned int ast_vga_set_decode(struct pci_dev *pdev, bool state)
+{
+   struct drm_device *drm = pci_get_drvdata(pdev);
+   struct ast_device *ast = to_ast_device(drm);
+   unsigned int decode;
+
+   if (state) {
+   /* Enable standard VGA decode and Enable normal VGA decode */
+   ast_set_index_reg(ast, AST_IO_CRTC_PORT, 0xa1, 0x04);
+
+   decode = VGA_RSRC_LEGACY_IO | VGA_RSRC_LEGACY_MEM |
+VGA_RSRC_NORMAL_IO | VGA_RSRC_NORMAL_MEM;
+   } else {
+   ast_set_index_reg(ast, AST_IO_CRTC_PORT, 0xa1, 0x07);
+   decode = VGA_RSRC_NORMAL_IO | VGA_RSRC_NORMAL_MEM;
+   }
+
+   return decode;
+}
+
 static int ast_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 {
struct ast_device *ast;
@@ -112,6 +141,8 @@ static int ast_pci_probe(struct pci_dev *pdev, const struct 
pci_device_id *ent)
if (ret)
return ret;
 
+   vga_client_register(pdev, ast_vga_set_decode, ast_want_to_be_primary);
+
drm_fbdev_generic_setup(dev, 32);
 
return 0;
-- 
2.34.1



[RFC, drm-misc-next v4 6/9] drm/loongson: Implement .be_primary() callback

2023-09-04 Thread Sui Jingfeng
From: Sui Jingfeng 

On a machine with multiple GPUs, a Linux user has no control over which one
is primary at boot time. This patch tries to solve the mentioned problem by
implementing the .be_primary() callback. Pass loongson.modeset=10 on the
kernel cmd line if you really want the device bound by loongson drm driver
to be the primary video adapter, no matter what VGAARB say.

Signed-off-by: Sui Jingfeng 
---
 drivers/gpu/drm/loongson/loongson_module.c |  2 +-
 drivers/gpu/drm/loongson/loongson_module.h |  1 +
 drivers/gpu/drm/loongson/lsdc_drv.c| 10 +-
 3 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/loongson/loongson_module.c 
b/drivers/gpu/drm/loongson/loongson_module.c
index d2a51bd395f6..12f2a453adff 100644
--- a/drivers/gpu/drm/loongson/loongson_module.c
+++ b/drivers/gpu/drm/loongson/loongson_module.c
@@ -9,7 +9,7 @@
 
 #include "loongson_module.h"
 
-static int loongson_modeset = -1;
+int loongson_modeset = -1;
 MODULE_PARM_DESC(modeset, "Disable/Enable modesetting");
 module_param_named(modeset, loongson_modeset, int, 0400);
 
diff --git a/drivers/gpu/drm/loongson/loongson_module.h 
b/drivers/gpu/drm/loongson/loongson_module.h
index 931c17521bf0..afff51e7f34f 100644
--- a/drivers/gpu/drm/loongson/loongson_module.h
+++ b/drivers/gpu/drm/loongson/loongson_module.h
@@ -6,6 +6,7 @@
 #ifndef __LOONGSON_MODULE_H__
 #define __LOONGSON_MODULE_H__
 
+extern int loongson_modeset;
 extern int loongson_vblank;
 extern struct pci_driver lsdc_pci_driver;
 
diff --git a/drivers/gpu/drm/loongson/lsdc_drv.c 
b/drivers/gpu/drm/loongson/lsdc_drv.c
index d10a28c2c494..7183b0666167 100644
--- a/drivers/gpu/drm/loongson/lsdc_drv.c
+++ b/drivers/gpu/drm/loongson/lsdc_drv.c
@@ -257,6 +257,14 @@ static unsigned int lsdc_vga_set_decode(struct pci_dev 
*pdev, bool state)
return VGA_RSRC_NORMAL_IO | VGA_RSRC_NORMAL_MEM;
 }
 
+static bool lsdc_want_to_be_primary(struct pci_dev *pdev)
+{
+   if (loongson_modeset == 10)
+   return true;
+
+   return false;
+}
+
 static int lsdc_pci_probe(struct pci_dev *pdev, const struct pci_device_id 
*ent)
 {
const struct lsdc_desc *descp;
@@ -289,7 +297,7 @@ static int lsdc_pci_probe(struct pci_dev *pdev, const 
struct pci_device_id *ent)
 
pci_set_drvdata(pdev, ddev);
 
-   vga_client_register(pdev, lsdc_vga_set_decode, NULL);
+   vga_client_register(pdev, lsdc_vga_set_decode, lsdc_want_to_be_primary);
 
drm_kms_helper_poll_init(ddev);
 
-- 
2.34.1



[RFC, drm-misc-next v4 4/9] drm/amdgpu: Implement .be_primary() callback

2023-09-04 Thread Sui Jingfeng
From: Sui Jingfeng 

On a machine with multiple GPUs, a Linux user has no control over which one
is primary at boot time. This patch tries to solve the mentioned problem by
implementing the .be_primary() callback. Pass amdgpu.modeset=10 on the
kernel cmd line if you really want the device bound by amdgpu drm driver to
be the primary video adapter, no matter what VGAARB say.

Cc: Alex Deucher 
Cc: Christian Konig 
Signed-off-by: Sui Jingfeng 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 11 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 13 -
 2 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index ecc4564ceac0..59bde6972a8b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3507,6 +3507,14 @@ static void amdgpu_device_set_mcbp(struct amdgpu_device 
*adev)
DRM_INFO("MCBP is enabled\n");
 }
 
+static bool amdgpu_want_to_be_primary(struct pci_dev *pdev)
+{
+   if (amdgpu_modeset == 10)
+   return true;
+
+   return false;
+}
+
 /**
  * amdgpu_device_init - initialize the driver
  *
@@ -3916,7 +3924,8 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 * ignore it
 */
if ((adev->pdev->class >> 8) == PCI_CLASS_DISPLAY_VGA)
-   vga_client_register(adev->pdev, amdgpu_device_vga_set_decode, 
NULL);
+   vga_client_register(adev->pdev, amdgpu_device_vga_set_decode,
+   amdgpu_want_to_be_primary);
 
px = amdgpu_device_supports_px(ddev);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 81edf66dbea8..2592e24ce62c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -118,6 +118,7 @@
 #define KMS_DRIVER_MINOR   54
 #define KMS_DRIVER_PATCHLEVEL  0
 
+int amdgpu_modeset = -1;
 unsigned int amdgpu_vram_limit = UINT_MAX;
 int amdgpu_vis_vram_limit;
 int amdgpu_gart_size = -1; /* auto */
@@ -223,6 +224,13 @@ struct amdgpu_watchdog_timer amdgpu_watchdog_timer = {
.period = 0x0, /* default to 0x0 (timeout disable) */
 };
 
+/**
+ * DOC: modeset (int)
+ * Disable/Enable kernel modesetting (1 = enable, 0 = disable, -1 = auto 
(default)).
+ */
+MODULE_PARM_DESC(modeset, "Disable/Enable kernel modesetting");
+module_param_named(modeset, amdgpu_modeset, int, 0600);
+
 /**
  * DOC: vramlimit (int)
  * Restrict the total amount of VRAM in MiB for testing.  The default is 0 
(Use full VRAM).
@@ -2872,7 +2880,10 @@ static int __init amdgpu_init(void)
 {
int r;
 
-   if (drm_firmware_drivers_only())
+   if (drm_firmware_drivers_only() && amdgpu_modeset == -1)
+   return -EINVAL;
+
+   if (amdgpu_modeset == 0)
return -EINVAL;
 
r = amdgpu_sync_init();
-- 
2.34.1



[RFC, drm-misc-next v4 3/9] drm/radeon: Implement .be_primary() callback

2023-09-04 Thread Sui Jingfeng
From: Sui Jingfeng 

On a machine with multiple GPUs, a Linux user has no control over which one
is primary at boot time. This patch tries to solve the mentioned problem by
implementing the .be_primary() callback. Pass radeon.modeset=10 on the
kernel cmd line if you really want the device bound by radeon to be the
primary video adapter, no matter what VGAARB say.

Cc: Alex Deucher 
Cc: Christian Koenig 
Signed-off-by: Sui Jingfeng 
---
 drivers/gpu/drm/radeon/radeon_device.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/radeon_device.c 
b/drivers/gpu/drm/radeon/radeon_device.c
index 71f2ff39d6a1..b661cd3a8dc2 100644
--- a/drivers/gpu/drm/radeon/radeon_device.c
+++ b/drivers/gpu/drm/radeon/radeon_device.c
@@ -1263,6 +1263,14 @@ static const struct vga_switcheroo_client_ops 
radeon_switcheroo_ops = {
.can_switch = radeon_switcheroo_can_switch,
 };
 
+static bool radeon_want_to_be_primary(struct pci_dev *pdev)
+{
+   if (radeon_modeset == 10)
+   return true;
+
+   return false;
+}
+
 /**
  * radeon_device_init - initialize the driver
  *
@@ -1425,7 +1433,7 @@ int radeon_device_init(struct radeon_device *rdev,
/* if we have > 1 VGA cards, then disable the radeon VGA resources */
/* this will fail for cards that aren't VGA class devices, just
 * ignore it */
-   vga_client_register(rdev->pdev, radeon_vga_set_decode, NULL);
+   vga_client_register(rdev->pdev, radeon_vga_set_decode, 
radeon_want_to_be_primary);
 
if (rdev->flags & RADEON_IS_PX)
runtime = true;
-- 
2.34.1



[RFC, drm-misc-next v4 2/9] drm/nouveau: Implement .be_primary() callback

2023-09-04 Thread Sui Jingfeng
From: Sui Jingfeng 

On a machine with multiple GPUs, a Linux user has no control over which one
is primary at boot time. This patch tries to solve the mentioned problem by
implementing the .be_primary() callback. VGAARB will call back to Nouveau
when the drm/nouveau gets loaded successfully.

Pass nouveau.modeset=10 on the kernel cmd line if you really want the
device bound by Nouveau to be the primary video adapter. This overrides
whatever boot device selected by VGAARB.

Signed-off-by: Sui Jingfeng 
---
 drivers/gpu/drm/nouveau/nouveau_vga.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_vga.c 
b/drivers/gpu/drm/nouveau/nouveau_vga.c
index 162b4f4676c7..4242188667e2 100644
--- a/drivers/gpu/drm/nouveau/nouveau_vga.c
+++ b/drivers/gpu/drm/nouveau/nouveau_vga.c
@@ -80,6 +80,15 @@ nouveau_switcheroo_ops = {
.can_switch = nouveau_switcheroo_can_switch,
 };
 
+static bool
+nouveau_want_to_be_primary(struct pci_dev *pdev)
+{
+   if (nouveau_modeset == 10)
+   return true;
+
+   return false;
+}
+
 void
 nouveau_vga_init(struct nouveau_drm *drm)
 {
@@ -92,7 +101,7 @@ nouveau_vga_init(struct nouveau_drm *drm)
return;
pdev = to_pci_dev(dev->dev);
 
-   vga_client_register(pdev, nouveau_vga_set_decode, NULL);
+   vga_client_register(pdev, nouveau_vga_set_decode, 
nouveau_want_to_be_primary);
 
/* don't register Thunderbolt eGPU with vga_switcheroo */
if (pci_is_thunderbolt_attached(pdev))
-- 
2.34.1



[RFC, drm-misc-next v4 1/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-04 Thread Sui Jingfeng
From: Sui Jingfeng 

On a machine with multiple GPUs, a Linux user has no control over which
one is primary at boot time. This series tries to solve above mentioned
problem by introduced the ->be_primary() function stub. The specific
device drivers can provide an implementation to hook up with this stub by
calling the vga_client_register() function.

Once the driver bound the device successfully, VGAARB will call back to
the device driver. To query if the device drivers want to be primary or
not. Device drivers can just pass NULL if have no such needs.

Acked-by: Jani Nikula  # i915
Reviewed-by: Lyude Paul  # nouveau
Signed-off-by: Sui Jingfeng 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  2 +-
 drivers/gpu/drm/i915/display/intel_vga.c   |  3 +-
 drivers/gpu/drm/loongson/lsdc_drv.c|  2 +-
 drivers/gpu/drm/nouveau/nouveau_vga.c  |  2 +-
 drivers/gpu/drm/radeon/radeon_device.c |  2 +-
 drivers/pci/vgaarb.c   | 43 +++---
 drivers/vfio/pci/vfio_pci_core.c   |  2 +-
 include/linux/vgaarb.h |  8 ++--
 8 files changed, 49 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index e77f048c99d8..ecc4564ceac0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3916,7 +3916,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 * ignore it
 */
if ((adev->pdev->class >> 8) == PCI_CLASS_DISPLAY_VGA)
-   vga_client_register(adev->pdev, amdgpu_device_vga_set_decode);
+   vga_client_register(adev->pdev, amdgpu_device_vga_set_decode, 
NULL);
 
px = amdgpu_device_supports_px(ddev);
 
diff --git a/drivers/gpu/drm/i915/display/intel_vga.c 
b/drivers/gpu/drm/i915/display/intel_vga.c
index 286a0bdd28c6..98d7d4dffe9f 100644
--- a/drivers/gpu/drm/i915/display/intel_vga.c
+++ b/drivers/gpu/drm/i915/display/intel_vga.c
@@ -115,7 +115,6 @@ intel_vga_set_decode(struct pci_dev *pdev, bool 
enable_decode)
 
 int intel_vga_register(struct drm_i915_private *i915)
 {
-
struct pci_dev *pdev = to_pci_dev(i915->drm.dev);
int ret;
 
@@ -127,7 +126,7 @@ int intel_vga_register(struct drm_i915_private *i915)
 * then we do not take part in VGA arbitration and the
 * vga_client_register() fails with -ENODEV.
 */
-   ret = vga_client_register(pdev, intel_vga_set_decode);
+   ret = vga_client_register(pdev, intel_vga_set_decode, NULL);
if (ret && ret != -ENODEV)
return ret;
 
diff --git a/drivers/gpu/drm/loongson/lsdc_drv.c 
b/drivers/gpu/drm/loongson/lsdc_drv.c
index 188ec82afcfb..d10a28c2c494 100644
--- a/drivers/gpu/drm/loongson/lsdc_drv.c
+++ b/drivers/gpu/drm/loongson/lsdc_drv.c
@@ -289,7 +289,7 @@ static int lsdc_pci_probe(struct pci_dev *pdev, const 
struct pci_device_id *ent)
 
pci_set_drvdata(pdev, ddev);
 
-   vga_client_register(pdev, lsdc_vga_set_decode);
+   vga_client_register(pdev, lsdc_vga_set_decode, NULL);
 
drm_kms_helper_poll_init(ddev);
 
diff --git a/drivers/gpu/drm/nouveau/nouveau_vga.c 
b/drivers/gpu/drm/nouveau/nouveau_vga.c
index f8bf0ec26844..162b4f4676c7 100644
--- a/drivers/gpu/drm/nouveau/nouveau_vga.c
+++ b/drivers/gpu/drm/nouveau/nouveau_vga.c
@@ -92,7 +92,7 @@ nouveau_vga_init(struct nouveau_drm *drm)
return;
pdev = to_pci_dev(dev->dev);
 
-   vga_client_register(pdev, nouveau_vga_set_decode);
+   vga_client_register(pdev, nouveau_vga_set_decode, NULL);
 
/* don't register Thunderbolt eGPU with vga_switcheroo */
if (pci_is_thunderbolt_attached(pdev))
diff --git a/drivers/gpu/drm/radeon/radeon_device.c 
b/drivers/gpu/drm/radeon/radeon_device.c
index afbb3a80c0c6..71f2ff39d6a1 100644
--- a/drivers/gpu/drm/radeon/radeon_device.c
+++ b/drivers/gpu/drm/radeon/radeon_device.c
@@ -1425,7 +1425,7 @@ int radeon_device_init(struct radeon_device *rdev,
/* if we have > 1 VGA cards, then disable the radeon VGA resources */
/* this will fail for cards that aren't VGA class devices, just
 * ignore it */
-   vga_client_register(rdev->pdev, radeon_vga_set_decode);
+   vga_client_register(rdev->pdev, radeon_vga_set_decode, NULL);
 
if (rdev->flags & RADEON_IS_PX)
runtime = true;
diff --git a/drivers/pci/vgaarb.c b/drivers/pci/vgaarb.c
index 5a696078b382..552ac7df10ee 100644
--- a/drivers/pci/vgaarb.c
+++ b/drivers/pci/vgaarb.c
@@ -53,6 +53,7 @@ struct vga_device {
bool bridge_has_one_vga;
bool is_firmware_default;   /* device selected by firmware */
unsigned int (*set_decode)(struct pci_dev *pdev, bool decode);
+   bool (*be_primary)(struct pci_dev *pdev);
 };
 
 static LIST_HEAD(vga_list);
@@ -956,6 +957,10 @@ EXPORT_SYMBOL(vga_set_legacy_decoding);
  * @set_decode callback: If a client can disable its GPU VGA resource, it
  * wi

[RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-04 Thread Sui Jingfeng
From: Sui Jingfeng 

On a machine with multiple GPUs, a Linux user has no control over which
one is primary at boot time. This series tries to solve above mentioned
problem by introduced the ->be_primary() function stub. The specific
device drivers can provide an implementation to hook up with this stub by
calling the vga_client_register() function.

Once the driver bound the device successfully, VGAARB will call back to
the device driver. To query if the device drivers want to be primary or
not. Device drivers can just pass NULL if have no such needs.

Please note that:

1) The ARM64, Loongarch, Mips servers have a lot PCIe slot, and I would
   like to mount at least three video cards.

2) Typically, those non-86 machines don't have a good UEFI firmware
   support, which doesn't support select primary GPU as firmware stage.
   Even on x86, there are old UEFI firmwares which already made undesired
   decision for you.

3) This series is attempt to solve the remain problems at the driver level,
   while another series[1] of me is target to solve the majority of the
   problems at device level.

Tested (limited) on x86 with four video card mounted, Intel UHD Graphics
630 is the default boot VGA, successfully override by ast2400 with
ast.modeset=10 append at the kernel cmd line.

$ lspci | grep VGA

 00:02.0 VGA compatible controller: Intel Corporation CoffeeLake-S GT2 [UHD 
Graphics 630]
 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] 
Caicos XTX [Radeon HD 8490 / R5 235X OEM]
 04:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics 
Family (rev 30)
 05:00.0 VGA compatible controller: NVIDIA Corporation GK208B [GeForce GT 720] 
(rev a1)

$ sudo dmesg | grep vgaarb

 pci :00:02.0: vgaarb: setting as boot VGA device
 pci :00:02.0: vgaarb: VGA device added: 
decodes=io+mem,owns=io+mem,locks=none
 pci :01:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
 pci :04:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
 pci :05:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
 vgaarb: loaded
 ast :04:00.0: vgaarb: Override as primary by driver
 i915 :00:02.0: vgaarb: changed VGA decodes: 
olddecodes=io+mem,decodes=none:owns=io+mem
 radeon :01:00.0: vgaarb: changed VGA decodes: 
olddecodes=io+mem,decodes=none:owns=none
 ast :04:00.0: vgaarb: changed VGA decodes: 
olddecodes=io+mem,decodes=none:owns=none

v2:
* Add a simple implemment for drm/i915 and drm/ast
* Pick up all tags (Mario)
v3:
* Fix a mistake for drm/i915 implement
* Fix patch can not be applied problem because of merge conflect.
v4:
* Focus on solve the real problem.

v1,v2 at https://patchwork.freedesktop.org/series/120059/
   v3 at https://patchwork.freedesktop.org/series/120562/

[1] https://patchwork.freedesktop.org/series/122845/

Sui Jingfeng (9):
  PCI/VGA: Allowing the user to select the primary video adapter at boot
time
  drm/nouveau: Implement .be_primary() callback
  drm/radeon: Implement .be_primary() callback
  drm/amdgpu: Implement .be_primary() callback
  drm/i915: Implement .be_primary() callback
  drm/loongson: Implement .be_primary() callback
  drm/ast: Register as a VGA client by calling vga_client_register()
  drm/hibmc: Register as a VGA client by calling vga_client_register()
  drm/gma500: Register as a VGA client by calling vga_client_register()

 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 11 +++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c   | 13 -
 drivers/gpu/drm/ast/ast_drv.c | 31 ++
 drivers/gpu/drm/gma500/psb_drv.c  | 57 ++-
 .../gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c   | 15 +
 drivers/gpu/drm/i915/display/intel_vga.c  | 15 -
 drivers/gpu/drm/loongson/loongson_module.c|  2 +-
 drivers/gpu/drm/loongson/loongson_module.h|  1 +
 drivers/gpu/drm/loongson/lsdc_drv.c   | 10 +++-
 drivers/gpu/drm/nouveau/nouveau_vga.c | 11 +++-
 drivers/gpu/drm/radeon/radeon_device.c| 10 +++-
 drivers/pci/vgaarb.c  | 43 --
 drivers/vfio/pci/vfio_pci_core.c  |  2 +-
 include/linux/vgaarb.h|  8 ++-
 14 files changed, 210 insertions(+), 19 deletions(-)

-- 
2.34.1



[PATCH] accel/habanalabs/gaudi2: Fix incorrect string length computation in gaudi2_psoc_razwi_get_engines()

2023-09-04 Thread Christophe JAILLET
snprintf() returns the "number of characters which *would* be generated for
the given input", not the size *really* generated.

In order to avoid too large values for 'str_size' (and potential negative
values for "PSOC_RAZWI_ENG_STR_SIZE - str_size") use scnprintf()
instead of snprintf().

Fixes: c0e6df916050 ("accel/habanalabs: fix address decode RAZWI handling")
Signed-off-by: Christophe JAILLET 
---
 drivers/accel/habanalabs/gaudi2/gaudi2.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/accel/habanalabs/gaudi2/gaudi2.c 
b/drivers/accel/habanalabs/gaudi2/gaudi2.c
index d94acec63d95..9617c062b7ca 100644
--- a/drivers/accel/habanalabs/gaudi2/gaudi2.c
+++ b/drivers/accel/habanalabs/gaudi2/gaudi2.c
@@ -8277,11 +8277,11 @@ static int gaudi2_psoc_razwi_get_engines(struct 
gaudi2_razwi_info *razwi_info, u
eng_id[num_of_eng] = razwi_info[i].eng_id;
base[num_of_eng] = razwi_info[i].rtr_ctrl;
if (!num_of_eng)
-   str_size += snprintf(eng_name + str_size,
+   str_size += scnprintf(eng_name + str_size,
PSOC_RAZWI_ENG_STR_SIZE - 
str_size, "%s",
razwi_info[i].eng_name);
else
-   str_size += snprintf(eng_name + str_size,
+   str_size += scnprintf(eng_name + str_size,
PSOC_RAZWI_ENG_STR_SIZE - 
str_size, " or %s",
razwi_info[i].eng_name);
num_of_eng++;
-- 
2.34.1



Re: [PATCH v3 1/1] backlight: hid_bl: Add VESA VCP HID backlight driver

2023-09-04 Thread Julius Zint



On Mon, 4 Sep 2023, Thomas Weißschuh wrote:


+Cc Hans who ins involved with the backlight subsystem

Hi Julius,

today I stumbled upon a mail from Hans [0], which explains that the
backlight subsystem is not actually a good fit (yet?) for external
displays.

It seems a new API is in the works that would better fit, but I'm not
sure about the state of this API. Maybe Hans can clarify.

This also ties back to my review question how userspace can figure out
to which display a backlight devices applies. So far it can not.

[0] 
https://lore.kernel.org/lkml/7f2d88de-60c5-e2ff-9b22-acba35cfd...@redhat.com/



Hi Thomas,

thanks for the hint. I will make sure to give this a proper read and
see, if it fits my use case better then the current backlight subsystem.

Especially since I wasnt able to properly address your other review
comments for now. You are right that the name should align better with
the kernel module and also, that it is possible for multiple displays to
be attached.

In its current state, this would mean that you could only control the
backlight for the first HID device (enough for me :-).

The systemd-backlight@.service uses not only the file name, but also the
full bus path for storing/restoring backlights. I did not yet get around
to see how the desktops handle brightness control, but since the
systemd-backlight@.service already uses the name, its important to stay
the same over multiple boots.

I would be able to get a handle on the underlying USB device and use the
serial to uniquely (and persistently) name the backlight. But it does
feel hacky doing it this way.

Anyways, this is where am at. Thanks again for the support and I will
try my best to come up with something better.

Julius

Re: [PATCH 06/10] drm/tests: Add test for drm_framebuffer_lookup()

2023-09-04 Thread Carlos

Hi Maíra,

On 8/26/23 11:13, Maíra Canal wrote:

Hi Carlos,

On 8/25/23 13:07, Carlos Eduardo Gallo Filho wrote:

Add a single KUnit test case for the drm_framebuffer_lookup function.

Signed-off-by: Carlos Eduardo Gallo Filho 
---
  drivers/gpu/drm/tests/drm_framebuffer_test.c | 28 
  1 file changed, 28 insertions(+)

diff --git a/drivers/gpu/drm/tests/drm_framebuffer_test.c 
b/drivers/gpu/drm/tests/drm_framebuffer_test.c

index 16d9cf4bed88..3d14d35b4c4d 100644
--- a/drivers/gpu/drm/tests/drm_framebuffer_test.c
+++ b/drivers/gpu/drm/tests/drm_framebuffer_test.c
@@ -8,6 +8,7 @@
  #include 
    #include 
+#include 
  #include 
  #include 
  #include 
@@ -370,6 +371,10 @@ static int drm_framebuffer_test_init(struct 
kunit *test)

  KUNIT_ASSERT_NOT_ERR_OR_NULL(test, mock);
  dev = &mock->dev;
  +    dev->driver = kunit_kzalloc(test, sizeof(*dev->driver), 
GFP_KERNEL);

+    KUNIT_ASSERT_NOT_ERR_OR_NULL(test, dev->driver);
+
+    idr_init_base(&dev->mode_config.object_idr, 1);


Shouldn't we start to use drm_framebuffer_init()?

Do you mean about replace drm_mode_object_add() to drm_framebuffer_init()?
If so, what could be the advantages of using it? It seems to just do the
same of drm_mode_object_add() (by actually calling it) but doing some more
things which is not really needed by this test (like adding fb to device
fb_list and etc). Am I missing something important?

Thanks,
Carlos


Best Regards,
- Maíra


mutex_init(&dev->mode_config.fb_lock);
  INIT_LIST_HEAD(&dev->mode_config.fb_list);
  dev->mode_config.num_fb = 0;
@@ -530,8 +535,31 @@ static void drm_test_framebuffer_cleanup(struct 
kunit *test)

  KUNIT_ASSERT_EQ(test, dev->mode_config.num_fb, 0);
  }
  +static void drm_test_framebuffer_lookup(struct kunit *test)
+{
+    struct drm_mock *mock = test->priv;
+    struct drm_device *dev = &mock->dev;
+    struct drm_framebuffer fb1 = { };
+    struct drm_framebuffer *fb2;
+    uint32_t id = 0;
+    int ret;
+
+    ret = drm_mode_object_add(dev, &fb1.base, DRM_MODE_OBJECT_FB);
+    KUNIT_ASSERT_EQ(test, ret, 0);
+    id = fb1.base.id;
+
+    /* Looking for fb1 */
+    fb2 = drm_framebuffer_lookup(dev, NULL, id);
+    KUNIT_EXPECT_PTR_EQ(test, fb2, &fb1);
+
+    /* Looking for an inexistent framebuffer */
+    fb2 = drm_framebuffer_lookup(dev, NULL, id + 1);
+    KUNIT_EXPECT_NULL(test, fb2);
+}
+
  static struct kunit_case drm_framebuffer_tests[] = {
  KUNIT_CASE(drm_test_framebuffer_cleanup),
+    KUNIT_CASE(drm_test_framebuffer_lookup),
KUNIT_CASE(drm_test_framebuffer_modifiers_not_supported),
  KUNIT_CASE_PARAM(drm_test_framebuffer_check_src_coords, 
check_src_coords_gen_params),
  KUNIT_CASE_PARAM(drm_test_framebuffer_create, 
drm_framebuffer_create_gen_params),


Re: [PATCH v3 12/12] drm/bridge: tc358768: Attempt to fix DSI horizontal timings

2023-09-04 Thread Marcel Ziswiler
Hi Tomi

Looks good. Thanks! Tested both on Verdin AM62 as well as on Verdin iMX8M Mini.

Just a minor nit-pick in your code comment further below.

On Tue, 2023-08-22 at 19:19 +0300, Tomi Valkeinen wrote:
> The DSI horizontal timing calculations done by the driver seem to often
> lead to underflows or overflows, depending on the videomode.
> 
> There are two main things the current driver doesn't seem to get right:
> DSI HSW and HFP, and VSDly. However, even following Toshiba's
> documentation it seems we don't always get a working display.
> 
> This patch attempts to fix the horizontal timings for DSI event mode, and
> on a system with a DSI->HDMI encoder, a lot of standard HDMI modes now
> seem to work. The work relies on Toshiba's documentation, but also quite
> a bit on empirical testing.
> 
> This also adds timing related debug prints to make it easier to improve
> on this later.
> 
> The DSI pulse mode has only been tested with a fixed-resolution panel,
> which limits the testing of different modes on DSI pulse mode. However,
> as the VSDly calculation also affects pulse mode, so this might cause a
> regression.
> 
> Reviewed-by: Peter Ujfalusi 
> Signed-off-by: Tomi Valkeinen 

For the whole series:

Tested-by: Marcel Ziswiler 

> ---
>  drivers/gpu/drm/bridge/tc358768.c | 211 
> +-
>  1 file changed, 183 insertions(+), 28 deletions(-)
> 
> diff --git a/drivers/gpu/drm/bridge/tc358768.c 
> b/drivers/gpu/drm/bridge/tc358768.c
> index f41bf56b7d6b..b465e0a31d09 100644
> --- a/drivers/gpu/drm/bridge/tc358768.c
> +++ b/drivers/gpu/drm/bridge/tc358768.c
> @@ -9,6 +9,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -157,6 +158,7 @@ struct tc358768_priv {
> u32 frs;/* PLL Freqency range for HSCK (post divider) */
>  
> u32 dsiclk; /* pll_clk / 2 */
> +   u32 pclk;   /* incoming pclk rate */
>  };
>  
>  static inline struct tc358768_priv *dsi_host_to_tc358768(struct mipi_dsi_host
> @@ -380,6 +382,7 @@ static int tc358768_calc_pll(struct tc358768_priv *priv,
> priv->prd = best_prd;
> priv->frs = frs;
> priv->dsiclk = best_pll / 2;
> +   priv->pclk = mode->clock * 1000;
>  
> return 0;
>  }
> @@ -638,6 +641,28 @@ static u32 tc358768_ps_to_ns(u32 ps)
> return ps / 1000;
>  }
>  
> +static u32 tc358768_dpi_to_ns(u32 val, u32 pclk)
> +{
> +   return (u32)div_u64((u64)val * NANO, pclk);
> +}
> +
> +/* Convert value in DPI pixel clock units to DSI byte count */
> +static u32 tc358768_dpi_to_dsi_bytes(struct tc358768_priv *priv, u32 val)
> +{
> +   u64 m = (u64)val * priv->dsiclk / 4 * priv->dsi_lanes;
> +   u64 n = priv->pclk;
> +
> +   return (u32)div_u64(m + n - 1, n);
> +}
> +
> +static u32 tc358768_dsi_bytes_to_ns(struct tc358768_priv *priv, u32 val)
> +{
> +   u64 m = (u64)val * NANO;
> +   u64 n = priv->dsiclk / 4 * priv->dsi_lanes;
> +
> +   return (u32)div_u64(m, n);
> +}
> +
>  static void tc358768_bridge_pre_enable(struct drm_bridge *bridge)
>  {
> struct tc358768_priv *priv = bridge_to_tc358768(bridge);
> @@ -647,11 +672,19 @@ static void tc358768_bridge_pre_enable(struct 
> drm_bridge *bridge)
> s32 raw_val;
> const struct drm_display_mode *mode;
> u32 hsbyteclk_ps, dsiclk_ps, ui_ps;
> -   u32 dsiclk, hsbyteclk, video_start;
> -   const u32 internal_delay = 40;
> +   u32 dsiclk, hsbyteclk;
> int ret, i;
> struct videomode vm;
> struct device *dev = priv->dev;
> +   /* In pixelclock units */
> +   u32 dpi_htot, dpi_data_start;
> +   /* In byte units */
> +   u32 dsi_dpi_htot, dsi_dpi_data_start;
> +   u32 dsi_hsw, dsi_hbp, dsi_hact, dsi_hfp;
> +   const u32 dsi_hss = 4; /* HSS is a short packet (4 bytes) */
> +   /* In hsbyteclk units */
> +   u32 dsi_vsdly;
> +   const u32 internal_dly = 40;
>  
> if (mode_flags & MIPI_DSI_CLOCK_NON_CONTINUOUS) {
> dev_warn_once(dev, "Non-continuous mode unimplemented, 
> falling back to continuous\n");
> @@ -686,27 +719,23 @@ static void tc358768_bridge_pre_enable(struct 
> drm_bridge *bridge)
> case MIPI_DSI_FMT_RGB888:
> val |= (0x3 << 4);
> hact = vm.hactive * 3;
> -   video_start = (vm.hsync_len + vm.hback_porch) * 3;
> data_type = MIPI_DSI_PACKED_PIXEL_STREAM_24;
> break;
> case MIPI_DSI_FMT_RGB666:
> val |= (0x4 << 4);
> hact = vm.hactive * 3;
> -   video_start = (vm.hsync_len + vm.hback_porch) * 3;
> data_type = MIPI_DSI_PACKED_PIXEL_STREAM_18;
> break;
>  
> case MIPI_DSI_FMT_RGB666_PACKED:
> val |= (0x4 << 4) | BIT(3);
> hact = vm.hactive * 18 / 8;
> -   video_start = (vm.hsync_len + vm.hback_porch) * 18 / 8;
> 

Re: [PATCH 3/4] drm/bridge: lt8912b: Manually disable HPD only if it was enabled

2023-09-04 Thread Marcel Ziswiler
Hi Tomi

Looks good. Thanks! Tested both on Verdin AM62 as well as on Verdin iMX8M Mini.

Just a minor nit-pick in your commit message.

On Fri, 2023-08-04 at 13:48 +0300, Tomi Valkeinen wrote:
> lt8912b only calls drm_bridge_hpd_enable() if it creates a connector and
> the next bridge has DRM_BRIDGE_OP_HPD set. However, when calling
> drm_bridge_hpd_disable() it misses checking if a connector was created,
> calling drm_bridge_hpd_disable() even if HPD was nenver enabled. I don't

was never enabled

> see any issues causing by this wrong call, though.

any issues caused by this wrong call

> Add the check to avoid wrongly calling drm_bridge_hpd_disable().
> 
> Fixes: 3b0a01a6a522 ("drm/bridge: lt8912b: Add hot plug detection")
> Signed-off-by: Tomi Valkeinen 

For the whole series:

Tested-by: Marcel Ziswiler 

> ---
>  drivers/gpu/drm/bridge/lontium-lt8912b.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/bridge/lontium-lt8912b.c 
> b/drivers/gpu/drm/bridge/lontium-lt8912b.c
> index 2d752e083433..9ee639e75a1c 100644
> --- a/drivers/gpu/drm/bridge/lontium-lt8912b.c
> +++ b/drivers/gpu/drm/bridge/lontium-lt8912b.c
> @@ -587,7 +587,7 @@ static void lt8912_bridge_detach(struct drm_bridge 
> *bridge)
>  
> lt8912_hard_power_off(lt);
>  
> -   if (lt->hdmi_port->ops & DRM_BRIDGE_OP_HPD)
> +   if (lt->connector.dev && lt->hdmi_port->ops & DRM_BRIDGE_OP_HPD)
> drm_bridge_hpd_disable(lt->hdmi_port);
>  }

Cheers

Marcel


[PATCH 17/17] drm/v3d: Create a CPU job extension for the copy performance query job

2023-09-04 Thread Maíra Canal
A CPU job is a type of job that performs operations that requires CPU
intervention. A copy performance query job is a job that copy the complete
or partial result of a query to a buffer. In order to copy the result of
a performance query to a buffer, we need to get the values from the
performance monitors.

So, create a user extension for the CPU job that enables the creation
of a copy performance query job. This user extension will allow the creation
of a CPU job that copy the results of a performance query to a BO with the
possibility to indicate the availability with a availability bit.

Signed-off-by: Maíra Canal 
---
 drivers/gpu/drm/v3d/v3d_drv.h|  1 +
 drivers/gpu/drm/v3d/v3d_sched.c  | 66 ++
 drivers/gpu/drm/v3d/v3d_submit.c | 81 
 include/uapi/drm/v3d_drm.h   | 47 ++
 4 files changed, 195 insertions(+)

diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h
index 0da88dcea01a..1852b144e737 100644
--- a/drivers/gpu/drm/v3d/v3d_drv.h
+++ b/drivers/gpu/drm/v3d/v3d_drv.h
@@ -297,6 +297,7 @@ enum v3d_cpu_job_type {
V3D_CPU_JOB_TYPE_RESET_TIMESTAMP_QUERY,
V3D_CPU_JOB_TYPE_COPY_TIMESTAMP_QUERY,
V3D_CPU_JOB_TYPE_RESET_PERFORMANCE_QUERY,
+   V3D_CPU_JOB_TYPE_COPY_PERFORMANCE_QUERY,
 };
 
 struct v3d_timestamp_query {
diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c
index b1964bc75d02..2ca714e0df40 100644
--- a/drivers/gpu/drm/v3d/v3d_sched.c
+++ b/drivers/gpu/drm/v3d/v3d_sched.c
@@ -428,6 +428,71 @@ v3d_reset_performance_queries(struct v3d_cpu_job *job)
}
 }
 
+static void
+v3d_write_performance_query_result(struct v3d_cpu_job *job, void *data, u32 
query)
+{
+   struct v3d_performance_query_info *performance_query = 
&job->performance_query;
+   struct v3d_copy_query_results_info *copy = &job->copy;
+   struct v3d_file_priv *v3d_priv = job->base.file->driver_priv;
+   struct v3d_dev *v3d = job->base.v3d;
+   struct v3d_perfmon *perfmon;
+   u64 counter_values[V3D_PERFCNT_NUM];
+
+   for (int i = 0; i < performance_query->nperfmons; i++) {
+   perfmon = v3d_perfmon_find(v3d_priv,
+  
performance_query->queries[query].kperfmon_ids[i]);
+   if (!perfmon) {
+   DRM_DEBUG("Failed to find perfmon.");
+   continue;
+   }
+
+   v3d_perfmon_stop(v3d, perfmon, true);
+
+   memcpy(&counter_values[i * DRM_V3D_MAX_PERF_COUNTERS], 
perfmon->values,
+  perfmon->ncounters * sizeof(u64));
+
+   v3d_perfmon_put(perfmon);
+   }
+
+   for (int i = 0; i < performance_query->ncounters; i++)
+   write_to_buffer(data, i, copy->do_64bit, counter_values[i]);
+}
+
+
+static void
+v3d_copy_performance_query(struct v3d_cpu_job *job)
+{
+   struct v3d_performance_query_info *performance_query = 
&job->performance_query;
+   struct v3d_copy_query_results_info *copy = &job->copy;
+   struct v3d_bo *bo = to_v3d_bo(job->base.bo[0]);
+   struct dma_fence *fence;
+   bool available, write_result;
+   u8 *data;
+
+   v3d_get_bo_vaddr(bo);
+
+   data = ((u8 *) bo->vaddr) + copy->offset;
+
+   for (int i = 0; i < performance_query->count; i++) {
+   fence = 
drm_syncobj_fence_get(performance_query->queries[i].syncobj);
+   available = fence ? dma_fence_is_signaled(fence) : false;
+
+   write_result = available || copy->do_partial;
+   if (write_result)
+   v3d_write_performance_query_result(job, data, i);
+
+   if (copy->availability_bit)
+   write_to_buffer(data, performance_query->ncounters,
+   copy->do_64bit, available ? 1u : 0u);
+
+   data += copy->stride;
+
+   dma_fence_put(fence);
+   }
+
+   v3d_put_bo_vaddr(bo);
+}
+
 static struct dma_fence *
 v3d_cpu_job_run(struct drm_sched_job *sched_job)
 {
@@ -440,6 +505,7 @@ v3d_cpu_job_run(struct drm_sched_job *sched_job)
[V3D_CPU_JOB_TYPE_RESET_TIMESTAMP_QUERY] = 
v3d_reset_timestamp_queries,
[V3D_CPU_JOB_TYPE_COPY_TIMESTAMP_QUERY] = 
v3d_copy_query_results,
[V3D_CPU_JOB_TYPE_RESET_PERFORMANCE_QUERY] = 
v3d_reset_performance_queries,
+   [V3D_CPU_JOB_TYPE_COPY_PERFORMANCE_QUERY] = 
v3d_copy_performance_query,
};
 
v3d->cpu_job = job;
diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c
index fb4c68352f4f..9f8d2caa69c7 100644
--- a/drivers/gpu/drm/v3d/v3d_submit.c
+++ b/drivers/gpu/drm/v3d/v3d_submit.c
@@ -677,6 +677,84 @@ v3d_get_cpu_reset_performance_params(struct drm_file 
*file_priv,
return 0;
 }
 
+static int
+v3d_get_cpu_copy_performance_query_params(struct drm_file *file_priv,
+  

[PATCH 16/17] drm/v3d: Create a CPU job extension for the reset performance query job

2023-09-04 Thread Maíra Canal
A CPU job is a type of job that performs operations that requires CPU
intervention. A reset performance query job is a job that resets the
performance queries by resetting the values of the perfmons. Moreover,
we also reset the syncobjs related to the availability of the query.

So, create a user extension for the CPU job that enables the creation
of a reset performance job. This user extension will allow the creation of
a CPU job that resets the perfmons values and resets the availability syncobj.

Signed-off-by: Maíra Canal 
---
 drivers/gpu/drm/v3d/v3d_drv.h| 31 ++
 drivers/gpu/drm/v3d/v3d_sched.c  | 36 
 drivers/gpu/drm/v3d/v3d_submit.c | 73 
 include/uapi/drm/v3d_drm.h   | 27 
 4 files changed, 167 insertions(+)

diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h
index 0cb629b116f1..0da88dcea01a 100644
--- a/drivers/gpu/drm/v3d/v3d_drv.h
+++ b/drivers/gpu/drm/v3d/v3d_drv.h
@@ -241,6 +241,9 @@ struct v3d_job {
 */
struct v3d_perfmon *perfmon;
 
+   /* File descriptor of the process that submitted the job */
+   struct drm_file *file;
+
/* Callback for the freeing of the job on refcount going to 0. */
void (*free)(struct kref *ref);
 };
@@ -293,6 +296,7 @@ enum v3d_cpu_job_type {
V3D_CPU_JOB_TYPE_TIMESTAMP_QUERY,
V3D_CPU_JOB_TYPE_RESET_TIMESTAMP_QUERY,
V3D_CPU_JOB_TYPE_COPY_TIMESTAMP_QUERY,
+   V3D_CPU_JOB_TYPE_RESET_PERFORMANCE_QUERY,
 };
 
 struct v3d_timestamp_query {
@@ -303,6 +307,18 @@ struct v3d_timestamp_query {
struct drm_syncobj *syncobj;
 };
 
+/* Number of perfmons required to handle all supported performance counters */
+#define V3D_MAX_PERFMONS DIV_ROUND_UP(V3D_PERFCNT_NUM, \
+ DRM_V3D_MAX_PERF_COUNTERS)
+
+struct v3d_performance_query {
+   /* Performance monitor IDs for this query */
+   u32 kperfmon_ids[V3D_MAX_PERFMONS];
+
+   /* Syncobj that indicates the query availability */
+   struct drm_syncobj *syncobj;
+};
+
 struct v3d_indirect_csd_info {
/* Indirect CSD */
struct v3d_csd_job *job;
@@ -334,6 +350,19 @@ struct v3d_timestamp_query_info {
u32 count;
 };
 
+struct v3d_performance_query_info {
+   struct v3d_performance_query *queries;
+
+   /* Number of performance queries */
+   u32 count;
+
+   /* Number of performance monitors related to that query pool */
+   u32 nperfmons;
+
+   /* Number of performance counters related to that query pool */
+   u32 ncounters;
+};
+
 struct v3d_copy_query_results_info {
/* Define if should write to buffer using 64 or 32 bits */
bool do_64bit;
@@ -361,6 +390,8 @@ struct v3d_cpu_job {
struct v3d_timestamp_query_info timestamp_query;
 
struct v3d_copy_query_results_info copy;
+
+   struct v3d_performance_query_info performance_query;
 };
 
 struct v3d_submit_outsync {
diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c
index b3664fb5aeef..b1964bc75d02 100644
--- a/drivers/gpu/drm/v3d/v3d_sched.c
+++ b/drivers/gpu/drm/v3d/v3d_sched.c
@@ -76,6 +76,7 @@ v3d_cpu_job_free(struct drm_sched_job *sched_job)
 {
struct v3d_cpu_job *job = to_cpu_job(sched_job);
struct v3d_timestamp_query_info *timestamp_query = 
&job->timestamp_query;
+   struct v3d_performance_query_info *performance_query = 
&job->performance_query;
 
if (timestamp_query->queries) {
for (int i = 0; i < timestamp_query->count; i++)
@@ -83,6 +84,12 @@ v3d_cpu_job_free(struct drm_sched_job *sched_job)
kvfree(timestamp_query->queries);
}
 
+   if (performance_query->queries) {
+   for (int i = 0; i < performance_query->count; i++)
+   drm_syncobj_put(performance_query->queries[i].syncobj);
+   kvfree(performance_query->queries);
+   }
+
v3d_job_cleanup(&job->base);
 }
 
@@ -393,6 +400,34 @@ v3d_copy_query_results(struct v3d_cpu_job *job)
v3d_put_bo_vaddr(bo);
 }
 
+static void
+v3d_reset_performance_queries(struct v3d_cpu_job *job)
+{
+   struct v3d_performance_query_info *performance_query = 
&job->performance_query;
+   struct v3d_file_priv *v3d_priv = job->base.file->driver_priv;
+   struct v3d_dev *v3d = job->base.v3d;
+   struct v3d_perfmon *perfmon;
+
+   for (int i = 0; i < performance_query->count; i++) {
+   for (int j = 0; j < performance_query->nperfmons; j++) {
+   perfmon = v3d_perfmon_find(v3d_priv,
+  
performance_query->queries[i].kperfmon_ids[j]);
+   if (!perfmon) {
+   DRM_DEBUG("Failed to find perfmon.");
+   continue;
+   }
+
+   v3d_perfmon_stop(v3d, perfmon, false);
+
+  

[PATCH 15/17] drm/v3d: Create a CPU job extension to copy timestamp query to a buffer

2023-09-04 Thread Maíra Canal
A CPU job is a type of job that performs operations that requires CPU
intervention. A copy timestamp query job is a job that copy the complete
or partial result of a query to a buffer. As V3D doesn't provide any
mechanism to obtain a timestamp from the GPU, it is a job that needs
CPU intervention.

So, create a user extension for the CPU job that enables the creation
of a copy timestamp query job. This user extension will allow the creation
of a CPU job that copy the results of a timestamp query to a BO with the
possibility to indicate the timestamp availability with a availability bit.

Signed-off-by: Maíra Canal 
---
 drivers/gpu/drm/v3d/v3d_drv.h| 20 ++
 drivers/gpu/drm/v3d/v3d_sched.c  | 54 +
 drivers/gpu/drm/v3d/v3d_submit.c | 68 
 include/uapi/drm/v3d_drm.h   | 41 +++
 4 files changed, 183 insertions(+)

diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h
index 0ffcc8db155e..0cb629b116f1 100644
--- a/drivers/gpu/drm/v3d/v3d_drv.h
+++ b/drivers/gpu/drm/v3d/v3d_drv.h
@@ -292,6 +292,7 @@ enum v3d_cpu_job_type {
V3D_CPU_JOB_TYPE_INDIRECT_CSD = 1,
V3D_CPU_JOB_TYPE_TIMESTAMP_QUERY,
V3D_CPU_JOB_TYPE_RESET_TIMESTAMP_QUERY,
+   V3D_CPU_JOB_TYPE_COPY_TIMESTAMP_QUERY,
 };
 
 struct v3d_timestamp_query {
@@ -333,6 +334,23 @@ struct v3d_timestamp_query_info {
u32 count;
 };
 
+struct v3d_copy_query_results_info {
+   /* Define if should write to buffer using 64 or 32 bits */
+   bool do_64bit;
+
+   /* Define if it can write to buffer even if the query is not available 
*/
+   bool do_partial;
+
+   /* Define if it should write availability bit to buffer */
+   bool availability_bit;
+
+   /* Offset of the copy buffer in the BO */
+   u32 offset;
+
+   /* Stride of the copy buffer in the BO */
+   u32 stride;
+};
+
 struct v3d_cpu_job {
struct v3d_job base;
 
@@ -341,6 +359,8 @@ struct v3d_cpu_job {
struct v3d_indirect_csd_info indirect_csd;
 
struct v3d_timestamp_query_info timestamp_query;
+
+   struct v3d_copy_query_results_info copy;
 };
 
 struct v3d_submit_outsync {
diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c
index da8a709ee590..b3664fb5aeef 100644
--- a/drivers/gpu/drm/v3d/v3d_sched.c
+++ b/drivers/gpu/drm/v3d/v3d_sched.c
@@ -340,6 +340,59 @@ v3d_reset_timestamp_queries(struct v3d_cpu_job *job)
v3d_put_bo_vaddr(bo);
 }
 
+static void
+write_to_buffer(void *dst, u32 idx, bool do_64bit, u64 value)
+{
+   if (do_64bit) {
+   u64 *dst64 = (u64 *) dst;
+   dst64[idx] = value;
+   } else {
+   u32 *dst32 = (u32 *) dst;
+   dst32[idx] = (u32) value;
+   }
+}
+
+static void
+v3d_copy_query_results(struct v3d_cpu_job *job)
+{
+   struct v3d_timestamp_query_info *timestamp_query = 
&job->timestamp_query;
+   struct v3d_timestamp_query *queries = timestamp_query->queries;
+   struct v3d_bo *bo = to_v3d_bo(job->base.bo[0]);
+   struct v3d_bo *timestamp = to_v3d_bo(job->base.bo[1]);
+   struct v3d_copy_query_results_info *copy = &job->copy;
+   struct dma_fence *fence;
+   u8 *query_addr;
+   bool available, write_result;
+   u8 *data;
+   int i;
+
+   v3d_get_bo_vaddr(bo);
+   v3d_get_bo_vaddr(timestamp);
+
+   data = ((u8 *) bo->vaddr) + copy->offset;
+
+   for (i = 0; i < timestamp_query->count; i++) {
+   fence = drm_syncobj_fence_get(queries[i].syncobj);
+   available = fence ? dma_fence_is_signaled(fence) : false;
+
+   write_result = available || copy->do_partial;
+   if (write_result) {
+   query_addr = ((u8 *) timestamp->vaddr) + 
queries[i].offset;
+   write_to_buffer(data, 0, copy->do_64bit, *((u64 *) 
query_addr));
+   }
+
+   if (copy->availability_bit)
+   write_to_buffer(data, 1, copy->do_64bit, available ? 1u 
: 0u);
+
+   data += copy->stride;
+
+   dma_fence_put(fence);
+   }
+
+   v3d_put_bo_vaddr(timestamp);
+   v3d_put_bo_vaddr(bo);
+}
+
 static struct dma_fence *
 v3d_cpu_job_run(struct drm_sched_job *sched_job)
 {
@@ -350,6 +403,7 @@ v3d_cpu_job_run(struct drm_sched_job *sched_job)
[V3D_CPU_JOB_TYPE_INDIRECT_CSD] = 
v3d_rewrite_csd_job_wg_counts_from_indirect,
[V3D_CPU_JOB_TYPE_TIMESTAMP_QUERY] = v3d_timestamp_query,
[V3D_CPU_JOB_TYPE_RESET_TIMESTAMP_QUERY] = 
v3d_reset_timestamp_queries,
+   [V3D_CPU_JOB_TYPE_COPY_TIMESTAMP_QUERY] = 
v3d_copy_query_results,
};
 
v3d->cpu_job = job;
diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c
index abd182b20021..9a41e8044011 100644
--- a/drivers/gpu/drm/v3d/v3d_submit.c
+++ b/drivers/gpu/drm/v3d/v3d_submit.c
@@ -543,6 +5

[PATCH 14/17] drm/v3d: Create a CPU job extension for the reset timestamp job

2023-09-04 Thread Maíra Canal
A CPU job is a type of job that performs operations that requires CPU
intervention. A reset timestamp job is a job that resets the timestamp
queries based on the value offset of the first query. As V3D doesn't
provide any mechanism to obtain a timestamp from the GPU, it is a job
that needs CPU intervention.

So, create a user extension for the CPU job that enables the creation
of a reset timestamp job. This user extension will allow the creation of
a CPU job that resets the timestamp value in the timestamp BO and resets
the availability syncobj.

Signed-off-by: Maíra Canal 
---
 drivers/gpu/drm/v3d/v3d_drv.h|  1 +
 drivers/gpu/drm/v3d/v3d_sched.c  | 21 +
 drivers/gpu/drm/v3d/v3d_submit.c | 51 
 include/uapi/drm/v3d_drm.h   | 24 +++
 4 files changed, 97 insertions(+)

diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h
index 2733f9c9f5f8..0ffcc8db155e 100644
--- a/drivers/gpu/drm/v3d/v3d_drv.h
+++ b/drivers/gpu/drm/v3d/v3d_drv.h
@@ -291,6 +291,7 @@ struct v3d_csd_job {
 enum v3d_cpu_job_type {
V3D_CPU_JOB_TYPE_INDIRECT_CSD = 1,
V3D_CPU_JOB_TYPE_TIMESTAMP_QUERY,
+   V3D_CPU_JOB_TYPE_RESET_TIMESTAMP_QUERY,
 };
 
 struct v3d_timestamp_query {
diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c
index f2fd7ffc8704..da8a709ee590 100644
--- a/drivers/gpu/drm/v3d/v3d_sched.c
+++ b/drivers/gpu/drm/v3d/v3d_sched.c
@@ -320,6 +320,26 @@ v3d_timestamp_query(struct v3d_cpu_job *job)
v3d_put_bo_vaddr(bo);
 }
 
+static void
+v3d_reset_timestamp_queries(struct v3d_cpu_job *job)
+{
+   struct v3d_timestamp_query_info *timestamp_query = 
&job->timestamp_query;
+   struct v3d_timestamp_query *queries = timestamp_query->queries;
+   struct v3d_bo *bo = to_v3d_bo(job->base.bo[0]);
+   u8 *value_addr;
+
+   v3d_get_bo_vaddr(bo);
+
+   for (int i = 0; i < timestamp_query->count; i++) {
+   value_addr = ((u8 *) bo->vaddr) + queries[i].offset;
+   *((u64 *) value_addr) = 0;
+
+   drm_syncobj_replace_fence(queries[i].syncobj, NULL);
+   }
+
+   v3d_put_bo_vaddr(bo);
+}
+
 static struct dma_fence *
 v3d_cpu_job_run(struct drm_sched_job *sched_job)
 {
@@ -329,6 +349,7 @@ v3d_cpu_job_run(struct drm_sched_job *sched_job)
void (*v3d_cpu_job_fn[])(struct v3d_cpu_job *job) = {
[V3D_CPU_JOB_TYPE_INDIRECT_CSD] = 
v3d_rewrite_csd_job_wg_counts_from_indirect,
[V3D_CPU_JOB_TYPE_TIMESTAMP_QUERY] = v3d_timestamp_query,
+   [V3D_CPU_JOB_TYPE_RESET_TIMESTAMP_QUERY] = 
v3d_reset_timestamp_queries,
};
 
v3d->cpu_job = job;
diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c
index 5bf55276c4f0..abd182b20021 100644
--- a/drivers/gpu/drm/v3d/v3d_submit.c
+++ b/drivers/gpu/drm/v3d/v3d_submit.c
@@ -495,6 +495,54 @@ v3d_get_cpu_timestamp_query_params(struct drm_file 
*file_priv,
return 0;
 }
 
+static int
+v3d_get_cpu_reset_timestamp_params(struct drm_file *file_priv,
+  struct drm_v3d_extension __user *ext,
+  struct v3d_cpu_job *job)
+{
+   u32 __user *syncs;
+   struct drm_v3d_reset_timestamp_query reset;
+
+   if (!job) {
+   DRM_DEBUG("CPU job extension was attached to a GPU job.\n");
+   return -EINVAL;
+   }
+
+   if (job->job_type) {
+   DRM_DEBUG("Two CPU job extensions were added to the same CPU 
job.\n");
+   return -EINVAL;
+   }
+
+   if (copy_from_user(&reset, ext, sizeof(reset)))
+   return -EFAULT;
+
+   job->job_type = V3D_CPU_JOB_TYPE_RESET_TIMESTAMP_QUERY;
+
+   job->timestamp_query.queries = kvmalloc_array(reset.count,
+ sizeof(struct 
v3d_timestamp_query),
+ GFP_KERNEL);
+   if (!job->timestamp_query.queries)
+   return -ENOMEM;
+
+   syncs = u64_to_user_ptr(reset.syncs);
+
+   for (int i = 0; i < reset.count; i++) {
+   u32 sync;
+
+   job->timestamp_query.queries[i].offset = reset.offset + 8 * i;
+
+   if (copy_from_user(&sync, syncs++, sizeof(sync))) {
+   kvfree(job->timestamp_query.queries);
+   return -EFAULT;
+   }
+
+   job->timestamp_query.queries[i].syncobj = 
drm_syncobj_find(file_priv, sync);
+   }
+   job->timestamp_query.count = reset.count;
+
+   return 0;
+}
+
 /* Whenever userspace sets ioctl extensions, v3d_get_extensions parses data
  * according to the extension id (name).
  */
@@ -526,6 +574,9 @@ v3d_get_extensions(struct drm_file *file_priv,
case DRM_V3D_EXT_ID_CPU_TIMESTAMP_QUERY:
ret = v3d_get_cpu_timestamp_query_params(file_priv, 
user_ext, job);
b

[PATCH 13/17] drm/v3d: Create a CPU job extension for the timestamp query job

2023-09-04 Thread Maíra Canal
A CPU job is a type of job that performs operations that requires CPU
intervention. A timestamp query job is a job that calculates the
query timestamp and updates the query availability by signaling a
syncobj. As V3D doesn't provide any mechanism to obtain a timestamp
from the GPU, it is a job that needs CPU intervention.

So, create a user extension for the CPU job that enables the creation
of a timestamp query job. This user extension will allow the creation of
a CPU job that performs the timestamp query calculation and updates the
timestamp BO with the proper value.

Signed-off-by: Maíra Canal 
---
 drivers/gpu/drm/v3d/v3d_drv.h| 17 +
 drivers/gpu/drm/v3d/v3d_sched.c  | 39 +++-
 drivers/gpu/drm/v3d/v3d_submit.c | 62 
 include/uapi/drm/v3d_drm.h   | 27 ++
 4 files changed, 144 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h
index 5e44542e7899..2733f9c9f5f8 100644
--- a/drivers/gpu/drm/v3d/v3d_drv.h
+++ b/drivers/gpu/drm/v3d/v3d_drv.h
@@ -290,6 +290,15 @@ struct v3d_csd_job {
 
 enum v3d_cpu_job_type {
V3D_CPU_JOB_TYPE_INDIRECT_CSD = 1,
+   V3D_CPU_JOB_TYPE_TIMESTAMP_QUERY,
+};
+
+struct v3d_timestamp_query {
+   /* Offset of this query in the timestamp BO for its value. */
+   u32 offset;
+
+   /* Syncobj that indicates the timestamp availability */
+   struct drm_syncobj *syncobj;
 };
 
 struct v3d_indirect_csd_info {
@@ -317,12 +326,20 @@ struct v3d_indirect_csd_info {
struct ww_acquire_ctx acquire_ctx;
 };
 
+struct v3d_timestamp_query_info {
+   struct v3d_timestamp_query *queries;
+
+   u32 count;
+};
+
 struct v3d_cpu_job {
struct v3d_job base;
 
enum v3d_cpu_job_type job_type;
 
struct v3d_indirect_csd_info indirect_csd;
+
+   struct v3d_timestamp_query_info timestamp_query;
 };
 
 struct v3d_submit_outsync {
diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c
index 7d3567c237dc..f2fd7ffc8704 100644
--- a/drivers/gpu/drm/v3d/v3d_sched.c
+++ b/drivers/gpu/drm/v3d/v3d_sched.c
@@ -18,6 +18,7 @@
  * semaphores to interlock between them.
  */
 
+#include 
 #include 
 
 #include "v3d_drv.h"
@@ -70,6 +71,21 @@ v3d_sched_job_free(struct drm_sched_job *sched_job)
v3d_job_cleanup(job);
 }
 
+static void
+v3d_cpu_job_free(struct drm_sched_job *sched_job)
+{
+   struct v3d_cpu_job *job = to_cpu_job(sched_job);
+   struct v3d_timestamp_query_info *timestamp_query = 
&job->timestamp_query;
+
+   if (timestamp_query->queries) {
+   for (int i = 0; i < timestamp_query->count; i++)
+   drm_syncobj_put(timestamp_query->queries[i].syncobj);
+   kvfree(timestamp_query->queries);
+   }
+
+   v3d_job_cleanup(&job->base);
+}
+
 static void
 v3d_switch_perfmon(struct v3d_dev *v3d, struct v3d_job *job)
 {
@@ -284,6 +300,26 @@ v3d_rewrite_csd_job_wg_counts_from_indirect(struct 
v3d_cpu_job *job)
v3d_put_bo_vaddr(bo);
 }
 
+static void
+v3d_timestamp_query(struct v3d_cpu_job *job)
+{
+   struct v3d_timestamp_query_info *timestamp_query = 
&job->timestamp_query;
+   struct v3d_bo *bo = to_v3d_bo(job->base.bo[0]);
+   u8 *value_addr;
+
+   v3d_get_bo_vaddr(bo);
+
+   for (int i = 0; i < timestamp_query->count; i++) {
+   value_addr = ((u8 *) bo->vaddr) + 
timestamp_query->queries[i].offset;
+   *((u64 *) value_addr) = i == 0 ? ktime_get_ns() : 0ull;
+
+   drm_syncobj_replace_fence(timestamp_query->queries[i].syncobj,
+ job->base.done_fence);
+   }
+
+   v3d_put_bo_vaddr(bo);
+}
+
 static struct dma_fence *
 v3d_cpu_job_run(struct drm_sched_job *sched_job)
 {
@@ -292,6 +328,7 @@ v3d_cpu_job_run(struct drm_sched_job *sched_job)
 
void (*v3d_cpu_job_fn[])(struct v3d_cpu_job *job) = {
[V3D_CPU_JOB_TYPE_INDIRECT_CSD] = 
v3d_rewrite_csd_job_wg_counts_from_indirect,
+   [V3D_CPU_JOB_TYPE_TIMESTAMP_QUERY] = v3d_timestamp_query,
};
 
v3d->cpu_job = job;
@@ -451,7 +488,7 @@ static const struct drm_sched_backend_ops 
v3d_cache_clean_sched_ops = {
 static const struct drm_sched_backend_ops v3d_cpu_sched_ops = {
.run_job = v3d_cpu_job_run,
.timedout_job = v3d_generic_job_timedout,
-   .free_job = v3d_sched_job_free
+   .free_job = v3d_cpu_job_free
 };
 
 int
diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c
index cb62d5752d6d..5bf55276c4f0 100644
--- a/drivers/gpu/drm/v3d/v3d_submit.c
+++ b/drivers/gpu/drm/v3d/v3d_submit.c
@@ -437,6 +437,64 @@ v3d_get_cpu_indirect_csd_params(struct drm_file *file_priv,
  NULL, &info->acquire_ctx);
 }
 
+/* Get data for the query timestamp job submission. */
+static int
+v3d_get_cpu_timestamp_query_params(struct drm_file *file_priv,
+

[PATCH 12/17] drm/v3d: Create a CPU job extension for a indirect CSD job

2023-09-04 Thread Maíra Canal
A CPU job is a type of job that performs operations that requires CPU
intervention. An indirect CSD job is a job that, when executed in the
queue, will map the indirect buffer, read the dispatch parameters, and
submit a regular dispatch. Therefore, it is a job that needs CPU
intervention.

So, create a user extension for the CPU job that enables the creation
of an indirect CSD. This user extension will allow the creation of a CSD
job linked to a CPU job. The CPU job will wait for the indirect CSD job
dependencies and, once they are signaled, it will update the CSD job
parameters.

Co-developed-by: Melissa Wen 
Signed-off-by: Melissa Wen 
Signed-off-by: Maíra Canal 
---
 drivers/gpu/drm/v3d/v3d_drv.h|  31 +-
 drivers/gpu/drm/v3d/v3d_sched.c  |  41 -
 drivers/gpu/drm/v3d/v3d_submit.c | 100 ++-
 include/uapi/drm/v3d_drm.h   |  37 +++-
 4 files changed, 205 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h
index 7940b6b1efd0..5e44542e7899 100644
--- a/drivers/gpu/drm/v3d/v3d_drv.h
+++ b/drivers/gpu/drm/v3d/v3d_drv.h
@@ -288,12 +288,41 @@ struct v3d_csd_job {
struct drm_v3d_submit_csd args;
 };
 
-enum v3d_cpu_job_type {};
+enum v3d_cpu_job_type {
+   V3D_CPU_JOB_TYPE_INDIRECT_CSD = 1,
+};
+
+struct v3d_indirect_csd_info {
+   /* Indirect CSD */
+   struct v3d_csd_job *job;
+
+   /* Clean cache job associated to the Indirect CSD job */
+   struct v3d_job *clean_job;
+
+   /* Offset within the BO where the workgroup counts are stored */
+   u32 offset;
+
+   /* Workgroups size */
+   u32 wg_size;
+
+   /* Indices of the uniforms with the workgroup dispatch counts
+* in the uniform stream.
+*/
+   u32 wg_uniform_offsets[3];
+
+   /* Indirect BO */
+   struct drm_gem_object *indirect;
+
+   /* Context of the Indirect CSD job */
+   struct ww_acquire_ctx acquire_ctx;
+};
 
 struct v3d_cpu_job {
struct v3d_job base;
 
enum v3d_cpu_job_type job_type;
+
+   struct v3d_indirect_csd_info indirect_csd;
 };
 
 struct v3d_submit_outsync {
diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c
index 85c11e0fe057..7d3567c237dc 100644
--- a/drivers/gpu/drm/v3d/v3d_sched.c
+++ b/drivers/gpu/drm/v3d/v3d_sched.c
@@ -24,6 +24,8 @@
 #include "v3d_regs.h"
 #include "v3d_trace.h"
 
+#define V3D_CSD_CFG012_WG_COUNT_SHIFT 16
+
 static struct v3d_job *
 to_v3d_job(struct drm_sched_job *sched_job)
 {
@@ -247,13 +249,50 @@ v3d_csd_job_run(struct drm_sched_job *sched_job)
return fence;
 }
 
+static void
+v3d_rewrite_csd_job_wg_counts_from_indirect(struct v3d_cpu_job *job)
+{
+   struct v3d_indirect_csd_info *indirect_csd = &job->indirect_csd;
+   struct v3d_bo *bo = to_v3d_bo(job->base.bo[0]);
+   struct v3d_bo *indirect = to_v3d_bo(indirect_csd->indirect);
+   struct drm_v3d_submit_csd *args = &indirect_csd->job->args;
+   u32 *wg_counts;
+
+   v3d_get_bo_vaddr(bo);
+   v3d_get_bo_vaddr(indirect);
+
+   wg_counts = (uint32_t *) (bo->vaddr + indirect_csd->offset);
+
+   if (wg_counts[0] == 0 || wg_counts[1] == 0 || wg_counts[2] == 0)
+   return;
+
+   args->cfg[0] = wg_counts[0] << V3D_CSD_CFG012_WG_COUNT_SHIFT;
+   args->cfg[1] = wg_counts[1] << V3D_CSD_CFG012_WG_COUNT_SHIFT;
+   args->cfg[2] = wg_counts[2] << V3D_CSD_CFG012_WG_COUNT_SHIFT;
+   args->cfg[4] = DIV_ROUND_UP(indirect_csd->wg_size, 16) *
+  (wg_counts[0] * wg_counts[1] * wg_counts[2]) - 1;
+
+   for (int i = 0; i < 3; i++) {
+   /* 0x indicates that the uniform rewrite is not needed 
*/
+   if (indirect_csd->wg_uniform_offsets[i] != 0x) {
+   u32 uniform_idx = indirect_csd->wg_uniform_offsets[i];
+   ((uint32_t *) indirect->vaddr)[uniform_idx] = 
wg_counts[i];
+   }
+   }
+
+   v3d_put_bo_vaddr(indirect);
+   v3d_put_bo_vaddr(bo);
+}
+
 static struct dma_fence *
 v3d_cpu_job_run(struct drm_sched_job *sched_job)
 {
struct v3d_cpu_job *job = to_cpu_job(sched_job);
struct v3d_dev *v3d = job->base.v3d;
 
-   void (*v3d_cpu_job_fn[])(struct v3d_cpu_job *job) = { };
+   void (*v3d_cpu_job_fn[])(struct v3d_cpu_job *job) = {
+   [V3D_CPU_JOB_TYPE_INDIRECT_CSD] = 
v3d_rewrite_csd_job_wg_counts_from_indirect,
+   };
 
v3d->cpu_job = job;
 
diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c
index 72f006738132..cb62d5752d6d 100644
--- a/drivers/gpu/drm/v3d/v3d_submit.c
+++ b/drivers/gpu/drm/v3d/v3d_submit.c
@@ -395,6 +395,48 @@ v3d_get_multisync_submit_deps(struct drm_file *file_priv,
return 0;
 }
 
+/* Get data for the indirect CSD job submission. */
+static int
+v3d_get_cpu_indirect_csd_params(struct drm_file *file_priv,
+   struct d

[PATCH 11/17] drm/v3d: Enable BO mapping

2023-09-04 Thread Maíra Canal
For the indirect CSD CPU job, we will need to access the internal
contents of the BO with the dispatch parameters. Therefore, create
methods to allow the mapping and unmapping of the BO.

Signed-off-by: Maíra Canal 
---
 drivers/gpu/drm/v3d/v3d_bo.c  | 18 ++
 drivers/gpu/drm/v3d/v3d_drv.h |  4 
 2 files changed, 22 insertions(+)

diff --git a/drivers/gpu/drm/v3d/v3d_bo.c b/drivers/gpu/drm/v3d/v3d_bo.c
index 357a0da7e16a..1bdfac8beafd 100644
--- a/drivers/gpu/drm/v3d/v3d_bo.c
+++ b/drivers/gpu/drm/v3d/v3d_bo.c
@@ -33,6 +33,9 @@ void v3d_free_object(struct drm_gem_object *obj)
struct v3d_dev *v3d = to_v3d_dev(obj->dev);
struct v3d_bo *bo = to_v3d_bo(obj);
 
+   if (bo->vaddr)
+   v3d_put_bo_vaddr(bo);
+
v3d_mmu_remove_ptes(bo);
 
mutex_lock(&v3d->bo_lock);
@@ -134,6 +137,7 @@ struct v3d_bo *v3d_bo_create(struct drm_device *dev, struct 
drm_file *file_priv,
if (IS_ERR(shmem_obj))
return ERR_CAST(shmem_obj);
bo = to_v3d_bo(&shmem_obj->base);
+   bo->vaddr = NULL;
 
ret = v3d_bo_create_finish(&shmem_obj->base);
if (ret)
@@ -167,6 +171,20 @@ v3d_prime_import_sg_table(struct drm_device *dev,
return obj;
 }
 
+void v3d_get_bo_vaddr(struct v3d_bo *bo)
+{
+   struct drm_gem_shmem_object *obj = &bo->base;
+
+   bo->vaddr = vmap(obj->pages, obj->base.size >> PAGE_SHIFT, VM_MAP,
+pgprot_writecombine(PAGE_KERNEL));
+}
+
+void v3d_put_bo_vaddr(struct v3d_bo *bo)
+{
+   vunmap(bo->vaddr);
+   bo->vaddr = NULL;
+}
+
 int v3d_create_bo_ioctl(struct drm_device *dev, void *data,
struct drm_file *file_priv)
 {
diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h
index 2a3f3beb272c..7940b6b1efd0 100644
--- a/drivers/gpu/drm/v3d/v3d_drv.h
+++ b/drivers/gpu/drm/v3d/v3d_drv.h
@@ -179,6 +179,8 @@ struct v3d_bo {
 * v3d_render_job->unref_list
 */
struct list_head unref_head;
+
+   void *vaddr;
 };
 
 static inline struct v3d_bo *
@@ -361,6 +363,8 @@ struct drm_gem_object *v3d_create_object(struct drm_device 
*dev, size_t size);
 void v3d_free_object(struct drm_gem_object *gem_obj);
 struct v3d_bo *v3d_bo_create(struct drm_device *dev, struct drm_file 
*file_priv,
 size_t size);
+void v3d_get_bo_vaddr(struct v3d_bo *bo);
+void v3d_put_bo_vaddr(struct v3d_bo *bo);
 int v3d_create_bo_ioctl(struct drm_device *dev, void *data,
struct drm_file *file_priv);
 int v3d_mmap_bo_ioctl(struct drm_device *dev, void *data,
-- 
2.41.0



[PATCH 10/17] drm/v3d: Detach the CSD job BO setup

2023-09-04 Thread Maíra Canal
From: Melissa Wen 

Detach CSD job setup from CSD submission ioctl to reuse it in CPU
submission ioctl for indirect CSD job.

Signed-off-by: Melissa Wen 
Co-developed-by: Maíra Canal 
Signed-off-by: Maíra Canal 
---
 drivers/gpu/drm/v3d/v3d_submit.c | 52 +---
 1 file changed, 34 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c
index 5402d8aacb71..72f006738132 100644
--- a/drivers/gpu/drm/v3d/v3d_submit.c
+++ b/drivers/gpu/drm/v3d/v3d_submit.c
@@ -268,6 +268,37 @@ v3d_attach_fences_and_unlock_reservation(struct drm_file 
*file_priv,
}
 }
 
+static int
+v3d_setup_csd_jobs_and_bos(struct drm_file *file_priv,
+  struct v3d_dev *v3d,
+  struct drm_v3d_submit_csd *args,
+  struct v3d_csd_job **job,
+  struct v3d_job **clean_job,
+  struct v3d_submit_ext *se,
+  struct ww_acquire_ctx *acquire_ctx)
+{
+   int ret;
+
+   ret = v3d_job_init(v3d, file_priv, (void *)job, sizeof(**job),
+  v3d_job_free, args->in_sync, se, V3D_CSD);
+   if (ret)
+   return ret;
+
+   ret = v3d_job_init(v3d, file_priv, (void *)clean_job, 
sizeof(**clean_job),
+  v3d_job_free, 0, NULL, V3D_CACHE_CLEAN);
+   if (ret)
+   return ret;
+
+   (*job)->args = *args;
+
+   ret = v3d_lookup_bos(&v3d->drm, file_priv, *clean_job,
+args->bo_handles, args->bo_handle_count);
+   if (ret)
+   return ret;
+
+   return v3d_lock_bo_reservations(*clean_job, acquire_ctx);
+}
+
 static void
 v3d_put_multisync_post_deps(struct v3d_submit_ext *se)
 {
@@ -696,24 +727,9 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
}
}
 
-   ret = v3d_job_init(v3d, file_priv, (void *)&job, sizeof(*job),
-  v3d_job_free, args->in_sync, &se, V3D_CSD);
-   if (ret)
-   goto fail;
-
-   ret = v3d_job_init(v3d, file_priv, (void *)&clean_job, 
sizeof(*clean_job),
-  v3d_job_free, 0, NULL, V3D_CACHE_CLEAN);
-   if (ret)
-   goto fail;
-
-   job->args = *args;
-
-   ret = v3d_lookup_bos(dev, file_priv, clean_job,
-args->bo_handles, args->bo_handle_count);
-   if (ret)
-   goto fail;
-
-   ret = v3d_lock_bo_reservations(clean_job, &acquire_ctx);
+   ret = v3d_setup_csd_jobs_and_bos(file_priv, v3d, args,
+&job, &clean_job, &se,
+&acquire_ctx);
if (ret)
goto fail;
 
-- 
2.41.0



[PATCH 09/17] drm/v3d: Create tracepoints to track the CPU job

2023-09-04 Thread Maíra Canal
Create tracepoints to track the three major events of a CPU job
lifetime:
1. Submission of a `v3d_submit_cpu` IOCTL
2. Beginning of the execution of a CPU job
3. Ending of the execution of a CPU job

Signed-off-by: Maíra Canal 
---
 drivers/gpu/drm/v3d/v3d_sched.c  |  4 +++
 drivers/gpu/drm/v3d/v3d_submit.c |  2 ++
 drivers/gpu/drm/v3d/v3d_trace.h  | 57 
 3 files changed, 63 insertions(+)

diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c
index 88c483da360c..85c11e0fe057 100644
--- a/drivers/gpu/drm/v3d/v3d_sched.c
+++ b/drivers/gpu/drm/v3d/v3d_sched.c
@@ -262,8 +262,12 @@ v3d_cpu_job_run(struct drm_sched_job *sched_job)
return NULL;
}
 
+   trace_v3d_cpu_job_begin(&v3d->drm, job->job_type);
+
v3d_cpu_job_fn[job->job_type](job);
 
+   trace_v3d_cpu_job_end(&v3d->drm, job->job_type);
+
return NULL;
 }
 
diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c
index ff8a77a4e2b0..5402d8aacb71 100644
--- a/drivers/gpu/drm/v3d/v3d_submit.c
+++ b/drivers/gpu/drm/v3d/v3d_submit.c
@@ -805,6 +805,8 @@ v3d_submit_cpu_ioctl(struct drm_device *dev, void *data,
goto fail;
}
 
+   trace_v3d_submit_cpu_ioctl(&v3d->drm, cpu_job->job_type);
+
ret = v3d_job_init(v3d, file_priv, (void *)&cpu_job, sizeof(*cpu_job),
   v3d_job_free, 0, &se, V3D_CPU);
if (ret)
diff --git a/drivers/gpu/drm/v3d/v3d_trace.h b/drivers/gpu/drm/v3d/v3d_trace.h
index 7aa8dc356e54..06086ece6e9e 100644
--- a/drivers/gpu/drm/v3d/v3d_trace.h
+++ b/drivers/gpu/drm/v3d/v3d_trace.h
@@ -225,6 +225,63 @@ TRACE_EVENT(v3d_submit_csd,
  __entry->seqno)
 );
 
+TRACE_EVENT(v3d_submit_cpu_ioctl,
+  TP_PROTO(struct drm_device *dev, enum v3d_cpu_job_type job_type),
+  TP_ARGS(dev, job_type),
+
+  TP_STRUCT__entry(
+   __field(u32, dev)
+   __field(enum v3d_cpu_job_type, job_type)
+   ),
+
+  TP_fast_assign(
+ __entry->dev = dev->primary->index;
+ __entry->job_type = job_type;
+ ),
+
+  TP_printk("dev=%u, job_type=%d",
+__entry->dev,
+__entry->job_type)
+);
+
+TRACE_EVENT(v3d_cpu_job_begin,
+   TP_PROTO(struct drm_device *dev, enum v3d_cpu_job_type job_type),
+   TP_ARGS(dev, job_type),
+
+   TP_STRUCT__entry(
+__field(u32, dev)
+__field(enum v3d_cpu_job_type, job_type)
+),
+
+   TP_fast_assign(
+  __entry->dev = dev->primary->index;
+  __entry->job_type = job_type;
+  ),
+
+   TP_printk("dev=%u, job_type=%d",
+ __entry->dev,
+ __entry->job_type)
+);
+
+TRACE_EVENT(v3d_cpu_job_end,
+   TP_PROTO(struct drm_device *dev, enum v3d_cpu_job_type job_type),
+   TP_ARGS(dev, job_type),
+
+   TP_STRUCT__entry(
+__field(u32, dev)
+__field(enum v3d_cpu_job_type, job_type)
+),
+
+   TP_fast_assign(
+  __entry->dev = dev->primary->index;
+  __entry->job_type = job_type;
+  ),
+
+   TP_printk("dev=%u, job_type=%d",
+ __entry->dev,
+ __entry->job_type)
+);
+
 TRACE_EVENT(v3d_cache_clean_begin,
TP_PROTO(struct drm_device *dev),
TP_ARGS(dev),
-- 
2.41.0



[PATCH 08/17] drm/v3d: Use v3d_get_extensions() to parse CPU job data

2023-09-04 Thread Maíra Canal
Currently, v3d_get_extensions() only parses multisync data and assigns
it to the `struct v3d_submit_ext`. But, to implement the CPU job with
user extensions, we want v3d_get_extensions() to be able to parse CPU
job data and assign it to the `struct v3d_cpu_job`.

Therefore, allow the function v3d_get_extensions() to use `struct v3d_cpu_job *`
as a parameter. If the `struct v3d_cpu_job *` is assigned to NULL, it means
that the job is a GPU job and CPU job extensions should be rejected.

Signed-off-by: Maíra Canal 
---
 drivers/gpu/drm/v3d/v3d_submit.c | 23 ---
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c
index 40880b758071..ff8a77a4e2b0 100644
--- a/drivers/gpu/drm/v3d/v3d_submit.c
+++ b/drivers/gpu/drm/v3d/v3d_submit.c
@@ -335,10 +335,9 @@ v3d_get_multisync_post_deps(struct drm_file *file_priv,
 static int
 v3d_get_multisync_submit_deps(struct drm_file *file_priv,
  struct drm_v3d_extension __user *ext,
- void *data)
+ struct v3d_submit_ext *se)
 {
struct drm_v3d_multi_sync multisync;
-   struct v3d_submit_ext *se = data;
int ret;
 
if (se->in_sync_count || se->out_sync_count) {
@@ -352,7 +351,7 @@ v3d_get_multisync_submit_deps(struct drm_file *file_priv,
if (multisync.pad)
return -EINVAL;
 
-   ret = v3d_get_multisync_post_deps(file_priv, data, 
multisync.out_sync_count,
+   ret = v3d_get_multisync_post_deps(file_priv, se, 
multisync.out_sync_count,
  multisync.out_syncs);
if (ret)
return ret;
@@ -371,7 +370,8 @@ v3d_get_multisync_submit_deps(struct drm_file *file_priv,
 static int
 v3d_get_extensions(struct drm_file *file_priv,
   u64 ext_handles,
-  void *data)
+  struct v3d_submit_ext *se,
+  struct v3d_cpu_job *job)
 {
struct drm_v3d_extension __user *user_ext;
int ret;
@@ -387,15 +387,16 @@ v3d_get_extensions(struct drm_file *file_priv,
 
switch (ext.id) {
case DRM_V3D_EXT_ID_MULTI_SYNC:
-   ret = v3d_get_multisync_submit_deps(file_priv, 
user_ext, data);
-   if (ret)
-   return ret;
+   ret = v3d_get_multisync_submit_deps(file_priv, 
user_ext, se);
break;
default:
DRM_DEBUG_DRIVER("Unknown extension id: %d\n", ext.id);
return -EINVAL;
}
 
+   if (ret)
+   return ret;
+
user_ext = u64_to_user_ptr(ext.next);
}
 
@@ -442,7 +443,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
}
 
if (args->flags & DRM_V3D_SUBMIT_EXTENSION) {
-   ret = v3d_get_extensions(file_priv, args->extensions, &se);
+   ret = v3d_get_extensions(file_priv, args->extensions, &se, 
NULL);
if (ret) {
DRM_DEBUG("Failed to get extensions.\n");
return ret;
@@ -585,7 +586,7 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data,
}
 
if (args->flags & DRM_V3D_SUBMIT_EXTENSION) {
-   ret = v3d_get_extensions(file_priv, args->extensions, &se);
+   ret = v3d_get_extensions(file_priv, args->extensions, &se, 
NULL);
if (ret) {
DRM_DEBUG("Failed to get extensions.\n");
return ret;
@@ -688,7 +689,7 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
}
 
if (args->flags & DRM_V3D_SUBMIT_EXTENSION) {
-   ret = v3d_get_extensions(file_priv, args->extensions, &se);
+   ret = v3d_get_extensions(file_priv, args->extensions, &se, 
NULL);
if (ret) {
DRM_DEBUG("Failed to get extensions.\n");
return ret;
@@ -791,7 +792,7 @@ v3d_submit_cpu_ioctl(struct drm_device *dev, void *data,
return ret;
 
if (args->flags & DRM_V3D_SUBMIT_EXTENSION) {
-   ret = v3d_get_extensions(file_priv, args->extensions, &se);
+   ret = v3d_get_extensions(file_priv, args->extensions, &se, 
cpu_job);
if (ret) {
DRM_DEBUG("Failed to get extensions.\n");
goto fail;
-- 
2.41.0



[PATCH 07/17] drm/v3d: Add a CPU job submission

2023-09-04 Thread Maíra Canal
From: Melissa Wen 

Create a new type of job, a CPU job. A CPU job is a type of job that
performs operations that requires CPU intervention. The overall idea is
to use user extensions to enable different types of CPU job, allowing the
CPU job to perform different operations according to the type of user
externsion. The user extension ID identify the type of CPU job that must
be dealt.

Having a CPU job is interesting for synchronization purposes as a CPU
job has a queue like any other V3D job and can be synchoronized by the
multisync extension.

Signed-off-by: Melissa Wen 
Co-developed-by: Maíra Canal 
Signed-off-by: Maíra Canal 
---
 drivers/gpu/drm/v3d/v3d_drv.c|  4 ++
 drivers/gpu/drm/v3d/v3d_drv.h| 13 +-
 drivers/gpu/drm/v3d/v3d_sched.c  | 40 
 drivers/gpu/drm/v3d/v3d_submit.c | 79 
 include/uapi/drm/v3d_drm.h   | 17 +++
 5 files changed, 152 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/v3d/v3d_drv.c b/drivers/gpu/drm/v3d/v3d_drv.c
index ffbbe9d527d3..6f6ef5af2bd0 100644
--- a/drivers/gpu/drm/v3d/v3d_drv.c
+++ b/drivers/gpu/drm/v3d/v3d_drv.c
@@ -90,6 +90,9 @@ static int v3d_get_param_ioctl(struct drm_device *dev, void 
*data,
case DRM_V3D_PARAM_SUPPORTS_MULTISYNC_EXT:
args->value = 1;
return 0;
+   case DRM_V3D_PARAM_SUPPORTS_CPU_QUEUE:
+   args->value = 1;
+   return 0;
default:
DRM_DEBUG("Unknown parameter %d\n", args->param);
return -EINVAL;
@@ -156,6 +159,7 @@ static const struct drm_ioctl_desc v3d_drm_ioctls[] = {
DRM_IOCTL_DEF_DRV(V3D_PERFMON_CREATE, v3d_perfmon_create_ioctl, 
DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(V3D_PERFMON_DESTROY, v3d_perfmon_destroy_ioctl, 
DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(V3D_PERFMON_GET_VALUES, v3d_perfmon_get_values_ioctl, 
DRM_RENDER_ALLOW),
+   DRM_IOCTL_DEF_DRV(V3D_SUBMIT_CPU, v3d_submit_cpu_ioctl, 
DRM_RENDER_ALLOW | DRM_AUTH),
 };
 
 static const struct drm_driver v3d_drm_driver = {
diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h
index 722a627e0a6e..2a3f3beb272c 100644
--- a/drivers/gpu/drm/v3d/v3d_drv.h
+++ b/drivers/gpu/drm/v3d/v3d_drv.h
@@ -19,7 +19,7 @@ struct reset_control;
 
 #define GMP_GRANULARITY (128 * 1024)
 
-#define V3D_MAX_QUEUES (V3D_CACHE_CLEAN + 1)
+#define V3D_MAX_QUEUES (V3D_CPU + 1)
 
 struct v3d_queue_state {
struct drm_gpu_scheduler sched;
@@ -106,6 +106,7 @@ struct v3d_dev {
struct v3d_render_job *render_job;
struct v3d_tfu_job *tfu_job;
struct v3d_csd_job *csd_job;
+   struct v3d_cpu_job *cpu_job;
 
struct v3d_queue_state queue[V3D_MAX_QUEUES];
 
@@ -285,6 +286,14 @@ struct v3d_csd_job {
struct drm_v3d_submit_csd args;
 };
 
+enum v3d_cpu_job_type {};
+
+struct v3d_cpu_job {
+   struct v3d_job base;
+
+   enum v3d_cpu_job_type job_type;
+};
+
 struct v3d_submit_outsync {
struct drm_syncobj *syncobj;
 };
@@ -387,6 +396,8 @@ int v3d_submit_tfu_ioctl(struct drm_device *dev, void *data,
 struct drm_file *file_priv);
 int v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
 struct drm_file *file_priv);
+int v3d_submit_cpu_ioctl(struct drm_device *dev, void *data,
+struct drm_file *file_priv);
 
 /* v3d_irq.c */
 int v3d_irq_init(struct v3d_dev *v3d);
diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c
index 06238e6d7f5c..88c483da360c 100644
--- a/drivers/gpu/drm/v3d/v3d_sched.c
+++ b/drivers/gpu/drm/v3d/v3d_sched.c
@@ -54,6 +54,12 @@ to_csd_job(struct drm_sched_job *sched_job)
return container_of(sched_job, struct v3d_csd_job, base.base);
 }
 
+static struct v3d_cpu_job *
+to_cpu_job(struct drm_sched_job *sched_job)
+{
+   return container_of(sched_job, struct v3d_cpu_job, base.base);
+}
+
 static void
 v3d_sched_job_free(struct drm_sched_job *sched_job)
 {
@@ -241,6 +247,26 @@ v3d_csd_job_run(struct drm_sched_job *sched_job)
return fence;
 }
 
+static struct dma_fence *
+v3d_cpu_job_run(struct drm_sched_job *sched_job)
+{
+   struct v3d_cpu_job *job = to_cpu_job(sched_job);
+   struct v3d_dev *v3d = job->base.v3d;
+
+   void (*v3d_cpu_job_fn[])(struct v3d_cpu_job *job) = { };
+
+   v3d->cpu_job = job;
+
+   if (job->job_type >= ARRAY_SIZE(v3d_cpu_job_fn)) {
+   DRM_DEBUG_DRIVER("Unknown CPU job: %d\n", job->job_type);
+   return NULL;
+   }
+
+   v3d_cpu_job_fn[job->job_type](job);
+
+   return NULL;
+}
+
 static struct dma_fence *
 v3d_cache_clean_job_run(struct drm_sched_job *sched_job)
 {
@@ -379,6 +405,12 @@ static const struct drm_sched_backend_ops 
v3d_cache_clean_sched_ops = {
.free_job = v3d_sched_job_free
 };
 
+static const struct drm_sched_backend_ops v3d_cpu_sched_ops = {
+   .run_job = v3d_cpu_job_run,
+   .timedout_job = v3d_generi

[PATCH 06/17] drm/v3d: Decouple job allocation from job initiation

2023-09-04 Thread Maíra Canal
We want to allow the IOCTLs to allocate the job without initiating it.
This will be useful for the CPU job submission IOCTL, as the CPU job has
the need to use information from the user extensions. Currently, the
user extensions are parsed before the job allocation, making it
impossible to fill the CPU job when parsing the user extensions.
Therefore, decouple the job allocation from the job initiation.

Signed-off-by: Maíra Canal 
---
 drivers/gpu/drm/v3d/v3d_submit.c | 23 ++-
 1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c
index 2c39e2acf01b..dff4525e6fde 100644
--- a/drivers/gpu/drm/v3d/v3d_submit.c
+++ b/drivers/gpu/drm/v3d/v3d_submit.c
@@ -135,6 +135,21 @@ void v3d_job_put(struct v3d_job *job)
kref_put(&job->refcount, job->free);
 }
 
+static int
+v3d_job_allocate(void **container, size_t size)
+{
+   if (*container)
+   return 0;
+
+   *container = kcalloc(1, size, GFP_KERNEL);
+   if (!*container) {
+   DRM_ERROR("Cannot allocate memory for V3D job.\n");
+   return -ENOMEM;
+   }
+
+   return 0;
+}
+
 static int
 v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv,
 void **container, size_t size, void (*free)(struct kref *ref),
@@ -145,11 +160,9 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file 
*file_priv,
bool has_multisync = se && (se->flags & DRM_V3D_EXT_ID_MULTI_SYNC);
int ret, i;
 
-   *container = kcalloc(1, size, GFP_KERNEL);
-   if (!*container) {
-   DRM_ERROR("Cannot allocate memory for v3d job.");
-   return -ENOMEM;
-   }
+   ret = v3d_job_allocate(container, size);
+   if (ret)
+   return ret;
 
job = *container;
job->v3d = v3d;
-- 
2.41.0



[PATCH 05/17] drm/v3d: Don't allow two multisync extensions in the same job

2023-09-04 Thread Maíra Canal
Currently, two multisync extensions can be added to the same job and
only the last multisync extension will be used. To avoid this
vulnerability, don't allow two multisync extensions in the same job.

Signed-off-by: Maíra Canal 
---
 drivers/gpu/drm/v3d/v3d_submit.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c
index b6be34060d4f..2c39e2acf01b 100644
--- a/drivers/gpu/drm/v3d/v3d_submit.c
+++ b/drivers/gpu/drm/v3d/v3d_submit.c
@@ -328,6 +328,11 @@ v3d_get_multisync_submit_deps(struct drm_file *file_priv,
struct v3d_submit_ext *se = data;
int ret;

+   if (se->in_sync_count || se->out_sync_count) {
+   DRM_DEBUG("Two multisync extensions were added to the same 
job.");
+   return -EINVAL;
+   }
+
if (copy_from_user(&multisync, ext, sizeof(multisync)))
return -EFAULT;

--
2.41.0



[PATCH 04/17] drm/v3d: Simplify job refcount handling

2023-09-04 Thread Maíra Canal
From: Melissa Wen 

Instead of checking if the job is NULL every time we call the function,
check it inside the function.

Signed-off-by: Melissa Wen 
Signed-off-by: Maíra Canal 
---
 drivers/gpu/drm/v3d/v3d_submit.c | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c
index 0886b3ec9aef..b6be34060d4f 100644
--- a/drivers/gpu/drm/v3d/v3d_submit.c
+++ b/drivers/gpu/drm/v3d/v3d_submit.c
@@ -129,6 +129,9 @@ void v3d_job_cleanup(struct v3d_job *job)
 
 void v3d_job_put(struct v3d_job *job)
 {
+   if (!job)
+   return;
+
kref_put(&job->refcount, job->free);
 }
 
@@ -516,11 +519,9 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
 &se,
 last_job->done_fence);
 
-   if (bin)
-   v3d_job_put(&bin->base);
-   v3d_job_put(&render->base);
-   if (clean_job)
-   v3d_job_put(clean_job);
+   v3d_job_put((void *)bin);
+   v3d_job_put((void *)render);
+   v3d_job_put((void *)clean_job);
 
return 0;
 
@@ -620,7 +621,7 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data,
 &se,
 job->base.done_fence);
 
-   v3d_job_put(&job->base);
+   v3d_job_put((void *)job);
 
return 0;
 
@@ -724,7 +725,7 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
 &se,
 clean_job->done_fence);
 
-   v3d_job_put(&job->base);
+   v3d_job_put((void *)job);
v3d_job_put(clean_job);
 
return 0;
-- 
2.41.0



[PATCH 03/17] drm/v3d: Detach job submissions IOCTLs to a new specific file

2023-09-04 Thread Maíra Canal
From: Melissa Wen 

We will include a new job submission type, the CPU job submission. For
readability and maintability, separate the job submission IOCTLs and
related operations from v3d_gem.c.

Minor fix in the CSD submission kernel doc:
CSD (texture formatting) -> CSD (compute shader).

Signed-off-by: Melissa Wen 
Signed-off-by: Maíra Canal 
---
 drivers/gpu/drm/v3d/Makefile |   3 +-
 drivers/gpu/drm/v3d/v3d_drv.h|  12 +-
 drivers/gpu/drm/v3d/v3d_gem.c| 734 --
 drivers/gpu/drm/v3d/v3d_submit.c | 743 +++
 4 files changed, 752 insertions(+), 740 deletions(-)
 create mode 100644 drivers/gpu/drm/v3d/v3d_submit.c

diff --git a/drivers/gpu/drm/v3d/Makefile b/drivers/gpu/drm/v3d/Makefile
index e8b314137020..f3b15a014ee4 100644
--- a/drivers/gpu/drm/v3d/Makefile
+++ b/drivers/gpu/drm/v3d/Makefile
@@ -11,7 +11,8 @@ v3d-y := \
v3d_mmu.o \
v3d_perfmon.o \
v3d_trace_points.o \
-   v3d_sched.o
+   v3d_sched.o \
+   v3d_submit.o

 v3d-$(CONFIG_DEBUG_FS) += v3d_debugfs.o

diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h
index 8257fb64506c..722a627e0a6e 100644
--- a/drivers/gpu/drm/v3d/v3d_drv.h
+++ b/drivers/gpu/drm/v3d/v3d_drv.h
@@ -374,17 +374,19 @@ struct dma_fence *v3d_fence_create(struct v3d_dev *v3d, 
enum v3d_queue queue);
 /* v3d_gem.c */
 int v3d_gem_init(struct drm_device *dev);
 void v3d_gem_destroy(struct drm_device *dev);
+void v3d_reset(struct v3d_dev *v3d);
+void v3d_invalidate_caches(struct v3d_dev *v3d);
+void v3d_clean_caches(struct v3d_dev *v3d);
+
+/* v3d_submit.c */
+void v3d_job_cleanup(struct v3d_job *job);
+void v3d_job_put(struct v3d_job *job);
 int v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
struct drm_file *file_priv);
 int v3d_submit_tfu_ioctl(struct drm_device *dev, void *data,
 struct drm_file *file_priv);
 int v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
 struct drm_file *file_priv);
-void v3d_job_cleanup(struct v3d_job *job);
-void v3d_job_put(struct v3d_job *job);
-void v3d_reset(struct v3d_dev *v3d);
-void v3d_invalidate_caches(struct v3d_dev *v3d);
-void v3d_clean_caches(struct v3d_dev *v3d);

 /* v3d_irq.c */
 int v3d_irq_init(struct v3d_dev *v3d);
diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
index e6f4b7ffd958..63378004272b 100644
--- a/drivers/gpu/drm/v3d/v3d_gem.c
+++ b/drivers/gpu/drm/v3d/v3d_gem.c
@@ -11,8 +11,6 @@
 #include 

 #include 
-#include 
-#include 

 #include "v3d_drv.h"
 #include "v3d_regs.h"
@@ -241,738 +239,6 @@ v3d_invalidate_caches(struct v3d_dev *v3d)
v3d_invalidate_slices(v3d, 0);
 }

-/* Takes the reservation lock on all the BOs being referenced, so that
- * at queue submit time we can update the reservations.
- *
- * We don't lock the RCL the tile alloc/state BOs, or overflow memory
- * (all of which are on exec->unref_list).  They're entirely private
- * to v3d, so we don't attach dma-buf fences to them.
- */
-static int
-v3d_lock_bo_reservations(struct v3d_job *job,
-struct ww_acquire_ctx *acquire_ctx)
-{
-   int i, ret;
-
-   ret = drm_gem_lock_reservations(job->bo, job->bo_count, acquire_ctx);
-   if (ret)
-   return ret;
-
-   for (i = 0; i < job->bo_count; i++) {
-   ret = dma_resv_reserve_fences(job->bo[i]->resv, 1);
-   if (ret)
-   goto fail;
-
-   ret = drm_sched_job_add_implicit_dependencies(&job->base,
- job->bo[i], true);
-   if (ret)
-   goto fail;
-   }
-
-   return 0;
-
-fail:
-   drm_gem_unlock_reservations(job->bo, job->bo_count, acquire_ctx);
-   return ret;
-}
-
-/**
- * v3d_lookup_bos() - Sets up job->bo[] with the GEM objects
- * referenced by the job.
- * @dev: DRM device
- * @file_priv: DRM file for this fd
- * @job: V3D job being set up
- * @bo_handles: GEM handles
- * @bo_count: Number of GEM handles passed in
- *
- * The command validator needs to reference BOs by their index within
- * the submitted job's BO list.  This does the validation of the job's
- * BO list and reference counting for the lifetime of the job.
- *
- * Note that this function doesn't need to unreference the BOs on
- * failure, because that will happen at v3d_exec_cleanup() time.
- */
-static int
-v3d_lookup_bos(struct drm_device *dev,
-  struct drm_file *file_priv,
-  struct v3d_job *job,
-  u64 bo_handles,
-  u32 bo_count)
-{
-   job->bo_count = bo_count;
-
-   if (!job->bo_count) {
-   /* See comment on bo_index for why we have to check
-* this.
-*/
-   DRM_DEBUG("Rendering requires BOs\n");
-   return -EINVAL;
-   }
-
-   return drm_gem_objects_lookup(file_

[PATCH 00/17] drm/v3d: Introduce CPU jobs

2023-09-04 Thread Maíra Canal
This patchset implements the basic infrastructure for a new type of
V3D job, a CPU job. A CPU job is a job that requires CPU intervention.
It would be nice to perform this operations on the kernel space as we
can attach multiple in/out syncobjs to it.

Why we want a CPU job on the kernel?


There are some Vulkan commands that cannot be performed by the GPU, so
we implement those as CPU jobs on Mesa. But to synchronize a CPU job
in the user space, we need to hold part of the command submission flow
in order to correctly synchronize their execution.

By moving the CPU job to the kernel, we can make use of the DRM
schedule queues and all the advantages it brings with it. This way,
instead of stalling the submission thread, we can use syncobjs to
synchronize the job, providing a more effective management.

About the implementation


After we decided that we would like to have a CPU job implementation
in the kernel, we could think about two possible implementations for
this job: creating an IOCTL for each type of CPU job or using an user
extension to provide a polymorphic behavior to a single CPU job IOCTL.
We decided for the latter one.

We have different types of CPU jobs (indirect CSD jobs, timestamp
query jobs, copy query results jobs...) and each of them have a common
infrastructure, but perform different operations. Therefore, by using
a single IOCTL that is extended by an user extension, we can reuse the
common infrastructure - avoiding code repetition - and yet use the
user extension ID to identify the type of job and depending on the
type of job, perform a certain operation.

About the patchset
==

This patchset introduces the basic infrastructure of a CPU job with a
new V3D queue (V3D_CPU) e new tracers. Moreover, it introduces six
types of CPU jobs: an indirect CSD job, a timestamp query job, a
reset timestamp queries job, a copy timestamp query results job, a reset
performance queries job, and a copy performance query results job.

An indirect CSD job is a job that, when executed in the queue, will
map the indirect buffer, read the dispatch parameters, and submit a
regular dispatch. So, the CSD job depends on the CPU job execution. We
attach the wait dependencies to the CPU job and once they are satisfied,
we read the dispatch parameters, rewrite the uniforms (if needed) and
enable the CSD job execution, which depends on the completion of the
CPU job.

A timestamp query job is a job that calculates the value of the
timestamp query and updates the availability of the query. In order to
implement this job, we had to change the Mesa implementation of the
timestamp. Now, the timestamp query value is tracked in a BO, instead
of using a memory address. Moreover, the timestamp query availability is
tracked with a syncobj, which is signaled when the query is available.

A reset timestamp queries job is a job that resets the timestamp queries by
zeroing the timestamp BO in the right positions. The right position on
the timestamp BO is found through the offset of the first query.

A reset performance queries job is a job that zeros the values of the
performance monitors associated to that query. Moreover, it resets the
availability syncobj related to that query.

A copy query results job is a job that copy the results of a query to a
BO in a given offset with a given stride.

The patchset is divided as such:
 * #1 - #4: refactoring operations to prepare for the introduction of the
CPU job
 * #5: addressing a vulnerability in the multisync extension
 * #6: decouple job allocation from job initiation
 * #7 - #9: introduction of the CPU job
 * #10 - #11: refactoring operations to prepare for the introduction of the
  indirect CSD job
 * #12: introduction of the indirect CSD job
 * #13: introduction of the timestamp query job
 * #14: introduction of the reset timestamp queries job
 * #15: introduction of the copy timestamp query results job
 * #16: introduction of the reset performance queries job
 * #17: introduction of the copy performance query results job

This patchset has its Mesa counterpart, which is available on [1].

Both the kernel and Mesa implementation were tested with

 * `dEQP-VK.compute.pipeline.indirect_dispatch.*`,
 * `dEQP-VK.pipeline.monolithic.timestamp.*`,
 * `dEQP-VK.synchronization.*`,
 * `dEQP-VK.query_pool.*`
 * and `dEQP-VK.multiview.*`.

[1] https://gitlab.freedesktop.org/mairacanal/mesa/-/tree/v3dv/v4/cpu-job

Best Regards,
- Maíra

Maíra Canal (11):
  drm/v3d: Don't allow two multisync extensions in the same job
  drm/v3d: Decouple job allocation from job initiation
  drm/v3d: Use v3d_get_extensions() to parse CPU job data
  drm/v3d: Create tracepoints to track the CPU job
  drm/v3d: Enable BO mapping
  drm/v3d: Create a CPU job extension for a indirect CSD job
  drm/v3d: Create a CPU job extension for the timestamp query job
  drm/v3d: Create a CPU job extension for the reset timesta

[PATCH 02/17] drm/v3d: Move wait BO ioctl to the v3d_bo file

2023-09-04 Thread Maíra Canal
From: Melissa Wen 

IOCTLs related to BO operations reside on the file v3d_bo.c. The wait BO
ioctl is the only IOCTL regarding BOs that is placed in a different file.
So, move it to the v3d_bo.c file.

Signed-off-by: Melissa Wen 
Signed-off-by: Maíra Canal 
---
 drivers/gpu/drm/v3d/v3d_bo.c  | 33 +
 drivers/gpu/drm/v3d/v3d_drv.h |  4 ++--
 drivers/gpu/drm/v3d/v3d_gem.c | 33 -
 3 files changed, 35 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/v3d/v3d_bo.c b/drivers/gpu/drm/v3d/v3d_bo.c
index 8b3229a37c6d..357a0da7e16a 100644
--- a/drivers/gpu/drm/v3d/v3d_bo.c
+++ b/drivers/gpu/drm/v3d/v3d_bo.c
@@ -233,3 +233,36 @@ int v3d_get_bo_offset_ioctl(struct drm_device *dev, void 
*data,
drm_gem_object_put(gem_obj);
return 0;
 }
+
+int
+v3d_wait_bo_ioctl(struct drm_device *dev, void *data,
+ struct drm_file *file_priv)
+{
+   int ret;
+   struct drm_v3d_wait_bo *args = data;
+   ktime_t start = ktime_get();
+   u64 delta_ns;
+   unsigned long timeout_jiffies =
+   nsecs_to_jiffies_timeout(args->timeout_ns);
+
+   if (args->pad != 0)
+   return -EINVAL;
+
+   ret = drm_gem_dma_resv_wait(file_priv, args->handle,
+   true, timeout_jiffies);
+
+   /* Decrement the user's timeout, in case we got interrupted
+* such that the ioctl will be restarted.
+*/
+   delta_ns = ktime_to_ns(ktime_sub(ktime_get(), start));
+   if (delta_ns < args->timeout_ns)
+   args->timeout_ns -= delta_ns;
+   else
+   args->timeout_ns = 0;
+
+   /* Asked to wait beyond the jiffie/scheduler precision? */
+   if (ret == -ETIME && args->timeout_ns)
+   ret = -EAGAIN;
+
+   return ret;
+}
diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h
index e8e45a9c705f..8257fb64506c 100644
--- a/drivers/gpu/drm/v3d/v3d_drv.h
+++ b/drivers/gpu/drm/v3d/v3d_drv.h
@@ -358,6 +358,8 @@ int v3d_mmap_bo_ioctl(struct drm_device *dev, void *data,
  struct drm_file *file_priv);
 int v3d_get_bo_offset_ioctl(struct drm_device *dev, void *data,
struct drm_file *file_priv);
+int v3d_wait_bo_ioctl(struct drm_device *dev, void *data,
+ struct drm_file *file_priv);
 struct drm_gem_object *v3d_prime_import_sg_table(struct drm_device *dev,
 struct dma_buf_attachment 
*attach,
 struct sg_table *sgt);
@@ -378,8 +380,6 @@ int v3d_submit_tfu_ioctl(struct drm_device *dev, void *data,
 struct drm_file *file_priv);
 int v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
 struct drm_file *file_priv);
-int v3d_wait_bo_ioctl(struct drm_device *dev, void *data,
- struct drm_file *file_priv);
 void v3d_job_cleanup(struct v3d_job *job);
 void v3d_job_put(struct v3d_job *job);
 void v3d_reset(struct v3d_dev *v3d);
diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
index 2e94ce788c71..e6f4b7ffd958 100644
--- a/drivers/gpu/drm/v3d/v3d_gem.c
+++ b/drivers/gpu/drm/v3d/v3d_gem.c
@@ -363,39 +363,6 @@ void v3d_job_put(struct v3d_job *job)
kref_put(&job->refcount, job->free);
 }
 
-int
-v3d_wait_bo_ioctl(struct drm_device *dev, void *data,
- struct drm_file *file_priv)
-{
-   int ret;
-   struct drm_v3d_wait_bo *args = data;
-   ktime_t start = ktime_get();
-   u64 delta_ns;
-   unsigned long timeout_jiffies =
-   nsecs_to_jiffies_timeout(args->timeout_ns);
-
-   if (args->pad != 0)
-   return -EINVAL;
-
-   ret = drm_gem_dma_resv_wait(file_priv, args->handle,
-   true, timeout_jiffies);
-
-   /* Decrement the user's timeout, in case we got interrupted
-* such that the ioctl will be restarted.
-*/
-   delta_ns = ktime_to_ns(ktime_sub(ktime_get(), start));
-   if (delta_ns < args->timeout_ns)
-   args->timeout_ns -= delta_ns;
-   else
-   args->timeout_ns = 0;
-
-   /* Asked to wait beyond the jiffie/scheduler precision? */
-   if (ret == -ETIME && args->timeout_ns)
-   ret = -EAGAIN;
-
-   return ret;
-}
-
 static int
 v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv,
 void **container, size_t size, void (*free)(struct kref *ref),
-- 
2.41.0



[PATCH 01/17] drm/v3d: Remove unused function header

2023-09-04 Thread Maíra Canal
From: Melissa Wen 

v3d_mmu_get_offset header was added but the function was never defined.
Just remove it.

Signed-off-by: Melissa Wen 
Signed-off-by: Maíra Canal 
---
 drivers/gpu/drm/v3d/v3d_drv.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h
index 7f664a4b2a75..e8e45a9c705f 100644
--- a/drivers/gpu/drm/v3d/v3d_drv.h
+++ b/drivers/gpu/drm/v3d/v3d_drv.h
@@ -393,8 +393,6 @@ void v3d_irq_disable(struct v3d_dev *v3d);
 void v3d_irq_reset(struct v3d_dev *v3d);
 
 /* v3d_mmu.c */
-int v3d_mmu_get_offset(struct drm_file *file_priv, struct v3d_bo *bo,
-  u32 *offset);
 int v3d_mmu_set_page_table(struct v3d_dev *v3d);
 void v3d_mmu_insert_ptes(struct v3d_bo *bo);
 void v3d_mmu_remove_ptes(struct v3d_bo *bo);
-- 
2.41.0



Re: [PATCH 07/10] drm/tests: Add test for drm_framebuffer_init()

2023-09-04 Thread Carlos

Hi Maíra,

On 8/26/23 11:16, Maíra Canal wrote:

Hi Carlos,

On 8/25/23 13:11, Carlos Eduardo Gallo Filho wrote:

Add a single KUnit test case for the drm_framebuffer_init function.

Signed-off-by: Carlos Eduardo Gallo Filho 
---
  drivers/gpu/drm/tests/drm_framebuffer_test.c | 52 
  1 file changed, 52 insertions(+)

diff --git a/drivers/gpu/drm/tests/drm_framebuffer_test.c 
b/drivers/gpu/drm/tests/drm_framebuffer_test.c

index 3d14d35b4c4d..50d88bf3fa65 100644
--- a/drivers/gpu/drm/tests/drm_framebuffer_test.c
+++ b/drivers/gpu/drm/tests/drm_framebuffer_test.c
@@ -557,8 +557,60 @@ static void drm_test_framebuffer_lookup(struct 
kunit *test)

  KUNIT_EXPECT_NULL(test, fb2);
  }
  +static void drm_test_framebuffer_init(struct kunit *test)
+{
+    struct drm_mock *mock = test->priv;
+    struct drm_device *dev = &mock->dev;
+    struct drm_device wrong_drm = { };
+    struct drm_format_info format = { };
+    struct drm_framebuffer fb1 = { .dev = dev, .format = &format };
+    struct drm_framebuffer *fb2;
+    struct drm_framebuffer_funcs funcs = { };
+    int ret;
+
+    /* Fails if fb->dev doesn't point to the drm_device passed on 
first arg */

+    fb1.dev = &wrong_drm;
+    ret = drm_framebuffer_init(dev, &fb1, &funcs);
+    KUNIT_EXPECT_EQ(test, ret, -EINVAL);
+    fb1.dev = dev;
+
+    /* Fails if fb.format isn't set */
+    fb1.format = NULL;
+    ret = drm_framebuffer_init(dev, &fb1, &funcs);
+    KUNIT_EXPECT_EQ(test, ret, -EINVAL);
+    fb1.format = &format;
+
+    ret = drm_framebuffer_init(dev, &fb1, &funcs);
+    KUNIT_EXPECT_EQ(test, ret, 0);
+
+    /*
+ * Check if fb->funcs is actually set to the drm_framebuffer_funcs
+ * passed to it
+ */
+    KUNIT_EXPECT_PTR_EQ(test, fb1.funcs, &funcs);
+
+    /* The fb->comm must be set to the current running process */
+    KUNIT_EXPECT_STREQ(test, fb1.comm, current->comm);
+
+    /* The fb->base must be successfully initialized */
+    KUNIT_EXPECT_EQ(test, fb1.base.id, 1);
+    KUNIT_EXPECT_EQ(test, fb1.base.type, DRM_MODE_OBJECT_FB);
+    KUNIT_EXPECT_EQ(test, kref_read(&fb1.base.refcount), 1);
+    KUNIT_EXPECT_PTR_EQ(test, fb1.base.free_cb, &drm_framebuffer_free);
+
+    /* Checks if the fb is really published and findable */
+    fb2 = drm_framebuffer_lookup(dev, NULL, fb1.base.id);
+    KUNIT_EXPECT_PTR_EQ(test, fb2, &fb1);
+
+    /* There must be just that one fb initialized */
+    KUNIT_EXPECT_EQ(test, dev->mode_config.num_fb, 1);
+    KUNIT_EXPECT_PTR_EQ(test, dev->mode_config.fb_list.prev, 
&fb1.head);
+    KUNIT_EXPECT_PTR_EQ(test, dev->mode_config.fb_list.next, 
&fb1.head);


Shouldn't we clean the framebuffer object?

What did you mean by "clean"? Firstly I supposed that it would be about
freeing some dynamically allocated frambuffer, but it's statically
allocated, so I believe it isn't what you are meaning. Is there some
collateral effect I'm not taking into account?

Thanks,
Carlos


Best Regards,
- Maíra


+}
+
  static struct kunit_case drm_framebuffer_tests[] = {
  KUNIT_CASE(drm_test_framebuffer_cleanup),
+    KUNIT_CASE(drm_test_framebuffer_init),
  KUNIT_CASE(drm_test_framebuffer_lookup),
KUNIT_CASE(drm_test_framebuffer_modifiers_not_supported),
  KUNIT_CASE_PARAM(drm_test_framebuffer_check_src_coords, 
check_src_coords_gen_params),


Re: [PATCH 05/10] drm/tests: Add test for drm_framebuffer_cleanup()

2023-09-04 Thread Carlos

Hi Maíra,

On 8/26/23 11:06, Maíra Canal wrote:

Hi Carlos,

On 8/25/23 13:07, Carlos Eduardo Gallo Filho wrote:

Add a single KUnit test case for the drm_framebuffer_cleanup function.

Signed-off-by: Carlos Eduardo Gallo Filho 
---
  drivers/gpu/drm/tests/drm_framebuffer_test.c | 49 
  1 file changed, 49 insertions(+)

diff --git a/drivers/gpu/drm/tests/drm_framebuffer_test.c 
b/drivers/gpu/drm/tests/drm_framebuffer_test.c

index 0e0e8216bbbc..16d9cf4bed88 100644
--- a/drivers/gpu/drm/tests/drm_framebuffer_test.c
+++ b/drivers/gpu/drm/tests/drm_framebuffer_test.c
@@ -370,6 +370,9 @@ static int drm_framebuffer_test_init(struct kunit 
*test)

  KUNIT_ASSERT_NOT_ERR_OR_NULL(test, mock);
  dev = &mock->dev;
  +    mutex_init(&dev->mode_config.fb_lock);


What about drmm_mutex_init()?

I took a look into it and as far I understand it would be useful if
the drm_device was allocated with drmm_kalloc(), sure? As far we
are using kunit_kalloc here and the drm_device is automatically
deallocated when the test finishes, what would be the better by
using drmm_mutex_init?

It isn't that I don't wanna use it, I just didn't understand how
exactly it works and how could I use it in that code. Should I
replace the drm_device allocation to use drmm?

Thanks,
Carlos


Best Regards,
- Maíra


+ INIT_LIST_HEAD(&dev->mode_config.fb_list);
+    dev->mode_config.num_fb = 0;
  dev->mode_config.min_width = MIN_WIDTH;
  dev->mode_config.max_width = MAX_WIDTH;
  dev->mode_config.min_height = MIN_HEIGHT;
@@ -380,6 +383,14 @@ static int drm_framebuffer_test_init(struct 
kunit *test)

  return 0;
  }
  +static void drm_framebuffer_test_exit(struct kunit *test)
+{
+    struct drm_mock *mock = test->priv;
+    struct drm_device *dev = &mock->dev;
+
+    mutex_destroy(&dev->mode_config.fb_lock);
+}
+
  static void drm_test_framebuffer_create(struct kunit *test)
  {
  const struct drm_framebuffer_test *params = test->param_value;
@@ -483,7 +494,44 @@ static void check_src_coords_test_to_desc(const 
struct check_src_coords_case *t,

  KUNIT_ARRAY_PARAM(check_src_coords, check_src_coords_cases,
    check_src_coords_test_to_desc);
  +static void drm_test_framebuffer_cleanup(struct kunit *test)
+{
+    struct drm_mock *mock = test->priv;
+    struct drm_device *dev = &mock->dev;
+    struct list_head *fb_list = &dev->mode_config.fb_list;
+    struct drm_framebuffer fb1 = { .dev = dev };
+    struct drm_framebuffer fb2 = { .dev = dev };
+
+    /* This must result on [fb_list] -> fb1 -> fb2 */
+    list_add_tail(&fb1.head, fb_list);
+    list_add_tail(&fb2.head, fb_list);
+    dev->mode_config.num_fb = 2;
+
+    KUNIT_ASSERT_PTR_EQ(test, fb_list->prev, &fb2.head);
+    KUNIT_ASSERT_PTR_EQ(test, fb_list->next, &fb1.head);
+    KUNIT_ASSERT_PTR_EQ(test, fb1.head.prev, fb_list);
+    KUNIT_ASSERT_PTR_EQ(test, fb1.head.next, &fb2.head);
+    KUNIT_ASSERT_PTR_EQ(test, fb2.head.prev, &fb1.head);
+    KUNIT_ASSERT_PTR_EQ(test, fb2.head.next, fb_list);
+
+    drm_framebuffer_cleanup(&fb1);
+
+    /* Now [fb_list] -> fb2 */
+    KUNIT_ASSERT_PTR_EQ(test, fb_list->prev, &fb2.head);
+    KUNIT_ASSERT_PTR_EQ(test, fb_list->next, &fb2.head);
+    KUNIT_ASSERT_PTR_EQ(test, fb2.head.prev, fb_list);
+    KUNIT_ASSERT_PTR_EQ(test, fb2.head.next, fb_list);
+    KUNIT_ASSERT_EQ(test, dev->mode_config.num_fb, 1);
+
+    drm_framebuffer_cleanup(&fb2);
+
+    /* Now fb_list is empty */
+    KUNIT_ASSERT_TRUE(test, list_empty(fb_list));
+    KUNIT_ASSERT_EQ(test, dev->mode_config.num_fb, 0);
+}
+
  static struct kunit_case drm_framebuffer_tests[] = {
+    KUNIT_CASE(drm_test_framebuffer_cleanup),
KUNIT_CASE(drm_test_framebuffer_modifiers_not_supported),
  KUNIT_CASE_PARAM(drm_test_framebuffer_check_src_coords, 
check_src_coords_gen_params),
  KUNIT_CASE_PARAM(drm_test_framebuffer_create, 
drm_framebuffer_create_gen_params),

@@ -493,6 +541,7 @@ static struct kunit_case drm_framebuffer_tests[] = {
  static struct kunit_suite drm_framebuffer_test_suite = {
  .name = "drm_framebuffer",
  .init = drm_framebuffer_test_init,
+    .exit = drm_framebuffer_test_exit,
  .test_cases = drm_framebuffer_tests,
  };


Re: [PATCH v2 2/7] drm: ci: Force db410c to host mode

2023-09-04 Thread Dmitry Baryshkov
On Mon, 4 Sept 2023 at 19:16, Vignesh Raman  wrote:
>
> Force db410c to host mode to fix network issue which results in failure
> to mount root fs via NFS.
> See 
> https://gitlab.freedesktop.org/gfx-ci/linux/-/commit/cb72a629b8c15c80a54dda510743cefd1c4b65b8
>
> Use fdtoverlay command to merge base device tree with an overlay
> which contains the fix for USB controllers to work in host mode.
>
> Signed-off-by: Vignesh Raman 
> ---
>
> v2:
>   - Use fdtoverlay command to merge overlay dtbo with the base dtb instead of 
> modifying the kernel sources
>
> ---
>  drivers/gpu/drm/ci/build.sh |  5 +
>  .../gpu/drm/ci/dt-overlays/apq8016-sbc-overlay.dts  | 13 +
>  2 files changed, 18 insertions(+)
>  create mode 100644 drivers/gpu/drm/ci/dt-overlays/apq8016-sbc-overlay.dts
>
> diff --git a/drivers/gpu/drm/ci/build.sh b/drivers/gpu/drm/ci/build.sh
> index 7b014287a041..92ffd98cd09e 100644
> --- a/drivers/gpu/drm/ci/build.sh
> +++ b/drivers/gpu/drm/ci/build.sh
> @@ -92,6 +92,11 @@ done
>
>  if [[ -n ${DEVICE_TREES} ]]; then
>  make dtbs
> +if [[ -e arch/arm64/boot/dts/qcom/apq8016-sbc.dtb ]]; then
> +dtc -@ -I dts -O dtb -o 
> drivers/gpu/drm/ci/dt-overlays/apq8016-sbc-overlay.dtbo 
> drivers/gpu/drm/ci/dt-overlays/apq8016-sbc-overlay.dts
> +fdtoverlay -i arch/arm64/boot/dts/qcom/apq8016-sbc.dtb -o 
> arch/arm64/boot/dts/qcom/apq8016-sbc-overlay.dtb 
> drivers/gpu/drm/ci/dt-overlays/apq8016-sbc-overlay.dtbo
> +mv arch/arm64/boot/dts/qcom/apq8016-sbc-overlay.dtb 
> arch/arm64/boot/dts/qcom/apq8016-sbc.dtb
> +fi
>  cp ${DEVICE_TREES} /lava-files/.
>  fi
>
> diff --git a/drivers/gpu/drm/ci/dt-overlays/apq8016-sbc-overlay.dts 
> b/drivers/gpu/drm/ci/dt-overlays/apq8016-sbc-overlay.dts
> new file mode 100644
> index ..57b7604f1c23
> --- /dev/null
> +++ b/drivers/gpu/drm/ci/dt-overlays/apq8016-sbc-overlay.dts
> @@ -0,0 +1,13 @@
> +/dts-v1/;
> +/plugin/;
> +
> +/ {
> +fragment@0 {
> +target-path = "/soc@0";
> +__overlay__ {
> +usb@78d9000 {
> +dr_mode = "host";
> +};
> +};
> +};
> +};
> --
> 2.40.1

Can we use normal dtso syntax here instead of defining fragments manually?

-- 
With best wishes
Dmitry


Re: [PATCH 03/10] drm/tests: Add test case for drm_internal_framebuffer_create()

2023-09-04 Thread Carlos

Hi Maíra,

On 8/26/23 10:58, Maíra Canal wrote:

Hi Carlos,

On 8/25/23 13:07, Carlos Eduardo Gallo Filho wrote:

Introduce a test to cover the creation of framebuffer with
modifier on a device that doesn't support it.

Signed-off-by: Carlos Eduardo Gallo Filho 
---
  drivers/gpu/drm/tests/drm_framebuffer_test.c | 28 
  1 file changed, 28 insertions(+)

diff --git a/drivers/gpu/drm/tests/drm_framebuffer_test.c 
b/drivers/gpu/drm/tests/drm_framebuffer_test.c

index aeaf2331f9cc..b20871e88995 100644
--- a/drivers/gpu/drm/tests/drm_framebuffer_test.c
+++ b/drivers/gpu/drm/tests/drm_framebuffer_test.c
@@ -396,7 +396,35 @@ static void drm_framebuffer_test_to_desc(const 
struct drm_framebuffer_test *t, c
  KUNIT_ARRAY_PARAM(drm_framebuffer_create, 
drm_framebuffer_create_cases,

    drm_framebuffer_test_to_desc);
  +/*
+ * This test is very similar to drm_test_framebuffer_create, except 
that it
+ * set mock->mode_config.fb_modifiers_not_supported member to 1, 
covering
+ * the case of trying to create a framebuffer with modifiers without 
the

+ * device really supporting it.
+ */
+static void drm_test_framebuffer_modifiers_not_supported(struct 
kunit *test)

+{
+    struct drm_mock *mock = test->priv;
+    struct drm_device *dev = &mock->dev;
+    int buffer_created = 0;
+
+    /* A valid cmd with modifier */
+    struct drm_mode_fb_cmd2 cmd = {
+    .width = MAX_WIDTH, .height = MAX_HEIGHT,
+    .pixel_format = DRM_FORMAT_ABGR, .handles = { 1, 0, 0 },
+    .offsets = { UINT_MAX / 2, 0, 0 }, .pitches = { 4 * 
MAX_WIDTH, 0, 0 },

+    .flags = DRM_MODE_FB_MODIFIERS,
+    };
+
+    mock->private = &buffer_created;
+    dev->mode_config.fb_modifiers_not_supported = 1;
+
+    drm_internal_framebuffer_create(dev, &cmd, NULL);
+    KUNIT_EXPECT_EQ(test, 0, buffer_created);
+}
+
  static struct kunit_case drm_framebuffer_tests[] = {
+    KUNIT_CASE(drm_test_framebuffer_modifiers_not_supported),


Could we preserve alphabetical order?


I've see a lot of other tests files with this ordered by every KUNIT_CASE()
coming before KUNIT_CASE_PARAM(), with each set ordered among themselves.
Did younoticed that or are you suggesting ordering it even so? Or maybe
you're referring about another unordered thing that I didn't noticed?

Thanks,
Carlos


Best Regards,
- Maíra

KUNIT_CASE_PARAM(drm_test_framebuffer_create, 
drm_framebuffer_create_gen_params),

  { }
  };


Re: [PATCH v2 02/15] drm/panthor: Add uAPI

2023-09-04 Thread Robin Murphy

On 2023-09-04 17:16, Boris Brezillon wrote:

On Mon, 4 Sep 2023 16:22:19 +0100
Steven Price  wrote:


On 04/09/2023 10:26, Boris Brezillon wrote:

On Mon, 4 Sep 2023 08:42:08 +0100
Steven Price  wrote:
   

On 01/09/2023 17:10, Boris Brezillon wrote:

On Wed,  9 Aug 2023 18:53:15 +0200
Boris Brezillon  wrote:
 

+/**
+ * DOC: MMIO regions exposed to userspace.
+ *
+ * .. c:macro:: DRM_PANTHOR_USER_MMIO_OFFSET
+ *
+ * File offset for all MMIO regions being exposed to userspace. Don't use
+ * this value directly, use DRM_PANTHOR_USER__OFFSET values instead.
+ *
+ * .. c:macro:: DRM_PANTHOR_USER_FLUSH_ID_MMIO_OFFSET
+ *
+ * File offset for the LATEST_FLUSH_ID register. The Userspace driver controls
+ * GPU cache flushling through CS instructions, but the flush reduction
+ * mechanism requires a flush_id. This flush_id could be queried with an
+ * ioctl, but Arm provides a well-isolated register page containing only this
+ * read-only register, so let's expose this page through a static mmap offset
+ * and allow direct mapping of this MMIO region so we can avoid the
+ * user <-> kernel round-trip.
+ */
+#define DRM_PANTHOR_USER_MMIO_OFFSET   (0x1ull << 56)


I'm playing with a 32-bit kernel/userspace, and this is problematic,
because vm_pgoff is limited to 32-bit there, meaning we can only map up
to (1ull << (PAGE_SHIFT + 32)) - 1. Should we add a DEV_QUERY to let
userspace set the mmio range?


Hmm, I was rather hoping we could ignore 32 bit these days ;) But while
I can't see why anyone would be running a 32 bit kernel, I guess 32 bit
user space is likely to still be needed.


Uh, I just hit a new problem with 32-bit kernels: the io-pgtable
interface (io_pgtable_ops) passes device VAs as unsigned longs, meaning
the GPU VA space is limited to 4G on a 32-bit build :-(. Robin, any
chance you could advise me on what to do here?

1. assume this limitation is here for a good reason, and limit the GPU
VA space to 32-bits on 32-bit kernels

or

2. update the interface to make iova an u64


I'm not sure I can answer the question from a technical perspective,
hopefully Robin will be able to.


Had a quick chat with Robin, and he's recommending going for #1 too.



But why do we care about 32-bit kernels on a platform which is new
enough to have a CSF-GPU (and by extension a recent 64-bit CPU)?


Apparently the memory you save by switching to a 32-bit kernel matters
to some people. To clarify, the CPU is aarch64, but they want to use it
in 32-bit mode.



Given the other limitations present in a 32-bit kernel I'd be tempted to
say '1' just for simplicity. Especially since apparently we've lived
with this for panfrost which presumably has the same limitation (even
though all Bifrost/Midgard GPUs have at least 33 bits of VA space).


Well, Panfrost is simpler in that you don't have this kernel VA range,
and, IIRC, we are using the old format that naturally limits the GPU VA
space to 4G.


FWIW the legacy pagetable format itself should be fine going up to 
however many bits the GPU supports, however there were various ISA 
limitations around crossing 4GB boundaries, and the easiest way to avoid 
having to think about those was to just not use more than 4GB of VA at 
all (minus chunks at the ends for similar weird ISA reasons).


Cheers,
Robin.


Re: [PATCH v2 02/15] drm/panthor: Add uAPI

2023-09-04 Thread Boris Brezillon
On Mon, 4 Sep 2023 16:22:19 +0100
Steven Price  wrote:

> On 04/09/2023 10:26, Boris Brezillon wrote:
> > On Mon, 4 Sep 2023 08:42:08 +0100
> > Steven Price  wrote:
> >   
> >> On 01/09/2023 17:10, Boris Brezillon wrote:  
> >>> On Wed,  9 Aug 2023 18:53:15 +0200
> >>> Boris Brezillon  wrote:
> >>> 
>  +/**
>  + * DOC: MMIO regions exposed to userspace.
>  + *
>  + * .. c:macro:: DRM_PANTHOR_USER_MMIO_OFFSET
>  + *
>  + * File offset for all MMIO regions being exposed to userspace. Don't 
>  use
>  + * this value directly, use DRM_PANTHOR_USER__OFFSET values 
>  instead.
>  + *
>  + * .. c:macro:: DRM_PANTHOR_USER_FLUSH_ID_MMIO_OFFSET
>  + *
>  + * File offset for the LATEST_FLUSH_ID register. The Userspace driver 
>  controls
>  + * GPU cache flushling through CS instructions, but the flush reduction
>  + * mechanism requires a flush_id. This flush_id could be queried with an
>  + * ioctl, but Arm provides a well-isolated register page containing 
>  only this
>  + * read-only register, so let's expose this page through a static mmap 
>  offset
>  + * and allow direct mapping of this MMIO region so we can avoid the
>  + * user <-> kernel round-trip.
>  + */
>  +#define DRM_PANTHOR_USER_MMIO_OFFSET(0x1ull << 56)
> >>>
> >>> I'm playing with a 32-bit kernel/userspace, and this is problematic,
> >>> because vm_pgoff is limited to 32-bit there, meaning we can only map up
> >>> to (1ull << (PAGE_SHIFT + 32)) - 1. Should we add a DEV_QUERY to let
> >>> userspace set the mmio range?
> >>
> >> Hmm, I was rather hoping we could ignore 32 bit these days ;) But while
> >> I can't see why anyone would be running a 32 bit kernel, I guess 32 bit
> >> user space is likely to still be needed.  
> > 
> > Uh, I just hit a new problem with 32-bit kernels: the io-pgtable
> > interface (io_pgtable_ops) passes device VAs as unsigned longs, meaning
> > the GPU VA space is limited to 4G on a 32-bit build :-(. Robin, any
> > chance you could advise me on what to do here?
> > 
> > 1. assume this limitation is here for a good reason, and limit the GPU
> > VA space to 32-bits on 32-bit kernels
> > 
> > or
> > 
> > 2. update the interface to make iova an u64  
> 
> I'm not sure I can answer the question from a technical perspective,
> hopefully Robin will be able to.

Had a quick chat with Robin, and he's recommending going for #1 too.

> 
> But why do we care about 32-bit kernels on a platform which is new
> enough to have a CSF-GPU (and by extension a recent 64-bit CPU)?

Apparently the memory you save by switching to a 32-bit kernel matters
to some people. To clarify, the CPU is aarch64, but they want to use it
in 32-bit mode.

> 
> Given the other limitations present in a 32-bit kernel I'd be tempted to
> say '1' just for simplicity. Especially since apparently we've lived
> with this for panfrost which presumably has the same limitation (even
> though all Bifrost/Midgard GPUs have at least 33 bits of VA space).

Well, Panfrost is simpler in that you don't have this kernel VA range,
and, IIRC, we are using the old format that naturally limits the GPU VA
space to 4G.


[PATCH v2 7/7] drm: ci: Use scripts/config to enable/disable configs

2023-09-04 Thread Vignesh Raman
Instead of modifying files in git to enable/disable
configs, use scripts/config on the .config file which
will be used for building the kernel.

Signed-off-by: Vignesh Raman 
---

v2:
  - Added a new patch in the series to use scripts/config to enable/disable 
configs
  
---
 drivers/gpu/drm/ci/build.sh | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/ci/build.sh b/drivers/gpu/drm/ci/build.sh
index 92ffd98cd09e..c95f4daac221 100644
--- a/drivers/gpu/drm/ci/build.sh
+++ b/drivers/gpu/drm/ci/build.sh
@@ -70,19 +70,19 @@ if [ -z "$CI_MERGE_REQUEST_PROJECT_PATH" ]; then
 fi
 fi
 
-for opt in $ENABLE_KCONFIGS; do
-  echo CONFIG_$opt=y >> drivers/gpu/drm/ci/${KERNEL_ARCH}.config
-done
-for opt in $DISABLE_KCONFIGS; do
-  echo CONFIG_$opt=n >> drivers/gpu/drm/ci/${KERNEL_ARCH}.config
-done
-
 if [[ -n "${MERGE_FRAGMENT}" ]]; then
 ./scripts/kconfig/merge_config.sh ${DEFCONFIG} 
drivers/gpu/drm/ci/${MERGE_FRAGMENT}
 else
 make `basename ${DEFCONFIG}`
 fi
 
+for opt in $ENABLE_KCONFIGS; do
+./scripts/config --enable CONFIG_$opt
+done
+for opt in $DISABLE_KCONFIGS; do
+./scripts/config --disable CONFIG_$opt
+done
+
 make ${KERNEL_IMAGE_NAME}
 
 mkdir -p /lava-files/
-- 
2.40.1



[PATCH v2 5/7] drm: ci: Update xfails

2023-09-04 Thread Vignesh Raman
Update amdgpu-stoney-fails, mediatek-mt8173-flakes,
mediatek-mt8173-fails, rockchip-rk3399-fails, rockchip-rk3399-flakes,
rockchip-rk3288-flakes, i915-cml-fails, i915-cml-flakes,
msm-apq8016-flakes files.

Add tests that fail sometimes into the *-flakes file and tests
that are failing into the *-fails file.

Signed-off-by: Helen Koike 
Signed-off-by: Vignesh Raman 
---

v2:
  - No changes
  
---
 .../gpu/drm/ci/xfails/amdgpu-stoney-fails.txt|  1 -
 drivers/gpu/drm/ci/xfails/i915-cml-fails.txt |  1 -
 drivers/gpu/drm/ci/xfails/i915-cml-flakes.txt|  2 ++
 drivers/gpu/drm/ci/xfails/i915-glk-flakes.txt|  1 +
 .../gpu/drm/ci/xfails/mediatek-mt8173-fails.txt  |  2 --
 .../gpu/drm/ci/xfails/mediatek-mt8173-flakes.txt | 16 
 drivers/gpu/drm/ci/xfails/msm-apq8016-flakes.txt |  2 ++
 .../gpu/drm/ci/xfails/rockchip-rk3288-flakes.txt |  1 +
 .../gpu/drm/ci/xfails/rockchip-rk3399-fails.txt  |  4 ++--
 .../gpu/drm/ci/xfails/rockchip-rk3399-flakes.txt |  3 +++
 10 files changed, 27 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/ci/xfails/amdgpu-stoney-fails.txt 
b/drivers/gpu/drm/ci/xfails/amdgpu-stoney-fails.txt
index bd9392536e7c..58bfded8a3fc 100644
--- a/drivers/gpu/drm/ci/xfails/amdgpu-stoney-fails.txt
+++ b/drivers/gpu/drm/ci/xfails/amdgpu-stoney-fails.txt
@@ -1,7 +1,6 @@
 kms_addfb_basic@bad-pitch-65536,Fail
 kms_addfb_basic@bo-too-small,Fail
 kms_async_flips@invalid-async-flip,Fail
-kms_atomic@plane-immutable-zpos,Fail
 kms_atomic_transition@plane-toggle-modeset-transition,Fail
 kms_bw@linear-tiling-1-displays-2560x1440p,Fail
 kms_bw@linear-tiling-1-displays-3840x2160p,Fail
diff --git a/drivers/gpu/drm/ci/xfails/i915-cml-fails.txt 
b/drivers/gpu/drm/ci/xfails/i915-cml-fails.txt
index 6139b410e767..5f513c638beb 100644
--- a/drivers/gpu/drm/ci/xfails/i915-cml-fails.txt
+++ b/drivers/gpu/drm/ci/xfails/i915-cml-fails.txt
@@ -1,4 +1,3 @@
-kms_color@ctm-0-25,Fail
 kms_flip_scaled_crc@flip-32bpp-linear-to-64bpp-linear-downscaling,Fail
 kms_flip_scaled_crc@flip-32bpp-linear-to-64bpp-linear-upscaling,Fail
 kms_flip_scaled_crc@flip-32bpp-xtile-to-64bpp-xtile-downscaling,Fail
diff --git a/drivers/gpu/drm/ci/xfails/i915-cml-flakes.txt 
b/drivers/gpu/drm/ci/xfails/i915-cml-flakes.txt
index 0514a7b3fdb0..f06f1a5b16f9 100644
--- a/drivers/gpu/drm/ci/xfails/i915-cml-flakes.txt
+++ b/drivers/gpu/drm/ci/xfails/i915-cml-flakes.txt
@@ -7,6 +7,8 @@ kms_bw@linear-tiling-3-displays-3840x2160p
 kms_bw@linear-tiling-4-displays-1920x1080p
 kms_bw@linear-tiling-4-displays-2560x1440p
 kms_bw@linear-tiling-4-displays-3840x2160p
+kms_color@ctm-0-25
+kms_cursor_legacy@torture-move
 kms_draw_crc@draw-method-xrgb-render-xtiled
 kms_flip@flip-vs-suspend
 kms_flip_scaled_crc@flip-32bpp-ytile-to-64bpp-ytile-downscaling
diff --git a/drivers/gpu/drm/ci/xfails/i915-glk-flakes.txt 
b/drivers/gpu/drm/ci/xfails/i915-glk-flakes.txt
index fc41d13a2d56..3aee1f11ee90 100644
--- a/drivers/gpu/drm/ci/xfails/i915-glk-flakes.txt
+++ b/drivers/gpu/drm/ci/xfails/i915-glk-flakes.txt
@@ -8,6 +8,7 @@ kms_bw@linear-tiling-3-displays-3840x2160p
 kms_bw@linear-tiling-4-displays-1920x1080p
 kms_bw@linear-tiling-4-displays-2560x1440p
 kms_bw@linear-tiling-4-displays-3840x2160p
+kms_cursor_legacy@torture-bo
 kms_flip@blocking-wf_vblank
 kms_flip@wf_vblank-ts-check
 kms_flip@wf_vblank-ts-check-interruptible
diff --git a/drivers/gpu/drm/ci/xfails/mediatek-mt8173-fails.txt 
b/drivers/gpu/drm/ci/xfails/mediatek-mt8173-fails.txt
index 671916067dba..c8e64bbfd480 100644
--- a/drivers/gpu/drm/ci/xfails/mediatek-mt8173-fails.txt
+++ b/drivers/gpu/drm/ci/xfails/mediatek-mt8173-fails.txt
@@ -1,5 +1,4 @@
 kms_3d,Fail
-kms_addfb_basic@addfb25-bad-modifier,Fail
 kms_bw@linear-tiling-1-displays-1920x1080p,Fail
 kms_bw@linear-tiling-1-displays-2560x1440p,Fail
 kms_bw@linear-tiling-1-displays-3840x2160p,Fail
@@ -11,7 +10,6 @@ kms_bw@linear-tiling-3-displays-2560x1440p,Fail
 kms_bw@linear-tiling-3-displays-3840x2160p,Fail
 kms_color@pipe-A-invalid-gamma-lut-sizes,Fail
 kms_color@pipe-B-invalid-gamma-lut-sizes,Fail
-kms_force_connector_basic@force-connector-state,Fail
 kms_force_connector_basic@force-edid,Fail
 kms_force_connector_basic@force-load-detect,Fail
 kms_force_connector_basic@prune-stale-modes,Fail
diff --git a/drivers/gpu/drm/ci/xfails/mediatek-mt8173-flakes.txt 
b/drivers/gpu/drm/ci/xfails/mediatek-mt8173-flakes.txt
index e69de29bb2d1..9ed6722df2c2 100644
--- a/drivers/gpu/drm/ci/xfails/mediatek-mt8173-flakes.txt
+++ b/drivers/gpu/drm/ci/xfails/mediatek-mt8173-flakes.txt
@@ -0,0 +1,16 @@
+core_setmaster_vs_auth
+kms_addfb_basic@addfb25-bad-modifier
+kms_color@invalid-gamma-lut-sizes
+kms_cursor_legacy@cursor-vs-flip-atomic
+kms_cursor_legacy@cursor-vs-flip-legacy
+kms_force_connector_basic@force-connector-state
+kms_hdmi_inject@inject-4k
+kms_plane_scaling@plane-scaler-with-pixel-format-unity-scaling
+kms_plane_scaling@plane-upscale-with-modifiers-20x20
+kms_plane_scaling@plane-upscale-with-pixel-format-20x20
+kms_plane_scaling@pla

[PATCH v2 6/7] drm: ci: Enable new jobs

2023-09-04 Thread Vignesh Raman
Enable the following jobs, as the issues noted in the
TODO comments have been resolved. This will ensure that these jobs
are now included and executed as part of the CI/CD pipeline.

msm:apq8016:
TODO: current issue: it is not fiding the NFS root. Fix and remove this rule.

mediatek:mt8173:
TODO: current issue: device is hanging. Fix and remove this rule.

virtio_gpu:none:
TODO: current issue: malloc(): corrupted top size. Fix and remove this rule.

Signed-off-by: Vignesh Raman 
---

v2:
  - Reworded the commit message
  
---
 drivers/gpu/drm/ci/gitlab-ci.yml | 2 +-
 drivers/gpu/drm/ci/test.yml  | 9 -
 2 files changed, 1 insertion(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/ci/gitlab-ci.yml b/drivers/gpu/drm/ci/gitlab-ci.yml
index 2c4df53f5dfe..d2aac4404914 100644
--- a/drivers/gpu/drm/ci/gitlab-ci.yml
+++ b/drivers/gpu/drm/ci/gitlab-ci.yml
@@ -248,4 +248,4 @@ sanity:
 
 # Jobs that need to pass before spending hardware resources on further testing
 .required-for-hardware-jobs:
-  needs: []
\ No newline at end of file
+  needs: []
diff --git a/drivers/gpu/drm/ci/test.yml b/drivers/gpu/drm/ci/test.yml
index d85add39f425..1771af21e2d9 100644
--- a/drivers/gpu/drm/ci/test.yml
+++ b/drivers/gpu/drm/ci/test.yml
@@ -108,9 +108,6 @@ msm:apq8016:
 RUNNER_TAG: google-freedreno-db410c
   script:
 - ./install/bare-metal/fastboot.sh
-  rules:
-# TODO: current issue: it is not fiding the NFS root. Fix and remove this 
rule.
-- when: never
 
 msm:apq8096:
   extends:
@@ -273,9 +270,6 @@ mediatek:mt8173:
 DEVICE_TYPE: mt8173-elm-hana
 GPU_VERSION: mt8173
 RUNNER_TAG: mesa-ci-x86-64-lava-mt8173-elm-hana
-  rules:
-# TODO: current issue: device is hanging. Fix and remove this rule.
-- when: never
 
 mediatek:mt8183:
   extends:
@@ -333,6 +327,3 @@ virtio_gpu:none:
 - debian/x86_64_test-gl
 - testing:x86_64
 - igt:x86_64
-  rules:
-# TODO: current issue: malloc(): corrupted top size. Fix and remove this 
rule.
-- when: never
\ No newline at end of file
-- 
2.40.1



[PATCH v2 4/7] drm: ci: Enable configs to fix mt8173 boot hang issue

2023-09-04 Thread Vignesh Raman
Enable regulator
Enable MT6397 RTC driver

Signed-off-by: Vignesh Raman 
---

v2:
  - No changes
  
---
 drivers/gpu/drm/ci/arm64.config | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/ci/arm64.config b/drivers/gpu/drm/ci/arm64.config
index 817e18ddfd4f..ea7a6cceff40 100644
--- a/drivers/gpu/drm/ci/arm64.config
+++ b/drivers/gpu/drm/ci/arm64.config
@@ -184,6 +184,8 @@ CONFIG_HW_RANDOM_MTK=y
 CONFIG_MTK_DEVAPC=y
 CONFIG_PWM_MTK_DISP=y
 CONFIG_MTK_CMDQ=y
+CONFIG_REGULATOR_DA9211=y
+CONFIG_RTC_DRV_MT6397=y
 
 # For nouveau.  Note that DRM must be a module so that it's loaded after NFS 
is up to provide the firmware.
 CONFIG_ARCH_TEGRA=y
-- 
2.40.1



[PATCH v2 3/7] drm: ci: virtio: update ci variables

2023-09-04 Thread Vignesh Raman
Update ci variables to fix the below error,
ERROR - Igt error: malloc(): corrupted top size
ERROR - Igt error: Received signal SIGABRT.
ERROR - Igt error: Stack trace:
ERROR - Igt error:  #0 [fatal_sig_handler+0x17b]

Signed-off-by: Vignesh Raman 
---

v2:
  - No changes
  
---
 drivers/gpu/drm/ci/test.yml | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/ci/test.yml b/drivers/gpu/drm/ci/test.yml
index 6473cddaa7a9..d85add39f425 100644
--- a/drivers/gpu/drm/ci/test.yml
+++ b/drivers/gpu/drm/ci/test.yml
@@ -316,8 +316,11 @@ virtio_gpu:none:
   stage: virtio-gpu
   variables:
 CROSVM_GALLIUM_DRIVER: llvmpipe
-DRIVER_NAME: virtio_gpu
+DRIVER_NAME: virtio
 GPU_VERSION: none
+CROSVM_MEMORY: 12288
+CROSVM_CPU: $FDO_CI_CONCURRENT
+CROSVM_GPU_ARGS: 
"vulkan=true,gles=false,backend=virglrenderer,egl=true,surfaceless=true"
   extends:
 - .test-gl
   tags:
-- 
2.40.1



[PATCH v2 2/7] drm: ci: Force db410c to host mode

2023-09-04 Thread Vignesh Raman
Force db410c to host mode to fix network issue which results in failure
to mount root fs via NFS.
See 
https://gitlab.freedesktop.org/gfx-ci/linux/-/commit/cb72a629b8c15c80a54dda510743cefd1c4b65b8

Use fdtoverlay command to merge base device tree with an overlay
which contains the fix for USB controllers to work in host mode.

Signed-off-by: Vignesh Raman 
---

v2:
  - Use fdtoverlay command to merge overlay dtbo with the base dtb instead of 
modifying the kernel sources
  
---
 drivers/gpu/drm/ci/build.sh |  5 +
 .../gpu/drm/ci/dt-overlays/apq8016-sbc-overlay.dts  | 13 +
 2 files changed, 18 insertions(+)
 create mode 100644 drivers/gpu/drm/ci/dt-overlays/apq8016-sbc-overlay.dts

diff --git a/drivers/gpu/drm/ci/build.sh b/drivers/gpu/drm/ci/build.sh
index 7b014287a041..92ffd98cd09e 100644
--- a/drivers/gpu/drm/ci/build.sh
+++ b/drivers/gpu/drm/ci/build.sh
@@ -92,6 +92,11 @@ done
 
 if [[ -n ${DEVICE_TREES} ]]; then
 make dtbs
+if [[ -e arch/arm64/boot/dts/qcom/apq8016-sbc.dtb ]]; then
+dtc -@ -I dts -O dtb -o 
drivers/gpu/drm/ci/dt-overlays/apq8016-sbc-overlay.dtbo 
drivers/gpu/drm/ci/dt-overlays/apq8016-sbc-overlay.dts
+fdtoverlay -i arch/arm64/boot/dts/qcom/apq8016-sbc.dtb -o 
arch/arm64/boot/dts/qcom/apq8016-sbc-overlay.dtb 
drivers/gpu/drm/ci/dt-overlays/apq8016-sbc-overlay.dtbo
+mv arch/arm64/boot/dts/qcom/apq8016-sbc-overlay.dtb 
arch/arm64/boot/dts/qcom/apq8016-sbc.dtb
+fi
 cp ${DEVICE_TREES} /lava-files/.
 fi
 
diff --git a/drivers/gpu/drm/ci/dt-overlays/apq8016-sbc-overlay.dts 
b/drivers/gpu/drm/ci/dt-overlays/apq8016-sbc-overlay.dts
new file mode 100644
index ..57b7604f1c23
--- /dev/null
+++ b/drivers/gpu/drm/ci/dt-overlays/apq8016-sbc-overlay.dts
@@ -0,0 +1,13 @@
+/dts-v1/;
+/plugin/;
+
+/ {
+fragment@0 {
+target-path = "/soc@0";
+__overlay__ {
+usb@78d9000 {
+dr_mode = "host";
+};
+};
+};
+};
-- 
2.40.1



[PATCH v2 1/7] drm: ci: igt_runner: remove todo

2023-09-04 Thread Vignesh Raman
/sys/kernel/debug/dri/*/state exist for every atomic KMS driver.
We do not test non-atomic drivers, so remove the todo.

Signed-off-by: Vignesh Raman 
---

v2:
  - No changes
  
---
 drivers/gpu/drm/ci/igt_runner.sh | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/ci/igt_runner.sh b/drivers/gpu/drm/ci/igt_runner.sh
index 2bb759165063..5bf130ac57c9 100755
--- a/drivers/gpu/drm/ci/igt_runner.sh
+++ b/drivers/gpu/drm/ci/igt_runner.sh
@@ -15,7 +15,6 @@ cat /sys/kernel/debug/device_component/*
 '
 
 # Dump drm state to confirm that kernel was able to find a connected display:
-# TODO this path might not exist for all drivers.. maybe run modetest instead?
 set +e
 cat /sys/kernel/debug/dri/*/state
 set -e
-- 
2.40.1



[PATCH v2 0/7] drm: ci: fixes

2023-09-04 Thread Vignesh Raman
The patch series contains improvements, enabling new ci jobs which
enables testing for Mediatek MT8173, Qualcomm APQ 8016 and VirtIO GPU,
fixing issues with the ci jobs and updating the expectation files.
This series is intended for drm branch topic/drm-ci.

v2:
  - Use fdtoverlay command to merge overlay dtbo with the base dtb instead of 
modifying the kernel sources
  - Reworded the commit message for enabling jobs
  - Added a new patch in the series to use scripts/config to enable/disable 
configs

Vignesh Raman (7):
  drm: ci: igt_runner: remove todo
  drm: ci: Force db410c to host mode
  drm: ci: virtio: update ci variables
  drm: ci: Enable configs to fix mt8173 boot hang issue
  drm: ci: Update xfails
  drm: ci: Enable new jobs
  drm: ci: Use scripts/config to enable/disable configs

 drivers/gpu/drm/ci/arm64.config   |  2 ++
 drivers/gpu/drm/ci/build.sh   | 19 ---
 .../ci/dt-overlays/apq8016-sbc-overlay.dts| 13 +
 drivers/gpu/drm/ci/gitlab-ci.yml  |  2 +-
 drivers/gpu/drm/ci/igt_runner.sh  |  1 -
 drivers/gpu/drm/ci/test.yml   | 14 --
 .../gpu/drm/ci/xfails/amdgpu-stoney-fails.txt |  1 -
 drivers/gpu/drm/ci/xfails/i915-cml-fails.txt  |  1 -
 drivers/gpu/drm/ci/xfails/i915-cml-flakes.txt |  2 ++
 drivers/gpu/drm/ci/xfails/i915-glk-flakes.txt |  1 +
 .../drm/ci/xfails/mediatek-mt8173-fails.txt   |  2 --
 .../drm/ci/xfails/mediatek-mt8173-flakes.txt  | 16 
 .../gpu/drm/ci/xfails/msm-apq8016-flakes.txt  |  2 ++
 .../drm/ci/xfails/rockchip-rk3288-flakes.txt  |  1 +
 .../drm/ci/xfails/rockchip-rk3399-fails.txt   |  4 ++--
 .../drm/ci/xfails/rockchip-rk3399-flakes.txt  |  3 +++
 16 files changed, 59 insertions(+), 25 deletions(-)
 create mode 100644 drivers/gpu/drm/ci/dt-overlays/apq8016-sbc-overlay.dts

-- 
2.40.1



Re: [PATCH v2 02/15] drm/panthor: Add uAPI

2023-09-04 Thread Robin Murphy

On 2023-08-09 17:53, Boris Brezillon wrote:
[...]

+/**
+ * struct drm_panthor_vm_create - Arguments passed to 
DRM_PANTHOR_IOCTL_VM_CREATE
+ */
+struct drm_panthor_vm_create {
+   /** @flags: VM flags, MBZ. */
+   __u32 flags;
+
+   /** @id: Returned VM ID. */
+   __u32 id;
+
+   /**
+* @kernel_va_range: Size of the VA space reserved for kernel objects.
+*
+* If kernel_va_range is zero, we pick half of the VA space for kernel 
objects.
+*
+* Kernel VA space is always placed at the top of the supported VA 
range.
+*/
+   __u64 kernel_va_range;


Off the back of the "IOVA as unsigned long" concern, Boris and I 
reasoned through the 64-bit vs. 32-bit vs. compat cases on IRC, and it 
seems like this kernel_va_range argument is a source of much of the pain.


Rather than have userspace specify a quantity which it shouldn't care 
about and depend on assumptions of kernel behaviour to infer the 
quantity which *is* relevant (i.e. how large the usable range of the VM 
will actually be), I think it would be considerably more logical for 
userspace to simply request the size of usable VM it actually wants. 
Then it would be straightforward and consistent to define the default 
value in terms of the minimum of half the GPU VA size or TASK_SIZE (the 
latter being the largest *meaningful* value in all 3 cases), and it's 
still easy enough for the kernel to deduce for itself whether there's a 
reasonable amount of space left between the requested limit and 
ULONG_MAX for it to use. 32-bit kernels should then get at least 1GB to 
play with, for compat the kernel BOs can get well out of the way into 
the >32-bit range, and it's only really 64-bit where userspace is liable 
to see "kernel" VA space impinging on usable process VAs. Even then 
we're not sure that's a significant concern beyond OpenCL SVM.


Thanks,
Robin.


Re: [PATCH v3 1/1] backlight: hid_bl: Add VESA VCP HID backlight driver

2023-09-04 Thread Thomas Weißschuh
+Cc Hans who ins involved with the backlight subsystem

Hi Julius,

today I stumbled upon a mail from Hans [0], which explains that the
backlight subsystem is not actually a good fit (yet?) for external
displays.

It seems a new API is in the works that would better fit, but I'm not
sure about the state of this API. Maybe Hans can clarify.

This also ties back to my review question how userspace can figure out
to which display a backlight devices applies. So far it can not.

[0] 
https://lore.kernel.org/lkml/7f2d88de-60c5-e2ff-9b22-acba35cfd...@redhat.com/

Below the original PATCH for Hans' reference.

On 2023-08-20 11:41:18+0200, Julius Zint wrote:
> The HID spec defines the following Usage IDs (p. 345 ff):
> 
> - Monitor Page (0x80) -> Monitor Control (0x01)
> - VESA Virtual Controls Page (0x82) -> Brightness (0x10)
> 
> Apple made use of them in their Apple Studio Display and most likely on
> other external displays (LG UltraFine 5k, Pro Display XDR).
> 
> The driver will work for any HID device with a report, where the
> application matches the Monitor Control Usage ID and:
> 
> 1. An Input field in this report with the Brightness Usage ID (to get
>the current brightness)
> 2. A Feature field in this report with the Brightness Usage ID (to
>set the current brightness)
> 
> This driver has been developed and tested with the Apple Studio Display.
> Here is a small excerpt from the decoded HID descriptor showing the
> feature field for setting the brightness:
> 
>   Usage Page (Monitor VESA VCP),  ; Monitor VESA VPC (82h, monitor page)
>   Usage (10h, Brightness),
>   Logical Minimum (400),
>   Logical Maximum (6),
>   Unit (Centimeter^-2 * Candela),
>   Unit Exponent (14),
>   Report Size (32),
>   Report Count (1),
>   Feature (Variable, Null State),
> 
> The full HID descriptor dump is available as a comment in the source
> code.
> 
> Signed-off-by: Julius Zint 
> ---
>  drivers/video/backlight/Kconfig  |   8 +
>  drivers/video/backlight/Makefile |   1 +
>  drivers/video/backlight/hid_bl.c | 269 +++
>  3 files changed, 278 insertions(+)
>  create mode 100644 drivers/video/backlight/hid_bl.c
> 
> diff --git a/drivers/video/backlight/Kconfig b/drivers/video/backlight/Kconfig
> index 51387b1ef012..b964a820956d 100644
> --- a/drivers/video/backlight/Kconfig
> +++ b/drivers/video/backlight/Kconfig
> @@ -472,6 +472,14 @@ config BACKLIGHT_LED
> If you have a LCD backlight adjustable by LED class driver, say Y
> to enable this driver.
>  
> +config BACKLIGHT_HID
> + tristate "VESA VCP HID Backlight Driver"
> + depends on HID
> + help
> +   If you have an external display with VESA compliant HID brightness
> +   controls then say Y to enable this backlight driver. Currently the
> +   only supported device is the Apple Studio Display.
> +
>  endif # BACKLIGHT_CLASS_DEVICE
>  
>  endmenu
> diff --git a/drivers/video/backlight/Makefile 
> b/drivers/video/backlight/Makefile
> index f72e1c3c59e9..835f9b8772c7 100644
> --- a/drivers/video/backlight/Makefile
> +++ b/drivers/video/backlight/Makefile
> @@ -58,3 +58,4 @@ obj-$(CONFIG_BACKLIGHT_WM831X)  += wm831x_bl.o
>  obj-$(CONFIG_BACKLIGHT_ARCXCNN)  += arcxcnn_bl.o
>  obj-$(CONFIG_BACKLIGHT_RAVE_SP)  += rave-sp-backlight.o
>  obj-$(CONFIG_BACKLIGHT_LED)  += led_bl.o
> +obj-$(CONFIG_BACKLIGHT_HID)  += hid_bl.o
> diff --git a/drivers/video/backlight/hid_bl.c 
> b/drivers/video/backlight/hid_bl.c
> new file mode 100644
> index ..b40f8f412ee2
> --- /dev/null
> +++ b/drivers/video/backlight/hid_bl.c
> @@ -0,0 +1,269 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#define APPLE_STUDIO_DISPLAY_VENDOR_ID  0x05ac
> +#define APPLE_STUDIO_DISPLAY_PRODUCT_ID 0x1114
> +
> +#define HID_USAGE_MONITOR_CTRL   0x81
> +#define HID_USAGE_VESA_VCP_BRIGHTNESS0x820010
> +
> +/*
> + * Apple Studio Display HID report descriptor
> + *
> + * Usage Page (Monitor),   ; USB monitor (80h, monitor page)
> + * Usage (01h),
> + * Collection (Application),
> + * Report ID (1),
> + *
> + * Usage Page (Monitor VESA VCP),  ; Monitor VESA virtual control panel 
> (82h, monitor page)
> + * Usage (10h, Brightness),
> + * Logical Minimum (400),
> + * Logical Maximum (6),
> + * Unit (Centimeter^-2 * Candela),
> + * Unit Exponent (14),
> + * Report Size (32),
> + * Report Count (1),
> + * Feature (Variable, Null State),
> + *
> + * Usage Page (PID),   ; Physical interface device (0Fh)
> + * Usage (50h),
> + * Logical Minimum (0),
> + * Logical Maximum (2),
> + * Unit (1001h),
> + * Unit Exponent (13),
> + * Report Size (16),
> + * Feature (Variable, Null State),
> + *
> + * Usage Page (Monitor VESA VCP),  ; Monitor VESA virtual control panel 
> (82h, monitor page)
> + * Usage

Re: [RFC PATCH v1 07/12] soc: qcom: pmic_glink_altmode: report that this is a Type-C connector

2023-09-04 Thread Bjorn Andersson
On Mon, Sep 04, 2023 at 12:41:45AM +0300, Dmitry Baryshkov wrote:
> Set the bridge's path property to point out that this connector is
> wrapped into the Type-C port.
> 
> We can not really identify the exact Type-C port because it is
> registered separately, by another driver, which is not mandatory and the
> corresponding device is not even present on some of platforms, like
> sc8180x or sm8350. Thus we use the shortened version of the PATH, which
> includes just the 'typec:' part.

How would a properly resolved path look like?

As with the other patch, I'm okay with this going through the USB tree.

Acked-by: Bjorn Andersson 

Regards,
Bjorn


Re: [RFC PATCH v1 00/12] drm,usb/typec: uABI for USB-C DisplayPort connectors

2023-09-04 Thread Dmitry Baryshkov

On 04/09/2023 18:46, Bjorn Andersson wrote:

On Mon, Sep 04, 2023 at 12:41:38AM +0300, Dmitry Baryshkov wrote:

During the discussion regarding DisplayPort wrapped in the USB-C
connectors (via the USB-C altmode) it was pointed out that currently
there is no good way to let userspace know if the DRM connector in
question is the 'native' DP connector or if it is the USB-C connector.

An attempt to use DRM_MODE_CONNECTOR_USB for such connectors was
declined, as existing DP drivers (i915, AMD) use
DRM_MODE_CONNECTOR_DisplayPort. New drivers should behave in the same
way.



Sorry, didn't see the commit message before posting my complaint about
USB -> DisplayPort.


An attempt to use subconnector property was also declined. It is defined
to the type of the DP dongle connector rather than the host connector.

This attempt targets reusing the connector's PATH property. Currently
this property is only used for the DP MST connectors. This patchset
reuses it to point out to the corresponding registered typec port
device.



Still interested in understanding how the path string should look like.


As wrote in the other letter, on RB5 it is 'typec:port0'. If the machine 
has two Type-C ports and two connected DP blocks, one of them will have 
'typec:port0', another one 'typec:port1'. This way one can further look 
under /sys/class/typec/portN/physical_localtion/ and find corresponding 
location, etc.



Is the path expected to be consumed by machine, or is it only there for
human convenience?


As with DP MST it is expected that userspace will consume this 
information, possibly renaming the connector. For example, on my laptop 
I have DP-1, ... DP-5 connectors (with DP-2 -- DP-5 being DP MST ones). 
Xorg renames them to DP-1, DP-2, DP-1-1, DP-1-2, DP-1-3, because the MST 
ones are branches for the DP-1.


--
With best wishes
Dmitry



Re: [RFC PATCH v1 07/12] soc: qcom: pmic_glink_altmode: report that this is a Type-C connector

2023-09-04 Thread Dmitry Baryshkov

On 04/09/2023 18:43, Bjorn Andersson wrote:

On Mon, Sep 04, 2023 at 12:41:45AM +0300, Dmitry Baryshkov wrote:

Set the bridge's path property to point out that this connector is
wrapped into the Type-C port.

We can not really identify the exact Type-C port because it is
registered separately, by another driver, which is not mandatory and the
corresponding device is not even present on some of platforms, like
sc8180x or sm8350. Thus we use the shortened version of the PATH, which
includes just the 'typec:' part.


How would a properly resolved path look like?


On RB5 it is 'typec:port0', as the USB-C port is registered as 
/sys/class/typec/port0




As with the other patch, I'm okay with this going through the USB tree.

Acked-by: Bjorn Andersson 

Regards,
Bjorn


--
With best wishes
Dmitry



Re: [RFC PATCH v1 06/12] soc: qcom: pmic_glink_altmode: fix DRM connector type

2023-09-04 Thread Bjorn Andersson
On Mon, Sep 04, 2023 at 12:41:44AM +0300, Dmitry Baryshkov wrote:
> During discussions regarding USB-C vs native DisplayPort it was pointed
> out that other implementations (i915, AMD) are using
> DRM_MODE_CONNECTOR_DisplayPort for both native and USB-C-wrapped DP
> output. Follow this example and make the pmic_glink_altmode driver also
> report DipslayPort connector rather than the USB one.

I started off with this, but on devices with both USB Type-C and
(native) DisplayPort, it seemed much more reasonable to make the
distinction; and thereby get the outputs named "USB-n".

Similarly, it looks quite reasonable that the output on my laptop are
eDP-1, USB-1 and USB-2.


But, if this is not the way its intended to be used, feel free to pick
this together with the other patches.

Acked-by: Bjorn Andersson 

Regards,
Bjorn


Re: [RFC PATCH v1 00/12] drm,usb/typec: uABI for USB-C DisplayPort connectors

2023-09-04 Thread Bjorn Andersson
On Mon, Sep 04, 2023 at 12:41:38AM +0300, Dmitry Baryshkov wrote:
> During the discussion regarding DisplayPort wrapped in the USB-C
> connectors (via the USB-C altmode) it was pointed out that currently
> there is no good way to let userspace know if the DRM connector in
> question is the 'native' DP connector or if it is the USB-C connector.
> 
> An attempt to use DRM_MODE_CONNECTOR_USB for such connectors was
> declined, as existing DP drivers (i915, AMD) use
> DRM_MODE_CONNECTOR_DisplayPort. New drivers should behave in the same
> way.
> 

Sorry, didn't see the commit message before posting my complaint about
USB -> DisplayPort.

> An attempt to use subconnector property was also declined. It is defined
> to the type of the DP dongle connector rather than the host connector.
> 
> This attempt targets reusing the connector's PATH property. Currently
> this property is only used for the DP MST connectors. This patchset
> reuses it to point out to the corresponding registered typec port
> device.
> 

Still interested in understanding how the path string should look like.

Is the path expected to be consumed by machine, or is it only there for
human convenience?

Regards,
Bjorn


Re: [RFC PATCH 04/10] drm/panel_helper: Introduce drm_panel_helper

2023-09-04 Thread Maxime Ripard
Hi,

On Fri, Sep 01, 2023 at 06:42:42AM -0700, Doug Anderson wrote:
> On Fri, Sep 1, 2023 at 1:15 AM Maxime Ripard  wrote:
> > On Thu, Aug 31, 2023 at 11:18:49AM -0700, Doug Anderson wrote:
> > > Today this is explicitly _not_ refcounting, right? It is simply
> > > treating double-enables as no-ops and double-disables as no-ops. With
> > > our current understanding, the only thing we actually need to guard
> > > against is double-disable but at the moment we do guard against both.
> > > Specifically we believe the cases that are issues:
> > >
> > > a) At shutdown/remove time we want to disable the panel, but only if
> > > it was enabled (we wouldn't want to call disable if the panel was
> > > already off because userspace turned it off).
> >
> > Yeah, and that's doable with refcounting too.
> 
> I don't understand the benefit of switching to refcounting, though. We
> don't ever expect the "prepare" or "enable" function to be called more
> than once and all we're guarding against is a double-unprepare and a
> double-enable. Switching this to refcounting would make the reader
> think that there was a legitimate case for things to be prepared or
> enabled twice. As far as I know, there isn't.

Sure, eventually we'll want to remove it.

I even said it as such here:
https://lore.kernel.org/dri-devel/wwzbd7dt5qyimshnd7sbgkf5gxk7tq5dxtrerz76uw5p6s7tzt@cbiezkfeuqqn/

However, we have a number of panels following various anti-patterns
where disable and unprepare would be called multiple times. A boolean
would just ignore the second, refcounting would warn over it, and that's
what we want.

And that's exactly because there isn't a legitimate case for things to
be disabled or unprepared twice, but yet many panel driver do it anyway.

> In any case, I don't think there's any need to switch this to
> refcounting as part of this effort. Someone could, in theory, do it as
> a separate patch series.

I'm sorry, but I'll insist on getting a solution that will warn panels
that call drm_atomic_helper_shutdown or drm_panel_disable/unprepare by
hand. It doesn't have to be refcounting though if you have a better idea
in mind.

> > > The above solves the problem with panels wanting to power sequence
> > > themselves at remove() time, but not at shutdown() time. Thus we'd
> > > still have a dependency on having all drivers use
> > > drm_atomic_helper_shutdown() so that work becomes a dependency.
> >
> > Does it? I think it can be done in parallel?
> 
> I don't think it can be in parallel. While it makes sense for panels
> to call drm_panel_remove() at remove time, it doesn't make sense for
> them to call it at shutdown time. That means that the trick of having
> the panel get powered off in drm_panel_remove() won't help for
> shutdown. For shutdown, which IMO is the more important case, we need
> to wait until all drm drivers call drm_atomic_helper_shutdown()
> properly.

Right, my point was more that drivers that already don't disable the
panel in their shutdown implementation will still not do it. And drivers
that do will still do it, so there's no regression.

We obviously want to tend to having all drivers call
drm_atomic_helper_shutdown(), but not having it will not introduce any
regression.

Maxime


Re: [PATCH 2/8] fbdev/udlfb: Use fb_ops helpers for deferred I/O

2023-09-04 Thread Javier Martinez Canillas
Thomas Zimmermann  writes:

> Am 04.09.23 um 15:05 schrieb Javier Martinez Canillas:
>> Thomas Zimmermann  writes:
>> 
>>> Generate callback functions for struct fb_ops with the fbdev macro
>>> FB_GEN_DEFAULT_DEFERRED_SYSMEM_OPS(). Initialize struct fb_ops to
>>> the generated functions with fbdev initializer macros.
>>>
>>> Signed-off-by: Thomas Zimmermann 
>>> Cc: Bernie Thompson 
>>> ---
>> 
>> Acked-by: Javier Martinez Canillas 
>> 
>> [...]
>> 
>>> +static void dlfb_ops_damage_range(struct fb_info *info, off_t off, size_t 
>>> len)
>>> +{
>>> +   struct dlfb_data *dlfb = info->par;
>>> +   int start = max((int)(off / info->fix.line_length), 0);
>>> +   int lines = min((u32)((len / info->fix.line_length) + 1), 
>>> (u32)info->var.yres);
>>> +
>>> +   dlfb_handle_damage(dlfb, 0, start, info->var.xres, lines);
>>> +}
>>> +
>>> +static void dlfb_ops_damage_area(struct fb_info *info, u32 x, u32 y, u32 
>>> width, u32 height)
>>> +{
>>> +   struct dlfb_data *dlfb = info->par;
>>> +
>>> +   dlfb_offload_damage(dlfb, x, y, width, height);
>>> +}
>>> +
>> 
>> These two are very similar to the helpers you added for the smscufx driver
>> in patch #1. I guess there's room for further consolidation as follow-up ?
>
> Maybe. I had patches that take the rectangle computation from [1] and 
> turn it into a helper for these USB drivers. But it's an unrelated 
> change, so I dropped them from this patchset.
>

Great and yes, I meant as separate patch-set, not as a part of this one.

> Best regards
> Thomas
>

-- 
Best regards,

Javier Martinez Canillas
Core Platforms
Red Hat



Re: [PATCH 1/8] fbdev/smscufx: Use fb_ops helpers for deferred I/O

2023-09-04 Thread Javier Martinez Canillas
Thomas Zimmermann  writes:

> Hi Javier
>
> Am 04.09.23 um 14:59 schrieb Javier Martinez Canillas:
>> Thomas Zimmermann  writes:
>> 
>> Hello Thomas,
>> 
>>> Generate callback functions for struct fb_ops with the fbdev macro
>>> FB_GEN_DEFAULT_DEFERRED_SYSMEM_OPS(). Initialize struct fb_ops to
>>> the generated functions with fbdev initializer macros.
>>>
>>> Signed-off-by: Thomas Zimmermann 
>>> Cc: Steve Glendinning 
>>> ---
>> 
>> The patch looks good to me, but I've a question below.
>> 
>> Acked-by: Javier Martinez Canillas 
>> 
>>>   drivers/video/fbdev/smscufx.c | 85 +--
>>>   1 file changed, 22 insertions(+), 63 deletions(-)
>>>
>>> diff --git a/drivers/video/fbdev/smscufx.c b/drivers/video/fbdev/smscufx.c
>> 
>> [...]
>> 
>>>   static const struct fb_ops ufx_ops = {
>>> .owner = THIS_MODULE,
>>> -   .fb_read = fb_sys_read,
>>> -   .fb_write = ufx_ops_write,
>>> +   __FB_DEFAULT_DEFERRED_OPS_RDWR(ufx_ops),
>>> .fb_setcolreg = ufx_ops_setcolreg,
>>> -   .fb_fillrect = ufx_ops_fillrect,
>>> -   .fb_copyarea = ufx_ops_copyarea,
>>> -   .fb_imageblit = ufx_ops_imageblit,
>>> +   __FB_DEFAULT_DEFERRED_OPS_DRAW(ufx_ops),
>>> .fb_mmap = ufx_ops_mmap,
>> 
>> There are no generated functions for .fb_mmap, I wonder what's the value
>> of __FB_DEFAULT_DEFERRED_OPS_MMAP() ? Maybe just removing that macro and
>> setting .fb_mmap = fb_deferred_io_mmap instead if there's no custom mmap
>> handler would be easier to read ?
>
> At least two drivers could use __FB_DEFAULT_DEFERRED_OPS_MMAP: 
> picolcd-fb and hyperv_fb. At some point, we might want to set/clear 
> fb_mmap depending on some Kconfig value. Having 
> __FB_DEFAULT_DEFERRED_OPS_MMAP might be helpful then.
>

Got it, thanks for the explanation.

>> 
>> Alternatively, __FB_DEFAULT_DEFERRED_OPS_MMAP() could still be left but
>> not taking a __prefix argument since that is not used anyways ?
>
> The driver optionally provides mmap without deferred I/O, hence the mmap 
> function. That makes no sense, as these writes to the buffer would never 
> make it to the device memory. But I didn't want to remove the code 
> either. So I just left the existing function as-is. Usually, the 
> deferred-I/O mmap is called immediately. [1]
>

Makes sense.

-- 
Best regards,

Javier Martinez Canillas
Core Platforms
Red Hat



Re: [PATCH v2 02/15] drm/panthor: Add uAPI

2023-09-04 Thread Steven Price
On 04/09/2023 10:26, Boris Brezillon wrote:
> On Mon, 4 Sep 2023 08:42:08 +0100
> Steven Price  wrote:
> 
>> On 01/09/2023 17:10, Boris Brezillon wrote:
>>> On Wed,  9 Aug 2023 18:53:15 +0200
>>> Boris Brezillon  wrote:
>>>   
 +/**
 + * DOC: MMIO regions exposed to userspace.
 + *
 + * .. c:macro:: DRM_PANTHOR_USER_MMIO_OFFSET
 + *
 + * File offset for all MMIO regions being exposed to userspace. Don't use
 + * this value directly, use DRM_PANTHOR_USER__OFFSET values instead.
 + *
 + * .. c:macro:: DRM_PANTHOR_USER_FLUSH_ID_MMIO_OFFSET
 + *
 + * File offset for the LATEST_FLUSH_ID register. The Userspace driver 
 controls
 + * GPU cache flushling through CS instructions, but the flush reduction
 + * mechanism requires a flush_id. This flush_id could be queried with an
 + * ioctl, but Arm provides a well-isolated register page containing only 
 this
 + * read-only register, so let's expose this page through a static mmap 
 offset
 + * and allow direct mapping of this MMIO region so we can avoid the
 + * user <-> kernel round-trip.
 + */
 +#define DRM_PANTHOR_USER_MMIO_OFFSET  (0x1ull << 56)  
>>>
>>> I'm playing with a 32-bit kernel/userspace, and this is problematic,
>>> because vm_pgoff is limited to 32-bit there, meaning we can only map up
>>> to (1ull << (PAGE_SHIFT + 32)) - 1. Should we add a DEV_QUERY to let
>>> userspace set the mmio range?  
>>
>> Hmm, I was rather hoping we could ignore 32 bit these days ;) But while
>> I can't see why anyone would be running a 32 bit kernel, I guess 32 bit
>> user space is likely to still be needed.
> 
> Uh, I just hit a new problem with 32-bit kernels: the io-pgtable
> interface (io_pgtable_ops) passes device VAs as unsigned longs, meaning
> the GPU VA space is limited to 4G on a 32-bit build :-(. Robin, any
> chance you could advise me on what to do here?
> 
> 1. assume this limitation is here for a good reason, and limit the GPU
> VA space to 32-bits on 32-bit kernels
> 
> or
> 
> 2. update the interface to make iova an u64

I'm not sure I can answer the question from a technical perspective,
hopefully Robin will be able to.

But why do we care about 32-bit kernels on a platform which is new
enough to have a CSF-GPU (and by extension a recent 64-bit CPU)?

Given the other limitations present in a 32-bit kernel I'd be tempted to
say '1' just for simplicity. Especially since apparently we've lived
with this for panfrost which presumably has the same limitation (even
though all Bifrost/Midgard GPUs have at least 33 bits of VA space).

Steve



Re: [RFT PATCH 01/15] drm/armada: Call drm_atomic_helper_shutdown() at shutdown time

2023-09-04 Thread Maxime Ripard
On Mon, Sep 04, 2023 at 08:55:43AM +0100, Russell King (Oracle) wrote:
> On Mon, Sep 04, 2023 at 09:36:10AM +0200, Maxime Ripard wrote:
> > On Sun, Sep 03, 2023 at 04:53:42PM +0100, Russell King (Oracle) wrote:
> > > On Fri, Sep 01, 2023 at 04:41:12PM -0700, Douglas Anderson wrote:
> > > > Based on grepping through the source code this driver appears to be
> > > > missing a call to drm_atomic_helper_shutdown() at system shutdown
> > > > time. Among other things, this means that if a panel is in use that it
> > > > won't be cleanly powered off at system shutdown time.
> > > > 
> > > > The fact that we should call drm_atomic_helper_shutdown() in the case
> > > > of OS shutdown/restart comes straight out of the kernel doc "driver
> > > > instance overview" in drm_drv.c.
> > > > 
> > > > This driver was fairly easy to update. The drm_device is stored in the
> > > > drvdata so we just have to make sure the drvdata is NULL whenever the
> > > > device is not bound.
> > > 
> > > ... and there I think you have a misunderstanding of the driver model.
> > > Please have a look at device_unbind_cleanup() which will be called if
> > > probe fails, or when the device is removed (in other words, when it is
> > > not bound to a driver.)
> > > 
> > > Also, devices which aren't bound to a driver won't have their shutdown
> > > method called (because there is no driver currently bound to that
> > > device.) So, ->probe must have completed successfully, and ->remove
> > > must not have been called for that device.
> > > 
> > > So, I think that all these dev_set_drvdata(dev, NULL) that you're
> > > adding are just asking for a kernel janitor to come along later and
> > > remove them because they serve no purpose... so best not introduce
> > > them in the first place.
> > 
> > What would that hypothetical janitor clean up exactly? Code making sure
> > that there's no dangling pointer? Doesn't look very wise to me.
> 
> How can there be a dangling pointer when the driver core removes the
> pointer for the driver in these cases?

You can still access that pointer from remove after the call to
component_del(). It's unlikely, sure, but the issue is still there.

> If I were to accept the argument that the driver core _might_ "forget"
> to NULL out this pointer, then that argument by extension also means
> that no one should make use of the devm_* stuff either, just in case
> the driver core forgets to release that stuff. Best have every driver
> manually release those resources.

It's funny that you go for that argument, because using devm is known to
be a design issue in KMS (and the rest of the kernel to some extent), so
yeah, I very much agree with you there.

> Nope, that doesn't work, because driver authors tend to write buggy
> cleanup paths.

And using devm is just as buggy for a KMS driver. We even discourage its
use in the documentation.

But really, I'm not sure what your point is there. Does devm lead to
bugs? Sure. Is it less buggy that manually unrolling an exit path by
hand? Probably. I seriously don't get the relation to some code clearing
a pointer after it's been freed though.

> There are janitors that go around removing this stuff, and janitorial
> patches tend to keep coming even if one says nak at any particular
> point... and yes, janitors do go around removing this unnecessary
> junk from drivers.
> 
> You will find examples of this removal in commit
> ec3b1ce2ca34, 5cdade2d77dd, c7cb175bb1ef

Other people doing it doesn't make it right (or wrong). And really, I'm
not arguing that it's right, I'm saying that it's not wrong.

It's probably being over-cautious, especially in that driver, but it's
not wrong.

> Moreover:
> 
> 7efb10383181
> 
> is also removing unnecessary driver code. Testing for driver data being
> NULL when we know that a _successful_ probe has happened (because
> ->remove won't be called unless we were successful) and the probe
> always sets drvdata non-NULL is also useless code.

Again, I fail to see what the relationship is there.

> If ultimately you don't trust the driver model to do what it's been
> doing for more than the last decade, then I wonder whether you should
> be trusting the kernel to manage your hardware!

It's not the kernel driver model that I don't trust, it's C's (lack of)
memory safety and management. And like you said yourself, "driver
authors tend to write buggy"

> Anyway, I've said no to this patch for a driver that I'm marked as
> maintainer for, so at least do not make the changes I am objecting to
> to that driver. Thanks.

You're entitled to that opinion indeed.

Maxime


Re: [PATCH 1/8] fbdev/smscufx: Use fb_ops helpers for deferred I/O

2023-09-04 Thread Thomas Zimmermann



Am 04.09.23 um 16:39 schrieb Thomas Zimmermann:
[...]
At least two drivers could use __FB_DEFAULT_DEFERRED_OPS_MMAP: 
picolcd-fb and hyperv_fb. At some point, we might want to set/clear 


Both drivers are already in this patchset.

fb_mmap depending on some Kconfig value. Having 
__FB_DEFAULT_DEFERRED_OPS_MMAP might be helpful then.




Alternatively, __FB_DEFAULT_DEFERRED_OPS_MMAP() could still be left but
not taking a __prefix argument since that is not used anyways ?


The driver optionally provides mmap without deferred I/O, hence the mmap 
function. That makes no sense, as these writes to the buffer would never 
make it to the device memory. But I didn't want to remove the code 
either. So I just left the existing function as-is. Usually, the 
deferred-I/O mmap is called immediately. [1]


Best regards
Thomas

[1] 
https://elixir.bootlin.com/linux/v6.5.1/source/drivers/video/fbdev/smscufx.c#L784








--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)


OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH v3 0/8] drm/msm/dpu: move INTF tearing checks to dpu_encoder_phys_cmd_ini

2023-09-04 Thread Dmitry Baryshkov
Of course this should be 'drm/msm/dpu: drop DPU_INTF_TE and 
DPU_PINGPONG_TE' series


On 04/09/2023 05:04, Dmitry Baryshkov wrote:

rop two feature flags, DPU_INTF_TE and DPU_PINGPONG_TE, in favour of
performing the MDSS revision checks instead.

Changes since v2:
- Added guarding checks for hw_intf and hw_pp in debug print (Marijn)
- Removed extra empty lines (Marijn)

Changes since v1:
- Added missing patch
- Reworked commit messages (following suggestions by Marijn)
- Changed code to check for major & INTF type rather than checking for
   intr presence in catalog. Added WARN_ON()s instead. (Marijn)
- Added severall comments & TODO item.

Dmitry Baryshkov (8):
   drm/msm/dpu: inline _setup_pingpong_ops()
   drm/msm/dpu: enable PINGPONG TE operations only when supported by HW
   drm/msm/dpu: drop the DPU_PINGPONG_TE flag
   drm/msm/dpu: inline _setup_intf_ops()
   drm/msm/dpu: enable INTF TE operations only when supported by HW
   drm/msm/dpu: drop DPU_INTF_TE feature flag
   drm/msm/dpu: drop useless check from
 dpu_encoder_phys_cmd_te_rd_ptr_irq()
   drm/msm/dpu: move INTF tearing checks to dpu_encoder_phys_cmd_init

  .../drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c  | 52 +--
  .../gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c|  3 +-
  .../gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h|  6 +--
  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.c   | 51 +-
  .../gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.c   | 41 +++
  .../gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.h   |  3 +-
  drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c|  2 +-
  7 files changed, 75 insertions(+), 83 deletions(-)



--
With best wishes
Dmitry



Re: [PATCH 2/8] fbdev/udlfb: Use fb_ops helpers for deferred I/O

2023-09-04 Thread Thomas Zimmermann



Am 04.09.23 um 15:05 schrieb Javier Martinez Canillas:

Thomas Zimmermann  writes:


Generate callback functions for struct fb_ops with the fbdev macro
FB_GEN_DEFAULT_DEFERRED_SYSMEM_OPS(). Initialize struct fb_ops to
the generated functions with fbdev initializer macros.

Signed-off-by: Thomas Zimmermann 
Cc: Bernie Thompson 
---


Acked-by: Javier Martinez Canillas 

[...]


+static void dlfb_ops_damage_range(struct fb_info *info, off_t off, size_t len)
+{
+   struct dlfb_data *dlfb = info->par;
+   int start = max((int)(off / info->fix.line_length), 0);
+   int lines = min((u32)((len / info->fix.line_length) + 1), 
(u32)info->var.yres);
+
+   dlfb_handle_damage(dlfb, 0, start, info->var.xres, lines);
+}
+
+static void dlfb_ops_damage_area(struct fb_info *info, u32 x, u32 y, u32 
width, u32 height)
+{
+   struct dlfb_data *dlfb = info->par;
+
+   dlfb_offload_damage(dlfb, x, y, width, height);
+}
+


These two are very similar to the helpers you added for the smscufx driver
in patch #1. I guess there's room for further consolidation as follow-up ?


Maybe. I had patches that take the rectangle computation from [1] and 
turn it into a helper for these USB drivers. But it's an unrelated 
change, so I dropped them from this patchset.


Best regards
Thomas

[1] 
https://elixir.bootlin.com/linux/v6.5.1/source/drivers/gpu/drm/drm_fb_helper.c#L641






--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)


OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH 1/8] fbdev/smscufx: Use fb_ops helpers for deferred I/O

2023-09-04 Thread Thomas Zimmermann

Hi Javier

Am 04.09.23 um 14:59 schrieb Javier Martinez Canillas:

Thomas Zimmermann  writes:

Hello Thomas,


Generate callback functions for struct fb_ops with the fbdev macro
FB_GEN_DEFAULT_DEFERRED_SYSMEM_OPS(). Initialize struct fb_ops to
the generated functions with fbdev initializer macros.

Signed-off-by: Thomas Zimmermann 
Cc: Steve Glendinning 
---


The patch looks good to me, but I've a question below.

Acked-by: Javier Martinez Canillas 


  drivers/video/fbdev/smscufx.c | 85 +--
  1 file changed, 22 insertions(+), 63 deletions(-)

diff --git a/drivers/video/fbdev/smscufx.c b/drivers/video/fbdev/smscufx.c


[...]


  static const struct fb_ops ufx_ops = {
.owner = THIS_MODULE,
-   .fb_read = fb_sys_read,
-   .fb_write = ufx_ops_write,
+   __FB_DEFAULT_DEFERRED_OPS_RDWR(ufx_ops),
.fb_setcolreg = ufx_ops_setcolreg,
-   .fb_fillrect = ufx_ops_fillrect,
-   .fb_copyarea = ufx_ops_copyarea,
-   .fb_imageblit = ufx_ops_imageblit,
+   __FB_DEFAULT_DEFERRED_OPS_DRAW(ufx_ops),
.fb_mmap = ufx_ops_mmap,


There are no generated functions for .fb_mmap, I wonder what's the value
of __FB_DEFAULT_DEFERRED_OPS_MMAP() ? Maybe just removing that macro and
setting .fb_mmap = fb_deferred_io_mmap instead if there's no custom mmap
handler would be easier to read ?


At least two drivers could use __FB_DEFAULT_DEFERRED_OPS_MMAP: 
picolcd-fb and hyperv_fb. At some point, we might want to set/clear 
fb_mmap depending on some Kconfig value. Having 
__FB_DEFAULT_DEFERRED_OPS_MMAP might be helpful then.




Alternatively, __FB_DEFAULT_DEFERRED_OPS_MMAP() could still be left but
not taking a __prefix argument since that is not used anyways ?


The driver optionally provides mmap without deferred I/O, hence the mmap 
function. That makes no sense, as these writes to the buffer would never 
make it to the device memory. But I didn't want to remove the code 
either. So I just left the existing function as-is. Usually, the 
deferred-I/O mmap is called immediately. [1]


Best regards
Thomas

[1] 
https://elixir.bootlin.com/linux/v6.5.1/source/drivers/video/fbdev/smscufx.c#L784






--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)


OpenPGP_signature
Description: OpenPGP digital signature


RE: [RFC 00/33] Add Support for Plane Color Pipeline

2023-09-04 Thread Shankar, Uma


> -Original Message-
> From: Sebastian Wick 
> Sent: Thursday, August 31, 2023 2:46 AM
> To: Shankar, Uma 
> Cc: Harry Wentland ; intel-
> g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; wayland-
> de...@lists.freedesktop.org; Ville Syrjala ; 
> Pekka
> Paalanen ; Simon Ser ;
> Melissa Wen ; Jonas Ådahl ; Shashank
> Sharma ; Alexander Goins ;
> Naseer Ahmed ; Christopher Braga
> 
> Subject: Re: [RFC 00/33] Add Support for Plane Color Pipeline
> 
> On Wed, Aug 30, 2023 at 08:47:37AM +, Shankar, Uma wrote:
> >
> >
> > > -Original Message-
> > > From: Harry Wentland 
> > > Sent: Wednesday, August 30, 2023 12:56 AM
> > > To: Shankar, Uma ;
> > > intel-...@lists.freedesktop.org; dri- de...@lists.freedesktop.org
> > > Cc: wayland-de...@lists.freedesktop.org; Ville Syrjala
> > > ; Pekka Paalanen
> > > ; Simon Ser ;
> > > Melissa Wen ; Jonas Ådahl ;
> > > Sebastian Wick ; Shashank Sharma
> > > ; Alexander Goins ;
> > > Naseer Ahmed ; Christopher Braga
> > > 
> > > Subject: Re: [RFC 00/33] Add Support for Plane Color Pipeline
> > >
> > > +CC Naseer and Chris, FYI
> > >
> > > See https://patchwork.freedesktop.org/series/123024/ for whole series.
> > >
> > > On 2023-08-29 12:03, Uma Shankar wrote:
> > > > Introduction
> > > > 
> > > >
> > > > Modern hardwares have various color processing capabilities both
> > > > at pre-blending and post-blending phases in the color pipeline.
> > > > The current drm implementation exposes only the post-blending
> > > > color hardware blocks. Support for pre-blending hardware is missing.
> > > > There are multiple use cases where pre-blending color hardware
> > > > will be
> > > > useful:
> > > > a) Linearization of input buffers encoded in various transfer
> > > >functions.
> > > > b) Color Space conversion
> > > > c) Tone mapping
> > > > d) Frame buffer format conversion
> > > > e) Non-linearization of buffer(apply transfer function)
> > > > f) 3D Luts
> > > >
> > > > and other miscellaneous color operations.
> > > >
> > > > Hence, there is a need to expose the color capabilities of the
> > > > hardware to user-space. This will help userspace/middleware to use
> > > > display hardware for color processing and blending instead of
> > > > doing it through GPU shaders.
> > > >
> > >
> > > Thanks, Uma, for sending this. I've been working on something
> > > similar but you beat me to it. :)
> >
> > Thanks Harry for the useful feedback and overall collaboration on this so 
> > far.
> >
> > > >
> > > > Work done so far and relevant references
> > > > 
> > > >
> > > > Some implementation is done by Intel and AMD/Igalia to address the same.
> > > > Broad consensus is there that we need a generic API at drm core to
> > > > suffice the use case of various HW vendors. Below are the links
> > > > capturing the discussion so far.
> > > >
> > > > Intel's Plane Color Implementation:
> > > > https://patchwork.freedesktop.org/series/90825/
> > > > AMD's Plane Color Implementation:
> > > > https://patchwork.freedesktop.org/series/116862/
> > > >
> > > >
> > > > Hackfest conclusions
> > > > 
> > > >
> > > > HDR/Color Hackfest was organised by Redhat to bring all the
> > > > industry stakeholders together and converge on a common uapi
> expectations.
> > > > Participants from Intel, AMD, Nvidia, Collabora, Redhat, Igalia
> > > > and other prominent user-space developers and maintainers.
> > > >
> > > > Discussions happened on the uapi expectations, opens, nature of
> > > > hardware of multiple hardware vendors, challenges in generalizing
> > > > the same and the path forward. Consensus was made that drm core
> > > > should implement descriptive APIs and not go with prescriptive
> > > > APIs. DRM core should just expose the hardware capabilities;
> > > > enabling, customizing and programming the same should be done by
> > > > the user-space. Driver should just
> > > honor the user space request without doing any operations internally.
> > > >
> > > > Thanks to Simon Ser, for nicely documenting the design consensus
> > > > and an UAPI RFC which can be referred to here:
> > > >
> > > > https://lore.kernel.org/dri-devel/QMers3awXvNCQlyhWdTtsPwkp5ie9bze
> > > > _hD5
> > > >
> > >
> nAccFW7a_RXlWjYB7MoUW_8CKLT2bSQwIXVi5H6VULYIxCdgvryZoAoJnC5lZgyK1
> Q
> > > Wn48
> > > > 8=@emersion.fr/
> > > >
> > > >
> > > > Design considerations
> > > > =
> > > >
> > > > Following are the important aspects taken into account while
> > > > designing the current RFC
> > > > proposal:
> > > >
> > > > 1. Individual HW blocks can be muxed. (e.g. out of two HW blocks
> > > > only one
> > > can be used)
> > > > 2. Position of the HW block in the pipeline can be programmable
> > > > 3. LUTs can be one dimentional or three dimentional
> > > > 4. Number of LUT entries can vary across platforms
> > > > 5. Preci

Re: [RFC][PATCH 0/2] drm/panic: Add a drm panic handler

2023-09-04 Thread Thomas Zimmermann

Hi Jocelyn,

thanks for moving this effort forward. It's much appreciated. I looked 
through the patches and tried the patchset on my test machine.


Am 09.08.23 um 21:17 schrieb Jocelyn Falempe:

This introduces a new drm panic handler, which displays a message when a panic 
occurs.
So when fbcon is disabled, you can still see a kernel panic.

This is one of the missing feature, when disabling VT/fbcon in the kernel:
https://www.reddit.com/r/linux/comments/10eccv9/config_vtn_in_2023/
Fbcon can be replaced by a userspace kms console, but the panic screen must be 
done in the kernel.

This is a proof of concept, and works only with simpledrm, using the drm_client 
API.
This implementation with the drm client API, allocates new framebuffers, and 
looks a bit too complex to run in a panic handler.
Maybe we should add an API to "steal" the current framebuffer instead, because 
in a panic handler user-space is already stopped.


Yes, that was also my first thought. I'd use an extra callback in struct 
drm_driver, like this:


struct drm_driver {
  int (*get_scanout_buffer)(/* return HW scanout */)
}

The scanout buffer would be described by kernel virtual address address, 
resolution, color format and scanline pitch. And that's what the panic 
handler uses.


Any driver implementing this interface would support the panic handler. 
If there's a concurrent display update, we'd have to synchronize.




To test it, make sure you're using the simpledrm driver, and trigger a panic:
echo c > /proc/sysrq-trigger


The penguin was cute. :)

This only works if the display is already running. I had to start Gnome 
to set a display mode. Then let the panic handler take over the output.


But with simpledrm, we could even display a message without an output, 
as the framebuffer is always there.




There is one thing I don't know how to do, is to unregister the drm_panic when 
the graphic driver is unloaded.
drm_client_register() says it will automatically unregister on driver unload. 
But then I don't know how to remove it from my linked list, and free the 
drm_client_dev struct.


Unregistering wouldn't be necessary with this proposed 
get_scanout_buffer. In the case of a panic, just remain silent if 
there's no driver that provides such a callback.




This is a first draft, so let me know what do you think about it.


One thing that will need serious work is the raw output. The current 
blitting for XRGB is really just a quick-and-dirty hack.


I think we should try to reuse fbdev's blitting code, if possible. The 
fbdev core, helpers and console come with all the features we need. We 
really only need to make them work without the struct fb_info, which is 
a full fbdev device.


In struct fb_ops, there are callbacks for modifying the framebuffer. [1] 
They are used by fbcon foir drawing. But they operate on fb_info.


For a while I've been thinking about using something like a drawable to 
provide some abstractions:


struct drawable {
/* store buffer parameters here */
...

struct drawable_funcs *funcs;
};

struct drawable_funcs {
/* have function pointers similar to struct fb_ops */
fill_rect()
copy_area()
image_blit()
};

We cannot rewrite all the existing fbdev drivers. To make this work with 
fbdev, we'd need adapter code that converts from drawable to fb_info and 
forwards to the existing helpers in fb_ops.


But for DRM's panic output, drawable_funcs would have to point to the 
scanout buffer and compatible callback funcs, for which we have 
implementations in fbdev.


We might be able to create console-like output that is independent from 
the fb_info. Hence, we could possible reuse a good chunk of the current 
panic output.


Best regards
Thomas

[1] https://elixir.bootlin.com/linux/v6.5.1/source/include/linux/fb.h#L273



Best regards,




Jocelyn Falempe (2):
   drm/panic: Add a drm panic handler
   drm/simpledrm: Add drm_panic support

  drivers/gpu/drm/Kconfig  |  11 ++
  drivers/gpu/drm/Makefile |   1 +
  drivers/gpu/drm/drm_drv.c|   3 +
  drivers/gpu/drm/drm_panic.c  | 286 +++
  drivers/gpu/drm/tiny/simpledrm.c |   2 +
  include/drm/drm_panic.h  |  26 +++
  6 files changed, 329 insertions(+)
  create mode 100644 drivers/gpu/drm/drm_panic.c
  create mode 100644 include/drm/drm_panic.h


base-commit: 6995e2de6891c724bfeb2db33d7b87775f913ad1


--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)


OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH 5/5] drm/bridge: samsung-dsim: calculate porches in Hz

2023-09-04 Thread Maxim Schwalm
Hi,

On 28.08.23 17:59, Michael Tretter wrote:
> Calculating the byte_clk in kHz is imprecise for a hs_clock of 55687500
> Hz, which may be used with a pixel clock of 74.25 MHz with mode
> 1920x1080-30.
> 
> Fix the calculation by using HZ instead of kHZ.
> 
> This requires to change the type to u64 to prevent overflows of the
> integer type.
> 
> Signed-off-by: Michael Tretter 
> ---
>  drivers/gpu/drm/bridge/samsung-dsim.c | 10 ++
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/bridge/samsung-dsim.c 
> b/drivers/gpu/drm/bridge/samsung-dsim.c
> index 459be953be55..eb7aca2b9ab7 100644
> --- a/drivers/gpu/drm/bridge/samsung-dsim.c
> +++ b/drivers/gpu/drm/bridge/samsung-dsim.c
> @@ -973,10 +973,12 @@ static void samsung_dsim_set_display_mode(struct 
> samsung_dsim *dsi)
>   u32 reg;
>  
>   if (dsi->mode_flags & MIPI_DSI_MODE_VIDEO) {
> - int byte_clk_khz = dsi->hs_clock / 1000 / 8;
> - int hfp = DIV_ROUND_UP((m->hsync_start - m->hdisplay) * 
> byte_clk_khz, m->clock);
> - int hbp = DIV_ROUND_UP((m->htotal - m->hsync_end) * 
> byte_clk_khz, m->clock);
> - int hsa = DIV_ROUND_UP((m->hsync_end - m->hsync_start) * 
> byte_clk_khz, m->clock);
> + u64 byte_clk = dsi->hs_clock / 8;
> + u64 pix_clk = m->clock * 1000;
> +
> + int hfp = DIV64_U64_ROUND_UP((m->hsync_start - m->hdisplay) * 
> byte_clk, pix_clk);
> + int hbp = DIV64_U64_ROUND_UP((m->htotal - m->hsync_end) * 
> byte_clk, pix_clk);
> + int hsa = DIV64_U64_ROUND_UP((m->hsync_end - m->hsync_start) * 
> byte_clk, pix_clk);

Wouldn't it make sense to use the videomode structure here?

>  
>   /* remove packet overhead when possible */
>   hfp = max(hfp - 6, 0);
> 

Best regards,
Maxim


RE: [RFC 02/33] drm: Add color operation structure

2023-09-04 Thread Shankar, Uma



> -Original Message-
> From: dri-devel  On Behalf Of Pekka
> Paalanen
> Sent: Wednesday, August 30, 2023 6:30 PM
> To: Shankar, Uma 
> Cc: intel-...@lists.freedesktop.org; Borah, Chaitanya Kumar
> ; dri-devel@lists.freedesktop.org; wayland-
> de...@lists.freedesktop.org
> Subject: Re: [RFC 02/33] drm: Add color operation structure
> 
> On Tue, 29 Aug 2023 21:33:51 +0530
> Uma Shankar  wrote:
> 
> > From: Chaitanya Kumar Borah 
> >
> > Each Color Hardware block will be represented uniquely in the color
> > pipeline. Define the structure to represent the same.
> >
> > These color operations will form the building blocks of a color
> > pipeline which best represents the underlying Hardware. Color
> > operations can be re-arranged, substracted or added to create distinct
> > color pipelines to accurately describe the Hardware blocks present in
> > the display engine.
> >
> > Co-developed-by: Uma Shankar 
> > Signed-off-by: Uma Shankar 
> > Signed-off-by: Chaitanya Kumar Borah 
> > ---
> >  include/uapi/drm/drm_mode.h | 72
> > +
> >  1 file changed, 72 insertions(+)
> >
> > diff --git a/include/uapi/drm/drm_mode.h b/include/uapi/drm/drm_mode.h
> > index ea1b639bcb28..882479f41745 100644
> > --- a/include/uapi/drm/drm_mode.h
> > +++ b/include/uapi/drm/drm_mode.h
> > @@ -943,6 +943,78 @@ struct hdr_output_metadata {
> > };
> >  };
> >
> > +/**
> > + * enum color_op_block
> > + *
> > + * Enums to identify hardware color blocks.
> > + *
> > + * @DRM_CB_PRE_CSC: LUT before the CTM unit
> > + * @DRM_CB_CSC: CTM hardware supporting 3x3 matrix
> > + * @DRM_CB_POST_CSC: LUT after the CTM unit
> > + * @DRM_CB_3D_LUT: LUT hardware with coefficients for all
> > + * color components
> > + * @DRM_CB_PRIVATE: Vendor specific hardware unit. Vendor
> > + *  can expose a custom hardware by defining a
> > + *  color operation block with this name as
> > + *  identifier
> 
> This naming scheme does not seem to work. It assumes a far too rigid pipeline,
> just like the old KMS property design. What if you have two other operations
> between PRE_CSC and CSC?
> 
> What sense do PRE_CSC and POST_CSC make if you don't happen to have a CSC
> operation?

Sure, we can re-look at the naming. However, it will be good to define some 
standard
operations common to all vendors and keep the rest as vendor private.

> What if a driver put POST_CSC before PRE_CSC in its pipeline?
> 
> What if your CSC is actually a series of three independent operations, and in
> addition you have PRE_CSC and POST_CSC?

We should try to standardized the operations as much as possible and leave rest 
as
vendor private. Current proposal allows us to do that.

> 3D_LUT is an operation category, not a name. The same could be said about
> private.

Sure, will fix this.

> Given that all these are also UAPI, do we also need protect old userspace from
> seeing values it does not understand?

For the values userspace doesn't understand, it can ignore the blocks. We 
should ensure
that userspace always gets a clean state wrt color hardware state and no 
baggage from
another client should be there. With that there is no burden of disabling that 
particular
block will be there on an older userspace.

> > + */
> > +enum color_op_block {
> > +   DRM_CB_INVAL = -1,
> > +
> > +   DRM_CB_PRE_CSC = 0,
> > +   DRM_CB_CSC,
> > +   DRM_CB_POST_CSC,
> > +   DRM_CB_3D_LUT,
> > +
> > +   /* Any new generic hardware block can be updated here */
> > +
> > +   /*
> > +* PRIVATE is kept at 255 to make it future proof and leave
> > +* scope for any new addition
> > +*/
> > +   DRM_CB_PRIVATE = 255,
> > +   DRM_CB_MAX = DRM_CB_PRIVATE,
> > +};
> > +
> > +/**
> > + * enum color_op_type
> > + *
> > + * These enums are to identify the mathematical operation that
> > + * a hardware block is capable of.
> > + * @CURVE_1D: It represents a one dimensional lookup table
> > + * @CURVE_3D: Represents lut value for each color component for 3d
> > +lut capable hardware
> > + * @MATRIX: It represents co-efficients for a CSC/CTM matrix hardware
> > + * @FIXED_FUNCTION: To enable and program any custom fixed function
> > +hardware unit  */ enum color_op_type {
> > +   CURVE_1D,
> > +   CURVE_3D,
> > +   MATRIX,
> > +   FIXED_FUNCTION,
> 
> My assumption was that a color_op_type would clearly and uniquely define the
> mathematical model of the operation and the UABI structure of the parameter
> blob. That means we need different values for uniform vs. exponentially vs.
> programmable distributed 1D LUT, etc.

In the hardware the LUTS are programmed as they are received from userspace.
So once the userspace gets to know the distribution of LUTS, segments, 
precision,
Number of lut samples, it can create the lut values to be programmed.

This information on the distribution of luts in the hardware can be extracted 
by the
drm_color_lut_range structure which is exposed as blob in the h

Re: [PATCH 0/5] drm/bridge: samsung-dsim: fix various modes with ADV7535 bridge

2023-09-04 Thread Frieder Schrempf
Hi Michael,

On 28.08.23 17:59, Michael Tretter wrote:
> I tested the i.MX8M Nano EVK with the NXP supplied MIPI-DSI adapter,
> which uses an ADV7535 MIPI-DSI to HDMI converter. I found that a few
> modes were working, but in many modes my monitor stayed dark.
> 
> This series fixes the Samsung DSIM bridge driver to bring up a few more
> modes:
> 
> The driver read the rate of the PLL ref clock only during probe.
> However, if the clock is re-parented to the VIDEO_PLL, changes to the
> pixel clock have an effect on the PLL ref clock. Therefore, the driver
> must read and potentially update the PLL ref clock on every modeset.
> 
> I also found that the rounding mode of the porches and active area has
> an effect on the working modes. If the driver rounds up instead of
> rounding down and be calculates them in Hz instead of kHz, more modes
> start to work.
> 
> The following table shows the modes that were working in my test without
> this patch set and the modes that are working now:
> 
> |Mode | Before | Now |
> | 1920x1080-60.00 | X  | X   |
> | 1920x1080-59.94 || X   |
> | 1920x1080-50.00 || X   |
> | 1920x1080-30.00 || X   |
> | 1920x1080-29.97 || X   |
> | 1920x1080-25.00 || X   |
> | 1920x1080-24.00 || |
> | 1920x1080-23.98 || |
> | 1680x1050-59.88 || X   |
> | 1280x1024-75.03 | X  | X   |
> | 1280x1024-60.02 | X  | X   |
> |  1200x960-59.99 || X   |
> |  1152x864-75.00 | X  | X   |
> |  1280x720-60.00 || |
> |  1280x720-59.94 || |
> |  1280x720-50.00 || X   |
> |  1024x768-75.03 || X   |
> |  1024x768-60.00 || X   |
> |   800x600-75.00 | X  | X   |
> |   800x600-60.32 | X  | X   |
> |   720x576-50.00 | X  | X   |
> |   720x480-60.00 || |
> |   720x480-59.94 | X  | |
> |   640x480-75.00 | X  | X   |
> |   640x480-60.00 || X   |
> |   640x480-59.94 || X   |
> |   720x400-70.08 || |
> 
> Interestingly, the 720x480-59.94 mode stopped working. However, I am
> able to bring up the 720x480 modes by manually hacking the active area
> (hsa) to 40 and carefully adjusting the clocks, but something still
> seems to be off.
> 
> Unfortunately, a few more modes are still not working at all. The NXP
> downstream kernel has some quirks to handle some of the modes especially
> wrt. to the porches, but I cannot figure out, what the driver should
> actually do in these cases. Maybe there is still an error in the
> calculation of the porches and someone at NXP can chime in.

Thanks for working on this! We tested these patches with our Kontron BL
i.MX8MM board and a "10.1inch HDMI LCD (E)" display from Waveshare  [1].

Without this series we don't get an image with the default mode of the
display (1024x600). With this series applied, it's now working.

For the whole series:

Tested-by: Frieder Schrempf  # Kontron BL
i.MX8MM + Waveshare 10.1inch HDMI LCD (E)

Thanks
Frieder

[1] https://www.waveshare.com/10.1inch-hdmi-lcd-e.htm


Re: [RFC] drm/bridge: megachips-stdpxxxx-ge-b850v3-fw: switch to drm_do_get_edid()

2023-09-04 Thread Peter Senna Tschudin
On Mon, Sep 4, 2023 at 12:16 PM Jani Nikula  wrote:
>
> On Sat, 02 Sep 2023, Peter Senna Tschudin  wrote:
> > Good morning Jani,
> >
> > It has been a long time since I wrote the driver, and many many years
> > since I sent my last kernel patch, so my memory does not serve me very
> > well, but I will try to shed some light.
> >
> > On Fri, Sep 1, 2023 at 12:24 PM Jani Nikula  wrote:
> >>
> >> The driver was originally added in commit fcfa0ddc18ed ("drm/bridge:
> >> Drivers for megachips-stdp-ge-b850v3-fw (LVDS-DP++)"). I tried to
> >> look up the discussion, but didn't find anyone questioning the EDID
> >> reading part.
> >>
> >> Why does it not use drm_get_edid() or drm_do_get_edid()?
> >>
> >> I don't know where client->addr comes from, so I guess it could be
> >> different from DDC_ADDR, rendering drm_get_edid() unusable.
> >>
> >> There's also the comment:
> >>
> >> /* Yes, read the entire buffer, and do not skip the first
> >>  * EDID_LENGTH bytes.
> >>  */
> >>
> >> But again, there's not a word on *why*.
> >
> > The video pipeline has two hardware bridges between the LVDS from the
> > SoC and DP+ output. For reasons, we would get hot plug events from one
> > of these bridges, and EDID from the other. If I am not mistaken, I
> > documented this strangeness in the DTS readme file.
> >
> > Did this shed any light on the *why* or did I tell you something you
> > already knew?
>
> I guess that answers the question why it's necessary to specify the ddc
> to use, but not why drm_do_get_edid() could not be used. Is it really
> necessary to read the EDID in one go?

I have a very weak recollection about hotplug and EDID issues with the
LVDS driver. I am not very confident about this, but maybe I needed to
find ways to read EDID early enough to please the LVDS display driver
during the LVDS driver startup.


RE: [RFC 01/33] drm/doc/rfc: Add RFC document for proposed Plane Color Pipeline

2023-09-04 Thread Shankar, Uma


> -Original Message-
> From: dri-devel  On Behalf Of Pekka
> Paalanen
> Sent: Wednesday, August 30, 2023 5:59 PM
> To: Shankar, Uma 
> Cc: intel-...@lists.freedesktop.org; Borah, Chaitanya Kumar
> ; dri-devel@lists.freedesktop.org; wayland-
> de...@lists.freedesktop.org
> Subject: Re: [RFC 01/33] drm/doc/rfc: Add RFC document for proposed Plane
> Color Pipeline
> 
> On Wed, 30 Aug 2023 08:59:36 +
> "Shankar, Uma"  wrote:
> 
> > > -Original Message-
> > > From: Harry Wentland 
> > > Sent: Wednesday, August 30, 2023 1:10 AM
> > > To: Shankar, Uma ;
> > > intel-...@lists.freedesktop.org; dri- de...@lists.freedesktop.org
> > > Cc: Borah, Chaitanya Kumar ;
> > > wayland- de...@lists.freedesktop.org
> > > Subject: Re: [RFC 01/33] drm/doc/rfc: Add RFC document for proposed
> > > Plane Color Pipeline
> > >
> > >
> > >
> > > On 2023-08-29 12:03, Uma Shankar wrote:
> > > > Add the documentation for the new proposed Plane Color Pipeline.
> > > >
> > > > Co-developed-by: Chaitanya Kumar Borah
> > > > 
> > > > Signed-off-by: Chaitanya Kumar Borah
> > > > 
> > > > Signed-off-by: Uma Shankar 
> > > > ---
> > > >   .../gpu/rfc/plane_color_pipeline.rst  | 394 ++
> > > >   1 file changed, 394 insertions(+)
> > > >   create mode 100644
> > > > Documentation/gpu/rfc/plane_color_pipeline.rst
> > > >
> > > > diff --git a/Documentation/gpu/rfc/plane_color_pipeline.rst
> > > > b/Documentation/gpu/rfc/plane_color_pipeline.rst
> > > > new file mode 100644
> > > > index ..60ce515b6ea7
> > > > --- /dev/null
> > > > +++ b/Documentation/gpu/rfc/plane_color_pipeline.rst
> 
> ...
> 
> Hi Uma!

Thanks Pekka for the feedback and useful inputs.

> > > > +This color pipeline is then packaged within a blob for the user
> > > > +space to retrieve it. Details can be found in the next section
> > > > +
> > >
> > > Not sure I like blobs that contain other blob ids.
> >
> > It provides flexibility and helps with just one interface to
> > userspace. Its easy to handle and manage once we get the hang of it 😊.
> >
> > We can clearly define the steps of parsing and data structures to be
> > used while interpreting and parsing the blobs.
> 
> Don't forget extendability. Possibly every single struct will need some kind 
> of
> versioning, and then it's not simple to parse anymore. Add to that new/old 
> kernel
> vs. old/new userspace, and it seems a bit nightmarish to design.

Structure to be used to interpret the blob should be defined as UAPI only and 
is not
expected to change once agreed upon. It should be interpreted like a standard 
property.
So structure to be used, say for 3dLut or 1dlut or CTM operations should be 
standardized
and fixed. No versioning of structure should be done and same 
rules/restrictions as of UAPI
property should be applied. 

New vs old userspace problem exists even today as you rightly highlighted in 
mail below,
however we are planning to propose that we clean the hardware state once the 
userspace
client switches or same client switches the pipeline.

> Also since it's records inside a single blob, it's like a new file
> format: every record needs a standard header that allows skipping it
> appropriately if userspace does not understand it, or you need a standard 
> index
> telling where everything is. Making all records the same size would waste 
> space,
> and extendability requires variable size.

The design currently implements 1 hardware block by a struct drm_color_op data 
structure.
Multiple such blocks make the pipeline. So userspace just needs to get the 
pipeline and then
parse blocks 1 by 1. For blocks which it doesn't understand it can just skip 
and move to the
next one. Each block is differentiated by a unique "name" standardized by an 
enum which will be
part of the UAPI. Thus we will have scope for variable size blob to represent 
the particular hardware
pipeline, userspace can parse and implement whichever blocks it understands. 
Only rule defined
by UAPI is the way the respective block is to be parsed and programmed.

> I also would not assume that we can declare a standard set of blocks and that
> nothing else will be needed. The existing hardware is too diverse for that 
> from
> what I have understood. I assume that some hardware have blocks unique to
> them, and they want to at least expose that functionality through a UAPI that
> allows at least generic enumeration of functionality, even if it needs 
> specialized
> userspace code to actually make use of.

Yeah, this is right and for that reason we came up with an idea of 
DRM_CB_PRIVATE
name for the hardware block. This will tell userspace that this is private 
hardware block
for a particular hardware vendor. Generic userspace will ignore this block. 
Vendor specific
HAL or compositor implementation can parse and use this block. To interpret the 
blob_id
and assign to a respective data structure, private_flags will be used. These 
private_flags will
be agreed upon by HAL and vendor implem

Re: [PATCH 8/8] staging/fbtft: Use fb_ops helpers for deferred I/O

2023-09-04 Thread Javier Martinez Canillas
Thomas Zimmermann  writes:

> Generate callback functions for struct fb_ops with the fbdev macro
> FB_GEN_DEFAULT_DEFERRED_SYSMEM_OPS(). Initialize struct fb_ops to
> the generated functions with an fbdev initializer macro.
>
> Signed-off-by: Thomas Zimmermann 
> ---

Acked-by: Javier Martinez Canillas 

-- 
Best regards,

Javier Martinez Canillas
Core Platforms
Red Hat



Re: [PATCH 7/8] staging/fbtft: Initialize fb_op struct as static const

2023-09-04 Thread Javier Martinez Canillas
Thomas Zimmermann  writes:

> Replace dynamic allocation of the fb_ops instance with static
> allocation. Initialize the fields at module-load time. The owner
> field changes to THIS_MODULE, as in all other fbdev drivers.
>
> Signed-off-by: Thomas Zimmermann 
> ---

Acked-by: Javier Martinez Canillas 

-- 
Best regards,

Javier Martinez Canillas
Core Platforms
Red Hat



Re: [PATCH 6/8] hid/picolcd: Use fb_ops helpers for deferred I/O

2023-09-04 Thread Javier Martinez Canillas
Thomas Zimmermann  writes:

> Generate callback functions for struct fb_ops with the fbdev macro
> FB_GEN_DEFAULT_DEFERRED_SYSMEM_OPS(). Initialize struct fb_ops to
> the generated functions with an fbdev initializer macro.
>
> Signed-off-by: Thomas Zimmermann 
> Cc: Jiri Kosina 
> Cc: Benjamin Tissoires 
> Cc: "Bruno Prémont" 
> ---

Acked-by: Javier Martinez Canillas 

-- 
Best regards,

Javier Martinez Canillas
Core Platforms
Red Hat



Re: [PATCH 5/8] hid: Remove trailing whitespace

2023-09-04 Thread Javier Martinez Canillas
Thomas Zimmermann  writes:

> Fix coding style in Kconfig. No functional changes.
>
> Signed-off-by: Thomas Zimmermann 
> Cc: Jiri Kosina 
> Cc: Benjamin Tissoires 
> ---

Acked-by: Javier Martinez Canillas 

-- 
Best regards,

Javier Martinez Canillas
Core Platforms
Red Hat



Re: [PATCH v16 20/20] drm/panfrost: Switch to generic memory shrinker

2023-09-04 Thread Steven Price
On 03/09/2023 18:07, Dmitry Osipenko wrote:
> Replace Panfrost's custom memory shrinker with a common drm-shmem
> memory shrinker.
> 
> Tested-by: Steven Price  # Firefly-RK3288

I just gave this version of the series a spin and I can trigger the following 
warning:

[  477.776163] [ cut here ]
[  477.781353] WARNING: CPU: 0 PID: 292 at 
drivers/gpu/drm/drm_gem_shmem_helper.c:227 drm_gem_shmem_free+0x1fc/0x200 
[drm_shmem_helper]
[  477.794790] panfrost ffa3.gpu: 
drm_WARN_ON(refcount_read(&shmem->pages_use_count))
[  477.794797] Modules linked in: panfrost gpu_sched drm_shmem_helper
[  477.810942] CPU: 0 PID: 292 Comm: glmark2-es2-drm Not tainted 
6.5.0-rc2-00527-gc8a0c16fa830 #1
[  477.820564] Hardware name: Rockchip (Device Tree)
[  477.825820]  unwind_backtrace from show_stack+0x10/0x14
[  477.831670]  show_stack from dump_stack_lvl+0x58/0x70
[  477.837319]  dump_stack_lvl from __warn+0x7c/0x1a4
[  477.842680]  __warn from warn_slowpath_fmt+0x134/0x1a0
[  477.848429]  warn_slowpath_fmt from drm_gem_shmem_free+0x1fc/0x200 
[drm_shmem_helper]
[  477.857199]  drm_gem_shmem_free [drm_shmem_helper] from 
drm_gem_handle_delete+0x84/0xb0
[  477.866163]  drm_gem_handle_delete from drm_ioctl+0x214/0x4ec
[  477.872592]  drm_ioctl from sys_ioctl+0x568/0xd48
[  477.877857]  sys_ioctl from ret_fast_syscall+0x0/0x1c
[  477.883504] Exception stack(0xf0a49fa8 to 0xf0a49ff0)
[  477.889148] 9fa0:   005969c0 bef34880 0006 40086409 
bef34880 0001
[  477.898289] 9fc0: 005969c0 bef34880 40086409 0036 bef34880 00590b64 
00590aec 
[  477.907428] 9fe0: b6ec408c bef3485c b6ead42f b6c31f98
[  477.913188] irq event stamp: 37296889
[  477.917319] hardirqs last  enabled at (37296951): [] 
__up_console_sem+0x50/0x60
[  477.926531] hardirqs last disabled at (37296972): [] 
__up_console_sem+0x3c/0x60
[  477.935714] softirqs last  enabled at (37296986): [] 
__do_softirq+0x318/0x4d4
[  477.944708] softirqs last disabled at (37296981): [] 
__irq_exit_rcu+0x140/0x160
[  477.953878] ---[ end trace  ]---

So something, somewhere has gone wrong with the reference counts.

Steve

> Reviewed-by: Steven Price 
> Signed-off-by: Dmitry Osipenko 
> ---
>  drivers/gpu/drm/panfrost/Makefile |   1 -
>  drivers/gpu/drm/panfrost/panfrost_device.h|   4 -
>  drivers/gpu/drm/panfrost/panfrost_drv.c   |  27 ++--
>  drivers/gpu/drm/panfrost/panfrost_gem.c   |  30 ++--
>  drivers/gpu/drm/panfrost/panfrost_gem.h   |   9 --
>  .../gpu/drm/panfrost/panfrost_gem_shrinker.c  | 129 --
>  drivers/gpu/drm/panfrost/panfrost_job.c   |  18 ++-
>  include/drm/drm_gem_shmem_helper.h|   7 -
>  8 files changed, 47 insertions(+), 178 deletions(-)
>  delete mode 100644 drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
> 
> diff --git a/drivers/gpu/drm/panfrost/Makefile 
> b/drivers/gpu/drm/panfrost/Makefile
> index 7da2b3f02ed9..11622e22cf15 100644
> --- a/drivers/gpu/drm/panfrost/Makefile
> +++ b/drivers/gpu/drm/panfrost/Makefile
> @@ -5,7 +5,6 @@ panfrost-y := \
>   panfrost_device.o \
>   panfrost_devfreq.o \
>   panfrost_gem.o \
> - panfrost_gem_shrinker.o \
>   panfrost_gpu.o \
>   panfrost_job.o \
>   panfrost_mmu.o \
> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h 
> b/drivers/gpu/drm/panfrost/panfrost_device.h
> index b0126b9fbadc..dcc2571c092b 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_device.h
> +++ b/drivers/gpu/drm/panfrost/panfrost_device.h
> @@ -116,10 +116,6 @@ struct panfrost_device {
>   atomic_t pending;
>   } reset;
>  
> - struct mutex shrinker_lock;
> - struct list_head shrinker_list;
> - struct shrinker shrinker;
> -
>   struct panfrost_devfreq pfdevfreq;
>  };
>  
> diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c 
> b/drivers/gpu/drm/panfrost/panfrost_drv.c
> index 175443eacead..8cf338c2a03b 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_drv.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
> @@ -170,7 +170,6 @@ panfrost_lookup_bos(struct drm_device *dev,
>   break;
>   }
>  
> - atomic_inc(&bo->gpu_usecount);
>   job->mappings[i] = mapping;
>   }
>  
> @@ -395,7 +394,6 @@ static int panfrost_ioctl_madvise(struct drm_device *dev, 
> void *data,
>  {
>   struct panfrost_file_priv *priv = file_priv->driver_priv;
>   struct drm_panfrost_madvise *args = data;
> - struct panfrost_device *pfdev = dev->dev_private;
>   struct drm_gem_object *gem_obj;
>   struct panfrost_gem_object *bo;
>   int ret = 0;
> @@ -408,11 +406,15 @@ static int panfrost_ioctl_madvise(struct drm_device 
> *dev, void *data,
>  
>   bo = to_panfrost_bo(gem_obj);
>  
> + if (bo->is_heap) {
> + args->retained = 1;
> + goto out_put_object;
> + }
> +
>   ret = dma_resv_lock_interruptible(bo->base.base.resv, NULL);
>   if (ret)
>   goto

Re: [PATCH 4/8] fbdev/hyperv_fb: Use fb_ops helpers for deferred I/O

2023-09-04 Thread Javier Martinez Canillas
Thomas Zimmermann  writes:

> Generate callback functions for struct fb_ops with the fbdev macro
> FB_GEN_DEFAULT_DEFERRED_IOMEM_OPS(). Initialize struct fb_ops to
> the generated functions with fbdev initializer macros.
>
> The hyperv_fb driver is incomplete in its handling of deferred I/O
> and damage framebuffers. Write operations do no trigger damage handling.
> Fixing this is beyond the scope of this patch.
>
> Signed-off-by: Thomas Zimmermann 

Acked-by: Javier Martinez Canillas 

-- 
Best regards,

Javier Martinez Canillas
Core Platforms
Red Hat



Re: [PATCH 3/8] fbdev: Add Kconfig macro FB_IOMEM_HELPERS_DEFERRED

2023-09-04 Thread Javier Martinez Canillas
Thomas Zimmermann  writes:

> The new Kconfig macro FB_IOMEM_HELPERS_DEFERRED selects fbdev's
> helpers for device I/O memory and deferred I/O. Drivers should
> use it if they perform damage updates on device I/O memory.
>
> Signed-off-by: Thomas Zimmermann 
> ---

Acked-by: Javier Martinez Canillas 

-- 
Best regards,

Javier Martinez Canillas
Core Platforms
Red Hat



Re: [RFC] drm/bridge: megachips-stdpxxxx-ge-b850v3-fw: switch to drm_do_get_edid()

2023-09-04 Thread Laurent Pinchart
On Mon, Sep 04, 2023 at 01:04:40PM +0300, Jani Nikula wrote:
> On Sat, 02 Sep 2023, Peter Senna Tschudin wrote:
> > Good morning Jani,
> >
> > It has been a long time since I wrote the driver, and many many years
> > since I sent my last kernel patch, so my memory does not serve me very
> > well, but I will try to shed some light.
> >
> > On Fri, Sep 1, 2023 at 12:24 PM Jani Nikula wrote:
> >>
> >> The driver was originally added in commit fcfa0ddc18ed ("drm/bridge:
> >> Drivers for megachips-stdp-ge-b850v3-fw (LVDS-DP++)"). I tried to
> >> look up the discussion, but didn't find anyone questioning the EDID
> >> reading part.
> >>
> >> Why does it not use drm_get_edid() or drm_do_get_edid()?
> >>
> >> I don't know where client->addr comes from, so I guess it could be
> >> different from DDC_ADDR, rendering drm_get_edid() unusable.
> >>
> >> There's also the comment:
> >>
> >> /* Yes, read the entire buffer, and do not skip the first
> >>  * EDID_LENGTH bytes.
> >>  */
> >>
> >> But again, there's not a word on *why*.
> >
> > The video pipeline has two hardware bridges between the LVDS from the
> > SoC and DP+ output. For reasons, we would get hot plug events from one
> > of these bridges, and EDID from the other. If I am not mistaken, I
> > documented this strangeness in the DTS readme file.

This should be supported properly by the drm_bridge_connector helper,
which supports delegating HPD and EDID retrieval to different bridges.

> > Did this shed any light on the *why* or did I tell you something you
> > already knew?
> 
> I guess that answers the question why it's necessary to specify the ddc
> to use, but not why drm_do_get_edid() could not be used. Is it really
> necessary to read the EDID in one go?

-- 
Regards,

Laurent Pinchart


  1   2   >