date:20210922

RE: [PATCH 1/2] Enable buddy memory manager support

2021-09-22 Thread Paneer Selvam, Arunpravin

[AMD Public Use]

Hi Christian,

> And where is the patch to switch i915 and remove the Intel copy of this?
Creating a patch for the switch.

> In general I think that every public function here needs a kerneldoc 
> description what it is all about.
Making a kernel doc description for each public function

Thanks,
Arun
-Original Message-
From: Koenig, Christian  
Sent: Wednesday, September 22, 2021 12:28 PM
To: Paneer Selvam, Arunpravin ; 
dri-devel@lists.freedesktop.org; intel-...@lists.freedesktop.org; 
amd-...@lists.freedesktop.org; matthew.a...@intel.com; dan...@ffwll.ch; 
Deucher, Alexander 
Subject: Re: [PATCH 1/2] Enable buddy memory manager support

Am 20.09.21 um 21:20 schrieb Arunpravin:
> Port Intel buddy system manager to drm root folder Add CPU 
> mappable/non-mappable region support to the drm buddy manager

And where is the patch to switch i915 and remove the Intel copy of this?

In general I think that every public function here needs a kerneldoc 
description what it is all about.

Regards,
Christian.

>
> Signed-off-by: Arunpravin 
> ---
>   drivers/gpu/drm/Makefile|   2 +-
>   drivers/gpu/drm/drm_buddy.c | 465 
>   include/drm/drm_buddy.h | 154 
>   3 files changed, 620 insertions(+), 1 deletion(-)
>   create mode 100644 drivers/gpu/drm/drm_buddy.c
>   create mode 100644 include/drm/drm_buddy.h
>
> diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile index 
> a118692a6df7..fe1a2fc09675 100644
> --- a/drivers/gpu/drm/Makefile
> +++ b/drivers/gpu/drm/Makefile
> @@ -18,7 +18,7 @@ drm-y   :=  drm_aperture.o drm_auth.o drm_cache.o \
>   drm_dumb_buffers.o drm_mode_config.o drm_vblank.o \
>   drm_syncobj.o drm_lease.o drm_writeback.o drm_client.o \
>   drm_client_modeset.o drm_atomic_uapi.o drm_hdcp.o \
> - drm_managed.o drm_vblank_work.o
> + drm_managed.o drm_vblank_work.o drm_buddy.o
>   
>   drm-$(CONFIG_DRM_LEGACY) += drm_agpsupport.o drm_bufs.o drm_context.o 
> drm_dma.o \
>   drm_legacy_misc.o drm_lock.o drm_memory.o 
> drm_scatter.o \ 
> diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c 
> new file mode 100644 index ..f07919a004b6
> --- /dev/null
> +++ b/drivers/gpu/drm/drm_buddy.c
> @@ -0,0 +1,465 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright   2021 Intel Corporation  */
> +
> +#include 
> +#include 
> +
> +static struct drm_buddy_block *drm_block_alloc(struct drm_buddy_mm *mm,
> + struct drm_buddy_block *parent, unsigned int order,
> + u64 offset)
> +{
> + struct drm_buddy_block *block;
> +
> + BUG_ON(order > DRM_BUDDY_MAX_ORDER);
> +
> + block = kmem_cache_zalloc(mm->slab_blocks, GFP_KERNEL);
> + if (!block)
> + return NULL;
> +
> + block->header = offset;
> + block->header |= order;
> + block->parent = parent;
> + block->start = offset >> PAGE_SHIFT;
> + block->size = (mm->chunk_size << order) >> PAGE_SHIFT;
> +
> + BUG_ON(block->header & DRM_BUDDY_HEADER_UNUSED);
> + return block;
> +}
> +
> +static void drm_block_free(struct drm_buddy_mm *mm, struct 
> +drm_buddy_block *block) {
> + kmem_cache_free(mm->slab_blocks, block); }
> +
> +static void add_ordered(struct drm_buddy_mm *mm, struct 
> +drm_buddy_block *block) {
> + struct drm_buddy_block *node;
> +
> + if (list_empty(>free_list[drm_buddy_block_order(block)])) {
> + list_add(>link,
> + >free_list[drm_buddy_block_order(block)]);
> + return;
> + }
> +
> + list_for_each_entry(node, >free_list[drm_buddy_block_order(block)], 
> link)
> + if (block->start > node->start)
> + break;
> +
> + __list_add(>link, node->link.prev, >link); }
> +
> +static void mark_allocated(struct drm_buddy_block *block) {
> + block->header &= ~DRM_BUDDY_HEADER_STATE;
> + block->header |= DRM_BUDDY_ALLOCATED;
> +
> + list_del(>link);
> +}
> +
> +static void mark_free(struct drm_buddy_mm *mm,
> +   struct drm_buddy_block *block) {
> + block->header &= ~DRM_BUDDY_HEADER_STATE;
> + block->header |= DRM_BUDDY_FREE;
> +
> + add_ordered(mm, block);
> +}
> +
> +static void mark_split(struct drm_buddy_block *block) {
> + block->header &= ~DRM_BUDDY_HEADER_STATE;
> + block->header |= DRM_BUDDY_SPLIT;
> +
> + list_del(>link);
> +}
> +
> +int drm_buddy_init(struct drm_buddy_mm *mm, u64 size, u64 chunk_size) 
> +{
> + unsigned int i;
> + u64 offset;
> +
> + if (size < chunk_size)
> + return -EINVAL;
> +
> + if (chunk_size < PAGE_SIZE)
> + return -EINVAL;
> +
> + if (!is_power_of_2(chunk_size))
> + return -EINVAL;
> +
> + size = round_down(size, chunk_size);
> +
> + mm->size = size;
> + mm->chunk_size = chunk_size;
> + mm->max_order =

Re: [PATCH v3 6/6] drm/ingenic: Attach bridge chain to encoders

2021-09-22 Thread H. Nikolaus Schaller

Hi Paul,
thanks for another update.

We have been delayed to rework the CI20 HDMI code on top of your series
but it basically works in some situations. There is for example a problem
if the EDID reports DRM_COLOR_FORMAT_YCRCB422 but it appears to be outside
of your series.

The only issue we have is described below.

> Am 22.09.2021 um 22:55 schrieb Paul Cercueil :
> 
> Attach a top-level bridge to each encoder, which will be used for
> negociating the bus format and flags.
> 
> All the bridges are now attached with DRM_BRIDGE_ATTACH_NO_CONNECTOR.
> 
> Signed-off-by: Paul Cercueil 
> ---
> drivers/gpu/drm/ingenic/ingenic-drm-drv.c | 92 +--
> 1 file changed, 70 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/gpu/drm/ingenic/ingenic-drm-drv.c 
> b/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
> index a5e2880e07a1..a05a9fa6e115 100644
> --- a/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
> +++ b/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
> @@ -21,6 +21,7 @@
> #include 
> #include 
> #include 
> +#include 
> #include 
> #include 
> #include 
> @@ -108,6 +109,19 @@ struct ingenic_drm {
>   struct drm_private_obj private_obj;
> };
> 
> +struct ingenic_drm_bridge {
> + struct drm_encoder encoder;
> + struct drm_bridge bridge, *next_bridge;
> +
> + struct drm_bus_cfg bus_cfg;
> +};
> +
> +static inline struct ingenic_drm_bridge *
> +to_ingenic_drm_bridge(struct drm_encoder *encoder)
> +{
> + return container_of(encoder, struct ingenic_drm_bridge, encoder);
> +}
> +
> static inline struct ingenic_drm_private_state *
> to_ingenic_drm_priv_state(struct drm_private_state *state)
> {
> @@ -668,11 +682,10 @@ static void ingenic_drm_encoder_atomic_mode_set(struct 
> drm_encoder *encoder,
> {
>   struct ingenic_drm *priv = drm_device_get_priv(encoder->dev);
>   struct drm_display_mode *mode = _state->adjusted_mode;
> - struct drm_connector *conn = conn_state->connector;
> - struct drm_display_info *info = >display_info;
> + struct ingenic_drm_bridge *bridge = to_ingenic_drm_bridge(encoder);
>   unsigned int cfg, rgbcfg = 0;
> 
> - priv->panel_is_sharp = info->bus_flags & DRM_BUS_FLAG_SHARP_SIGNALS;
> + priv->panel_is_sharp = bridge->bus_cfg.flags & 
> DRM_BUS_FLAG_SHARP_SIGNALS;
> 
>   if (priv->panel_is_sharp) {
>   cfg = JZ_LCD_CFG_MODE_SPECIAL_TFT_1 | JZ_LCD_CFG_REV_POLARITY;
> @@ -685,19 +698,19 @@ static void ingenic_drm_encoder_atomic_mode_set(struct 
> drm_encoder *encoder,
>   cfg |= JZ_LCD_CFG_HSYNC_ACTIVE_LOW;
>   if (mode->flags & DRM_MODE_FLAG_NVSYNC)
>   cfg |= JZ_LCD_CFG_VSYNC_ACTIVE_LOW;
> - if (info->bus_flags & DRM_BUS_FLAG_DE_LOW)
> + if (bridge->bus_cfg.flags & DRM_BUS_FLAG_DE_LOW)
>   cfg |= JZ_LCD_CFG_DE_ACTIVE_LOW;
> - if (info->bus_flags & DRM_BUS_FLAG_PIXDATA_DRIVE_NEGEDGE)
> + if (bridge->bus_cfg.flags & DRM_BUS_FLAG_PIXDATA_DRIVE_NEGEDGE)
>   cfg |= JZ_LCD_CFG_PCLK_FALLING_EDGE;
> 
>   if (!priv->panel_is_sharp) {
> - if (conn->connector_type == DRM_MODE_CONNECTOR_TV) {
> + if (conn_state->connector->connector_type == 
> DRM_MODE_CONNECTOR_TV) {
>   if (mode->flags & DRM_MODE_FLAG_INTERLACE)
>   cfg |= JZ_LCD_CFG_MODE_TV_OUT_I;
>   else
>   cfg |= JZ_LCD_CFG_MODE_TV_OUT_P;
>   } else {
> - switch (*info->bus_formats) {
> + switch (bridge->bus_cfg.format) {
>   case MEDIA_BUS_FMT_RGB565_1X16:
>   cfg |= JZ_LCD_CFG_MODE_GENERIC_16BIT;
>   break;
> @@ -723,20 +736,29 @@ static void ingenic_drm_encoder_atomic_mode_set(struct 
> drm_encoder *encoder,
>   regmap_write(priv->map, JZ_REG_LCD_RGBC, rgbcfg);
> }
> 
> -static int ingenic_drm_encoder_atomic_check(struct drm_encoder *encoder,
> - struct drm_crtc_state *crtc_state,
> - struct drm_connector_state 
> *conn_state)
> +static int ingenic_drm_bridge_attach(struct drm_bridge *bridge,
> +  enum drm_bridge_attach_flags flags)
> +{
> + struct ingenic_drm_bridge *ib = to_ingenic_drm_bridge(bridge->encoder);
> +
> + return drm_bridge_attach(bridge->encoder, ib->next_bridge,
> +  >bridge, flags);
> +}
> +
> +static int ingenic_drm_bridge_atomic_check(struct drm_bridge *bridge,
> +struct drm_bridge_state 
> *bridge_state,
> +struct drm_crtc_state *crtc_state,
> +struct drm_connector_state 
> *conn_state)
> {
> - struct drm_display_info *info = _state->connector->display_info;
>   struct drm_display_mode *mode = _state->adjusted_mode;
> + struct ingenic_drm_bridge *ib =

RE: [PATCH 1/2] Enable buddy memory manager support

2021-09-22 Thread Paneer Selvam, Arunpravin

[AMD Public Use]

Hi Alex,
I will fix the name and send a document in my next version.

Thanks,
Arun
-Original Message-
From: Alex Deucher  
Sent: Tuesday, September 21, 2021 12:54 AM
To: Paneer Selvam, Arunpravin 
Cc: Maling list - DRI developers ; Intel 
Graphics Development ; amd-gfx list 
; Koenig, Christian ; 
Matthew Auld ; Daniel Vetter ; 
Deucher, Alexander 
Subject: Re: [PATCH 1/2] Enable buddy memory manager support

On Mon, Sep 20, 2021 at 3:21 PM Arunpravin  
wrote:

Please prefix the patch subject with drm.  E.g.,
drm: Enable buddy memory manager support

Same for the second patch, but make it drm/amdgpu instead.

Alex

>
> Port Intel buddy system manager to drm root folder Add CPU 
> mappable/non-mappable region support to the drm buddy manager
>
> Signed-off-by: Arunpravin 
> ---
>  drivers/gpu/drm/Makefile|   2 +-
>  drivers/gpu/drm/drm_buddy.c | 465 
>  include/drm/drm_buddy.h | 154 
>  3 files changed, 620 insertions(+), 1 deletion(-)  create mode 100644 
> drivers/gpu/drm/drm_buddy.c  create mode 100644 
> include/drm/drm_buddy.h
>
> diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile index 
> a118692a6df7..fe1a2fc09675 100644
> --- a/drivers/gpu/drm/Makefile
> +++ b/drivers/gpu/drm/Makefile
> @@ -18,7 +18,7 @@ drm-y   :=drm_aperture.o drm_auth.o drm_cache.o 
> \
> drm_dumb_buffers.o drm_mode_config.o drm_vblank.o \
> drm_syncobj.o drm_lease.o drm_writeback.o drm_client.o \
> drm_client_modeset.o drm_atomic_uapi.o drm_hdcp.o \
> -   drm_managed.o drm_vblank_work.o
> +   drm_managed.o drm_vblank_work.o drm_buddy.o
>
>  drm-$(CONFIG_DRM_LEGACY) += drm_agpsupport.o drm_bufs.o drm_context.o 
> drm_dma.o \
> drm_legacy_misc.o drm_lock.o drm_memory.o 
> drm_scatter.o \ diff --git a/drivers/gpu/drm/drm_buddy.c 
> b/drivers/gpu/drm/drm_buddy.c new file mode 100644 index 
> ..f07919a004b6
> --- /dev/null
> +++ b/drivers/gpu/drm/drm_buddy.c
> @@ -0,0 +1,465 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright   2021 Intel Corporation  */
> +
> +#include 
> +#include 
> +
> +static struct drm_buddy_block *drm_block_alloc(struct drm_buddy_mm *mm,
> +   struct drm_buddy_block *parent, unsigned int order,
> +   u64 offset)
> +{
> +   struct drm_buddy_block *block;
> +
> +   BUG_ON(order > DRM_BUDDY_MAX_ORDER);
> +
> +   block = kmem_cache_zalloc(mm->slab_blocks, GFP_KERNEL);
> +   if (!block)
> +   return NULL;
> +
> +   block->header = offset;
> +   block->header |= order;
> +   block->parent = parent;
> +   block->start = offset >> PAGE_SHIFT;
> +   block->size = (mm->chunk_size << order) >> PAGE_SHIFT;
> +
> +   BUG_ON(block->header & DRM_BUDDY_HEADER_UNUSED);
> +   return block;
> +}
> +
> +static void drm_block_free(struct drm_buddy_mm *mm, struct 
> +drm_buddy_block *block) {
> +   kmem_cache_free(mm->slab_blocks, block); }
> +
> +static void add_ordered(struct drm_buddy_mm *mm, struct 
> +drm_buddy_block *block) {
> +   struct drm_buddy_block *node;
> +
> +   if (list_empty(>free_list[drm_buddy_block_order(block)])) {
> +   list_add(>link,
> +   >free_list[drm_buddy_block_order(block)]);
> +   return;
> +   }
> +
> +   list_for_each_entry(node, 
> >free_list[drm_buddy_block_order(block)], link)
> +   if (block->start > node->start)
> +   break;
> +
> +   __list_add(>link, node->link.prev, >link); }
> +
> +static void mark_allocated(struct drm_buddy_block *block) {
> +   block->header &= ~DRM_BUDDY_HEADER_STATE;
> +   block->header |= DRM_BUDDY_ALLOCATED;
> +
> +   list_del(>link);
> +}
> +
> +static void mark_free(struct drm_buddy_mm *mm,
> + struct drm_buddy_block *block) {
> +   block->header &= ~DRM_BUDDY_HEADER_STATE;
> +   block->header |= DRM_BUDDY_FREE;
> +
> +   add_ordered(mm, block);
> +}
> +
> +static void mark_split(struct drm_buddy_block *block) {
> +   block->header &= ~DRM_BUDDY_HEADER_STATE;
> +   block->header |= DRM_BUDDY_SPLIT;
> +
> +   list_del(>link);
> +}
> +
> +int drm_buddy_init(struct drm_buddy_mm *mm, u64 size, u64 chunk_size) 
> +{
> +   unsigned int i;
> +   u64 offset;
> +
> +   if (size < chunk_size)
> +   return -EINVAL;
> +
> +   if (chunk_size < PAGE_SIZE)
> +   return -EINVAL;
> +
> +   if (!is_power_of_2(chunk_size))
> +   return -EINVAL;
> +
> +   size = round_down(size, chunk_size);
> +
> +   mm->size = size;
> +   mm->chunk_size = chunk_size;
> +   mm->max_order = ilog2(size) - ilog2(chunk_size);
> +
> +   BUG_ON(mm->max_order > DRM_BUDDY_MAX_ORDER);
> +
> +   mm->slab_blocks = KMEM_CACHE(drm_buddy_block, 
> +

RE: [PATCH 2/2] Add drm buddy manager support to amdgpu driver

2021-09-22 Thread Paneer Selvam, Arunpravin

[AMD Public Use]

Hi Christian,
Thanks for the review, I will the send the next version fixing all issues.

Regards,
Arun
-Original Message-
From: Christian König  
Sent: Wednesday, September 22, 2021 12:18 PM
To: Paneer Selvam, Arunpravin ; Koenig, 
Christian ; dri-devel@lists.freedesktop.org; 
intel-...@lists.freedesktop.org; amd-...@lists.freedesktop.org; 
matthew.a...@intel.com; dan...@ffwll.ch; Deucher, Alexander 

Subject: Re: [PATCH 2/2] Add drm buddy manager support to amdgpu driver

Am 21.09.21 um 17:51 schrieb Paneer Selvam, Arunpravin:
> [AMD Public Use]
>
> Hi Christian,
> Please find my comments.

A better mail client might be helpful for mailing list communication. I use 
Thunderbird, but Outlook with appropriate setting should do as well.

>
> Thanks,
> Arun
> -Original Message-
> From: Koenig, Christian 
> Sent: Tuesday, September 21, 2021 2:34 PM
> To: Paneer Selvam, Arunpravin ; 
> dri-devel@lists.freedesktop.org; intel-...@lists.freedesktop.org; 
> amd-...@lists.freedesktop.org; matthew.a...@intel.com; 
> dan...@ffwll.ch; Deucher, Alexander 
> Subject: Re: [PATCH 2/2] Add drm buddy manager support to amdgpu 
> driver
>
> Am 20.09.21 um 21:21 schrieb Arunpravin:
> [SNIP]
>> +struct list_head blocks;
>> +};
>> +
>> +static inline struct amdgpu_vram_mgr_node * 
>> +to_amdgpu_vram_mgr_node(struct ttm_resource *res) {
>> +return container_of(container_of(res, struct ttm_range_mgr_node, base),
>> +struct amdgpu_vram_mgr_node, tnode); }
>> +
> Maybe stuff that in a separate amdgpu_vram_mgr.h file together with all the 
> other defines for the vram manager.
>
> Arun - I thought about it, will create a new header file for vram 
> manager

Maybe make that a separate patch before this one here.

>> +if (mode == DRM_BUDDY_ALLOC_RANGE) {
>> +r = drm_buddy_alloc_range(mm, >blocks,
>> +(uint64_t)place->fpfn << PAGE_SHIFT, pages << 
>> PAGE_SHIFT);
> That handling won't work. It's possible that you need contiguous memory in a 
> specific range.
>
> Arun - the existing default backend range handler allocates contiguous 
> nodes in power of 2 finding the MSB's of the any given size. We get linked 
> nodes (depends on the requested size) in continuous range of address.
> Example, for the size 768 pages request, we get 512 + 256 range of continuous 
> address in 2 nodes.
>
> It works by passing the fpfn and the requested size, the backend handler 
> calculates the lpfn by adding fpfn + size = lpfn.
> The drawback here are we are not handling the specific lpfn value (as 
> of now it is calculated using the fpfn + requested size) and not following 
> the pages_per_node rule.
>
> Please let me know if this won't work for all specific fpfn / lpfn 
> cases

 From your description that sounds like it won't work at all for any cases.

See the fpfn/lpfn specifies the range of allocation. For the most common case 
that's either 0..visible_vram or 0..start_of_some_hw_limitation.

When you always try to allocate the range from 0 you will quickly find that you 
clash with existing allocations.

What you need to do in general is to have a drm_buddy_alloc() which is able to 
limit the returned page to the desired range fpfn..lpfn.

>> +
>> +do {
>> +unsigned int order;
>> +
>> +order = fls(n_pages) - 1;
>> +BUG_ON(order > mm->max_order);
>> +
>> +spin_lock(>lock);
>> +block = drm_buddy_alloc(mm, order, 
>> bar_limit_enabled,
>> +
>> visible_pfn, mode);
> That doesn't seem to make much sense either. The backend allocator should not 
> care about the BAR size nor the visible_pfn.
>
> Arun - we are sending the BAR limit enable information (in case of APU 
> or large BAR, we take different approach) and visible_pfn Information.
>
> In case of bar_limit_enabled is true, I thought visible_pfn required 
> for the backend allocator to compare with the block start address and 
> find the desired blocks for the TOP-DOWN and BOTTOM-UP approach (TOP-DOWN - 
> return blocks higher than the visible_pfn limit, BOTTOM-UP - return blocks 
> lower than the visible_pfn limit).
>
> In case of bar_limit_enabled is false, we just return the top ordered 
> blocks and bottom most blocks for the TOP-DOWN and BOTTOM-UP respectively 
> (suitable for APU and Large BAR case).
>
> Please let me know if we have other way to fix this problem

That is the completely wrong approach. The backend must not care about the BAR 
configuration and visibility of the VRAM.

What it should do instead is to take the fpfn..lpfn range into account and make 
sure that all allocated pages are in the desired range.

BOTTOM-UP vs. TOP-DOWN then just optimizes the algorithm because we insert all 
freed up TOP-DOWN pages at the end and all BOTTOM-UP pages at the

Re: [Intel-gfx] [PATCH] drm/i915/guc/slpc: remove unneeded clflush calls

2021-09-22 Thread Lucas De Marchi

On Tue, Sep 21, 2021 at 04:06:00PM +0300, Ville Syrjälä wrote:

On Mon, Sep 20, 2021 at 10:47:08PM -0700, Lucas De Marchi wrote:

On Wed, Sep 15, 2021 at 12:29:12PM -0700, John Harrison wrote:
>On 9/15/2021 12:24, Belgaumkar, Vinay wrote:
>>On 9/14/2021 12:51 PM, Lucas De Marchi wrote:
>>>The clflush calls here aren't doing anything since we are not writting
>>>something and flushing the cache lines to be visible to GuC. Here the
>>>intention seems to be to make sure whatever GuC has written is visible
>>>to the CPU before we read them. However a clflush from the CPU side is
>>>the wrong instruction to use.
>Is there a right instruction to use? Either we need to verify that no

how can there be a right instruction? If the GuC needs to flush, then
the GuC needs to do it, nothing to be done by the CPU.

Flushing the CPU cache line here is doing nothing to guarantee that what
was written by GuC hit the memory and we are reading it. Not sure why it
was actually added, but since it was added by Vinay and he reviewed this
patch, I'm assuming he also agrees

clflush == writeback + invalidate. The invalidate is the important part
when the CPU has to read something written by something else that's not
cache coherent.

Although the invalidate would be the important part, how would that work
if there is still a flush? Wouldn't we be overriding whatever
was written by the other side? Or are we using the fact that we
shouldn't be writting to this cacheline so we know it's not dirty?

Now, I have no idea if the guc has its own (CPU invisible) caches or not.
If it does then it will need to trigger a writeback. But regardless, if
the guc bypasses the CPU caches the CPU will need to invalidate before
it reads anything in case it has stale data sitting in its cache.

Indeed, thanks... but another case would be if caches are coherent
through snoop.  Do you know what is the cache architecture with GuC
and CPU?

Another question comes to mind, but first some context: I'm looking
at this in order to support other archs besides x86... the only
platforms in which this would be relevant would be on the discrete ones
(I'm currently running an arm64 guest on qemu and using pci
passthrough). I see that for dgfx intel_guc_allocate_vma() uses
i915_gem_object_create_lmem() instead of i915_gem_object_create_shmem().
Would that make a difference?

thanks
Lucas De Marchi

Re: [RFC PATCH 3/4] rpmsg: Add support of AI Processor Unit (APU)

2021-09-22 Thread Bjorn Andersson

On Fri 17 Sep 07:59 CDT 2021, Alexandre Bailon wrote:

> Some Mediatek SoC provides hardware accelerator for AI / ML.
> This driver use the DRM driver to manage the shared memory,
> and use rpmsg to execute jobs on the APU.
> 
> Signed-off-by: Alexandre Bailon 
> ---
>  drivers/rpmsg/Kconfig |  10 +++
>  drivers/rpmsg/Makefile|   1 +
>  drivers/rpmsg/apu_rpmsg.c | 184 ++
>  3 files changed, 195 insertions(+)
>  create mode 100644 drivers/rpmsg/apu_rpmsg.c
> 
> diff --git a/drivers/rpmsg/Kconfig b/drivers/rpmsg/Kconfig
> index 0b4407abdf138..fc1668f795004 100644
> --- a/drivers/rpmsg/Kconfig
> +++ b/drivers/rpmsg/Kconfig
> @@ -73,4 +73,14 @@ config RPMSG_VIRTIO
>   select RPMSG_NS
>   select VIRTIO
>  
> +config RPMSG_APU
> + tristate "APU RPMSG driver"
> + select REMOTEPROC
> + select RPMSG_VIRTIO
> + select DRM_APU
> + help
> +   This provides a RPMSG driver that provides some facilities to
> +   communicate with an accelerated processing unit (APU).
> +   This Uses the APU DRM driver to manage memory and job scheduling.

Similar to how a driver for e.g. an I2C device doesn't live in
drivers/i2c, this doesn't belong in drivers/rpmsg. Probably rather
directly in the DRM driver.

> +
>  endmenu
> diff --git a/drivers/rpmsg/Makefile b/drivers/rpmsg/Makefile
> index 8d452656f0ee3..8b336b9a817c1 100644
> --- a/drivers/rpmsg/Makefile
> +++ b/drivers/rpmsg/Makefile
> @@ -9,3 +9,4 @@ obj-$(CONFIG_RPMSG_QCOM_GLINK_RPM) += qcom_glink_rpm.o
>  obj-$(CONFIG_RPMSG_QCOM_GLINK_SMEM) += qcom_glink_smem.o
>  obj-$(CONFIG_RPMSG_QCOM_SMD) += qcom_smd.o
>  obj-$(CONFIG_RPMSG_VIRTIO)   += virtio_rpmsg_bus.o
> +obj-$(CONFIG_RPMSG_APU)  += apu_rpmsg.o
> diff --git a/drivers/rpmsg/apu_rpmsg.c b/drivers/rpmsg/apu_rpmsg.c
> new file mode 100644
> index 0..7e504bd176a4d
> --- /dev/null
> +++ b/drivers/rpmsg/apu_rpmsg.c
> @@ -0,0 +1,184 @@
> +// SPDX-License-Identifier: GPL-2.0
> +//
> +// Copyright 2020 BayLibre SAS
> +
> +#include 
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include 
> +
> +#include "rpmsg_internal.h"
> +
> +#define APU_RPMSG_SERVICE_MT8183 "rpmsg-mt8183-apu0"
> +
> +struct rpmsg_apu {
> + struct apu_core *core;
> + struct rpmsg_device *rpdev;
> +};
> +
> +static int apu_rpmsg_callback(struct rpmsg_device *rpdev, void *data, int 
> count,
> +   void *priv, u32 addr)
> +{
> + struct rpmsg_apu *apu = dev_get_drvdata(>dev);
> + struct apu_core *apu_core = apu->core;
> +
> + return apu_drm_callback(apu_core, data, count);
> +}
> +
> +static int apu_rpmsg_send(struct apu_core *apu_core, void *data, int len)
> +{
> + struct rpmsg_apu *apu = apu_drm_priv(apu_core);
> + struct rpmsg_device *rpdev = apu->rpdev;
> +
> + return rpmsg_send(rpdev->ept, data, len);

The rpmsg API is exposed outside drivers/rpmsg, so as I said above, just
implement this directly in your driver, no need to lug around a dummy
wrapper for things like this.

> +}
> +
> +static struct apu_drm_ops apu_rpmsg_ops = {
> + .send = apu_rpmsg_send,
> +};
> +
> +static int apu_init_iovad(struct rproc *rproc, struct rpmsg_apu *apu)
> +{
> + struct resource_table *table;
> + struct fw_rsc_carveout *rsc;
> + int i;
> +
> + if (!rproc->table_ptr) {
> + dev_err(>rpdev->dev,
> + "No resource_table: has the firmware been loaded ?\n");
> + return -ENODEV;
> + }
> +
> + table = rproc->table_ptr;
> + for (i = 0; i < table->num; i++) {
> + int offset = table->offset[i];
> + struct fw_rsc_hdr *hdr = (void *)table + offset;
> +
> + if (hdr->type != RSC_CARVEOUT)
> + continue;
> +
> + rsc = (void *)hdr + sizeof(*hdr);
> + if (apu_drm_reserve_iova(apu->core, rsc->da, rsc->len)) {
> + dev_err(>rpdev->dev,
> + "failed to reserve iova\n");
> + return -ENOMEM;
> + }
> + }
> +
> + return 0;
> +}
> +
> +static struct rproc *apu_get_rproc(struct rpmsg_device *rpdev)
> +{
> + /*
> +  * To work, the APU RPMsg driver need to get the rproc device.
> +  * Currently, we only use virtio so we could use that to find the
> +  * remoteproc parent.
> +  */
> + if (!rpdev->dev.parent && rpdev->dev.parent->bus) {
> + dev_err(>dev, "invalid rpmsg device\n");
> + return ERR_PTR(-EINVAL);
> + }
> +
> + if (strcmp(rpdev->dev.parent->bus->name, "virtio")) {
> + dev_err(>dev, "unsupported bus\n");
> + return ERR_PTR(-EINVAL);
> + }
> +
> + return vdev_to_rproc(dev_to_virtio(rpdev->dev.parent));
> +}
> +
> +static int apu_rpmsg_probe(struct rpmsg_device *rpdev)
>

Re: [PATCH] vgaarb: Use ACPI HID name to find integrated GPU

2021-09-22 Thread Kai-Heng Feng

On Sat, Sep 18, 2021 at 12:55 AM Bjorn Helgaas  wrote:
>
> On Fri, Sep 17, 2021 at 11:49:45AM +0800, Kai-Heng Feng wrote:
> > On Fri, Sep 17, 2021 at 12:38 AM Bjorn Helgaas  wrote:
> > >
> > > [+cc Huacai, linux-pci]
> > >
> > > On Wed, May 19, 2021 at 09:57:23PM +0800, Kai-Heng Feng wrote:
> > > > Commit 3d42f1ddc47a ("vgaarb: Keep adding VGA device in queue") assumes
> > > > the first device is an integrated GPU. However, on AMD platforms an
> > > > integrated GPU can have higher PCI device number than a discrete GPU.
> > > >
> > > > Integrated GPU on ACPI platform generally has _DOD and _DOS method, so
> > > > use that as predicate to find integrated GPU. If the new strategy
> > > > doesn't work, fallback to use the first device as boot VGA.
> > > >
> > > > Signed-off-by: Kai-Heng Feng 
> > > > ---
> > > >  drivers/gpu/vga/vgaarb.c | 31 ++-
> > > >  1 file changed, 26 insertions(+), 5 deletions(-)
> > > >
> > > > diff --git a/drivers/gpu/vga/vgaarb.c b/drivers/gpu/vga/vgaarb.c
> > > > index 5180c5687ee5..949fde433ea2 100644
> > > > --- a/drivers/gpu/vga/vgaarb.c
> > > > +++ b/drivers/gpu/vga/vgaarb.c
> > > > @@ -50,6 +50,7 @@
> > > >  #include 
> > > >  #include 
> > > >  #include 
> > > > +#include 
> > > >
> > > >  #include 
> > > >
> > > > @@ -1450,9 +1451,23 @@ static struct miscdevice vga_arb_device = {
> > > >   MISC_DYNAMIC_MINOR, "vga_arbiter", _arb_device_fops
> > > >  };
> > > >
> > > > +#if defined(CONFIG_ACPI)
> > > > +static bool vga_arb_integrated_gpu(struct device *dev)
> > > > +{
> > > > + struct acpi_device *adev = ACPI_COMPANION(dev);
> > > > +
> > > > + return adev && !strcmp(acpi_device_hid(adev), ACPI_VIDEO_HID);
> > > > +}
> > > > +#else
> > > > +static bool vga_arb_integrated_gpu(struct device *dev)
> > > > +{
> > > > + return false;
> > > > +}
> > > > +#endif
> > > > +
> > > >  static void __init vga_arb_select_default_device(void)
> > > >  {
> > > > - struct pci_dev *pdev;
> > > > + struct pci_dev *pdev, *found = NULL;
> > > >   struct vga_device *vgadev;
> > > >
> > > >  #if defined(CONFIG_X86) || defined(CONFIG_IA64)
> > > > @@ -1505,20 +1520,26 @@ static void __init 
> > > > vga_arb_select_default_device(void)
> > > >  #endif
> > > >
> > > >   if (!vga_default_device()) {
> > > > - list_for_each_entry(vgadev, _list, list) {
> > > > + list_for_each_entry_reverse(vgadev, _list, list) {
> > >
> > > Hi Kai-Heng, do you remember why you changed the order of this list
> > > traversal?
> >
> > The descending order is to keep the original behavior.
> >
> > Before this patch, it breaks out of the loop as early as possible, so
> > the lower numbered device is picked.
> > This patch makes it only break out of the loop when ACPI_VIDEO_HID
> > device is found.
> > So if there are more than one device that meet "cmd & (PCI_COMMAND_IO
> > | PCI_COMMAND_MEMORY)", higher numbered device will be selected.
> > So the traverse order reversal is to keep the original behavior.
>
> Can you give an example of what you mean?  I don't quite follow how it
> keeps the original behavior.
>
> If we have this:
>
>   0  PCI_COMMAND_MEMORY set   ACPI_VIDEO_HID
>   1  PCI_COMMAND_MEMORY set   ACPI_VIDEO_HID
>
> Previously we didn't look for ACPI_VIDEO_HID, so we chose 0, now we
> choose 1, which seems wrong.  In the absence of other information, I
> would prefer the lower-numbered device.
>
> Or this:
>
>   0  PCI_COMMAND_MEMORY set
>   1  PCI_COMMAND_MEMORY set   ACPI_VIDEO_HID
>
> Previously we chose 0; now we choose 1, which does seem right, but
> we'd choose 1 regardless of the order.
>
> Or this:
>
>   0  PCI_COMMAND_MEMORY set   ACPI_VIDEO_HID
>   1  PCI_COMMAND_MEMORY set
>
> Previously we chose 0, now we still choose 0, which seems right but
> again doesn't depend on the order.
>
> The first case, where both devices are ACPI_VIDEO_HID, is the only one
> where the order matters, and I suggest that we should be using the
> original order, not the reversed order.

Consider this:
0  PCI_COMMAND_MEMORY set
1  PCI_COMMAND_MEMORY set

Originally device 0 will be picked. If the traverse order is kept,
device 1 will be selected instead, because none of them pass
vga_arb_integrated_gpu().

Kai-Heng

>
> > > I guess the list_add_tail() in vga_arbiter_add_pci_device() means
> > > vga_list is generally ordered with small device numbers first and
> > > large ones last.
> > >
> > > So you pick the integrated GPU with the largest device number.  Are
> > > there systems with more than one integrated GPU?  If so, I would
> > > naively expect that in the absence of an indication otherwise, we'd
> > > want the one with the *smallest* device number.
> >
> > There's only one integrated GPU on the affected system.
> >
> > The approach is to keep the list traversal in one pass.
> > Is there any regression introduce by this patch?
> > If that's the case, we can separate the logic and find the
> > ACPI_VIDEO_HID in second pass.
>
> No

Re: [Intel-gfx] [PATCH 15/27] drm/i915/guc: Implement multi-lrc submission

2021-09-22 Thread Matthew Brost

On Wed, Sep 22, 2021 at 01:15:46PM -0700, John Harrison wrote:
> On 9/22/2021 09:25, Matthew Brost wrote:
> > On Mon, Sep 20, 2021 at 02:48:52PM -0700, John Harrison wrote:
> > > On 8/20/2021 15:44, Matthew Brost wrote:
> > > > Implement multi-lrc submission via a single workqueue entry and single
> > > > H2G. The workqueue entry contains an updated tail value for each
> > > > request, of all the contexts in the multi-lrc submission, and updates
> > > > these values simultaneously. As such, the tasklet and bypass path have
> > > > been updated to coalesce requests into a single submission.
> > > > 
> > > > Signed-off-by: Matthew Brost 
> > > > ---
> > > >drivers/gpu/drm/i915/gt/uc/intel_guc.c|  21 ++
> > > >drivers/gpu/drm/i915/gt/uc/intel_guc.h|   8 +
> > > >drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c |  24 +-
> > > >drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   |   6 +-
> > > >.../gpu/drm/i915/gt/uc/intel_guc_submission.c | 312 
> > > > +++---
> > > >drivers/gpu/drm/i915/i915_request.h   |   8 +
> > > >6 files changed, 317 insertions(+), 62 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c 
> > > > b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> > > > index fbfcae727d7f..879aef662b2e 100644
> > > > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> > > > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> > > > @@ -748,3 +748,24 @@ void intel_guc_load_status(struct intel_guc *guc, 
> > > > struct drm_printer *p)
> > > > }
> > > > }
> > > >}
> > > > +
> > > > +void intel_guc_write_barrier(struct intel_guc *guc)
> > > > +{
> > > > +   struct intel_gt *gt = guc_to_gt(guc);
> > > > +
> > > > +   if (i915_gem_object_is_lmem(guc->ct.vma->obj)) {
> > > > +   GEM_BUG_ON(guc->send_regs.fw_domains);
> > > Granted, this patch is just moving code from one file to another not
> > > changing it. However, I think it would be worth adding a blank line in 
> > > here.
> > > Otherwise the 'this register' comment below can be confusingly read as
> > > referring to the send_regs.fw_domain entry above.
> > > 
> > > And maybe add a comment why it is a bug for the send_regs value to be set?
> > > I'm not seeing any obvious connection between it and the reset of this 
> > > code.
> > > 
> > Can add a blank line. I think the GEM_BUG_ON relates to being able to
> > use intel_uncore_write_fw vs intel_uncore_write. Can add comment.
> > 
> > > > +   /*
> > > > +* This register is used by the i915 and GuC for MMIO 
> > > > based
> > > > +* communication. Once we are in this code CTBs are the 
> > > > only
> > > > +* method the i915 uses to communicate with the GuC so 
> > > > it is
> > > > +* safe to write to this register (a value of 0 is NOP 
> > > > for MMIO
> > > > +* communication). If we ever start mixing CTBs and 
> > > > MMIOs a new
> > > > +* register will have to be chosen.
> > > > +*/
> > > > +   intel_uncore_write_fw(gt->uncore, 
> > > > GEN11_SOFT_SCRATCH(0), 0);
> > > > +   } else {
> > > > +   /* wmb() sufficient for a barrier if in smem */
> > > > +   wmb();
> > > > +   }
> > > > +}
> > > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h 
> > > > b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> > > > index 3f95b1b4f15c..0ead2406d03c 100644
> > > > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> > > > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> > > > @@ -37,6 +37,12 @@ struct intel_guc {
> > > > /* Global engine used to submit requests to GuC */
> > > > struct i915_sched_engine *sched_engine;
> > > > struct i915_request *stalled_request;
> > > > +   enum {
> > > > +   STALL_NONE,
> > > > +   STALL_REGISTER_CONTEXT,
> > > > +   STALL_MOVE_LRC_TAIL,
> > > > +   STALL_ADD_REQUEST,
> > > > +   } submission_stall_reason;
> > > > /* intel_guc_recv interrupt related state */
> > > > spinlock_t irq_lock;
> > > > @@ -332,4 +338,6 @@ void intel_guc_submission_cancel_requests(struct 
> > > > intel_guc *guc);
> > > >void intel_guc_load_status(struct intel_guc *guc, struct drm_printer 
> > > > *p);
> > > > +void intel_guc_write_barrier(struct intel_guc *guc);
> > > > +
> > > >#endif
> > > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c 
> > > > b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > > > index 20c710a74498..10d1878d2826 100644
> > > > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > > > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > > > @@ -377,28 +377,6 @@ static u32 ct_get_next_fence(struct intel_guc_ct 
> > > > *ct)
> > > > return ++ct->requests.last_fence;
> > > >}
> > > > -static void write_barrier(struct intel_guc_ct *ct)
> > > > -{
> > > > -   struct intel_guc *guc = ct_to_guc(ct);
> > >

Re: [RFC PATCH 2/4] DRM: Add support of AI Processor Unit (APU)

2021-09-22 Thread Dave Airlie

On Sat, 18 Sept 2021 at 07:57, Alexandre Bailon  wrote:
>
> Some Mediatek SoC provides hardware accelerator for AI / ML.
> This driver provides the infrastructure to manage memory
> shared between host CPU and the accelerator, and to submit
> jobs to the accelerator.
> The APU itself is managed by remoteproc so this drivers
> relies on remoteproc to found the APU and get some important data
> from it. But, the driver is quite generic and it should possible
> to manage accelerator using another ways.
> This driver doesn't manage itself the data transmitions.
> It must be registered by another driver implementing the transmitions.
>
> Signed-off-by: Alexandre Bailon 
> ---
>  drivers/gpu/drm/Kconfig|   2 +
>  drivers/gpu/drm/Makefile   |   1 +
>  drivers/gpu/drm/apu/Kconfig|  10 +
>  drivers/gpu/drm/apu/Makefile   |   7 +
>  drivers/gpu/drm/apu/apu_drm_drv.c  | 238 +++
>  drivers/gpu/drm/apu/apu_gem.c  | 232 +++
>  drivers/gpu/drm/apu/apu_internal.h |  89 
>  drivers/gpu/drm/apu/apu_sched.c| 634 +
>  include/drm/apu_drm.h  |  59 +++
>  include/uapi/drm/apu_drm.h | 106 +
>  10 files changed, 1378 insertions(+)
>  create mode 100644 drivers/gpu/drm/apu/Kconfig
>  create mode 100644 drivers/gpu/drm/apu/Makefile
>  create mode 100644 drivers/gpu/drm/apu/apu_drm_drv.c
>  create mode 100644 drivers/gpu/drm/apu/apu_gem.c
>  create mode 100644 drivers/gpu/drm/apu/apu_internal.h
>  create mode 100644 drivers/gpu/drm/apu/apu_sched.c
>  create mode 100644 include/drm/apu_drm.h
>  create mode 100644 include/uapi/drm/apu_drm.h
>
> diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
> index 8fc40317f2b77..bcdca35c9eda5 100644
> --- a/drivers/gpu/drm/Kconfig
> +++ b/drivers/gpu/drm/Kconfig
> @@ -382,6 +382,8 @@ source "drivers/gpu/drm/xlnx/Kconfig"
>
>  source "drivers/gpu/drm/gud/Kconfig"
>
> +source "drivers/gpu/drm/apu/Kconfig"
> +
>  config DRM_HYPERV
> tristate "DRM Support for Hyper-V synthetic video device"
> depends on DRM && PCI && MMU && HYPERV
> diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
> index ad11121548983..f3d8432976558 100644
> --- a/drivers/gpu/drm/Makefile
> +++ b/drivers/gpu/drm/Makefile
> @@ -127,4 +127,5 @@ obj-$(CONFIG_DRM_MCDE) += mcde/
>  obj-$(CONFIG_DRM_TIDSS) += tidss/
>  obj-y  += xlnx/
>  obj-y  += gud/
> +obj-$(CONFIG_DRM_APU) += apu/
>  obj-$(CONFIG_DRM_HYPERV) += hyperv/
> diff --git a/drivers/gpu/drm/apu/Kconfig b/drivers/gpu/drm/apu/Kconfig
> new file mode 100644
> index 0..c8471309a0351
> --- /dev/null
> +++ b/drivers/gpu/drm/apu/Kconfig
> @@ -0,0 +1,10 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +#
> +
> +config DRM_APU
> +   tristate "APU (AI Processor Unit)"
> +   select REMOTEPROC
> +   select DRM_SCHED
> +   help
> + This provides a DRM driver that provides some facilities to
> + communicate with an accelerated processing unit (APU).
> diff --git a/drivers/gpu/drm/apu/Makefile b/drivers/gpu/drm/apu/Makefile
> new file mode 100644
> index 0..3e97846b091c9
> --- /dev/null
> +++ b/drivers/gpu/drm/apu/Makefile
> @@ -0,0 +1,7 @@
> +# SPDX-License-Identifier: GPL-2.0
> +
> +apu_drm-y += apu_drm_drv.o
> +apu_drm-y += apu_sched.o
> +apu_drm-y += apu_gem.o
> +
> +obj-$(CONFIG_DRM_APU) += apu_drm.o
> diff --git a/drivers/gpu/drm/apu/apu_drm_drv.c 
> b/drivers/gpu/drm/apu/apu_drm_drv.c
> new file mode 100644
> index 0..91d8c99e373c0
> --- /dev/null
> +++ b/drivers/gpu/drm/apu/apu_drm_drv.c
> @@ -0,0 +1,238 @@
> +// SPDX-License-Identifier: GPL-2.0
> +//
> +// Copyright 2020 BayLibre SAS
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include 
> +
> +#include "apu_internal.h"
> +
> +static LIST_HEAD(apu_devices);
> +
> +static const struct drm_ioctl_desc ioctls[] = {
> +   DRM_IOCTL_DEF_DRV(APU_GEM_NEW, ioctl_gem_new,
> + DRM_RENDER_ALLOW),
> +   DRM_IOCTL_DEF_DRV(APU_GEM_QUEUE, ioctl_gem_queue,
> + DRM_RENDER_ALLOW),
> +   DRM_IOCTL_DEF_DRV(APU_GEM_DEQUEUE, ioctl_gem_dequeue,
> + DRM_RENDER_ALLOW),
> +   DRM_IOCTL_DEF_DRV(APU_GEM_IOMMU_MAP, ioctl_gem_iommu_map,
> + DRM_RENDER_ALLOW),
> +   DRM_IOCTL_DEF_DRV(APU_GEM_IOMMU_UNMAP, ioctl_gem_iommu_unmap,
> + DRM_RENDER_ALLOW),
> +   DRM_IOCTL_DEF_DRV(APU_STATE, ioctl_apu_state,
> + DRM_RENDER_ALLOW),
> +};
> +
> +DEFINE_DRM_GEM_CMA_FOPS(apu_drm_ops);
> +
> +static struct drm_driver apu_drm_driver = {
> +   .driver_features = DRIVER_GEM | DRIVER_SYNCOBJ,
> +   .name = "drm_apu",
> +   .desc = "APU DRM driver",
> +   .date = "20210319",
> +   .major = 1,
> +   .minor

Re: [PATCH v2 3/3] drm/bridge: ti-sn65dsi86: Add NO_CONNECTOR support

2021-09-22 Thread Laurent Pinchart

Hi Rob,

Thank you for the patch.

On Mon, Sep 20, 2021 at 03:58:00PM -0700, Rob Clark wrote:
> From: Rob Clark 
> 
> Slightly awkward to fish out the display_info when we aren't creating
> own connector.  But I don't see an obvious better way.
> 
> v2: Remove error return with NO_CONNECTOR flag
> 
> Signed-off-by: Rob Clark 
> ---
>  drivers/gpu/drm/bridge/ti-sn65dsi86.c | 39 ---
>  1 file changed, 29 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi86.c 
> b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
> index 6154bed0af5b..94c94cc8a4d8 100644
> --- a/drivers/gpu/drm/bridge/ti-sn65dsi86.c
> +++ b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
> @@ -667,11 +667,6 @@ static int ti_sn_bridge_attach(struct drm_bridge *bridge,
>  .node = NULL,
>};
>  
> - if (flags & DRM_BRIDGE_ATTACH_NO_CONNECTOR) {
> - DRM_ERROR("Fix bridge driver to make connector optional!");
> - return -EINVAL;
> - }
> -
>   pdata->aux.drm_dev = bridge->dev;
>   ret = drm_dp_aux_register(>aux);
>   if (ret < 0) {
> @@ -679,9 +674,11 @@ static int ti_sn_bridge_attach(struct drm_bridge *bridge,
>   return ret;
>   }
>  
> - ret = ti_sn_bridge_connector_init(pdata);
> - if (ret < 0)
> - goto err_conn_init;
> + if (!(flags & DRM_BRIDGE_ATTACH_NO_CONNECTOR)) {
> + ret = ti_sn_bridge_connector_init(pdata);
> + if (ret < 0)
> + goto err_conn_init;
> + }
>  
>   /*
>* TODO: ideally finding host resource and dsi dev registration needs
> @@ -743,7 +740,8 @@ static int ti_sn_bridge_attach(struct drm_bridge *bridge,
>  err_dsi_attach:
>   mipi_dsi_device_unregister(dsi);
>  err_dsi_host:
> - drm_connector_cleanup(>connector);
> + if (!(flags & DRM_BRIDGE_ATTACH_NO_CONNECTOR))
> + drm_connector_cleanup(>connector);

I wonder if we actually need this. The connector gets attached to the
encoder, won't it be destroyed by the DRM core in the error path ?

>  err_conn_init:
>   drm_dp_aux_unregister(>aux);
>   return ret;
> @@ -792,9 +790,30 @@ static void ti_sn_bridge_set_dsi_rate(struct 
> ti_sn65dsi86 *pdata)
>   regmap_write(pdata->regmap, SN_DSIA_CLK_FREQ_REG, val);
>  }
>  
> +/*
> + * Find the connector and fish out the bpc from display_info.  It would
> + * be nice if we could get this instead from drm_bridge_state, but that
> + * doesn't yet appear to be the case.

You already have a bus format in the bridge state, from which you can
derive the bpp. Could you give it a try ?

> + */
>  static unsigned int ti_sn_bridge_get_bpp(struct ti_sn65dsi86 *pdata)
>  {
> - if (pdata->connector.display_info.bpc <= 6)
> + struct drm_bridge *bridge = >bridge;
> + struct drm_connector_list_iter conn_iter;
> + struct drm_connector *connector;
> + unsigned bpc = 0;
> +
> + drm_connector_list_iter_begin(bridge->dev, _iter);
> + drm_for_each_connector_iter(connector, _iter) {
> + if (drm_connector_has_possible_encoder(connector, 
> bridge->encoder)) {
> + bpc = connector->display_info.bpc;
> + break;
> + }
> + }
> + drm_connector_list_iter_end(_iter);
> +
> + WARN_ON(bpc == 0);
> +
> + if (bpc <= 6)
>   return 18;
>   else
>   return 24;

-- 
Regards,

Laurent Pinchart

Re: [PATCH 4/4] drm/bridge: ti-sn65dsi86: Add NO_CONNECTOR support

2021-09-22 Thread Laurent Pinchart

Hi Rob and Doug,

On Mon, Sep 20, 2021 at 11:32:02AM -0700, Rob Clark wrote:
> On Thu, Aug 12, 2021 at 1:08 PM Doug Anderson wrote:
> > On Thu, Aug 12, 2021 at 12:26 PM Laurent Pinchart wrote:
> > > On Wed, Aug 11, 2021 at 04:52:50PM -0700, Rob Clark wrote:
> > > > From: Rob Clark 
> > > >
> > > > Slightly awkward to fish out the display_info when we aren't creating
> > > > own connector.  But I don't see an obvious better way.
> > >
> > > We need a bit more than this, to support the NO_CONNECTOR case, the
> > > bridge has to implement a few extra operations, and set the bridge .ops
> > > field. I've submitted two patches to do so a while ago:
> > >
> > > - [RFC PATCH 08/11] drm/bridge: ti-sn65dsi86: Implement bridge connector 
> > > operations ([1])
> >
> > Rob asked me about this over IRC, so if he left it out and it's needed
> > then it's my fault. However, I don't believe it's needed until your
> > series making this bridge chip support full DP. For the the eDP case
> > the bridge chip driver in ToT no longer queries the EDID itself. It
> > simply provides an AUX bus to the panel driver and the panel driver
> > queries the EDID. I think that means we don't need to add
> > DRM_BRIDGE_OP_EDID, right?

That's right.

> > I was also wondering if in the full DP case we should actually model
> > the physical DP jack as a drm_bridge and have it work the same way. It
> > would get probed via the DP AUX bus just like a panel. I seem to
> > remember Stephen Boyd was talking about modeling the DP connector as a
> > drm_bridge because it would allow us to handle the fact that some TCPC
> > chips could only support HBR2 whereas others could support HBR3. Maybe
> > it would end up being a fairly elegant solution?

Physical connectors are already handled as bridges, see
drivers/gpu/drm/bridge/display-connector.c. I however don't think it
should handle EDID retrieval, because that's really not an operation
implemented by the connector itself.

> > > - [RFC PATCH 09/11] drm/bridge: ti-sn65dsi86: Make connector creation 
> > > optional ([2])
> > >
> > > The second patch is similar to the first half of this patch, but misses
> > > the cleanup code. I'll try to rebase this and resubmit, but it may take
> > > a bit of time.
> >
> > Whoops! You're right that Rob's patch won't work at all because we'll
> > just hit the "Fix bridge driver to make connector optional!" case. I
> > should have noticed that. :(
> 
> Yes, indeed.. once I fix that, I get no display..
> 
> Not sure if Laurent is still working on his series, otherwise I can
> try to figure out what bridge ops are missing

I am, but too slowly. I don't mind fast-tracking the changes you need
though.

-- 
Regards,

Laurent Pinchart

Re: [PATCH v3 6/6] drm: rcar-du: Add r8a779a0 device support

2021-09-22 Thread Kieran Bingham




On 23/09/2021 00:59, Laurent Pinchart wrote:
> Hi Kieran,
> 
> Thank you for the patch.
> 
> On Thu, Sep 23, 2021 at 12:47:26AM +0100, Kieran Bingham wrote:
>> From: Kieran Bingham 
>>
>> Extend the rcar_du_device_info structure and rcar_du_output enum to
>> support DSI outputs and utilise these additions to provide support for
>> the R8A779A0 V3U platform.
>>
>> While the DIDSR register field is now named "DSI/CSI-2-TX-IF0 Dot Clock
>> Select" the existing define LVDS0 is used, and is directly compatible
>> from other DU variants.
> 
> That's not true anymore :-) The paragraph can simply be dropped.
> 

Agreed.

>> Signed-off-by: Kieran Bingham 
>>
>> ---
>>
>> v3:
>>  - Introduce DIDSR_LDCS_DSI macro
>>
>> v2:
>>  - No longer requires a direct interface with the DSI encoder
>>  - Use correct field naming (LDCS)
>>  - Remove per-crtc clock feature.
>>
>>  drivers/gpu/drm/rcar-du/rcar_du_crtc.h  |  2 ++
>>  drivers/gpu/drm/rcar-du/rcar_du_drv.c   | 20 
>>  drivers/gpu/drm/rcar-du/rcar_du_drv.h   |  2 ++
>>  drivers/gpu/drm/rcar-du/rcar_du_group.c |  2 ++
>>  drivers/gpu/drm/rcar-du/rcar_du_regs.h  |  1 +
>>  5 files changed, 27 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/rcar-du/rcar_du_crtc.h 
>> b/drivers/gpu/drm/rcar-du/rcar_du_crtc.h
>> index 440e6b4fbb58..26e79b74898c 100644
>> --- a/drivers/gpu/drm/rcar-du/rcar_du_crtc.h
>> +++ b/drivers/gpu/drm/rcar-du/rcar_du_crtc.h
>> @@ -96,6 +96,8 @@ struct rcar_du_crtc_state {
>>  enum rcar_du_output {
>>  RCAR_DU_OUTPUT_DPAD0,
>>  RCAR_DU_OUTPUT_DPAD1,
>> +RCAR_DU_OUTPUT_DSI0,
>> +RCAR_DU_OUTPUT_DSI1,
>>  RCAR_DU_OUTPUT_HDMI0,
>>  RCAR_DU_OUTPUT_HDMI1,
>>  RCAR_DU_OUTPUT_LVDS0,
>> diff --git a/drivers/gpu/drm/rcar-du/rcar_du_drv.c 
>> b/drivers/gpu/drm/rcar-du/rcar_du_drv.c
>> index 8a094d5b9c77..8b4c8851b6bc 100644
>> --- a/drivers/gpu/drm/rcar-du/rcar_du_drv.c
>> +++ b/drivers/gpu/drm/rcar-du/rcar_du_drv.c
>> @@ -489,6 +489,25 @@ static const struct rcar_du_device_info 
>> rcar_du_r8a7799x_info = {
>>  .lvds_clk_mask =  BIT(1) | BIT(0),
>>  };
>>  
>> +static const struct rcar_du_device_info rcar_du_r8a779a0_info = {
>> +.gen = 3,
>> +.features = RCAR_DU_FEATURE_CRTC_IRQ
>> +  | RCAR_DU_FEATURE_VSP1_SOURCE,
>> +.channels_mask = BIT(1) | BIT(0),
>> +.routes = {
>> +/* R8A779A0 has two MIPI DSI outputs. */
>> +[RCAR_DU_OUTPUT_DSI0] = {
>> +.possible_crtcs = BIT(0),
>> +.port = 0,
>> +},
>> +[RCAR_DU_OUTPUT_DSI1] = {
>> +.possible_crtcs = BIT(1),
>> +.port = 1,
>> +},
>> +},
>> +.dsi_clk_mask =  BIT(1) | BIT(0),
>> +};
>> +
>>  static const struct of_device_id rcar_du_of_table[] = {
>>  { .compatible = "renesas,du-r8a7742", .data = _du_r8a7790_info },
>>  { .compatible = "renesas,du-r8a7743", .data = _du_r8a7743_info },
>> @@ -513,6 +532,7 @@ static const struct of_device_id rcar_du_of_table[] = {
>>  { .compatible = "renesas,du-r8a77980", .data = _du_r8a77970_info },
>>  { .compatible = "renesas,du-r8a77990", .data = _du_r8a7799x_info },
>>  { .compatible = "renesas,du-r8a77995", .data = _du_r8a7799x_info },
>> +{ .compatible = "renesas,du-r8a779a0", .data = _du_r8a779a0_info },
>>  { }
>>  };
>>  
>> diff --git a/drivers/gpu/drm/rcar-du/rcar_du_drv.h 
>> b/drivers/gpu/drm/rcar-du/rcar_du_drv.h
>> index 5fe9152454ff..cf98d43d72d0 100644
>> --- a/drivers/gpu/drm/rcar-du/rcar_du_drv.h
>> +++ b/drivers/gpu/drm/rcar-du/rcar_du_drv.h
>> @@ -57,6 +57,7 @@ struct rcar_du_output_routing {
>>   * @routes: array of CRTC to output routes, indexed by output 
>> (RCAR_DU_OUTPUT_*)
>>   * @num_lvds: number of internal LVDS encoders
>>   * @dpll_mask: bit mask of DU channels equipped with a DPLL
>> + * @dsi_clk_mask: bitmask of channels that can use the DSI clock as dot 
>> clock
>>   * @lvds_clk_mask: bitmask of channels that can use the LVDS clock as dot 
>> clock
>>   */
>>  struct rcar_du_device_info {
>> @@ -67,6 +68,7 @@ struct rcar_du_device_info {
>>  struct rcar_du_output_routing routes[RCAR_DU_OUTPUT_MAX];
>>  unsigned int num_lvds;
>>  unsigned int dpll_mask;
>> +unsigned int dsi_clk_mask;
>>  unsigned int lvds_clk_mask;
>>  };
>>  
>> diff --git a/drivers/gpu/drm/rcar-du/rcar_du_group.c 
>> b/drivers/gpu/drm/rcar-du/rcar_du_group.c
>> index a984eef265d2..8665a1dd2186 100644
>> --- a/drivers/gpu/drm/rcar-du/rcar_du_group.c
>> +++ b/drivers/gpu/drm/rcar-du/rcar_du_group.c
>> @@ -124,6 +124,8 @@ static void rcar_du_group_setup_didsr(struct 
>> rcar_du_group *rgrp)
>>  if (rcdu->info->lvds_clk_mask & BIT(rcrtc->index))
>>  didsr |= DIDSR_LDCS_LVDS0(i)
>>|  DIDSR_PDCS_CLK(i, 0);
>> +else if (rcdu->info->dsi_clk_mask & BIT(rcrtc->index))
>> +didsr |= DIDSR_LDCS_DSI(i);
>>  else
>>

Re: [PATCH v2 2/3] drm/bridge: ti-sn65dsi86: Implement bridge->mode_valid()

2021-09-22 Thread Laurent Pinchart

Hi Rob,

Thank you for the patch.

On Mon, Sep 20, 2021 at 03:57:59PM -0700, Rob Clark wrote:
> From: Rob Clark 
> 
> For the brave new world of bridges not creating their own connectors, we
> need to implement the max clock limitation via bridge->mode_valid()
> instead of connector->mode_valid().
> 
> v2: Drop unneeded connector->mode_valid()
> 
> Signed-off-by: Rob Clark 
> Reviewed-by: Douglas Anderson 

Reviewed-by: Laurent Pinchart 

> ---
>  drivers/gpu/drm/bridge/ti-sn65dsi86.c | 25 +
>  1 file changed, 13 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi86.c 
> b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
> index 41d48a393e7f..6154bed0af5b 100644
> --- a/drivers/gpu/drm/bridge/ti-sn65dsi86.c
> +++ b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
> @@ -615,20 +615,8 @@ static int ti_sn_bridge_connector_get_modes(struct 
> drm_connector *connector)
>   return drm_bridge_get_modes(pdata->next_bridge, connector);
>  }
>  
> -static enum drm_mode_status
> -ti_sn_bridge_connector_mode_valid(struct drm_connector *connector,
> -   struct drm_display_mode *mode)
> -{
> - /* maximum supported resolution is 4K at 60 fps */
> - if (mode->clock > 594000)
> - return MODE_CLOCK_HIGH;
> -
> - return MODE_OK;
> -}
> -
>  static struct drm_connector_helper_funcs ti_sn_bridge_connector_helper_funcs 
> = {
>   .get_modes = ti_sn_bridge_connector_get_modes,
> - .mode_valid = ti_sn_bridge_connector_mode_valid,
>  };
>  
>  static const struct drm_connector_funcs ti_sn_bridge_connector_funcs = {
> @@ -766,6 +754,18 @@ static void ti_sn_bridge_detach(struct drm_bridge 
> *bridge)
>   drm_dp_aux_unregister(_to_ti_sn65dsi86(bridge)->aux);
>  }
>  
> +static enum drm_mode_status
> +ti_sn_bridge_mode_valid(struct drm_bridge *bridge,
> + const struct drm_display_info *info,
> + const struct drm_display_mode *mode)
> +{
> + /* maximum supported resolution is 4K at 60 fps */
> + if (mode->clock > 594000)
> + return MODE_CLOCK_HIGH;
> +
> + return MODE_OK;
> +}
> +
>  static void ti_sn_bridge_disable(struct drm_bridge *bridge)
>  {
>   struct ti_sn65dsi86 *pdata = bridge_to_ti_sn65dsi86(bridge);
> @@ -1127,6 +1127,7 @@ static void ti_sn_bridge_post_disable(struct drm_bridge 
> *bridge)
>  static const struct drm_bridge_funcs ti_sn_bridge_funcs = {
>   .attach = ti_sn_bridge_attach,
>   .detach = ti_sn_bridge_detach,
> + .mode_valid = ti_sn_bridge_mode_valid,
>   .pre_enable = ti_sn_bridge_pre_enable,
>   .enable = ti_sn_bridge_enable,
>   .disable = ti_sn_bridge_disable,

-- 
Regards,

Laurent Pinchart

[PATCH] drm/i915/uncore: fwtable read handlers are now used on all forcewake platforms

2021-09-22 Thread Matt Roper

With the recent refactor of the uncore mmio handling, all
forcewake-based platforms (i.e., graphics version 6 and beyond) now use
the 'fwtable' read handlers.  Let's pull the assignment out of the
per-platform if/else ladder to make this more obvious.

Suggested-by: Tvrtko Ursulin 
Suggested-by: Lucas De Marchi 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/intel_uncore.c | 11 ++-
 1 file changed, 2 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_uncore.c 
b/drivers/gpu/drm/i915/intel_uncore.c
index c8e7c71f0896..678a99de07fe 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -2088,49 +2088,42 @@ static int uncore_forcewake_init(struct intel_uncore 
*uncore)
return ret;
forcewake_early_sanitize(uncore, 0);
 
+   ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
+
if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 55)) {
ASSIGN_FW_DOMAINS_TABLE(uncore, __dg2_fw_ranges);
ASSIGN_SHADOW_TABLE(uncore, dg2_shadowed_regs);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);
-   ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
} else if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) {
ASSIGN_FW_DOMAINS_TABLE(uncore, __xehp_fw_ranges);
ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);
-   ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
} else if (GRAPHICS_VER(i915) >= 12) {
ASSIGN_FW_DOMAINS_TABLE(uncore, __gen12_fw_ranges);
ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);
-   ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
} else if (GRAPHICS_VER(i915) == 11) {
ASSIGN_FW_DOMAINS_TABLE(uncore, __gen11_fw_ranges);
ASSIGN_SHADOW_TABLE(uncore, gen11_shadowed_regs);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);
-   ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
} else if (IS_GRAPHICS_VER(i915, 9, 10)) {
ASSIGN_FW_DOMAINS_TABLE(uncore, __gen9_fw_ranges);
ASSIGN_SHADOW_TABLE(uncore, gen8_shadowed_regs);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);
-   ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
} else if (IS_CHERRYVIEW(i915)) {
ASSIGN_FW_DOMAINS_TABLE(uncore, __chv_fw_ranges);
ASSIGN_SHADOW_TABLE(uncore, gen8_shadowed_regs);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);
-   ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
} else if (GRAPHICS_VER(i915) == 8) {
ASSIGN_FW_DOMAINS_TABLE(uncore, __gen6_fw_ranges);
ASSIGN_SHADOW_TABLE(uncore, gen8_shadowed_regs);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);
-   ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
} else if (IS_VALLEYVIEW(i915)) {
ASSIGN_FW_DOMAINS_TABLE(uncore, __vlv_fw_ranges);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen6);
-   ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
} else if (IS_GRAPHICS_VER(i915, 6, 7)) {
ASSIGN_FW_DOMAINS_TABLE(uncore, __gen6_fw_ranges);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen6);
-   ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
}
 
uncore->pmic_bus_access_nb.notifier_call = 
i915_pmic_bus_access_notifier;
-- 
2.33.0

Re: [PATCH 1/3] drm/bridge: Add a function to abstract away panels

2021-09-22 Thread Laurent Pinchart

Hi Maxime,

Thank you for the patch.

I know this has already been merged, but I have a question.

On Fri, Sep 10, 2021 at 03:09:39PM +0200, Maxime Ripard wrote:
> Display drivers so far need to have a lot of boilerplate to first
> retrieve either the panel or bridge that they are connected to using
> drm_of_find_panel_or_bridge(), and then either deal with each with ad-hoc
> functions or create a drm panel bridge through drm_panel_bridge_add.
> 
> In order to reduce the boilerplate and hopefully create a path of least
> resistance towards using the DRM panel bridge layer, let's create the
> function devm_drm_of_get_next to reduce that boilerplate.
>
> Signed-off-by: Maxime Ripard 
> ---
>  drivers/gpu/drm/drm_bridge.c | 42 
>  drivers/gpu/drm/drm_of.c |  3 +++
>  include/drm/drm_bridge.h |  2 ++
>  3 files changed, 43 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_bridge.c b/drivers/gpu/drm/drm_bridge.c
> index a8ed66751c2d..10ddca4638b0 100644
> --- a/drivers/gpu/drm/drm_bridge.c
> +++ b/drivers/gpu/drm/drm_bridge.c
> @@ -28,6 +28,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  
>  #include "drm_crtc_internal.h"
> @@ -51,10 +52,8 @@
>   *
>   * Display drivers are responsible for linking encoders with the first bridge
>   * in the chains. This is done by acquiring the appropriate bridge with
> - * of_drm_find_bridge() or drm_of_find_panel_or_bridge(), or creating it for 
> a
> - * panel with drm_panel_bridge_add_typed() (or the managed version
> - * devm_drm_panel_bridge_add_typed()). Once acquired, the bridge shall be
> - * attached to the encoder with a call to drm_bridge_attach().
> + * devm_drm_of_get_bridge(). Once acquired, the bridge shall be attached to 
> the
> + * encoder with a call to drm_bridge_attach().
>   *
>   * Bridges are responsible for linking themselves with the next bridge in the
>   * chain, if any. This is done the same way as for encoders, with the call to
> @@ -1233,6 +1232,41 @@ struct drm_bridge *of_drm_find_bridge(struct 
> device_node *np)
>   return NULL;
>  }
>  EXPORT_SYMBOL(of_drm_find_bridge);
> +
> +/**
> + * devm_drm_of_get_bridge - Return next bridge in the chain
> + * @dev: device to tie the bridge lifetime to
> + * @np: device tree node containing encoder output ports
> + * @port: port in the device tree node
> + * @endpoint: endpoint in the device tree node
> + *
> + * Given a DT node's port and endpoint number, finds the connected node
> + * and returns the associated bridge if any, or creates and returns a
> + * drm panel bridge instance if a panel is connected.
> + *
> + * Returns a pointer to the bridge if successful, or an error pointer
> + * otherwise.
> + */
> +struct drm_bridge *devm_drm_of_get_bridge(struct device *dev,
> +   struct device_node *np,
> +   unsigned int port,
> +   unsigned int endpoint)
> +{
> + struct drm_bridge *bridge;
> + struct drm_panel *panel;
> + int ret;
> +
> + ret = drm_of_find_panel_or_bridge(np, port, endpoint,
> +   , );
> + if (ret)
> + return ERR_PTR(ret);
> +
> + if (panel)
> + bridge = devm_drm_panel_bridge_add(dev, panel);
> +
> + return bridge;

I really like the idea, I've wanted to do something like this for a long
time. I however wonder if this is the best approach, or if we could get
the panel core to register the bridge itself. The part that bothers me
here is the assymetry in the lifetime of the bridges, the returned
pointer is either looked up or allocated.

Bridge lifetime is such a mess that it may not make a big difference,
but eventually we'll have to address that problem globally.

> +}
> +EXPORT_SYMBOL(devm_drm_of_get_bridge);
>  #endif
>  
>  MODULE_AUTHOR("Ajay Kumar ");
> diff --git a/drivers/gpu/drm/drm_of.c b/drivers/gpu/drm/drm_of.c
> index 997b8827fed2..37c34146eea8 100644
> --- a/drivers/gpu/drm/drm_of.c
> +++ b/drivers/gpu/drm/drm_of.c
> @@ -231,6 +231,9 @@ EXPORT_SYMBOL_GPL(drm_of_encoder_active_endpoint);
>   * return either the associated struct drm_panel or drm_bridge device. Either
>   * @panel or @bridge must not be NULL.
>   *
> + * This function is deprecated and should not be used in new drivers. Use
> + * devm_drm_of_get_bridge() instead.
> + *
>   * Returns zero if successful, or one of the standard error codes if it 
> fails.
>   */
>  int drm_of_find_panel_or_bridge(const struct device_node *np,
> diff --git a/include/drm/drm_bridge.h b/include/drm/drm_bridge.h
> index 46bdfa48c413..f70c88ca96ef 100644
> --- a/include/drm/drm_bridge.h
> +++ b/include/drm/drm_bridge.h
> @@ -911,6 +911,8 @@ struct drm_bridge *devm_drm_panel_bridge_add(struct 
> device *dev,
>  struct drm_bridge *devm_drm_panel_bridge_add_typed(struct device *dev,
>  struct drm_panel

Re: [PATCH v3 6/6] drm: rcar-du: Add r8a779a0 device support

2021-09-22 Thread Laurent Pinchart

Hi Kieran,

Thank you for the patch.

On Thu, Sep 23, 2021 at 12:47:26AM +0100, Kieran Bingham wrote:
> From: Kieran Bingham 
> 
> Extend the rcar_du_device_info structure and rcar_du_output enum to
> support DSI outputs and utilise these additions to provide support for
> the R8A779A0 V3U platform.
> 
> While the DIDSR register field is now named "DSI/CSI-2-TX-IF0 Dot Clock
> Select" the existing define LVDS0 is used, and is directly compatible
> from other DU variants.

That's not true anymore :-) The paragraph can simply be dropped.

> Signed-off-by: Kieran Bingham 
> 
> ---
> 
> v3:
>  - Introduce DIDSR_LDCS_DSI macro
> 
> v2:
>  - No longer requires a direct interface with the DSI encoder
>  - Use correct field naming (LDCS)
>  - Remove per-crtc clock feature.
> 
>  drivers/gpu/drm/rcar-du/rcar_du_crtc.h  |  2 ++
>  drivers/gpu/drm/rcar-du/rcar_du_drv.c   | 20 
>  drivers/gpu/drm/rcar-du/rcar_du_drv.h   |  2 ++
>  drivers/gpu/drm/rcar-du/rcar_du_group.c |  2 ++
>  drivers/gpu/drm/rcar-du/rcar_du_regs.h  |  1 +
>  5 files changed, 27 insertions(+)
> 
> diff --git a/drivers/gpu/drm/rcar-du/rcar_du_crtc.h 
> b/drivers/gpu/drm/rcar-du/rcar_du_crtc.h
> index 440e6b4fbb58..26e79b74898c 100644
> --- a/drivers/gpu/drm/rcar-du/rcar_du_crtc.h
> +++ b/drivers/gpu/drm/rcar-du/rcar_du_crtc.h
> @@ -96,6 +96,8 @@ struct rcar_du_crtc_state {
>  enum rcar_du_output {
>   RCAR_DU_OUTPUT_DPAD0,
>   RCAR_DU_OUTPUT_DPAD1,
> + RCAR_DU_OUTPUT_DSI0,
> + RCAR_DU_OUTPUT_DSI1,
>   RCAR_DU_OUTPUT_HDMI0,
>   RCAR_DU_OUTPUT_HDMI1,
>   RCAR_DU_OUTPUT_LVDS0,
> diff --git a/drivers/gpu/drm/rcar-du/rcar_du_drv.c 
> b/drivers/gpu/drm/rcar-du/rcar_du_drv.c
> index 8a094d5b9c77..8b4c8851b6bc 100644
> --- a/drivers/gpu/drm/rcar-du/rcar_du_drv.c
> +++ b/drivers/gpu/drm/rcar-du/rcar_du_drv.c
> @@ -489,6 +489,25 @@ static const struct rcar_du_device_info 
> rcar_du_r8a7799x_info = {
>   .lvds_clk_mask =  BIT(1) | BIT(0),
>  };
>  
> +static const struct rcar_du_device_info rcar_du_r8a779a0_info = {
> + .gen = 3,
> + .features = RCAR_DU_FEATURE_CRTC_IRQ
> +   | RCAR_DU_FEATURE_VSP1_SOURCE,
> + .channels_mask = BIT(1) | BIT(0),
> + .routes = {
> + /* R8A779A0 has two MIPI DSI outputs. */
> + [RCAR_DU_OUTPUT_DSI0] = {
> + .possible_crtcs = BIT(0),
> + .port = 0,
> + },
> + [RCAR_DU_OUTPUT_DSI1] = {
> + .possible_crtcs = BIT(1),
> + .port = 1,
> + },
> + },
> + .dsi_clk_mask =  BIT(1) | BIT(0),
> +};
> +
>  static const struct of_device_id rcar_du_of_table[] = {
>   { .compatible = "renesas,du-r8a7742", .data = _du_r8a7790_info },
>   { .compatible = "renesas,du-r8a7743", .data = _du_r8a7743_info },
> @@ -513,6 +532,7 @@ static const struct of_device_id rcar_du_of_table[] = {
>   { .compatible = "renesas,du-r8a77980", .data = _du_r8a77970_info },
>   { .compatible = "renesas,du-r8a77990", .data = _du_r8a7799x_info },
>   { .compatible = "renesas,du-r8a77995", .data = _du_r8a7799x_info },
> + { .compatible = "renesas,du-r8a779a0", .data = _du_r8a779a0_info },
>   { }
>  };
>  
> diff --git a/drivers/gpu/drm/rcar-du/rcar_du_drv.h 
> b/drivers/gpu/drm/rcar-du/rcar_du_drv.h
> index 5fe9152454ff..cf98d43d72d0 100644
> --- a/drivers/gpu/drm/rcar-du/rcar_du_drv.h
> +++ b/drivers/gpu/drm/rcar-du/rcar_du_drv.h
> @@ -57,6 +57,7 @@ struct rcar_du_output_routing {
>   * @routes: array of CRTC to output routes, indexed by output 
> (RCAR_DU_OUTPUT_*)
>   * @num_lvds: number of internal LVDS encoders
>   * @dpll_mask: bit mask of DU channels equipped with a DPLL
> + * @dsi_clk_mask: bitmask of channels that can use the DSI clock as dot clock
>   * @lvds_clk_mask: bitmask of channels that can use the LVDS clock as dot 
> clock
>   */
>  struct rcar_du_device_info {
> @@ -67,6 +68,7 @@ struct rcar_du_device_info {
>   struct rcar_du_output_routing routes[RCAR_DU_OUTPUT_MAX];
>   unsigned int num_lvds;
>   unsigned int dpll_mask;
> + unsigned int dsi_clk_mask;
>   unsigned int lvds_clk_mask;
>  };
>  
> diff --git a/drivers/gpu/drm/rcar-du/rcar_du_group.c 
> b/drivers/gpu/drm/rcar-du/rcar_du_group.c
> index a984eef265d2..8665a1dd2186 100644
> --- a/drivers/gpu/drm/rcar-du/rcar_du_group.c
> +++ b/drivers/gpu/drm/rcar-du/rcar_du_group.c
> @@ -124,6 +124,8 @@ static void rcar_du_group_setup_didsr(struct 
> rcar_du_group *rgrp)
>   if (rcdu->info->lvds_clk_mask & BIT(rcrtc->index))
>   didsr |= DIDSR_LDCS_LVDS0(i)
> |  DIDSR_PDCS_CLK(i, 0);
> + else if (rcdu->info->dsi_clk_mask & BIT(rcrtc->index))
> + didsr |= DIDSR_LDCS_DSI(i);
>   else
>   didsr |= DIDSR_LDCS_DCLKIN(i)
> |  DIDSR_PDCS_CLK(i, 0);
> diff --git

[PATCH v3 3/6] drm: rcar-du: Only initialise TVM_TVSYNC mode when supported

2021-09-22 Thread Kieran Bingham

The R-Car DU as found on the D3, E3, and V3U do not have support
for an external synchronisation method.

In these cases, the dsysr cached register should not be initialised
in DSYSR_TVM_TVSYNC, but instead should be left clear to configure as
DSYSR_TVM_MASTER by default.

Reviewed-by: Laurent Pinchart 
Signed-off-by: Kieran Bingham 

---
v2:
 - Remove parenthesis

 drivers/gpu/drm/rcar-du/rcar_du_crtc.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/rcar-du/rcar_du_crtc.c 
b/drivers/gpu/drm/rcar-du/rcar_du_crtc.c
index ea7e39d03545..a0f837e8243a 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_crtc.c
+++ b/drivers/gpu/drm/rcar-du/rcar_du_crtc.c
@@ -1243,7 +1243,10 @@ int rcar_du_crtc_create(struct rcar_du_group *rgrp, 
unsigned int swindex,
rcrtc->group = rgrp;
rcrtc->mmio_offset = mmio_offsets[hwindex];
rcrtc->index = hwindex;
-   rcrtc->dsysr = (rcrtc->index % 2 ? 0 : DSYSR_DRES) | DSYSR_TVM_TVSYNC;
+   rcrtc->dsysr = rcrtc->index % 2 ? 0 : DSYSR_DRES;
+
+   if (rcar_du_has(rcdu, RCAR_DU_FEATURE_TVM_SYNC))
+   rcrtc->dsysr |= DSYSR_TVM_TVSYNC;
 
if (rcar_du_has(rcdu, RCAR_DU_FEATURE_VSP1_SOURCE))
primary = >vsp->planes[rcrtc->vsp_pipe].plane;
-- 
2.30.2

[PATCH v3 4/6] drm: rcar-du: Fix DIDSR field name

2021-09-22 Thread Kieran Bingham

The DIDSR fields named LDCS were incorrectly defined as LCDS.
Both the Gen2 and Gen3 documentation refer to the fields as the "LVDS
Dot Clock Select".

Correct the definitions.

Reviewed-by: Laurent Pinchart 
Signed-off-by: Kieran Bingham 

---
v2:
 - New patch

v3:
 - Collect tag

 drivers/gpu/drm/rcar-du/rcar_du_group.c | 4 ++--
 drivers/gpu/drm/rcar-du/rcar_du_regs.h  | 8 
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/rcar-du/rcar_du_group.c 
b/drivers/gpu/drm/rcar-du/rcar_du_group.c
index 88a783ceb3e9..a984eef265d2 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_group.c
+++ b/drivers/gpu/drm/rcar-du/rcar_du_group.c
@@ -122,10 +122,10 @@ static void rcar_du_group_setup_didsr(struct 
rcar_du_group *rgrp)
didsr = DIDSR_CODE;
for (i = 0; i < num_crtcs; ++i, ++rcrtc) {
if (rcdu->info->lvds_clk_mask & BIT(rcrtc->index))
-   didsr |= DIDSR_LCDS_LVDS0(i)
+   didsr |= DIDSR_LDCS_LVDS0(i)
  |  DIDSR_PDCS_CLK(i, 0);
else
-   didsr |= DIDSR_LCDS_DCLKIN(i)
+   didsr |= DIDSR_LDCS_DCLKIN(i)
  |  DIDSR_PDCS_CLK(i, 0);
}
 
diff --git a/drivers/gpu/drm/rcar-du/rcar_du_regs.h 
b/drivers/gpu/drm/rcar-du/rcar_du_regs.h
index fb9964949368..fb7c467aa484 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_regs.h
+++ b/drivers/gpu/drm/rcar-du/rcar_du_regs.h
@@ -257,10 +257,10 @@
 
 #define DIDSR  0x20028
 #define DIDSR_CODE (0x7790 << 16)
-#define DIDSR_LCDS_DCLKIN(n)   (0 << (8 + (n) * 2))
-#define DIDSR_LCDS_LVDS0(n)(2 << (8 + (n) * 2))
-#define DIDSR_LCDS_LVDS1(n)(3 << (8 + (n) * 2))
-#define DIDSR_LCDS_MASK(n) (3 << (8 + (n) * 2))
+#define DIDSR_LDCS_DCLKIN(n)   (0 << (8 + (n) * 2))
+#define DIDSR_LDCS_LVDS0(n)(2 << (8 + (n) * 2))
+#define DIDSR_LDCS_LVDS1(n)(3 << (8 + (n) * 2))
+#define DIDSR_LDCS_MASK(n) (3 << (8 + (n) * 2))
 #define DIDSR_PDCS_CLK(n, clk) (clk << ((n) * 2))
 #define DIDSR_PDCS_MASK(n) (3 << ((n) * 2))
 
-- 
2.30.2

[PATCH v3 6/6] drm: rcar-du: Add r8a779a0 device support

2021-09-22 Thread Kieran Bingham

From: Kieran Bingham 

Extend the rcar_du_device_info structure and rcar_du_output enum to
support DSI outputs and utilise these additions to provide support for
the R8A779A0 V3U platform.

While the DIDSR register field is now named "DSI/CSI-2-TX-IF0 Dot Clock
Select" the existing define LVDS0 is used, and is directly compatible
from other DU variants.

Signed-off-by: Kieran Bingham 

---

v3:
 - Introduce DIDSR_LDCS_DSI macro

v2:
 - No longer requires a direct interface with the DSI encoder
 - Use correct field naming (LDCS)
 - Remove per-crtc clock feature.

 drivers/gpu/drm/rcar-du/rcar_du_crtc.h  |  2 ++
 drivers/gpu/drm/rcar-du/rcar_du_drv.c   | 20 
 drivers/gpu/drm/rcar-du/rcar_du_drv.h   |  2 ++
 drivers/gpu/drm/rcar-du/rcar_du_group.c |  2 ++
 drivers/gpu/drm/rcar-du/rcar_du_regs.h  |  1 +
 5 files changed, 27 insertions(+)

diff --git a/drivers/gpu/drm/rcar-du/rcar_du_crtc.h 
b/drivers/gpu/drm/rcar-du/rcar_du_crtc.h
index 440e6b4fbb58..26e79b74898c 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_crtc.h
+++ b/drivers/gpu/drm/rcar-du/rcar_du_crtc.h
@@ -96,6 +96,8 @@ struct rcar_du_crtc_state {
 enum rcar_du_output {
RCAR_DU_OUTPUT_DPAD0,
RCAR_DU_OUTPUT_DPAD1,
+   RCAR_DU_OUTPUT_DSI0,
+   RCAR_DU_OUTPUT_DSI1,
RCAR_DU_OUTPUT_HDMI0,
RCAR_DU_OUTPUT_HDMI1,
RCAR_DU_OUTPUT_LVDS0,
diff --git a/drivers/gpu/drm/rcar-du/rcar_du_drv.c 
b/drivers/gpu/drm/rcar-du/rcar_du_drv.c
index 8a094d5b9c77..8b4c8851b6bc 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_drv.c
+++ b/drivers/gpu/drm/rcar-du/rcar_du_drv.c
@@ -489,6 +489,25 @@ static const struct rcar_du_device_info 
rcar_du_r8a7799x_info = {
.lvds_clk_mask =  BIT(1) | BIT(0),
 };
 
+static const struct rcar_du_device_info rcar_du_r8a779a0_info = {
+   .gen = 3,
+   .features = RCAR_DU_FEATURE_CRTC_IRQ
+ | RCAR_DU_FEATURE_VSP1_SOURCE,
+   .channels_mask = BIT(1) | BIT(0),
+   .routes = {
+   /* R8A779A0 has two MIPI DSI outputs. */
+   [RCAR_DU_OUTPUT_DSI0] = {
+   .possible_crtcs = BIT(0),
+   .port = 0,
+   },
+   [RCAR_DU_OUTPUT_DSI1] = {
+   .possible_crtcs = BIT(1),
+   .port = 1,
+   },
+   },
+   .dsi_clk_mask =  BIT(1) | BIT(0),
+};
+
 static const struct of_device_id rcar_du_of_table[] = {
{ .compatible = "renesas,du-r8a7742", .data = _du_r8a7790_info },
{ .compatible = "renesas,du-r8a7743", .data = _du_r8a7743_info },
@@ -513,6 +532,7 @@ static const struct of_device_id rcar_du_of_table[] = {
{ .compatible = "renesas,du-r8a77980", .data = _du_r8a77970_info },
{ .compatible = "renesas,du-r8a77990", .data = _du_r8a7799x_info },
{ .compatible = "renesas,du-r8a77995", .data = _du_r8a7799x_info },
+   { .compatible = "renesas,du-r8a779a0", .data = _du_r8a779a0_info },
{ }
 };
 
diff --git a/drivers/gpu/drm/rcar-du/rcar_du_drv.h 
b/drivers/gpu/drm/rcar-du/rcar_du_drv.h
index 5fe9152454ff..cf98d43d72d0 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_drv.h
+++ b/drivers/gpu/drm/rcar-du/rcar_du_drv.h
@@ -57,6 +57,7 @@ struct rcar_du_output_routing {
  * @routes: array of CRTC to output routes, indexed by output 
(RCAR_DU_OUTPUT_*)
  * @num_lvds: number of internal LVDS encoders
  * @dpll_mask: bit mask of DU channels equipped with a DPLL
+ * @dsi_clk_mask: bitmask of channels that can use the DSI clock as dot clock
  * @lvds_clk_mask: bitmask of channels that can use the LVDS clock as dot clock
  */
 struct rcar_du_device_info {
@@ -67,6 +68,7 @@ struct rcar_du_device_info {
struct rcar_du_output_routing routes[RCAR_DU_OUTPUT_MAX];
unsigned int num_lvds;
unsigned int dpll_mask;
+   unsigned int dsi_clk_mask;
unsigned int lvds_clk_mask;
 };
 
diff --git a/drivers/gpu/drm/rcar-du/rcar_du_group.c 
b/drivers/gpu/drm/rcar-du/rcar_du_group.c
index a984eef265d2..8665a1dd2186 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_group.c
+++ b/drivers/gpu/drm/rcar-du/rcar_du_group.c
@@ -124,6 +124,8 @@ static void rcar_du_group_setup_didsr(struct rcar_du_group 
*rgrp)
if (rcdu->info->lvds_clk_mask & BIT(rcrtc->index))
didsr |= DIDSR_LDCS_LVDS0(i)
  |  DIDSR_PDCS_CLK(i, 0);
+   else if (rcdu->info->dsi_clk_mask & BIT(rcrtc->index))
+   didsr |= DIDSR_LDCS_DSI(i);
else
didsr |= DIDSR_LDCS_DCLKIN(i)
  |  DIDSR_PDCS_CLK(i, 0);
diff --git a/drivers/gpu/drm/rcar-du/rcar_du_regs.h 
b/drivers/gpu/drm/rcar-du/rcar_du_regs.h
index fb7c467aa484..9484215b51e2 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_regs.h
+++ b/drivers/gpu/drm/rcar-du/rcar_du_regs.h
@@ -258,6 +258,7 @@
 #define DIDSR  0x20028
 #define DIDSR_CODE (0x7790 << 16)
 #define DIDSR_LDCS_DCLKIN(n)   (0 <<

[PATCH v3 5/6] drm: rcar-du: Split CRTC IRQ and Clock features

2021-09-22 Thread Kieran Bingham

Not all platforms require both per-crtc IRQ and per-crtc clock
management. In preparation for suppporting such platforms, split the
feature macro to be able to specify both features independently.

The other features are incremented accordingly, to keep the two crtc
features adjacent.

Reviewed-by: Laurent Pinchart 
Signed-off-by: Kieran Bingham 

---
v2:
 - New patch

v3:
 - Collect tag

 drivers/gpu/drm/rcar-du/rcar_du_crtc.c |  4 +--
 drivers/gpu/drm/rcar-du/rcar_du_drv.c  | 48 +-
 drivers/gpu/drm/rcar-du/rcar_du_drv.h  |  9 ++---
 3 files changed, 39 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/rcar-du/rcar_du_crtc.c 
b/drivers/gpu/drm/rcar-du/rcar_du_crtc.c
index a0f837e8243a..5672830ca184 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_crtc.c
+++ b/drivers/gpu/drm/rcar-du/rcar_du_crtc.c
@@ -1206,7 +1206,7 @@ int rcar_du_crtc_create(struct rcar_du_group *rgrp, 
unsigned int swindex,
int ret;
 
/* Get the CRTC clock and the optional external clock. */
-   if (rcar_du_has(rcdu, RCAR_DU_FEATURE_CRTC_IRQ_CLOCK)) {
+   if (rcar_du_has(rcdu, RCAR_DU_FEATURE_CRTC_CLOCK)) {
sprintf(clk_name, "du.%u", hwindex);
name = clk_name;
} else {
@@ -1272,7 +1272,7 @@ int rcar_du_crtc_create(struct rcar_du_group *rgrp, 
unsigned int swindex,
drm_crtc_helper_add(crtc, _helper_funcs);
 
/* Register the interrupt handler. */
-   if (rcar_du_has(rcdu, RCAR_DU_FEATURE_CRTC_IRQ_CLOCK)) {
+   if (rcar_du_has(rcdu, RCAR_DU_FEATURE_CRTC_IRQ)) {
/* The IRQ's are associated with the CRTC (sw)index. */
irq = platform_get_irq(pdev, swindex);
irqflags = 0;
diff --git a/drivers/gpu/drm/rcar-du/rcar_du_drv.c 
b/drivers/gpu/drm/rcar-du/rcar_du_drv.c
index 4ac26d08ebb4..8a094d5b9c77 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_drv.c
+++ b/drivers/gpu/drm/rcar-du/rcar_du_drv.c
@@ -36,7 +36,8 @@
 
 static const struct rcar_du_device_info rzg1_du_r8a7743_info = {
.gen = 2,
-   .features = RCAR_DU_FEATURE_CRTC_IRQ_CLOCK
+   .features = RCAR_DU_FEATURE_CRTC_IRQ
+ | RCAR_DU_FEATURE_CRTC_CLOCK
  | RCAR_DU_FEATURE_INTERLACED
  | RCAR_DU_FEATURE_TVM_SYNC,
.channels_mask = BIT(1) | BIT(0),
@@ -58,7 +59,8 @@ static const struct rcar_du_device_info rzg1_du_r8a7743_info 
= {
 
 static const struct rcar_du_device_info rzg1_du_r8a7745_info = {
.gen = 2,
-   .features = RCAR_DU_FEATURE_CRTC_IRQ_CLOCK
+   .features = RCAR_DU_FEATURE_CRTC_IRQ
+ | RCAR_DU_FEATURE_CRTC_CLOCK
  | RCAR_DU_FEATURE_INTERLACED
  | RCAR_DU_FEATURE_TVM_SYNC,
.channels_mask = BIT(1) | BIT(0),
@@ -79,7 +81,8 @@ static const struct rcar_du_device_info rzg1_du_r8a7745_info 
= {
 
 static const struct rcar_du_device_info rzg1_du_r8a77470_info = {
.gen = 2,
-   .features = RCAR_DU_FEATURE_CRTC_IRQ_CLOCK
+   .features = RCAR_DU_FEATURE_CRTC_IRQ
+ | RCAR_DU_FEATURE_CRTC_CLOCK
  | RCAR_DU_FEATURE_INTERLACED
  | RCAR_DU_FEATURE_TVM_SYNC,
.channels_mask = BIT(1) | BIT(0),
@@ -105,7 +108,8 @@ static const struct rcar_du_device_info 
rzg1_du_r8a77470_info = {
 
 static const struct rcar_du_device_info rcar_du_r8a774a1_info = {
.gen = 3,
-   .features = RCAR_DU_FEATURE_CRTC_IRQ_CLOCK
+   .features = RCAR_DU_FEATURE_CRTC_IRQ
+ | RCAR_DU_FEATURE_CRTC_CLOCK
  | RCAR_DU_FEATURE_VSP1_SOURCE
  | RCAR_DU_FEATURE_INTERLACED
  | RCAR_DU_FEATURE_TVM_SYNC,
@@ -134,7 +138,8 @@ static const struct rcar_du_device_info 
rcar_du_r8a774a1_info = {
 
 static const struct rcar_du_device_info rcar_du_r8a774b1_info = {
.gen = 3,
-   .features = RCAR_DU_FEATURE_CRTC_IRQ_CLOCK
+   .features = RCAR_DU_FEATURE_CRTC_IRQ
+ | RCAR_DU_FEATURE_CRTC_CLOCK
  | RCAR_DU_FEATURE_VSP1_SOURCE
  | RCAR_DU_FEATURE_INTERLACED
  | RCAR_DU_FEATURE_TVM_SYNC,
@@ -163,7 +168,8 @@ static const struct rcar_du_device_info 
rcar_du_r8a774b1_info = {
 
 static const struct rcar_du_device_info rcar_du_r8a774c0_info = {
.gen = 3,
-   .features = RCAR_DU_FEATURE_CRTC_IRQ_CLOCK
+   .features = RCAR_DU_FEATURE_CRTC_IRQ
+ | RCAR_DU_FEATURE_CRTC_CLOCK
  | RCAR_DU_FEATURE_VSP1_SOURCE,
.channels_mask = BIT(1) | BIT(0),
.routes = {
@@ -189,7 +195,8 @@ static const struct rcar_du_device_info 
rcar_du_r8a774c0_info = {
 
 static const struct rcar_du_device_info rcar_du_r8a774e1_info = {
.gen = 3,
-   .features = RCAR_DU_FEATURE_CRTC_IRQ_CLOCK
+   .features = RCAR_DU_FEATURE_CRTC_IRQ
+ | RCAR_DU_FEATURE_CRTC_CLOCK
  | RCAR_DU_FEATURE_VSP1_SOURCE
  | RCAR_DU_FEATURE_INTERLACED

[PATCH v3 2/6] drm: rcar-du: Sort the DU outputs

2021-09-22 Thread Kieran Bingham

From: Kieran Bingham 

Sort the DU outputs alphabetically, with the exception of the final
entry which is there as a sentinal.

Reviewed-by: Laurent Pinchart 
Signed-off-by: Kieran Bingham 

---
v2:
 - Collect tag

 drivers/gpu/drm/rcar-du/rcar_du_crtc.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/rcar-du/rcar_du_crtc.h 
b/drivers/gpu/drm/rcar-du/rcar_du_crtc.h
index 5f2940c42225..440e6b4fbb58 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_crtc.h
+++ b/drivers/gpu/drm/rcar-du/rcar_du_crtc.h
@@ -96,10 +96,10 @@ struct rcar_du_crtc_state {
 enum rcar_du_output {
RCAR_DU_OUTPUT_DPAD0,
RCAR_DU_OUTPUT_DPAD1,
-   RCAR_DU_OUTPUT_LVDS0,
-   RCAR_DU_OUTPUT_LVDS1,
RCAR_DU_OUTPUT_HDMI0,
RCAR_DU_OUTPUT_HDMI1,
+   RCAR_DU_OUTPUT_LVDS0,
+   RCAR_DU_OUTPUT_LVDS1,
RCAR_DU_OUTPUT_TCON,
RCAR_DU_OUTPUT_MAX,
 };
-- 
2.30.2

[PATCH v3 1/6] dt-bindings: display: renesas, du: Provide bindings for r8a779a0

2021-09-22 Thread Kieran Bingham

From: Kieran Bingham 

Extend the Renesas DU display bindings to support the r8a779a0 V3U.

Reviewed-by: Laurent Pinchart 
Signed-off-by: Kieran Bingham 

---
v2:
 - Collected Laurent's tag
 - Remove clock-names requirement
 - Specify only a single clock

v3:
 - Use clocknames: 'du.0' instead of 'du' to remain consistent

 .../bindings/display/renesas,du.yaml  | 50 +++
 1 file changed, 50 insertions(+)

diff --git a/Documentation/devicetree/bindings/display/renesas,du.yaml 
b/Documentation/devicetree/bindings/display/renesas,du.yaml
index e3ca5389c17d..6db6a3f15395 100644
--- a/Documentation/devicetree/bindings/display/renesas,du.yaml
+++ b/Documentation/devicetree/bindings/display/renesas,du.yaml
@@ -39,6 +39,7 @@ properties:
   - renesas,du-r8a77980 # for R-Car V3H compatible DU
   - renesas,du-r8a77990 # for R-Car E3 compatible DU
   - renesas,du-r8a77995 # for R-Car D3 compatible DU
+  - renesas,du-r8a779a0 # for R-Car V3U compatible DU
 
   reg:
 maxItems: 1
@@ -773,6 +774,55 @@ allOf:
 - reset-names
 - renesas,vsps
 
+  - if:
+  properties:
+compatible:
+  contains:
+enum:
+  - renesas,du-r8a779a0
+then:
+  properties:
+clocks:
+  items:
+- description: Functional clock
+
+clock-names:
+  maxItems: 1
+  items:
+- const: du.0
+
+interrupts:
+  maxItems: 2
+
+resets:
+  maxItems: 1
+
+reset-names:
+  items:
+- const: du.0
+
+ports:
+  properties:
+port@0:
+  description: DSI 0
+port@1:
+  description: DSI 1
+port@2: false
+port@3: false
+
+  required:
+- port@0
+- port@1
+
+renesas,vsps:
+  minItems: 2
+
+  required:
+- interrupts
+- resets
+- reset-names
+- renesas,vsps
+
 additionalProperties: false
 
 examples:
-- 
2.30.2

Re: [PATCH] drm/bridge: dw-hdmi-cec: Make use of the helper function devm_add_action_or_reset()

2021-09-22 Thread Laurent Pinchart

Hi Cai,

Thank you for the patch.

On Wed, Sep 22, 2021 at 08:59:08PM +0800, Cai Huoqing wrote:
> The helper function devm_add_action_or_reset() will internally
> call devm_add_action(), and if devm_add_action() fails then it will
> execute the action mentioned and return the error code. So
> use devm_add_action_or_reset() instead of devm_add_action()
> to simplify the error handling, reduce the code.
> 
> Signed-off-by: Cai Huoqing 

Reviewed-by: Laurent Pinchart 

> ---
>  drivers/gpu/drm/bridge/synopsys/dw-hdmi-cec.c | 6 ++
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-cec.c 
> b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-cec.c
> index 70ab4fbdc23e..c8f44bcb298a 100644
> --- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-cec.c
> +++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-cec.c
> @@ -265,11 +265,9 @@ static int dw_hdmi_cec_probe(struct platform_device 
> *pdev)
>   /* override the module pointer */
>   cec->adap->owner = THIS_MODULE;
>  
> - ret = devm_add_action(>dev, dw_hdmi_cec_del, cec);
> - if (ret) {
> - cec_delete_adapter(cec->adap);
> + ret = devm_add_action_or_reset(>dev, dw_hdmi_cec_del, cec);
> + if (ret)
>   return ret;
> - }
>  
>   ret = devm_request_threaded_irq(>dev, cec->irq,
>   dw_hdmi_cec_hardirq,

-- 
Regards,

Laurent Pinchart

Re: [PATCH] drm: rcar-du: Make use of the helper function devm_platform_ioremap_resource()

2021-09-22 Thread Laurent Pinchart

Hi Cai,

Thank you for the patch.

On Tue, Aug 31, 2021 at 03:54:42PM +0800, Cai Huoqing wrote:
> Use the devm_platform_ioremap_resource() helper instead of
> calling platform_get_resource() and devm_ioremap_resource()
> separately
> 
> Signed-off-by: Cai Huoqing 

Reviewed-by: Laurent Pinchart 

> ---
>  drivers/gpu/drm/rcar-du/rcar_du_drv.c | 4 +---
>  drivers/gpu/drm/rcar-du/rcar_lvds.c   | 4 +---
>  2 files changed, 2 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/rcar-du/rcar_du_drv.c 
> b/drivers/gpu/drm/rcar-du/rcar_du_drv.c
> index 4ac26d08ebb4..ebec4b7269d1 100644
> --- a/drivers/gpu/drm/rcar-du/rcar_du_drv.c
> +++ b/drivers/gpu/drm/rcar-du/rcar_du_drv.c
> @@ -570,7 +570,6 @@ static void rcar_du_shutdown(struct platform_device *pdev)
>  static int rcar_du_probe(struct platform_device *pdev)
>  {
>   struct rcar_du_device *rcdu;
> - struct resource *mem;
>   int ret;
>  
>   /* Allocate and initialize the R-Car device structure. */
> @@ -585,8 +584,7 @@ static int rcar_du_probe(struct platform_device *pdev)
>   platform_set_drvdata(pdev, rcdu);
>  
>   /* I/O resources */
> - mem = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> - rcdu->mmio = devm_ioremap_resource(>dev, mem);
> + rcdu->mmio = devm_platform_ioremap_resource(pdev, 0);
>   if (IS_ERR(rcdu->mmio))
>   return PTR_ERR(rcdu->mmio);
>  
> diff --git a/drivers/gpu/drm/rcar-du/rcar_lvds.c 
> b/drivers/gpu/drm/rcar-du/rcar_lvds.c
> index d061b8de748f..a64d910b0500 100644
> --- a/drivers/gpu/drm/rcar-du/rcar_lvds.c
> +++ b/drivers/gpu/drm/rcar-du/rcar_lvds.c
> @@ -802,7 +802,6 @@ static int rcar_lvds_probe(struct platform_device *pdev)
>  {
>   const struct soc_device_attribute *attr;
>   struct rcar_lvds *lvds;
> - struct resource *mem;
>   int ret;
>  
>   lvds = devm_kzalloc(>dev, sizeof(*lvds), GFP_KERNEL);
> @@ -825,8 +824,7 @@ static int rcar_lvds_probe(struct platform_device *pdev)
>   lvds->bridge.funcs = _lvds_bridge_ops;
>   lvds->bridge.of_node = pdev->dev.of_node;
>  
> - mem = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> - lvds->mmio = devm_ioremap_resource(>dev, mem);
> + lvds->mmio = devm_platform_ioremap_resource(pdev, 0);
>   if (IS_ERR(lvds->mmio))
>   return PTR_ERR(lvds->mmio);
>  

-- 
Regards,

Laurent Pinchart

Re: [PATCH] drm/shmobile: Make use of the helper function devm_platform_ioremap_resource()

2021-09-22 Thread Laurent Pinchart

Hi Cai,

Thank you for the patch.

On Tue, Aug 31, 2021 at 09:57:30PM +0800, Cai Huoqing wrote:
> Use the devm_platform_ioremap_resource() helper instead of
> calling platform_get_resource() and devm_ioremap_resource()
> separately
> 
> Signed-off-by: Cai Huoqing 

Reviewed-by: Laurent Pinchart 

> ---
>  drivers/gpu/drm/shmobile/shmob_drm_drv.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/shmobile/shmob_drm_drv.c 
> b/drivers/gpu/drm/shmobile/shmob_drm_drv.c
> index 7db01904d18d..80078a9fd7f6 100644
> --- a/drivers/gpu/drm/shmobile/shmob_drm_drv.c
> +++ b/drivers/gpu/drm/shmobile/shmob_drm_drv.c
> @@ -192,7 +192,6 @@ static int shmob_drm_probe(struct platform_device *pdev)
>   struct shmob_drm_platform_data *pdata = pdev->dev.platform_data;
>   struct shmob_drm_device *sdev;
>   struct drm_device *ddev;
> - struct resource *res;
>   unsigned int i;
>   int ret;
>  
> @@ -215,8 +214,7 @@ static int shmob_drm_probe(struct platform_device *pdev)
>  
>   platform_set_drvdata(pdev, sdev);
>  
> - res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> - sdev->mmio = devm_ioremap_resource(>dev, res);
> + sdev->mmio = devm_platform_ioremap_resource(pdev, 0);
>   if (IS_ERR(sdev->mmio))
>   return PTR_ERR(sdev->mmio);
>  

-- 
Regards,

Laurent Pinchart

Re: [PATCH v2 2/2] drm: rcar-du: Add R-Car DSI driver

2021-09-22 Thread Laurent Pinchart

Hi Andrzej,

On Wed, Sep 22, 2021 at 04:29:39AM +0300, Laurent Pinchart wrote:
> On Tue, Sep 21, 2021 at 09:42:11PM +0200, Andrzej Hajda wrote:
> > W dniu 23.06.2021 o 15:56, Laurent Pinchart pisze:
> > > From: LUU HOAI 
> > >
> > > The driver supports the MIPI DSI/CSI-2 TX encoder found in the R-Car V3U
> > > SoC. It currently supports DSI mode only.
> > >
> > > Signed-off-by: LUU HOAI 
> > > Signed-off-by: Laurent Pinchart 
> > > 
> > > Reviewed-by: Kieran Bingham 
> > > Tested-by: Kieran Bingham 
> > > ---
> > >   drivers/gpu/drm/rcar-du/Kconfig  |   6 +
> > >   drivers/gpu/drm/rcar-du/Makefile |   1 +
> > >   drivers/gpu/drm/rcar-du/rcar_mipi_dsi.c  | 827 +++
> > >   drivers/gpu/drm/rcar-du/rcar_mipi_dsi_regs.h | 172 
> > >   4 files changed, 1006 insertions(+)
> > >   create mode 100644 drivers/gpu/drm/rcar-du/rcar_mipi_dsi.c
> > >   create mode 100644 drivers/gpu/drm/rcar-du/rcar_mipi_dsi_regs.h
> > >
> > > diff --git a/drivers/gpu/drm/rcar-du/Kconfig 
> > > b/drivers/gpu/drm/rcar-du/Kconfig
> > > index b47e74421e34..8cb94fe90639 100644
> > > --- a/drivers/gpu/drm/rcar-du/Kconfig
> > > +++ b/drivers/gpu/drm/rcar-du/Kconfig
> > > @@ -38,6 +38,12 @@ config DRM_RCAR_LVDS
> > >   help
> > > Enable support for the R-Car Display Unit embedded LVDS 
> > > encoders.
> > >   
> > > +config DRM_RCAR_MIPI_DSI
> > > + tristate "R-Car DU MIPI DSI Encoder Support"
> > > + depends on DRM && DRM_BRIDGE && OF
> > > + help
> > > +   Enable support for the R-Car Display Unit embedded MIPI DSI encoders.
> > > +
> > >   config DRM_RCAR_VSP
> > >   bool "R-Car DU VSP Compositor Support" if ARM
> > >   default y if ARM64
> > > diff --git a/drivers/gpu/drm/rcar-du/Makefile 
> > > b/drivers/gpu/drm/rcar-du/Makefile
> > > index 4d1187ccc3e5..adc1b49d02cf 100644
> > > --- a/drivers/gpu/drm/rcar-du/Makefile
> > > +++ b/drivers/gpu/drm/rcar-du/Makefile
> > > @@ -19,6 +19,7 @@ obj-$(CONFIG_DRM_RCAR_CMM)  += rcar_cmm.o
> > >   obj-$(CONFIG_DRM_RCAR_DU)   += rcar-du-drm.o
> > >   obj-$(CONFIG_DRM_RCAR_DW_HDMI)  += rcar_dw_hdmi.o
> > >   obj-$(CONFIG_DRM_RCAR_LVDS) += rcar_lvds.o
> > > +obj-$(CONFIG_DRM_RCAR_MIPI_DSI)  += rcar_mipi_dsi.o
> > >   
> > >   # 'remote-endpoint' is fixed up at run-time
> > >   DTC_FLAGS_rcar_du_of_lvds_r8a7790 += -Wno-graph_endpoint
> > > diff --git a/drivers/gpu/drm/rcar-du/rcar_mipi_dsi.c 
> > > b/drivers/gpu/drm/rcar-du/rcar_mipi_dsi.c
> > > new file mode 100644
> > > index ..e94245029f95
> > > --- /dev/null
> > > +++ b/drivers/gpu/drm/rcar-du/rcar_mipi_dsi.c
> > > @@ -0,0 +1,827 @@
> > > +// SPDX-License-Identifier: GPL-2.0
> > > +/*
> > > + * rcar_mipi_dsi.c  --  R-Car MIPI DSI Encoder
> > > + *
> > > + * Copyright (C) 2020 Renesas Electronics Corporation
> > > + */
> > > +
> > > +#include 
> > > +#include 
> > > +#include 
> > > +#include 
> > > +#include 
> > > +#include 
> > > +#include 
> > > +#include 
> > > +#include 
> > > +#include 
> > > +#include 
> > > +
> > > +#include 
> > > +#include 
> > > +#include 
> > > +#include 
> > > +#include 
> > > +#include 
> > > +#include 
> > > +
> > > +#include "rcar_mipi_dsi_regs.h"
> > > +
> > > +struct rcar_mipi_dsi {
> > > + struct device *dev;
> > > + const struct rcar_mipi_dsi_device_info *info;
> > > + struct reset_control *rstc;
> > > +
> > > + struct mipi_dsi_host host;
> > > + struct drm_bridge bridge;
> > > + struct drm_bridge *next_bridge;
> > > + struct drm_connector connector;
> > > +
> > > + void __iomem *mmio;
> > > + struct {
> > > + struct clk *mod;
> > > + struct clk *pll;
> > > + struct clk *dsi;
> > > + } clocks;
> > > +
> > > + struct drm_display_mode display_mode;
> > > + enum mipi_dsi_pixel_format format;
> > > + unsigned int num_data_lanes;
> > > + unsigned int lanes;
> > > +};
> > > +
> > > +static inline struct rcar_mipi_dsi *
> > > +bridge_to_rcar_mipi_dsi(struct drm_bridge *bridge)
> > > +{
> > > + return container_of(bridge, struct rcar_mipi_dsi, bridge);
> > > +}
> > > +
> > > +static inline struct rcar_mipi_dsi *
> > > +host_to_rcar_mipi_dsi(struct mipi_dsi_host *host)
> > > +{
> > > + return container_of(host, struct rcar_mipi_dsi, host);
> > > +}
> > > +
> > > +static const u32 phtw[] = {
> > > + 0x01020114, 0x01600115, /* General testing */
> > > + 0x01030116, 0x0102011d, /* General testing */
> > > + 0x011101a4, 0x018601a4, /* 1Gbps testing */
> > > + 0x014201a0, 0x010001a3, /* 1Gbps testing */
> > > + 0x0101011f, /* 1Gbps testing */
> > > +};
> > > +
> > > +static const u32 phtw2[] = {
> > > + 0x010c0130, 0x010c0140, /* General testing */
> > > + 0x010c0150, 0x010c0180, /* General testing */
> > > + 0x010c0190,
> > > + 0x010a0160, 0x010a0170,
> > > + 0x01800164, 0x01800174, /* 1Gbps testing */
> > > +};
> > > +
> > > +static const u32 hsfreqrange_table[][2] = {
> > > + { 8000,   0x00 }, { 9000,   0x10 }, { 1,  0x20 },
> > > + {

[PATCH 3/3] drm/msm: Extend gpu devcore dumps with pgtbl info

2021-09-22 Thread Rob Clark

From: Rob Clark 

In the case of iova fault triggered devcore dumps, include additional
debug information based on what we think is the current page tables,
including the TTBR0 value (which should match what we have in
adreno_smmu_fault_info unless things have gone horribly wrong), and
the pagetable entries traversed in the process of resolving the
faulting iova.

Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/msm/adreno/adreno_gpu.c | 10 ++
 drivers/gpu/drm/msm/msm_gpu.c   | 10 ++
 drivers/gpu/drm/msm/msm_gpu.h   |  8 
 drivers/gpu/drm/msm/msm_iommu.c | 17 +
 drivers/gpu/drm/msm/msm_mmu.h   |  2 ++
 5 files changed, 47 insertions(+)

diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c 
b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index 42e522a60623..d3718982be77 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -707,6 +707,16 @@ void adreno_show(struct msm_gpu *gpu, struct msm_gpu_state 
*state,
drm_printf(p, "  - dir: %s\n", info->flags & IOMMU_FAULT_WRITE 
? "WRITE" : "READ");
drm_printf(p, "  - type: %s\n", info->type);
drm_printf(p, "  - source: %s\n", info->block);
+
+   /* Information extracted from what we think are the current
+* pgtables.  Hopefully the TTBR0 matches what we've extracted
+* from the SMMU registers in smmu_info!
+*/
+   drm_puts(p, "pgtable-fault-info:\n");
+   drm_printf(p, "  - ttbr0: %.16llx\n", info->pgtbl_ttbr0);
+   drm_printf(p, "  - asid: %d\n", info->asid);
+   drm_printf(p, "  - ptes: %.16llx %.16llx %.16llx %.16llx\n",
+  info->ptes[0], info->ptes[1], info->ptes[2], 
info->ptes[3]);
}
 
drm_printf(p, "rbbm-status: 0x%08x\n", state->rbbm_status);
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index 8a3a592da3a4..d1a16642ecd5 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -284,6 +284,16 @@ static void msm_gpu_crashstate_capture(struct msm_gpu *gpu,
if (submit) {
int i, nr = 0;
 
+   if (state->fault_info.smmu_info.ttbr0) {
+   struct msm_gpu_fault_info *info = >fault_info;
+   struct msm_mmu *mmu = submit->aspace->mmu;
+
+   msm_iommu_pagetable_params(mmu, >pgtbl_ttbr0,
+  >asid);
+   msm_iommu_pagetable_walk(mmu, info->iova, info->ptes,
+ARRAY_SIZE(info->ptes));
+   }
+
/* count # of buffers to dump: */
for (i = 0; i < submit->nr_bos; i++)
if (should_dump(submit, i))
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index a7a5a53536a8..32a859307e81 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -78,6 +78,14 @@ struct msm_gpu_fault_info {
int flags;
const char *type;
const char *block;
+
+   /* Information about what we think/expect is the current SMMU state,
+* for example expected_ttbr0 should match smmu_info.ttbr0 which
+* was read back from SMMU registers.
+*/
+   u64 pgtbl_ttbr0;
+   u64 ptes[4];
+   int asid;
 };
 
 /**
diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c
index eed2a762e9dd..1bd985b56e35 100644
--- a/drivers/gpu/drm/msm/msm_iommu.c
+++ b/drivers/gpu/drm/msm/msm_iommu.c
@@ -116,6 +116,23 @@ int msm_iommu_pagetable_params(struct msm_mmu *mmu,
return 0;
 }
 
+int msm_iommu_pagetable_walk(struct msm_mmu *mmu, unsigned long iova,
+u64 *ptes, int num_ptes)
+{
+   struct msm_iommu_pagetable *pagetable;
+
+   if (mmu->type != MSM_MMU_IOMMU_PAGETABLE)
+   return -EINVAL;
+
+   pagetable = to_pagetable(mmu);
+
+   if (!pagetable->pgtbl_ops->pgtable_walk)
+   return -EINVAL;
+
+   return pagetable->pgtbl_ops->pgtable_walk(pagetable->pgtbl_ops, iova,
+ ptes, _ptes);
+}
+
 static const struct msm_mmu_funcs pagetable_funcs = {
.map = msm_iommu_pagetable_map,
.unmap = msm_iommu_pagetable_unmap,
diff --git a/drivers/gpu/drm/msm/msm_mmu.h b/drivers/gpu/drm/msm/msm_mmu.h
index de158e1bf765..519b749c61af 100644
--- a/drivers/gpu/drm/msm/msm_mmu.h
+++ b/drivers/gpu/drm/msm/msm_mmu.h
@@ -58,5 +58,7 @@ void msm_gpummu_params(struct msm_mmu *mmu, dma_addr_t 
*pt_base,
 
 int msm_iommu_pagetable_params(struct msm_mmu *mmu, phys_addr_t *ttbr,
int *asid);
+int msm_iommu_pagetable_walk(struct msm_mmu *mmu, unsigned long iova,
+u64 *ptes, int num_ptes);
 
 #endif /* __MSM_MMU_H__

[PATCH 2/3] drm/msm: Show all smmu info for iova fault devcore dumps

2021-09-22 Thread Rob Clark

From: Rob Clark 

Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c   |  2 +-
 drivers/gpu/drm/msm/adreno/adreno_gpu.c | 25 +
 drivers/gpu/drm/msm/msm_gpu.h   |  2 +-
 3 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 4ac652c35c43..f6a4dbef796b 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -1269,7 +1269,7 @@ static int a6xx_fault_handler(void *arg, unsigned long 
iova, int flags, void *da
/* Turn off the hangcheck timer to keep it from bothering us */
del_timer(>hangcheck_timer);
 
-   gpu->fault_info.ttbr0 = info->ttbr0;
+   gpu->fault_info.smmu_info = *info;
gpu->fault_info.iova  = iova;
gpu->fault_info.flags = flags;
gpu->fault_info.type  = type;
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c 
b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index 748665232d29..42e522a60623 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -685,19 +685,28 @@ void adreno_show(struct msm_gpu *gpu, struct 
msm_gpu_state *state,
adreno_gpu->rev.major, adreno_gpu->rev.minor,
adreno_gpu->rev.patchid);
/*
-* If this is state collected due to iova fault, so fault related info
+* If this is state collected due to iova fault, show fault related
+* info
 *
-* TTBR0 would not be zero, so this is a good way to distinguish
+* TTBR0 would not be zero in this case, so this is a good way to
+* distinguish
 */
-   if (state->fault_info.ttbr0) {
+   if (state->fault_info.smmu_info.ttbr0) {
const struct msm_gpu_fault_info *info = >fault_info;
+   const struct adreno_smmu_fault_info *smmu_info = 
>smmu_info;
 
drm_puts(p, "fault-info:\n");
-   drm_printf(p, "  - ttbr0=%.16llx\n", info->ttbr0);
-   drm_printf(p, "  - iova=%.16lx\n", info->iova);
-   drm_printf(p, "  - dir=%s\n", info->flags & IOMMU_FAULT_WRITE ? 
"WRITE" : "READ");
-   drm_printf(p, "  - type=%s\n", info->type);
-   drm_printf(p, "  - source=%s\n", info->block);
+   drm_printf(p, "  - far: %.16llx\n", smmu_info->far);
+   drm_printf(p, "  - ttbr0: %.16llx\n", smmu_info->ttbr0);
+   drm_printf(p, "  - contextidr: %.8x\n", smmu_info->contextidr);
+   drm_printf(p, "  - fsr: %.8x\n", smmu_info->fsr);
+   drm_printf(p, "  - fsynr0: %.8x\n", smmu_info->fsynr0);
+   drm_printf(p, "  - fsynr1: %.8x\n", smmu_info->fsynr1);
+   drm_printf(p, "  - cbfrsynra: %.8x\n", smmu_info->cbfrsynra);
+   drm_printf(p, "  - iova: %.16lx\n", info->iova);
+   drm_printf(p, "  - dir: %s\n", info->flags & IOMMU_FAULT_WRITE 
? "WRITE" : "READ");
+   drm_printf(p, "  - type: %s\n", info->type);
+   drm_printf(p, "  - source: %s\n", info->block);
}
 
drm_printf(p, "rbbm-status: 0x%08x\n", state->rbbm_status);
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index e031c9b495ed..a7a5a53536a8 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -73,7 +73,7 @@ struct msm_gpu_funcs {
 
 /* Additional state for iommu faults: */
 struct msm_gpu_fault_info {
-   u64 ttbr0;
+   struct adreno_smmu_fault_info smmu_info;
unsigned long iova;
int flags;
const char *type;
-- 
2.31.1

[PATCH v3] drm: rcar-du: Allow importing non-contiguous dma-buf with VSP

2021-09-22 Thread Laurent Pinchart

On R-Car Gen3, the DU uses a separate IP core named VSP to perform DMA
from memory and composition of planes. The DU hardware then only handles
the video timings and the interface with the encoders. This differs from
Gen2, where the DU included a composer with DMA engines.

When sourcing from the VSP, the DU hardware performs no memory access,
and thus has no requirements on imported dma-buf memory types. The GEM
CMA helpers however still create a DMA mapping to the DU device, which
isn't used. The mapping to the VSP is done when processing the atomic
commits, in the plane .prepare_fb() handler.

When the system uses an IOMMU, the VSP device is attached to it, which
enables the VSP to use non physically contiguous memory. The DU, as it
performs no memory access, isn't connected to the IOMMU. The GEM CMA
drm_gem_cma_prime_import_sg_table() helper will in that case fail to map
non-contiguous imported dma-bufs, as the DMA mapping to the DU device
will have multiple entries in its sgtable. The prevents using non
physically contiguous memory for display.

The DRM PRIME and GEM CMA helpers are designed to create the sgtable
when the dma-buf is imported. By default, the device referenced by the
drm_device is used to create the dma-buf attachment. Drivers can use a
different device by using the drm_gem_prime_import_dev() function. While
the DU has access to the VSP device, this won't help here, as different
CRTCs use different VSP instances, connected to different IOMMU
channels. The driver doesn't know at import time which CRTC a GEM object
will be used, and thus can't select the right VSP device to pass to
drm_gem_prime_import_dev().

To support non-contiguous memory, implement a custom
.gem_prime_import_sg_table() operation that accepts all imported dma-buf
regardless of the number of scatterlist entries. The sgtable will be
mapped to the VSP at .prepare_fb() time, which will reject the
framebuffer if the VSP isn't connected to an IOMMU.

Signed-off-by: Laurent Pinchart 
Reviewed-by: Kieran Bingham 
---
Changes since v2:

- Inline error handling in rcar_du_gem_prime_import_sg_table()

Changes since v1:

- Rewrote commit message to explain issue in more details
- Duplicate the imported scatter gather table in
  rcar_du_vsp_plane_prepare_fb()
- Use separate loop counter j to avoid overwritting i
- Update to latest drm_gem_cma API
---
 drivers/gpu/drm/rcar-du/rcar_du_drv.c |  6 +++-
 drivers/gpu/drm/rcar-du/rcar_du_kms.c | 46 +++
 drivers/gpu/drm/rcar-du/rcar_du_kms.h |  7 
 drivers/gpu/drm/rcar-du/rcar_du_vsp.c | 36 ++---
 4 files changed, 89 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/rcar-du/rcar_du_drv.c 
b/drivers/gpu/drm/rcar-du/rcar_du_drv.c
index 62dfd1b66db0..806c68823a28 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_drv.c
+++ b/drivers/gpu/drm/rcar-du/rcar_du_drv.c
@@ -529,7 +529,11 @@ DEFINE_DRM_GEM_CMA_FOPS(rcar_du_fops);
 
 static const struct drm_driver rcar_du_driver = {
.driver_features= DRIVER_GEM | DRIVER_MODESET | DRIVER_ATOMIC,
-   DRM_GEM_CMA_DRIVER_OPS_WITH_DUMB_CREATE(rcar_du_dumb_create),
+   .dumb_create= rcar_du_dumb_create,
+   .prime_handle_to_fd = drm_gem_prime_handle_to_fd,
+   .prime_fd_to_handle = drm_gem_prime_fd_to_handle,
+   .gem_prime_import_sg_table = rcar_du_gem_prime_import_sg_table,
+   .gem_prime_mmap = drm_gem_prime_mmap,
.fops   = _du_fops,
.name   = "rcar-du",
.desc   = "Renesas R-Car Display Unit",
diff --git a/drivers/gpu/drm/rcar-du/rcar_du_kms.c 
b/drivers/gpu/drm/rcar-du/rcar_du_kms.c
index ca29e4a62816..eacb1f17f747 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_kms.c
+++ b/drivers/gpu/drm/rcar-du/rcar_du_kms.c
@@ -19,6 +19,7 @@
 #include 
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -325,6 +326,51 @@ const struct rcar_du_format_info *rcar_du_format_info(u32 
fourcc)
  * Frame buffer
  */
 
+static const struct drm_gem_object_funcs rcar_du_gem_funcs = {
+   .free = drm_gem_cma_free_object,
+   .print_info = drm_gem_cma_print_info,
+   .get_sg_table = drm_gem_cma_get_sg_table,
+   .vmap = drm_gem_cma_vmap,
+   .mmap = drm_gem_cma_mmap,
+   .vm_ops = _gem_cma_vm_ops,
+};
+
+struct drm_gem_object *rcar_du_gem_prime_import_sg_table(struct drm_device 
*dev,
+   struct dma_buf_attachment *attach,
+   struct sg_table *sgt)
+{
+   struct rcar_du_device *rcdu = to_rcar_du_device(dev);
+   struct drm_gem_cma_object *cma_obj;
+   struct drm_gem_object *gem_obj;
+   int ret;
+
+   if (!rcar_du_has(rcdu, RCAR_DU_FEATURE_VSP1_SOURCE))
+   return drm_gem_cma_prime_import_sg_table(dev, attach, sgt);
+
+   /* Create a CMA GEM buffer. */
+   cma_obj = kzalloc(sizeof(*cma_obj), GFP_KERNEL);
+   if (!cma_obj)
+   return ERR_PTR(-ENOMEM);
+
+

[PATCH 1/3] iommu/io-pgtable-arm: Add way to debug pgtable walk

2021-09-22 Thread Rob Clark

From: Rob Clark 

Add an io-pgtable method to retrieve the raw PTEs that would be
traversed for a given iova access.

Signed-off-by: Rob Clark 
---
 drivers/iommu/io-pgtable-arm.c | 40 +++---
 include/linux/io-pgtable.h |  9 
 2 files changed, 41 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 87def58e79b5..5571d7203f11 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -638,38 +638,61 @@ static size_t arm_lpae_unmap(struct io_pgtable_ops *ops, 
unsigned long iova,
return __arm_lpae_unmap(data, gather, iova, size, data->start_level, 
ptep);
 }
 
-static phys_addr_t arm_lpae_iova_to_phys(struct io_pgtable_ops *ops,
-unsigned long iova)
+static int arm_lpae_pgtable_walk(struct io_pgtable_ops *ops, unsigned long 
iova,
+void *_ptes, int *num_ptes)
 {
struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops);
arm_lpae_iopte pte, *ptep = data->pgd;
+   arm_lpae_iopte *ptes = _ptes;
+   int max_ptes = *num_ptes;
int lvl = data->start_level;
 
+   *num_ptes = 0;
+
do {
+   if (*num_ptes >= max_ptes)
+   return -ENOSPC;
+
/* Valid IOPTE pointer? */
if (!ptep)
-   return 0;
+   return -EFAULT;
 
/* Grab the IOPTE we're interested in */
ptep += ARM_LPAE_LVL_IDX(iova, lvl, data);
pte = READ_ONCE(*ptep);
 
+   ptes[(*num_ptes)++] = pte;
+
/* Valid entry? */
if (!pte)
-   return 0;
+   return -EFAULT;
 
/* Leaf entry? */
if (iopte_leaf(pte, lvl, data->iop.fmt))
-   goto found_translation;
+   return 0;
 
/* Take it to the next level */
ptep = iopte_deref(pte, data);
} while (++lvl < ARM_LPAE_MAX_LEVELS);
 
-   /* Ran out of page tables to walk */
-   return 0;
+   return -EFAULT;
+}
+
+static phys_addr_t arm_lpae_iova_to_phys(struct io_pgtable_ops *ops,
+unsigned long iova)
+{
+   struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops);
+   arm_lpae_iopte pte, ptes[ARM_LPAE_MAX_LEVELS];
+   int lvl, num_ptes = ARM_LPAE_MAX_LEVELS;
+   int ret;
+
+   ret = arm_lpae_pgtable_walk(ops, iova, ptes, _ptes);
+   if (ret)
+   return 0;
+
+   pte = ptes[num_ptes - 1];
+   lvl = num_ptes - 1 + data->start_level;
 
-found_translation:
iova &= (ARM_LPAE_BLOCK_SIZE(lvl, data) - 1);
return iopte_to_paddr(pte, data) | iova;
 }
@@ -752,6 +775,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg)
.map= arm_lpae_map,
.unmap  = arm_lpae_unmap,
.iova_to_phys   = arm_lpae_iova_to_phys,
+   .pgtable_walk   = arm_lpae_pgtable_walk,
};
 
return data;
diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
index 4d40dfa75b55..6cba731ed8d3 100644
--- a/include/linux/io-pgtable.h
+++ b/include/linux/io-pgtable.h
@@ -145,6 +145,13 @@ struct io_pgtable_cfg {
  * @map:  Map a physically contiguous memory region.
  * @unmap:Unmap a physically contiguous memory region.
  * @iova_to_phys: Translate iova to physical address.
+ * @pgtable_walk: Return details of a page table walk for a given iova.
+ *This returns the array of PTEs in a format that is
+ *specific to the page table format.  The number of
+ *PTEs can be format specific.  The num_ptes parameter
+ *on input specifies the size of the ptes array, and
+ *on output the number of PTEs filled in (which depends
+ *on the number of PTEs walked to resolve the iova)
  *
  * These functions map directly onto the iommu_ops member functions with
  * the same names.
@@ -156,6 +163,8 @@ struct io_pgtable_ops {
size_t size, struct iommu_iotlb_gather *gather);
phys_addr_t (*iova_to_phys)(struct io_pgtable_ops *ops,
unsigned long iova);
+   int (*pgtable_walk)(struct io_pgtable_ops *ops, unsigned long iova,
+   void *ptes, int *num_ptes);
 };
 
 /**
-- 
2.31.1

[PATCH 0/3] io-pgtable-arm + drm/msm: Extend iova fault debugging

2021-09-22 Thread Rob Clark

From: Rob Clark 

This series extends io-pgtable-arm with a method to retrieve the page
table entries traversed in the process of address translation, and then
beefs up drm/msm gpu devcore dump to include this (and additional info)
in the devcore dump.

The motivation is tracking down an obscure iova fault triggered crash on
the address of the IB1 cmdstream.  This is one of the few places where
the GPU address written into the cmdstream is soley under control of the
kernel mode driver, so I don't think it can be a userspace bug.  The
logged cmdstream from the devcore's I've looked at look correct, and the
TTBR0 read back from arm-smmu agrees with the kernel emitted cmdstream.
Unfortunately it happens infrequently enough (something like once per
1000hrs of usage, from what I can tell from our telemetry) that actually
reproducing it with an instrumented debug kernel is not an option.  So
further spiffying out the devcore dumps and hoping we can spot a clue is
the plan I'm shooting for.

See https://gitlab.freedesktop.org/drm/msm/-/issues/8 for more info on
the issue I'm trying to debug.

Rob Clark (3):
  iommu/io-pgtable-arm: Add way to debug pgtable walk
  drm/msm: Show all smmu info for iova fault devcore dumps
  drm/msm: Extend gpu devcore dumps with pgtbl info

 drivers/gpu/drm/msm/adreno/a6xx_gpu.c   |  2 +-
 drivers/gpu/drm/msm/adreno/adreno_gpu.c | 35 +-
 drivers/gpu/drm/msm/msm_gpu.c   | 10 +++
 drivers/gpu/drm/msm/msm_gpu.h   | 10 ++-
 drivers/gpu/drm/msm/msm_iommu.c | 17 +++
 drivers/gpu/drm/msm/msm_mmu.h   |  2 ++
 drivers/iommu/io-pgtable-arm.c  | 40 -
 include/linux/io-pgtable.h  |  9 ++
 8 files changed, 107 insertions(+), 18 deletions(-)

-- 
2.31.1

Re: [PATCH RESEND] drm/i915: Flush buffer pools on driver remove

2021-09-22 Thread Matt Roper

On Fri, Sep 03, 2021 at 04:23:20PM +0200, Janusz Krzysztofik wrote:
> In preparation for clean driver release, attempts to drain work queues
> and release freed objects are taken at driver remove time.  However, GT
> buffer pools are now not flushed before the driver release phase.
> Since unused objects may stay there for up to one second, some may
> survive until driver release is attempted.  That can potentially
> explain sporadic then hardly reproducible issues observed at driver
> release time, like non-zero shrink counter or outstanding address space

So just to make sure I'm understanding the description here:
 - We currently do an explicit flush of the buffer pools within the call
   path of drm_driver.release(); this removes all buffers, regardless of
   their age.
 - However there may be other code that runs *earlier* within the
   drm_driver.release() call chain that expects buffer pools have
   already been flushed and are already empty.
 - Since buffer pools auto-flush old buffers once per second in a worker
   thread, there's a small window where if we remove the driver while
   there are still buffers with an age of less than one second, the
   assumptions of the other release code may be violated.

So by moving the flush to driver remove (which executes earlier via the
pci_driver.remove() flow) you're ensuring that all buffers are flushed
before _any_ code in drm_driver.release() executes.

I found the wording of the commit message here somewhat confusing since
it's talking about flushes we do in driver release, but mentions
problems that arise during driver release due to lack of flushing.  You
might want to reword the commit message somewhat to help clarify.
Otherwise, the code change itself looks reasonable to me.

BTW, I do notice that drm_driver.release() in general is technically
deprecated at this point (with a suggestion in the drm_drv.h comments to
switch to using drmm_add_action(), drmm_kmalloc(), etc. to manage the
cleanup of resources).  At some point in the future me may want to
rework the i915 cleanup in general according to that guidance.


Matt

> areas.
> 
> Flush buffer pools on GT remove as a fix.  On driver release, don't
> flush the pools again, just assert that the flush was called and
> nothing added more in between.
> 
> Signed-off-by: Janusz Krzysztofik 
> Cc: Chris Wilson 
> ---
> Resending with Cc: dri-devel@lists.freedesktop.org as requested, and a
> typo in commit description fixed.
> 
> Thanks,
> Janusz
> 
>  drivers/gpu/drm/i915/gt/intel_gt.c | 2 ++
>  drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c | 2 --
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c 
> b/drivers/gpu/drm/i915/gt/intel_gt.c
> index 62d40c986642..8f322a4ecd87 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt.c
> @@ -737,6 +737,8 @@ void intel_gt_driver_remove(struct intel_gt *gt)
>   intel_uc_driver_remove(>uc);
>  
>   intel_engines_release(gt);
> +
> + intel_gt_flush_buffer_pool(gt);
>  }
>  
>  void intel_gt_driver_unregister(struct intel_gt *gt)
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c 
> b/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c
> index aa0a59c5b614..acc49c56a9f3 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c
> @@ -245,8 +245,6 @@ void intel_gt_fini_buffer_pool(struct intel_gt *gt)
>   struct intel_gt_buffer_pool *pool = >buffer_pool;
>   int n;
>  
> - intel_gt_flush_buffer_pool(gt);
> -
>   for (n = 0; n < ARRAY_SIZE(pool->cache_list); n++)
>   GEM_BUG_ON(!list_empty(>cache_list[n]));
>  }
> -- 
> 2.25.1
> 

-- 
Matt Roper
Graphics Software Engineer
VTT-OSGC Platform Enablement
Intel Corporation
(916) 356-2795

Re: [PATCH v2] drm: rcar-du: Allow importing non-contiguous dma-buf with VSP

2021-09-22 Thread Laurent Pinchart

Hi Kieran,

On Wed, Sep 22, 2021 at 10:37:29PM +0100, Kieran Bingham wrote:
> On 30/07/2021 03:05, Laurent Pinchart wrote:
> > On R-Car Gen3, the DU uses a separate IP core named VSP to perform DMA
> > from memory and composition of planes. The DU hardware then only handles
> > the video timings and the interface with the encoders. This differs from
> > Gen2, where the DU included a composer with DMA engines.
> > 
> > When sourcing from the VSP, the DU hardware performs no memory access,
> > and thus has no requirements on imported dma-buf memory types. The GEM
> > CMA helpers however still create a DMA mapping to the DU device, which
> > isn't used. The mapping to the VSP is done when processing the atomic
> > commits, in the plane .prepare_fb() handler.
> > 
> > When the system uses an IOMMU, the VSP device is attached to it, which
> > enables the VSP to use non physically contiguous memory. The DU, as it
> > performs no memory access, isn't connected to the IOMMU. The GEM CMA
> > drm_gem_cma_prime_import_sg_table() helper will in that case fail to map
> > non-contiguous imported dma-bufs, as the DMA mapping to the DU device
> > will have multiple entries in its sgtable. The prevents using non
> > physically contiguous memory for display.
> > 
> > The DRM PRIME and GEM CMA helpers are designed to create the sgtable
> > when the dma-buf is imported. By default, the device referenced by the
> > drm_device is used to create the dma-buf attachment. Drivers can use a
> > different device by using the drm_gem_prime_import_dev() function. While
> > the DU has access to the VSP device, this won't help here, as different
> > CRTCs use different VSP instances, connected to different IOMMU
> > channels. The driver doesn't know at import time which CRTC a GEM object
> > will be used, and thus can't select the right VSP device to pass to
> > drm_gem_prime_import_dev().
> > 
> > To support non-contiguous memory, implement a custom
> > .gem_prime_import_sg_table() operation that accepts all imported dma-buf
> > regardless of the number of scatterlist entries. The sgtable will be
> > mapped to the VSP at .prepare_fb() time, which will reject the
> > framebuffer if the VSP isn't connected to an IOMMU.
> 
> Wow - quite a lot to digest there.

I tried to make it clear, but there's lots to explain :-S

> > Signed-off-by: Laurent Pinchart 
> > ---
> > 
> > This can arguably be considered as a bit of a hack, as the GEM PRIME
> > import still maps the dma-buf attachment to the DU, which isn't
> > necessary. This is however not a big issue, as the DU isn't connected to
> > any IOMMU, the DMA mapping thus doesn't waste any resource such as I/O
> > memory space. Avoiding the mapping creation would require replacing the
> > helpers completely, resulting in lots of code duplication. If this type
> > of hardware setup was common we could create another set of helpers, but
> > I don't think it would be worth it to support a single device.
> > 
> > I have tested this patch with the cam application from libcamera, on a
> > R-Car H3 ES2.x Salvator-XS board, importing buffers from the vimc
> > driver:
> > 
> > cam -c 'platform/vimc.0 Sensor B' \
> > -s pixelformat=BGR888,width=1440,height=900 \
> > -C -D HDMI-A-1
> 
> Are VIMC buffers always physically non-contiguous to validate this test?

They're not, but with
https://lore.kernel.org/linux-media/20210730001939.30769-1-laurent.pinchart+rene...@ideasonboard.com/
they can be made to be, with a module parameter.

> > A set of patches to add DRM/KMS output support to cam has been posted.
> > Until it gets merged (hopefully soon), it can be found at [1].
> > 
> > As cam doesn't support DRM/KMS scaling and overlay planes yet, the
> > camera resolution needs to match the display resolution. Due to a
> > peculiarity of the vimc driver, the resolution has to be divisible by 3,
> > which may require changes to the resolution above depending on your
> > monitor.
> > 
> > A test patch is also needed for the kernel, to enable IOMMU support for
> > the VSP, which isn't done by default (yet ?) in mainline. I have pushed
> > a branch to [2] if anyone is interested.
> > 
> > [1] 
> > https://lists.libcamera.org/pipermail/libcamera-devel/2021-July/022815.html
> > [2] git://linuxtv.org/pinchartl/media.git drm/du/devel/gem/contig
> > 
> > ---
> > Changes since v1:
> > 
> > - Rewrote commit message to explain issue in more details
> > - Duplicate the imported scatter gather table in
> >   rcar_du_vsp_plane_prepare_fb()
> > - Use separate loop counter j to avoid overwritting i
> > - Update to latest drm_gem_cma API
> > ---
> >  drivers/gpu/drm/rcar-du/rcar_du_drv.c |  6 +++-
> >  drivers/gpu/drm/rcar-du/rcar_du_kms.c | 49 +++
> >  drivers/gpu/drm/rcar-du/rcar_du_kms.h |  7 
> >  drivers/gpu/drm/rcar-du/rcar_du_vsp.c | 36 +---
> >  4 files changed, 92 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/rcar-du/rcar_du_drv.c 
> >

[Bug 214029] [bisected] [amdgpu] Several memory leaks in amdgpu and ttm

2021-09-22 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=214029

Erhard F. (erhar...@mailbox.org) changed:

   What|Removed |Added

Summary|[bisected] [NAVI] Several   |[bisected] [amdgpu] Several
   |memory leaks in amdgpu and  |memory leaks in amdgpu and
   |ttm |ttm

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 214029] [bisected] [NAVI] Several memory leaks in amdgpu and ttm

2021-09-22 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=214029

--- Comment #14 from Erhard F. (erhar...@mailbox.org) ---
Created attachment 298927
  --> https://bugzilla.kernel.org/attachment.cgi?id=298927=edit
kernel dmesg (kernel 5.14.6, AMD Opteron 6386 SE)

Does not seem to be Navi specific after all as the leaks do happen with the
Radeon R7 360 in my Opteron box too.

[...]
unreferenced object 0x8afeddd0c2c0 (size 176):
  comm "Web Content", pid 1830253, jiffies 4302445561 (age 2701.157s)
  hex dump (first 32 bytes):
50 c3 d0 dd fe 8a ff ff 80 51 3a c0 ff ff ff ff  PQ:.
0f 89 14 e9 f1 16 00 00 48 fe b6 09 41 a7 ff ff  H...A...
  backtrace:
[] drm_sched_fence_create+0x1d/0xb0 [gpu_sched]
[] drm_sched_job_init+0x58/0xa0 [gpu_sched]
[] amdgpu_job_submit+0x21/0xe0 [amdgpu]
[] amdgpu_copy_buffer+0x1ea/0x290 [amdgpu]
[] amdgpu_ttm_copy_mem_to_mem+0x282/0x5b0 [amdgpu]
[] amdgpu_bo_move+0x130/0x7d8 [amdgpu]
[] ttm_bo_handle_move_mem+0x89/0x178 [ttm]
[] ttm_bo_validate+0xba/0x140 [ttm]
[] amdgpu_bo_fault_reserve_notify+0xb6/0x160 [amdgpu]
[] amdgpu_gem_fault+0x78/0x100 [amdgpu]
[] __do_fault+0x31/0xe8
[] __handle_mm_fault+0xe1a/0x1290
[] handle_mm_fault+0xb5/0x218
[] exc_page_fault+0x177/0x5d0
[] asm_exc_page_fault+0x1b/0x20
unreferenced object 0x8b01f00bd0c0 (size 72):
  comm "sdma0", pid 403, jiffies 4302445561 (age 2701.157s)
  hex dump (first 32 bytes):
e0 c7 64 13 ff 8a ff ff 00 1c 30 c1 ff ff ff ff  ..d...0.
65 59 16 e9 f1 16 00 00 58 28 b9 86 03 8b ff ff  eY..X(..
  backtrace:
[] amdgpu_fence_emit+0x2b/0x1f0 [amdgpu]
[] amdgpu_ib_schedule+0x2e3/0x4e8 [amdgpu]
[] amdgpu_job_run+0x8b/0x1e8 [amdgpu]
[] drm_sched_main+0x1b7/0x3d8 [gpu_sched]
[] kthread+0x122/0x140
[] ret_from_fork+0x22/0x30
unreferenced object 0x8b02ec1796c0 (size 176):
  comm "Renderer", pid 108402, jiffies 4302694486 (age 1871.424s)
  hex dump (first 32 bytes):
50 97 17 ec 02 8b ff ff 80 51 3a c0 ff ff ff ff  PQ:.
4f 9c 02 1a b3 17 00 00 48 fe b6 09 41 a7 ff ff  O...H...A...
  backtrace:
[] drm_sched_fence_create+0x1d/0xb0 [gpu_sched]
[] drm_sched_job_init+0x58/0xa0 [gpu_sched]
[] amdgpu_job_submit+0x21/0xe0 [amdgpu]
[] amdgpu_copy_buffer+0x1ea/0x290 [amdgpu]
[] amdgpu_ttm_copy_mem_to_mem+0x282/0x5b0 [amdgpu]
[] amdgpu_bo_move+0x130/0x7d8 [amdgpu]
[] ttm_bo_handle_move_mem+0x89/0x178 [ttm]
[] ttm_bo_validate+0xba/0x140 [ttm]
[] amdgpu_bo_fault_reserve_notify+0xb6/0x160 [amdgpu]
[] amdgpu_gem_fault+0x78/0x100 [amdgpu]
[] __do_fault+0x31/0xe8
[] __handle_mm_fault+0xe1a/0x1290
[] handle_mm_fault+0xb5/0x218
[] exc_page_fault+0x177/0x5d0
[] asm_exc_page_fault+0x1b/0x20


 # lspci -s 01:00.0 -v
01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI]
Tobago PRO [Radeon R7 360 / R9 360 OEM] (rev 81) (prog-if 00 [VGA controller])
Subsystem: PC Partner Limited / Sapphire Technology Tobago PRO [Radeon
R7 360 / R9 360 OEM]
Flags: bus master, fast devsel, latency 0, IRQ 47, IOMMU group 11
Memory at d000 (64-bit, prefetchable) [size=256M]
Memory at cf80 (64-bit, prefetchable) [size=8M]
I/O ports at c000 [size=256]
Memory at fdc8 (32-bit, non-prefetchable) [size=256K]
Expansion ROM at 000c [disabled] [size=128K]
Capabilities: [48] Vendor Specific Information: Len=08 
Capabilities: [50] Power Management version 3
Capabilities: [58] Express Legacy Endpoint, MSI 00
Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010

Capabilities: [150] Advanced Error Reporting
Capabilities: [200] Physical Resizable BAR
Capabilities: [270] Secondary PCI Express
Capabilities: [2b0] Address Translation Service (ATS)
Capabilities: [2c0] Page Request Interface (PRI)
Capabilities: [2d0] Process Address Space ID (PASID)
Kernel driver in use: amdgpu
Kernel modules: radeon, amdgpu

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

Re: [Intel-gfx] [PATCH 14/19] drm/i915/oprom: Basic sanitization

2021-09-22 Thread Lucas De Marchi


On Mon, Sep 20, 2021 at 08:04:32AM +, Gupta, Anshuman wrote:




-Original Message-
From: Nikula, Jani 
Sent: Monday, September 20, 2021 1:12 PM
To: De Marchi, Lucas 
Cc: Auld, Matthew ; intel-...@lists.freedesktop.org;
dri-devel@lists.freedesktop.org; Gupta, Anshuman

Subject: Re: [Intel-gfx] [PATCH 14/19] drm/i915/oprom: Basic sanitization

On Fri, 17 Sep 2021, Lucas De Marchi  wrote:
> On Mon, May 17, 2021 at 02:57:33PM +0300, Jani Nikula wrote:
>>On Mon, 12 Apr 2021, Matthew Auld  wrote:
>>> From: Anshuman Gupta 
>>>
>>> Sanitize OPROM header, CPD signature and OPROM PCI version.
>>> OPROM_HEADER, EXPANSION_ROM_HEADER and OPROM_MEU_BLOB
structures and
>>> PCI struct offsets are provided by GSC counterparts.
>>> These are yet to be Documented in B.Spec.
>>> After successful sanitization, extract VBT from opregion image.
>>
>>So I don't understand what the point is with two consecutive patches
>>where the latter rewrites a lot of the former.
>
> I actually wonder what's the point of this. Getting it from spi is
> already the fallback and looks much more complex. Yes, it's pretty
> detailed and document the format pretty well, but it still looks more
> complex than the initial code. Do you see additional benefit in this
> one?

Getting opregion image from spi is needed to get the intel_opregion and its 
mailboxes on discrete card.


The commit message doesn't really explain much. Anshuman?

I will get rework of the patches and float it again.



from this patch the only thing I see it's doing is to get the VBT from
inside opregion... it moves the read part to helper methods and
apparently it supports multiple images...?

The question here is not why we are reading from spi, but rather what
this is doing that the previous commit wasn't already.

Lucas De Marchi

Re: [PATCH v2] dt-bindings: display: renesas,du: Provide bindings for r8a779a0

2021-09-22 Thread Laurent Pinchart

Hello everybody,

On Tue, Sep 07, 2021 at 09:17:31PM +0200, Geert Uytterhoeven wrote:
> On Tue, Sep 7, 2021 at 8:45 PM Rob Herring wrote:
> > On Mon, Sep 06, 2021 at 10:13:07AM +0200, Geert Uytterhoeven wrote:
> > > On Thu, Sep 2, 2021 at 1:39 AM Kieran Bingham wrote:
> > > > From: Kieran Bingham 
> > > >
> > > > Extend the Renesas DU display bindings to support the r8a779a0 V3U.
> > > >
> > > > Reviewed-by: Laurent Pinchart 
> > > > Signed-off-by: Kieran Bingham 
> > > > ---
> > > > v2:
> > > >  - Collected Laurent's tag
> > > >  - Remove clock-names requirement
> > > >  - Specify only a single clock
> > >
> > > Thanks for the update!
> > >
> > > > --- a/Documentation/devicetree/bindings/display/renesas,du.yaml
> > > > +++ b/Documentation/devicetree/bindings/display/renesas,du.yaml
> > > > @@ -39,6 +39,7 @@ properties:
> > > >- renesas,du-r8a77980 # for R-Car V3H compatible DU
> > > >- renesas,du-r8a77990 # for R-Car E3 compatible DU
> > > >- renesas,du-r8a77995 # for R-Car D3 compatible DU
> > > > +  - renesas,du-r8a779a0 # for R-Car V3U compatible DU
> > > >
> > > >reg:
> > > >  maxItems: 1
> > > > @@ -773,6 +774,55 @@ allOf:
> > > >  - reset-names
> > > >  - renesas,vsps
> > > >
> > > > +  - if:
> > > > +  properties:
> > > > +compatible:
> > > > +  contains:
> > > > +enum:
> > > > +  - renesas,du-r8a779a0
> > > > +then:
> > > > +  properties:
> > > > +clocks:
> > > > +  items:
> > > > +- description: Functional clock
> > > > +
> > > > +clock-names:
> > > > +  maxItems: 1
> > > > +  items:
> > > > +- const: du
> > > > +
> > > > +interrupts:
> > > > +  maxItems: 2
> > > > +
> > > > +resets:
> > > > +  maxItems: 1
> > > > +
> > > > +reset-names:
> > > > +  items:
> > > > +- const: du.0
> > >
> > > This is now inconsistent with clock-names, which doesn't use a suffix.
> >
> > But it is consistent with all the other cases of 'reset-names'. The
> > problem is 'clock-names' is not consistent and should be 'du.0'.
> 
> True.

Looks fine to me. The only other SoC that has a similar shared clock
architecture is H1 (R8A7779), and we use du.0 there.

> > Ideally, the if/them schemas should not be defining the names. That
> > should be at the top level and the if/them schema just limits the number
> > of entries. That's not always possible, but I think is for clocks and
> > resets in this case.
> 
> It's a bit tricky.
> For clocks, there's usually one clock per channel, but not always.
> Plus clocks for external inputs, if present.

Yes, it's mostly the external clocks that mess things up here. Each DU
channel typically has one internal clock and one optional external
clock, but not always.

> For resets, there's one reset for a group of channels, with the number
> of channels in a group depending on the SoC family.
> And then there's the special casing for SoCs where there's a gap in
> the channel numbering...

For resets, H1 and M3-N are indeed special cases. H1 has no reset-names,
while M3-N has du.0 and du.3 due to a gap in hardware channel numbering.
All other SoCs have du.0 and optionally du.2.

> Still wondering if it would be better to have one device node per
> channel, and companion links...

The hardware design would make that too messy. There are too many
cross-channel dependencies.

-- 
Regards,

Laurent Pinchart

Re: [PATCH v2 5/5] drm: rcar-du: Add r8a779a0 device support

2021-09-22 Thread Laurent Pinchart

Hi Kieran,

Thank you for the patch.

On Thu, Sep 02, 2021 at 12:49:07AM +0100, Kieran Bingham wrote:
> From: Kieran Bingham 
> 
> Extend the rcar_du_device_info structure and rcar_du_output enum to
> support DSI outputs and utilise these additions to provide support for
> the R8A779A0 V3U platform.
> 
> While the DIDSR register field is now named "DSI/CSI-2-TX-IF0 Dot Clock
> Select" the existing define LVDS0 is used, and is directly compatible
> from other DU variants.
> 
> Signed-off-by: Kieran Bingham 
> 
> ---
> 
> I can add a macro named DIDSR_LDCS_DSI0 duplicating DIDSR_LDCS_LVDS0 if
> it's deemed better.

I think I'd like that a bit better if you don't mind. I'd name the macro
DIDSR_LDCS_DSI though, as there's a single option (you can't pick one
DSI encoder or the other as the clock source, it's DSI0 for DU0 and DSI1
for DU1).

> 
> v2:
>  - No longer requires a direct interface with the DSI encoder
>  - Use correct field naming (LDCS)
>  - Remove per-crtc clock feature.
> 
>  drivers/gpu/drm/rcar-du/rcar_du_crtc.h  |  2 ++
>  drivers/gpu/drm/rcar-du/rcar_du_drv.c   | 20 
>  drivers/gpu/drm/rcar-du/rcar_du_drv.h   |  2 ++
>  drivers/gpu/drm/rcar-du/rcar_du_group.c |  2 ++
>  4 files changed, 26 insertions(+)
> 
> diff --git a/drivers/gpu/drm/rcar-du/rcar_du_crtc.h 
> b/drivers/gpu/drm/rcar-du/rcar_du_crtc.h
> index 440e6b4fbb58..26e79b74898c 100644
> --- a/drivers/gpu/drm/rcar-du/rcar_du_crtc.h
> +++ b/drivers/gpu/drm/rcar-du/rcar_du_crtc.h
> @@ -96,6 +96,8 @@ struct rcar_du_crtc_state {
>  enum rcar_du_output {
>   RCAR_DU_OUTPUT_DPAD0,
>   RCAR_DU_OUTPUT_DPAD1,
> + RCAR_DU_OUTPUT_DSI0,
> + RCAR_DU_OUTPUT_DSI1,
>   RCAR_DU_OUTPUT_HDMI0,
>   RCAR_DU_OUTPUT_HDMI1,
>   RCAR_DU_OUTPUT_LVDS0,
> diff --git a/drivers/gpu/drm/rcar-du/rcar_du_drv.c 
> b/drivers/gpu/drm/rcar-du/rcar_du_drv.c
> index 8a094d5b9c77..8b4c8851b6bc 100644
> --- a/drivers/gpu/drm/rcar-du/rcar_du_drv.c
> +++ b/drivers/gpu/drm/rcar-du/rcar_du_drv.c
> @@ -489,6 +489,25 @@ static const struct rcar_du_device_info 
> rcar_du_r8a7799x_info = {
>   .lvds_clk_mask =  BIT(1) | BIT(0),
>  };
>  
> +static const struct rcar_du_device_info rcar_du_r8a779a0_info = {
> + .gen = 3,
> + .features = RCAR_DU_FEATURE_CRTC_IRQ
> +   | RCAR_DU_FEATURE_VSP1_SOURCE,
> + .channels_mask = BIT(1) | BIT(0),
> + .routes = {
> + /* R8A779A0 has two MIPI DSI outputs. */
> + [RCAR_DU_OUTPUT_DSI0] = {
> + .possible_crtcs = BIT(0),
> + .port = 0,
> + },
> + [RCAR_DU_OUTPUT_DSI1] = {
> + .possible_crtcs = BIT(1),
> + .port = 1,
> + },
> + },
> + .dsi_clk_mask =  BIT(1) | BIT(0),
> +};
> +
>  static const struct of_device_id rcar_du_of_table[] = {
>   { .compatible = "renesas,du-r8a7742", .data = _du_r8a7790_info },
>   { .compatible = "renesas,du-r8a7743", .data = _du_r8a7743_info },
> @@ -513,6 +532,7 @@ static const struct of_device_id rcar_du_of_table[] = {
>   { .compatible = "renesas,du-r8a77980", .data = _du_r8a77970_info },
>   { .compatible = "renesas,du-r8a77990", .data = _du_r8a7799x_info },
>   { .compatible = "renesas,du-r8a77995", .data = _du_r8a7799x_info },
> + { .compatible = "renesas,du-r8a779a0", .data = _du_r8a779a0_info },

While this looks good, the DT bindings need a v3, so I can't include
this series in a pull request just yet :-( Could you please group the DT
bindings and driver patches in a single series for v3 ?

>   { }
>  };
>  
> diff --git a/drivers/gpu/drm/rcar-du/rcar_du_drv.h 
> b/drivers/gpu/drm/rcar-du/rcar_du_drv.h
> index 5fe9152454ff..cf98d43d72d0 100644
> --- a/drivers/gpu/drm/rcar-du/rcar_du_drv.h
> +++ b/drivers/gpu/drm/rcar-du/rcar_du_drv.h
> @@ -57,6 +57,7 @@ struct rcar_du_output_routing {
>   * @routes: array of CRTC to output routes, indexed by output 
> (RCAR_DU_OUTPUT_*)
>   * @num_lvds: number of internal LVDS encoders
>   * @dpll_mask: bit mask of DU channels equipped with a DPLL
> + * @dsi_clk_mask: bitmask of channels that can use the DSI clock as dot clock
>   * @lvds_clk_mask: bitmask of channels that can use the LVDS clock as dot 
> clock
>   */
>  struct rcar_du_device_info {
> @@ -67,6 +68,7 @@ struct rcar_du_device_info {
>   struct rcar_du_output_routing routes[RCAR_DU_OUTPUT_MAX];
>   unsigned int num_lvds;
>   unsigned int dpll_mask;
> + unsigned int dsi_clk_mask;
>   unsigned int lvds_clk_mask;
>  };
>  
> diff --git a/drivers/gpu/drm/rcar-du/rcar_du_group.c 
> b/drivers/gpu/drm/rcar-du/rcar_du_group.c
> index a984eef265d2..27c912bab76e 100644
> --- a/drivers/gpu/drm/rcar-du/rcar_du_group.c
> +++ b/drivers/gpu/drm/rcar-du/rcar_du_group.c
> @@ -124,6 +124,8 @@ static void rcar_du_group_setup_didsr(struct 
> rcar_du_group *rgrp)
>   if (rcdu->info->lvds_clk_mask & BIT(rcrtc->index))
>   didsr |=

Re: [PATCH v2] drm: rcar-du: Allow importing non-contiguous dma-buf with VSP

2021-09-22 Thread Kieran Bingham

On 30/07/2021 03:05, Laurent Pinchart wrote:
> On R-Car Gen3, the DU uses a separate IP core named VSP to perform DMA
> from memory and composition of planes. The DU hardware then only handles
> the video timings and the interface with the encoders. This differs from
> Gen2, where the DU included a composer with DMA engines.
> 
> When sourcing from the VSP, the DU hardware performs no memory access,
> and thus has no requirements on imported dma-buf memory types. The GEM
> CMA helpers however still create a DMA mapping to the DU device, which
> isn't used. The mapping to the VSP is done when processing the atomic
> commits, in the plane .prepare_fb() handler.
> 
> When the system uses an IOMMU, the VSP device is attached to it, which
> enables the VSP to use non physically contiguous memory. The DU, as it
> performs no memory access, isn't connected to the IOMMU. The GEM CMA
> drm_gem_cma_prime_import_sg_table() helper will in that case fail to map
> non-contiguous imported dma-bufs, as the DMA mapping to the DU device
> will have multiple entries in its sgtable. The prevents using non
> physically contiguous memory for display.
> 
> The DRM PRIME and GEM CMA helpers are designed to create the sgtable
> when the dma-buf is imported. By default, the device referenced by the
> drm_device is used to create the dma-buf attachment. Drivers can use a
> different device by using the drm_gem_prime_import_dev() function. While
> the DU has access to the VSP device, this won't help here, as different
> CRTCs use different VSP instances, connected to different IOMMU
> channels. The driver doesn't know at import time which CRTC a GEM object
> will be used, and thus can't select the right VSP device to pass to
> drm_gem_prime_import_dev().
> 
> To support non-contiguous memory, implement a custom
> .gem_prime_import_sg_table() operation that accepts all imported dma-buf
> regardless of the number of scatterlist entries. The sgtable will be
> mapped to the VSP at .prepare_fb() time, which will reject the
> framebuffer if the VSP isn't connected to an IOMMU.


Wow - quite a lot to digest there.


> Signed-off-by: Laurent Pinchart 
> ---
> 
> This can arguably be considered as a bit of a hack, as the GEM PRIME
> import still maps the dma-buf attachment to the DU, which isn't
> necessary. This is however not a big issue, as the DU isn't connected to
> any IOMMU, the DMA mapping thus doesn't waste any resource such as I/O
> memory space. Avoiding the mapping creation would require replacing the
> helpers completely, resulting in lots of code duplication. If this type
> of hardware setup was common we could create another set of helpers, but
> I don't think it would be worth it to support a single device.
> 
> I have tested this patch with the cam application from libcamera, on a
> R-Car H3 ES2.x Salvator-XS board, importing buffers from the vimc
> driver:
> 
> cam -c 'platform/vimc.0 Sensor B' \
>   -s pixelformat=BGR888,width=1440,height=900 \
>   -C -D HDMI-A-1

Are VIMC buffers always physically non-contiguous to validate this test?


> A set of patches to add DRM/KMS output support to cam has been posted.
> Until it gets merged (hopefully soon), it can be found at [1].
> 
> As cam doesn't support DRM/KMS scaling and overlay planes yet, the
> camera resolution needs to match the display resolution. Due to a
> peculiarity of the vimc driver, the resolution has to be divisible by 3,
> which may require changes to the resolution above depending on your
> monitor.
> 
> A test patch is also needed for the kernel, to enable IOMMU support for
> the VSP, which isn't done by default (yet ?) in mainline. I have pushed
> a branch to [2] if anyone is interested.
> 
> [1] 
> https://lists.libcamera.org/pipermail/libcamera-devel/2021-July/022815.html
> [2] git://linuxtv.org/pinchartl/media.git drm/du/devel/gem/contig
> 
> ---
> Changes since v1:
> 
> - Rewrote commit message to explain issue in more details
> - Duplicate the imported scatter gather table in
>   rcar_du_vsp_plane_prepare_fb()
> - Use separate loop counter j to avoid overwritting i
> - Update to latest drm_gem_cma API
> ---
>  drivers/gpu/drm/rcar-du/rcar_du_drv.c |  6 +++-
>  drivers/gpu/drm/rcar-du/rcar_du_kms.c | 49 +++
>  drivers/gpu/drm/rcar-du/rcar_du_kms.h |  7 
>  drivers/gpu/drm/rcar-du/rcar_du_vsp.c | 36 +---
>  4 files changed, 92 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/rcar-du/rcar_du_drv.c 
> b/drivers/gpu/drm/rcar-du/rcar_du_drv.c
> index cb34b1e477bc..d1f8d51a10fe 100644
> --- a/drivers/gpu/drm/rcar-du/rcar_du_drv.c
> +++ b/drivers/gpu/drm/rcar-du/rcar_du_drv.c
> @@ -511,7 +511,11 @@ DEFINE_DRM_GEM_CMA_FOPS(rcar_du_fops);
>  
>  static const struct drm_driver rcar_du_driver = {
>   .driver_features= DRIVER_GEM | DRIVER_MODESET | DRIVER_ATOMIC,
> - DRM_GEM_CMA_DRIVER_OPS_WITH_DUMB_CREATE(rcar_du_dumb_create),
> + .dumb_create= rcar_du_dumb_create,
>

Re: [Intel-gfx] [PATCH] drm/i915: fix blank screen booting crashes

2021-09-22 Thread Lucas De Marchi


On Tue, Sep 21, 2021 at 06:40:41PM -0700, Matthew Brost wrote:

On Tue, Sep 21, 2021 at 04:29:31PM -0700, Lucas De Marchi wrote:

On Tue, Sep 21, 2021 at 03:55:15PM -0700, Matthew Brost wrote:
> On Tue, Sep 21, 2021 at 11:46:37AM -0700, Lucas De Marchi wrote:
> > On Tue, Sep 21, 2021 at 10:43:32AM -0700, Matthew Brost wrote:
> > > From: Hugh Dickins 
> > >
> > > 5.15-rc1 crashes with blank screen when booting up on two ThinkPads
> > > using i915.  Bisections converge convincingly, but arrive at different
> > > and surprising "culprits", none of them the actual culprit.
> > >
> > > netconsole (with init_netconsole() hacked to call i915_init() when
> > > logging has started, instead of by module_init()) tells the story:
> > >
> > > kernel BUG at drivers/gpu/drm/i915/i915_sw_fence.c:245!
> > > with RSI: 814d408b pointing to sw_fence_dummy_notify().
> > > I've been building with CONFIG_CC_OPTIMIZE_FOR_SIZE=y, and that
> > > function needs to be 4-byte aligned.
> > >
> > > v2:
> > > (Jani Nikula)
> > >  - Change BUG_ON to WARN_ON
> > > v3:
> > > (Jani / Tvrtko)
> > >  - Short circuit __i915_sw_fence_init on WARN_ON
> > >
> > > Fixes: 62eaf0ae217d ("drm/i915/guc: Support request cancellation")
> > > Signed-off-by: Hugh Dickins 
> > > Signed-off-by: Matthew Brost 
> > > Reviewed-by: Matthew Brost 
> > > ---
> > > drivers/gpu/drm/i915/gt/intel_context.c |  4 ++--
> > > drivers/gpu/drm/i915/i915_sw_fence.c| 17 ++---
> > > 2 files changed, 12 insertions(+), 9 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_context.c 
b/drivers/gpu/drm/i915/gt/intel_context.c
> > > index ff637147b1a9..e7f78bc7ebfc 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_context.c
> > > +++ b/drivers/gpu/drm/i915/gt/intel_context.c
> > > @@ -362,8 +362,8 @@ static int __intel_context_active(struct i915_active 
*active)
> > >  return 0;
> > > }
> > >
> >
> > > -static int sw_fence_dummy_notify(struct i915_sw_fence *sf,
> > > - enum i915_sw_fence_notify state)
> > > +static int __i915_sw_fence_call
> > > +sw_fence_dummy_notify(struct i915_sw_fence *sf, enum 
i915_sw_fence_notify state)
> > > {
> > >  return NOTIFY_DONE;
> > > }
> > > diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c 
b/drivers/gpu/drm/i915/i915_sw_fence.c
> > > index c589a681da77..08cea73264e7 100644
> > > --- a/drivers/gpu/drm/i915/i915_sw_fence.c
> > > +++ b/drivers/gpu/drm/i915/i915_sw_fence.c
> > > @@ -13,9 +13,9 @@
> > > #include "i915_selftest.h"
> > >
> > > #if IS_ENABLED(CONFIG_DRM_I915_DEBUG)
> > > -#define I915_SW_FENCE_BUG_ON(expr) BUG_ON(expr)
> > > +#define I915_SW_FENCE_WARN_ON(expr) WARN_ON(expr)
> > > #else
> > > -#define I915_SW_FENCE_BUG_ON(expr) BUILD_BUG_ON_INVALID(expr)
> > > +#define I915_SW_FENCE_WARN_ON(expr) BUILD_BUG_ON_INVALID(expr)
> > > #endif
> > >
> > > static DEFINE_SPINLOCK(i915_sw_fence_lock);
> > > @@ -129,7 +129,10 @@ static int __i915_sw_fence_notify(struct 
i915_sw_fence *fence,
> > >  i915_sw_fence_notify_t fn;
> > >
> > >  fn = (i915_sw_fence_notify_t)(fence->flags & I915_SW_FENCE_MASK);
> > > -return fn(fence, state);
> > > +if (likely(fn))
> > > +return fn(fence, state);
> > > +else
> > > +return 0;
> >
> > since the knowledge for these being NULL (or with the wrong alignment)
> > are in the init/reinit functions,  wouldn't it be better to just add a
> > fence_nop() and assign it there instead this likely() here?
> >
>
> Maybe? I prefer the way it is.
>
> > > }
> > >
> > > #ifdef CONFIG_DRM_I915_SW_FENCE_DEBUG_OBJECTS
> > > @@ -242,9 +245,9 @@ void __i915_sw_fence_init(struct i915_sw_fence *fence,
> > >const char *name,
> > >struct lock_class_key *key)
> > > {
> > > -BUG_ON(!fn || (unsigned long)fn & ~I915_SW_FENCE_MASK);
> > > -
> > >  __init_waitqueue_head(>wait, name, key);
> > > +if (WARN_ON(!fn || (unsigned long)fn & ~I915_SW_FENCE_MASK))
> > > +return;
> >
> > like:
> >   if (WARN_ON(!fn || (unsigned long)fn & ~I915_SW_FENCE_MASK))
> >   fence->flags = (unsigned long)sw_fence_dummy_notify;
> >   else
> >   fence->flags = (unsigned long)fn;
> >
> >
> > f you return here instead of calling i915_sw_fence_reinit(), aren't you
> > just going to use uninitialized memory later? At least in the selftests,
> > which allocate it with kmalloc()... I didn't check others.
> >
>
> I don't think so, maybe the fence won't work but it won't blow up
> either.
>
> >
> > For the bug fix we could just add the __aligned(4) and leave the rest to a
> > separate patch.
> >
>
> The bug was sw_fence_dummy_notify in gt/intel_context.c was not 4 byte
> align which triggered a BUG_ON during boot which blank screened a
> laptop. Jani / Tvrtko suggested that we make the BUG_ON to WARN_ONs so
> if someone makes this mistake in the future kernel should boot albiet
> with a WARNING.

yes, I understood. But afaics with WARN_ON you

Re: [PATCH v3 5/8] x86/sme: Replace occurrences of sme_active() with cc_platform_has()

2021-09-22 Thread Kirill A. Shutemov

On Wed, Sep 22, 2021 at 09:52:07PM +0200, Borislav Petkov wrote:
> On Wed, Sep 22, 2021 at 05:30:15PM +0300, Kirill A. Shutemov wrote:
> > Not fine, but waiting to blowup with random build environment change.
> 
> Why is it not fine?
> 
> Are you suspecting that the compiler might generate something else and
> not a rip-relative access?

Yes. We had it before for __supported_pte_mask and other users of
fixup_pointer().

See for instance 4a09f0210c8b ("x86/boot/64/clang: Use fixup_pointer() to
access '__supported_pte_mask'")

Unless we find other way to guarantee RIP-relative access, we must use
fixup_pointer() to access any global variables.

-- 
 Kirill A. Shutemov

[PATCH v3 6/6] drm/ingenic: Attach bridge chain to encoders

2021-09-22 Thread Paul Cercueil

Attach a top-level bridge to each encoder, which will be used for
negociating the bus format and flags.

All the bridges are now attached with DRM_BRIDGE_ATTACH_NO_CONNECTOR.

Signed-off-by: Paul Cercueil 
---
 drivers/gpu/drm/ingenic/ingenic-drm-drv.c | 92 +--
 1 file changed, 70 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/ingenic/ingenic-drm-drv.c 
b/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
index a5e2880e07a1..a05a9fa6e115 100644
--- a/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
+++ b/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
@@ -21,6 +21,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -108,6 +109,19 @@ struct ingenic_drm {
struct drm_private_obj private_obj;
 };
 
+struct ingenic_drm_bridge {
+   struct drm_encoder encoder;
+   struct drm_bridge bridge, *next_bridge;
+
+   struct drm_bus_cfg bus_cfg;
+};
+
+static inline struct ingenic_drm_bridge *
+to_ingenic_drm_bridge(struct drm_encoder *encoder)
+{
+   return container_of(encoder, struct ingenic_drm_bridge, encoder);
+}
+
 static inline struct ingenic_drm_private_state *
 to_ingenic_drm_priv_state(struct drm_private_state *state)
 {
@@ -668,11 +682,10 @@ static void ingenic_drm_encoder_atomic_mode_set(struct 
drm_encoder *encoder,
 {
struct ingenic_drm *priv = drm_device_get_priv(encoder->dev);
struct drm_display_mode *mode = _state->adjusted_mode;
-   struct drm_connector *conn = conn_state->connector;
-   struct drm_display_info *info = >display_info;
+   struct ingenic_drm_bridge *bridge = to_ingenic_drm_bridge(encoder);
unsigned int cfg, rgbcfg = 0;
 
-   priv->panel_is_sharp = info->bus_flags & DRM_BUS_FLAG_SHARP_SIGNALS;
+   priv->panel_is_sharp = bridge->bus_cfg.flags & 
DRM_BUS_FLAG_SHARP_SIGNALS;
 
if (priv->panel_is_sharp) {
cfg = JZ_LCD_CFG_MODE_SPECIAL_TFT_1 | JZ_LCD_CFG_REV_POLARITY;
@@ -685,19 +698,19 @@ static void ingenic_drm_encoder_atomic_mode_set(struct 
drm_encoder *encoder,
cfg |= JZ_LCD_CFG_HSYNC_ACTIVE_LOW;
if (mode->flags & DRM_MODE_FLAG_NVSYNC)
cfg |= JZ_LCD_CFG_VSYNC_ACTIVE_LOW;
-   if (info->bus_flags & DRM_BUS_FLAG_DE_LOW)
+   if (bridge->bus_cfg.flags & DRM_BUS_FLAG_DE_LOW)
cfg |= JZ_LCD_CFG_DE_ACTIVE_LOW;
-   if (info->bus_flags & DRM_BUS_FLAG_PIXDATA_DRIVE_NEGEDGE)
+   if (bridge->bus_cfg.flags & DRM_BUS_FLAG_PIXDATA_DRIVE_NEGEDGE)
cfg |= JZ_LCD_CFG_PCLK_FALLING_EDGE;
 
if (!priv->panel_is_sharp) {
-   if (conn->connector_type == DRM_MODE_CONNECTOR_TV) {
+   if (conn_state->connector->connector_type == 
DRM_MODE_CONNECTOR_TV) {
if (mode->flags & DRM_MODE_FLAG_INTERLACE)
cfg |= JZ_LCD_CFG_MODE_TV_OUT_I;
else
cfg |= JZ_LCD_CFG_MODE_TV_OUT_P;
} else {
-   switch (*info->bus_formats) {
+   switch (bridge->bus_cfg.format) {
case MEDIA_BUS_FMT_RGB565_1X16:
cfg |= JZ_LCD_CFG_MODE_GENERIC_16BIT;
break;
@@ -723,20 +736,29 @@ static void ingenic_drm_encoder_atomic_mode_set(struct 
drm_encoder *encoder,
regmap_write(priv->map, JZ_REG_LCD_RGBC, rgbcfg);
 }
 
-static int ingenic_drm_encoder_atomic_check(struct drm_encoder *encoder,
-   struct drm_crtc_state *crtc_state,
-   struct drm_connector_state 
*conn_state)
+static int ingenic_drm_bridge_attach(struct drm_bridge *bridge,
+enum drm_bridge_attach_flags flags)
+{
+   struct ingenic_drm_bridge *ib = to_ingenic_drm_bridge(bridge->encoder);
+
+   return drm_bridge_attach(bridge->encoder, ib->next_bridge,
+>bridge, flags);
+}
+
+static int ingenic_drm_bridge_atomic_check(struct drm_bridge *bridge,
+  struct drm_bridge_state 
*bridge_state,
+  struct drm_crtc_state *crtc_state,
+  struct drm_connector_state 
*conn_state)
 {
-   struct drm_display_info *info = _state->connector->display_info;
struct drm_display_mode *mode = _state->adjusted_mode;
+   struct ingenic_drm_bridge *ib = to_ingenic_drm_bridge(bridge->encoder);
 
-   if (info->num_bus_formats != 1)
-   return -EINVAL;
+   ib->bus_cfg = bridge_state->output_bus_cfg;
 
if (conn_state->connector->connector_type == DRM_MODE_CONNECTOR_TV)
return 0;
 
-   switch (*info->bus_formats) {
+   switch (bridge_state->output_bus_cfg.format) {
case MEDIA_BUS_FMT_RGB888_3X8:
case MEDIA_BUS_FMT_RGB888_3X8_DELTA:
/*
@@ -900,8 +922,16

[PATCH v3 5/6] drm/ingenic: Upload palette before frame

2021-09-22 Thread Paul Cercueil

When using C8 color mode, make sure that the palette is always uploaded
before a frame; otherwise the very first frame will have wrong colors.

Do that by changing the link order of the DMA descriptors.

v3: Fix ingenic_drm_get_new_priv_state() called instead of
ingenic_drm_get_priv_state()

Signed-off-by: Paul Cercueil 
---
 drivers/gpu/drm/ingenic/ingenic-drm-drv.c | 53 ---
 1 file changed, 47 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/ingenic/ingenic-drm-drv.c 
b/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
index cbc76cede99e..a5e2880e07a1 100644
--- a/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
+++ b/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
@@ -66,6 +66,7 @@ struct jz_soc_info {
 
 struct ingenic_drm_private_state {
struct drm_private_state base;
+   bool use_palette;
 };
 
 struct ingenic_drm {
@@ -113,6 +114,30 @@ to_ingenic_drm_priv_state(struct drm_private_state *state)
return container_of(state, struct ingenic_drm_private_state, base);
 }
 
+static struct ingenic_drm_private_state *
+ingenic_drm_get_priv_state(struct ingenic_drm *priv, struct drm_atomic_state 
*state)
+{
+   struct drm_private_state *priv_state;
+
+   priv_state = drm_atomic_get_private_obj_state(state, 
>private_obj);
+   if (IS_ERR(priv_state))
+   return ERR_CAST(priv_state);
+
+   return to_ingenic_drm_priv_state(priv_state);
+}
+
+static struct ingenic_drm_private_state *
+ingenic_drm_get_new_priv_state(struct ingenic_drm *priv, struct 
drm_atomic_state *state)
+{
+   struct drm_private_state *priv_state;
+
+   priv_state = drm_atomic_get_new_private_obj_state(state, 
>private_obj);
+   if (!priv_state)
+   return NULL;
+
+   return to_ingenic_drm_priv_state(priv_state);
+}
+
 static bool ingenic_drm_writeable_reg(struct device *dev, unsigned int reg)
 {
switch (reg) {
@@ -183,11 +208,18 @@ static void ingenic_drm_crtc_atomic_enable(struct 
drm_crtc *crtc,
   struct drm_atomic_state *state)
 {
struct ingenic_drm *priv = drm_crtc_get_priv(crtc);
+   struct ingenic_drm_private_state *priv_state;
+   unsigned int next_id;
+
+   priv_state = ingenic_drm_get_priv_state(priv, state);
+   if (WARN_ON(IS_ERR(priv_state)))
+   return;
 
regmap_write(priv->map, JZ_REG_LCD_STATE, 0);
 
-   /* Set address of our DMA descriptor chain */
-   regmap_write(priv->map, JZ_REG_LCD_DA0, dma_hwdesc_addr(priv, 0));
+   /* Set addresses of our DMA descriptor chains */
+   next_id = priv_state->use_palette ? HWDESC_PALETTE : 0;
+   regmap_write(priv->map, JZ_REG_LCD_DA0, dma_hwdesc_addr(priv, next_id));
regmap_write(priv->map, JZ_REG_LCD_DA1, dma_hwdesc_addr(priv, 1));
 
regmap_update_bits(priv->map, JZ_REG_LCD_CTRL,
@@ -393,6 +425,7 @@ static int ingenic_drm_plane_atomic_check(struct drm_plane 
*plane,
struct drm_plane_state *new_plane_state = 
drm_atomic_get_new_plane_state(state,

 plane);
struct ingenic_drm *priv = drm_device_get_priv(plane->dev);
+   struct ingenic_drm_private_state *priv_state;
struct drm_crtc_state *crtc_state;
struct drm_crtc *crtc = new_plane_state->crtc ?: old_plane_state->crtc;
int ret;
@@ -405,6 +438,10 @@ static int ingenic_drm_plane_atomic_check(struct drm_plane 
*plane,
if (WARN_ON(!crtc_state))
return -EINVAL;
 
+   priv_state = ingenic_drm_get_priv_state(priv, state);
+   if (IS_ERR(priv_state))
+   return PTR_ERR(priv_state);
+
ret = drm_atomic_helper_check_plane_state(new_plane_state, crtc_state,
  DRM_PLANE_HELPER_NO_SCALING,
  DRM_PLANE_HELPER_NO_SCALING,
@@ -423,6 +460,9 @@ static int ingenic_drm_plane_atomic_check(struct drm_plane 
*plane,
 (new_plane_state->src_h >> 16) != new_plane_state->crtc_h))
return -EINVAL;
 
+   priv_state->use_palette = new_plane_state->fb &&
+   new_plane_state->fb->format->format == DRM_FORMAT_C8;
+
/*
 * Require full modeset if enabling or disabling a plane, or changing
 * its position, size or depth.
@@ -583,6 +623,7 @@ static void ingenic_drm_plane_atomic_update(struct 
drm_plane *plane,
struct drm_plane_state *newstate = 
drm_atomic_get_new_plane_state(state, plane);
struct drm_plane_state *oldstate = 
drm_atomic_get_old_plane_state(state, plane);
unsigned int width, height, cpp, next_id, plane_id;
+   struct ingenic_drm_private_state *priv_state;
struct drm_crtc_state *crtc_state;
struct ingenic_dma_hwdesc *hwdesc;
dma_addr_t addr;
@@ -600,19 +641,19 @@ static void ingenic_drm_plane_atomic_update(struct 
drm_plane *plane,
height =

[PATCH v3 4/6] drm/ingenic: Set DMA descriptor chain register when starting CRTC

2021-09-22 Thread Paul Cercueil

Setting the DMA descriptor chain register in the probe function has been
fine until now, because we only ever had one descriptor per foreground.

As the driver will soon have real descriptor chains, and the DMA
descriptor chain register updates itself to point to the current
descriptor being processed, this register needs to be reset after a full
modeset to point to the first descriptor of the chain.

Signed-off-by: Paul Cercueil 
---
 drivers/gpu/drm/ingenic/ingenic-drm-drv.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/ingenic/ingenic-drm-drv.c 
b/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
index 5dbeca0f8f37..cbc76cede99e 100644
--- a/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
+++ b/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
@@ -186,6 +186,10 @@ static void ingenic_drm_crtc_atomic_enable(struct drm_crtc 
*crtc,
 
regmap_write(priv->map, JZ_REG_LCD_STATE, 0);
 
+   /* Set address of our DMA descriptor chain */
+   regmap_write(priv->map, JZ_REG_LCD_DA0, dma_hwdesc_addr(priv, 0));
+   regmap_write(priv->map, JZ_REG_LCD_DA1, dma_hwdesc_addr(priv, 1));
+
regmap_update_bits(priv->map, JZ_REG_LCD_CTRL,
   JZ_LCD_CTRL_ENABLE | JZ_LCD_CTRL_DISABLE,
   JZ_LCD_CTRL_ENABLE);
-- 
2.33.0

[PATCH v3 3/6] drm/ingenic: Move IPU scale settings to private state

2021-09-22 Thread Paul Cercueil

The IPU scaling information is computed in the plane's ".atomic_check"
callback, and used in the ".atomic_update" callback. As such, it is
state-specific, and should be moved to a private state structure.

Signed-off-by: Paul Cercueil 
---
 drivers/gpu/drm/ingenic/ingenic-ipu.c | 73 ---
 1 file changed, 54 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/ingenic/ingenic-ipu.c 
b/drivers/gpu/drm/ingenic/ingenic-ipu.c
index c819293b8317..2737fc521e15 100644
--- a/drivers/gpu/drm/ingenic/ingenic-ipu.c
+++ b/drivers/gpu/drm/ingenic/ingenic-ipu.c
@@ -47,6 +47,8 @@ struct soc_info {
 
 struct ingenic_ipu_private_state {
struct drm_private_state base;
+
+   unsigned int num_w, num_h, denom_w, denom_h;
 };
 
 struct ingenic_ipu {
@@ -58,8 +60,6 @@ struct ingenic_ipu {
const struct soc_info *soc_info;
bool clk_enabled;
 
-   unsigned int num_w, num_h, denom_w, denom_h;
-
dma_addr_t addr_y, addr_u, addr_v;
 
struct drm_property *sharpness_prop;
@@ -85,6 +85,30 @@ to_ingenic_ipu_priv_state(struct drm_private_state *state)
return container_of(state, struct ingenic_ipu_private_state, base);
 }
 
+static struct ingenic_ipu_private_state *
+ingenic_ipu_get_priv_state(struct ingenic_ipu *priv, struct drm_atomic_state 
*state)
+{
+   struct drm_private_state *priv_state;
+
+   priv_state = drm_atomic_get_private_obj_state(state, 
>private_obj);
+   if (IS_ERR(priv_state))
+   return ERR_CAST(priv_state);
+
+   return to_ingenic_ipu_priv_state(priv_state);
+}
+
+static struct ingenic_ipu_private_state *
+ingenic_ipu_get_new_priv_state(struct ingenic_ipu *priv, struct 
drm_atomic_state *state)
+{
+   struct drm_private_state *priv_state;
+
+   priv_state = drm_atomic_get_new_private_obj_state(state, 
>private_obj);
+   if (!priv_state)
+   return NULL;
+
+   return to_ingenic_ipu_priv_state(priv_state);
+}
+
 /*
  * Apply conventional cubic convolution kernel. Both parameters
  *  and return value are 15.16 signed fixed-point.
@@ -305,11 +329,16 @@ static void ingenic_ipu_plane_atomic_update(struct 
drm_plane *plane,
const struct drm_format_info *finfo;
u32 ctrl, stride = 0, coef_index = 0, format = 0;
bool needs_modeset, upscaling_w, upscaling_h;
+   struct ingenic_ipu_private_state *ipu_state;
int err;
 
if (!newstate || !newstate->fb)
return;
 
+   ipu_state = ingenic_ipu_get_new_priv_state(ipu, state);
+   if (WARN_ON(!ipu_state))
+   return;
+
finfo = drm_format_info(newstate->fb->format->format);
 
if (!ipu->clk_enabled) {
@@ -482,27 +511,27 @@ static void ingenic_ipu_plane_atomic_update(struct 
drm_plane *plane,
if (ipu->soc_info->has_bicubic)
ctrl |= JZ_IPU_CTRL_ZOOM_SEL;
 
-   upscaling_w = ipu->num_w > ipu->denom_w;
+   upscaling_w = ipu_state->num_w > ipu_state->denom_w;
if (upscaling_w)
ctrl |= JZ_IPU_CTRL_HSCALE;
 
-   if (ipu->num_w != 1 || ipu->denom_w != 1) {
+   if (ipu_state->num_w != 1 || ipu_state->denom_w != 1) {
if (!ipu->soc_info->has_bicubic && !upscaling_w)
-   coef_index |= (ipu->denom_w - 1) << 16;
+   coef_index |= (ipu_state->denom_w - 1) << 16;
else
-   coef_index |= (ipu->num_w - 1) << 16;
+   coef_index |= (ipu_state->num_w - 1) << 16;
ctrl |= JZ_IPU_CTRL_HRSZ_EN;
}
 
-   upscaling_h = ipu->num_h > ipu->denom_h;
+   upscaling_h = ipu_state->num_h > ipu_state->denom_h;
if (upscaling_h)
ctrl |= JZ_IPU_CTRL_VSCALE;
 
-   if (ipu->num_h != 1 || ipu->denom_h != 1) {
+   if (ipu_state->num_h != 1 || ipu_state->denom_h != 1) {
if (!ipu->soc_info->has_bicubic && !upscaling_h)
-   coef_index |= ipu->denom_h - 1;
+   coef_index |= ipu_state->denom_h - 1;
else
-   coef_index |= ipu->num_h - 1;
+   coef_index |= ipu_state->num_h - 1;
ctrl |= JZ_IPU_CTRL_VRSZ_EN;
}
 
@@ -513,13 +542,13 @@ static void ingenic_ipu_plane_atomic_update(struct 
drm_plane *plane,
/* Set the LUT index register */
regmap_write(ipu->map, JZ_REG_IPU_RSZ_COEF_INDEX, coef_index);
 
-   if (ipu->num_w != 1 || ipu->denom_w != 1)
+   if (ipu_state->num_w != 1 || ipu_state->denom_w != 1)
ingenic_ipu_set_coefs(ipu, JZ_REG_IPU_HRSZ_COEF_LUT,
- ipu->num_w, ipu->denom_w);
+ ipu_state->num_w, ipu_state->denom_w);
 
-   if (ipu->num_h != 1 || ipu->denom_h != 1)
+   if (ipu_state->num_h != 1 || ipu_state->denom_h != 1)
ingenic_ipu_set_coefs(ipu, JZ_REG_IPU_VRSZ_COEF_LUT,
-

[PATCH v3 2/6] drm/ingenic: Add support for private objects

2021-09-22 Thread Paul Cercueil

Until now, the ingenic-drm as well as the ingenic-ipu drivers used to
put state-specific information in their respective private structure.

Add boilerplate code to support private objects in the two drivers, so
that state-specific information can be put in the state-specific private
structure.

Signed-off-by: Paul Cercueil 
---
 drivers/gpu/drm/ingenic/ingenic-drm-drv.c | 61 +++
 drivers/gpu/drm/ingenic/ingenic-ipu.c | 54 
 2 files changed, 115 insertions(+)

diff --git a/drivers/gpu/drm/ingenic/ingenic-drm-drv.c 
b/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
index 95c12c2aba14..5dbeca0f8f37 100644
--- a/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
+++ b/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
@@ -64,6 +64,10 @@ struct jz_soc_info {
unsigned int num_formats_f0, num_formats_f1;
 };
 
+struct ingenic_drm_private_state {
+   struct drm_private_state base;
+};
+
 struct ingenic_drm {
struct drm_device drm;
/*
@@ -99,8 +103,16 @@ struct ingenic_drm {
struct mutex clk_mutex;
bool update_clk_rate;
struct notifier_block clock_nb;
+
+   struct drm_private_obj private_obj;
 };
 
+static inline struct ingenic_drm_private_state *
+to_ingenic_drm_priv_state(struct drm_private_state *state)
+{
+   return container_of(state, struct ingenic_drm_private_state, base);
+}
+
 static bool ingenic_drm_writeable_reg(struct device *dev, unsigned int reg)
 {
switch (reg) {
@@ -766,6 +778,28 @@ ingenic_drm_gem_create_object(struct drm_device *drm, 
size_t size)
return >base;
 }
 
+static struct drm_private_state *
+ingenic_drm_duplicate_state(struct drm_private_obj *obj)
+{
+   struct ingenic_drm_private_state *state = 
to_ingenic_drm_priv_state(obj->state);
+
+   state = kmemdup(state, sizeof(*state), GFP_KERNEL);
+   if (!state)
+   return NULL;
+
+   __drm_atomic_helper_private_obj_duplicate_state(obj, >base);
+
+   return >base;
+}
+
+static void ingenic_drm_destroy_state(struct drm_private_obj *obj,
+ struct drm_private_state *state)
+{
+   struct ingenic_drm_private_state *priv_state = 
to_ingenic_drm_priv_state(state);
+
+   kfree(priv_state);
+}
+
 DEFINE_DRM_GEM_CMA_FOPS(ingenic_drm_fops);
 
 static const struct drm_driver ingenic_drm_driver_data = {
@@ -836,6 +870,11 @@ static struct drm_mode_config_helper_funcs 
ingenic_drm_mode_config_helpers = {
.atomic_commit_tail = drm_atomic_helper_commit_tail,
 };
 
+static const struct drm_private_state_funcs ingenic_drm_private_state_funcs = {
+   .atomic_duplicate_state = ingenic_drm_duplicate_state,
+   .atomic_destroy_state = ingenic_drm_destroy_state,
+};
+
 static void ingenic_drm_unbind_all(void *d)
 {
struct ingenic_drm *priv = d;
@@ -877,9 +916,15 @@ static void ingenic_drm_configure_hwdesc_plane(struct 
ingenic_drm *priv,
ingenic_drm_configure_hwdesc(priv, plane, plane, 0xf0 | plane);
 }
 
+static void ingenic_drm_atomic_private_obj_fini(struct drm_device *drm, void 
*private_obj)
+{
+   drm_atomic_private_obj_fini(private_obj);
+}
+
 static int ingenic_drm_bind(struct device *dev, bool has_components)
 {
struct platform_device *pdev = to_platform_device(dev);
+   struct ingenic_drm_private_state *private_state;
const struct jz_soc_info *soc_info;
struct ingenic_drm *priv;
struct clk *parent_clk;
@@ -1148,6 +1193,20 @@ static int ingenic_drm_bind(struct device *dev, bool 
has_components)
goto err_devclk_disable;
}
 
+   private_state = kzalloc(sizeof(*private_state), GFP_KERNEL);
+   if (!private_state) {
+   ret = -ENOMEM;
+   goto err_clk_notifier_unregister;
+   }
+
+   drm_atomic_private_obj_init(drm, >private_obj, 
_state->base,
+   _drm_private_state_funcs);
+
+   ret = drmm_add_action_or_reset(drm, ingenic_drm_atomic_private_obj_fini,
+  >private_obj);
+   if (ret)
+   goto err_private_state_free;
+
ret = drm_dev_register(drm, 0);
if (ret) {
dev_err(dev, "Failed to register DRM driver\n");
@@ -1158,6 +1217,8 @@ static int ingenic_drm_bind(struct device *dev, bool 
has_components)
 
return 0;
 
+err_private_state_free:
+   kfree(private_state);
 err_clk_notifier_unregister:
clk_notifier_unregister(parent_clk, >clock_nb);
 err_devclk_disable:
diff --git a/drivers/gpu/drm/ingenic/ingenic-ipu.c 
b/drivers/gpu/drm/ingenic/ingenic-ipu.c
index aeb8a757d213..c819293b8317 100644
--- a/drivers/gpu/drm/ingenic/ingenic-ipu.c
+++ b/drivers/gpu/drm/ingenic/ingenic-ipu.c
@@ -45,6 +45,10 @@ struct soc_info {
  unsigned int weight, unsigned int offset);
 };
 
+struct ingenic_ipu_private_state {
+   struct drm_private_state base;
+};
+
 struct ingenic_ipu {
struct drm_plane plane;

[PATCH v3 1/6] drm/ingenic: Simplify code by using hwdescs array

2021-09-22 Thread Paul Cercueil

Instead of having one 'hwdesc' variable for the plane #0, one for the
plane #1 and one for the palette, use a 'hwdesc[3]' array, where the
DMA hardware descriptors are indexed by the plane's number.

v2: dma_hwdesc_addr() extended to support palette hwdesc. The palette
hwdesc is now hwdesc[3] to simplify things. Add
ingenic_drm_configure_hwdesc*() functions to factorize code.

Signed-off-by: Paul Cercueil 
---
 drivers/gpu/drm/ingenic/ingenic-drm-drv.c | 78 ++-
 1 file changed, 48 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/ingenic/ingenic-drm-drv.c 
b/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
index a5df1c8d34cd..95c12c2aba14 100644
--- a/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
+++ b/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
@@ -41,6 +41,8 @@
 #include 
 #include 
 
+#define HWDESC_PALETTE 2
+
 struct ingenic_dma_hwdesc {
u32 next;
u32 addr;
@@ -49,9 +51,7 @@ struct ingenic_dma_hwdesc {
 } __aligned(16);
 
 struct ingenic_dma_hwdescs {
-   struct ingenic_dma_hwdesc hwdesc_f0;
-   struct ingenic_dma_hwdesc hwdesc_f1;
-   struct ingenic_dma_hwdesc hwdesc_pal;
+   struct ingenic_dma_hwdesc hwdesc[3];
u16 palette[256] __aligned(16);
 };
 
@@ -141,6 +141,14 @@ static inline struct ingenic_drm *drm_nb_get_priv(struct 
notifier_block *nb)
return container_of(nb, struct ingenic_drm, clock_nb);
 }
 
+static inline dma_addr_t dma_hwdesc_addr(const struct ingenic_drm *priv,
+unsigned int idx)
+{
+   u32 offset = offsetof(struct ingenic_dma_hwdescs, hwdesc[idx]);
+
+   return priv->dma_hwdescs_phys + offset;
+}
+
 static int ingenic_drm_update_pixclk(struct notifier_block *nb,
 unsigned long action,
 void *data)
@@ -558,9 +566,9 @@ static void ingenic_drm_plane_atomic_update(struct 
drm_plane *plane,
struct ingenic_drm *priv = drm_device_get_priv(plane->dev);
struct drm_plane_state *newstate = 
drm_atomic_get_new_plane_state(state, plane);
struct drm_plane_state *oldstate = 
drm_atomic_get_old_plane_state(state, plane);
+   unsigned int width, height, cpp, next_id, plane_id;
struct drm_crtc_state *crtc_state;
struct ingenic_dma_hwdesc *hwdesc;
-   unsigned int width, height, cpp, offset;
dma_addr_t addr;
u32 fourcc;
 
@@ -569,16 +577,14 @@ static void ingenic_drm_plane_atomic_update(struct 
drm_plane *plane,
drm_fb_cma_sync_non_coherent(>drm, oldstate, 
newstate);
 
crtc_state = newstate->crtc->state;
+   plane_id = !!(priv->soc_info->has_osd && plane != >f0);
 
addr = drm_fb_cma_get_gem_addr(newstate->fb, newstate, 0);
width = newstate->src_w >> 16;
height = newstate->src_h >> 16;
cpp = newstate->fb->format->cpp[0];
 
-   if (!priv->soc_info->has_osd || plane == >f0)
-   hwdesc = >dma_hwdescs->hwdesc_f0;
-   else
-   hwdesc = >dma_hwdescs->hwdesc_f1;
+   hwdesc = >dma_hwdescs->hwdesc[plane_id];
 
hwdesc->addr = addr;
hwdesc->cmd = JZ_LCD_CMD_EOF_IRQ | (width * height * cpp / 4);
@@ -588,12 +594,8 @@ static void ingenic_drm_plane_atomic_update(struct 
drm_plane *plane,
 
ingenic_drm_plane_config(priv->dev, plane, fourcc);
 
-   if (fourcc == DRM_FORMAT_C8)
-   offset = offsetof(struct ingenic_dma_hwdescs, 
hwdesc_pal);
-   else
-   offset = offsetof(struct ingenic_dma_hwdescs, 
hwdesc_f0);
-
-   priv->dma_hwdescs->hwdesc_f0.next = 
priv->dma_hwdescs_phys + offset;
+   next_id = fourcc == DRM_FORMAT_C8 ? HWDESC_PALETTE : 0;
+   priv->dma_hwdescs->hwdesc[0].next = 
dma_hwdesc_addr(priv, next_id);
 
crtc_state->color_mgmt_changed = fourcc == 
DRM_FORMAT_C8;
}
@@ -846,6 +848,35 @@ static void __maybe_unused ingenic_drm_release_rmem(void 
*d)
of_reserved_mem_device_release(d);
 }
 
+static void ingenic_drm_configure_hwdesc(struct ingenic_drm *priv,
+unsigned int hwdesc,
+unsigned int next_hwdesc, u32 id)
+{
+   struct ingenic_dma_hwdesc *desc = >dma_hwdescs->hwdesc[hwdesc];
+
+   desc->next = dma_hwdesc_addr(priv, next_hwdesc);
+   desc->id = id;
+}
+
+static void ingenic_drm_configure_hwdesc_palette(struct ingenic_drm *priv)
+{
+   struct ingenic_dma_hwdesc *desc;
+
+   ingenic_drm_configure_hwdesc(priv, HWDESC_PALETTE, 0, 0xc0);
+
+   desc = >dma_hwdescs->hwdesc[HWDESC_PALETTE];
+   desc->addr = priv->dma_hwdescs_phys
+   + offsetof(struct ingenic_dma_hwdescs, palette);
+

[PATCH v3 0/6] drm/ingenic: Various improvements v3

2021-09-22 Thread Paul Cercueil

Hi,

A V3 of my patchset for the ingenic-drm driver.

The patches "drm/ingenic: Remove dead code" and
"drm/ingenic: Use standard drm_atomic_helper_commit_tail"
that were present in V1 have been merged in drm-misc-next,
so they are not in this V3.

Changelog since V2:

[PATCH 5/6]:
Fix ingenic_drm_get_new_priv_state() called instead of
ingenic_drm_get_priv_state()

Cheers,
-Paul

Paul Cercueil (6):
  drm/ingenic: Simplify code by using hwdescs array
  drm/ingenic: Add support for private objects
  drm/ingenic: Move IPU scale settings to private state
  drm/ingenic: Set DMA descriptor chain register when starting CRTC
  drm/ingenic: Upload palette before frame
  drm/ingenic: Attach bridge chain to encoders

 drivers/gpu/drm/ingenic/ingenic-drm-drv.c | 278 +-
 drivers/gpu/drm/ingenic/ingenic-ipu.c | 127 --
 2 files changed, 333 insertions(+), 72 deletions(-)

-- 
2.33.0

Re: Regression with mainline kernel on rpi4

2021-09-22 Thread Linus Torvalds

On Wed, Sep 22, 2021 at 1:19 PM Sudip Mukherjee
 wrote:
>
> I added some debugs to print the addresses, and I am getting:
> [   38.813809] sudip crtc 
>
> This is from struct drm_crtc *crtc = connector->state->crtc;

Yeah, that was my personal suspicion, because while the line number
implied "crtc->state" being NULL, the drm data structure documentation
and other drivers both imply that "crtc" was the more likely one.

I suspect a simple

if (!crtc)
return;

in vc4_hdmi_set_n_cts() is at least part of the fix for this all, but
I didn't check if there is possibly something else that needs to be
done too.

Linus

Re: Regression with mainline kernel on rpi4

2021-09-22 Thread Sudip Mukherjee

On Wed, Sep 22, 2021 at 7:23 PM Linus Torvalds
 wrote:
>
> On Wed, Sep 22, 2021 at 10:02 AM Sudip Mukherjee
>  wrote:
> >
> >
> > Attached is a complete dmesg and also the decoded trace.
> > This is done on 4357f03d6611 ("Merge tag 'pm-5.15-rc2' of
> > git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm")
>
> drivers/gpu/drm/vc4/vc4_hdmi.c:1214 is
>
> tmp = (u64)(mode->clock * 1000) * n;
>
> in vc4_hdmi_set_n_cts(), which has apparently been inlined from
> vc4_hdmi_audio_prepare() in vc4_hdmi.c:1398.
>
> So it looks like 'mode' is some offset off a NULL pointer.
>
> Which looks not impossible:
>
>   1207  struct drm_connector *connector = _hdmi->connector;
>   1208  struct drm_crtc *crtc = connector->state->crtc;
>   1209  const struct drm_display_mode *mode =
> >state->adjusted_mode;
>
> looks like crtc->state perhaps might be NULL.

I added some debugs to print the addresses, and I am getting:
[   38.813809] sudip crtc 

This is from struct drm_crtc *crtc = connector->state->crtc;

connector and connector->state had valid addresses.
[   38.805302] sudip connector 40bac578
[   38.809779] sudip state 57eb5400

This is the diff of the debug I added:
diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
index 4a1115043114..2a8f06948094 100644
--- a/drivers/gpu/drm/vc4/vc4_hdmi.c
+++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
@@ -1205,11 +1205,20 @@ static void
vc4_hdmi_audio_set_mai_clock(struct vc4_hdmi *vc4_hdmi,
 static void vc4_hdmi_set_n_cts(struct vc4_hdmi *vc4_hdmi, unsigned
int samplerate)
 {
struct drm_connector *connector = _hdmi->connector;
-   struct drm_crtc *crtc = connector->state->crtc;
-   const struct drm_display_mode *mode = >state->adjusted_mode;
+   struct drm_crtc *crtc;
+   struct drm_display_mode *mode;
u32 n, cts;
u64 tmp;

+
+   pr_err("sudip connector %px\n", connector);
+   pr_err("sudip state %px\n", connector->state);
+   crtc = connector->state->crtc;
+
+   pr_err("sudip crtc %px\n", crtc);
+   pr_err("sudip state %px\n", crtc->state);
+   pr_err("state mode %px\n", >state->adjusted_mode);
+   mode = >state->adjusted_mode;
n = 128 * samplerate / 1000;
tmp = (u64)(mode->clock * 1000) * n;
do_div(tmp, 128 * samplerate);


-- 
Regards
Sudip

[PATCH] drm/rockchip: rgb: make connector a pointer in struct rockchip_rgb

2021-09-22 Thread Alex Bee

As reported at [1] Coverity complains about an used value.

Let's make drm_connector a pointer in struct rockchip_rgb and "remove
redundant assignment of pointer connector".

[1] https://lkml.org/lkml/2021/9/22/432

Fixes: 2e87bf389e13 ("drm/rockchip: add DRM_BRIDGE_ATTACH_NO_CONNECTOR flag to 
drm_bridge_attach")
Addresses-Coverity: ("Unused value")
Reported-by: Colin Ian King 
Signed-off-by: Alex Bee 
---
 drivers/gpu/drm/rockchip/rockchip_rgb.c | 18 --
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/rockchip/rockchip_rgb.c 
b/drivers/gpu/drm/rockchip/rockchip_rgb.c
index 09be9678f2bd..fe932c26c3e0 100644
--- a/drivers/gpu/drm/rockchip/rockchip_rgb.c
+++ b/drivers/gpu/drm/rockchip/rockchip_rgb.c
@@ -28,7 +28,7 @@ struct rockchip_rgb {
struct drm_device *drm_dev;
struct drm_bridge *bridge;
struct drm_encoder encoder;
-   struct drm_connector connector;
+   struct drm_connector *connector;
int output_mode;
 };
 
@@ -82,7 +82,6 @@ struct rockchip_rgb *rockchip_rgb_init(struct device *dev,
int ret = 0, child_count = 0;
struct drm_panel *panel;
struct drm_bridge *bridge;
-   struct drm_connector *connector;
 
rgb = devm_kzalloc(dev, sizeof(*rgb), GFP_KERNEL);
if (!rgb)
@@ -150,17 +149,16 @@ struct rockchip_rgb *rockchip_rgb_init(struct device *dev,
if (ret)
goto err_free_encoder;
 
-   connector = >connector;
-   connector = drm_bridge_connector_init(rgb->drm_dev, encoder);
-   if (IS_ERR(connector)) {
+   rgb->connector = drm_bridge_connector_init(rgb->drm_dev, encoder);
+   if (IS_ERR(rgb->connector)) {
DRM_DEV_ERROR(drm_dev->dev,
  "failed to initialize bridge connector: %pe\n",
- connector);
-   ret = PTR_ERR(connector);
+ rgb->connector);
+   ret = PTR_ERR(rgb->connector);
goto err_free_encoder;
}
 
-   ret = drm_connector_attach_encoder(connector, encoder);
+   ret = drm_connector_attach_encoder(rgb->connector, encoder);
if (ret < 0) {
DRM_DEV_ERROR(drm_dev->dev,
  "failed to attach encoder: %d\n", ret);
@@ -170,7 +168,7 @@ struct rockchip_rgb *rockchip_rgb_init(struct device *dev,
return rgb;
 
 err_free_connector:
-   drm_connector_cleanup(connector);
+   drm_connector_cleanup(rgb->connector);
 err_free_encoder:
drm_encoder_cleanup(encoder);
return ERR_PTR(ret);
@@ -180,7 +178,7 @@ EXPORT_SYMBOL_GPL(rockchip_rgb_init);
 void rockchip_rgb_fini(struct rockchip_rgb *rgb)
 {
drm_panel_bridge_remove(rgb->bridge);
-   drm_connector_cleanup(>connector);
+   drm_connector_cleanup(rgb->connector);
drm_encoder_cleanup(>encoder);
 }
 EXPORT_SYMBOL_GPL(rockchip_rgb_fini);
-- 
2.30.2

Re: [Intel-gfx] [PATCH 15/27] drm/i915/guc: Implement multi-lrc submission

2021-09-22 Thread John Harrison


On 9/22/2021 09:25, Matthew Brost wrote:

On Mon, Sep 20, 2021 at 02:48:52PM -0700, John Harrison wrote:

On 8/20/2021 15:44, Matthew Brost wrote:

Implement multi-lrc submission via a single workqueue entry and single
H2G. The workqueue entry contains an updated tail value for each
request, of all the contexts in the multi-lrc submission, and updates
these values simultaneously. As such, the tasklet and bypass path have
been updated to coalesce requests into a single submission.

Signed-off-by: Matthew Brost 
---
   drivers/gpu/drm/i915/gt/uc/intel_guc.c|  21 ++
   drivers/gpu/drm/i915/gt/uc/intel_guc.h|   8 +
   drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c |  24 +-
   drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   |   6 +-
   .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 312 +++---
   drivers/gpu/drm/i915/i915_request.h   |   8 +
   6 files changed, 317 insertions(+), 62 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
index fbfcae727d7f..879aef662b2e 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
@@ -748,3 +748,24 @@ void intel_guc_load_status(struct intel_guc *guc, struct 
drm_printer *p)
}
}
   }
+
+void intel_guc_write_barrier(struct intel_guc *guc)
+{
+   struct intel_gt *gt = guc_to_gt(guc);
+
+   if (i915_gem_object_is_lmem(guc->ct.vma->obj)) {
+   GEM_BUG_ON(guc->send_regs.fw_domains);

Granted, this patch is just moving code from one file to another not
changing it. However, I think it would be worth adding a blank line in here.
Otherwise the 'this register' comment below can be confusingly read as
referring to the send_regs.fw_domain entry above.

And maybe add a comment why it is a bug for the send_regs value to be set?
I'm not seeing any obvious connection between it and the reset of this code.


Can add a blank line. I think the GEM_BUG_ON relates to being able to
use intel_uncore_write_fw vs intel_uncore_write. Can add comment.


+   /*
+* This register is used by the i915 and GuC for MMIO based
+* communication. Once we are in this code CTBs are the only
+* method the i915 uses to communicate with the GuC so it is
+* safe to write to this register (a value of 0 is NOP for MMIO
+* communication). If we ever start mixing CTBs and MMIOs a new
+* register will have to be chosen.
+*/
+   intel_uncore_write_fw(gt->uncore, GEN11_SOFT_SCRATCH(0), 0);
+   } else {
+   /* wmb() sufficient for a barrier if in smem */
+   wmb();
+   }
+}
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
index 3f95b1b4f15c..0ead2406d03c 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
@@ -37,6 +37,12 @@ struct intel_guc {
/* Global engine used to submit requests to GuC */
struct i915_sched_engine *sched_engine;
struct i915_request *stalled_request;
+   enum {
+   STALL_NONE,
+   STALL_REGISTER_CONTEXT,
+   STALL_MOVE_LRC_TAIL,
+   STALL_ADD_REQUEST,
+   } submission_stall_reason;
/* intel_guc_recv interrupt related state */
spinlock_t irq_lock;
@@ -332,4 +338,6 @@ void intel_guc_submission_cancel_requests(struct intel_guc 
*guc);
   void intel_guc_load_status(struct intel_guc *guc, struct drm_printer *p);
+void intel_guc_write_barrier(struct intel_guc *guc);
+
   #endif
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index 20c710a74498..10d1878d2826 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -377,28 +377,6 @@ static u32 ct_get_next_fence(struct intel_guc_ct *ct)
return ++ct->requests.last_fence;
   }
-static void write_barrier(struct intel_guc_ct *ct)
-{
-   struct intel_guc *guc = ct_to_guc(ct);
-   struct intel_gt *gt = guc_to_gt(guc);
-
-   if (i915_gem_object_is_lmem(guc->ct.vma->obj)) {
-   GEM_BUG_ON(guc->send_regs.fw_domains);
-   /*
-* This register is used by the i915 and GuC for MMIO based
-* communication. Once we are in this code CTBs are the only
-* method the i915 uses to communicate with the GuC so it is
-* safe to write to this register (a value of 0 is NOP for MMIO
-* communication). If we ever start mixing CTBs and MMIOs a new
-* register will have to be chosen.
-*/
-   intel_uncore_write_fw(gt->uncore, GEN11_SOFT_SCRATCH(0), 0);
-   } else {
-   /* wmb() sufficient for a barrier if in smem */
-   wmb();
-   }
-}
-
   static int ct_write(struct

Re: [PATCH v3 5/8] x86/sme: Replace occurrences of sme_active() with cc_platform_has()

2021-09-22 Thread Borislav Petkov

On Wed, Sep 22, 2021 at 05:30:15PM +0300, Kirill A. Shutemov wrote:
> Not fine, but waiting to blowup with random build environment change.

Why is it not fine?

Are you suspecting that the compiler might generate something else and
not a rip-relative access?

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

[PATCH] drm/i915: Fix bug in user proto-context creation that leaked contexts

2021-09-22 Thread Matthew Brost

Set number of engines before attempting to create contexts so the
function free_engines can clean up properly. Also check return of
alloc_engines for NULL.

v2:
 (Tvrtko)
  - Send as stand alone patch
 (John Harrison)
  - Check for alloc_engines returning NULL

Cc: Jason Ekstrand 
Fixes: d4433c7600f7 ("drm/i915/gem: Use the proto-context to handle create 
parameters (v5)")
Signed-off-by: Matthew Brost 
Cc: 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index c2ab0e22db0a..9627c7aac6a3 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -898,6 +898,11 @@ static struct i915_gem_engines *user_engines(struct 
i915_gem_context *ctx,
unsigned int n;
 
e = alloc_engines(num_engines);
+   if (!e) {
+   return ERR_PTR(-ENOMEM);
+   }
+   e->num_engines = num_engines;
+
for (n = 0; n < num_engines; n++) {
struct intel_context *ce;
int ret;
@@ -931,7 +936,6 @@ static struct i915_gem_engines *user_engines(struct 
i915_gem_context *ctx,
goto free_engines;
}
}
-   e->num_engines = num_engines;
 
return e;
 
-- 
2.32.0

Re: [PATCH][next] drm/rockchip: Remove redundant assignment of pointer connector

2021-09-22 Thread Alex Bee


Hi all,

Am 22.09.21 um 19:31 schrieb Alex Bee:

Hi Heiko,

Am 22.09.21 um 18:45 schrieb Heiko Stübner:

Hi Alex,

Am Mittwoch, 22. September 2021, 18:35:38 CEST schrieb Alex Bee:

Hi Colin,
Am 22.09.21 um 13:24 schrieb Colin King:

From: Colin Ian King 

The pointer connector is being assigned a value that is never
read, it is being updated immediately afterwards. The assignment
is redundant and can be removed.

The pointer to the connector is used in rockchip_rgb_fini for
drm_connector_cleanup.
It's pretty much the same for the encoder, btw.

I think the issue is more the two lines

connector = >connector;
  connector = drm_bridge_connector_init(rgb->drm_dev, encoder);

hence the connector = >connector being overwritten immediately 
after


Now that I look at it again, the whole approach looks strange.
drm_bridge_connector_init() creates the connector structure and
returns a pointer to it.


Totally agreed.

The main reason I was doing it that way, was the way it was done 
already in rockchip_lvds.c, where the connector was already existent 
in the struct rockchip_lvds (and was already used in the panel-case - 
all places where it is used accept pointers also, btw) and is *no* 
pointer - and is done already this very strange way.


I wanted to re-use it for the bridge-case and didn't want to differ in 
coding in rockchip-rgb to much.


The only reason I can think of, why it was done that way is, that we 
might need a pointer to a fully initialized struct drm_connector for 
some reason (drm_connector_cleanup ?), what we wouldn't have if have 
just a pointer and something goes wrong before drm_connector_init 
respectivly drm_bridge_connector_init.


Alex



So the first line below sets the connector pointer to point to the
>connector element and the second line then set a completely
different address into it.

So the connector element in rockchip_lvds and rockchip_rgb should 
actually

become a pointer itself to hold the connector element returned from
drm_bridge_connector_init() .
It turns out, nothing bad happens (i.e. rockchip_rgb_fini, the only 
place where the connector is also used, isn't called if 
rockchip_rgb_init fails) - so it will be OK if we make the connector a 
pointer in struct rockchip_rgb. But we'll need to keep it a "full" 
struct drm_connector in struct rockchip_lvds, since in case it's a 
panel  it gets properties assigend before drm_connector_init is called 
and should for that reason initialized before.


I'll send a patch soon.

Alex




Heiko


Regards,

Alex

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King 
---
   drivers/gpu/drm/rockchip/rockchip_rgb.c | 1 -
   1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/rockchip/rockchip_rgb.c 
b/drivers/gpu/drm/rockchip/rockchip_rgb.c

index 09be9678f2bd..18fb84068a64 100644
--- a/drivers/gpu/drm/rockchip/rockchip_rgb.c
+++ b/drivers/gpu/drm/rockchip/rockchip_rgb.c
@@ -150,7 +150,6 @@ struct rockchip_rgb *rockchip_rgb_init(struct 
device *dev,

   if (ret)
   goto err_free_encoder;
   -    connector = >connector;
   connector = drm_bridge_connector_init(rgb->drm_dev, encoder);
   if (IS_ERR(connector)) {
   DRM_DEV_ERROR(drm_dev->dev,

Re: Regression with mainline kernel on rpi4

2021-09-22 Thread Linus Torvalds

On Wed, Sep 22, 2021 at 10:02 AM Sudip Mukherjee
 wrote:
>
>
> Attached is a complete dmesg and also the decoded trace.
> This is done on 4357f03d6611 ("Merge tag 'pm-5.15-rc2' of
> git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm")

drivers/gpu/drm/vc4/vc4_hdmi.c:1214 is

tmp = (u64)(mode->clock * 1000) * n;

in vc4_hdmi_set_n_cts(), which has apparently been inlined from
vc4_hdmi_audio_prepare() in vc4_hdmi.c:1398.

So it looks like 'mode' is some offset off a NULL pointer.

Which looks not impossible:

  1207  struct drm_connector *connector = _hdmi->connector;
  1208  struct drm_crtc *crtc = connector->state->crtc;
  1209  const struct drm_display_mode *mode =
>state->adjusted_mode;

looks like crtc->state perhaps might be NULL.

Although it's entirely possible that it's 'crtc' itself that is NULL
or one of the earlier indirection accesses.

The exact line information from the debug info is very useful and
mostly correct, but at the same time should always be taken with a
small pinch of salt.

Compiler optimizations means that code gets munged and moved around,
and since this is the first access to 'mode', I would not be surprised
if some of the calculations and accesses to get 'mode' might be moved
around to it.

   Linus

Re: [RESEND] [PATCH v2 1/2] dt-bindings: display: bridge: Add binding for R-Car MIPI DSI/CSI-2 TX

2021-09-22 Thread Geert Uytterhoeven

Hi Laurent,

On Wed, Sep 22, 2021 at 10:08 AM Laurent Pinchart
 wrote:
> On Wed, Sep 22, 2021 at 08:43:57AM +0200, Geert Uytterhoeven wrote:
> > On Wed, Sep 22, 2021 at 3:27 AM Laurent Pinchart wrote:
> > > On Tue, Sep 21, 2021 at 05:53:52PM +0200, Geert Uytterhoeven wrote:
> > > > On Wed, Jul 28, 2021 at 6:26 PM Laurent Pinchart wrote:
> > > > > The R-Car MIPI DSI/CSI-2 TX is embedded in the Renesas R-Car V3U SoC. 
> > > > > It
> > > > > can operate in either DSI or CSI-2 mode, with up to four data lanes.
> > > > >
> > > > > Signed-off-by: Laurent Pinchart 
> > > > > 
> > > > > Reviewed-by: Kieran Bingham 
> > > >
> > > > Thanks for your patch!
> > > >
> > > > > --- /dev/null
> > > > > +++ 
> > > > > b/Documentation/devicetree/bindings/display/bridge/renesas,dsi-csi2-tx.yaml
> > > > > @@ -0,0 +1,118 @@
> > > > > +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> > > > > +%YAML 1.2
> > > > > +---
> > > > > +$id: 
> > > > > http://devicetree.org/schemas/display/bridge/renesas,dsi-csi2-tx.yaml#
> > > > > +$schema: http://devicetree.org/meta-schemas/core.yaml#
> > > > > +
> > > > > +title: Renesas R-Car MIPI DSI/CSI-2 Encoder
> > > > > +
> > > > > +maintainers:
> > > > > +  - Laurent Pinchart 
> > > > > +
> > > > > +description: |
> > > > > +  This binding describes the MIPI DSI/CSI-2 encoder embedded in the 
> > > > > Renesas
> > > > > +  R-Car V3U SoC. The encoder can operate in either DSI or CSI-2 
> > > > > mode, with up
> > > > > +  to four data lanes.
> > > > > +
> > > > > +properties:
> > > > > +  compatible:
> > > > > +enum:
> > > > > +  - renesas,r8a779a0-dsi-csi2-tx# for V3U
> > > > > +
> > > > > +  reg:
> > > > > +maxItems: 1
> > > > > +
> > > > > +  clocks:
> > > > > +items:
> > > > > +  - description: Functional clock
> > > > > +  - description: DSI (and CSI-2) functional clock
> > > > > +  - description: PLL reference clock
> > > > > +
> > > > > +  clock-names:
> > > > > +items:
> > > > > +  - const: fck
> > > > > +  - const: dsi
> > > > > +  - const: pll
> > > >
> > > > No interrupts?
> > > > The hardware manual says there are 9 interrupts.
> > >
> > > Who comes up with such insanely high numbers of interrupts ? :-)
> > >
> > > What the hardware manual doesn't document is how interrupts are mapped.
> > > There's indeed 9 of them, and there are 9 interrupt sources, but that's
> > > all we know. I can easily add a
> > >
> > >   interrupts:
> > > maxItems: 9
> > >
> > > but I can add interrupt names without additional information. It may be
> > > possible to deduce some of the interrupt mappings from experiments, but
> > > not all of them. What do you think would be a good way forward ? Leave
> > > the interrupts out for now as we don't have the information ? Only list
> > > the interrupts but not their names ? Something else ?
> >
> > I think what we did in the past is not list the interrupts at all.
> > They can be added once we receive more documentation.
>
> Sounds good to me, as that's what this patch does already ;-) A R-b or
> A-b tag is welcome.

Your wish is my command...

Reviewed-by: Geert Uytterhoeven 

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

Re: [PATCH v2 4/5] drm: rcar-du: Split CRTC IRQ and Clock features

2021-09-22 Thread Laurent Pinchart

Hi Kieran,

Thank you for the patch.

On Thu, Sep 02, 2021 at 12:49:06AM +0100, Kieran Bingham wrote:
> Not all platforms require both per-crtc IRQ and per-crtc clock
> management. In preparation for suppporting such platforms, split the
> feature macro to be able to specify both features independently.
> 
> The other features are incremented accordingly, to keep the two crtc
> features adjacent.
> 
> Signed-off-by: Kieran Bingham 

Reviewed-by: Laurent Pinchart 

> ---
> v2:
>  - New patch
> 
>  drivers/gpu/drm/rcar-du/rcar_du_crtc.c |  4 +--
>  drivers/gpu/drm/rcar-du/rcar_du_drv.c  | 48 +-
>  drivers/gpu/drm/rcar-du/rcar_du_drv.h  |  9 ++---
>  3 files changed, 39 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/gpu/drm/rcar-du/rcar_du_crtc.c 
> b/drivers/gpu/drm/rcar-du/rcar_du_crtc.c
> index a0f837e8243a..5672830ca184 100644
> --- a/drivers/gpu/drm/rcar-du/rcar_du_crtc.c
> +++ b/drivers/gpu/drm/rcar-du/rcar_du_crtc.c
> @@ -1206,7 +1206,7 @@ int rcar_du_crtc_create(struct rcar_du_group *rgrp, 
> unsigned int swindex,
>   int ret;
>  
>   /* Get the CRTC clock and the optional external clock. */
> - if (rcar_du_has(rcdu, RCAR_DU_FEATURE_CRTC_IRQ_CLOCK)) {
> + if (rcar_du_has(rcdu, RCAR_DU_FEATURE_CRTC_CLOCK)) {
>   sprintf(clk_name, "du.%u", hwindex);
>   name = clk_name;
>   } else {
> @@ -1272,7 +1272,7 @@ int rcar_du_crtc_create(struct rcar_du_group *rgrp, 
> unsigned int swindex,
>   drm_crtc_helper_add(crtc, _helper_funcs);
>  
>   /* Register the interrupt handler. */
> - if (rcar_du_has(rcdu, RCAR_DU_FEATURE_CRTC_IRQ_CLOCK)) {
> + if (rcar_du_has(rcdu, RCAR_DU_FEATURE_CRTC_IRQ)) {
>   /* The IRQ's are associated with the CRTC (sw)index. */
>   irq = platform_get_irq(pdev, swindex);
>   irqflags = 0;
> diff --git a/drivers/gpu/drm/rcar-du/rcar_du_drv.c 
> b/drivers/gpu/drm/rcar-du/rcar_du_drv.c
> index 4ac26d08ebb4..8a094d5b9c77 100644
> --- a/drivers/gpu/drm/rcar-du/rcar_du_drv.c
> +++ b/drivers/gpu/drm/rcar-du/rcar_du_drv.c
> @@ -36,7 +36,8 @@
>  
>  static const struct rcar_du_device_info rzg1_du_r8a7743_info = {
>   .gen = 2,
> - .features = RCAR_DU_FEATURE_CRTC_IRQ_CLOCK
> + .features = RCAR_DU_FEATURE_CRTC_IRQ
> +   | RCAR_DU_FEATURE_CRTC_CLOCK
> | RCAR_DU_FEATURE_INTERLACED
> | RCAR_DU_FEATURE_TVM_SYNC,
>   .channels_mask = BIT(1) | BIT(0),
> @@ -58,7 +59,8 @@ static const struct rcar_du_device_info 
> rzg1_du_r8a7743_info = {
>  
>  static const struct rcar_du_device_info rzg1_du_r8a7745_info = {
>   .gen = 2,
> - .features = RCAR_DU_FEATURE_CRTC_IRQ_CLOCK
> + .features = RCAR_DU_FEATURE_CRTC_IRQ
> +   | RCAR_DU_FEATURE_CRTC_CLOCK
> | RCAR_DU_FEATURE_INTERLACED
> | RCAR_DU_FEATURE_TVM_SYNC,
>   .channels_mask = BIT(1) | BIT(0),
> @@ -79,7 +81,8 @@ static const struct rcar_du_device_info 
> rzg1_du_r8a7745_info = {
>  
>  static const struct rcar_du_device_info rzg1_du_r8a77470_info = {
>   .gen = 2,
> - .features = RCAR_DU_FEATURE_CRTC_IRQ_CLOCK
> + .features = RCAR_DU_FEATURE_CRTC_IRQ
> +   | RCAR_DU_FEATURE_CRTC_CLOCK
> | RCAR_DU_FEATURE_INTERLACED
> | RCAR_DU_FEATURE_TVM_SYNC,
>   .channels_mask = BIT(1) | BIT(0),
> @@ -105,7 +108,8 @@ static const struct rcar_du_device_info 
> rzg1_du_r8a77470_info = {
>  
>  static const struct rcar_du_device_info rcar_du_r8a774a1_info = {
>   .gen = 3,
> - .features = RCAR_DU_FEATURE_CRTC_IRQ_CLOCK
> + .features = RCAR_DU_FEATURE_CRTC_IRQ
> +   | RCAR_DU_FEATURE_CRTC_CLOCK
> | RCAR_DU_FEATURE_VSP1_SOURCE
> | RCAR_DU_FEATURE_INTERLACED
> | RCAR_DU_FEATURE_TVM_SYNC,
> @@ -134,7 +138,8 @@ static const struct rcar_du_device_info 
> rcar_du_r8a774a1_info = {
>  
>  static const struct rcar_du_device_info rcar_du_r8a774b1_info = {
>   .gen = 3,
> - .features = RCAR_DU_FEATURE_CRTC_IRQ_CLOCK
> + .features = RCAR_DU_FEATURE_CRTC_IRQ
> +   | RCAR_DU_FEATURE_CRTC_CLOCK
> | RCAR_DU_FEATURE_VSP1_SOURCE
> | RCAR_DU_FEATURE_INTERLACED
> | RCAR_DU_FEATURE_TVM_SYNC,
> @@ -163,7 +168,8 @@ static const struct rcar_du_device_info 
> rcar_du_r8a774b1_info = {
>  
>  static const struct rcar_du_device_info rcar_du_r8a774c0_info = {
>   .gen = 3,
> - .features = RCAR_DU_FEATURE_CRTC_IRQ_CLOCK
> + .features = RCAR_DU_FEATURE_CRTC_IRQ
> +   | RCAR_DU_FEATURE_CRTC_CLOCK
> | RCAR_DU_FEATURE_VSP1_SOURCE,
>   .channels_mask = BIT(1) | BIT(0),
>   .routes = {
> @@ -189,7 +195,8 @@ static const struct rcar_du_device_info 
> rcar_du_r8a774c0_info = {
>  
>  static const struct rcar_du_device_info rcar_du_r8a774e1_info = {
>   .gen = 3,
> - .features = RCAR_DU_FEATURE_CRTC_IRQ_CLOCK

[PATCH] drm/amd/display: Only define DP 2.0 symbols if not already defined

2021-09-22 Thread Harry Wentland

[Why]
For some reason we're defining DP 2.0 definitions inside our
driver. Now that patches to introduce relevant definitions
are slated to be merged into drm-next this is causing conflicts.

In file included from drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c:33:
In file included from ./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgpu.h:70:
In file included from ./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgpu_mode.h:36:
./include/drm/drm_dp_helper.h:1322:9: error: 
'DP_MAIN_LINK_CHANNEL_CODING_PHY_REPEATER' macro redefined 
[-Werror,-Wmacro-redefined]
^
./drivers/gpu/drm/amd/amdgpu/../display/dc/dc_dp_types.h:881:9: note: previous 
definition is here
^
1 error generated.

[How]
Guard all display driver defines with #ifndef for now. Once we pull
in the new definitions into amd-staging-drm-next we will follow
up and drop definitions from our driver and provide follow-up
header updates for any addition DP 2.0 definitions required
by our driver.

Signed-off-by: Harry Wentland 
---
 drivers/gpu/drm/amd/display/dc/dc_dp_types.h | 53 ++--
 1 file changed, 48 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dc_dp_types.h 
b/drivers/gpu/drm/amd/display/dc/dc_dp_types.h
index a5e798b5da79..74b8de616dcd 100644
--- a/drivers/gpu/drm/amd/display/dc/dc_dp_types.h
+++ b/drivers/gpu/drm/amd/display/dc/dc_dp_types.h
@@ -860,28 +860,71 @@ struct psr_caps {
 };
 
 #if defined(CONFIG_DRM_AMD_DC_DCN)
+#ifndef DP_MAIN_LINK_CHANNEL_CODING_CAP
 #define DP_MAIN_LINK_CHANNEL_CODING_CAP0x006
+#endif
+#ifndef DP_SINK_VIDEO_FALLBACK_FORMATS
 #define DP_SINK_VIDEO_FALLBACK_FORMATS 0x020
+#endif
+#ifndef DP_FEC_CAPABILITY_1
 #define DP_FEC_CAPABILITY_10x091
+#endif
+#ifndef DP_DFP_CAPABILITY_EXTENSION_SUPPORT
 #define DP_DFP_CAPABILITY_EXTENSION_SUPPORT0x0A3
+#endif
+#ifndef DP_DSC_CONFIGURATION
 #define DP_DSC_CONFIGURATION   0x161
+#endif
+#ifndef DP_PHY_SQUARE_PATTERN
 #define DP_PHY_SQUARE_PATTERN  0x249
+#endif
+#ifndef DP_128b_132b_SUPPORTED_LINK_RATES
 #define DP_128b_132b_SUPPORTED_LINK_RATES  0x2215
+#endif
+#ifndef DP_128b_132b_TRAINING_AUX_RD_INTERVAL
 #define DP_128b_132b_TRAINING_AUX_RD_INTERVAL  0x2216
+#endif
+#ifndef DP_TEST_264BIT_CUSTOM_PATTERN_7_0
 #define DP_TEST_264BIT_CUSTOM_PATTERN_7_0  0X2230
+#endif
+#ifndef DP_TEST_264BIT_CUSTOM_PATTERN_263_256
 #define DP_TEST_264BIT_CUSTOM_PATTERN_263_256  0X2250
+#endif
+#ifndef DP_DSC_SUPPORT_AND_DECODER_COUNT
 #define DP_DSC_SUPPORT_AND_DECODER_COUNT   0x2260
+#endif
+#ifndef DP_DSC_MAX_SLICE_COUNT_AND_AGGREGATION_0
 #define DP_DSC_MAX_SLICE_COUNT_AND_AGGREGATION_0   0x2270
-# define DP_DSC_DECODER_0_MAXIMUM_SLICE_COUNT_MASK (1 << 0)
-# define DP_DSC_DECODER_0_AGGREGATION_SUPPORT_MASK (0b111 << 1)
-# define DP_DSC_DECODER_0_AGGREGATION_SUPPORT_SHIFT1
-# define DP_DSC_DECODER_COUNT_MASK (0b111 << 5)
-# define DP_DSC_DECODER_COUNT_SHIFT5
+#endif
+#ifndef DP_DSC_DECODER_0_MAXIMUM_SLICE_COUNT_MASK
+#define DP_DSC_DECODER_0_MAXIMUM_SLICE_COUNT_MASK  (1 << 0)
+#endif
+#ifndef DP_DSC_DECODER_0_AGGREGATION_SUPPORT_MASK
+#define DP_DSC_DECODER_0_AGGREGATION_SUPPORT_MASK  (0b111 << 1)
+#endif
+#ifndef DP_DSC_DECODER_0_AGGREGATION_SUPPORT_SHIFT
+#define DP_DSC_DECODER_0_AGGREGATION_SUPPORT_SHIFT 1
+#endif
+#ifndef DP_DSC_DECODER_COUNT_MASK
+#define DP_DSC_DECODER_COUNT_MASK  (0b111 << 5)
+#endif
+#ifndef DP_DSC_DECODER_COUNT_SHIFT
+#define DP_DSC_DECODER_COUNT_SHIFT 5
+#endif
+#ifndef DP_MAIN_LINK_CHANNEL_CODING_SET
 #define DP_MAIN_LINK_CHANNEL_CODING_SET0x108
+#endif
+#ifndef DP_MAIN_LINK_CHANNEL_CODING_PHY_REPEATER
 #define DP_MAIN_LINK_CHANNEL_CODING_PHY_REPEATER   0xF0006
+#endif
+#ifndef DP_PHY_REPEATER_128b_132b_RATES
 #define DP_PHY_REPEATER_128b_132b_RATES0xF0007
+#ifndef DP_128b_132b_TRAINING_AUX_RD_INTERVAL_PHY_REPEATER1
 #define DP_128b_132b_TRAINING_AUX_RD_INTERVAL_PHY_REPEATER10xF0022
+#endif
+#ifndef DP_INTRA_HOP_AUX_REPLY_INDICATION
 #define DP_INTRA_HOP_AUX_REPLY_INDICATION  (1 << 3)
+#endif
 /* TODO - Use DRM header to replace above once available */
 
 union dp_main_line_channel_coding_cap {
-- 
2.33.0

Re: [PATCH][next] drm/rockchip: Remove redundant assignment of pointer connector

2021-09-22 Thread Alex Bee


Hi Heiko,

Am 22.09.21 um 18:45 schrieb Heiko Stübner:

Hi Alex,

Am Mittwoch, 22. September 2021, 18:35:38 CEST schrieb Alex Bee:

Hi Colin,
Am 22.09.21 um 13:24 schrieb Colin King:

From: Colin Ian King 

The pointer connector is being assigned a value that is never
read, it is being updated immediately afterwards. The assignment
is redundant and can be removed.

The pointer to the connector is used in rockchip_rgb_fini for
drm_connector_cleanup.
It's pretty much the same for the encoder, btw.

I think the issue is more the two lines

connector = >connector;
connector = drm_bridge_connector_init(rgb->drm_dev, encoder);

hence the connector = >connector being overwritten immediately after

Now that I look at it again, the whole approach looks strange.
drm_bridge_connector_init() creates the connector structure and
returns a pointer to it.


Totally agreed.

The main reason I was doing it that way, was the way it was done already 
in rockchip_lvds.c, where the connector was already existent in the 
struct rockchip_lvds (and was already used in the panel-case - all 
places where it is used accept pointers also, btw) and is *no* pointer - 
and is done already this very strange way.


I wanted to re-use it for the bridge-case and didn't want to differ in 
coding in rockchip-rgb to much.


The only reason I can think of, why it was done that way is, that we 
might need a pointer to a fully initialized struct drm_connector for 
some reason (drm_connector_cleanup ?), what we wouldn't have if have 
just a pointer and something goes wrong before drm_connector_init 
respectivly drm_bridge_connector_init.


Alex



So the first line below sets the connector pointer to point to the
>connector element and the second line then set a completely
different address into it.

So the connector element in rockchip_lvds and rockchip_rgb should actually
become a pointer itself to hold the connector element returned from
drm_bridge_connector_init() .


Heiko


Regards,

Alex

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King 
---
   drivers/gpu/drm/rockchip/rockchip_rgb.c | 1 -
   1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/rockchip/rockchip_rgb.c 
b/drivers/gpu/drm/rockchip/rockchip_rgb.c
index 09be9678f2bd..18fb84068a64 100644
--- a/drivers/gpu/drm/rockchip/rockchip_rgb.c
+++ b/drivers/gpu/drm/rockchip/rockchip_rgb.c
@@ -150,7 +150,6 @@ struct rockchip_rgb *rockchip_rgb_init(struct device *dev,
if (ret)
goto err_free_encoder;
   
-	connector = >connector;

connector = drm_bridge_connector_init(rgb->drm_dev, encoder);
if (IS_ERR(connector)) {
DRM_DEV_ERROR(drm_dev->dev,

Re: [PATCH v2 3/5] drm: rcar-du: Fix DIDSR field name

2021-09-22 Thread Laurent Pinchart

Hi Kieran,

Thank you for the patch.

On Thu, Sep 02, 2021 at 12:49:05AM +0100, Kieran Bingham wrote:
> The DIDSR fields named LDCS were incorrectly defined as LCDS.
> Both the Gen2 and Gen3 documentation refer to the fields as the "LVDS
> Dot Clock Select".
> 
> Correct the definitions.
> 
> Signed-off-by: Kieran Bingham 

Reviewed-by: Laurent Pinchart 

> ---
> v2:
>  - New patch
> 
>  drivers/gpu/drm/rcar-du/rcar_du_group.c | 4 ++--
>  drivers/gpu/drm/rcar-du/rcar_du_regs.h  | 8 
>  2 files changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/rcar-du/rcar_du_group.c 
> b/drivers/gpu/drm/rcar-du/rcar_du_group.c
> index 88a783ceb3e9..a984eef265d2 100644
> --- a/drivers/gpu/drm/rcar-du/rcar_du_group.c
> +++ b/drivers/gpu/drm/rcar-du/rcar_du_group.c
> @@ -122,10 +122,10 @@ static void rcar_du_group_setup_didsr(struct 
> rcar_du_group *rgrp)
>   didsr = DIDSR_CODE;
>   for (i = 0; i < num_crtcs; ++i, ++rcrtc) {
>   if (rcdu->info->lvds_clk_mask & BIT(rcrtc->index))
> - didsr |= DIDSR_LCDS_LVDS0(i)
> + didsr |= DIDSR_LDCS_LVDS0(i)
> |  DIDSR_PDCS_CLK(i, 0);
>   else
> - didsr |= DIDSR_LCDS_DCLKIN(i)
> + didsr |= DIDSR_LDCS_DCLKIN(i)
> |  DIDSR_PDCS_CLK(i, 0);
>   }
>  
> diff --git a/drivers/gpu/drm/rcar-du/rcar_du_regs.h 
> b/drivers/gpu/drm/rcar-du/rcar_du_regs.h
> index fb9964949368..fb7c467aa484 100644
> --- a/drivers/gpu/drm/rcar-du/rcar_du_regs.h
> +++ b/drivers/gpu/drm/rcar-du/rcar_du_regs.h
> @@ -257,10 +257,10 @@
>  
>  #define DIDSR0x20028
>  #define DIDSR_CODE   (0x7790 << 16)
> -#define DIDSR_LCDS_DCLKIN(n) (0 << (8 + (n) * 2))
> -#define DIDSR_LCDS_LVDS0(n)  (2 << (8 + (n) * 2))
> -#define DIDSR_LCDS_LVDS1(n)  (3 << (8 + (n) * 2))
> -#define DIDSR_LCDS_MASK(n)   (3 << (8 + (n) * 2))
> +#define DIDSR_LDCS_DCLKIN(n) (0 << (8 + (n) * 2))
> +#define DIDSR_LDCS_LVDS0(n)  (2 << (8 + (n) * 2))
> +#define DIDSR_LDCS_LVDS1(n)  (3 << (8 + (n) * 2))
> +#define DIDSR_LDCS_MASK(n)   (3 << (8 + (n) * 2))
>  #define DIDSR_PDCS_CLK(n, clk)   (clk << ((n) * 2))
>  #define DIDSR_PDCS_MASK(n)   (3 << ((n) * 2))
>  

-- 
Regards,

Laurent Pinchart

Re: Regression with mainline kernel on rpi4

2021-09-22 Thread Sudip Mukherjee

On Wed, Sep 22, 2021 at 12:28 PM Maxime Ripard  wrote:
>
> On Wed, Sep 22, 2021 at 11:10:34AM +0100, Sudip Mukherjee wrote:
> > On Wed, Sep 22, 2021 at 10:57 AM Maxime Ripard  wrote:
> > >

>
> Still works fine (and it required some mangling of the kernel command line).
>
> If we summarize:
>
>   - You initially just dumped a panic and a link to your QA, without any
> more context:

The SHA was also given, and I didn't know what else you would need.
The openQA link was given to show the dmesg.

>
>   - Then stating that you're not doing any test, really;

Yes, and I still say that, its just a boot test.

>
>   - Well, except for booting Ubuntu, but no other modification
>
>   - But you're not booting the standard image
>
>   - And with a custom initrd

yes, something which has always worked in boot-testing LTS kernel or
mainline kernel.

>
>   - And that QA link states that you're booting from QEMU, but you're
> not.

I only found that the "WORKER_CLASS" has the name "qemu_rpi4", that is
a name which I choose to give as that worker laptop is connected to
rpi4 and also running qemu tests. If you want I can change the name of
the WORKER_CLASS. :)
iiuc, dmesg shows if its booting in a qemu or on a real hardware.

>
> Please provide a full documentation on what you're doing to generate
> that image, from scratch, in order to get that panic you reported
> previously.

I have now ordered another rpi4 board and will create the image from
scratch and give you the steps.

>
> I've spent an entire day trying to make sense of what you're doing
> exactly to get into that situation. I have other things to work on and I
> don't plan on figuring out any random CI system.

I am not really sure why you are trying to figure out a random CI
system. I can reproduce the problem in our setup everytime I test with
that reverted commit and I have already said I am happy to test with a
debug patch or anything else.

-- 
Regards
Sudip

Re: Regression with mainline kernel on rpi4

2021-09-22 Thread Sudip Mukherjee

On Wed, Sep 22, 2021 at 4:25 PM Linus Torvalds
 wrote:
>
> On Wed, Sep 22, 2021 at 3:11 AM Sudip Mukherjee
>  wrote:
> >
> > That test script is triggering the openqa job, but its running only
> > after lava is able to login. The trace is appearing before the login
> > prompt even, so test_mainline.sh should not matter here.
>
> Side note: the traces might be more legible if you have debug info in
> the kernel, and run the dmesg through the script in
>
>   scripts/decode_stacktrace.sh
>
> which should give line numbers and inlining information.

Attached is a complete dmesg and also the decoded trace.
This is done on 4357f03d6611 ("Merge tag 'pm-5.15-rc2' of
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm")


-- 
Regards
Sudip
[0.00] Booting Linux on physical CPU 0x00 [0x410fd083]
[0.00] Linux version 5.15.0-rc1-4357f03d6611 (smukherjee@db7e030c489f) (aarch64-linux-gcc (GCC) 11.2.1 20210911, GNU ld (GNU Binutils) 2.36.1) #1 SMP PREEMPT Wed Sep 22 15:18:16 UTC 2021
[0.00] Machine model: Raspberry Pi 4 Model B
[0.00] efi: UEFI not found.
[0.00] Reserved memory: created CMA memory pool at 0x3280, size 64 MiB
[0.00] OF: reserved mem: initialized node linux,cma, compatible id shared-dma-pool
[0.00] Zone ranges:
[0.00]   DMA  [mem 0x-0x3fff]
[0.00]   DMA32[mem 0x4000-0xfbff]
[0.00]   Normal   empty
[0.00] Movable zone start for each node
[0.00] Early memory node ranges
[0.00]   node   0: [mem 0x-0x3b3f]
[0.00]   node   0: [mem 0x4000-0xfbff]
[0.00] Initmem setup node 0 [mem 0x-0xfbff]
[0.00] percpu: Embedded 29 pages/cpu s80344 r8192 d30248 u118784
[0.00] pcpu-alloc: s80344 r8192 d30248 u118784 alloc=29*4096
[0.00] pcpu-alloc: [0] 0 [0] 1 [0] 2 [0] 3 
[0.00] Detected PIPT I-cache on CPU0
[0.00] CPU features: detected: Spectre-v2
[0.00] CPU features: detected: Spectre-v3a
[0.00] CPU features: detected: Spectre-v4
[0.00] CPU features: detected: ARM errata 1165522, 1319367, or 1530923
[0.00] Built 1 zonelists, mobility grouping on.  Total pages: 996912
[0.00] Kernel command line: console=ttyS0,115200n8 root=/dev/ram0 coherent_pool=1M 8250.nr_uarts=1 snd_bcm2835.enable_compat_alsa=0 snd_bcm2835.enable_hdmi=1 bcm2708_fb.fbwidth=0 bcm2708_fb.fbheight=0 bcm2708_fb.fbswap=1 smsc95xx.macaddr=DC:A6:32:BC:4B:41 vc_mem.mem_base=0x3ec0 vc_mem.mem_size=0x4000  dwc_otg.lpm_enable=0 root=LABEL=writable1 rootfstype=ext4 elevator=deadline rootwait fixrtc loglevel=7 ip=dhcp
[0.00] Kernel parameter elevator= does not have any effect anymore.
   Please use sysfs to set IO scheduler for individual devices.
[0.00] Unknown command line parameters: fixrtc
[0.00] Dentry cache hash table entries: 524288 (order: 10, 4194304 bytes, linear)
[0.00] Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes, linear)
[0.00] mem auto-init: stack:off, heap alloc:off, heap free:off
[0.00] software IO TLB: mapped [mem 0x2e80-0x3280] (64MB)
[0.00] Memory: 3742168K/4050944K available (12928K kernel code, 1960K rwdata, 4868K rodata, 4352K init, 843K bss, 243240K reserved, 65536K cma-reserved)
[0.00] random: get_random_u64 called from __kmem_cache_create+0x38/0x4e0 with crng_init=0
[0.00] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
[0.00] ftrace: allocating 40527 entries in 159 pages
[0.00] ftrace: allocated 159 pages with 6 groups
[0.00] trace event string verifier disabled
[0.00] rcu: Preemptible hierarchical RCU implementation.
[0.00] rcu: 	RCU restricting CPUs from NR_CPUS=256 to nr_cpu_ids=4.
[0.00] 	Trampoline variant of Tasks RCU enabled.
[0.00] 	Rude variant of Tasks RCU enabled.
[0.00] 	Tracing variant of Tasks RCU enabled.
[0.00] rcu: RCU calculated value of scheduler-enlistment delay is 25 jiffies.
[0.00] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=4
[0.00] NR_IRQS: 64, nr_irqs: 64, preallocated irqs: 0
[0.00] Root IRQ handler: gic_handle_irq
[0.00] GIC: Using split EOI/Deactivate mode
[0.00] irq_brcmstb_l2: registered L2 intc (/soc/interrupt-controller@7ef00100, parent irq: 10)
[0.00] arch_timer: cp15 timer(s) running at 54.00MHz (phys).
[0.00] clocksource: arch_sys_counter: mask: 0xff max_cycles: 0xc743ce346, max_idle_ns: 440795203123 ns
[0.01] sched_clock: 56 bits at 54MHz, resolution 18ns, wraps every 4398046511102ns
[0.000189] Console: colour dummy device 80x25
[0.000266] Calibrating delay loop (skipped), value calculated using timer frequency.. 108.00 BogoMIPS

Re: [PATCH][next] drm/rockchip: Remove redundant assignment of pointer connector

2021-09-22 Thread Heiko Stübner

Hi Alex,

Am Mittwoch, 22. September 2021, 18:35:38 CEST schrieb Alex Bee:
> Hi Colin,
> Am 22.09.21 um 13:24 schrieb Colin King:
> > From: Colin Ian King 
> > 
> > The pointer connector is being assigned a value that is never
> > read, it is being updated immediately afterwards. The assignment
> > is redundant and can be removed.
> 
> The pointer to the connector is used in rockchip_rgb_fini for 
> drm_connector_cleanup.
> It's pretty much the same for the encoder, btw.

I think the issue is more the two lines

connector = >connector;
connector = drm_bridge_connector_init(rgb->drm_dev, encoder);

hence the connector = >connector being overwritten immediately after

Now that I look at it again, the whole approach looks strange.
drm_bridge_connector_init() creates the connector structure and
returns a pointer to it.

So the first line below sets the connector pointer to point to the
>connector element and the second line then set a completely
different address into it.

So the connector element in rockchip_lvds and rockchip_rgb should actually
become a pointer itself to hold the connector element returned from
drm_bridge_connector_init() .


Heiko

> 
> Regards,
> 
> Alex
> > 
> > Addresses-Coverity: ("Unused value")
> > Signed-off-by: Colin Ian King 
> > ---
> >   drivers/gpu/drm/rockchip/rockchip_rgb.c | 1 -
> >   1 file changed, 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/rockchip/rockchip_rgb.c 
> > b/drivers/gpu/drm/rockchip/rockchip_rgb.c
> > index 09be9678f2bd..18fb84068a64 100644
> > --- a/drivers/gpu/drm/rockchip/rockchip_rgb.c
> > +++ b/drivers/gpu/drm/rockchip/rockchip_rgb.c
> > @@ -150,7 +150,6 @@ struct rockchip_rgb *rockchip_rgb_init(struct device 
> > *dev,
> > if (ret)
> > goto err_free_encoder;
> >   
> > -   connector = >connector;
> > connector = drm_bridge_connector_init(rgb->drm_dev, encoder);
> > if (IS_ERR(connector)) {
> > DRM_DEV_ERROR(drm_dev->dev,
> > 
> 
>

Re: [Intel-gfx] [PATCH 20/27] drm/i915/guc: Connect UAPI to GuC multi-lrc interface

2021-09-22 Thread Matthew Brost

On Mon, Sep 20, 2021 at 05:09:28PM -0700, John Harrison wrote:
> On 8/20/2021 15:44, Matthew Brost wrote:
> > Introduce 'set parallel submit' extension to connect UAPI to GuC
> > multi-lrc interface. Kernel doc in new uAPI should explain it all.
> > 
> > IGT: https://patchwork.freedesktop.org/patch/447008/?series=93071=1
> > media UMD: link to come
> Is this link still not available?
> 

Have it now: https://github.com/intel/media-driver/pull/1252

> Also, see 'kernel test robot' emails saying that sparse is complaining about
> something I don't understand but presumably needs to be fixed.
>

Yea, those warning need to be fixed.
 
> 
> > 
> > v2:
> >   (Daniel Vetter)
> >- Add IGT link and placeholder for media UMD link
> > 
> > Cc: Tvrtko Ursulin 
> > Signed-off-by: Matthew Brost 
> > ---
> >   drivers/gpu/drm/i915/gem/i915_gem_context.c   | 220 +-
> >   .../gpu/drm/i915/gem/i915_gem_context_types.h |   6 +
> >   drivers/gpu/drm/i915/gt/intel_context_types.h |   9 +-
> >   drivers/gpu/drm/i915/gt/intel_engine.h|  12 +-
> >   drivers/gpu/drm/i915/gt/intel_engine_cs.c |   6 +-
> >   .../drm/i915/gt/intel_execlists_submission.c  |   6 +-
> >   drivers/gpu/drm/i915/gt/selftest_execlists.c  |  12 +-
> >   .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 114 -
> >   include/uapi/drm/i915_drm.h   | 128 ++
> >   9 files changed, 485 insertions(+), 28 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
> > b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > index bcaaf514876b..de0fd145fb47 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > @@ -522,9 +522,149 @@ set_proto_ctx_engines_bond(struct i915_user_extension 
> > __user *base, void *data)
> > return 0;
> >   }
> > +static int
> > +set_proto_ctx_engines_parallel_submit(struct i915_user_extension __user 
> > *base,
> > + void *data)
> > +{
> > +   struct i915_context_engines_parallel_submit __user *ext =
> > +   container_of_user(base, typeof(*ext), base);
> > +   const struct set_proto_ctx_engines *set = data;
> > +   struct drm_i915_private *i915 = set->i915;
> > +   u64 flags;
> > +   int err = 0, n, i, j;
> > +   u16 slot, width, num_siblings;
> > +   struct intel_engine_cs **siblings = NULL;
> > +   intel_engine_mask_t prev_mask;
> > +
> > +   /* Disabling for now */
> > +   return -ENODEV;
> > +
> > +   if (!(intel_uc_uses_guc_submission(>gt.uc)))
> > +   return -ENODEV;
> This needs a FIXME comment to say that exec list will be added later.
> 

Sure.

> > +
> > +   if (get_user(slot, >engine_index))
> > +   return -EFAULT;
> > +
> > +   if (get_user(width, >width))
> > +   return -EFAULT;
> > +
> > +   if (get_user(num_siblings, >num_siblings))
> > +   return -EFAULT;
> > +
> > +   if (slot >= set->num_engines) {
> > +   drm_dbg(>drm, "Invalid placement value, %d >= %d\n",
> > +   slot, set->num_engines);
> > +   return -EINVAL;
> > +   }
> > +
> > +   if (set->engines[slot].type != I915_GEM_ENGINE_TYPE_INVALID) {
> > +   drm_dbg(>drm,
> > +   "Invalid placement[%d], already occupied\n", slot);
> > +   return -EINVAL;
> > +   }
> > +
> > +   if (get_user(flags, >flags))
> > +   return -EFAULT;
> > +
> > +   if (flags) {
> > +   drm_dbg(>drm, "Unknown flags 0x%02llx", flags);
> > +   return -EINVAL;
> > +   }
> > +
> > +   for (n = 0; n < ARRAY_SIZE(ext->mbz64); n++) {
> > +   err = check_user_mbz(>mbz64[n]);
> > +   if (err)
> > +   return err;
> > +   }
> > +
> > +   if (width < 2) {
> > +   drm_dbg(>drm, "Width (%d) < 2\n", width);
> > +   return -EINVAL;
> > +   }
> > +
> > +   if (num_siblings < 1) {
> > +   drm_dbg(>drm, "Number siblings (%d) < 1\n",
> > +   num_siblings);
> > +   return -EINVAL;
> > +   }
> > +
> > +   siblings = kmalloc_array(num_siblings * width,
> > +sizeof(*siblings),
> > +GFP_KERNEL);
> > +   if (!siblings)
> > +   return -ENOMEM;
> > +
> > +   /* Create contexts / engines */
> > +   for (i = 0; i < width; ++i) {
> > +   intel_engine_mask_t current_mask = 0;
> > +   struct i915_engine_class_instance prev_engine;
> > +
> > +   for (j = 0; j < num_siblings; ++j) {
> > +   struct i915_engine_class_instance ci;
> > +
> > +   n = i * num_siblings + j;
> > +   if (copy_from_user(, >engines[n], sizeof(ci))) {
> > +   err = -EFAULT;
> > +   goto out_err;
> > +   }
> > +
> > +   siblings[n] =
> > +   intel_engine_lookup_user(i915, ci.engine_class,
> > +

Re: [PATCH][next] drm/rockchip: Remove redundant assignment of pointer connector

2021-09-22 Thread Alex Bee


Hi Colin,
Am 22.09.21 um 13:24 schrieb Colin King:

From: Colin Ian King 

The pointer connector is being assigned a value that is never
read, it is being updated immediately afterwards. The assignment
is redundant and can be removed.


The pointer to the connector is used in rockchip_rgb_fini for 
drm_connector_cleanup.

It's pretty much the same for the encoder, btw.

Regards,

Alex


Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King 
---
  drivers/gpu/drm/rockchip/rockchip_rgb.c | 1 -
  1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/rockchip/rockchip_rgb.c 
b/drivers/gpu/drm/rockchip/rockchip_rgb.c
index 09be9678f2bd..18fb84068a64 100644
--- a/drivers/gpu/drm/rockchip/rockchip_rgb.c
+++ b/drivers/gpu/drm/rockchip/rockchip_rgb.c
@@ -150,7 +150,6 @@ struct rockchip_rgb *rockchip_rgb_init(struct device *dev,
if (ret)
goto err_free_encoder;
  
-	connector = >connector;

connector = drm_bridge_connector_init(rgb->drm_dev, encoder);
if (IS_ERR(connector)) {
DRM_DEV_ERROR(drm_dev->dev,

Re: [Intel-gfx] [PATCH 15/27] drm/i915/guc: Implement multi-lrc submission

2021-09-22 Thread Matthew Brost

On Mon, Sep 20, 2021 at 02:48:52PM -0700, John Harrison wrote:
> On 8/20/2021 15:44, Matthew Brost wrote:
> > Implement multi-lrc submission via a single workqueue entry and single
> > H2G. The workqueue entry contains an updated tail value for each
> > request, of all the contexts in the multi-lrc submission, and updates
> > these values simultaneously. As such, the tasklet and bypass path have
> > been updated to coalesce requests into a single submission.
> > 
> > Signed-off-by: Matthew Brost 
> > ---
> >   drivers/gpu/drm/i915/gt/uc/intel_guc.c|  21 ++
> >   drivers/gpu/drm/i915/gt/uc/intel_guc.h|   8 +
> >   drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c |  24 +-
> >   drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   |   6 +-
> >   .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 312 +++---
> >   drivers/gpu/drm/i915/i915_request.h   |   8 +
> >   6 files changed, 317 insertions(+), 62 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c 
> > b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> > index fbfcae727d7f..879aef662b2e 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> > @@ -748,3 +748,24 @@ void intel_guc_load_status(struct intel_guc *guc, 
> > struct drm_printer *p)
> > }
> > }
> >   }
> > +
> > +void intel_guc_write_barrier(struct intel_guc *guc)
> > +{
> > +   struct intel_gt *gt = guc_to_gt(guc);
> > +
> > +   if (i915_gem_object_is_lmem(guc->ct.vma->obj)) {
> > +   GEM_BUG_ON(guc->send_regs.fw_domains);
> Granted, this patch is just moving code from one file to another not
> changing it. However, I think it would be worth adding a blank line in here.
> Otherwise the 'this register' comment below can be confusingly read as
> referring to the send_regs.fw_domain entry above.
> 
> And maybe add a comment why it is a bug for the send_regs value to be set?
> I'm not seeing any obvious connection between it and the reset of this code.
> 

Can add a blank line. I think the GEM_BUG_ON relates to being able to
use intel_uncore_write_fw vs intel_uncore_write. Can add comment.

> > +   /*
> > +* This register is used by the i915 and GuC for MMIO based
> > +* communication. Once we are in this code CTBs are the only
> > +* method the i915 uses to communicate with the GuC so it is
> > +* safe to write to this register (a value of 0 is NOP for MMIO
> > +* communication). If we ever start mixing CTBs and MMIOs a new
> > +* register will have to be chosen.
> > +*/
> > +   intel_uncore_write_fw(gt->uncore, GEN11_SOFT_SCRATCH(0), 0);
> > +   } else {
> > +   /* wmb() sufficient for a barrier if in smem */
> > +   wmb();
> > +   }
> > +}
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h 
> > b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> > index 3f95b1b4f15c..0ead2406d03c 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> > @@ -37,6 +37,12 @@ struct intel_guc {
> > /* Global engine used to submit requests to GuC */
> > struct i915_sched_engine *sched_engine;
> > struct i915_request *stalled_request;
> > +   enum {
> > +   STALL_NONE,
> > +   STALL_REGISTER_CONTEXT,
> > +   STALL_MOVE_LRC_TAIL,
> > +   STALL_ADD_REQUEST,
> > +   } submission_stall_reason;
> > /* intel_guc_recv interrupt related state */
> > spinlock_t irq_lock;
> > @@ -332,4 +338,6 @@ void intel_guc_submission_cancel_requests(struct 
> > intel_guc *guc);
> >   void intel_guc_load_status(struct intel_guc *guc, struct drm_printer *p);
> > +void intel_guc_write_barrier(struct intel_guc *guc);
> > +
> >   #endif
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c 
> > b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > index 20c710a74498..10d1878d2826 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > @@ -377,28 +377,6 @@ static u32 ct_get_next_fence(struct intel_guc_ct *ct)
> > return ++ct->requests.last_fence;
> >   }
> > -static void write_barrier(struct intel_guc_ct *ct)
> > -{
> > -   struct intel_guc *guc = ct_to_guc(ct);
> > -   struct intel_gt *gt = guc_to_gt(guc);
> > -
> > -   if (i915_gem_object_is_lmem(guc->ct.vma->obj)) {
> > -   GEM_BUG_ON(guc->send_regs.fw_domains);
> > -   /*
> > -* This register is used by the i915 and GuC for MMIO based
> > -* communication. Once we are in this code CTBs are the only
> > -* method the i915 uses to communicate with the GuC so it is
> > -* safe to write to this register (a value of 0 is NOP for MMIO
> > -* communication). If we ever start mixing CTBs and MMIOs a new
> > -* register will have to be chosen.
> > -*/
> > -   intel_uncore_write_fw(gt->uncore,

Re: [Intel-gfx] [PATCH 17/27] drm/i915/guc: Implement multi-lrc reset

2021-09-22 Thread Matthew Brost

On Mon, Sep 20, 2021 at 03:44:18PM -0700, John Harrison wrote:
> On 8/20/2021 15:44, Matthew Brost wrote:
> 
> Update context and full GPU reset to work with multi-lrc. The idea is
> parent context tracks all the active requests inflight for itself and
> its' children. The parent context owns the reset replaying / canceling
> 
> its' -> its
> 
> 
> requests as needed.
> 
> Signed-off-by: Matthew Brost 
> ---
>  drivers/gpu/drm/i915/gt/intel_context.c   | 11 ++--
>  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 63 +--
>  2 files changed, 51 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_context.c 
> b/drivers/gpu/drm/i915/gt/intel_context.c
> index 00d1aee6d199..5615be32879c 100644
> --- a/drivers/gpu/drm/i915/gt/intel_context.c
> +++ b/drivers/gpu/drm/i915/gt/intel_context.c
> @@ -528,20 +528,21 @@ struct i915_request 
> *intel_context_create_request(struct intel_context *ce)
> 
>  struct i915_request *intel_context_find_active_request(struct 
> intel_context *ce)
>  {
> +   struct intel_context *parent = intel_context_to_parent(ce);
> struct i915_request *rq, *active = NULL;
> unsigned long flags;
> 
> GEM_BUG_ON(!intel_engine_uses_guc(ce->engine));
> 
> Should this not check the parent as well/instead?
> 

I don't think so. The 'ce' could be a not parallel context, a parent
context, or a child context. 

> And to be clear, this can be called on regular contexts (where ce == parent)
> and on both the parent or child contexts of multi-LRC contexts (where ce may 
> or
> may not match parent)?
>

Right. The parent owns the parent->guc_state.lock/requests and search
that list for the first non-completed request that matches submitted
'ce'.
 
> 
> 
> 
> -   spin_lock_irqsave(>guc_state.lock, flags);
> -   list_for_each_entry_reverse(rq, >guc_state.requests,
> +   spin_lock_irqsave(>guc_state.lock, flags);
> +   list_for_each_entry_reverse(rq, >guc_state.requests,
> sched.link) {
> -   if (i915_request_completed(rq))
> +   if (i915_request_completed(rq) && rq->context == ce)
> 
> 'rq->context == ce' means:
> 
>  1. single-LRC context, rq is owned by ce
>  2. multi-LRC context, ce is child, rq really belongs to ce but is being
> tracked by parent
>  3. multi-LRC context, ce is parent, rq really is owned by ce
> 
> So when 'rq->ce != ce', it means that the request is owned by a different 
> child
> to 'ce' but within the same multi-LRC group. So we want to ignore that request
> and keep searching until we find one that is really owned by the target ce?
>

All correct.
 
> 
> break;
> 
> -   active = rq;
> +   active = (rq->context == ce) ? rq : active;
> 
> Would be clearer to say 'if(rq->ce != ce) continue;' and leave 'active = rq;'
> alone?
>

Yes, that is probably cleaner.
 
> And again, the intention is to ignore requests that are owned by other members
> of the same multi-LRC group?
> 
> Would be good to add some documentation to this function to explain the above
> (assuming my description is correct?).
>

Will add a comment explaining this.
 
> 
> }
> -   spin_unlock_irqrestore(>guc_state.lock, flags);
> +   spin_unlock_irqrestore(>guc_state.lock, flags);
> 
> return active;
>  }
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
> b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> index f0b60fecf253..e34e0ea9136a 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> @@ -670,6 +670,11 @@ static int rq_prio(const struct i915_request *rq)
> return rq->sched.attr.priority;
>  }
> 
> +static inline bool is_multi_lrc(struct intel_context *ce)
> +{
> +   return intel_context_is_parallel(ce);
> +}
> +
>  static bool is_multi_lrc_rq(struct i915_request *rq)
>  {
> return intel_context_is_parallel(rq->context);
> @@ -1179,10 +1184,13 @@ __unwind_incomplete_requests(struct intel_context 
> *ce)
> 
>  static void __guc_reset_context(struct intel_context *ce, bool stalled)
>  {
> +   bool local_stalled;
> struct i915_request *rq;
> unsigned long flags;
> u32 head;
> +   int i, number_children = ce->guc_number_children;
> 
> If this is a child context, does it not need to pull the child count from the
> parent? Likewise the list/link pointers below? Or does each child context have
> a full list of its siblings + parent?
> 

This function shouldn't be called by a child. Will add
GEM_BUG_ON(intel_context_is_child(ce)) to this function.

> 
> bool skip = false;
> +   struct intel_context *parent = ce;
> 
>

Re: [PATCH 4/4] dt-bindings: display: mediatek: add MT8195 hdmi bindings

2021-09-22 Thread Chun-Kuang Hu

Hi, Guillaume:

Rob Herring  於 2021年9月7日 週二 下午10:51寫道：
>
> On Tue, 07 Sep 2021 10:37:21 +0200, Guillaume Ranquet wrote:
> > Add Mediatek HDMI and HDMI-DDC bindings for MT8195 SoC.

Move this patch before the driver patch which refer to this patch.

> >
> > Signed-off-by: Guillaume Ranquet 
> > ---
> >  .../mediatek/mediatek,mt8195-hdmi-ddc.yaml| 46 +
> >  .../mediatek/mediatek,mt8195-hdmi.yaml| 99 +++
> >  2 files changed, 145 insertions(+)
> >  create mode 100644 
> > Documentation/devicetree/bindings/display/mediatek/mediatek,mt8195-hdmi-ddc.yaml

I think this file should be merged into mediatek,hdmi-ddc.yaml [1].

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/devicetree/bindings/display/mediatek/mediatek,hdmi-ddc.yaml?h=v5.15-rc2

> >  create mode 100644 
> > Documentation/devicetree/bindings/display/mediatek/mediatek,mt8195-hdmi.yaml

I think this file should be merged into mediatek,hdmi.yaml [2].

[2] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/devicetree/bindings/display/mediatek/mediatek,hdmi.yaml?h=v5.15-rc2

Regards,
Chun-Kuang.

> >
>
> My bot found errors running 'make DT_CHECKER_FLAGS=-m dt_binding_check'
> on your patch (DT_CHECKER_FLAGS is new in v5.13):
>
> yamllint warnings/errors:
>
> dtschema/dtc warnings/errors:
> Documentation/devicetree/bindings/display/mediatek/mediatek,mt8195-hdmi.example.dts:19:18:
>  fatal error: dt-bindings/clock/mt8195-clk.h: No such file or directory
>19 | #include 
>   |  ^~~~
> compilation terminated.
> make[1]: *** [scripts/Makefile.lib:379: 
> Documentation/devicetree/bindings/display/mediatek/mediatek,mt8195-hdmi.example.dt.yaml]
>  Error 1
> make[1]: *** Waiting for unfinished jobs
> make: *** [Makefile:1438: dt_binding_check] Error 2
>
> doc reference errors (make refcheckdocs):
>
> See https://patchwork.ozlabs.org/patch/1525170
>
> This check can fail if there are any dependencies. The base for a patch
> series is generally the most recent rc1.
>
> If you already ran 'make dt_binding_check' and didn't see the above
> error(s), then make sure 'yamllint' is installed and dt-schema is up to
> date:
>
> pip3 install dtschema --upgrade
>
> Please check and re-submit.
>

Re: [PATCH 3/4] dt-bindings: phy: Add binding for Mediatek MT8195 HDMI PHY

2021-09-22 Thread Chun-Kuang Hu

Hi, Guillaume:

Guillaume Ranquet  於 2021年9月7日 週二 下午4:39寫道：
>
> Add bindings to describe Mediatek MT8195 HDMI PHY

Move this patch before the driver patch which reference this patch.

>
> Signed-off-by: Guillaume Ranquet 
> ---
>  .../phy/mediatek,mtk8195-hdmi-phy.yaml| 71 +++
>  1 file changed, 71 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/phy/mediatek,mtk8195-hdmi-phy.yaml

I think this file should be merged into mediatek,hdmi-phy.yaml [1].

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/devicetree/bindings/phy/mediatek,hdmi-phy.yaml?h=v5.15-rc2

Regards,
Chun-Kuang.

>
> diff --git 
> a/Documentation/devicetree/bindings/phy/mediatek,mtk8195-hdmi-phy.yaml 
> b/Documentation/devicetree/bindings/phy/mediatek,mtk8195-hdmi-phy.yaml
> new file mode 100644
> index ..f03bd3af7fd8
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/phy/mediatek,mtk8195-hdmi-phy.yaml
> @@ -0,0 +1,71 @@
> +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> +# Copyright (c) 2020 MediaTek
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/phy/mediatek,hdmi-phy.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: MediaTek High Definition Multimedia Interface (HDMI) PHY binding for 
> mt8195
> +
> +maintainers:
> +  - Chun-Kuang Hu 
> +  - Philipp Zabel 
> +  - Chunfeng Yun 
> +
> +description: |
> +  The HDMI PHY serializes the HDMI encoder's three channel 10-bit parallel
> +  output and drives the HDMI pads.
> +
> +properties:
> +  $nodename:
> +pattern: "^hdmi-phy@[0-9a-f]+$"
> +
> +  compatible:
> +- const: mediatek,mt8195-hdmi-phy
> +
> +  reg:
> +maxItems: 1
> +
> +  clocks:
> +items:
> +  - description: PLL reference clock
> +
> +  clock-names:
> +items:
> +  - const: hdmi_xtal_sel
> +
> +  clock-output-names:
> +items:
> +  - const: hdmi_txpll
> +
> +  "#phy-cells":
> +const: 0
> +
> +  "#clock-cells":
> +const: 0
> +
> +required:
> +  - compatible
> +  - reg
> +  - clocks
> +  - clock-names
> +  - clock-output-names
> +  - "#phy-cells"
> +  - "#clock-cells"
> +
> +additionalProperties: false
> +
> +examples:
> +  - |
> +#include 
> +hdmi_phy: hdmi-phy@11d5f000 {
> +compatible = "mediatek,mt8195-hdmi-phy";
> +reg = <0 0x11d5f000 0 0x100>;
> +clocks = < CLK_TOP_HDMI_XTAL>;
> +clock-names = "hdmi_xtal_sel";
> +clock-output-names = "hdmi_txpll";
> +#clock-cells = <0>;
> +#phy-cells = <0>;
> +};
> +
> +...
> --
> 2.31.1
>

[PATCH] drm/i915: Drop stealing of bits from i915_sw_fence function pointer

2021-09-22 Thread Matthew Brost

Rather than stealing bits from i915_sw_fence function pointer use
seperate fields for function pointer and flags. If using two different
fields, the 4 byte alignment for the i915_sw_fence function pointer can
also be dropped.

v2:
 (CI)
  - Set new function field rather than flags in __i915_sw_fence_init
v3:
 (Tvrtko)
  - Remove BUG_ON(!fence->flags) in reinit as that will now blow up
  - Only define fence->flags if CONFIG_DRM_I915_SW_FENCE_CHECK_DAG is
defined

Signed-off-by: Matthew Brost 
Acked-by: Jani Nikula 
---
 drivers/gpu/drm/i915/display/intel_display.c  |  2 +-
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |  2 +-
 drivers/gpu/drm/i915/i915_request.c   |  4 +--
 drivers/gpu/drm/i915/i915_sw_fence.c  | 28 +++
 drivers/gpu/drm/i915/i915_sw_fence.h  | 23 +++
 drivers/gpu/drm/i915/i915_sw_fence_work.c |  2 +-
 .../gpu/drm/i915/selftests/i915_sw_fence.c|  2 +-
 drivers/gpu/drm/i915/selftests/lib_sw_fence.c |  8 +++---
 8 files changed, 39 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
b/drivers/gpu/drm/i915/display/intel_display.c
index a7ca38613f89..6d5bb55ffc82 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -10323,7 +10323,7 @@ static void intel_atomic_commit_work(struct work_struct 
*work)
intel_atomic_commit_tail(state);
 }
 
-static int __i915_sw_fence_call
+static int
 intel_atomic_commit_ready(struct i915_sw_fence *fence,
  enum i915_sw_fence_notify notify)
 {
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index c2ab0e22db0a..df5fec5c3da8 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -800,7 +800,7 @@ static void free_engines_rcu(struct rcu_head *rcu)
free_engines(engines);
 }
 
-static int __i915_sw_fence_call
+static int
 engines_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state)
 {
struct i915_gem_engines *engines =
diff --git a/drivers/gpu/drm/i915/i915_request.c 
b/drivers/gpu/drm/i915/i915_request.c
index ce446716d092..945d3025a0b6 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -719,7 +719,7 @@ void i915_request_cancel(struct i915_request *rq, int error)
intel_context_cancel_request(rq->context, rq);
 }
 
-static int __i915_sw_fence_call
+static int
 submit_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state)
 {
struct i915_request *request =
@@ -755,7 +755,7 @@ submit_notify(struct i915_sw_fence *fence, enum 
i915_sw_fence_notify state)
return NOTIFY_DONE;
 }
 
-static int __i915_sw_fence_call
+static int
 semaphore_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state)
 {
struct i915_request *rq = container_of(fence, typeof(*rq), semaphore);
diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c 
b/drivers/gpu/drm/i915/i915_sw_fence.c
index c589a681da77..f10d31818ecc 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.c
+++ b/drivers/gpu/drm/i915/i915_sw_fence.c
@@ -18,7 +18,9 @@
 #define I915_SW_FENCE_BUG_ON(expr) BUILD_BUG_ON_INVALID(expr)
 #endif
 
+#ifdef CONFIG_DRM_I915_SW_FENCE_CHECK_DAG
 static DEFINE_SPINLOCK(i915_sw_fence_lock);
+#endif
 
 #define WQ_FLAG_BITS \
BITS_PER_TYPE(typeof_member(struct wait_queue_entry, flags))
@@ -34,7 +36,7 @@ enum {
 
 static void *i915_sw_fence_debug_hint(void *addr)
 {
-   return (void *)(((struct i915_sw_fence *)addr)->flags & 
I915_SW_FENCE_MASK);
+   return (void *)(((struct i915_sw_fence *)addr)->fn);
 }
 
 #ifdef CONFIG_DRM_I915_SW_FENCE_DEBUG_OBJECTS
@@ -126,10 +128,7 @@ static inline void debug_fence_assert(struct i915_sw_fence 
*fence)
 static int __i915_sw_fence_notify(struct i915_sw_fence *fence,
  enum i915_sw_fence_notify state)
 {
-   i915_sw_fence_notify_t fn;
-
-   fn = (i915_sw_fence_notify_t)(fence->flags & I915_SW_FENCE_MASK);
-   return fn(fence, state);
+   return fence->fn(fence, state);
 }
 
 #ifdef CONFIG_DRM_I915_SW_FENCE_DEBUG_OBJECTS
@@ -242,10 +241,13 @@ void __i915_sw_fence_init(struct i915_sw_fence *fence,
  const char *name,
  struct lock_class_key *key)
 {
-   BUG_ON(!fn || (unsigned long)fn & ~I915_SW_FENCE_MASK);
+   BUG_ON(!fn);
 
__init_waitqueue_head(>wait, name, key);
-   fence->flags = (unsigned long)fn;
+   fence->fn = fn;
+#ifdef CONFIG_DRM_I915_SW_FENCE_CHECK_DAG
+   fence->flags = 0;
+#endif
 
i915_sw_fence_reinit(fence);
 }
@@ -257,7 +259,6 @@ void i915_sw_fence_reinit(struct i915_sw_fence *fence)
atomic_set(>pending, 1);
fence->error = 0;
 
-   I915_SW_FENCE_BUG_ON(!fence->flags);
I915_SW_FENCE_BUG_ON(!list_empty(>wait.head));
 }
 
@@ -279,6 +280,7 @@ static int

[PATCH 2/7] drm/i915: Make GEM contexts track DRM clients

2021-09-22 Thread Tvrtko Ursulin

From: Tvrtko Ursulin 

Make GEM contexts keep a reference to i915_drm_client for the whole of
of their lifetime which will come handy in following patches.

v2: Don't bother supporting selftests contexts from debugfs. (Chris)
v3 (Lucas): Finish constructing ctx before adding it to the list
v4 (Ram): Rebase.
v5: Trivial rebase for proto ctx changes.
v6: Rebase after clients no longer track name and pid.

Signed-off-by: Tvrtko Ursulin 
Reviewed-by: Chris Wilson  # v5
Reviewed-by: Aravind Iddamsetty  # v5
Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 5 +
 drivers/gpu/drm/i915/gem/i915_gem_context_types.h | 3 +++
 2 files changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index c2ab0e22db0a..70340663136e 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -956,6 +956,9 @@ static void i915_gem_context_release_work(struct 
work_struct *work)
if (vm)
i915_vm_put(vm);
 
+   if (ctx->client)
+   i915_drm_client_put(ctx->client);
+
mutex_destroy(>engines_mutex);
mutex_destroy(>lut_mutex);
 
@@ -1373,6 +1376,8 @@ static void gem_context_register(struct i915_gem_context 
*ctx,
ctx->file_priv = fpriv;
 
ctx->pid = get_task_pid(current, PIDTYPE_PID);
+   ctx->client = i915_drm_client_get(fpriv->client);
+
snprintf(ctx->name, sizeof(ctx->name), "%s[%d]",
 current->comm, pid_nr(ctx->pid));
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h 
b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
index c4617e4d9fa9..598c57ac5cdf 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
@@ -277,6 +277,9 @@ struct i915_gem_context {
/** @link: place with _i915_private.context_list */
struct list_head link;
 
+   /** @client: struct i915_drm_client */
+   struct i915_drm_client *client;
+
/**
 * @ref: reference count
 *
-- 
2.30.2

[PATCH 3/7] drm/i915: Track runtime spent in closed and unreachable GEM contexts

2021-09-22 Thread Tvrtko Ursulin

From: Tvrtko Ursulin 

As contexts are abandoned we want to remember how much GPU time they used
(per class) so later we can used it for smarter purposes.

As GEM contexts are closed we want to have the DRM client remember how
much GPU time they used (per class) so later we can used it for smarter
purposes.

Signed-off-by: Tvrtko Ursulin 
Reviewed-by: Aravind Iddamsetty 
Reviewed-by: Chris Wilson 
Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 25 +++--
 drivers/gpu/drm/i915/i915_drm_client.h  |  7 ++
 2 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 70340663136e..9b37723b70a9 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -800,23 +800,44 @@ static void free_engines_rcu(struct rcu_head *rcu)
free_engines(engines);
 }
 
+static void accumulate_runtime(struct i915_drm_client *client,
+  struct i915_gem_engines *engines)
+{
+   struct i915_gem_engines_iter it;
+   struct intel_context *ce;
+
+   if (!client)
+   return;
+
+   /* Transfer accumulated runtime to the parent GEM context. */
+   for_each_gem_engine(ce, engines, it) {
+   unsigned int class = ce->engine->uabi_class;
+
+   GEM_BUG_ON(class >= ARRAY_SIZE(client->past_runtime));
+   atomic64_add(intel_context_get_total_runtime_ns(ce),
+>past_runtime[class]);
+   }
+}
+
 static int __i915_sw_fence_call
 engines_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state)
 {
struct i915_gem_engines *engines =
container_of(fence, typeof(*engines), fence);
+   struct i915_gem_context *ctx = engines->ctx;
 
switch (state) {
case FENCE_COMPLETE:
if (!list_empty(>link)) {
-   struct i915_gem_context *ctx = engines->ctx;
unsigned long flags;
 
spin_lock_irqsave(>stale.lock, flags);
list_del(>link);
spin_unlock_irqrestore(>stale.lock, flags);
}
-   i915_gem_context_put(engines->ctx);
+   accumulate_runtime(ctx->client, engines);
+   i915_gem_context_put(ctx);
+
break;
 
case FENCE_FREE:
diff --git a/drivers/gpu/drm/i915/i915_drm_client.h 
b/drivers/gpu/drm/i915/i915_drm_client.h
index e8986ad51176..9d80d9f715ee 100644
--- a/drivers/gpu/drm/i915/i915_drm_client.h
+++ b/drivers/gpu/drm/i915/i915_drm_client.h
@@ -9,6 +9,8 @@
 #include 
 #include 
 
+#include "gt/intel_engine_types.h"
+
 struct drm_i915_private;
 
 struct i915_drm_clients {
@@ -24,6 +26,11 @@ struct i915_drm_client {
unsigned int id;
 
struct i915_drm_clients *clients;
+
+   /**
+* @past_runtime: Accumulation of pphwsp runtimes from closed contexts.
+*/
+   atomic64_t past_runtime[MAX_ENGINE_CLASS + 1];
 };
 
 void i915_drm_clients_init(struct i915_drm_clients *clients,
-- 
2.30.2

[PATCH 7/7] drm/i915: Expose client engine utilisation via fdinfo

2021-09-22 Thread Tvrtko Ursulin

From: Tvrtko Ursulin 

Similar to AMD commit
874442541133 ("drm/amdgpu: Add show_fdinfo() interface"), using the
infrastructure added in previous patches, we add basic client info
and GPU engine utilisation for i915.

Example of the output:

  pos:0
  flags:  012
  mnt_id: 21
  drm-driver: i915
  drm-pdev:   :00:02.0
  drm-client-id:  7
  drm-engine-render:  9288864723 ns
  drm-engine-copy:2035071108 ns
  drm-engine-video:   0 ns
  drm-engine-video-enhance:   0 ns

v2:
 * Update for removal of name and pid.

v3:
 * Use drm_driver.name.

Signed-off-by: Tvrtko Ursulin 
Cc: David M Nieto 
Cc: Christian König 
Cc: Daniel Vetter 
Acked-by: Christian König 
---
 Documentation/gpu/drm-usage-stats.rst  |  6 +++
 Documentation/gpu/i915.rst | 27 ++
 drivers/gpu/drm/i915/i915_drm_client.c | 73 ++
 drivers/gpu/drm/i915/i915_drm_client.h |  4 ++
 drivers/gpu/drm/i915/i915_drv.c|  3 ++
 5 files changed, 113 insertions(+)

diff --git a/Documentation/gpu/drm-usage-stats.rst 
b/Documentation/gpu/drm-usage-stats.rst
index c669026be244..6952f8389d07 100644
--- a/Documentation/gpu/drm-usage-stats.rst
+++ b/Documentation/gpu/drm-usage-stats.rst
@@ -95,3 +95,9 @@ object belong to this client, in the respective memory region.
 
 Default unit shall be bytes with optional unit specifiers of 'KiB' or 'MiB'
 indicating kibi- or mebi-bytes.
+
+===
+Driver specific implementations
+===
+
+:ref:`i915-usage-stats`
diff --git a/Documentation/gpu/i915.rst b/Documentation/gpu/i915.rst
index 311e10400708..36cd6f74fb1b 100644
--- a/Documentation/gpu/i915.rst
+++ b/Documentation/gpu/i915.rst
@@ -700,3 +700,30 @@ The style guide for ``i915_reg.h``.
 
 .. kernel-doc:: drivers/gpu/drm/i915/i915_reg.h
:doc: The i915 register macro definition style guide
+
+.. _i915-usage-stats:
+
+i915 DRM client usage stats implementation
+==
+
+The drm/i915 driver implements the DRM client usage stats specification as
+documented in :ref:`drm-client-usage-stats`.
+
+Example of the output showing the implemented key value pairs and entirety of
+the currenly possible format options:
+
+::
+
+  pos:0
+  flags:  012
+  mnt_id: 21
+  drm-driver: i915
+  drm-pdev:   :00:02.0
+  drm-client-id:  7
+  drm-engine-render:  9288864723 ns
+  drm-engine-copy:2035071108 ns
+  drm-engine-video:   0 ns
+  drm-engine-video-enhance:   0 ns
+
+Possible `drm-engine-` key names are: `render`, `copy`, `video` and
+`video-enhance`.
diff --git a/drivers/gpu/drm/i915/i915_drm_client.c 
b/drivers/gpu/drm/i915/i915_drm_client.c
index 91a8559bebf7..06dbd20ce763 100644
--- a/drivers/gpu/drm/i915/i915_drm_client.c
+++ b/drivers/gpu/drm/i915/i915_drm_client.c
@@ -7,6 +7,11 @@
 #include 
 #include 
 
+#include 
+
+#include 
+
+#include "gem/i915_gem_context.h"
 #include "i915_drm_client.h"
 #include "i915_gem.h"
 #include "i915_utils.h"
@@ -68,3 +73,71 @@ void i915_drm_clients_fini(struct i915_drm_clients *clients)
GEM_BUG_ON(!xa_empty(>xarray));
xa_destroy(>xarray);
 }
+
+#ifdef CONFIG_PROC_FS
+static const char * const uabi_class_names[] = {
+   [I915_ENGINE_CLASS_RENDER] = "render",
+   [I915_ENGINE_CLASS_COPY] = "copy",
+   [I915_ENGINE_CLASS_VIDEO] = "video",
+   [I915_ENGINE_CLASS_VIDEO_ENHANCE] = "video-enhance",
+};
+
+static u64 busy_add(struct i915_gem_context *ctx, unsigned int class)
+{
+   struct i915_gem_engines_iter it;
+   struct intel_context *ce;
+   u64 total = 0;
+
+   for_each_gem_engine(ce, rcu_dereference(ctx->engines), it) {
+   if (ce->engine->uabi_class != class)
+   continue;
+
+   total += intel_context_get_total_runtime_ns(ce);
+   }
+
+   return total;
+}
+
+static void
+show_client_class(struct seq_file *m,
+ struct i915_drm_client *client,
+ unsigned int class)
+{
+   const struct list_head *list = >ctx_list;
+   u64 total = atomic64_read(>past_runtime[class]);
+   struct i915_gem_context *ctx;
+
+   rcu_read_lock();
+   list_for_each_entry_rcu(ctx, list, client_link)
+   total += busy_add(ctx, class);
+   rcu_read_unlock();
+
+   return seq_printf(m, "drm-engine-%s:\t%llu ns\n",
+ uabi_class_names[class], total);
+}
+
+void i915_drm_client_fdinfo(struct seq_file *m, struct file *f)
+{
+   struct drm_file *file = f->private_data;
+   struct drm_i915_file_private *file_priv = file->driver_priv;
+   struct drm_i915_private *i915 = file_priv->dev_priv;
+   struct i915_drm_client *client = file_priv->client;
+   struct pci_dev *pdev = to_pci_dev(i915->drm.dev);
+   unsigned int i;
+
+   /*
+* **
+* For text output format description please

[PATCH 5/7] drm/i915: Track context current active time

2021-09-22 Thread Tvrtko Ursulin

From: Tvrtko Ursulin 

Track context active (on hardware) status together with the start
timestamp.

This will be used to provide better granularity of context
runtime reporting in conjunction with already tracked pphwsp accumulated
runtime.

The latter is only updated on context save so does not give us visibility
to any currently executing work.

As part of the patch the existing runtime tracking data is moved under the
new ce->stats member and updated under the seqlock. This provides the
ability to atomically read out accumulated plus active runtime.

v2:
 * Rename and make __intel_context_get_active_time unlocked.

v3:
 * Use GRAPHICS_VER.

Signed-off-by: Tvrtko Ursulin 
Reviewed-by: Aravind Iddamsetty  #  v1
Reviewed-by: Chris Wilson 
Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gt/intel_context.c   | 27 ++-
 drivers/gpu/drm/i915/gt/intel_context.h   | 15 ---
 drivers/gpu/drm/i915/gt/intel_context_types.h | 24 +++--
 .../drm/i915/gt/intel_execlists_submission.c  | 23 
 .../gpu/drm/i915/gt/intel_gt_clock_utils.c|  4 +++
 drivers/gpu/drm/i915/gt/intel_lrc.c   | 27 ++-
 drivers/gpu/drm/i915/gt/intel_lrc.h   | 24 +
 drivers/gpu/drm/i915/gt/selftest_lrc.c| 10 +++
 drivers/gpu/drm/i915/i915_gpu_error.c |  9 +++
 drivers/gpu/drm/i915/i915_gpu_error.h |  2 +-
 10 files changed, 116 insertions(+), 49 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context.c 
b/drivers/gpu/drm/i915/gt/intel_context.c
index ff637147b1a9..ae97c311d65b 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -382,7 +382,7 @@ intel_context_init(struct intel_context *ce, struct 
intel_engine_cs *engine)
ce->ring = NULL;
ce->ring_size = SZ_4K;
 
-   ewma_runtime_init(>runtime.avg);
+   ewma_runtime_init(>stats.runtime.avg);
 
ce->vm = i915_vm_get(engine->gt->vm);
 
@@ -532,6 +532,31 @@ struct i915_request 
*intel_context_find_active_request(struct intel_context *ce)
return active;
 }
 
+u64 intel_context_get_total_runtime_ns(const struct intel_context *ce)
+{
+   u64 total, active;
+
+   total = ce->stats.runtime.total;
+   if (ce->ops->flags & COPS_RUNTIME_CYCLES)
+   total *= ce->engine->gt->clock_period_ns;
+
+   active = READ_ONCE(ce->stats.active);
+   if (active)
+   active = intel_context_clock() - active;
+
+   return total + active;
+}
+
+u64 intel_context_get_avg_runtime_ns(struct intel_context *ce)
+{
+   u64 avg = ewma_runtime_read(>stats.runtime.avg);
+
+   if (ce->ops->flags & COPS_RUNTIME_CYCLES)
+   avg *= ce->engine->gt->clock_period_ns;
+
+   return avg;
+}
+
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
 #include "selftest_context.c"
 #endif
diff --git a/drivers/gpu/drm/i915/gt/intel_context.h 
b/drivers/gpu/drm/i915/gt/intel_context.h
index c41098950746..2aaffe1bb388 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.h
+++ b/drivers/gpu/drm/i915/gt/intel_context.h
@@ -296,18 +296,13 @@ intel_context_clear_nopreempt(struct intel_context *ce)
clear_bit(CONTEXT_NOPREEMPT, >flags);
 }
 
-static inline u64 intel_context_get_total_runtime_ns(struct intel_context *ce)
-{
-   const u32 period = ce->engine->gt->clock_period_ns;
-
-   return READ_ONCE(ce->runtime.total) * period;
-}
+u64 intel_context_get_total_runtime_ns(const struct intel_context *ce);
+u64 intel_context_get_avg_runtime_ns(struct intel_context *ce);
 
-static inline u64 intel_context_get_avg_runtime_ns(struct intel_context *ce)
+static inline u64 intel_context_clock(void)
 {
-   const u32 period = ce->engine->gt->clock_period_ns;
-
-   return mul_u32_u32(ewma_runtime_read(>runtime.avg), period);
+   /* As we mix CS cycles with CPU clocks, use the raw monotonic clock. */
+   return ktime_get_raw_fast_ns();
 }
 
 #endif /* __INTEL_CONTEXT_H__ */
diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h 
b/drivers/gpu/drm/i915/gt/intel_context_types.h
index 930569a1a01f..2a8a8d207691 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
@@ -35,6 +35,9 @@ struct intel_context_ops {
 #define COPS_HAS_INFLIGHT_BIT 0
 #define COPS_HAS_INFLIGHT BIT(COPS_HAS_INFLIGHT_BIT)
 
+#define COPS_RUNTIME_CYCLES_BIT 1
+#define COPS_RUNTIME_CYCLES BIT(COPS_RUNTIME_CYCLES_BIT)
+
int (*alloc)(struct intel_context *ce);
 
void (*ban)(struct intel_context *ce, struct i915_request *rq);
@@ -128,14 +131,19 @@ struct intel_context {
} lrc;
u32 tag; /* cookie passed to HW to track this context on submission */
 
-   /* Time on GPU as tracked by the hw. */
-   struct {
-   struct ewma_runtime avg;
-   u64 total;
-   u32 last;
-   I915_SELFTEST_DECLARE(u32 num_underflow);
-

[PATCH 6/7] drm: Document fdinfo format specification

2021-09-22 Thread Tvrtko Ursulin

From: Tvrtko Ursulin 

Proposal to standardise the fdinfo text format as optionally output by DRM
drivers.

Idea is that a simple but, well defined, spec will enable generic
userspace tools to be written while at the same time avoiding a more heavy
handed approach of adding a mid-layer to DRM.

i915 implements a subset of the spec, everything apart from the memory
stats currently, and a matching intel_gpu_top tool exists.

Open is to see if AMD can migrate to using the proposed GPU utilisation
key-value pairs, or if they are not workable to see whether to go
vendor specific, or if a standardised  alternative can be found which is
workable for both drivers.

Same for the memory utilisation key-value pairs proposal.

v2:
 * Update for removal of name and pid.

v3:
 * 'Drm-driver' tag will be obtained from struct drm_driver.name. (Daniel)

Signed-off-by: Tvrtko Ursulin 
Cc: David M Nieto 
Cc: Christian König 
Cc: Daniel Vetter 
Cc: Daniel Stone 
Acked-by: Christian König 
---
 Documentation/gpu/drm-usage-stats.rst | 97 +++
 Documentation/gpu/index.rst   |  1 +
 2 files changed, 98 insertions(+)
 create mode 100644 Documentation/gpu/drm-usage-stats.rst

diff --git a/Documentation/gpu/drm-usage-stats.rst 
b/Documentation/gpu/drm-usage-stats.rst
new file mode 100644
index ..c669026be244
--- /dev/null
+++ b/Documentation/gpu/drm-usage-stats.rst
@@ -0,0 +1,97 @@
+.. _drm-client-usage-stats:
+
+==
+DRM client usage stats
+==
+
+DRM drivers can choose to export partly standardised text output via the
+`fops->show_fdinfo()` as part of the driver specific file operations registered
+in the `struct drm_driver` object registered with the DRM core.
+
+One purpose of this output is to enable writing as generic as practicaly
+feasible `top(1)` like userspace monitoring tools.
+
+Given the differences between various DRM drivers the specification of the
+output is split between common and driver specific parts. Having said that,
+wherever possible effort should still be made to standardise as much as
+possible.
+
+File format specification
+=
+
+- File shall contain one key value pair per one line of text.
+- Colon character (`:`) must be used to delimit keys and values.
+- All keys shall be prefixed with `drm-`.
+- Whitespace between the delimiter and first non-whitespace character shall be
+  ignored when parsing.
+- Neither keys or values are allowed to contain whitespace characters.
+- Numerical key value pairs can end with optional unit string.
+- Data type of the value is fixed as defined in the specification.
+
+Key types
+-
+
+1. Mandatory, fully standardised.
+2. Optional, fully standardised.
+3. Driver specific.
+
+Data types
+--
+
+-  - Unsigned integer without defining the maximum value.
+-  - String excluding any above defined reserved characters or whitespace.
+
+Mandatory fully standardised keys
+-
+
+- drm-driver: 
+
+String shall contain the name this driver registered as via the respective
+`struct drm_driver` data structure.
+
+Optional fully standardised keys
+
+
+- drm-pdev: 
+
+For PCI devices this should contain the PCI slot address of the device in
+question.
+
+- drm-client-id: 
+
+Unique value relating to the open DRM file descriptor used to distinguish
+duplicated and shared file descriptors. Conceptually the value should map 1:1
+to the in kernel representation of `struct drm_file` instances.
+
+Uniqueness of the value shall be either globally unique, or unique within the
+scope of each device, in which case `drm-pdev` shall be present as well.
+
+Userspace should make sure to not double account any usage statistics by using
+the above described criteria in order to associate data to individual clients.
+
+- drm-engine-:  ns
+
+GPUs usually contain multiple execution engines. Each shall be given a stable
+and unique name (str), with possible values documented in the driver specific
+documentation.
+
+Value shall be in specified time units which the respective GPU engine spent
+busy executing workloads belonging to this client.
+
+Values are not required to be constantly monotonic if it makes the driver
+implementation easier, but are required to catch up with the previously 
reported
+larger value within a reasonable period. Upon observing a value lower than what
+was previously read, userspace is expected to stay with that larger previous
+value until a monotonic update is seen.
+
+- drm-memory-:  [KiB|MiB]
+
+Each possible memory type which can be used to store buffer objects by the
+GPU in question shall be given a stable and unique name to be returned as the
+string here.
+
+Value shall reflect the amount of storage currently consumed by the buffer
+object belong to this client, in the respective memory region.
+
+Default unit shall be bytes with optional unit specifiers of 'KiB' or 'MiB'
+indicating kibi-

[PATCH 1/7] drm/i915: Explicitly track DRM clients

2021-09-22 Thread Tvrtko Ursulin

From: Tvrtko Ursulin 

Tracking DRM clients more explicitly will allow later patches to
accumulate past and current GPU usage in a centralised place and also
consolidate access to owning task pid/name.

Unique client id is also assigned for the purpose of distinguishing/
consolidating between multiple file descriptors owned by the same process.

v2:
 Chris Wilson:
 * Enclose new members into dedicated structs.
 * Protect against failed sysfs registration.

v3:
 * sysfs_attr_init.

v4:
 * Fix for internal clients.

v5:
 * Use cyclic ida for client id. (Chris)
 * Do not leak pid reference. (Chris)
 * Tidy code with some locals.

v6:
 * Use xa_alloc_cyclic to simplify locking. (Chris)
 * No need to unregister individial sysfs files. (Chris)
 * Rebase on top of fpriv kref.
 * Track client closed status and reflect in sysfs.

v7:
 * Make drm_client more standalone concept.

v8:
 * Simplify sysfs show. (Chris)
 * Always track name and pid.

v9:
 * Fix cyclic id assignment.

v10:
 * No need for a mutex around xa_alloc_cyclic.
 * Refactor sysfs into own function.
 * Unregister sysfs before freeing pid and name.
 * Move clients setup into own function.

v11:
 * Call clients init directly from driver init. (Chris)

v12:
 * Do not fail client add on id wrap. (Maciej)

v13 (Lucas): Rebase.

v14:
 * Dropped sysfs bits.

v15:
 * Dropped tracking of pid/ and name.
 * Dropped RCU freeing of the client object.

Signed-off-by: Tvrtko Ursulin 
Reviewed-by: Chris Wilson  # v11
Reviewed-by: Aravind Iddamsetty  # v11
Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/Makefile  |  5 +-
 drivers/gpu/drm/i915/i915_drm_client.c | 68 ++
 drivers/gpu/drm/i915/i915_drm_client.h | 50 +++
 drivers/gpu/drm/i915/i915_drv.c|  6 +++
 drivers/gpu/drm/i915/i915_drv.h|  5 ++
 drivers/gpu/drm/i915/i915_gem.c| 21 ++--
 6 files changed, 150 insertions(+), 5 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_drm_client.c
 create mode 100644 drivers/gpu/drm/i915/i915_drm_client.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 335a8c668848..8187c9e52a79 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -32,8 +32,9 @@ subdir-ccflags-y += -I$(srctree)/$(src)
 # Please keep these build lists sorted!
 
 # core driver code
-i915-y += i915_drv.o \
- i915_config.o \
+i915-y += i915_config.o \
+ i915_drm_client.o \
+ i915_drv.o \
  i915_irq.o \
  i915_getparam.o \
  i915_mitigations.o \
diff --git a/drivers/gpu/drm/i915/i915_drm_client.c 
b/drivers/gpu/drm/i915/i915_drm_client.c
new file mode 100644
index ..e61e9ba15256
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_drm_client.c
@@ -0,0 +1,68 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2020 Intel Corporation
+ */
+
+#include 
+#include 
+#include 
+
+#include "i915_drm_client.h"
+#include "i915_gem.h"
+#include "i915_utils.h"
+
+void i915_drm_clients_init(struct i915_drm_clients *clients,
+  struct drm_i915_private *i915)
+{
+   clients->i915 = i915;
+   clients->next_id = 0;
+
+   xa_init_flags(>xarray, XA_FLAGS_ALLOC | XA_FLAGS_LOCK_IRQ);
+}
+
+struct i915_drm_client *i915_drm_client_add(struct i915_drm_clients *clients)
+{
+   struct i915_drm_client *client;
+   struct xarray *xa = >xarray;
+   int ret;
+
+   client = kzalloc(sizeof(*client), GFP_KERNEL);
+   if (!client)
+   return ERR_PTR(-ENOMEM);
+
+   xa_lock_irq(xa);
+   ret = __xa_alloc_cyclic(xa, >id, client, xa_limit_32b,
+   >next_id, GFP_KERNEL);
+   xa_unlock_irq(xa);
+   if (ret < 0)
+   goto err;
+
+   kref_init(>kref);
+   client->clients = clients;
+
+   return client;
+
+err:
+   kfree(client);
+
+   return ERR_PTR(ret);
+}
+
+void __i915_drm_client_free(struct kref *kref)
+{
+   struct i915_drm_client *client =
+   container_of(kref, typeof(*client), kref);
+   struct xarray *xa = >clients->xarray;
+   unsigned long flags;
+
+   xa_lock_irqsave(xa, flags);
+   __xa_erase(xa, client->id);
+   xa_unlock_irqrestore(xa, flags);
+   kfree(client);
+}
+
+void i915_drm_clients_fini(struct i915_drm_clients *clients)
+{
+   GEM_BUG_ON(!xa_empty(>xarray));
+   xa_destroy(>xarray);
+}
diff --git a/drivers/gpu/drm/i915/i915_drm_client.h 
b/drivers/gpu/drm/i915/i915_drm_client.h
new file mode 100644
index ..e8986ad51176
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_drm_client.h
@@ -0,0 +1,50 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2020 Intel Corporation
+ */
+
+#ifndef __I915_DRM_CLIENT_H__
+#define __I915_DRM_CLIENT_H__
+
+#include 
+#include 
+
+struct drm_i915_private;
+
+struct i915_drm_clients {
+   struct drm_i915_private *i915;
+
+   struct xarray xarray;
+   u32 next_id;
+};
+
+struct

[PATCH 4/7] drm/i915: Track all user contexts per client

2021-09-22 Thread Tvrtko Ursulin

From: Tvrtko Ursulin 

We soon want to start answering questions like how much GPU time is the
context belonging to a client which exited still using.

To enable this we start tracking all context belonging to a client on a
separate list.

Signed-off-by: Tvrtko Ursulin 
Reviewed-by: Aravind Iddamsetty 
Reviewed-by: Chris Wilson 
Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 12 
 drivers/gpu/drm/i915/gem/i915_gem_context_types.h |  3 +++
 drivers/gpu/drm/i915/i915_drm_client.c|  2 ++
 drivers/gpu/drm/i915/i915_drm_client.h|  5 +
 4 files changed, 22 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 9b37723b70a9..a1ef6be28899 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1190,6 +1190,7 @@ static void set_closed_name(struct i915_gem_context *ctx)
 
 static void context_close(struct i915_gem_context *ctx)
 {
+   struct i915_drm_client *client;
struct i915_address_space *vm;
 
/* Flush any concurrent set_engines() */
@@ -1226,6 +1227,13 @@ static void context_close(struct i915_gem_context *ctx)
list_del(>link);
spin_unlock(>i915->gem.contexts.lock);
 
+   client = ctx->client;
+   if (client) {
+   spin_lock(>ctx_lock);
+   list_del_rcu(>client_link);
+   spin_unlock(>ctx_lock);
+   }
+
mutex_unlock(>mutex);
 
/*
@@ -1406,6 +1414,10 @@ static void gem_context_register(struct i915_gem_context 
*ctx,
old = xa_store(>context_xa, id, ctx, GFP_KERNEL);
WARN_ON(old);
 
+   spin_lock(>client->ctx_lock);
+   list_add_tail_rcu(>client_link, >client->ctx_list);
+   spin_unlock(>client->ctx_lock);
+
spin_lock(>gem.contexts.lock);
list_add_tail(>link, >gem.contexts.list);
spin_unlock(>gem.contexts.lock);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h 
b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
index 598c57ac5cdf..b878e1b13b38 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
@@ -280,6 +280,9 @@ struct i915_gem_context {
/** @client: struct i915_drm_client */
struct i915_drm_client *client;
 
+   /** link: _client.context_list */
+   struct list_head client_link;
+
/**
 * @ref: reference count
 *
diff --git a/drivers/gpu/drm/i915/i915_drm_client.c 
b/drivers/gpu/drm/i915/i915_drm_client.c
index e61e9ba15256..91a8559bebf7 100644
--- a/drivers/gpu/drm/i915/i915_drm_client.c
+++ b/drivers/gpu/drm/i915/i915_drm_client.c
@@ -38,6 +38,8 @@ struct i915_drm_client *i915_drm_client_add(struct 
i915_drm_clients *clients)
goto err;
 
kref_init(>kref);
+   spin_lock_init(>ctx_lock);
+   INIT_LIST_HEAD(>ctx_list);
client->clients = clients;
 
return client;
diff --git a/drivers/gpu/drm/i915/i915_drm_client.h 
b/drivers/gpu/drm/i915/i915_drm_client.h
index 9d80d9f715ee..7416e18aa33c 100644
--- a/drivers/gpu/drm/i915/i915_drm_client.h
+++ b/drivers/gpu/drm/i915/i915_drm_client.h
@@ -7,6 +7,8 @@
 #define __I915_DRM_CLIENT_H__
 
 #include 
+#include 
+#include 
 #include 
 
 #include "gt/intel_engine_types.h"
@@ -25,6 +27,9 @@ struct i915_drm_client {
 
unsigned int id;
 
+   spinlock_t ctx_lock; /* For add/remove from ctx_list. */
+   struct list_head ctx_list; /* List of contexts belonging to client. */
+
struct i915_drm_clients *clients;
 
/**
-- 
2.30.2

[PATCH 0/7] Per client GPU stats

2021-09-22 Thread Tvrtko Ursulin

From: Tvrtko Ursulin 

Same old work but now rebased and series ending with some DRM docs proposing
the common specification which should enable nice common userspace tools to be
written.

For the moment I only have intel_gpu_top converted to use this and that seems to
work okay.

v2:
 * Added prototype of possible amdgpu changes and spec updates to align with the
   common spec.

v3:
 * Documented that 'drm-driver' tag shall correspond with
   struct drm_driver.name.

v4:
 * Dropped amdgpu conversion from the series for now until AMD folks can find
   some time to finish that patch.

Tvrtko Ursulin (7):
  drm/i915: Explicitly track DRM clients
  drm/i915: Make GEM contexts track DRM clients
  drm/i915: Track runtime spent in closed and unreachable GEM contexts
  drm/i915: Track all user contexts per client
  drm/i915: Track context current active time
  drm: Document fdinfo format specification
  drm/i915: Expose client engine utilisation via fdinfo

 Documentation/gpu/drm-usage-stats.rst | 103 +
 Documentation/gpu/i915.rst|  27 
 Documentation/gpu/index.rst   |   1 +
 drivers/gpu/drm/i915/Makefile |   5 +-
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |  42 -
 .../gpu/drm/i915/gem/i915_gem_context_types.h |   6 +
 drivers/gpu/drm/i915/gt/intel_context.c   |  27 +++-
 drivers/gpu/drm/i915/gt/intel_context.h   |  15 +-
 drivers/gpu/drm/i915/gt/intel_context_types.h |  24 ++-
 .../drm/i915/gt/intel_execlists_submission.c  |  23 ++-
 .../gpu/drm/i915/gt/intel_gt_clock_utils.c|   4 +
 drivers/gpu/drm/i915/gt/intel_lrc.c   |  27 ++--
 drivers/gpu/drm/i915/gt/intel_lrc.h   |  24 +++
 drivers/gpu/drm/i915/gt/selftest_lrc.c|  10 +-
 drivers/gpu/drm/i915/i915_drm_client.c| 143 ++
 drivers/gpu/drm/i915/i915_drm_client.h|  66 
 drivers/gpu/drm/i915/i915_drv.c   |   9 ++
 drivers/gpu/drm/i915/i915_drv.h   |   5 +
 drivers/gpu/drm/i915/i915_gem.c   |  21 ++-
 drivers/gpu/drm/i915/i915_gpu_error.c |   9 +-
 drivers/gpu/drm/i915/i915_gpu_error.h |   2 +-
 21 files changed, 537 insertions(+), 56 deletions(-)
 create mode 100644 Documentation/gpu/drm-usage-stats.rst
 create mode 100644 drivers/gpu/drm/i915/i915_drm_client.c
 create mode 100644 drivers/gpu/drm/i915/i915_drm_client.h

-- 
2.30.2

Re: [PATCH v3 1/6] drm/vc4: select PM (openrisc)

2021-09-22 Thread Nathan Chancellor

On Wed, Sep 22, 2021 at 10:41:56AM +0200, Maxime Ripard wrote:
> Hi Randy,
> 
> On Sun, Sep 19, 2021 at 09:40:44AM -0700, Randy Dunlap wrote:
> > On 8/19/21 6:59 AM, Maxime Ripard wrote:
> > > We already depend on runtime PM to get the power domains and clocks for
> > > most of the devices supported by the vc4 driver, so let's just select it
> > > to make sure it's there, and remove the ifdef.
> > > 
> > > Signed-off-by: Maxime Ripard 
> > > ---
> > >   drivers/gpu/drm/vc4/Kconfig| 1 +
> > >   drivers/gpu/drm/vc4/vc4_hdmi.c | 2 --
> > >   2 files changed, 1 insertion(+), 2 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/vc4/Kconfig b/drivers/gpu/drm/vc4/Kconfig
> > > index 118e8a426b1a..f774ab340863 100644
> > > --- a/drivers/gpu/drm/vc4/Kconfig
> > > +++ b/drivers/gpu/drm/vc4/Kconfig
> > > @@ -9,6 +9,7 @@ config DRM_VC4
> > >   select DRM_KMS_CMA_HELPER
> > >   select DRM_GEM_CMA_HELPER
> > >   select DRM_PANEL_BRIDGE
> > > + select PM
> > >   select SND_PCM
> > >   select SND_PCM_ELD
> > >   select SND_SOC_GENERIC_DMAENGINE_PCM
> > > diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c 
> > > b/drivers/gpu/drm/vc4/vc4_hdmi.c
> > > index c2876731ee2d..602203b2d8e1 100644
> > > --- a/drivers/gpu/drm/vc4/vc4_hdmi.c
> > > +++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
> > > @@ -2107,7 +2107,6 @@ static int vc5_hdmi_init_resources(struct vc4_hdmi 
> > > *vc4_hdmi)
> > >   return 0;
> > >   }
> > > -#ifdef CONFIG_PM
> > >   static int vc4_hdmi_runtime_suspend(struct device *dev)
> > >   {
> > >   struct vc4_hdmi *vc4_hdmi = dev_get_drvdata(dev);
> > > @@ -2128,7 +2127,6 @@ static int vc4_hdmi_runtime_resume(struct device 
> > > *dev)
> > >   return 0;
> > >   }
> > > -#endif
> > >   static int vc4_hdmi_bind(struct device *dev, struct device *master, 
> > > void *data)
> > >   {
> > > 
> > 
> > Hi,
> > 
> > FYI.
> > 
> > This still causes a build error on arch/openrisc/ since it does not support
> > CONFIG_PM (it does not source "kernel/power/Kconfig" like some other arches 
> > do):
> > 
> > ./arch/riscv/Kconfig:source "kernel/power/Kconfig"
> > ./arch/x86/Kconfig:source "kernel/power/Kconfig"
> > ./arch/nds32/Kconfig:source "kernel/power/Kconfig"
> > ./arch/sh/Kconfig:source "kernel/power/Kconfig"
> > ./arch/arc/Kconfig:source "kernel/power/Kconfig"
> > ./arch/arm64/Kconfig:source "kernel/power/Kconfig"
> > ./arch/xtensa/Kconfig:source "kernel/power/Kconfig"
> > ./arch/sparc/Kconfig:source "kernel/power/Kconfig"
> > ./arch/arm/Kconfig:source "kernel/power/Kconfig"
> > ./arch/mips/Kconfig:source "kernel/power/Kconfig"
> > ./arch/powerpc/Kconfig:source "kernel/power/Kconfig"
> > ./arch/um/Kconfig:source "kernel/power/Kconfig"
> > ./arch/ia64/Kconfig:source "kernel/power/Kconfig"
> > 
> > so with
> > CONFIG_DRM_VC4=y
> > # CONFIG_DRM_VC4_HDMI_CEC is not set
> > 
> > I still see
> > ../drivers/gpu/drm/vc4/vc4_hdmi.c:2139:12: warning: 
> > 'vc4_hdmi_runtime_suspend' defined but not used [-Wunused-function]
> >  2139 | static int vc4_hdmi_runtime_suspend(struct device *dev)
> >   |^~~~
> 
> With what version did you get that build error? -rc2 shouldn't have it
> anymore since the runtime_pm hooks introduction got reverted.

-next still contains these patches as Stephen effectively reverted the
changes in Linus' tree when merging in the drm-misc-fixes tree:

https://lore.kernel.org/r/20210920090729.19458...@canb.auug.org.au/

Cheers,
Nathan

Re: [Intel-gfx] [PATCH] drm/i915: Drop stealing of bits from i915_sw_fence function pointer

2021-09-22 Thread Matthew Brost

On Wed, Sep 22, 2021 at 04:25:04PM +0100, Tvrtko Ursulin wrote:
> 
> On 22/09/2021 16:21, Tvrtko Ursulin wrote:
> > 
> > On 22/09/2021 15:57, Matthew Brost wrote:
> > > Rather than stealing bits from i915_sw_fence function pointer use
> > > seperate fields for function pointer and flags. If using two different
> > > fields, the 4 byte alignment for the i915_sw_fence function pointer can
> > > also be dropped.
> > > 
> > > v2:
> > >   (CI)
> > >    - Set new function field rather than flags in __i915_sw_fence_init
> > > 
> > > Signed-off-by: Matthew Brost 
> > > ---
> > >   drivers/gpu/drm/i915/display/intel_display.c  |  2 +-
> > >   drivers/gpu/drm/i915/gem/i915_gem_context.c   |  2 +-
> > >   drivers/gpu/drm/i915/i915_request.c   |  4 ++--
> > >   drivers/gpu/drm/i915/i915_sw_fence.c  | 12 +--
> > >   drivers/gpu/drm/i915/i915_sw_fence.h  | 21 +--
> > >   drivers/gpu/drm/i915/i915_sw_fence_work.c |  2 +-
> > >   .../gpu/drm/i915/selftests/i915_sw_fence.c    |  2 +-
> > >   drivers/gpu/drm/i915/selftests/lib_sw_fence.c |  4 ++--
> > >   8 files changed, 23 insertions(+), 26 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/display/intel_display.c
> > > b/drivers/gpu/drm/i915/display/intel_display.c
> > > index a7ca38613f89..6d5bb55ffc82 100644
> > > --- a/drivers/gpu/drm/i915/display/intel_display.c
> > > +++ b/drivers/gpu/drm/i915/display/intel_display.c
> > > @@ -10323,7 +10323,7 @@ static void intel_atomic_commit_work(struct
> > > work_struct *work)
> > >   intel_atomic_commit_tail(state);
> > >   }
> > > -static int __i915_sw_fence_call
> > > +static int
> > >   intel_atomic_commit_ready(struct i915_sw_fence *fence,
> > >     enum i915_sw_fence_notify notify)
> > >   {
> > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > index c2ab0e22db0a..df5fec5c3da8 100644
> > > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > @@ -800,7 +800,7 @@ static void free_engines_rcu(struct rcu_head *rcu)
> > >   free_engines(engines);
> > >   }
> > > -static int __i915_sw_fence_call
> > > +static int
> > >   engines_notify(struct i915_sw_fence *fence, enum
> > > i915_sw_fence_notify state)
> > >   {
> > >   struct i915_gem_engines *engines =
> > > diff --git a/drivers/gpu/drm/i915/i915_request.c
> > > b/drivers/gpu/drm/i915/i915_request.c
> > > index ce446716d092..945d3025a0b6 100644
> > > --- a/drivers/gpu/drm/i915/i915_request.c
> > > +++ b/drivers/gpu/drm/i915/i915_request.c
> > > @@ -719,7 +719,7 @@ void i915_request_cancel(struct i915_request
> > > *rq, int error)
> > >   intel_context_cancel_request(rq->context, rq);
> > >   }
> > > -static int __i915_sw_fence_call
> > > +static int
> > >   submit_notify(struct i915_sw_fence *fence, enum
> > > i915_sw_fence_notify state)
> > >   {
> > >   struct i915_request *request =
> > > @@ -755,7 +755,7 @@ submit_notify(struct i915_sw_fence *fence, enum
> > > i915_sw_fence_notify state)
> > >   return NOTIFY_DONE;
> > >   }
> > > -static int __i915_sw_fence_call
> > > +static int
> > >   semaphore_notify(struct i915_sw_fence *fence, enum
> > > i915_sw_fence_notify state)
> > >   {
> > >   struct i915_request *rq = container_of(fence, typeof(*rq),
> > > semaphore);
> > > diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c
> > > b/drivers/gpu/drm/i915/i915_sw_fence.c
> > > index c589a681da77..1c080dd1f718 100644
> > > --- a/drivers/gpu/drm/i915/i915_sw_fence.c
> > > +++ b/drivers/gpu/drm/i915/i915_sw_fence.c
> > > @@ -34,7 +34,7 @@ enum {
> > >   static void *i915_sw_fence_debug_hint(void *addr)
> > >   {
> > > -    return (void *)(((struct i915_sw_fence *)addr)->flags &
> > > I915_SW_FENCE_MASK);
> > > +    return (void *)(((struct i915_sw_fence *)addr)->fn);
> > >   }
> > >   #ifdef CONFIG_DRM_I915_SW_FENCE_DEBUG_OBJECTS
> > > @@ -126,10 +126,7 @@ static inline void debug_fence_assert(struct
> > > i915_sw_fence *fence)
> > >   static int __i915_sw_fence_notify(struct i915_sw_fence *fence,
> > >     enum i915_sw_fence_notify state)
> > >   {
> > > -    i915_sw_fence_notify_t fn;
> > > -
> > > -    fn = (i915_sw_fence_notify_t)(fence->flags & I915_SW_FENCE_MASK);
> > > -    return fn(fence, state);
> > > +    return fence->fn(fence, state);
> > >   }
> > >   #ifdef CONFIG_DRM_I915_SW_FENCE_DEBUG_OBJECTS
> > > @@ -242,10 +239,11 @@ void __i915_sw_fence_init(struct i915_sw_fence
> > > *fence,
> > >     const char *name,
> > >     struct lock_class_key *key)
> > >   {
> > > -    BUG_ON(!fn || (unsigned long)fn & ~I915_SW_FENCE_MASK);
> > > +    BUG_ON(!fn);
> > >   __init_waitqueue_head(>wait, name, key);
> > > -    fence->flags = (unsigned long)fn;
> > > +    fence->fn = fn;
> > > +    fence->flags = 0;
> > >   i915_sw_fence_reinit(fence);
> > >   }
> > > diff --git a/drivers/gpu/drm/i915/i915_sw_fence.h
>

Re: Multiple DRI card detection in compositor systemd units

2021-09-22 Thread Simon Ser

Maybe try creating multiple physical seats with logind, and start each
compositor on its own seat? A physical seat is a collection of devices like
DRM nodes and evdev device files.

Also udev creates files in /dev/dri/by-path/, these should be stable across
reboots. `udevadm settle` before a compositor start-up can wait for udev to
finish its job.

Out of curiosity, can you explain your use-case? Why do you need to start
multiple compositors, each on its own GPU?

Re: [RFC PATCH v3 1/6] drm/doc: Color Management and HDR10 RFC

2021-09-22 Thread Harry Wentland




On 2021-09-22 04:31, Pekka Paalanen wrote:
> On Tue, 21 Sep 2021 14:05:05 -0400
> Harry Wentland  wrote:
> 
>> On 2021-09-21 09:31, Pekka Paalanen wrote:
>>> On Mon, 20 Sep 2021 20:14:50 -0400
>>> Harry Wentland  wrote:
>>>   

...

> 
>> Did anybody start any CM doc patches in Weston or Wayland yet?
> 
> There is the
> https://gitlab.freedesktop.org/swick/wayland-protocols/-/blob/color/unstable/color-management/color.rst
> we started a long time ago, and have not really touched it for a while.
> Since we last touched it, at least my understanding has developed
> somewhat.
> 
> It is linked from the overview in
> https://gitlab.freedesktop.org/wayland/wayland-protocols/-/merge_requests/14
> and if you want to propose changes, the way to do it is file a MR in
> https://gitlab.freedesktop.org/swick/wayland-protocols/-/merge_requests
> against the 'color' branch. Patches very much welcome, that doc does
> not need to limit itself to Wayland. :-)
> 

Right, I've read all that a while back.

It might be a good place to consolidate most of the Linux CM/HDR discussion,
since gitlab is good with allowing discussions, we can track changes, and
it's more formatting and diagram friendly than text-only email.

> We also have issues tracked at
> https://gitlab.freedesktop.org/swick/wayland-protocols/-/issues?scope=all=%E2%9C%93=opened
> 
>>> Pre-curve for instance could be a combination of decoding to linear
>>> light and a shaper for the 3D LUT coming next. That's why we don't call
>>> them gamma or EOTF, that would be too limiting.
>>>
>>> (Using a shaper may help to keep the 3D LUT size reasonable - I suppose
>>> very much like those multi-segmented LUTs.)
>>>   
>>
>> AFAIU a 3D LUTs will need a shaper as they don't have enough precision.
>> But that's going deeper into color theory than I understand. Vitaly would
>> know better all the details around 3D LUT usage.
> 
> There is a very practical problem: the sheer number of elements in a 3D
> LUT grows to the power of three. So you can't have very many taps per
> channel without storage requirements blowing up. Each element needs to
> be a 3-channel value, too. And then 8 bits is not enough.
> 

And those storage requirements would have a direct impact on silicon real
estate and therefore the price and power usage of the HW.

Harry

> I'm really happy that Vitaly is working with us on Weston and Wayland. :-)
> He's a huge help, and I feel like I'm currently the one slowing things
> down by being backlogged in reviews.
> 
> 
> Thanks,
> pq
>

Re: Regression with mainline kernel on rpi4

2021-09-22 Thread Linus Torvalds

On Wed, Sep 22, 2021 at 3:11 AM Sudip Mukherjee
 wrote:
>
> That test script is triggering the openqa job, but its running only
> after lava is able to login. The trace is appearing before the login
> prompt even, so test_mainline.sh should not matter here.

Side note: the traces might be more legible if you have debug info in
the kernel, and run the dmesg through the script in

  scripts/decode_stacktrace.sh

which should give line numbers and inlining information.

That often makes it much easier to see which access it is that hits a
NULL pointer dereference.

On x86-64, generally just decode the instruction stream, and look at
the instruction patterns and try to figure out where an oops is coming
from, but that's much less useful on arm64 (partly because I'm not as
used to it, but because the arm64 oopses don't print out much of the
instructions so there's often little to go by).

 Linus

Re: [Intel-gfx] [PATCH] drm/i915: Drop stealing of bits from i915_sw_fence function pointer

2021-09-22 Thread Tvrtko Ursulin




On 22/09/2021 16:21, Tvrtko Ursulin wrote:


On 22/09/2021 15:57, Matthew Brost wrote:

Rather than stealing bits from i915_sw_fence function pointer use
seperate fields for function pointer and flags. If using two different
fields, the 4 byte alignment for the i915_sw_fence function pointer can
also be dropped.

v2:
  (CI)
   - Set new function field rather than flags in __i915_sw_fence_init

Signed-off-by: Matthew Brost 
---
  drivers/gpu/drm/i915/display/intel_display.c  |  2 +-
  drivers/gpu/drm/i915/gem/i915_gem_context.c   |  2 +-
  drivers/gpu/drm/i915/i915_request.c   |  4 ++--
  drivers/gpu/drm/i915/i915_sw_fence.c  | 12 +--
  drivers/gpu/drm/i915/i915_sw_fence.h  | 21 +--
  drivers/gpu/drm/i915/i915_sw_fence_work.c |  2 +-
  .../gpu/drm/i915/selftests/i915_sw_fence.c    |  2 +-
  drivers/gpu/drm/i915/selftests/lib_sw_fence.c |  4 ++--
  8 files changed, 23 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
b/drivers/gpu/drm/i915/display/intel_display.c

index a7ca38613f89..6d5bb55ffc82 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -10323,7 +10323,7 @@ static void intel_atomic_commit_work(struct 
work_struct *work)

  intel_atomic_commit_tail(state);
  }
-static int __i915_sw_fence_call
+static int
  intel_atomic_commit_ready(struct i915_sw_fence *fence,
    enum i915_sw_fence_notify notify)
  {
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c

index c2ab0e22db0a..df5fec5c3da8 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -800,7 +800,7 @@ static void free_engines_rcu(struct rcu_head *rcu)
  free_engines(engines);
  }
-static int __i915_sw_fence_call
+static int
  engines_notify(struct i915_sw_fence *fence, enum 
i915_sw_fence_notify state)

  {
  struct i915_gem_engines *engines =
diff --git a/drivers/gpu/drm/i915/i915_request.c 
b/drivers/gpu/drm/i915/i915_request.c

index ce446716d092..945d3025a0b6 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -719,7 +719,7 @@ void i915_request_cancel(struct i915_request *rq, 
int error)

  intel_context_cancel_request(rq->context, rq);
  }
-static int __i915_sw_fence_call
+static int
  submit_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify 
state)

  {
  struct i915_request *request =
@@ -755,7 +755,7 @@ submit_notify(struct i915_sw_fence *fence, enum 
i915_sw_fence_notify state)

  return NOTIFY_DONE;
  }
-static int __i915_sw_fence_call
+static int
  semaphore_notify(struct i915_sw_fence *fence, enum 
i915_sw_fence_notify state)

  {
  struct i915_request *rq = container_of(fence, typeof(*rq), 
semaphore);
diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c 
b/drivers/gpu/drm/i915/i915_sw_fence.c

index c589a681da77..1c080dd1f718 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.c
+++ b/drivers/gpu/drm/i915/i915_sw_fence.c
@@ -34,7 +34,7 @@ enum {
  static void *i915_sw_fence_debug_hint(void *addr)
  {
-    return (void *)(((struct i915_sw_fence *)addr)->flags & 
I915_SW_FENCE_MASK);

+    return (void *)(((struct i915_sw_fence *)addr)->fn);
  }
  #ifdef CONFIG_DRM_I915_SW_FENCE_DEBUG_OBJECTS
@@ -126,10 +126,7 @@ static inline void debug_fence_assert(struct 
i915_sw_fence *fence)

  static int __i915_sw_fence_notify(struct i915_sw_fence *fence,
    enum i915_sw_fence_notify state)
  {
-    i915_sw_fence_notify_t fn;
-
-    fn = (i915_sw_fence_notify_t)(fence->flags & I915_SW_FENCE_MASK);
-    return fn(fence, state);
+    return fence->fn(fence, state);
  }
  #ifdef CONFIG_DRM_I915_SW_FENCE_DEBUG_OBJECTS
@@ -242,10 +239,11 @@ void __i915_sw_fence_init(struct i915_sw_fence 
*fence,

    const char *name,
    struct lock_class_key *key)
  {
-    BUG_ON(!fn || (unsigned long)fn & ~I915_SW_FENCE_MASK);
+    BUG_ON(!fn);
  __init_waitqueue_head(>wait, name, key);
-    fence->flags = (unsigned long)fn;
+    fence->fn = fn;
+    fence->flags = 0;
  i915_sw_fence_reinit(fence);
  }
diff --git a/drivers/gpu/drm/i915/i915_sw_fence.h 
b/drivers/gpu/drm/i915/i915_sw_fence.h

index 30a863353ee6..70ba1789aa89 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.h
+++ b/drivers/gpu/drm/i915/i915_sw_fence.h
@@ -17,26 +17,25 @@
  struct completion;
  struct dma_resv;
+struct i915_sw_fence;
+
+enum i915_sw_fence_notify {
+    FENCE_COMPLETE,
+    FENCE_FREE
+};
+
+typedef int (*i915_sw_fence_notify_t)(struct i915_sw_fence *,
+  enum i915_sw_fence_notify state);
  struct i915_sw_fence {
  wait_queue_head_t wait;
+    i915_sw_fence_notify_t fn;
  unsigned long flags;


Looks good to me. I'd just make the flags narrower now that they can be, 
and put them down..



  atomic_t pending;


.. here as unsigned int and so we save 4 bytes,

Re: [Intel-gfx] [PATCH] drm/i915: Drop stealing of bits from i915_sw_fence function pointer

2021-09-22 Thread Tvrtko Ursulin




On 22/09/2021 15:57, Matthew Brost wrote:

Rather than stealing bits from i915_sw_fence function pointer use
seperate fields for function pointer and flags. If using two different
fields, the 4 byte alignment for the i915_sw_fence function pointer can
also be dropped.

v2:
  (CI)
   - Set new function field rather than flags in __i915_sw_fence_init

Signed-off-by: Matthew Brost 
---
  drivers/gpu/drm/i915/display/intel_display.c  |  2 +-
  drivers/gpu/drm/i915/gem/i915_gem_context.c   |  2 +-
  drivers/gpu/drm/i915/i915_request.c   |  4 ++--
  drivers/gpu/drm/i915/i915_sw_fence.c  | 12 +--
  drivers/gpu/drm/i915/i915_sw_fence.h  | 21 +--
  drivers/gpu/drm/i915/i915_sw_fence_work.c |  2 +-
  .../gpu/drm/i915/selftests/i915_sw_fence.c|  2 +-
  drivers/gpu/drm/i915/selftests/lib_sw_fence.c |  4 ++--
  8 files changed, 23 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
b/drivers/gpu/drm/i915/display/intel_display.c
index a7ca38613f89..6d5bb55ffc82 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -10323,7 +10323,7 @@ static void intel_atomic_commit_work(struct work_struct 
*work)
intel_atomic_commit_tail(state);
  }
  
-static int __i915_sw_fence_call

+static int
  intel_atomic_commit_ready(struct i915_sw_fence *fence,
  enum i915_sw_fence_notify notify)
  {
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index c2ab0e22db0a..df5fec5c3da8 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -800,7 +800,7 @@ static void free_engines_rcu(struct rcu_head *rcu)
free_engines(engines);
  }
  
-static int __i915_sw_fence_call

+static int
  engines_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state)
  {
struct i915_gem_engines *engines =
diff --git a/drivers/gpu/drm/i915/i915_request.c 
b/drivers/gpu/drm/i915/i915_request.c
index ce446716d092..945d3025a0b6 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -719,7 +719,7 @@ void i915_request_cancel(struct i915_request *rq, int error)
intel_context_cancel_request(rq->context, rq);
  }
  
-static int __i915_sw_fence_call

+static int
  submit_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state)
  {
struct i915_request *request =
@@ -755,7 +755,7 @@ submit_notify(struct i915_sw_fence *fence, enum 
i915_sw_fence_notify state)
return NOTIFY_DONE;
  }
  
-static int __i915_sw_fence_call

+static int
  semaphore_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state)
  {
struct i915_request *rq = container_of(fence, typeof(*rq), semaphore);
diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c 
b/drivers/gpu/drm/i915/i915_sw_fence.c
index c589a681da77..1c080dd1f718 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.c
+++ b/drivers/gpu/drm/i915/i915_sw_fence.c
@@ -34,7 +34,7 @@ enum {
  
  static void *i915_sw_fence_debug_hint(void *addr)

  {
-   return (void *)(((struct i915_sw_fence *)addr)->flags & 
I915_SW_FENCE_MASK);
+   return (void *)(((struct i915_sw_fence *)addr)->fn);
  }
  
  #ifdef CONFIG_DRM_I915_SW_FENCE_DEBUG_OBJECTS

@@ -126,10 +126,7 @@ static inline void debug_fence_assert(struct i915_sw_fence 
*fence)
  static int __i915_sw_fence_notify(struct i915_sw_fence *fence,
  enum i915_sw_fence_notify state)
  {
-   i915_sw_fence_notify_t fn;
-
-   fn = (i915_sw_fence_notify_t)(fence->flags & I915_SW_FENCE_MASK);
-   return fn(fence, state);
+   return fence->fn(fence, state);
  }
  
  #ifdef CONFIG_DRM_I915_SW_FENCE_DEBUG_OBJECTS

@@ -242,10 +239,11 @@ void __i915_sw_fence_init(struct i915_sw_fence *fence,
  const char *name,
  struct lock_class_key *key)
  {
-   BUG_ON(!fn || (unsigned long)fn & ~I915_SW_FENCE_MASK);
+   BUG_ON(!fn);
  
  	__init_waitqueue_head(>wait, name, key);

-   fence->flags = (unsigned long)fn;
+   fence->fn = fn;
+   fence->flags = 0;
  
  	i915_sw_fence_reinit(fence);

  }
diff --git a/drivers/gpu/drm/i915/i915_sw_fence.h 
b/drivers/gpu/drm/i915/i915_sw_fence.h
index 30a863353ee6..70ba1789aa89 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.h
+++ b/drivers/gpu/drm/i915/i915_sw_fence.h
@@ -17,26 +17,25 @@
  
  struct completion;

  struct dma_resv;
+struct i915_sw_fence;
+
+enum i915_sw_fence_notify {
+   FENCE_COMPLETE,
+   FENCE_FREE
+};
+
+typedef int (*i915_sw_fence_notify_t)(struct i915_sw_fence *,
+ enum i915_sw_fence_notify state);
  
  struct i915_sw_fence {

wait_queue_head_t wait;
+   i915_sw_fence_notify_t fn;
unsigned long flags;


Looks good to me. I'd just make the flags narrower now that they can be, 
and

[PATCH] backlight: hx8357: Add SPI device ID table

2021-09-22 Thread Mark Brown

Currently autoloading for SPI devices does not use the DT ID table, it uses
SPI modalises. Supporting OF modalises is going to be difficult if not
impractical, an attempt was made but has been reverted, so ensure that
module autoloading works for this driver by adding a SPI device ID table.

Fixes: 96c8395e2166 ("spi: Revert modalias changes")
Signed-off-by: Mark Brown 
---
 drivers/video/backlight/hx8357.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/drivers/video/backlight/hx8357.c b/drivers/video/backlight/hx8357.c
index 9b50bc96e00f..c64b1fbe027f 100644
--- a/drivers/video/backlight/hx8357.c
+++ b/drivers/video/backlight/hx8357.c
@@ -565,6 +565,19 @@ static struct lcd_ops hx8357_ops = {
.get_power  = hx8357_get_power,
 };
 
+static const struct spi_device_id hx8357_spi_ids[] = {
+   {
+   .name = "hx8357",
+   .driver_data = (kernel_ulong_t)hx8357_lcd_init,
+   },
+   {
+   .name = "hx8369",
+   .driver_data = (kernel_ulong_t)hx8369_lcd_init,
+   },
+   {},
+};
+MODULE_DEVICE_TABLE(spi, hx8357_spi_ids);
+
 static const struct of_device_id hx8357_dt_ids[] = {
{
.compatible = "himax,hx8357",
@@ -672,6 +685,7 @@ static struct spi_driver hx8357_driver = {
.name = "hx8357",
.of_match_table = hx8357_dt_ids,
},
+   .id_table = hx8357_spi_ids,
 };
 
 module_spi_driver(hx8357_driver);
-- 
2.20.1

Re: [PATCH 01/26] dma-buf: add dma_resv_for_each_fence_unlocked v4

2021-09-22 Thread Tvrtko Ursulin




On 22/09/2021 15:50, Christian König wrote:

Am 22.09.21 um 16:36 schrieb Tvrtko Ursulin:

+
+/**
+ * dma_resv_iter_first_unlocked - first fence in an unlocked 
dma_resv obj.

+ * @cursor: the cursor with the current position
+ *
+ * Returns the first fence from an unlocked dma_resv obj.
+ */
+struct dma_fence *dma_resv_iter_first_unlocked(struct dma_resv_iter 
*cursor)

+{
+    rcu_read_lock();
+    do {
+    dma_resv_iter_restart_unlocked(cursor);
+    dma_resv_iter_walk_unlocked(cursor);
+    } while (read_seqcount_retry(>obj->seq, cursor->seq));
+    rcu_read_unlock();
+
+    return cursor->fence;
+}
+EXPORT_SYMBOL(dma_resv_iter_first_unlocked);


Why is this one split from dma_resv_iter_begin and even exported?


I've split it to be able to use dma_resv_iter_begin in both the unlocked 
and locked iterator.


Ok.




I couldn't find any users in the series.


This is used in the dma_resv_for_each_fence_unlocked() macro to return 
the first fence.


Doh!


+
+/**
+ * dma_resv_iter_next_unlocked - next fence in an unlocked dma_resv 
obj.

+ * @cursor: the cursor with the current position
+ *
+ * Returns the next fence from an unlocked dma_resv obj.
+ */
+struct dma_fence *dma_resv_iter_next_unlocked(struct dma_resv_iter 
*cursor)

+{
+    bool restart;
+
+    rcu_read_lock();
+    cursor->is_restarted = false;
+    restart = read_seqcount_retry(>obj->seq, cursor->seq);
+    do {
+    if (restart)
+    dma_resv_iter_restart_unlocked(cursor);
+    dma_resv_iter_walk_unlocked(cursor);
+    restart = true;
+    } while (read_seqcount_retry(>obj->seq, cursor->seq));
+    rcu_read_unlock();
+
+    return cursor->fence;
+}
+EXPORT_SYMBOL(dma_resv_iter_next_unlocked);


Couldn't dma_resv_iter_first_unlocked and dma_resv_iter_next_unlocked 
share the same implementation? Especially if you are able to replace 
cursor->is_restarted with cursor->index == -1.


That's what I had initially, but Daniel disliked it for some reason. You 
then need a centralized walk function instead of first/next.


I had some ideas to only consolidate "first" and "next" helpers but never mind, 
yours is fine as well.

Regards,

Tvrtko



Thanks,
Christian.


Regards,

Tvrtko

Re: [RFC PATCH v3 1/6] drm/doc: Color Management and HDR10 RFC

2021-09-22 Thread Harry Wentland

On 2021-09-20 20:14, Harry Wentland wrote:
> On 2021-09-15 10:01, Pekka Paalanen wrote:> On Fri, 30 Jul 2021 16:41:29 -0400
>> Harry Wentland  wrote:
>>

>>> +If a display's maximum HDR white level is correctly reported it is trivial
>>> +to convert between all of the above representations of SDR white level. If
>>> +it is not, defining SDR luminance as a nits value, or a ratio vs a fixed
>>> +nits value is preferred, assuming we are blending in linear space.
>>> +
>>> +It is our experience that many HDR displays do not report maximum white
>>> +level correctly
>>
>> Which value do you refer to as "maximum white", and how did you measure
>> it?
>>
> Good question. I haven't played with those displays myself but I'll try to
> find out a bit more background behind this statement.
> 

Some TVs report the EOTF but not the luminance values.
For an example edid-code capture of my eDP HDR panel:

  HDR Static Metadata Data Block:
Electro optical transfer functions:
  Traditional gamma - SDR luminance range
  SMPTE ST2084
Supported static metadata descriptors:
  Static metadata type 1
Desired content max luminance: 115 (603.666 cd/m^2)
Desired content max frame-average luminance: 109 (530.095 cd/m^2)
Desired content min luminance: 7 (0.005 cd/m^2)

I suspect on those TVs it looks like this:

  HDR Static Metadata Data Block:
Electro optical transfer functions:
  Traditional gamma - SDR luminance range
  SMPTE ST2084
Supported static metadata descriptors:
  Static metadata type 1

Windows has some defaults in this case and our Windows driver also has
some defaults.

Using defaults in the 1000-2000 nits range would yield much better
tone-mapping results than assuming the monitor can support a full
10k nits.

As an aside, recently we've come across displays where the max
average luminance is higher than the max peak luminance. This is
not a mistake but due to how the display's dimming zones work.

Not sure what impact this might have on tone-mapping, other than
to keep in mind that we can assume that max_avg < max_peak.

Harry

[PATCH] drm/i915: Drop stealing of bits from i915_sw_fence function pointer

2021-09-22 Thread Matthew Brost

Rather than stealing bits from i915_sw_fence function pointer use
seperate fields for function pointer and flags. If using two different
fields, the 4 byte alignment for the i915_sw_fence function pointer can
also be dropped.

v2:
 (CI)
  - Set new function field rather than flags in __i915_sw_fence_init

Signed-off-by: Matthew Brost 
---
 drivers/gpu/drm/i915/display/intel_display.c  |  2 +-
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |  2 +-
 drivers/gpu/drm/i915/i915_request.c   |  4 ++--
 drivers/gpu/drm/i915/i915_sw_fence.c  | 12 +--
 drivers/gpu/drm/i915/i915_sw_fence.h  | 21 +--
 drivers/gpu/drm/i915/i915_sw_fence_work.c |  2 +-
 .../gpu/drm/i915/selftests/i915_sw_fence.c|  2 +-
 drivers/gpu/drm/i915/selftests/lib_sw_fence.c |  4 ++--
 8 files changed, 23 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
b/drivers/gpu/drm/i915/display/intel_display.c
index a7ca38613f89..6d5bb55ffc82 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -10323,7 +10323,7 @@ static void intel_atomic_commit_work(struct work_struct 
*work)
intel_atomic_commit_tail(state);
 }
 
-static int __i915_sw_fence_call
+static int
 intel_atomic_commit_ready(struct i915_sw_fence *fence,
  enum i915_sw_fence_notify notify)
 {
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index c2ab0e22db0a..df5fec5c3da8 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -800,7 +800,7 @@ static void free_engines_rcu(struct rcu_head *rcu)
free_engines(engines);
 }
 
-static int __i915_sw_fence_call
+static int
 engines_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state)
 {
struct i915_gem_engines *engines =
diff --git a/drivers/gpu/drm/i915/i915_request.c 
b/drivers/gpu/drm/i915/i915_request.c
index ce446716d092..945d3025a0b6 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -719,7 +719,7 @@ void i915_request_cancel(struct i915_request *rq, int error)
intel_context_cancel_request(rq->context, rq);
 }
 
-static int __i915_sw_fence_call
+static int
 submit_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state)
 {
struct i915_request *request =
@@ -755,7 +755,7 @@ submit_notify(struct i915_sw_fence *fence, enum 
i915_sw_fence_notify state)
return NOTIFY_DONE;
 }
 
-static int __i915_sw_fence_call
+static int
 semaphore_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state)
 {
struct i915_request *rq = container_of(fence, typeof(*rq), semaphore);
diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c 
b/drivers/gpu/drm/i915/i915_sw_fence.c
index c589a681da77..1c080dd1f718 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.c
+++ b/drivers/gpu/drm/i915/i915_sw_fence.c
@@ -34,7 +34,7 @@ enum {
 
 static void *i915_sw_fence_debug_hint(void *addr)
 {
-   return (void *)(((struct i915_sw_fence *)addr)->flags & 
I915_SW_FENCE_MASK);
+   return (void *)(((struct i915_sw_fence *)addr)->fn);
 }
 
 #ifdef CONFIG_DRM_I915_SW_FENCE_DEBUG_OBJECTS
@@ -126,10 +126,7 @@ static inline void debug_fence_assert(struct i915_sw_fence 
*fence)
 static int __i915_sw_fence_notify(struct i915_sw_fence *fence,
  enum i915_sw_fence_notify state)
 {
-   i915_sw_fence_notify_t fn;
-
-   fn = (i915_sw_fence_notify_t)(fence->flags & I915_SW_FENCE_MASK);
-   return fn(fence, state);
+   return fence->fn(fence, state);
 }
 
 #ifdef CONFIG_DRM_I915_SW_FENCE_DEBUG_OBJECTS
@@ -242,10 +239,11 @@ void __i915_sw_fence_init(struct i915_sw_fence *fence,
  const char *name,
  struct lock_class_key *key)
 {
-   BUG_ON(!fn || (unsigned long)fn & ~I915_SW_FENCE_MASK);
+   BUG_ON(!fn);
 
__init_waitqueue_head(>wait, name, key);
-   fence->flags = (unsigned long)fn;
+   fence->fn = fn;
+   fence->flags = 0;
 
i915_sw_fence_reinit(fence);
 }
diff --git a/drivers/gpu/drm/i915/i915_sw_fence.h 
b/drivers/gpu/drm/i915/i915_sw_fence.h
index 30a863353ee6..70ba1789aa89 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.h
+++ b/drivers/gpu/drm/i915/i915_sw_fence.h
@@ -17,26 +17,25 @@
 
 struct completion;
 struct dma_resv;
+struct i915_sw_fence;
+
+enum i915_sw_fence_notify {
+   FENCE_COMPLETE,
+   FENCE_FREE
+};
+
+typedef int (*i915_sw_fence_notify_t)(struct i915_sw_fence *,
+ enum i915_sw_fence_notify state);
 
 struct i915_sw_fence {
wait_queue_head_t wait;
+   i915_sw_fence_notify_t fn;
unsigned long flags;
atomic_t pending;
int error;
 };
 
 #define I915_SW_FENCE_CHECKED_BIT  0 /* used internally for DAG checking */
-#define I915_SW_FENCE_PRIVATE_BIT  1

Re: [PATCH v3 4/9] drm/scheduler: Add fence deadline support

2021-09-22 Thread Rob Clark

On Wed, Sep 22, 2021 at 7:31 AM Andrey Grodzovsky
 wrote:
>
>
> On 2021-09-21 11:32 p.m., Rob Clark wrote:
> > On Tue, Sep 21, 2021 at 7:18 PM Andrey Grodzovsky
> >  wrote:
> >>
> >> On 2021-09-21 4:47 p.m., Rob Clark wrote:
> >>> On Tue, Sep 21, 2021 at 1:09 PM Andrey Grodzovsky
> >>>  wrote:
>  On 2021-09-03 2:47 p.m., Rob Clark wrote:
> 
> > From: Rob Clark 
> >
> > As the finished fence is the one that is exposed to userspace, and
> > therefore the one that other operations, like atomic update, would
> > block on, we need to propagate the deadline from from the finished
> > fence to the actual hw fence.
> >
> > v2: Split into drm_sched_fence_set_parent() (ckoenig)
> >
> > Signed-off-by: Rob Clark 
> > ---
> > drivers/gpu/drm/scheduler/sched_fence.c | 34 
> > +
> > drivers/gpu/drm/scheduler/sched_main.c  |  2 +-
> > include/drm/gpu_scheduler.h |  8 ++
> > 3 files changed, 43 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/scheduler/sched_fence.c 
> > b/drivers/gpu/drm/scheduler/sched_fence.c
> > index bcea035cf4c6..4fc41a71d1c7 100644
> > --- a/drivers/gpu/drm/scheduler/sched_fence.c
> > +++ b/drivers/gpu/drm/scheduler/sched_fence.c
> > @@ -128,6 +128,30 @@ static void 
> > drm_sched_fence_release_finished(struct dma_fence *f)
> > dma_fence_put(>scheduled);
> > }
> >
> > +static void drm_sched_fence_set_deadline_finished(struct dma_fence *f,
> > +   ktime_t deadline)
> > +{
> > + struct drm_sched_fence *fence = to_drm_sched_fence(f);
> > + unsigned long flags;
> > +
> > + spin_lock_irqsave(>lock, flags);
> > +
> > + /* If we already have an earlier deadline, keep it: */
> > + if (test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, >flags) &&
> > + ktime_before(fence->deadline, deadline)) {
> > + spin_unlock_irqrestore(>lock, flags);
> > + return;
> > + }
> > +
> > + fence->deadline = deadline;
> > + set_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, >flags);
> > +
> > + spin_unlock_irqrestore(>lock, flags);
> > +
> > + if (fence->parent)
> > + dma_fence_set_deadline(fence->parent, deadline);
> > +}
> > +
> > static const struct dma_fence_ops drm_sched_fence_ops_scheduled = {
> > .get_driver_name = drm_sched_fence_get_driver_name,
> > .get_timeline_name = drm_sched_fence_get_timeline_name,
> > @@ -138,6 +162,7 @@ static const struct dma_fence_ops 
> > drm_sched_fence_ops_finished = {
> > .get_driver_name = drm_sched_fence_get_driver_name,
> > .get_timeline_name = drm_sched_fence_get_timeline_name,
> > .release = drm_sched_fence_release_finished,
> > + .set_deadline = drm_sched_fence_set_deadline_finished,
> > };
> >
> > struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
> > @@ -152,6 +177,15 @@ struct drm_sched_fence *to_drm_sched_fence(struct 
> > dma_fence *f)
> > }
> > EXPORT_SYMBOL(to_drm_sched_fence);
> >
> > +void drm_sched_fence_set_parent(struct drm_sched_fence *s_fence,
> > + struct dma_fence *fence)
> > +{
> > + s_fence->parent = dma_fence_get(fence);
> > + if (test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT,
> > +  _fence->finished.flags))
> > + dma_fence_set_deadline(fence, s_fence->deadline);
>  I believe above you should pass be s_fence->finished to
>  dma_fence_set_deadline
>  instead it fence which is the HW fence itself.
> >>> Hmm, unless this has changed recently with some patches I don't have,
> >>> s_fence->parent is the one signalled by hw, so it is the one we want
> >>> to set the deadline on
> >>>
> >>> BR,
> >>> -R
> >>
> >> No it didn't change. But then when exactly will
> >> drm_sched_fence_set_deadline_finished
> >> execute such that fence->parent != NULL ? In other words, I am not clear
> >> how propagation
> >> happens otherwise - if dma_fence_set_deadline is called with the HW
> >> fence then the assumption
> >> here is that driver provided driver specific
> >> dma_fence_ops.dma_fence_set_deadline callback executes
> >> but I was under impression that drm_sched_fence_set_deadline_finished is
> >> the one that propagates
> >> the deadline to the HW fence's callback and for it to execute
> >> dma_fence_set_deadline needs to be called
> >> with s_fence->finished.
> > Assuming I didn't screw up drm/msm conversion to scheduler,
> > _fence->finished is the one that will be returned to userspace.. and
> > later passed back to kernel for atomic commit (or to the compositor).
> > So it is the one that fence->set_deadline() will be called on.  But

Re: [PATCH 01/26] dma-buf: add dma_resv_for_each_fence_unlocked v4

2021-09-22 Thread Christian König


Am 22.09.21 um 16:36 schrieb Tvrtko Ursulin:

+
+/**
+ * dma_resv_iter_first_unlocked - first fence in an unlocked 
dma_resv obj.

+ * @cursor: the cursor with the current position
+ *
+ * Returns the first fence from an unlocked dma_resv obj.
+ */
+struct dma_fence *dma_resv_iter_first_unlocked(struct dma_resv_iter 
*cursor)

+{
+    rcu_read_lock();
+    do {
+    dma_resv_iter_restart_unlocked(cursor);
+    dma_resv_iter_walk_unlocked(cursor);
+    } while (read_seqcount_retry(>obj->seq, cursor->seq));
+    rcu_read_unlock();
+
+    return cursor->fence;
+}
+EXPORT_SYMBOL(dma_resv_iter_first_unlocked);


Why is this one split from dma_resv_iter_begin and even exported?


I've split it to be able to use dma_resv_iter_begin in both the unlocked 
and locked iterator.



I couldn't find any users in the series.


This is used in the dma_resv_for_each_fence_unlocked() macro to return 
the first fence.





+
+/**
+ * dma_resv_iter_next_unlocked - next fence in an unlocked dma_resv 
obj.

+ * @cursor: the cursor with the current position
+ *
+ * Returns the next fence from an unlocked dma_resv obj.
+ */
+struct dma_fence *dma_resv_iter_next_unlocked(struct dma_resv_iter 
*cursor)

+{
+    bool restart;
+
+    rcu_read_lock();
+    cursor->is_restarted = false;
+    restart = read_seqcount_retry(>obj->seq, cursor->seq);
+    do {
+    if (restart)
+    dma_resv_iter_restart_unlocked(cursor);
+    dma_resv_iter_walk_unlocked(cursor);
+    restart = true;
+    } while (read_seqcount_retry(>obj->seq, cursor->seq));
+    rcu_read_unlock();
+
+    return cursor->fence;
+}
+EXPORT_SYMBOL(dma_resv_iter_next_unlocked);


Couldn't dma_resv_iter_first_unlocked and dma_resv_iter_next_unlocked 
share the same implementation? Especially if you are able to replace 
cursor->is_restarted with cursor->index == -1.


That's what I had initially, but Daniel disliked it for some reason. You 
then need a centralized walk function instead of first/next.


Thanks,
Christian.


Regards,

Tvrtko

Re: [PATCH v9 2/4] dt-bindings: mfd: logicvc: Add patternProperties for the display

2021-09-22 Thread Lee Jones

On Tue, 14 Sep 2021, Paul Kocialkowski wrote:

> The LogiCVC multi-function device has a display part which is now
> described in its binding. Add a patternProperties match for it.
> 
> Signed-off-by: Paul Kocialkowski 
> ---
>  Documentation/devicetree/bindings/mfd/xylon,logicvc.yaml | 3 +++
>  1 file changed, 3 insertions(+)

Applied, thanks.

-- 
Lee Jones [李琼斯]
Senior Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog

Re: [PATCH 13/26] drm/i915: use the new iterator in i915_gem_busy_ioctl

2021-09-22 Thread Tvrtko Ursulin




On 22/09/2021 15:31, Christian König wrote:

Am 22.09.21 um 12:21 schrieb Tvrtko Ursulin:


On 22/09/2021 10:10, Christian König wrote:

This makes the function much simpler since the complex
retry logic is now handled else where.

Signed-off-by: Christian König 
---
  drivers/gpu/drm/i915/gem/i915_gem_busy.c | 35 ++--
  1 file changed, 14 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_busy.c 
b/drivers/gpu/drm/i915/gem/i915_gem_busy.c

index 6234e17259c1..313afb4a11c7 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_busy.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_busy.c
@@ -82,8 +82,8 @@ i915_gem_busy_ioctl(struct drm_device *dev, void 
*data,

  {
  struct drm_i915_gem_busy *args = data;
  struct drm_i915_gem_object *obj;
-    struct dma_resv_list *list;
-    unsigned int seq;
+    struct dma_resv_iter cursor;
+    struct dma_fence *fence;
  int err;
    err = -ENOENT;
@@ -109,27 +109,20 @@ i915_gem_busy_ioctl(struct drm_device *dev, 
void *data,
   * to report the overall busyness. This is what the wait-ioctl 
does.

   *
   */
-retry:
-    seq = raw_read_seqcount(>base.resv->seq);
-
-    /* Translate the exclusive fence to the READ *and* WRITE engine */
-    args->busy = 
busy_check_writer(dma_resv_excl_fence(obj->base.resv));

-
-    /* Translate shared fences to READ set of engines */
-    list = dma_resv_shared_list(obj->base.resv);
-    if (list) {
-    unsigned int shared_count = list->shared_count, i;
-
-    for (i = 0; i < shared_count; ++i) {
-    struct dma_fence *fence =
-    rcu_dereference(list->shared[i]);
-
+    args->busy = false;


You can drop this line, especially since it is not a boolean. With that:


I just realized that this won't work. We still need to initialize the 
return value when there is no fence at all in the resv object.




Reviewed-by: Tvrtko Ursulin 


Does that still counts if I set args->busy to zero?


Ah yes, my bad, apologies. You can keep the r-b.

Regards,

Tvrtko



Thanks,
Christian.



Regards,

Tvrtko


+    dma_resv_iter_begin(, obj->base.resv, true);
+    dma_resv_for_each_fence_unlocked(, fence) {
+    if (dma_resv_iter_is_restarted())
+    args->busy = 0;
+
+    if (dma_resv_iter_is_exclusive())
+    /* Translate the exclusive fence to the READ *and* WRITE 
engine */

+    args->busy |= busy_check_writer(fence);
+    else
+    /* Translate shared fences to READ set of engines */
  args->busy |= busy_check_reader(fence);
-    }
  }
-
-    if (args->busy && read_seqcount_retry(>base.resv->seq, seq))
-    goto retry;
+    dma_resv_iter_end();
    err = 0;
  out:

Re: [PATCH 01/26] dma-buf: add dma_resv_for_each_fence_unlocked v4

2021-09-22 Thread Tvrtko Ursulin




On 22/09/2021 10:10, Christian König wrote:

Abstract the complexity of iterating over all the fences
in a dma_resv object.

The new loop handles the whole RCU and retry dance and
returns only fences where we can be sure we grabbed the
right one.

v2: fix accessing the shared fences while they might be freed,
 improve kerneldoc, rename _cursor to _iter, add
 dma_resv_iter_is_exclusive, add dma_resv_iter_begin/end

v3: restructor the code, move rcu_read_lock()/unlock() into the
 iterator, add dma_resv_iter_is_restarted()

v4: fix NULL deref when no explicit fence exists, drop superflous
 rcu_read_lock()/unlock() calls.

Signed-off-by: Christian König 
---
  drivers/dma-buf/dma-resv.c | 95 ++
  include/linux/dma-resv.h   | 95 ++
  2 files changed, 190 insertions(+)

diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index 84fbe60629e3..7768140ab36d 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -323,6 +323,101 @@ void dma_resv_add_excl_fence(struct dma_resv *obj, struct 
dma_fence *fence)
  }
  EXPORT_SYMBOL(dma_resv_add_excl_fence);
  
+/**

+ * dma_resv_iter_restart_unlocked - restart the unlocked iterator
+ * @cursor: The dma_resv_iter object to restart
+ *
+ * Restart the unlocked iteration by initializing the cursor object.
+ */
+static void dma_resv_iter_restart_unlocked(struct dma_resv_iter *cursor)
+{
+   cursor->seq = read_seqcount_begin(>obj->seq);
+   cursor->index = -1;
+   if (cursor->all_fences)
+   cursor->fences = dma_resv_shared_list(cursor->obj);
+   else
+   cursor->fences = NULL;
+   cursor->is_restarted = true;
+}
+
+/**
+ * dma_resv_iter_walk_unlocked - walk over fences in a dma_resv obj
+ * @cursor: cursor to record the current position
+ *
+ * Return all the fences in the dma_resv object which are not yet signaled.
+ * The returned fence has an extra local reference so will stay alive.
+ * If a concurrent modify is detected the whole iterration is started over 
again.


iteration


+ */
+static void dma_resv_iter_walk_unlocked(struct dma_resv_iter *cursor)
+{
+   struct dma_resv *obj = cursor->obj;
+
+   do {
+   /* Drop the reference from the previous round */
+   dma_fence_put(cursor->fence);
+
+   if (cursor->index++ == -1) {
+   cursor->fence = dma_resv_excl_fence(obj);
+
+   } else if (!cursor->fences ||
+  cursor->index >= cursor->fences->shared_count) {
+   cursor->fence = NULL;
+
+   } else {
+   struct dma_resv_list *fences = cursor->fences;
+   unsigned int idx = cursor->index;
+
+   cursor->fence = rcu_dereference(fences->shared[idx]);
+   }
+   if (cursor->fence)
+   cursor->fence = dma_fence_get_rcu(cursor->fence);
+   } while (cursor->fence && dma_fence_is_signaled(cursor->fence));
+}
+
+/**
+ * dma_resv_iter_first_unlocked - first fence in an unlocked dma_resv obj.
+ * @cursor: the cursor with the current position
+ *
+ * Returns the first fence from an unlocked dma_resv obj.
+ */
+struct dma_fence *dma_resv_iter_first_unlocked(struct dma_resv_iter *cursor)
+{
+   rcu_read_lock();
+   do {
+   dma_resv_iter_restart_unlocked(cursor);
+   dma_resv_iter_walk_unlocked(cursor);
+   } while (read_seqcount_retry(>obj->seq, cursor->seq));
+   rcu_read_unlock();
+
+   return cursor->fence;
+}
+EXPORT_SYMBOL(dma_resv_iter_first_unlocked);


Why is this one split from dma_resv_iter_begin and even exported? I 
couldn't find any users in the series.



+
+/**
+ * dma_resv_iter_next_unlocked - next fence in an unlocked dma_resv obj.
+ * @cursor: the cursor with the current position
+ *
+ * Returns the next fence from an unlocked dma_resv obj.
+ */
+struct dma_fence *dma_resv_iter_next_unlocked(struct dma_resv_iter *cursor)
+{
+   bool restart;
+
+   rcu_read_lock();
+   cursor->is_restarted = false;
+   restart = read_seqcount_retry(>obj->seq, cursor->seq);
+   do {
+   if (restart)
+   dma_resv_iter_restart_unlocked(cursor);
+   dma_resv_iter_walk_unlocked(cursor);
+   restart = true;
+   } while (read_seqcount_retry(>obj->seq, cursor->seq));
+   rcu_read_unlock();
+
+   return cursor->fence;
+}
+EXPORT_SYMBOL(dma_resv_iter_next_unlocked);


Couldn't dma_resv_iter_first_unlocked and dma_resv_iter_next_unlocked 
share the same implementation? Especially if you are able to replace 
cursor->is_restarted with cursor->index == -1.



+
  /**
   * dma_resv_copy_fences - Copy all fences from src to dst.
   * @dst: the destination reservation object
diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
index 9100dd3dc21f..baf77a542392

Re: [PATCH v3 5/8] x86/sme: Replace occurrences of sme_active() with cc_platform_has()

2021-09-22 Thread Kirill A. Shutemov

On Wed, Sep 22, 2021 at 08:40:43AM -0500, Tom Lendacky wrote:
> On 9/21/21 4:58 PM, Kirill A. Shutemov wrote:
> > On Tue, Sep 21, 2021 at 04:43:59PM -0500, Tom Lendacky wrote:
> > > On 9/21/21 4:34 PM, Kirill A. Shutemov wrote:
> > > > On Tue, Sep 21, 2021 at 11:27:17PM +0200, Borislav Petkov wrote:
> > > > > On Wed, Sep 22, 2021 at 12:20:59AM +0300, Kirill A. Shutemov wrote:
> > > > > > I still believe calling cc_platform_has() from __startup_64() is 
> > > > > > totally
> > > > > > broken as it lacks proper wrapping while accessing global variables.
> > > > > 
> > > > > Well, one of the issues on the AMD side was using boot_cpu_data too
> > > > > early and the Intel side uses it too. Can you replace those checks 
> > > > > with
> > > > > is_tdx_guest() or whatever was the helper's name which would check
> > > > > whether the the kernel is running as a TDX guest, and see if that 
> > > > > helps?
> > > > 
> > > > There's no need in Intel check this early. Only AMD need it. Maybe just
> > > > opencode them?
> > > 
> > > Any way you can put a gzipped/bzipped copy of your vmlinux file somewhere 
> > > I
> > > can grab it from and take a look at it?
> > 
> > You can find broken vmlinux and bzImage here:
> > 
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdrive.google.com%2Fdrive%2Ffolders%2F1n74vUQHOGebnF70Im32qLFY8iS3wvjIs%3Fusp%3Dsharingdata=04%7C01%7Cthomas.lendacky%40amd.com%7C1c7adf380cbe4c1a6bb708d97d4af6ff%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637678583935705530%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=gA30x%2Bfu97tUx0p2UqI8HgjiL8bxDbK1GqgJBbUrUE4%3Dreserved=0
> > 
> > Let me know when I can remove it.
> 
> Looking at everything, it is all RIP relative addressing, so those
> accesses should be fine.

Not fine, but waiting to blowup with random build environment change.

> Your image has the intel_cc_platform_has()
> function, does it work if you remove that call? Because I think it may be
> the early call into that function which looks like it has instrumentation
> that uses %gs in __sanitizer_cov_trace_pc and %gs is not setup properly
> yet. And since boot_cpu_data.x86_vendor will likely be zero this early it
> will match X86_VENDOR_INTEL and call into that function.

Right removing call to intel_cc_platform_has() or moving it to
cc_platform.c fixes the issue.

-- 
 Kirill A. Shutemov

Re: [PATCH v3 4/9] drm/scheduler: Add fence deadline support

2021-09-22 Thread Andrey Grodzovsky




On 2021-09-21 11:32 p.m., Rob Clark wrote:

On Tue, Sep 21, 2021 at 7:18 PM Andrey Grodzovsky
 wrote:


On 2021-09-21 4:47 p.m., Rob Clark wrote:

On Tue, Sep 21, 2021 at 1:09 PM Andrey Grodzovsky
 wrote:

On 2021-09-03 2:47 p.m., Rob Clark wrote:


From: Rob Clark 

As the finished fence is the one that is exposed to userspace, and
therefore the one that other operations, like atomic update, would
block on, we need to propagate the deadline from from the finished
fence to the actual hw fence.

v2: Split into drm_sched_fence_set_parent() (ckoenig)

Signed-off-by: Rob Clark 
---
drivers/gpu/drm/scheduler/sched_fence.c | 34 +
drivers/gpu/drm/scheduler/sched_main.c  |  2 +-
include/drm/gpu_scheduler.h |  8 ++
3 files changed, 43 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/scheduler/sched_fence.c 
b/drivers/gpu/drm/scheduler/sched_fence.c
index bcea035cf4c6..4fc41a71d1c7 100644
--- a/drivers/gpu/drm/scheduler/sched_fence.c
+++ b/drivers/gpu/drm/scheduler/sched_fence.c
@@ -128,6 +128,30 @@ static void drm_sched_fence_release_finished(struct 
dma_fence *f)
dma_fence_put(>scheduled);
}

+static void drm_sched_fence_set_deadline_finished(struct dma_fence *f,
+   ktime_t deadline)
+{
+ struct drm_sched_fence *fence = to_drm_sched_fence(f);
+ unsigned long flags;
+
+ spin_lock_irqsave(>lock, flags);
+
+ /* If we already have an earlier deadline, keep it: */
+ if (test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, >flags) &&
+ ktime_before(fence->deadline, deadline)) {
+ spin_unlock_irqrestore(>lock, flags);
+ return;
+ }
+
+ fence->deadline = deadline;
+ set_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, >flags);
+
+ spin_unlock_irqrestore(>lock, flags);
+
+ if (fence->parent)
+ dma_fence_set_deadline(fence->parent, deadline);
+}
+
static const struct dma_fence_ops drm_sched_fence_ops_scheduled = {
.get_driver_name = drm_sched_fence_get_driver_name,
.get_timeline_name = drm_sched_fence_get_timeline_name,
@@ -138,6 +162,7 @@ static const struct dma_fence_ops 
drm_sched_fence_ops_finished = {
.get_driver_name = drm_sched_fence_get_driver_name,
.get_timeline_name = drm_sched_fence_get_timeline_name,
.release = drm_sched_fence_release_finished,
+ .set_deadline = drm_sched_fence_set_deadline_finished,
};

struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
@@ -152,6 +177,15 @@ struct drm_sched_fence *to_drm_sched_fence(struct 
dma_fence *f)
}
EXPORT_SYMBOL(to_drm_sched_fence);

+void drm_sched_fence_set_parent(struct drm_sched_fence *s_fence,
+ struct dma_fence *fence)
+{
+ s_fence->parent = dma_fence_get(fence);
+ if (test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT,
+  _fence->finished.flags))
+ dma_fence_set_deadline(fence, s_fence->deadline);

I believe above you should pass be s_fence->finished to
dma_fence_set_deadline
instead it fence which is the HW fence itself.

Hmm, unless this has changed recently with some patches I don't have,
s_fence->parent is the one signalled by hw, so it is the one we want
to set the deadline on

BR,
-R


No it didn't change. But then when exactly will
drm_sched_fence_set_deadline_finished
execute such that fence->parent != NULL ? In other words, I am not clear
how propagation
happens otherwise - if dma_fence_set_deadline is called with the HW
fence then the assumption
here is that driver provided driver specific
dma_fence_ops.dma_fence_set_deadline callback executes
but I was under impression that drm_sched_fence_set_deadline_finished is
the one that propagates
the deadline to the HW fence's callback and for it to execute
dma_fence_set_deadline needs to be called
with s_fence->finished.

Assuming I didn't screw up drm/msm conversion to scheduler,
_fence->finished is the one that will be returned to userspace.. and
later passed back to kernel for atomic commit (or to the compositor).
So it is the one that fence->set_deadline() will be called on.  But
s_fence->parent is the actual hw fence that needs to know about the
deadline.  Depending on whether or not the job has been written into
hw ringbuffer or not, there are two cases:

1) not scheduled yet, s_fence will store the deadline and propagate it
later once s_fence->parent is known



And by later you mean the call to drm_sched_fence_set_parent
after HW fence is returned  ? If yes I think i get it now.

Andrey



2) already scheduled, in which case s_fence->finished.set_deadline
will propagate it directly to the real fence

BR,
-R


Andrey




Andrey



+}
+
struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity 
*entity,
  void *owner)
{
diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index

Re: [PATCH 13/26] drm/i915: use the new iterator in i915_gem_busy_ioctl

2021-09-22 Thread Christian König


Am 22.09.21 um 12:21 schrieb Tvrtko Ursulin:


On 22/09/2021 10:10, Christian König wrote:

This makes the function much simpler since the complex
retry logic is now handled else where.

Signed-off-by: Christian König 
---
  drivers/gpu/drm/i915/gem/i915_gem_busy.c | 35 ++--
  1 file changed, 14 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_busy.c 
b/drivers/gpu/drm/i915/gem/i915_gem_busy.c

index 6234e17259c1..313afb4a11c7 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_busy.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_busy.c
@@ -82,8 +82,8 @@ i915_gem_busy_ioctl(struct drm_device *dev, void 
*data,

  {
  struct drm_i915_gem_busy *args = data;
  struct drm_i915_gem_object *obj;
-    struct dma_resv_list *list;
-    unsigned int seq;
+    struct dma_resv_iter cursor;
+    struct dma_fence *fence;
  int err;
    err = -ENOENT;
@@ -109,27 +109,20 @@ i915_gem_busy_ioctl(struct drm_device *dev, 
void *data,
   * to report the overall busyness. This is what the wait-ioctl 
does.

   *
   */
-retry:
-    seq = raw_read_seqcount(>base.resv->seq);
-
-    /* Translate the exclusive fence to the READ *and* WRITE engine */
-    args->busy = 
busy_check_writer(dma_resv_excl_fence(obj->base.resv));

-
-    /* Translate shared fences to READ set of engines */
-    list = dma_resv_shared_list(obj->base.resv);
-    if (list) {
-    unsigned int shared_count = list->shared_count, i;
-
-    for (i = 0; i < shared_count; ++i) {
-    struct dma_fence *fence =
-    rcu_dereference(list->shared[i]);
-
+    args->busy = false;


You can drop this line, especially since it is not a boolean. With that:


I just realized that this won't work. We still need to initialize the 
return value when there is no fence at all in the resv object.




Reviewed-by: Tvrtko Ursulin 


Does that still counts if I set args->busy to zero?

Thanks,
Christian.



Regards,

Tvrtko


+    dma_resv_iter_begin(, obj->base.resv, true);
+    dma_resv_for_each_fence_unlocked(, fence) {
+    if (dma_resv_iter_is_restarted())
+    args->busy = 0;
+
+    if (dma_resv_iter_is_exclusive())
+    /* Translate the exclusive fence to the READ *and* WRITE 
engine */

+    args->busy |= busy_check_writer(fence);
+    else
+    /* Translate shared fences to READ set of engines */
  args->busy |= busy_check_reader(fence);
-    }
  }
-
-    if (args->busy && read_seqcount_retry(>base.resv->seq, seq))
-    goto retry;
+    dma_resv_iter_end();
    err = 0;
  out:

Re: [PATCH 0/5] drm/gma500: Managed cleanup

2021-09-22 Thread Patrik Jakobsson

On Mon, Sep 20, 2021 at 4:10 PM Thomas Zimmermann  wrote:
>
> Switch gma500 to managed cleanup and remove the manual cleanup
> code from the driver's PCI callbacks.
>
> Managed cleanup involves embedding the DRM device structure in the
> driver's structure. In preparation, patch 1 replaces references all
> references to dev_private with a helper function.
>
> Patch 2 adds managed cleanup for pci_enable_device().
>
> Patches 3 and 4 embed struct drm_device in struct_drm_psb_private. The
> structure's memory is being automatically released.
>
> Patch 5 adds managed cleanup for the device resources. Instead of
> calling the large, monolithic function psb_driver_unload(), the release
> code could be split up split into smaller helpers and reuse exising
> functionality from devres.
>
> Future work: for a number of drivers, the PCI remove callback contains
> only a single call to drm_device_unregister(). In a later patchset,
> this could be implemented as another shared helper within DRM.
>
> Tested on Atom N2800 hardware.

Thanks for the patches!

For the entire series:
Reviewed-by: Patrik Jakobsson 

I'll let you apply this to drm-misc-next yourself

Cheers
Patrik

>
> Thomas Zimmermann (5):
>   drm/gma500: Replace references to dev_private with helper function
>   drm/gma500: Disable PCI device during shutdown
>   drm/gma500: Embed struct drm_device in struct drm_psb_private
>   drm/gma500: Remove dev_priv branch from unload function
>   drm/gma500: Managed device release
>
>  drivers/gpu/drm/gma500/backlight.c |  12 +-
>  drivers/gpu/drm/gma500/cdv_device.c|  24 ++--
>  drivers/gpu/drm/gma500/cdv_intel_display.c |  10 +-
>  drivers/gpu/drm/gma500/cdv_intel_dp.c  |  12 +-
>  drivers/gpu/drm/gma500/cdv_intel_lvds.c|  22 +--
>  drivers/gpu/drm/gma500/framebuffer.c   |  16 +--
>  drivers/gpu/drm/gma500/gem.c   |   2 +-
>  drivers/gpu/drm/gma500/gma_device.c|   2 +-
>  drivers/gpu/drm/gma500/gma_display.c   |  14 +-
>  drivers/gpu/drm/gma500/gtt.c   |  18 +--
>  drivers/gpu/drm/gma500/intel_bios.c|  10 +-
>  drivers/gpu/drm/gma500/intel_gmbus.c   |  12 +-
>  drivers/gpu/drm/gma500/mid_bios.c  |  11 +-
>  drivers/gpu/drm/gma500/mmu.c   |  12 +-
>  drivers/gpu/drm/gma500/oaktrail_crtc.c |   8 +-
>  drivers/gpu/drm/gma500/oaktrail_device.c   |  20 +--
>  drivers/gpu/drm/gma500/oaktrail_hdmi.c |  18 +--
>  drivers/gpu/drm/gma500/oaktrail_lvds.c |  14 +-
>  drivers/gpu/drm/gma500/oaktrail_lvds_i2c.c |   2 +-
>  drivers/gpu/drm/gma500/opregion.c  |  14 +-
>  drivers/gpu/drm/gma500/power.c |  20 +--
>  drivers/gpu/drm/gma500/psb_device.c|  16 +--
>  drivers/gpu/drm/gma500/psb_drv.c   | 147 ++---
>  drivers/gpu/drm/gma500/psb_drv.h   |  24 ++--
>  drivers/gpu/drm/gma500/psb_intel_display.c |  10 +-
>  drivers/gpu/drm/gma500/psb_intel_lvds.c|  31 ++---
>  drivers/gpu/drm/gma500/psb_intel_sdvo.c|  10 +-
>  drivers/gpu/drm/gma500/psb_irq.c   |  26 ++--
>  drivers/gpu/drm/gma500/psb_lid.c   |   2 +-
>  29 files changed, 261 insertions(+), 278 deletions(-)
>
> --
> 2.33.0
>

Re: [PATCH v4 10/14] drm/i915/ttm: hide shmem objects from TTM LRU

2021-09-22 Thread Christian König


Am 22.09.21 um 15:34 schrieb Matthew Auld:

On 21/09/2021 12:48, Christian König wrote:

Am 21.09.21 um 13:01 schrieb Matthew Auld:

This is probably a NAK. But ideally we need to somehow prevent TTM from
seeing shmem objects when doing its LRU swap walk. Since these are
EXTERNAL they are ignored anyway, but keeping them in the LRU seems
pretty wasteful.  Trying to use bo_pin() for this is all kinds of nasty
since we need to be able to do the bo_unpin() from the unpopulate hook,
but since that can be called from the BO destroy path we will likely go
down in flames.

An alternative is to maybe just add EXTERNAL objects to some
bdev->external LRU in TTM, or just don't add them at all?


Yeah, that goes into the same direction as why I want to push the LRU 
into the resource for some time.


The problem is that the LRU is needed for multiple things. E.g. 
swapping, GART management, resource constrains, IOMMU teardown etc..


So for now I think that everything should be on the LRU even if it 
isn't valid to be there for some use case.


Ok. Is it a no-go to keep TT_FLAG_EXTERNAL on say bdev->external?


We could add that as a workaround, but I would rather aim for cleaning 
that up more thoughtfully.


Regards,
Christian.





Regards,
Christian.



Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
Cc: Christian König 
---
  drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 17 +
  1 file changed, 17 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c 
b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c

index 174aebe11264..b438ddb52764 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
@@ -800,6 +800,22 @@ static unsigned long i915_ttm_io_mem_pfn(struct 
ttm_buffer_object *bo,

  return ((base + sg_dma_address(sg)) >> PAGE_SHIFT) + ofs;
  }
+static void i915_ttm_del_from_lru_notify(struct ttm_buffer_object *bo)
+{
+    struct i915_ttm_tt *i915_tt =
+    container_of(bo->ttm, typeof(*i915_tt), ttm);
+
+    /* Idealy we need to prevent TTM from seeing shmem objects when 
doing
+ * its LRU swap walk. Since these are EXTERNAL they are ignored 
anyway,
+ * but keeping them in the LRU is pretty waseful. Trying to use 
bo_pin()
+ * for this is very nasty since we need to be able to do the 
bo_unpin()
+ * from the unpopulate hook, but since that can be called from 
the BO

+ * destroy path we will go down in flames.
+ */
+    if (bo->ttm && ttm_tt_is_populated(bo->ttm) && i915_tt->is_shmem)
+    list_del_init(>lru);
+}
+
  static struct ttm_device_funcs i915_ttm_bo_driver = {
  .ttm_tt_create = i915_ttm_tt_create,
  .ttm_tt_populate = i915_ttm_tt_populate,
@@ -810,6 +826,7 @@ static struct ttm_device_funcs 
i915_ttm_bo_driver = {

  .move = i915_ttm_move,
  .swap_notify = i915_ttm_swap_notify,
  .delete_mem_notify = i915_ttm_delete_mem_notify,
+    .del_from_lru_notify = i915_ttm_del_from_lru_notify,
  .io_mem_reserve = i915_ttm_io_mem_reserve,
  .io_mem_pfn = i915_ttm_io_mem_pfn,
  };

1 2 >

1 - 100 of 196 matches

Mail list logo