Re: [PATCH v2 0/2] video: A couple of fixes for the vga16fb driver

2022-01-11 Thread Geert Uytterhoeven
Hi Kris,

On Wed, Jan 12, 2022 at 3:19 AM Kris Karas (Bug reporting)
 wrote:
> Javier Martinez Canillas wrote:
> > Changes in v2:
> > - Make the change only for x86 (Geert Uytterhoeven)
> > - Only check the suppported video mode for x86 (Geert Uytterhoeven).
>
> I just updated Bug 215001 to reflect that I have tested this new, V2
> patch against 4 systems, one more than last time - 2 BIOS/VGAC and 2
> UEFI - and it works perfectly on all four.
>
> Thanks, Javier, for the excellent work!
> I didn't test with non-X86, but the code appears to bypass the patch on
> non-X86, so should work fine for Geert.

Note that I can no longer test the PPC use case, as the hardware
died a long time ago.

> Tested-By: Kris Karas 

Thanks!

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


Re: [PATCH v8 1/6] drm: move the buddy allocator from i915 into common drm

2022-01-11 Thread Christian König
If nobody has any more objections/ideas I'm going to push this one here 
to drm-misc-next in the afternoon.


Christian.

Am 11.01.22 um 21:14 schrieb Arunpravin:

Move the base i915 buddy allocator code into drm
- Move i915_buddy.h to include/drm
- Move i915_buddy.c to drm root folder
- Rename "i915" string with "drm" string wherever applicable
- Rename "I915" string with "DRM" string wherever applicable
- Fix header file dependencies
- Fix alignment issues
- add Makefile support for drm buddy
- export functions and write kerneldoc description
- Remove i915 selftest config check condition as buddy selftest
   will be moved to drm selftest folder

cleanup i915 buddy references in i915 driver module
and replace with drm buddy

v2:
   - include header file in alphabetical order(Thomas)
   - merged changes listed in the body section into a single patch
 to keep the build intact(Christian, Jani)

v3:
   - make drm buddy a separate module(Thomas, Christian)

v4:
   - Fix build error reported by kernel test robot 
   - removed i915 buddy selftest from i915_mock_selftests.h to
 avoid build error
   - removed selftests/i915_buddy.c file as we create a new set of
 buddy test cases in drm/selftests folder

v5:
   - Fix merge conflict issue

v6:
   - replace drm_buddy_mm structure name as drm_buddy(Thomas, Christian)
   - replace drm_buddy_alloc() function name as drm_buddy_alloc_blocks()
 (Thomas)
   - replace drm_buddy_free() function name as drm_buddy_free_block()
 (Thomas)
   - export drm_buddy_free_block() function
   - fix multiple instances of KMEM_CACHE() entry

v7:
   - fix warnings reported by kernel test robot 
   - modify the license(Christian)

Signed-off-by: Arunpravin 
---
  drivers/gpu/drm/Kconfig   |   6 +
  drivers/gpu/drm/Makefile  |   2 +
  drivers/gpu/drm/drm_buddy.c   | 535 
  drivers/gpu/drm/i915/Kconfig  |   1 +
  drivers/gpu/drm/i915/Makefile |   1 -
  drivers/gpu/drm/i915/i915_buddy.c | 466 ---
  drivers/gpu/drm/i915/i915_buddy.h | 143 
  drivers/gpu/drm/i915/i915_module.c|   3 -
  drivers/gpu/drm/i915/i915_scatterlist.c   |  11 +-
  drivers/gpu/drm/i915/i915_ttm_buddy_manager.c |  33 +-
  drivers/gpu/drm/i915/i915_ttm_buddy_manager.h |   4 +-
  drivers/gpu/drm/i915/selftests/i915_buddy.c   | 787 --
  .../drm/i915/selftests/i915_mock_selftests.h  |   1 -
  .../drm/i915/selftests/intel_memory_region.c  |  13 +-
  include/drm/drm_buddy.h   | 150 
  15 files changed, 725 insertions(+), 1431 deletions(-)
  create mode 100644 drivers/gpu/drm/drm_buddy.c
  delete mode 100644 drivers/gpu/drm/i915/i915_buddy.c
  delete mode 100644 drivers/gpu/drm/i915/i915_buddy.h
  delete mode 100644 drivers/gpu/drm/i915/selftests/i915_buddy.c
  create mode 100644 include/drm/drm_buddy.h

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index b1f22e457fd0..b85f7ffae621 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -198,6 +198,12 @@ config DRM_TTM
  GPU memory types. Will be enabled automatically if a device driver
  uses it.
  
+config DRM_BUDDY

+   tristate
+   depends on DRM
+   help
+ A page based buddy allocator
+
  config DRM_VRAM_HELPER
tristate
depends on DRM
diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
index 301a44dc18e3..ff0286eca254 100644
--- a/drivers/gpu/drm/Makefile
+++ b/drivers/gpu/drm/Makefile
@@ -42,6 +42,8 @@ obj-$(CONFIG_DRM_GEM_CMA_HELPER) += drm_cma_helper.o
  drm_shmem_helper-y := drm_gem_shmem_helper.o
  obj-$(CONFIG_DRM_GEM_SHMEM_HELPER) += drm_shmem_helper.o
  
+obj-$(CONFIG_DRM_BUDDY) += drm_buddy.o

+
  drm_vram_helper-y := drm_gem_vram_helper.o
  obj-$(CONFIG_DRM_VRAM_HELPER) += drm_vram_helper.o
  
diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c

new file mode 100644
index ..9f4d929995b2
--- /dev/null
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -0,0 +1,535 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2021 Intel Corporation
+ */
+
+#include 
+#include 
+#include 
+
+#include 
+
+static struct kmem_cache *slab_blocks;
+
+static struct drm_buddy_block *drm_block_alloc(struct drm_buddy *mm,
+  struct drm_buddy_block *parent,
+  unsigned int order,
+  u64 offset)
+{
+   struct drm_buddy_block *block;
+
+   BUG_ON(order > DRM_BUDDY_MAX_ORDER);
+
+   block = kmem_cache_zalloc(slab_blocks, GFP_KERNEL);
+   if (!block)
+   return NULL;
+
+   block->header = offset;
+   block->header |= order;
+   block->parent = parent;
+
+   BUG_ON(block->header & DRM_BUDDY_HEADER_UNUSED);
+   return block;
+}
+
+static void drm_block_free(struct drm_buddy *mm,
+

Re: [PATCH 6/6] drm/meson: add support for MIPI-DSI transceiver

2022-01-11 Thread Jagan Teki
Hi Neil,

On Mon, Sep 7, 2020 at 1:48 PM Neil Armstrong  wrote:
>
> The Amlogic AXg SoCs embeds a Synopsys DW-MIPI-DSI transceiver (ver 1.21a), 
> with a custom
> glue managing the IP resets, clock and data input similar to the DW-HDMI Glue 
> on other
> Amlogic SoCs.
>
> This adds support for the Glue managing the transceiver, mimicing the init 
> flow provided
> by Amlogic to setup the ENCl encoder, the glue, the transceiver, the digital 
> D-PHY and the
> Analog PHY in the proper way.
>
> The DW-MIPI-DSI transceiver + D-PHY are directly clocked by the VCLK2 clock, 
> which pixel clock
> is derived and feeds the ENCL encoder and the VIU pixel reader.
>
> An optional "MEAS" clock can be enabled to measure the delay between each 
> vsync feeding the
> DW-MIPI-DSI transceiver.
>
> Signed-off-by: Neil Armstrong 
> ---
>  drivers/gpu/drm/meson/Kconfig |   7 +
>  drivers/gpu/drm/meson/Makefile|   1 +
>  drivers/gpu/drm/meson/meson_dw_mipi_dsi.c | 562 ++
>  3 files changed, 570 insertions(+)
>  create mode 100644 drivers/gpu/drm/meson/meson_dw_mipi_dsi.c
>
> diff --git a/drivers/gpu/drm/meson/Kconfig b/drivers/gpu/drm/meson/Kconfig
> index 9f9281dd49f8..385f6f23839b 100644
> --- a/drivers/gpu/drm/meson/Kconfig
> +++ b/drivers/gpu/drm/meson/Kconfig
> @@ -16,3 +16,10 @@ config DRM_MESON_DW_HDMI
> default y if DRM_MESON
> select DRM_DW_HDMI
> imply DRM_DW_HDMI_I2S_AUDIO
> +
> +config DRM_MESON_DW_MIPI_DSI
> +   tristate "MIPI DSI Synopsys Controller support for Amlogic Meson 
> Display"
> +   depends on DRM_MESON
> +   default y if DRM_MESON
> +   select DRM_DW_MIPI_DSI
> +   select GENERIC_PHY_MIPI_DPHY
> diff --git a/drivers/gpu/drm/meson/Makefile b/drivers/gpu/drm/meson/Makefile
> index 28a519cdf66b..2cc870e91182 100644
> --- a/drivers/gpu/drm/meson/Makefile
> +++ b/drivers/gpu/drm/meson/Makefile
> @@ -5,3 +5,4 @@ meson-drm-y += meson_rdma.o meson_osd_afbcd.o
>
>  obj-$(CONFIG_DRM_MESON) += meson-drm.o
>  obj-$(CONFIG_DRM_MESON_DW_HDMI) += meson_dw_hdmi.o
> +obj-$(CONFIG_DRM_MESON_DW_MIPI_DSI) += meson_dw_mipi_dsi.o
> diff --git a/drivers/gpu/drm/meson/meson_dw_mipi_dsi.c 
> b/drivers/gpu/drm/meson/meson_dw_mipi_dsi.c
> new file mode 100644
> index ..bbe1294fce7c
> --- /dev/null
> +++ b/drivers/gpu/drm/meson/meson_dw_mipi_dsi.c
> @@ -0,0 +1,562 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +/*
> + * Copyright (C) 2016 BayLibre, SAS
> + * Author: Neil Armstrong 
> + * Copyright (C) 2015 Amlogic, Inc. All rights reserved.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include 
> +
> +#include 
> +#include 
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "meson_drv.h"
> +#include "meson_dw_mipi_dsi.h"
> +#include "meson_registers.h"
> +#include "meson_venc.h"
> +
> +#define DRIVER_NAME "meson-dw-mipi-dsi"
> +#define DRIVER_DESC "Amlogic Meson MIPI-DSI DRM driver"
> +
> +/*  MIPI DSI/VENC Color Format Definitions */
> +#define MIPI_DSI_VENC_COLOR_30B   0x0
> +#define MIPI_DSI_VENC_COLOR_24B   0x1
> +#define MIPI_DSI_VENC_COLOR_18B   0x2
> +#define MIPI_DSI_VENC_COLOR_16B   0x3
> +
> +#define COLOR_16BIT_CFG_1 0x0
> +#define COLOR_16BIT_CFG_2 0x1
> +#define COLOR_16BIT_CFG_3 0x2
> +#define COLOR_18BIT_CFG_1 0x3
> +#define COLOR_18BIT_CFG_2 0x4
> +#define COLOR_24BIT   0x5
> +#define COLOR_20BIT_LOOSE 0x6
> +#define COLOR_24_BIT_YCBCR0x7
> +#define COLOR_16BIT_YCBCR 0x8
> +#define COLOR_30BIT   0x9
> +#define COLOR_36BIT   0xa
> +#define COLOR_12BIT   0xb
> +#define COLOR_RGB_111 0xc
> +#define COLOR_RGB_332 0xd
> +#define COLOR_RGB_444 0xe
> +
> +/*  MIPI DSI Relative REGISTERs Definitions */
> +/* For MIPI_DSI_TOP_CNTL */
> +#define BIT_DPI_COLOR_MODE20
> +#define BIT_IN_COLOR_MODE 16
> +#define BIT_CHROMA_SUBSAMPLE  14
> +#define BIT_COMP2_SEL 12
> +#define BIT_COMP1_SEL 10
> +#define BIT_COMP0_SEL  8
> +#define BIT_DE_POL 6
> +#define BIT_HSYNC_POL  5
> +#define BIT_VSYNC_POL  4
> +#define BIT_DPICOLORM  3
> +#define BIT_DPISHUTDN  2
> +#define BIT_EDPITE_INTR_PULSE  1
> +#define BIT_ERR_INTR_PULSE 0
> +
> +/* HHI Registers */
> +#define HHI_VIID_CLK_DIV   0x128 /* 0x4a offset in data sheet */
> +#define VCLK2_DIV_MASK 0xff
> +#define VCLK2_DIV_EN   BIT(16)
> +#define VCLK2_DIV_RESETBIT(17)
> +#define CTS_ENCL_SEL_MASK  (0xf << 12)
> +#define CTS_ENCL_SEL_SHIFT 12
> +#define HHI_VIID_CLK_CNTL  0x12c /* 0x4b offset in data sheet */
> +#define VCLK2_EN   BIT(19)
> +#define VCLK2_SEL_MASK (0x7 << 16)
> +#define VCLK2_SEL_SHIFT16
> +#define VCLK2_SOFT_RESET

Re: [PATCH v2] drm/mediatek: mtk_dsi: Avoid EPROBE_DEFER loop with external bridge

2022-01-11 Thread Jagan Teki
On Tue, Jan 4, 2022 at 3:30 PM AngeloGioacchino Del Regno
 wrote:
>
> DRM bridge drivers are now attaching their DSI device at probe time,
> which requires us to register our DSI host in order to let the bridge
> to probe: this recently started producing an endless -EPROBE_DEFER
> loop on some machines that are using external bridges, like the
> parade-ps8640, found on the ACER Chromebook R13.
>
> Now that the DSI hosts/devices probe sequence is documented, we can
> do adjustments to the mtk_dsi driver as to both fix now and make sure
> to avoid this situation in the future: for this, following what is
> documented in drm_bridge.c, move the mtk_dsi component_add() to the
> mtk_dsi_ops.attach callback and delete it in the detach callback;
> keeping in mind that we are registering a drm_bridge for our DSI,
> which is only used/attached if the DSI Host is bound, it wouldn't
> make sense to keep adding our bridge at probe time (as it would
> be useless to have it if mtk_dsi_ops.attach() fails!), so also move
> that one to the dsi host attach function (and remove it in detach).
>
> Signed-off-by: AngeloGioacchino Del Regno 
> 
> Reviewed-by: Andrzej Hajda 
> ---

Eventually I've observed similar issue on other Component based DSI
controllers, hence

Reviewed-by: Jagan Teki 


Re: 回复: [PATCH] drm/ttm: Put BO in its memory manager's lru list

2022-01-11 Thread Christian König

Yeah, that should probably be the right one.

Christian.

Am 12.01.22 um 03:19 schrieb Chen, Guchun:

[Public]

Hi Christian,

My BAD, I checked that discussion history of this just now. So If I read it correctly, 
the double check at a different place to skip evict is: " drm/ttm: Double check 
mem_type of BO while eviction"? It is in 5.16 kernel.

Regards,
Guchun

-Original Message-
From: Christian König 
Sent: Tuesday, January 11, 2022 7:27 PM
To: Chen, Guchun ; Pan, Xinhui ; Koenig, 
Christian ; amd-...@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Subject: Re: 回复: [PATCH] drm/ttm: Put BO in its memory manager's lru list

IIRC we have completely dropped this patch in favor of a check at a different 
place.

Regards,
Christian.

Am 11.01.22 um 09:47 schrieb Chen, Guchun:

[Public]

Hi Christian,

Looks this patch still missed in 5.16 kernel. Is it intentional?
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.
kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2
Ftree%2Fdrivers%2Fgpu%2Fdrm%2Fttm%2Fttm_bo.c%3Fh%3Dv5.16&data=04%7
C01%7CGuchun.Chen%40amd.com%7Cf3b7f4971dc8405b0c2908d9d4f55547%7C3dd89
61fe4884e608e11a82d994e183d%7C0%7C0%7C637774972434004088%7CUnknown%7CT
WFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI
6Mn0%3D%7C3000&sdata=vbuBPHO40J2HGt7abzfzC0nC1DQa62qal5S6TXBRj4w%3
D&reserved=0

Regards,
Guchun

-Original Message-
From: amd-gfx  On Behalf Of
Pan, Xinhui
Sent: Tuesday, November 9, 2021 9:16 PM
To: Koenig, Christian ;
amd-...@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Subject: 回复: 回复: [PATCH] drm/ttm: Put BO in its memory manager's lru
list

[AMD Official Use Only]

[AMD Official Use Only]

Actually this patch does not totally fix the mismatch of lru list with mem_type as 
mem_type is changed in ->move() and lru list is changed after that.

During this small period, another eviction could still happed and evict this 
mismatched BO from sMam(say, its lru list is on vram domain) to sMem.

发件人: Pan, Xinhui 
发送时间: 2021年11月9日 21:05
收件人: Koenig, Christian; amd-...@lists.freedesktop.org
抄送: dri-devel@lists.freedesktop.org
主题: 回复: 回复: [PATCH] drm/ttm: Put BO in its memory manager's lru list

Yes, a stable tag is needed. vulkan guys say 5.14 hit this issue too.

I think that amdgpu_bo_move() does support copy from sysMem to sysMem correctly.
maybe something below is needed.
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index c83ef42ca702..aa63ae7ddf1e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -485,7 +485,8 @@ static int amdgpu_bo_move(struct ttm_buffer_object *bo, 
bool evict,
  }
  if (old_mem->mem_type == TTM_PL_SYSTEM &&
  (new_mem->mem_type == TTM_PL_TT ||
-new_mem->mem_type == AMDGPU_PL_PREEMPT)) {
+new_mem->mem_type == AMDGPU_PL_PREEMPT ||
+new_mem->mem_type == TTM_PL_SYSTEM)) {
  ttm_bo_move_null(bo, new_mem);
  goto out;
  }

otherwise, amdgpu_move_blit() is called to do the system memory copy which use 
a wrong address.
   206 /* Map only what can't be accessed directly */
   207 if (!tmz && mem->start != AMDGPU_BO_INVALID_OFFSET) {
   208 *addr = amdgpu_ttm_domain_start(adev, mem->mem_type) +
   209 mm_cur->start;
   210 return 0;
   211 }

line 208, *addr is zero. So when amdgpu_copy_buffer submit job with such addr, 
page fault happens.



发件人: Koenig, Christian 
发送时间: 2021年11月9日 20:35
收件人: Pan, Xinhui; amd-...@lists.freedesktop.org
抄送: dri-devel@lists.freedesktop.org
主题: Re: 回复: [PATCH] drm/ttm: Put BO in its memory manager's lru list

Mhm, I'm not sure what the rational behind that is.

Not moving the BO would make things less efficient, but should never cause a 
crash.

Maybe we should add a CC: stable tag and push it to -fixes instead?

Christian.

Am 09.11.21 um 13:28 schrieb Pan, Xinhui:

[AMD Official Use Only]

I hit vulkan cts test hang with navi23.

dmesg says gmc page fault with address 0x0, 0x1000, 0x2000
And some debug log also says amdgu copy one BO from system Domain to system 
Domain which is really weird.

发件人: Koenig, Christian 
发送时间: 2021年11月9日 20:20
收件人: Pan, Xinhui; amd-...@lists.freedesktop.org
抄送: dri-devel@lists.freedesktop.org
主题: Re: [PATCH] drm/ttm: Put BO in its memory manager's lru list

Am 09.11.21 um 12:19 schrieb xinhui pan:

After we move BO to a new memory region, we should put it to the new
memory manager's lru list regardless we unlock the resv or not.

Signed-off-by: xinhui pan 

Interesting find, did you trigger that somehow or did you just
stumbled over it by reading the code?

Patch is Reviewed-by: Christian König , I
will pick that up f

Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2022-01-11 Thread JingWen Chen
Hi Andrey,

Please go ahead and push your change. I will prepare the RFC later.

On 2022/1/8 上午12:02, Andrey Grodzovsky wrote:
>
> On 2022-01-07 12:46 a.m., JingWen Chen wrote:
>> On 2022/1/7 上午11:57, JingWen Chen wrote:
>>> On 2022/1/7 上午3:13, Andrey Grodzovsky wrote:
 On 2022-01-06 12:18 a.m., JingWen Chen wrote:
> On 2022/1/6 下午12:59, JingWen Chen wrote:
>> On 2022/1/6 上午2:24, Andrey Grodzovsky wrote:
>>> On 2022-01-05 2:59 a.m., Christian König wrote:
 Am 05.01.22 um 08:34 schrieb JingWen Chen:
> On 2022/1/5 上午12:56, Andrey Grodzovsky wrote:
>> On 2022-01-04 6:36 a.m., Christian König wrote:
>>> Am 04.01.22 um 11:49 schrieb Liu, Monk:
 [AMD Official Use Only]

>> See the FLR request from the hypervisor is just another source 
>> of signaling the need for a reset, similar to each job timeout 
>> on each queue. Otherwise you have a race condition between the 
>> hypervisor and the scheduler.
 No it's not, FLR from hypervisor is just to notify guest the hw VF 
 FLR is about to start or was already executed, but host will do 
 FLR anyway without waiting for guest too long

>>> Then we have a major design issue in the SRIOV protocol and really 
>>> need to question this.
>>>
>>> How do you want to prevent a race between the hypervisor resetting 
>>> the hardware and the client trying the same because of a timeout?
>>>
>>> As far as I can see the procedure should be:
>>> 1. We detect that a reset is necessary, either because of a fault a 
>>> timeout or signal from hypervisor.
>>> 2. For each of those potential reset sources a work item is send to 
>>> the single workqueue.
>>> 3. One of those work items execute first and prepares the reset.
>>> 4. We either do the reset our self or notify the hypervisor that we 
>>> are ready for the reset.
>>> 5. Cleanup after the reset, eventually resubmit jobs etc..
>>> 6. Cancel work items which might have been scheduled from other 
>>> reset sources.
>>>
>>> It does make sense that the hypervisor resets the hardware without 
>>> waiting for the clients for too long, but if we don't follow this 
>>> general steps we will always have a race between the different 
>>> components.
>> Monk, just to add to this - if indeed as you say that 'FLR from 
>> hypervisor is just to notify guest the hw VF FLR is about to start 
>> or was already executed, but host will do FLR anyway without waiting 
>> for guest too long'
>> and there is no strict waiting from the hypervisor for 
>> IDH_READY_TO_RESET to be recived from guest before starting the 
>> reset then setting in_gpu_reset and locking reset_sem from guest 
>> side is not really full proof
>> protection from MMIO accesses by the guest - it only truly helps if 
>> hypervisor waits for that message before initiation of HW reset.
>>
> Hi Andrey, this cannot be done. If somehow guest kernel hangs and 
> never has the chance to send the response back, then other VFs will 
> have to wait it reset. All the vfs will hang in this case. Or 
> sometimes the mailbox has some delay and other VFs will also wait. 
> The user of other VFs will be affected in this case.
 Yeah, agree completely with JingWen. The hypervisor is the one in 
 charge here, not the guest.

 What the hypervisor should do (and it already seems to be designed 
 that way) is to send the guest a message that a reset is about to 
 happen and give it some time to response appropriately.

 The guest on the other hand then tells the hypervisor that all 
 processing has stopped and it is ready to restart. If that doesn't 
 happen in time the hypervisor should eliminate the guest probably 
 trigger even more severe consequences, e.g. restart the whole VM etc...

 Christian.
>>> So what's the end conclusion here regarding dropping this particular 
>>> patch ? Seems to me we still need to drop it to prevent driver's MMIO 
>>> access
>>> to the GPU during reset from various places in the code.
>>>
>>> Andrey
>>>
>> Hi Andrey & Christian,
>>
>> I have ported your patch(drop the reset_sem and in_gpu_reset in flr 
>> work) and run some tests. If a engine hang during an OCL benchmark(using 
>> kfd), we can see the logs below:
 Did you port the entire patchset or just 'drm/amd/virt: Drop concurrent 
 GPU reset protection for SRIOV' ?


>>> I ported the entire patchset
>> [  397.190727] amdgpu :00:07.0: amdgpu: wait for kiq fence error: 0.
>> [  3

Re: [PATCH v2 1/3] clk: Introduce a clock request API

2022-01-11 Thread Stephen Boyd
Sorry for being super delayed on response here. I'm buried in other
work. +Jerome for exclusive clk API.

Quoting Maxime Ripard (2021-09-14 02:35:13)
> It's not unusual to find clocks being shared across multiple devices
> that need to change the rate depending on what the device is doing at a
> given time.
> 
> The SoC found on the RaspberryPi4 (BCM2711) is in such a situation
> between its two HDMI controllers that share a clock that needs to be
> raised depending on the output resolution of each controller.
> 
> The current clk_set_rate API doesn't really allow to support that case
> since there's really no synchronisation between multiple users, it's
> essentially a fire-and-forget solution.

I'd also say a "last caller wins"

> 
> clk_set_min_rate does allow for such a synchronisation, but has another
> drawback: it doesn't allow to reduce the clock rate once the work is
> over.

What does "work over" mean specifically? Does it mean one of the clk
consumers has decided to stop using the clk?

Why doesn't clk_set_rate_range() work? Or clk_set_rate_range() combined
with clk_set_rate_exclusive()?

> 
> In our previous example, this means that if we were to raise the
> resolution of one HDMI controller to the largest resolution and then
> changing for a smaller one, we would still have the clock running at the
> largest resolution rate resulting in a poor power-efficiency.

Does this example have two HDMI controllers where they share one clk and
want to use the most efficient frequency for both of the HDMI devices? I
think I'm following along but it's hard. It would be clearer if there
was some psuedo-code explaining how it is both non-workable with current
APIs and workable with the new APIs.

> 
> In order to address both issues, let's create an API that allows user to
> create temporary requests to increase the rate to a minimum, before
> going back to the initial rate once the request is done.
> 
> This introduces mainly two side-effects:
> 
>   * There's an interaction between clk_set_rate and requests. This has
> been addressed by having clk_set_rate increasing the rate if it's
> greater than what the requests asked for, and in any case changing
> the rate the clock will return to once all the requests are done.
> 
>   * Similarly, clk_round_rate has been adjusted to take the requests
> into account and return a rate that will be greater or equal to the
> requested rates.
> 

I believe clk_set_rate_range() is broken but it can be fixed. I'm
forgetting the details though. If the intended user of this new API
can't use that range API then it would be good to understand why it
can't be used. I imagine it would be something like

struct clk *clk_hdmi1, *clk_hdmi2;

clk_set_rate_range(&clk_hdmi1, HDMI1_MIN, HDMI1_MAX);
clk_set_rate_range(&clk_hdmi2, HDMI2_MIN, HDMI2_MAX);
clk_set_rate_range(&clk_hdmi2, 0, UINT_MAX);

and then the goal would be for HDMI1_MIN to be used, or at the least for
the last call to clk_set_rate_range() to drop the rate constraint and
re-evaluate the frequency of the clk again based on hdmi1's rate range.
We could have a macro for range requests to drop their frequency
constraint like clk_drop_rate_range() that's a simple wrapper around 0,
UINT_MAX if that makes it easier to read.


RE: 回复: [PATCH] drm/ttm: Put BO in its memory manager's lru list

2022-01-11 Thread Chen, Guchun
[Public]

Hi Christian,

My BAD, I checked that discussion history of this just now. So If I read it 
correctly, the double check at a different place to skip evict is: " drm/ttm: 
Double check mem_type of BO while eviction"? It is in 5.16 kernel.

Regards,
Guchun

-Original Message-
From: Christian König  
Sent: Tuesday, January 11, 2022 7:27 PM
To: Chen, Guchun ; Pan, Xinhui ; 
Koenig, Christian ; amd-...@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Subject: Re: 回复: [PATCH] drm/ttm: Put BO in its memory manager's lru list

IIRC we have completely dropped this patch in favor of a check at a different 
place.

Regards,
Christian.

Am 11.01.22 um 09:47 schrieb Chen, Guchun:
> [Public]
>
> Hi Christian,
>
> Looks this patch still missed in 5.16 kernel. Is it intentional?
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.
> kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2
> Ftree%2Fdrivers%2Fgpu%2Fdrm%2Fttm%2Fttm_bo.c%3Fh%3Dv5.16&data=04%7
> C01%7CGuchun.Chen%40amd.com%7Cf3b7f4971dc8405b0c2908d9d4f55547%7C3dd89
> 61fe4884e608e11a82d994e183d%7C0%7C0%7C637774972434004088%7CUnknown%7CT
> WFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI
> 6Mn0%3D%7C3000&sdata=vbuBPHO40J2HGt7abzfzC0nC1DQa62qal5S6TXBRj4w%3
> D&reserved=0
>
> Regards,
> Guchun
>
> -Original Message-
> From: amd-gfx  On Behalf Of 
> Pan, Xinhui
> Sent: Tuesday, November 9, 2021 9:16 PM
> To: Koenig, Christian ; 
> amd-...@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org
> Subject: 回复: 回复: [PATCH] drm/ttm: Put BO in its memory manager's lru 
> list
>
> [AMD Official Use Only]
>
> [AMD Official Use Only]
>
> Actually this patch does not totally fix the mismatch of lru list with 
> mem_type as mem_type is changed in ->move() and lru list is changed after 
> that.
>
> During this small period, another eviction could still happed and evict this 
> mismatched BO from sMam(say, its lru list is on vram domain) to sMem.
> 
> 发件人: Pan, Xinhui 
> 发送时间: 2021年11月9日 21:05
> 收件人: Koenig, Christian; amd-...@lists.freedesktop.org
> 抄送: dri-devel@lists.freedesktop.org
> 主题: 回复: 回复: [PATCH] drm/ttm: Put BO in its memory manager's lru list
>
> Yes, a stable tag is needed. vulkan guys say 5.14 hit this issue too.
>
> I think that amdgpu_bo_move() does support copy from sysMem to sysMem 
> correctly.
> maybe something below is needed.
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index c83ef42ca702..aa63ae7ddf1e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -485,7 +485,8 @@ static int amdgpu_bo_move(struct ttm_buffer_object *bo, 
> bool evict,
>  }
>  if (old_mem->mem_type == TTM_PL_SYSTEM &&
>  (new_mem->mem_type == TTM_PL_TT ||
> -new_mem->mem_type == AMDGPU_PL_PREEMPT)) {
> +new_mem->mem_type == AMDGPU_PL_PREEMPT ||
> +new_mem->mem_type == TTM_PL_SYSTEM)) {
>  ttm_bo_move_null(bo, new_mem);
>  goto out;
>  }
>
> otherwise, amdgpu_move_blit() is called to do the system memory copy which 
> use a wrong address.
>   206 /* Map only what can't be accessed directly */
>   207 if (!tmz && mem->start != AMDGPU_BO_INVALID_OFFSET) {
>   208 *addr = amdgpu_ttm_domain_start(adev, mem->mem_type) +
>   209 mm_cur->start;
>   210 return 0;
>   211 }
>
> line 208, *addr is zero. So when amdgpu_copy_buffer submit job with such 
> addr, page fault happens.
>
>
> 
> 发件人: Koenig, Christian 
> 发送时间: 2021年11月9日 20:35
> 收件人: Pan, Xinhui; amd-...@lists.freedesktop.org
> 抄送: dri-devel@lists.freedesktop.org
> 主题: Re: 回复: [PATCH] drm/ttm: Put BO in its memory manager's lru list
>
> Mhm, I'm not sure what the rational behind that is.
>
> Not moving the BO would make things less efficient, but should never cause a 
> crash.
>
> Maybe we should add a CC: stable tag and push it to -fixes instead?
>
> Christian.
>
> Am 09.11.21 um 13:28 schrieb Pan, Xinhui:
>> [AMD Official Use Only]
>>
>> I hit vulkan cts test hang with navi23.
>>
>> dmesg says gmc page fault with address 0x0, 0x1000, 0x2000
>> And some debug log also says amdgu copy one BO from system Domain to system 
>> Domain which is really weird.
>> 
>> 发件人: Koenig, Christian 
>> 发送时间: 2021年11月9日 20:20
>> 收件人: Pan, Xinhui; amd-...@lists.freedesktop.org
>> 抄送: dri-devel@lists.freedesktop.org
>> 主题: Re: [PATCH] drm/ttm: Put BO in its memory manager's lru list
>>
>> Am 09.11.21 um 12:19 schrieb xinhui pan:
>>> After we move BO to a new memory region, we should put it to the new 
>>> memory manager's lru list regardless we unlock the resv or not.
>>>
>>> Signed-off-by: xinhui pan 
>> Interesting find, did you trigger that so

[Bug 215001] Regression in 5.15, Firmware-initialized graphics console selects FB_VGA16, screen corruption

2022-01-11 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=215001

--- Comment #13 from Kris Karas (bugs-...@moonlit-rail.com) ---
Hi Javier, et al,

I have just tested version two of the patch (from email, I don't see it listed
in the attachments), on the original two BIOS/VGAC servers, one new UEFI
server, and my original UEFI desktop.  Once again, I'm happy to report flawless
operation on all four.

Tested-By: Kris Karas 

Thanks again, Javier!  I hope this one also makes Geert happy, too.

(I'd still be happier if non-X86 would be patched to use orig_video_isVGA as an
integer; but for expediency, this seems fine.)

Kris

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are on the CC list for the bug.

Re: [RFC PATCH 0/3] Add support modifiers for drivers whose planes only support linear layout

2022-01-11 Thread Esaki Tomohito

Hi, Simon

On 2022/01/06 8:57, Simon Ser wrote:

Thanks for working on this! I've pushed a patch [1] to drm-misc-next which
touches the same function, can you rebase your patches on top of it?

[1]: https://patchwork.freedesktop.org/patch/467940/?series=98255&rev=3


I understand. I will rebase the patches and send.

Thanks
Tomohito Esaki


Re: [PATCH 2/6] dt-bindings: display: meson-vpu: add third DPI output port

2022-01-11 Thread Rob Herring
On Fri, 07 Jan 2022 15:55:11 +0100, Neil Armstrong wrote:
> Add third port corresponding to the ENCL DPI encoder used to connect
> to DSI or LVDS transceivers.
> 
> Signed-off-by: Neil Armstrong 
> ---
>  .../devicetree/bindings/display/amlogic,meson-vpu.yaml   | 5 +
>  1 file changed, 5 insertions(+)
> 

Reviewed-by: Rob Herring 


Re: [PATCH 1/6] dt-bindings: display: add Amlogic MIPI DSI Host Controller bindings

2022-01-11 Thread Rob Herring
On Fri, Jan 07, 2022 at 03:55:10PM +0100, Neil Armstrong wrote:
> The Amlogic G12A, G12B & SM1 SoCs embeds a Synopsys DW-MIPI-DSI transceiver 
> (ver 1.21a),
> with a custom glue managing the IP resets, clock and data input similar to 
> the DW-HDMI Glue
> on the same Amlogic SoCs.
> 
> Signed-off-by: Neil Armstrong 
> ---
>  .../display/amlogic,meson-dw-mipi-dsi.yaml| 118 ++
>  1 file changed, 118 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/display/amlogic,meson-dw-mipi-dsi.yaml
> 
> diff --git 
> a/Documentation/devicetree/bindings/display/amlogic,meson-dw-mipi-dsi.yaml 
> b/Documentation/devicetree/bindings/display/amlogic,meson-dw-mipi-dsi.yaml
> new file mode 100644
> index ..f3070783d606
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/display/amlogic,meson-dw-mipi-dsi.yaml
> @@ -0,0 +1,118 @@
> +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
> +# Copyright 2020 BayLibre, SAS
> +%YAML 1.2
> +---
> +$id: "http://devicetree.org/schemas/display/amlogic,meson-dw-mipi-dsi.yaml#";
> +$schema: "http://devicetree.org/meta-schemas/core.yaml#";
> +
> +title: Amlogic specific extensions to the Synopsys Designware MIPI DSI Host 
> Controller
> +
> +maintainers:
> +  - Neil Armstrong 
> +
> +description: |
> +  The Amlogic Meson Synopsys Designware Integration is composed of
> +  - A Synopsys DesignWare MIPI DSI Host Controller IP
> +  - A TOP control block controlling the Clocks & Resets of the IP
> +
> +allOf:
> +  - $ref: dsi-controller.yaml#
> +
> +properties:
> +  compatible:
> +enum:
> +  - amlogic,meson-g12a-dw-mipi-dsi
> +
> +  reg:
> +maxItems: 1
> +
> +  clocks:
> +minItems: 2
> +
> +  clock-names:
> +minItems: 2
> +items:
> +  - const: pclk
> +  - const: px_clk
> +  - const: meas_clk
> +
> +  resets:
> +minItems: 1
> +
> +  reset-names:
> +items:
> +  - const: top
> +
> +  phys:
> +minItems: 1
> +
> +  phy-names:
> +items:
> +  - const: dphy
> +
> +  ports:
> +$ref: /schemas/graph.yaml#/properties/ports
> +
> +properties:
> +  port@0:
> +$ref: /schemas/graph.yaml#/$defs/port-base

/schemas/graph.yaml#/properties/port

> +unevaluatedProperties: false

And this can be dropped.

> +description: Input node to receive pixel data.
> +
> +  port@1:
> +$ref: /schemas/graph.yaml#/$defs/port-base
> +unevaluatedProperties: false

Same here.

With that,

Reviewed-by: Rob Herring 

> +description: DSI output node to panel.
> +
> +required:
> +  - port@0
> +  - port@1
> +
> +required:
> +  - compatible
> +  - reg
> +  - clocks
> +  - clock-names
> +  - resets
> +  - reset-names
> +  - phys
> +  - phy-names
> +  - ports
> +
> +unevaluatedProperties: false
> +
> +examples:
> +  - |
> +dsi@7000 {
> +  compatible = "amlogic,meson-g12a-dw-mipi-dsi";
> +  reg = <0x6000 0x400>;
> +  resets = <&reset_top>;
> +  reset-names = "top";
> +  clocks = <&clk_pclk>, <&clk_px>;
> +  clock-names = "pclk", "px_clk";
> +  phys = <&mipi_dphy>;
> +  phy-names = "dphy";
> +
> +  ports {
> +  #address-cells = <1>;
> +  #size-cells = <0>;
> +
> +  /* VPU VENC Input */
> +  mipi_dsi_venc_port: port@0 {
> +  reg = <0>;
> +
> +  mipi_dsi_in: endpoint {
> +   remote-endpoint = <&dpi_out>;
> +  };
> +  };
> +
> +  /* DSI Output */
> +  mipi_dsi_panel_port: port@1 {
> +  reg = <1>;
> +
> +  mipi_out_panel: endpoint {
> +  remote-endpoint = <&mipi_in_panel>;
> +  };
> +  };
> +  };
> +};
> -- 
> 2.25.1
> 
> 


Re: [PATCH v2 2/2] dt-bindings: display: Add STARRY 2081101QFH032011-53G

2022-01-11 Thread Rob Herring
On Fri, 07 Jan 2022 20:22:08 +0800, xiazhengqiao wrote:
> Add dt-bindings for 10.1" TFT LCD module called STARRY 2081101
> QFH032011-53G.
> 
> Signed-off-by: xiazhengqiao 
> ---
>  .../display/panel/innolux,himax8279d.yaml | 72 +++
>  1 file changed, 72 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/display/panel/innolux,himax8279d.yaml
> 

Reviewed-by: Rob Herring 


Re: [PATCH 1/4] dt-bindings: display: panel: feiyang,fy07024di26a30d: make reset gpio optional

2022-01-11 Thread Rob Herring
On Fri, 07 Jan 2022 00:13:32 -0500, Peter Geis wrote:
> Some implementations do not use the reset signal, instead tying it to dvdd.
> Make the reset gpio optional to permit this.
> 
> Signed-off-by: Peter Geis 
> ---
>  .../bindings/display/panel/feiyang,fy07024di26a30d.yaml  | 1 -
>  1 file changed, 1 deletion(-)
> 

Acked-by: Rob Herring 


Re: [PATCH v5 25/32] iommu/mtk: Migrate to aggregate driver

2022-01-11 Thread Stephen Boyd
Quoting Yong Wu (2022-01-11 04:22:23)
> Hi Stephen,
>
> Thanks for helping update here.
>
> On Thu, 2022-01-06 at 13:45 -0800, Stephen Boyd wrote:
> > Use an aggregate driver instead of component ops so that we can get
> > proper driver probe ordering of the aggregate device with respect to
> > all
> > the component devices that make up the aggregate device.
> >
> > Cc: Yong Wu 
> > Cc: Joerg Roedel 
> > Cc: Will Deacon 
> > Cc: Daniel Vetter 
> > Cc: "Rafael J. Wysocki" 
> > Cc: Rob Clark 
> > Cc: Russell King 
> > Cc: Saravana Kannan 
> > Signed-off-by: Stephen Boyd 
>
> When I test this on mt8195 which have two IOMMU HWs(calling
> component_aggregate_regsiter twice), it will abort like this. Then what
> should we do if we have two instances?
>

Thanks for testing it out. We can't register the struct driver more than
once but this driver is calling the component_aggregate_register()
function from the driver probe and there are two devices bound to the
mtk-iommu driver so we try to register it more than once. Sigh!

I see a couple options. One is to do a deep copy of the driver structure
and change the driver name. Then it's a one to one relationship between
device and driver. That's not very great because it leaves around junk
so it should probably be avoided.

Another option is to reference count the driver registration calls when
component_aggregate_register() is called multiple times. Then we would
only register the driver once and keep it pinned until the last
unregister call is made, but still remove devices that are created for
the match table.

Can you try the attached patch? It is based on the next version of this
patch series so the include part of the patch may not apply cleanly.

---8<---
diff --git a/drivers/base/component.c b/drivers/base/component.c
index 64ad7478c67a..97f253a41bdf 100644
--- a/drivers/base/component.c
+++ b/drivers/base/component.c
@@ -492,15 +492,30 @@ static struct aggregate_device
*__aggregate_find(struct device *parent)
return dev ? to_aggregate_device(dev) : NULL;
 }

+static DEFINE_MUTEX(aggregate_mutex);
+
 static int aggregate_driver_register(struct aggregate_driver *adrv)
 {
-   adrv->driver.bus = &aggregate_bus_type;
-   return driver_register(&adrv->driver);
+   int ret = 0;
+
+   mutex_lock(&aggregate_mutex);
+   if (!refcount_inc_not_zero(&adrv->count)) {
+   adrv->driver.bus = &aggregate_bus_type;
+   ret = driver_register(&adrv->driver);
+   if (!ret)
+   refcount_inc(&adrv->count);
+   }
+   mutex_unlock(&aggregate_mutex);
+
+   return ret;
 }

 static void aggregate_driver_unregister(struct aggregate_driver *adrv)
 {
-   driver_unregister(&adrv->driver);
+   if (refcount_dec_and_mutex_lock(&adrv->count, &aggregate_mutex)) {
+   driver_unregister(&adrv->driver);
+   mutex_unlock(&aggregate_mutex);
+   }
 }

 static struct aggregate_device *aggregate_device_add(struct device *parent,
diff --git a/include/linux/component.h b/include/linux/component.h
index 53d81203c095..b061341938aa 100644
--- a/include/linux/component.h
+++ b/include/linux/component.h
@@ -4,6 +4,7 @@

 #include 
 #include 
+#include 

 struct aggregate_device;

@@ -66,6 +67,7 @@ struct device *aggregate_device_parent(const struct
aggregate_device *adev);

 /**
  * struct aggregate_driver - Aggregate driver (made up of other drivers)
+ * @count: driver registration refcount
  * @driver: device driver
  */
 struct aggregate_driver {
@@ -101,6 +103,7 @@ struct aggregate_driver {
 */
void (*shutdown)(struct aggregate_device *adev);

+   refcount_t  count;
struct device_driverdriver;
 };


Re: [RFC PATCH 2/7] drm/msm/dp: support attaching bridges to the DP encoder

2022-01-11 Thread Dmitry Baryshkov
On Wed, 12 Jan 2022 at 02:12, Kuogee Hsieh  wrote:
>
>
> On 1/6/2022 9:26 PM, Dmitry Baryshkov wrote:
> > On 07/01/2022 06:42, Stephen Boyd wrote:
> >> Quoting Dmitry Baryshkov (2022-01-06 18:01:27)
> >>> Currently DP driver will allocate panel bridge for eDP panels.
> >>> Simplify this code to just check if there is any next bridge in the
> >>> chain (be it a panel bridge or regular bridge). Rename panel_bridge
> >>> field to next_bridge accordingly.
> >>>
> >>> Signed-off-by: Dmitry Baryshkov 
> >>> ---
> >>>   drivers/gpu/drm/msm/dp/dp_display.c |  2 +-
> >>>   drivers/gpu/drm/msm/dp/dp_display.h |  2 +-
> >>>   drivers/gpu/drm/msm/dp/dp_drm.c |  4 ++--
> >>>   drivers/gpu/drm/msm/dp/dp_parser.c  | 26 --
> >>>   drivers/gpu/drm/msm/dp/dp_parser.h  |  2 +-
> >>>   5 files changed, 13 insertions(+), 23 deletions(-)
> >>
> >> I like this one, it certainly makes it easier to understand.
> >>
> >>> diff --git a/drivers/gpu/drm/msm/dp/dp_parser.c
> >>> b/drivers/gpu/drm/msm/dp/dp_parser.c
> >>> index a7acc23f742b..5de21f3d0812 100644
> >>> --- a/drivers/gpu/drm/msm/dp/dp_parser.c
> >>> +++ b/drivers/gpu/drm/msm/dp/dp_parser.c
> >>> @@ -307,11 +299,9 @@ static int dp_parser_parse(struct dp_parser
> >>> *parser, int connector_type)
> >>>  if (rc)
> >>>  return rc;
> >>>
> >>> -   if (connector_type == DRM_MODE_CONNECTOR_eDP) {
> >>
> >> It feels like this is on purpose, but I don't see any comment so I have
> >> no idea. I think qcom folks are concerned about changing how not eDP
> >> works. I'll have to test it out locally.
> >
> > Ah, another thing that should go into the commit message.
> >
> > Current situation:
> > - DP: no external bridges supported.
> > - eDP: only a drm_panel wrapped into the panel bridge
> >
> > After this patch:
> > - both DP and eDP support any chain of bridges attached.
> >
> >
> > While the change means nothing for the DP (IIUC, it will not have any
> > bridges), it simplifies the code path, lowering the amount of checks.
> >
> > And for eDP this means that we can attach any eDP-to-something bridges
> > (e.g. NXP PTN3460).
> >
> >
> > Well... After re-checking the source code for
> > devm_drm_of_get_bridge/drm_of_find_panel_or_bridge I should probably
> > revert removal of the check. The function will return -ENODEV if
> > neither bridge nor panel are specified.
> >
> I am new to drm and  confusing with bridge here.
>
> Isn't bridge used to bridging two different kind of interface together?
>
> for example, dsi <--> bridge <--> dp.
>
> why edp need bridge here?
>
> Can you give me more info regrading what bridge try to do here.

First, there are bridges converting the eDP interface to another
interface. The mentioned NXP PTN3460 converts (embedded) DisplayPort
to LVDS.

Second (and this is the case here) drm_bridge can be used to wrap
drm_panel (panel-bridge), so that the driver doesn't have to care
about the drm_panel interface.

Last, but not least, external display connectors can also be
abstracted as bridges (see display-connector.c).

This becomes even more appealing as the driver can then switch to
drm_bridge_connector, supporting any kinds of pipelines attached to
the encoder, supporting any kind of converters, panel or external
connector pipelines.Think about the following (sometimes crazy, but
possible) examples. With
drm_bridge/panel-bridge/display-connector/drm_bridge_connector there
is no difference for the driver at all:
- DP encoder ⇒ DP port ⇒ DP monitor
- DP encoder ⇒ DP connector supporting generic GPIO as HPD ⇒ DP port ⇒
DP monitor
- eDP encoder ⇒ eDP panel
- eDP encoder ⇒ ptn3460 ⇒ fixed LVDS panel
- eDP encoder ⇒ ptn3460 ⇒ LVDS connector with EDID lines for panel autodetect
- eDP encoder ⇒ ptn3460 ⇒ THC63LVD1024 ⇒ DPI panel.
- eDP encoder ⇒ LT8912 ⇒ DSI panel

> >>
> >>> -   rc = dp_parser_find_panel(parser);
> >>> -   if (rc)
> >>> -   return rc;
> >>> -   }
> >>> +   rc = dp_parser_find_next_bridge(parser);
> >>> +   if (rc)
> >>> +   return rc;
> >>>
> >>>  /* Map the corresponding regulator information according to
> >>>   * version. Currently, since we only have one supported
> >>> platform,
> >
> >



-- 
With best wishes
Dmitry


[PATCH 1/2] drm/i915/selftests: Add a cancel request selftest that triggers a reset

2022-01-11 Thread Matthew Brost
Add a cancel request selftest that results in an engine reset to cancel
the request as it is non-preemptable. Also insert a NOP request after
the cancelled request and confirm that it completes successfully.

v2:
 (Tvrtko)
  - Skip test if preemption timeout compiled out
  - Skip test if engine reset isn't supported
  - Update debug prints to be more descriptive
v3:
  - Add comment explaining test

Signed-off-by: Matthew Brost 
---
 drivers/gpu/drm/i915/selftests/i915_request.c | 117 ++
 1 file changed, 117 insertions(+)

diff --git a/drivers/gpu/drm/i915/selftests/i915_request.c 
b/drivers/gpu/drm/i915/selftests/i915_request.c
index 7f66f6d299b26..f78de99d5ae1e 100644
--- a/drivers/gpu/drm/i915/selftests/i915_request.c
+++ b/drivers/gpu/drm/i915/selftests/i915_request.c
@@ -782,6 +782,115 @@ static int __cancel_completed(struct intel_engine_cs 
*engine)
return err;
 }
 
+/*
+ * Test to prove a non-preemptable request can be cancelled and a subsequent
+ * request on the same context can successfully complete after cancallation.
+ *
+ * Testing methodology is to create non-preemptable request and submit it,
+ * wait for spinner to start, create a NOP request and submit it, cancel the
+ * spinner, wait for spinner to complete and verify it failed with an error,
+ * finally wait for NOP request to complete verify it succeeded without an
+ * error. Preemption timeout also reduced / restored so test runs in a timely
+ * maner.
+ */
+static int __cancel_reset(struct drm_i915_private *i915,
+ struct intel_engine_cs *engine)
+{
+   struct intel_context *ce;
+   struct igt_spinner spin;
+   struct i915_request *rq, *nop;
+   unsigned long preempt_timeout_ms;
+   int err = 0;
+
+   if (!CONFIG_DRM_I915_PREEMPT_TIMEOUT ||
+   !intel_has_reset_engine(engine->gt))
+   return 0;
+
+   preempt_timeout_ms = engine->props.preempt_timeout_ms;
+   engine->props.preempt_timeout_ms = 100;
+
+   if (igt_spinner_init(&spin, engine->gt))
+   goto out_restore;
+
+   ce = intel_context_create(engine);
+   if (IS_ERR(ce)) {
+   err = PTR_ERR(ce);
+   goto out_spin;
+   }
+
+   rq = igt_spinner_create_request(&spin, ce, MI_NOOP);
+   if (IS_ERR(rq)) {
+   err = PTR_ERR(rq);
+   goto out_ce;
+   }
+
+   pr_debug("%s: Cancelling active non-preemptable request\n",
+engine->name);
+   i915_request_get(rq);
+   i915_request_add(rq);
+   if (!igt_wait_for_spinner(&spin, rq)) {
+   struct drm_printer p = drm_info_printer(engine->i915->drm.dev);
+
+   pr_err("Failed to start spinner on %s\n", engine->name);
+   intel_engine_dump(engine, &p, "%s\n", engine->name);
+   err = -ETIME;
+   goto out_rq;
+   }
+
+   nop = intel_context_create_request(ce);
+   if (IS_ERR(nop))
+   goto out_nop;
+   i915_request_get(nop);
+   i915_request_add(nop);
+
+   i915_request_cancel(rq, -EINTR);
+
+   if (i915_request_wait(rq, 0, HZ) < 0) {
+   struct drm_printer p = drm_info_printer(engine->i915->drm.dev);
+
+   pr_err("%s: Failed to cancel hung request\n", engine->name);
+   intel_engine_dump(engine, &p, "%s\n", engine->name);
+   err = -ETIME;
+   goto out_nop;
+   }
+
+   if (rq->fence.error != -EINTR) {
+   pr_err("%s: fence not cancelled (%u)\n",
+  engine->name, rq->fence.error);
+   err = -EINVAL;
+   goto out_nop;
+   }
+
+   if (i915_request_wait(nop, 0, HZ) < 0) {
+   struct drm_printer p = drm_info_printer(engine->i915->drm.dev);
+
+   pr_err("%s: Failed to complete nop request\n", engine->name);
+   intel_engine_dump(engine, &p, "%s\n", engine->name);
+   err = -ETIME;
+   goto out_nop;
+   }
+
+   if (nop->fence.error != 0) {
+   pr_err("%s: Nop request errored (%u)\n",
+  engine->name, nop->fence.error);
+   err = -EINVAL;
+   }
+
+out_nop:
+   i915_request_put(nop);
+out_rq:
+   i915_request_put(rq);
+out_ce:
+   intel_context_put(ce);
+out_spin:
+   igt_spinner_fini(&spin);
+out_restore:
+   engine->props.preempt_timeout_ms = preempt_timeout_ms;
+   if (err)
+   pr_err("%s: %s error %d\n", __func__, engine->name, err);
+   return err;
+}
+
 static int live_cancel_request(void *arg)
 {
struct drm_i915_private *i915 = arg;
@@ -814,6 +923,14 @@ static int live_cancel_request(void *arg)
return err;
if (err2)
return err2;
+
+   /* Expects reset so call outside of igt_live_test_* */
+   err = __cancel_reset(i915, engine);
+   if (err)
+  

[PATCH 2/2] drm/i915/guc: Remove hacks for reset and schedule disable G2H being received out of order

2022-01-11 Thread Matthew Brost
In the i915 there are several hacks in place to make request cancelation
work with an old version of the GuC which delivered the G2H indicating
schedule disable is done before G2H indicating a context reset. Version
69 fixes this, so we can remove these hacks.

Signed-off-by: Matthew Brost 
---
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 30 ++-
 1 file changed, 2 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 23a40f10d376d..3918f1be114fa 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -1533,7 +1533,6 @@ static void __guc_reset_context(struct intel_context *ce, 
bool stalled)
unsigned long flags;
u32 head;
int i, number_children = ce->parallel.number_children;
-   bool skip = false;
struct intel_context *parent = ce;
 
GEM_BUG_ON(intel_context_is_child(ce));
@@ -1544,23 +1543,10 @@ static void __guc_reset_context(struct intel_context 
*ce, bool stalled)
 * GuC will implicitly mark the context as non-schedulable when it sends
 * the reset notification. Make sure our state reflects this change. The
 * context will be marked enabled on resubmission.
-*
-* XXX: If the context is reset as a result of the request cancellation
-* this G2H is received after the schedule disable complete G2H which is
-* wrong as this creates a race between the request cancellation code
-* re-submitting the context and this G2H handler. This is a bug in the
-* GuC but can be worked around in the meantime but converting this to a
-* NOP if a pending enable is in flight as this indicates that a request
-* cancellation has occurred.
 */
spin_lock_irqsave(&ce->guc_state.lock, flags);
-   if (likely(!context_pending_enable(ce)))
-   clr_context_enabled(ce);
-   else
-   skip = true;
+   clr_context_enabled(ce);
spin_unlock_irqrestore(&ce->guc_state.lock, flags);
-   if (unlikely(skip))
-   goto out_put;
 
/*
 * For each context in the relationship find the hanging request
@@ -1592,7 +1578,6 @@ static void __guc_reset_context(struct intel_context *ce, 
bool stalled)
}
 
__unwind_incomplete_requests(parent);
-out_put:
intel_context_put(parent);
 }
 
@@ -2531,12 +2516,6 @@ static void guc_context_cancel_request(struct 
intel_context *ce,
true);
}
 
-   /*
-* XXX: Racey if context is reset, see comment in
-* __guc_reset_context().
-*/
-   flush_work(&ce_to_guc(ce)->ct.requests.worker);
-
guc_context_unblock(block_context);
intel_context_put(ce);
}
@@ -3971,12 +3950,7 @@ static void guc_handle_context_reset(struct intel_guc 
*guc,
 {
trace_intel_context_reset(ce);
 
-   /*
-* XXX: Racey if request cancellation has occurred, see comment in
-* __guc_reset_context().
-*/
-   if (likely(!intel_context_is_banned(ce) &&
-  !context_blocked(ce))) {
+   if (likely(!intel_context_is_banned(ce))) {
capture_error_state(guc, ce);
guc_context_replay(ce);
} else {
-- 
2.34.1



[PATCH 0/2] Remove some hacks required for GuC 62.0.0

2022-01-11 Thread Matthew Brost
Remove a hack required because schedule disable done G2H was received
before context reset G2H in GuC firmware 62.0.0. Since we have upgraded
69.0.3, this is no longer required.

Also revive selftest which proves this works before / after change.

Signed-off-by: Matthew Brost 

Matthew Brost (2):
  drm/i915/selftests: Add a cancel request selftest that triggers a
reset
  drm/i915/guc: Remove hacks for reset and schedule disable G2H being
received out of order

 .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  30 +
 drivers/gpu/drm/i915/selftests/i915_request.c | 117 ++
 2 files changed, 119 insertions(+), 28 deletions(-)

-- 
2.34.1



Re: [RFC PATCH 2/7] drm/msm/dp: support attaching bridges to the DP encoder

2022-01-11 Thread Kuogee Hsieh



On 1/6/2022 9:26 PM, Dmitry Baryshkov wrote:

On 07/01/2022 06:42, Stephen Boyd wrote:

Quoting Dmitry Baryshkov (2022-01-06 18:01:27)

Currently DP driver will allocate panel bridge for eDP panels.
Simplify this code to just check if there is any next bridge in the
chain (be it a panel bridge or regular bridge). Rename panel_bridge
field to next_bridge accordingly.

Signed-off-by: Dmitry Baryshkov 
---
  drivers/gpu/drm/msm/dp/dp_display.c |  2 +-
  drivers/gpu/drm/msm/dp/dp_display.h |  2 +-
  drivers/gpu/drm/msm/dp/dp_drm.c |  4 ++--
  drivers/gpu/drm/msm/dp/dp_parser.c  | 26 --
  drivers/gpu/drm/msm/dp/dp_parser.h  |  2 +-
  5 files changed, 13 insertions(+), 23 deletions(-)


I like this one, it certainly makes it easier to understand.

diff --git a/drivers/gpu/drm/msm/dp/dp_parser.c 
b/drivers/gpu/drm/msm/dp/dp_parser.c

index a7acc23f742b..5de21f3d0812 100644
--- a/drivers/gpu/drm/msm/dp/dp_parser.c
+++ b/drivers/gpu/drm/msm/dp/dp_parser.c
@@ -307,11 +299,9 @@ static int dp_parser_parse(struct dp_parser 
*parser, int connector_type)

 if (rc)
 return rc;

-   if (connector_type == DRM_MODE_CONNECTOR_eDP) {


It feels like this is on purpose, but I don't see any comment so I have
no idea. I think qcom folks are concerned about changing how not eDP
works. I'll have to test it out locally.


Ah, another thing that should go into the commit message.

Current situation:
- DP: no external bridges supported.
- eDP: only a drm_panel wrapped into the panel bridge

After this patch:
- both DP and eDP support any chain of bridges attached.


While the change means nothing for the DP (IIUC, it will not have any 
bridges), it simplifies the code path, lowering the amount of checks.


And for eDP this means that we can attach any eDP-to-something bridges 
(e.g. NXP PTN3460).



Well... After re-checking the source code for 
devm_drm_of_get_bridge/drm_of_find_panel_or_bridge I should probably 
revert removal of the check. The function will return -ENODEV if 
neither bridge nor panel are specified.



I am new to drm and  confusing with bridge here.

Isn't bridge used to bridging two different kind of interface together?

for example, dsi <--> bridge <--> dp.

why edp need bridge here?

Can you give me more info regrading what bridge try to do here.






-   rc = dp_parser_find_panel(parser);
-   if (rc)
-   return rc;
-   }
+   rc = dp_parser_find_next_bridge(parser);
+   if (rc)
+   return rc;

 /* Map the corresponding regulator information according to
  * version. Currently, since we only have one supported 
platform,





Re: Phyr Starter

2022-01-11 Thread Logan Gunthorpe



On 2022-01-11 4:02 p.m., Jason Gunthorpe wrote:
> On Tue, Jan 11, 2022 at 03:57:07PM -0700, Logan Gunthorpe wrote:
>>
>>
>> On 2022-01-11 3:53 p.m., Jason Gunthorpe wrote:
>>> I just want to share the whole API that will have to exist to
>>> reasonably support this flexible array of intervals data structure..
>>
>> Is that really worth it? I feel like type safety justifies replicating a
>> bit of iteration and allocation infrastructure. Then there's no silly
>> mistakes of thinking one array is one thing when it is not.
> 
> If it is a 'a bit' then sure, but I suspect doing a good job here will
> be a lot of code here.
> 
> Look at how big scatterlist is, for instance.

Yeah, but scatterlist has a ton of cruft; numerous ways to allocate,
multiple iterators, developers using it in different ways, etc, etc.
It's a big mess. bvec.h is much smaller (though includes stuff that
wouldn't necessarily be appropriate here).

Also some things apply to one but not the other. eg: a memcpy to/from
function might make sense for a phy_range but makes no sense for a
dma_range.

> Maybe we could have a generic 64 bit interval arry and then two type
> wrappers that do dma and physaddr casting? IDK.
> 
> Not sure type safety of DMA vs CPU address is critical?

I would argue it is. A DMA address is not a CPU address and should not
be treated the same.

Logan



Re: Phyr Starter

2022-01-11 Thread Logan Gunthorpe



On 2022-01-11 3:57 p.m., Jason Gunthorpe wrote:
> On Tue, Jan 11, 2022 at 03:09:13PM -0700, Logan Gunthorpe wrote:
> 
>> Either that, or we need a wrapper that allocates an appropriately
>> sized SGL to pass to any dma_map implementation that doesn't support
>> the new structures.
> 
> This is what I think we should do. If we start with RDMA then we can
> motivate the 4 main server IOMMU drivers to get updated ASAP, then it
> can acceptably start to spread to other users.

I suspect the preferred path forward is for the IOMMU drivers that don't
use dma-iommu should be converted to use it. Then anything we do to
dma-iommu will be applicable to the IOMMU drivers. Better than expecting
them to implement a bunch of new functionality themselves.

Logan



Re: Phyr Starter

2022-01-11 Thread Jason Gunthorpe
On Tue, Jan 11, 2022 at 03:57:07PM -0700, Logan Gunthorpe wrote:
> 
> 
> On 2022-01-11 3:53 p.m., Jason Gunthorpe wrote:
> > I just want to share the whole API that will have to exist to
> > reasonably support this flexible array of intervals data structure..
> 
> Is that really worth it? I feel like type safety justifies replicating a
> bit of iteration and allocation infrastructure. Then there's no silly
> mistakes of thinking one array is one thing when it is not.

If it is a 'a bit' then sure, but I suspect doing a good job here will
be a lot of code here.

Look at how big scatterlist is, for instance.

Maybe we could have a generic 64 bit interval arry and then two type
wrappers that do dma and physaddr casting? IDK.

Not sure type safety of DMA vs CPU address is critical?

Jason


Re: [PATCH 1/4] drm/msm/adreno: Add support for Adreno 8c Gen 3

2022-01-11 Thread Rob Clark
On Tue, Jan 11, 2022 at 1:31 PM Akhil P Oommen  wrote:
>
> Add support for "Adreno 8c Gen 3" gpu along with the necessary speedbin
> support.
>
> Signed-off-by: Akhil P Oommen 
> ---
>
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c  | 21 +
>  drivers/gpu/drm/msm/adreno/adreno_device.c | 29 ++---
>  drivers/gpu/drm/msm/adreno/adreno_gpu.h| 10 --
>  3 files changed, 51 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index 51b8377..9268ce3 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -10,7 +10,6 @@
>
>  #include 
>  #include 
> -#include 
>  #include 
>
>  #define GPU_PAS_ID 13
> @@ -1734,6 +1733,18 @@ static u32 a618_get_speed_bin(u32 fuse)
> return UINT_MAX;
>  }
>
> +static u32 adreno_7c3_get_speed_bin(u32 fuse)
> +{
> +   if (fuse == 0)
> +   return 0;
> +   else if (fuse == 117)
> +   return 0;
> +   else if (fuse == 190)
> +   return 1;
> +
> +   return UINT_MAX;
> +}
> +
>  static u32 fuse_to_supp_hw(struct device *dev, struct adreno_rev rev, u32 
> fuse)
>  {
> u32 val = UINT_MAX;
> @@ -1741,6 +1752,9 @@ static u32 fuse_to_supp_hw(struct device *dev, struct 
> adreno_rev rev, u32 fuse)
> if (adreno_cmp_rev(ADRENO_REV(6, 1, 8, ANY_ID), rev))
> val = a618_get_speed_bin(fuse);
>
> +   if (adreno_cmp_rev(ADRENO_REV(6, 3, 5, ANY_ID), rev))
> +   val = adreno_7c3_get_speed_bin(fuse);
> +
> if (val == UINT_MAX) {
> DRM_DEV_ERROR(dev,
> "missing support for speed-bin: %u. Some OPPs may not 
> be supported by hardware",
> @@ -1753,11 +1767,10 @@ static u32 fuse_to_supp_hw(struct device *dev, struct 
> adreno_rev rev, u32 fuse)
>
>  static int a6xx_set_supported_hw(struct device *dev, struct adreno_rev rev)
>  {
> -   u32 supp_hw = UINT_MAX;
> -   u32 speedbin;
> +   u32 speedbin, supp_hw = UINT_MAX;
> int ret;
>
> -   ret = nvmem_cell_read_variable_le_u32(dev, "speed_bin", &speedbin);
> +   ret = adreno_read_speedbin(dev, &speedbin);
> /*
>  * -ENOENT means that the platform doesn't support speedbin which is
>  * fine
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c 
> b/drivers/gpu/drm/msm/adreno/adreno_device.c
> index 9300583..f35c631 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_device.c
> +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
> @@ -6,6 +6,7 @@
>   * Copyright (c) 2014,2017 The Linux Foundation. All rights reserved.
>   */
>
> +#include 
>  #include "adreno_gpu.h"
>
>  bool hang_debug = false;
> @@ -317,6 +318,17 @@ static const struct adreno_info gpulist[] = {
> .zapfw = "a660_zap.mdt",
> .hwcg = a660_hwcg,
> }, {
> +   .rev = ADRENO_REV_SKU(6, 3, 5, ANY_ID, 190),
> +   .name = "Adreno 8c Gen 3",
> +   .fw = {
> +   [ADRENO_FW_SQE] = "a660_sqe.fw",
> +   [ADRENO_FW_GMU] = "a660_gmu.bin",
> +   },
> +   .gmem = SZ_512K,
> +   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> +   .init = a6xx_gpu_init,
> +   .hwcg = a660_hwcg,
> +   }, {
> .rev = ADRENO_REV(6, 3, 5, ANY_ID),
> .name = "Adreno 7c Gen 3",
> .fw = {
> @@ -371,7 +383,8 @@ bool adreno_cmp_rev(struct adreno_rev rev1, struct 
> adreno_rev rev2)
> return _rev_match(rev1.core, rev2.core) &&
> _rev_match(rev1.major, rev2.major) &&
> _rev_match(rev1.minor, rev2.minor) &&
> -   _rev_match(rev1.patchid, rev2.patchid);
> +   _rev_match(rev1.patchid, rev2.patchid) &&
> +   _rev_match(rev1.sku, rev2.sku);
>  }
>
>  const struct adreno_info *adreno_info(struct adreno_rev rev)
> @@ -445,12 +458,17 @@ struct msm_gpu *adreno_load_gpu(struct drm_device *dev)
> return gpu;
>  }
>
> +int adreno_read_speedbin(struct device *dev, u32 *speedbin)
> +{
> +   return nvmem_cell_read_variable_le_u32(dev, "speed_bin", speedbin);
> +}

If you are going to add a helper for this, you should probably use it
in a6xx_set_supported_hw() as well..

BR,
-R

> +
>  static int find_chipid(struct device *dev, struct adreno_rev *rev)
>  {
> struct device_node *node = dev->of_node;
> const char *compat;
> int ret;
> -   u32 chipid;
> +   u32 chipid, speedbin;
>
> /* first search the compat strings for qcom,adreno-XYZ.W: */
> ret = of_property_read_string_index(node, "compatible", 0, &compat);
> @@ -466,7 +484,7 @@ static int find_chipid(struct device *dev, struct 
> adreno_rev *rev)
> rev->minor = r;
> rev->patchid = patch;
>
> -   return 0;
> +   

Re: Phyr Starter

2022-01-11 Thread Jason Gunthorpe
On Tue, Jan 11, 2022 at 03:09:13PM -0700, Logan Gunthorpe wrote:

> Either that, or we need a wrapper that allocates an appropriately
> sized SGL to pass to any dma_map implementation that doesn't support
> the new structures.

This is what I think we should do. If we start with RDMA then we can
motivate the 4 main server IOMMU drivers to get updated ASAP, then it
can acceptably start to spread to other users.

The passthrough path would have to be optimized from the start to
avoid the SGL.

Jason



Re: Phyr Starter

2022-01-11 Thread Logan Gunthorpe



On 2022-01-11 3:53 p.m., Jason Gunthorpe wrote:
> I just want to share the whole API that will have to exist to
> reasonably support this flexible array of intervals data structure..

Is that really worth it? I feel like type safety justifies replicating a
bit of iteration and allocation infrastructure. Then there's no silly
mistakes of thinking one array is one thing when it is not.

Logan



Re: Phyr Starter

2022-01-11 Thread Jason Gunthorpe
On Tue, Jan 11, 2022 at 09:25:40PM +, Matthew Wilcox wrote:
> > I don't need the sgt at all. I just need another list of physical
> > addresses for DMA. I see no issue with a phsr_list storing either CPU
> > Physical Address or DMA Physical Addresses, same data structure.
> 
> There's a difference between a phys_addr_t and a dma_addr_t.  They
> can even be different sizes; some architectures use a 32-bit dma_addr_t
> and a 64-bit phys_addr_t or vice-versa.  phyr cannot store DMA addresses.

I know, but I'm not sure optimizing for 32 bit phys_addr_t is
worthwhile. So I imagine phyrs forced to be 64 bits so it can always
hold a dma_addr_t and we can re-use all the machinery that supports it
for the DMA list as well.

Even on 32 bit physaddr platforms scatterlist is still 24 bytes,
forcing 8 bytes for the physr CPU list is still a net space win.

> > Mode 01 (Up to 2^48 bytes of memory on a 4k alignment)
> >   31:0 - # of order pages
> > 
> > Mode 10 (Up to 2^25 bytes of memory on a 1 byte alignment)
> >   11:0 - starting byte offset in the 4k
> >   31:12 - 20 bits, plus the 5 bit order from the first 8 bytes:
> >   length in bytes
> 
> Honestly, this looks awful to operate on.  Mandatory 8-bytes per entry
> with an optional 4 byte extension?

I expect it is, if we don't value memory efficiency then make it
simpler. A fixed 12 bytes means that the worst case is still only 24
bytes so it isn't a degredation from scatterlist. 

Unfortunately 16 bytes is a degredation.

My point is the structure can hold what scatterlist holds and we can
trade some CPU power to achieve memory compression. I don't know what
the right balance is, but it suggests to me that the idea of a general
flexable array to hold 64 bit addr/length intervals is a useful
generic data structure for this problem.

> > Well, I'm not comfortable with the idea above where RDMA would have to
> > take a memory penalty to use the new interface. To avoid that memory
> > penalty we need to get rid of scatterlist entirely.
> > 
> > If we do the 16 byte struct from the first email then a umem for MRs
> > will increase in memory consumption by 160% compared today's 24
> > bytes/page. I think the HPC workloads will veto this.
> 
> Huh?  We do 16 bytes per physically contiguous range.  Then, if your
> HPC workloads use an IOMMU that can map a virtually contiguous range
> into a single sg entry, it uses 24 bytes for the entire mapping.  It
> should shrink.

IOMMU is not common in those cases, it is slow.

So you end up with 16 bytes per entry then another 24 bytes in the
entirely redundant scatter list. That is now 40 bytes/page for typical
HPC case, and I can't see that being OK.

> > > I just want to delete page_link, offset and length from struct
> > > scatterlist.  Given the above sequence of calls, we're going to get
> > > sg lists that aren't chained.  They may have to be vmalloced, but
> > > they should be contiguous.
> > 
> > I don't understand that? Why would the SGL out of the iommu suddenly
> > not be chained?
> 
> Because it's being given a single set of ranges to map, instead of
> being given 512 pages at a time.

I still don't understand what you are describing here? I don't know of
any case where a struct scatterlist will be vmalloc'd not page chained
- we don't even support that??

> It would only be slow for degenerate cases where the pinned memory
> is fragmented and not contiguous.

Degenerate? This is the normal case today isn't it? I think it is for
RDMA the last time I looked. Even small allocations like < 64k were
fragmented...

> > IMHO, the scatterlist has to go away. The interface should be physr
> > list in, physr list out.
> 
> That's reproducing the bad decision of the scatterlist, only with
> a different encoding.  You end up with something like:
> 
> struct neoscat {
>   dma_addr_t dma_addr;
>   phys_addr_t phys_addr;
>   size_t dma_len;
>   size_t phys_len;
> };

This isn't what I mean at all!

I imagine a generic data structure that can hold an array of 64 bit
intervals.

The DMA map operation takes in this array that holds CPU addreses,
allocates a new array and fills it with DMA addresses and returns
that. The caller ends up with two arrays in two memory allocations.

No scatterlist required.

It is undoing the bad design of scatterlist by forcing the CPU and DMA
to be in different memory. 

I just want to share the whole API that will have to exist to
reasonably support this flexible array of intervals data structure..

Jason


Re: [PATCH 1/2] drm/i915: Prepare for multiple GTs

2022-01-11 Thread Stimson, Dale B
Hi Andi,

On 2022-01-11 14:15:51, Andi Shyti wrote:
> 
> From: Tvrtko Ursulin 
> 
> On a multi-tile platform, each tile has its own registers + GGTT
> space, and BAR 0 is extended to cover all of them.
> 
> Up to four gts are supported in i915->gt[], with slot zero
> shadowing the existing i915->gt0 to enable source compatibility
> with legacy driver paths. A for_each_gt macro is added to iterate
> over the GTs and will be used by upcoming patches that convert
> various parts of the driver to be multi-gt aware.
> 
> Only the primary/root tile is initialized for now; the other
> tiles will be detected and plugged in by future patches once the
> necessary infrastructure is in place to handle them.
> 
> Signed-off-by: Abdiel Janulgue 
> Signed-off-by: Daniele Ceraolo Spurio 
> Signed-off-by: Tvrtko Ursulin 
> Signed-off-by: Matt Roper 
> Signed-off-by: Andi Shyti 
> Cc: Daniele Ceraolo Spurio 
> Cc: Joonas Lahtinen 
> Cc: Matthew Auld 
> ---
>  drivers/gpu/drm/i915/gt/intel_gt.c| 139 --
>  drivers/gpu/drm/i915/gt/intel_gt.h|  14 +-
>  drivers/gpu/drm/i915/gt/intel_gt_pm.c |   9 +-
>  drivers/gpu/drm/i915/gt/intel_gt_types.h  |   7 +
>  drivers/gpu/drm/i915/i915_driver.c|  29 ++--
>  drivers/gpu/drm/i915/i915_drv.h   |   6 +
>  drivers/gpu/drm/i915/intel_memory_region.h|   3 +
>  drivers/gpu/drm/i915/intel_uncore.c   |  12 +-
>  drivers/gpu/drm/i915/intel_uncore.h   |   3 +-
>  .../gpu/drm/i915/selftests/mock_gem_device.c  |   5 +-
>  10 files changed, 185 insertions(+), 42 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c 
> b/drivers/gpu/drm/i915/gt/intel_gt.c
> index 298ff32c8d0c..5e062c9525f8 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt.c
> @@ -26,7 +26,8 @@
>  #include "shmem_utils.h"
>  #include "pxp/intel_pxp.h"
>  
> -void __intel_gt_init_early(struct intel_gt *gt, struct drm_i915_private 
> *i915)
> +static void
> +__intel_gt_init_early(struct intel_gt *gt)
>  {
>   spin_lock_init(>->irq_lock);
>  
> @@ -46,19 +47,27 @@ void __intel_gt_init_early(struct intel_gt *gt, struct 
> drm_i915_private *i915)
>   intel_rps_init_early(>->rps);
>  }
>  
> +/* Preliminary initialization of Tile 0 */
>  void intel_gt_init_early(struct intel_gt *gt, struct drm_i915_private *i915)
>  {
>   gt->i915 = i915;
>   gt->uncore = &i915->uncore;
> +
> + __intel_gt_init_early(gt);
>  }
>  
> -int intel_gt_probe_lmem(struct intel_gt *gt)
> +static int intel_gt_probe_lmem(struct intel_gt *gt)
>  {
>   struct drm_i915_private *i915 = gt->i915;
> + unsigned int instance = gt->info.id;
>   struct intel_memory_region *mem;
>   int id;
>   int err;
>  
> + id = INTEL_REGION_LMEM + instance;
> + if (drm_WARN_ON(&i915->drm, id >= INTEL_REGION_STOLEN_SMEM))
> + return -ENODEV;
> +
>   mem = intel_gt_setup_lmem(gt);
>   if (mem == ERR_PTR(-ENODEV))
>   mem = intel_gt_setup_fake_lmem(gt);
> @@ -73,9 +82,8 @@ int intel_gt_probe_lmem(struct intel_gt *gt)
>   return err;
>   }
>  
> - id = INTEL_REGION_LMEM;
> -
>   mem->id = id;
> + mem->instance = instance;
>  
>   intel_memory_region_set_name(mem, "local%u", mem->instance);
>  
> @@ -790,16 +798,21 @@ void intel_gt_driver_release(struct intel_gt *gt)
>   intel_gt_fini_buffer_pool(gt);
>  }
>  
> -void intel_gt_driver_late_release(struct intel_gt *gt)
> +void intel_gt_driver_late_release(struct drm_i915_private *i915)
>  {
> + struct intel_gt *gt;
> + unsigned int id;
> +
>   /* We need to wait for inflight RCU frees to release their grip */
>   rcu_barrier();
>  
> - intel_uc_driver_late_release(>->uc);
> - intel_gt_fini_requests(gt);
> - intel_gt_fini_reset(gt);
> - intel_gt_fini_timelines(gt);
> - intel_engines_free(gt);
> + for_each_gt(gt, i915, id) {
> + intel_uc_driver_late_release(>->uc);
> + intel_gt_fini_requests(gt);
> + intel_gt_fini_reset(gt);
> + intel_gt_fini_timelines(gt);
> + intel_engines_free(gt);
> + }
>  }
>  
>  /**
> @@ -908,6 +921,112 @@ u32 intel_gt_read_register_fw(struct intel_gt *gt, 
> i915_reg_t reg)
>   return intel_uncore_read_fw(gt->uncore, reg);
>  }
>  
> +static int
> +intel_gt_tile_setup(struct intel_gt *gt, phys_addr_t phys_addr)
> +{
> + struct drm_i915_private *i915 = gt->i915;
> + unsigned int id = gt->info.id;
> + int ret;
> +
> + if (id) {
> + struct intel_uncore_mmio_debug *mmio_debug;
> + struct intel_uncore *uncore;
> +
> + /* For multi-tile platforms BAR0 must have at least 16MB per 
> tile */
> + if (GEM_WARN_ON(pci_resource_len(to_pci_dev(i915->drm.dev), 0) <
> + (id + 1) * SZ_16M))
> + return -EINVAL;
> +
> + uncore = kzalloc(sizeof(*uncore), 

Re: Phyr Starter

2022-01-11 Thread Logan Gunthorpe



On 2022-01-11 2:25 p.m., Matthew Wilcox wrote:
> That's reproducing the bad decision of the scatterlist, only with
> a different encoding.  You end up with something like:
> 
> struct neoscat {
>   dma_addr_t dma_addr;
>   phys_addr_t phys_addr;
>   size_t dma_len;
>   size_t phys_len;
> };
> 
> and the dma_addr and dma_len are unused by all-but-the-first entry when
> you have a competent IOMMU.  We want a different data structure in and
> out, and we may as well keep using the scatterlist for the dma-map-out.

With my P2PDMA patchset, even with a competent IOMMU, we need to support
multiple dma_addr/dma_len pairs (plus the flag bit). This is required to
program IOVAs and multiple bus addresses into a single DMA transactions.

I think using the scatter list for the DMA-out side is not ideal seeing
we don't need the page pointers or multiple length fields and we won't
be able to change the sgl substantially given the massive amount of
existing use cases that won't go away over night.

My hope would be along these lines:

struct phy_range {
phys_addr_t phyr_addr;
u32 phyr_len;
u32 phyr_flags;
};

struct dma_range {
dma_addr_t dmar_addr;
u32 dmar_len;
u32 dmar_flags;
};

A new GUP helper would somehow return a list of phy_range structs and
the new dma_map function would take that list and return a list of
dma_range structs. Each element in the list could represent a segment up
to 4GB, so any range longer than that would need multiple items in the
list. (Alternatively, perhaps the length could be a 64bit value and we
steal some of the top bits for flags or some such). The flags would not
only be needed by some of the use cases mentioned (FOLL_PIN or
DMA_BUS_ADDRESS) but could also support chaining these lists like SGLs
so continuous vmallocs would not be necessary for longer lists.

If there's an [phy|dma]_range_list struct (or some such) which contains
these range structs (per some details of Jason's suggestions) that'd be
fine by me too and would just depend on implementation details.

However, the main problem I see is a chicken and egg problem. The new
dma_map function would need to be implemented by every dma_map provider
or any driver trying to use it would need a messy fallback. Either that,
or we need a wrapper that allocates an appropriately sized SGL to pass
to any dma_map implementation that doesn't support the new structures.

Logan



Re: Phyr Starter

2022-01-11 Thread Matthew Wilcox
On Tue, Jan 11, 2022 at 04:21:59PM -0400, Jason Gunthorpe wrote:
> On Tue, Jan 11, 2022 at 06:33:57PM +, Matthew Wilcox wrote:
> 
> > > Then we are we using get_user_phyr() at all if we are just storing it
> > > in a sg?
> > 
> > I did consider just implementing get_user_sg() (actually 4 years ago),
> > but that cements the use of sg as both an input and output data structure
> > for DMA mapping, which I am under the impression we're trying to get
> > away from.
> 
> I know every time I talked about a get_user_sg() Christoph is against
> it and we need to stop using scatter list...
> 
> > > Also 16 entries is way to small, it should be at least a whole PMD
> > > worth so we don't have to relock the PMD level each iteration.
> > > 
> > > I would like to see a flow more like:
> > > 
> > >   cpu_phyr_list = get_user_phyr(uptr, 1G);
> > >   dma_phyr_list = dma_map_phyr(device, cpu_phyr_list);
> > >   [..]
> > >   dma_unmap_phyr(device, dma_phyr_list);
> > >   unpin_drity_free(cpu_phy_list);
> > > 
> > > Where dma_map_phyr() can build a temporary SGL for old iommu drivers
> > > compatability. iommu drivers would want to implement natively, of
> > > course.
> > > 
> > > ie no loops in drivers.
> > 
> > Let me just rewrite that for you ...
> > 
> > umem->phyrs = get_user_phyrs(addr, size, &umem->phyr_len);
> > umem->sgt = dma_map_phyrs(device, umem->phyrs, umem->phyr_len,
> > DMA_BIDIRECTIONAL, dma_attr);
> > ...
> > dma_unmap_phyr(device, umem->phyrs, umem->phyr_len, umem->sgt->sgl,
> > umem->sgt->nents, DMA_BIDIRECTIONAL, dma_attr);
> > sg_free_table(umem->sgt);
> > free_user_phyrs(umem->phyrs, umem->phyr_len);
> 
> Why? As above we want to get rid of the sgl, so you are telling me to
> adopt phyrs I need to increase the memory consumption by a hefty
> amount to store the phyrs and still keep the sgt now? Why?
> 
> I don't need the sgt at all. I just need another list of physical
> addresses for DMA. I see no issue with a phsr_list storing either CPU
> Physical Address or DMA Physical Addresses, same data structure.

There's a difference between a phys_addr_t and a dma_addr_t.  They
can even be different sizes; some architectures use a 32-bit dma_addr_t
and a 64-bit phys_addr_t or vice-versa.  phyr cannot store DMA addresses.

> In the fairly important passthrough DMA case the CPU list and DMA list
> are identical, so we don't even need to do anything.
> 
> In the typical iommu case my dma map's phyrs is only one entry.

That becomes a very simple sg table then.

> As an example coding - Use the first 8 bytes to encode this:
> 
>  51:0 - Physical address / 4k (ie pfn)
>  56:52 - Order (simple, your order encoding can do better)
>  61:57 - Unused
>  63:62 - Mode, one of:
>  00 = natural order pfn (8 bytes)
>  01 = order aligned with length (12 bytes)
>  10 = arbitary (12 bytes)
> 
> Then the optional 4 bytes are used as:
> 
> Mode 01 (Up to 2^48 bytes of memory on a 4k alignment)
>   31:0 - # of order pages
> 
> Mode 10 (Up to 2^25 bytes of memory on a 1 byte alignment)
>   11:0 - starting byte offset in the 4k
>   31:12 - 20 bits, plus the 5 bit order from the first 8 bytes:
>   length in bytes

Honestly, this looks awful to operate on.  Mandatory 8-bytes per entry
with an optional 4 byte extension?

> > > The last case is, perhaps, a possible route to completely replace
> > > scatterlist. Few places need true byte granularity for interior pages,
> > > so we can invent some coding to say 'this is 8 byte aligned, and n
> > > bytes long' that only fits < 4k or something. Exceptional cases can
> > > then still work. I'm not sure what block needs here - is it just 512?
> > 
> > Replacing scatterlist is not my goal.  That seems like a lot more work
> > for little gain.  
> 
> Well, I'm not comfortable with the idea above where RDMA would have to
> take a memory penalty to use the new interface. To avoid that memory
> penalty we need to get rid of scatterlist entirely.
> 
> If we do the 16 byte struct from the first email then a umem for MRs
> will increase in memory consumption by 160% compared today's 24
> bytes/page. I think the HPC workloads will veto this.

Huh?  We do 16 bytes per physically contiguous range.  Then, if your HPC
workloads use an IOMMU that can map a virtually contiguous range
into a single sg entry, it uses 24 bytes for the entire mapping.
It should shrink.

> > I just want to delete page_link, offset and length from struct
> > scatterlist.  Given the above sequence of calls, we're going to get
> > sg lists that aren't chained.  They may have to be vmalloced, but
> > they should be contiguous.
> 
> I don't understand that? Why would the SGL out of the iommu suddenly
> not be chained?

Because it's being given a single set of ranges to map, instead of
being given 512 pages at a time.

> >From what I've heard I'm also not keen on a physr list using vmalloc
> either, that is said to be quite slow?

I

Re: [PATCH] drm/doc: overview before functions for drm_writeback.c

2022-01-11 Thread Daniel Vetter
On Tue, Jan 11, 2022 at 10:35:22PM +0200, Laurent Pinchart wrote:
> Hi Dan,
> 
> Thank you for the patch.
> 
> On Tue, Jan 11, 2022 at 09:27:14PM +0100, Daniel Vetter wrote:
> > Otherwise it's really hard to link to that, which I realized when I
> > wanted to link to the property definitions for a question on irc.
> > 
> > Fix it.
> > 
> > Fixes: e2d7fc20b3e2 ("drm/writeback: wire drm_writeback.h to kernel-doc")
> > Cc: Sam Ravnborg 
> > Cc: Daniel Vetter 
> > Cc: Laurent Pinchart 
> > Cc: Brian Starkey 
> > Cc: Liviu Dudau 
> > Signed-off-by: Daniel Vetter 
> 
> Reviewed-by: Laurent Pinchart 

Thanks for the quick rb, patch pushed.
-Daniel

> 
> > ---
> >  Documentation/gpu/drm-kms.rst | 6 +++---
> >  1 file changed, 3 insertions(+), 3 deletions(-)
> > 
> > diff --git a/Documentation/gpu/drm-kms.rst b/Documentation/gpu/drm-kms.rst
> > index d14bf1c35d7e..6f9c064fd323 100644
> > --- a/Documentation/gpu/drm-kms.rst
> > +++ b/Documentation/gpu/drm-kms.rst
> > @@ -423,12 +423,12 @@ Connector Functions Reference
> >  Writeback Connectors
> >  
> >  
> > -.. kernel-doc:: include/drm/drm_writeback.h
> > -  :internal:
> > -
> >  .. kernel-doc:: drivers/gpu/drm/drm_writeback.c
> >:doc: overview
> >  
> > +.. kernel-doc:: include/drm/drm_writeback.h
> > +  :internal:
> > +
> >  .. kernel-doc:: drivers/gpu/drm/drm_writeback.c
> >:export:
> >  
> 
> -- 
> Regards,
> 
> Laurent Pinchart

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: Unable to unselect VGA_ARB (VGA Arbitration)

2022-01-11 Thread Randy Dunlap
Hi Paul,

On 1/11/22 12:28, Paul Menzel wrote:
> Dear Linux folks,
> 
> 
> I am using Linux 5.16, and I am unable to unset `VGA_ARB` in Kconfig (`make 
> menuconfig`). I have an Asus F2A85-M PRO with an AMD A6-6400K APU (integrated 
> Radeon graphics device), so no legacy stuff.
> 
> From `drivers/gpu/vga/Kconfig`:
> 
> ```
> config VGA_ARB
>     bool "VGA Arbitration" if EXPERT

You can modify VGA_ARB if you set ^^ "EXPERT".

>     default y
>     depends on (PCI && !S390)
>     help
>   […]
> 
> config VGA_ARB_MAX_GPUS
>     int "Maximum number of GPUs"
>     default 16
>     depends on VGA_ARB
>     help
>   […]
> 
> config VGA_SWITCHEROO
>     bool "Laptop Hybrid Graphics - GPU switching support"
>     depends on X86
>     depends on ACPI
>     depends on PCI
>     depends on (FRAMEBUFFER_CONSOLE=n || FB=y)
>     select VGA_ARB
>     help
>   […]
> ```
> 
> But in `make menuconfig` I am unable to unselect it.
> 
>     -*- VGA Arbitration
> 
> and the help says:
> 
>     Symbol: VGA_ARB [=y]
>     Type  : bool
>   Depends on: HAS_IOMEM [=y] && PCI [=y] && !S390
>   Visible if: HAS_IOMEM [=y] && PCI [=y] && !S390 && EXPERT [=n]
>   Location:
>     Main menu
>  -> Device Drivers
>    -> Graphics support
>     Selected by [n]:
>   - VGA_SWITCHEROO [=n] && HAS_IOMEM [=y] && X86 [=y] && ACPI [=y] && PCI 
> [=y] && (!FRAMEBUFFER_CONSOLE [=y] || FB [=y]=y)
> 
> So, VGA_SWITCHEROO is not set, and, therefore, as `Selected by [n]:` 
> suggests, I thought I’d be able to deselect it.
> 
> It’d be great if you could help me out.
> 
> 
> Kind regards,
> 
> Paul

-- 
~Randy


Re: [git pull] drm for 5.17-rc1 (pre-merge window pull)

2022-01-11 Thread Linus Torvalds
On Tue, Jan 11, 2022 at 7:38 AM Harry Wentland  wrote:
>
> Attached is a v2 of the buggy patch that should get this right.
> If you have a chance to try it out let us know

I can confirm that I do not see the horribly flickering behavior with
this patch.

I didn't look at what the actual differences were from the one I
reverted, but at least on my machine this version works.

Linus


Unable to unselect VGA_ARB (VGA Arbitration)

2022-01-11 Thread Paul Menzel

Dear Linux folks,


I am using Linux 5.16, and I am unable to unset `VGA_ARB` in Kconfig 
(`make menuconfig`). I have an Asus F2A85-M PRO with an AMD A6-6400K APU 
(integrated Radeon graphics device), so no legacy stuff.


From `drivers/gpu/vga/Kconfig`:

```
config VGA_ARB
bool "VGA Arbitration" if EXPERT
default y
depends on (PCI && !S390)
help
  […]

config VGA_ARB_MAX_GPUS
int "Maximum number of GPUs"
default 16
depends on VGA_ARB
help
  […]

config VGA_SWITCHEROO
bool "Laptop Hybrid Graphics - GPU switching support"
depends on X86
depends on ACPI
depends on PCI
depends on (FRAMEBUFFER_CONSOLE=n || FB=y)
select VGA_ARB
help
  […]
```

But in `make menuconfig` I am unable to unselect it.

-*- VGA Arbitration

and the help says:

Symbol: VGA_ARB [=y]
Type  : bool
  Depends on: HAS_IOMEM [=y] && PCI [=y] && !S390
  Visible if: HAS_IOMEM [=y] && PCI [=y] && !S390 && EXPERT [=n]
  Location:
Main menu
 -> Device Drivers
   -> Graphics support
Selected by [n]:
  - VGA_SWITCHEROO [=n] && HAS_IOMEM [=y] && X86 [=y] && ACPI [=y] 
&& PCI [=y] && (!FRAMEBUFFER_CONSOLE [=y] || FB [=y]=y)


So, VGA_SWITCHEROO is not set, and, therefore, as `Selected by [n]:` 
suggests, I thought I’d be able to deselect it.


It’d be great if you could help me out.


Kind regards,

Paul


Re: [PATCH] drm/doc: overview before functions for drm_writeback.c

2022-01-11 Thread Laurent Pinchart
Hi Dan,

Thank you for the patch.

On Tue, Jan 11, 2022 at 09:27:14PM +0100, Daniel Vetter wrote:
> Otherwise it's really hard to link to that, which I realized when I
> wanted to link to the property definitions for a question on irc.
> 
> Fix it.
> 
> Fixes: e2d7fc20b3e2 ("drm/writeback: wire drm_writeback.h to kernel-doc")
> Cc: Sam Ravnborg 
> Cc: Daniel Vetter 
> Cc: Laurent Pinchart 
> Cc: Brian Starkey 
> Cc: Liviu Dudau 
> Signed-off-by: Daniel Vetter 

Reviewed-by: Laurent Pinchart 

> ---
>  Documentation/gpu/drm-kms.rst | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/Documentation/gpu/drm-kms.rst b/Documentation/gpu/drm-kms.rst
> index d14bf1c35d7e..6f9c064fd323 100644
> --- a/Documentation/gpu/drm-kms.rst
> +++ b/Documentation/gpu/drm-kms.rst
> @@ -423,12 +423,12 @@ Connector Functions Reference
>  Writeback Connectors
>  
>  
> -.. kernel-doc:: include/drm/drm_writeback.h
> -  :internal:
> -
>  .. kernel-doc:: drivers/gpu/drm/drm_writeback.c
>:doc: overview
>  
> +.. kernel-doc:: include/drm/drm_writeback.h
> +  :internal:
> +
>  .. kernel-doc:: drivers/gpu/drm/drm_writeback.c
>:export:
>  

-- 
Regards,

Laurent Pinchart


[PATCH] drm/doc: overview before functions for drm_writeback.c

2022-01-11 Thread Daniel Vetter
Otherwise it's really hard to link to that, which I realized when I
wanted to link to the property definitions for a question on irc.

Fix it.

Fixes: e2d7fc20b3e2 ("drm/writeback: wire drm_writeback.h to kernel-doc")
Cc: Sam Ravnborg 
Cc: Daniel Vetter 
Cc: Laurent Pinchart 
Cc: Brian Starkey 
Cc: Liviu Dudau 
Signed-off-by: Daniel Vetter 
---
 Documentation/gpu/drm-kms.rst | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/Documentation/gpu/drm-kms.rst b/Documentation/gpu/drm-kms.rst
index d14bf1c35d7e..6f9c064fd323 100644
--- a/Documentation/gpu/drm-kms.rst
+++ b/Documentation/gpu/drm-kms.rst
@@ -423,12 +423,12 @@ Connector Functions Reference
 Writeback Connectors
 
 
-.. kernel-doc:: include/drm/drm_writeback.h
-  :internal:
-
 .. kernel-doc:: drivers/gpu/drm/drm_writeback.c
   :doc: overview
 
+.. kernel-doc:: include/drm/drm_writeback.h
+  :internal:
+
 .. kernel-doc:: drivers/gpu/drm/drm_writeback.c
   :export:
 
-- 
2.33.0



Re: Phyr Starter

2022-01-11 Thread Jason Gunthorpe
On Tue, Jan 11, 2022 at 10:05:40AM +0100, Daniel Vetter wrote:

> If we go with page size I think hardcoding a PHYS_PAGE_SIZE KB(4)
> would make sense, because thanks to x86 that's pretty much the lowest
> common denominator that all hw (I know of at least) supports. Not
> having to fiddle with "which page size do we have" in driver code
> would be neat. It makes writing portable gup code in drivers just
> needlessly silly.

What I did in RDMA was make an iterator rdma_umem_for_each_dma_block()

The driver passes in the page size it wants and the iterator breaks up
the SGL into that size.

So, eg on a 16k page size system the SGL would be full of 16K stuff,
but the driver only support 4k and so the iterator hands out 4 pages
for each SGL entry.

All the drivers use this to build their DMA lists and tables, it works
really well.

The other part is that most RDMA drivers support many page sizes, so
there is another API to inspect the SGL and take in the device's page
size support and compute what page size the driver should use.

> - I think minimally an sg list form of dma-mapped stuff which does not
> have a struct page, iirc last time we discussed that we agreed that
> this really needs to be part of such a rework or it's not really
> improving things much

Yes, this seems important..

> - a few per-entry driver bits would be nice in both the phys/dma
> chains, if we can have them. gpus have funny gpu interconnects, this
> would allow us to put all the gpu addresses into dma_addr_t if we can
> have some bits indicating whether it's on the pci bus, gpu local
> memory or the gpu<->gpu interconnect.

It seems useful, see my other email for a suggested coding..

Jason 


Re: Phyr Starter

2022-01-11 Thread Jason Gunthorpe
On Tue, Jan 11, 2022 at 06:33:57PM +, Matthew Wilcox wrote:

> > Then we are we using get_user_phyr() at all if we are just storing it
> > in a sg?
> 
> I did consider just implementing get_user_sg() (actually 4 years ago),
> but that cements the use of sg as both an input and output data structure
> for DMA mapping, which I am under the impression we're trying to get
> away from.

I know every time I talked about a get_user_sg() Christoph is against
it and we need to stop using scatter list...

> > Also 16 entries is way to small, it should be at least a whole PMD
> > worth so we don't have to relock the PMD level each iteration.
> > 
> > I would like to see a flow more like:
> > 
> >   cpu_phyr_list = get_user_phyr(uptr, 1G);
> >   dma_phyr_list = dma_map_phyr(device, cpu_phyr_list);
> >   [..]
> >   dma_unmap_phyr(device, dma_phyr_list);
> >   unpin_drity_free(cpu_phy_list);
> > 
> > Where dma_map_phyr() can build a temporary SGL for old iommu drivers
> > compatability. iommu drivers would want to implement natively, of
> > course.
> > 
> > ie no loops in drivers.
> 
> Let me just rewrite that for you ...
> 
>   umem->phyrs = get_user_phyrs(addr, size, &umem->phyr_len);
>   umem->sgt = dma_map_phyrs(device, umem->phyrs, umem->phyr_len,
>   DMA_BIDIRECTIONAL, dma_attr);
>   ...
>   dma_unmap_phyr(device, umem->phyrs, umem->phyr_len, umem->sgt->sgl,
>   umem->sgt->nents, DMA_BIDIRECTIONAL, dma_attr);
>   sg_free_table(umem->sgt);
>   free_user_phyrs(umem->phyrs, umem->phyr_len);

Why? As above we want to get rid of the sgl, so you are telling me to
adopt phyrs I need to increase the memory consumption by a hefty
amount to store the phyrs and still keep the sgt now? Why?

I don't need the sgt at all. I just need another list of physical
addresses for DMA. I see no issue with a phsr_list storing either CPU
Physical Address or DMA Physical Addresses, same data structure.

In the fairly important passthrough DMA case the CPU list and DMA list
are identical, so we don't even need to do anything.

In the typical iommu case my dma map's phyrs is only one entry.

Other cases require a larger allocation. This is the advantage against
today's scatterlist - it forces 24 bytes/page for *everyone* to
support niche architectures even if 8 bytes would have been fine for a
server platform.

> > > The question is whether this is the right kind of optimisation to be
> > > doing.  I hear you that we want a dense format, but it's questionable
> > > whether the kind of thing you're suggesting is actually denser than this
> > > scheme.  For example, if we have 1GB pages and userspace happens to have
> > > allocated pages (3, 4, 5, 6, 7, 8, 9, 10) then this can be represented
> > > as a single phyr.  A power-of-two scheme would have us use four entries
> > > (3, 4-7, 8-9, 10).
> > 
> > That is not quite what I had in mind..
> > 
> > struct phyr_list {
> >unsigned int first_page_offset_bytes;
> >size_t total_length_bytes;
> >phys_addr_t min_alignment;
> >struct packed_phyr *list_of_pages;
> > };
> > 
> > Where each 'packed_phyr' is an aligned page of some kind. The packing
> > has to be able to represent any number of pfns, so we have four major
> > cases:
> >  - 4k pfns (use 8 bytes)
> >  - Natural order pfn (use 8 bytes)
> >  - 4k aligned pfns, arbitary number (use 12 bytes)
> >  - <4k aligned, arbitary length (use 16 bytes?)
> > 
> > In all cases the interior pages are fully used, only the first and
> > last page is sliced based on the two parameters in the phyr_list.
> 
> This kind of representation works for a virtually contiguous range.
> Unfortunately, that's not sufficient for some bio users (as I discovered
> after getting a representation like this enshrined in the NVMe spec as
> the PRP List).

This is what I was trying to convay with the 4th bullet, I'm not
suggesting a PRP list.

As an example coding - Use the first 8 bytes to encode this:

 51:0 - Physical address / 4k (ie pfn)
 56:52 - Order (simple, your order encoding can do better)
 61:57 - Unused
 63:62 - Mode, one of:
 00 = natural order pfn (8 bytes)
 01 = order aligned with length (12 bytes)
 10 = arbitary (12 bytes)

Then the optional 4 bytes are used as:

Mode 01 (Up to 2^48 bytes of memory on a 4k alignment)
  31:0 - # of order pages

Mode 10 (Up to 2^25 bytes of memory on a 1 byte alignment)
  11:0 - starting byte offset in the 4k
  31:12 - 20 bits, plus the 5 bit order from the first 8 bytes:
  length in bytes

I think this covers everything? I assume BIO cannot be doing
non-aligned contiguous transfers beyond 2M? The above can represent
33M of arbitary contiguous memory at 12 bytes/page.

If BIO really needs > 33M then we can use the extra mode to define a
16 byte entry that will cover everything.

> > The last case is, perhaps, a possible route to completely replace
> > scatterlist. Few places need true byte granularity for interior pag

[PATCH v8 6/6] drm/amdgpu: add drm buddy support to amdgpu

2022-01-11 Thread Arunpravin
- Remove drm_mm references and replace with drm buddy functionalities
- Add res cursor support for drm buddy

v2(Matthew Auld):
  - replace spinlock with mutex as we call kmem_cache_zalloc
(..., GFP_KERNEL) in drm_buddy_alloc() function

  - lock drm_buddy_block_trim() function as it calls
mark_free/mark_split are all globally visible

v3(Matthew Auld):
  - remove trim method error handling as we address the failure case
at drm_buddy_block_trim() function

Signed-off-by: Arunpravin 
---
 drivers/gpu/drm/Kconfig   |   1 +
 .../gpu/drm/amd/amdgpu/amdgpu_res_cursor.h|  97 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h   |   6 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c  | 259 ++
 4 files changed, 230 insertions(+), 133 deletions(-)

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index b85f7ffae621..572fcc4adedd 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -270,6 +270,7 @@ config DRM_AMDGPU
select HWMON
select BACKLIGHT_CLASS_DEVICE
select INTERVAL_TREE
+   select DRM_BUDDY
help
  Choose this option if you have a recent AMD Radeon graphics card.
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h
index acfa207cf970..da12b4ff2e45 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h
@@ -30,12 +30,15 @@
 #include 
 #include 
 
+#include "amdgpu_vram_mgr.h"
+
 /* state back for walking over vram_mgr and gtt_mgr allocations */
 struct amdgpu_res_cursor {
uint64_tstart;
uint64_tsize;
uint64_tremaining;
-   struct drm_mm_node  *node;
+   void*node;
+   uint32_tmem_type;
 };
 
 /**
@@ -52,27 +55,63 @@ static inline void amdgpu_res_first(struct ttm_resource 
*res,
uint64_t start, uint64_t size,
struct amdgpu_res_cursor *cur)
 {
+   struct drm_buddy_block *block;
+   struct list_head *head, *next;
struct drm_mm_node *node;
 
-   if (!res || res->mem_type == TTM_PL_SYSTEM) {
-   cur->start = start;
-   cur->size = size;
-   cur->remaining = size;
-   cur->node = NULL;
-   WARN_ON(res && start + size > res->num_pages << PAGE_SHIFT);
-   return;
-   }
+   if (!res)
+   goto err_out;
 
BUG_ON(start + size > res->num_pages << PAGE_SHIFT);
 
-   node = to_ttm_range_mgr_node(res)->mm_nodes;
-   while (start >= node->size << PAGE_SHIFT)
-   start -= node++->size << PAGE_SHIFT;
+   cur->mem_type = res->mem_type;
+
+   switch (cur->mem_type) {
+   case TTM_PL_VRAM:
+   head = &to_amdgpu_vram_mgr_node(res)->blocks;
+
+   block = list_first_entry_or_null(head,
+struct drm_buddy_block,
+link);
+   if (!block)
+   goto err_out;
+
+   while (start >= amdgpu_node_size(block)) {
+   start -= amdgpu_node_size(block);
+
+   next = block->link.next;
+   if (next != head)
+   block = list_entry(next, struct 
drm_buddy_block, link);
+   }
+
+   cur->start = amdgpu_node_start(block) + start;
+   cur->size = min(amdgpu_node_size(block) - start, size);
+   cur->remaining = size;
+   cur->node = block;
+   break;
+   case TTM_PL_TT:
+   node = to_ttm_range_mgr_node(res)->mm_nodes;
+   while (start >= node->size << PAGE_SHIFT)
+   start -= node++->size << PAGE_SHIFT;
+
+   cur->start = (node->start << PAGE_SHIFT) + start;
+   cur->size = min((node->size << PAGE_SHIFT) - start, size);
+   cur->remaining = size;
+   cur->node = node;
+   break;
+   default:
+   goto err_out;
+   }
 
-   cur->start = (node->start << PAGE_SHIFT) + start;
-   cur->size = min((node->size << PAGE_SHIFT) - start, size);
+   return;
+
+err_out:
+   cur->start = start;
+   cur->size = size;
cur->remaining = size;
-   cur->node = node;
+   cur->node = NULL;
+   WARN_ON(res && start + size > res->num_pages << PAGE_SHIFT);
+   return;
 }
 
 /**
@@ -85,7 +124,9 @@ static inline void amdgpu_res_first(struct ttm_resource *res,
  */
 static inline void amdgpu_res_next(struct amdgpu_res_cursor *cur, uint64_t 
size)
 {
-   struct drm_mm_node *node = cur->node;
+   struct drm_buddy_block *block;
+   struct drm_mm_node *node;
+   struct list_head *next;
 
BUG_ON(size > 

[PATCH v8 5/6] drm/amdgpu: move vram inline functions into a header

2022-01-11 Thread Arunpravin
Move shared vram inline functions and structs
into a header file

Signed-off-by: Arunpravin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h | 51 
 1 file changed, 51 insertions(+)
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h
new file mode 100644
index ..59983464cce5
--- /dev/null
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h
@@ -0,0 +1,51 @@
+/* SPDX-License-Identifier: MIT
+ * Copyright 2021 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#ifndef __AMDGPU_VRAM_MGR_H__
+#define __AMDGPU_VRAM_MGR_H__
+
+#include 
+
+struct amdgpu_vram_mgr_node {
+   struct ttm_resource base;
+   struct list_head blocks;
+   unsigned long flags;
+};
+
+static inline u64 amdgpu_node_start(struct drm_buddy_block *block)
+{
+   return drm_buddy_block_offset(block);
+}
+
+static inline u64 amdgpu_node_size(struct drm_buddy_block *block)
+{
+   return PAGE_SIZE << drm_buddy_block_order(block);
+}
+
+static inline struct amdgpu_vram_mgr_node *
+to_amdgpu_vram_mgr_node(struct ttm_resource *res)
+{
+   return container_of(res, struct amdgpu_vram_mgr_node, base);
+}
+
+#endif
-- 
2.25.1



[PATCH v8 4/6] drm: implement a method to free unused pages

2022-01-11 Thread Arunpravin
On contiguous allocation, we round up the size
to the *next* power of 2, implement a function
to free the unused pages after the newly allocate block.

v2(Matthew Auld):
  - replace function name 'drm_buddy_free_unused_pages' with
drm_buddy_block_trim
  - replace input argument name 'actual_size' with 'new_size'
  - add more validation checks for input arguments
  - add overlaps check to avoid needless searching and splitting
  - merged the below patch to see the feature in action
 - add free unused pages support to i915 driver
  - lock drm_buddy_block_trim() function as it calls mark_free/mark_split
are all globally visible

v3(Matthew Auld):
  - remove trim method error handling as we address the failure case
at drm_buddy_block_trim() function

v4:
  - in case of trim, at __alloc_range() split_block failure path
marks the block as free and removes it from the original list,
potentially also freeing it, to overcome this problem, we turn
the drm_buddy_block_trim() input node into a temporary node to
prevent recursively freeing itself, but still retain the
un-splitting/freeing of the other nodes(Matthew Auld)

  - modify the drm_buddy_block_trim() function return type

v5(Matthew Auld):
  - revert drm_buddy_block_trim() function return type changes in v4
  - modify drm_buddy_block_trim() passing argument n_pages to original_size
as n_pages has already been rounded up to the next power-of-two and
passing n_pages results noop

Signed-off-by: Arunpravin 
---
 drivers/gpu/drm/drm_buddy.c   | 65 +++
 drivers/gpu/drm/i915/i915_ttm_buddy_manager.c | 10 +++
 include/drm/drm_buddy.h   |  4 ++
 3 files changed, 79 insertions(+)

diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index 4cc6e88d8e0d..7ba7123c8aec 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -540,6 +540,71 @@ static int __drm_buddy_alloc_range(struct drm_buddy *mm,
return __alloc_range(mm, &dfs, start, size, blocks);
 }
 
+/**
+ * drm_buddy_block_trim - free unused pages
+ *
+ * @mm: DRM buddy manager
+ * @new_size: original size requested
+ * @blocks: output list head to add allocated blocks
+ *
+ * For contiguous allocation, we round up the size to the nearest
+ * power of two value, drivers consume *actual* size, so remaining
+ * portions are unused and it can be freed.
+ *
+ * Returns:
+ * 0 on success, error code on failure.
+ */
+int drm_buddy_block_trim(struct drm_buddy *mm,
+u64 new_size,
+struct list_head *blocks)
+{
+   struct drm_buddy_block *parent;
+   struct drm_buddy_block *block;
+   LIST_HEAD(dfs);
+   u64 new_start;
+   int err;
+
+   if (!list_is_singular(blocks))
+   return -EINVAL;
+
+   block = list_first_entry(blocks,
+struct drm_buddy_block,
+link);
+
+   if (!drm_buddy_block_is_allocated(block))
+   return -EINVAL;
+
+   if (new_size > drm_buddy_block_size(mm, block))
+   return -EINVAL;
+
+   if (!new_size && !IS_ALIGNED(new_size, mm->chunk_size))
+   return -EINVAL;
+
+   if (new_size == drm_buddy_block_size(mm, block))
+   return 0;
+
+   list_del(&block->link);
+   mark_free(mm, block);
+   mm->avail += drm_buddy_block_size(mm, block);
+
+   /* Prevent recursively freeing this node */
+   parent = block->parent;
+   block->parent = NULL;
+
+   new_start = drm_buddy_block_offset(block);
+   list_add(&block->tmp_link, &dfs);
+   err =  __alloc_range(mm, &dfs, new_start, new_size, blocks);
+   if (err) {
+   mark_allocated(block);
+   mm->avail -= drm_buddy_block_size(mm, block);
+   list_add(&block->link, blocks);
+   }
+
+   block->parent = parent;
+   return err;
+}
+EXPORT_SYMBOL(drm_buddy_block_trim);
+
 /**
  * drm_buddy_alloc_blocks - allocate power-of-two blocks
  *
diff --git a/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c 
b/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c
index ae9201246bb5..626108fb9725 100644
--- a/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c
+++ b/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c
@@ -97,6 +97,16 @@ static int i915_ttm_buddy_man_alloc(struct 
ttm_resource_manager *man,
if (unlikely(err))
goto err_free_blocks;
 
+   if (place->flags & TTM_PL_FLAG_CONTIGUOUS) {
+   u64 original_size = (u64)bman_res->base.num_pages << PAGE_SHIFT;
+
+   mutex_lock(&bman->lock);
+   drm_buddy_block_trim(mm,
+   original_size,
+   &bman_res->blocks);
+   mutex_unlock(&bman->lock);
+   }
+
*res = &bman_res->base;
return 0;
 
diff --git a/include/drm/drm_buddy.h b/include/drm/drm_buddy.h
index 424fc443115e..17c

[PATCH v8 3/6] drm: implement top-down allocation method

2022-01-11 Thread Arunpravin
Implemented a function which walk through the order list,
compares the offset and returns the maximum offset block,
this method is unpredictable in obtaining the high range
address blocks which depends on allocation and deallocation.
for instance, if driver requests address at a low specific
range, allocator traverses from the root block and splits
the larger blocks until it reaches the specific block and
in the process of splitting, lower orders in the freelist
are occupied with low range address blocks and for the
subsequent TOPDOWN memory request we may return the low
range blocks.To overcome this issue, we may go with the
below approach.

The other approach, sorting each order list entries in
ascending order and compares the last entry of each
order list in the freelist and return the max block.
This creates sorting overhead on every drm_buddy_free()
request and split up of larger blocks for a single page
request.

v2:
  - Fix alignment issues(Matthew Auld)
  - Remove unnecessary list_empty check(Matthew Auld)
  - merged the below patch to see the feature in action
 - add top-down alloc support to i915 driver

Signed-off-by: Arunpravin 
---
 drivers/gpu/drm/drm_buddy.c   | 36 ---
 drivers/gpu/drm/i915/i915_ttm_buddy_manager.c |  3 ++
 include/drm/drm_buddy.h   |  1 +
 3 files changed, 35 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index 717c37ae8a32..4cc6e88d8e0d 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -369,6 +369,26 @@ alloc_range_bias(struct drm_buddy *mm,
return ERR_PTR(err);
 }
 
+static struct drm_buddy_block *
+get_maxblock(struct list_head *head)
+{
+   struct drm_buddy_block *max_block = NULL, *node;
+
+   max_block = list_first_entry_or_null(head,
+struct drm_buddy_block,
+link);
+   if (!max_block)
+   return NULL;
+
+   list_for_each_entry(node, head, link) {
+   if (drm_buddy_block_offset(node) >
+   drm_buddy_block_offset(max_block))
+   max_block = node;
+   }
+
+   return max_block;
+}
+
 static struct drm_buddy_block *
 alloc_from_freelist(struct drm_buddy *mm,
unsigned int order,
@@ -379,11 +399,17 @@ alloc_from_freelist(struct drm_buddy *mm,
int err;
 
for (i = order; i <= mm->max_order; ++i) {
-   block = list_first_entry_or_null(&mm->free_list[i],
-struct drm_buddy_block,
-link);
-   if (block)
-   break;
+   if (flags & DRM_BUDDY_TOPDOWN_ALLOCATION) {
+   block = get_maxblock(&mm->free_list[i]);
+   if (block)
+   break;
+   } else {
+   block = list_first_entry_or_null(&mm->free_list[i],
+struct drm_buddy_block,
+link);
+   if (block)
+   break;
+   }
}
 
if (!block)
diff --git a/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c 
b/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c
index 143657a706ae..ae9201246bb5 100644
--- a/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c
+++ b/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c
@@ -53,6 +53,9 @@ static int i915_ttm_buddy_man_alloc(struct 
ttm_resource_manager *man,
INIT_LIST_HEAD(&bman_res->blocks);
bman_res->mm = mm;
 
+   if (place->flags & TTM_PL_FLAG_TOPDOWN)
+   bman_res->flags |= DRM_BUDDY_TOPDOWN_ALLOCATION;
+
if (place->fpfn || lpfn != man->size)
bman_res->flags |= DRM_BUDDY_RANGE_ALLOCATION;
 
diff --git a/include/drm/drm_buddy.h b/include/drm/drm_buddy.h
index 865664b90a8a..424fc443115e 100644
--- a/include/drm/drm_buddy.h
+++ b/include/drm/drm_buddy.h
@@ -28,6 +28,7 @@
 })
 
 #define DRM_BUDDY_RANGE_ALLOCATION (1 << 0)
+#define DRM_BUDDY_TOPDOWN_ALLOCATION (1 << 1)
 
 struct drm_buddy_block {
 #define DRM_BUDDY_HEADER_OFFSET GENMASK_ULL(63, 12)
-- 
2.25.1



[PATCH v8 1/6] drm: move the buddy allocator from i915 into common drm

2022-01-11 Thread Arunpravin
Move the base i915 buddy allocator code into drm
- Move i915_buddy.h to include/drm
- Move i915_buddy.c to drm root folder
- Rename "i915" string with "drm" string wherever applicable
- Rename "I915" string with "DRM" string wherever applicable
- Fix header file dependencies
- Fix alignment issues
- add Makefile support for drm buddy
- export functions and write kerneldoc description
- Remove i915 selftest config check condition as buddy selftest
  will be moved to drm selftest folder

cleanup i915 buddy references in i915 driver module
and replace with drm buddy

v2:
  - include header file in alphabetical order(Thomas)
  - merged changes listed in the body section into a single patch
to keep the build intact(Christian, Jani)

v3:
  - make drm buddy a separate module(Thomas, Christian)

v4:
  - Fix build error reported by kernel test robot 
  - removed i915 buddy selftest from i915_mock_selftests.h to
avoid build error
  - removed selftests/i915_buddy.c file as we create a new set of
buddy test cases in drm/selftests folder

v5:
  - Fix merge conflict issue

v6:
  - replace drm_buddy_mm structure name as drm_buddy(Thomas, Christian)
  - replace drm_buddy_alloc() function name as drm_buddy_alloc_blocks()
(Thomas)
  - replace drm_buddy_free() function name as drm_buddy_free_block()
(Thomas)
  - export drm_buddy_free_block() function
  - fix multiple instances of KMEM_CACHE() entry

v7:
  - fix warnings reported by kernel test robot 
  - modify the license(Christian)

Signed-off-by: Arunpravin 
---
 drivers/gpu/drm/Kconfig   |   6 +
 drivers/gpu/drm/Makefile  |   2 +
 drivers/gpu/drm/drm_buddy.c   | 535 
 drivers/gpu/drm/i915/Kconfig  |   1 +
 drivers/gpu/drm/i915/Makefile |   1 -
 drivers/gpu/drm/i915/i915_buddy.c | 466 ---
 drivers/gpu/drm/i915/i915_buddy.h | 143 
 drivers/gpu/drm/i915/i915_module.c|   3 -
 drivers/gpu/drm/i915/i915_scatterlist.c   |  11 +-
 drivers/gpu/drm/i915/i915_ttm_buddy_manager.c |  33 +-
 drivers/gpu/drm/i915/i915_ttm_buddy_manager.h |   4 +-
 drivers/gpu/drm/i915/selftests/i915_buddy.c   | 787 --
 .../drm/i915/selftests/i915_mock_selftests.h  |   1 -
 .../drm/i915/selftests/intel_memory_region.c  |  13 +-
 include/drm/drm_buddy.h   | 150 
 15 files changed, 725 insertions(+), 1431 deletions(-)
 create mode 100644 drivers/gpu/drm/drm_buddy.c
 delete mode 100644 drivers/gpu/drm/i915/i915_buddy.c
 delete mode 100644 drivers/gpu/drm/i915/i915_buddy.h
 delete mode 100644 drivers/gpu/drm/i915/selftests/i915_buddy.c
 create mode 100644 include/drm/drm_buddy.h

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index b1f22e457fd0..b85f7ffae621 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -198,6 +198,12 @@ config DRM_TTM
  GPU memory types. Will be enabled automatically if a device driver
  uses it.
 
+config DRM_BUDDY
+   tristate
+   depends on DRM
+   help
+ A page based buddy allocator
+
 config DRM_VRAM_HELPER
tristate
depends on DRM
diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
index 301a44dc18e3..ff0286eca254 100644
--- a/drivers/gpu/drm/Makefile
+++ b/drivers/gpu/drm/Makefile
@@ -42,6 +42,8 @@ obj-$(CONFIG_DRM_GEM_CMA_HELPER) += drm_cma_helper.o
 drm_shmem_helper-y := drm_gem_shmem_helper.o
 obj-$(CONFIG_DRM_GEM_SHMEM_HELPER) += drm_shmem_helper.o
 
+obj-$(CONFIG_DRM_BUDDY) += drm_buddy.o
+
 drm_vram_helper-y := drm_gem_vram_helper.o
 obj-$(CONFIG_DRM_VRAM_HELPER) += drm_vram_helper.o
 
diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
new file mode 100644
index ..9f4d929995b2
--- /dev/null
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -0,0 +1,535 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2021 Intel Corporation
+ */
+
+#include 
+#include 
+#include 
+
+#include 
+
+static struct kmem_cache *slab_blocks;
+
+static struct drm_buddy_block *drm_block_alloc(struct drm_buddy *mm,
+  struct drm_buddy_block *parent,
+  unsigned int order,
+  u64 offset)
+{
+   struct drm_buddy_block *block;
+
+   BUG_ON(order > DRM_BUDDY_MAX_ORDER);
+
+   block = kmem_cache_zalloc(slab_blocks, GFP_KERNEL);
+   if (!block)
+   return NULL;
+
+   block->header = offset;
+   block->header |= order;
+   block->parent = parent;
+
+   BUG_ON(block->header & DRM_BUDDY_HEADER_UNUSED);
+   return block;
+}
+
+static void drm_block_free(struct drm_buddy *mm,
+  struct drm_buddy_block *block)
+{
+   kmem_cache_free(slab_blocks, block);
+}
+
+static void mark_allocated(struct drm_buddy_block *block)
+{
+   block->header &= ~DRM_BUDDY_HEADER_STATE;
+   bl

[PATCH v8 2/6] drm: improve drm_buddy_alloc function

2022-01-11 Thread Arunpravin
- Make drm_buddy_alloc a single function to handle
  range allocation and non-range allocation demands

- Implemented a new function alloc_range() which allocates
  the requested power-of-two block comply with range limitations

- Moved order computation and memory alignment logic from
  i915 driver to drm buddy

v2:
  merged below changes to keep the build unbroken
   - drm_buddy_alloc_range() becomes obsolete and may be removed
   - enable ttm range allocation (fpfn / lpfn) support in i915 driver
   - apply enhanced drm_buddy_alloc() function to i915 driver

v3(Matthew Auld):
  - Fix alignment issues and remove unnecessary list_empty check
  - add more validation checks for input arguments
  - make alloc_range() block allocations as bottom-up
  - optimize order computation logic
  - replace uint64_t with u64, which is preferred in the kernel

v4(Matthew Auld):
  - keep drm_buddy_alloc_range() function implementation for generic
actual range allocations
  - keep alloc_range() implementation for end bias allocations

v5(Matthew Auld):
  - modify drm_buddy_alloc() passing argument place->lpfn to lpfn
as place->lpfn will currently always be zero for i915

Signed-off-by: Arunpravin 
---
 drivers/gpu/drm/drm_buddy.c   | 316 +-
 drivers/gpu/drm/i915/i915_ttm_buddy_manager.c |  67 ++--
 drivers/gpu/drm/i915/i915_ttm_buddy_manager.h |   2 +
 include/drm/drm_buddy.h   |  22 +-
 4 files changed, 285 insertions(+), 122 deletions(-)

diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index 9f4d929995b2..717c37ae8a32 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -282,23 +282,97 @@ void drm_buddy_free_list(struct drm_buddy *mm, struct 
list_head *objects)
 }
 EXPORT_SYMBOL(drm_buddy_free_list);
 
-/**
- * drm_buddy_alloc_blocks - allocate power-of-two blocks
- *
- * @mm: DRM buddy manager to allocate from
- * @order: size of the allocation
- *
- * The order value here translates to:
- *
- * 0 = 2^0 * mm->chunk_size
- * 1 = 2^1 * mm->chunk_size
- * 2 = 2^2 * mm->chunk_size
- *
- * Returns:
- * allocated ptr to the &drm_buddy_block on success
- */
-struct drm_buddy_block *
-drm_buddy_alloc_blocks(struct drm_buddy *mm, unsigned int order)
+static inline bool overlaps(u64 s1, u64 e1, u64 s2, u64 e2)
+{
+   return s1 <= e2 && e1 >= s2;
+}
+
+static inline bool contains(u64 s1, u64 e1, u64 s2, u64 e2)
+{
+   return s1 <= s2 && e1 >= e2;
+}
+
+static struct drm_buddy_block *
+alloc_range_bias(struct drm_buddy *mm,
+u64 start, u64 end,
+unsigned int order)
+{
+   struct drm_buddy_block *block;
+   struct drm_buddy_block *buddy;
+   LIST_HEAD(dfs);
+   int err;
+   int i;
+
+   end = end - 1;
+
+   for (i = 0; i < mm->n_roots; ++i)
+   list_add_tail(&mm->roots[i]->tmp_link, &dfs);
+
+   do {
+   u64 block_start;
+   u64 block_end;
+
+   block = list_first_entry_or_null(&dfs,
+struct drm_buddy_block,
+tmp_link);
+   if (!block)
+   break;
+
+   list_del(&block->tmp_link);
+
+   if (drm_buddy_block_order(block) < order)
+   continue;
+
+   block_start = drm_buddy_block_offset(block);
+   block_end = block_start + drm_buddy_block_size(mm, block) - 1;
+
+   if (!overlaps(start, end, block_start, block_end))
+   continue;
+
+   if (drm_buddy_block_is_allocated(block))
+   continue;
+
+   if (contains(start, end, block_start, block_end) &&
+   order == drm_buddy_block_order(block)) {
+   /*
+* Find the free block within the range.
+*/
+   if (drm_buddy_block_is_free(block))
+   return block;
+
+   continue;
+   }
+
+   if (!drm_buddy_block_is_split(block)) {
+   err = split_block(mm, block);
+   if (unlikely(err))
+   goto err_undo;
+   }
+
+   list_add(&block->right->tmp_link, &dfs);
+   list_add(&block->left->tmp_link, &dfs);
+   } while (1);
+
+   return ERR_PTR(-ENOSPC);
+
+err_undo:
+   /*
+* We really don't want to leave around a bunch of split blocks, since
+* bigger is better, so make sure we merge everything back before we
+* free the allocated blocks.
+*/
+   buddy = get_buddy(block);
+   if (buddy &&
+   (drm_buddy_block_is_free(block) &&
+drm_buddy_block_is_free(buddy)))
+   __drm_buddy_free(mm, block);
+   return ERR_PTR(err);
+}
+
+static struct drm_buddy_block *
+

Re: [PATCH v7 1/6] drm: move the buddy allocator from i915 into common drm

2022-01-11 Thread Arunpravin
yes, I will use Dual MIT/GPL

Regards,
Arun

On 10/01/22 1:33 pm, Christian König wrote:
> Am 09.01.22 um 15:19 schrieb Arunpravin:
>> +// SPDX-License-Identifier: MIT
> 
>> +MODULE_DESCRIPTION("DRM Buddy Allocator");
>> +MODULE_LICENSE("GPL");
> 
> I'm not an expert on this, but maybe we should use "Dual MIT/GPL" here?
> 
> The code is certainly MIT licensed.
> 
> Regards,
> Christian.
> 


Re: [PATCH v4, 00/15] media: mtk-vcodec: support for MT8192 decoder

2022-01-11 Thread Nicolas Dufresne
Hello Yunfei,

Le lundi 10 janvier 2022 à 16:34 +0800, Yunfei Dong a écrit :
> This series adds support for mt8192 h264/vp8/vp9 decoder drivers. Firstly, 
> refactor
> power/clock/interrupt interfaces for mt8192 is lat and core architecture.
> 
> Secondly, add new functions to get frame buffer size and resolution according
> to decoder capability from scp side. Then add callback function to get/put
> capture buffer in order to enable lat and core decoder in parallel. 
> 
> Then add to support MT21C compressed mode and fix v4l2-compliance fail.

Perhaps you wanted to append the referred v4l2-compliance output (fixed) ?

As we started doing with other codec driver submission (just did last month for
NXP), can you state which software this driver was tested with ? I have started
receiving feedback from third party that MTK driver support is not reproducible,
I would like to work with you to fix the situation.

regards,
Nicolas

> 
> Next, extract H264 request api driver to let mt8183 and mt8192 use the same
> code, and adds mt8192 frame based h264 driver for stateless decoder.
> 
> Lastly, add vp8 and vp9 stateless decoder drivers.
> 
> Patches 1 to refactor power/clock/interrupt interface.
> Patches 2~4 get frame buffer size and resolution according to decoder 
> capability.
> Patches 5~6 enable lat and core decode in parallel.
> Patch 7~10 add to support MT21C compressed mode and fix v4l2-compliance fail.
> patch 11 record capture queue format type.
> Patch 12~13 extract h264 driver and add mt8192 frame based driver for h264 
> decoder.
> Patch 14~15 add vp8 and vp9 stateless decoder drivers.
> 
> Dependents on "Support multi hardware decode using of_platform_populate"[1].
> 
> This patches are the second part used to add mt8192 h264 decoder. And the 
> base part is [1].
> 
> [1]https://patchwork.linuxtv.org/project/linux-media/cover/20211215061552.8523-1-yunfei.d...@mediatek.com/
> ---
> changes compared with v3:
> - remove enum mtk_chip for patch 2.
> - add vp8 stateless decoder drivers for patch 14.
> - add vp9 stateless decoder drivers for patch 15.
> changes compared with v2:
> - add new patch 11 to record capture queue format type.
> - separate patch 4 according to tzung-bi's suggestion.
> - re-write commit message for patch 5 according to tzung-bi's suggestion.
> changes compared with v1:
> - rewrite commit message for patch 12.
> - rewrite cover-letter message.
> ---
> Yunfei Dong (15):
>   media: mtk-vcodec: Add vdec enable/disable hardware helpers
>   media: mtk-vcodec: Using firmware type to separate different firmware
> architecture
>   media: mtk-vcodec: get capture queue buffer size from scp
>   media: mtk-vcodec: Read max resolution from dec_capability
>   media: mtk-vcodec: Call v4l2_m2m_set_dst_buffered() set capture buffer
> buffered
>   media: mtk-vcodec: Refactor get and put capture buffer flow
>   media: mtk-vcodec: Refactor supported vdec formats and framesizes
>   media: mtk-vcodec: Add format to support MT21C
>   media: mtk-vcodec: disable vp8 4K capability
>   media: mtk-vcodec: Fix v4l2-compliance fail
>   media: mtk-vcodec: record capture queue format type
>   media: mtk-vcodec: Extract H264 common code
>   media: mtk-vcodec: Add h264 decoder driver for mt8192
>   media: mtk-vcodec: Add vp8 decoder driver for mt8192
>   media: mtk-vcodec: Add vp9 decoder driver for mt8192
> 
>  drivers/media/platform/mtk-vcodec/Makefile|4 +
>  .../platform/mtk-vcodec/mtk_vcodec_dec.c  |   49 +-
>  .../platform/mtk-vcodec/mtk_vcodec_dec_drv.c  |5 -
>  .../platform/mtk-vcodec/mtk_vcodec_dec_pm.c   |  168 +-
>  .../platform/mtk-vcodec/mtk_vcodec_dec_pm.h   |6 +-
>  .../mtk-vcodec/mtk_vcodec_dec_stateful.c  |   14 +-
>  .../mtk-vcodec/mtk_vcodec_dec_stateless.c |  284 ++-
>  .../platform/mtk-vcodec/mtk_vcodec_drv.h  |   40 +-
>  .../platform/mtk-vcodec/mtk_vcodec_enc_drv.c  |5 -
>  .../media/platform/mtk-vcodec/mtk_vcodec_fw.c |6 +
>  .../media/platform/mtk-vcodec/mtk_vcodec_fw.h |1 +
>  .../mtk-vcodec/vdec/vdec_h264_req_common.c|  311 +++
>  .../mtk-vcodec/vdec/vdec_h264_req_common.h|  254 ++
>  .../mtk-vcodec/vdec/vdec_h264_req_if.c|  416 +---
>  .../mtk-vcodec/vdec/vdec_h264_req_multi_if.c  |  605 +
>  .../mtk-vcodec/vdec/vdec_vp8_req_if.c |  445 
>  .../mtk-vcodec/vdec/vdec_vp9_req_lat_if.c | 2066 +
>  .../media/platform/mtk-vcodec/vdec_drv_if.c   |   36 +-
>  .../media/platform/mtk-vcodec/vdec_drv_if.h   |3 +
>  .../media/platform/mtk-vcodec/vdec_ipi_msg.h  |   37 +
>  .../platform/mtk-vcodec/vdec_msg_queue.c  |2 +
>  .../media/platform/mtk-vcodec/vdec_vpu_if.c   |   54 +-
>  .../media/platform/mtk-vcodec/vdec_vpu_if.h   |   15 +
>  .../media/platform/mtk-vcodec/venc_vpu_if.c   |2 +-
>  include/linux/remoteproc/mtk_scp.h|2 +
>  25 files changed, 4248 insertions(+), 582 deletions(-)
>  create mode 100644 
> drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_

Re: [PATCH v2] drm/bridge: analogix_dp: Grab runtime PM reference for DP-AUX

2022-01-11 Thread Brian Norris
Hi Andrzej,

On Tue, Jan 11, 2022 at 5:26 AM Andrzej Hajda  wrote:
> I am not DP specialist so CC-ed people working with DP

Thanks for the review regardless! I'll also not claim to be a DP
specialist -- although I've had to learn my fair share to debug a good
handful of issues on an SoC using this driver.

> On 01.10.2021 23:42, Brian Norris wrote:
> > If the display is not enable()d, then we aren't holding a runtime PM
> > reference here. Thus, it's easy to accidentally cause a hang, if user
> > space is poking around at /dev/drm_dp_aux0 at the "wrong" time.
> >
> > Let's get the panel and PM state right before trying to talk AUX.
> >
> > Fixes: 0d97ad03f422 ("drm/bridge: analogix_dp: Remove duplicated code")
> > Cc: 
> > Cc: Tomeu Vizoso 
> > Signed-off-by: Brian Norris 
>
>
> Few questions/issues here:
>
> 1. If it is just to avoid accidental 'hangs' it would be better to just
> check if the panel is working before transfer, if not, return error
> code. If there is better reason for this pm dance, please provide it  in
> description.

I'm not that familiar with DP-AUX, but I believe it can potentially
provide a variety of useful information (e.g., EDID?) to users without
the display and primary video link being active. So it doesn't sound
like a good idea to me to purposely leave this interface uninitialized
(and emitting errors) even when the user is asking for communication
(via /dev/drm_dp_aux). Do you want me to document what
/dev/drm_dp_aux does, and why someone would use it, in the commit
message?

> 2. Again I see an assumption that panel-prepare enables power for
> something different than video transmission, accidentally it is true for
> most devices, but devices having more fine grained power management will
> break, or at least will be used inefficiently - but maybe in case of dp
> it is OK ???

For this part, I'm less sure -- I wasn't sure what the general needs
are for AUX communication, and whether we need the panel enabled or
not. It seems logical that we need something powered, and I don't know
of anything besides "prepare()" that ensures that for DP panels.

(NB: the key to _my_ problem is the PM runtime reference. It's
absolutely essential that we don't try to utilize the DP hardware
without powering it up. The panel power state is less critical.)

> 3. More general issue - I am not sure if this should not be handled
> uniformly for all drm_dp devices.

I'm not sure what precisely you mean by #3. But FWIW, this is at least
partially documented ("make sure it's been properly enabled"):

/**
 * @transfer: transfers a message representing a single AUX
 * transaction.
 *
 * This is a hardware-specific implementation of how
 * transactions are executed that the drivers must provide.
...
 * Also note that this callback can be called no matter the
 * state @dev is in. Drivers that need that device to be powered
 * to perform this operation will first need to make sure it's
 * been properly enabled.
 */
ssize_t (*transfer)(struct drm_dp_aux *aux,
struct drm_dp_aux_msg *msg);

But maybe the definition of "properly enabled" is what you're unsure
about? (I'm also a little unsure.)

Regards,
Brian


Re: [Nouveau] [PATCH v2 3/5] drm/dp: Move DisplayPort helpers into separate helper module

2022-01-11 Thread Lyude Paul
Acked-by: Lyude Paul 

On Wed, 2021-12-15 at 11:43 +0100, Thomas Zimmermann wrote:
> Move DisplayPort functions into a separate module to reduce the size
> of the KMS helpers. Select DRM_DP_HELPER for all users of the code. To
> avoid naming conflicts, rename drm_dp_helper.c to drm_dp.c
> 
> This change can help to reduce the size of the kernel binary. Some
> numbers from a x86-64 test build:
> 
> Before:
> drm_kms_helper.ko:  447480 bytes
> 
> After:
> drm_dp_helper.ko:   216632 bytes
> drm_kms_helper.ko:  239424 bytes
> 
> For early-boot graphics, generic DRM drivers, such as simpledrm,
> require DRM KMS helpers to be built into the kernel. Generic helper
> functions for DisplayPort take up a significant portion of DRM KMS
> helper library. These functions are not used by generic drivers and
> can be loaded as a module.
> 
> v2:
> * move DP helper code into dp/ (Jani)
> 
> Signed-off-by: Thomas Zimmermann 
> ---
>  drivers/gpu/drm/Kconfig   |  8 +++
>  drivers/gpu/drm/Makefile  | 10 -
>  drivers/gpu/drm/bridge/Kconfig    |  4 
>  drivers/gpu/drm/bridge/analogix/Kconfig   |  2 ++
>  drivers/gpu/drm/bridge/cadence/Kconfig    |  1 +
>  drivers/gpu/drm/dp/Makefile   |  7 ++
>  .../gpu/drm/{drm_dp_helper.c => dp/drm_dp.c}  |  0
>  drivers/gpu/drm/{ => dp}/drm_dp_aux_dev.c |  0
>  drivers/gpu/drm/{ => dp}/drm_dp_cec.c |  0
>  .../drm/{ => dp}/drm_dp_dual_mode_helper.c    |  0
>  .../gpu/drm/{ => dp}/drm_dp_helper_internal.h |  0
>  drivers/gpu/drm/dp/drm_dp_helper_mod.c    | 22 +++
>  .../gpu/drm/{ => dp}/drm_dp_mst_topology.c    |  0
>  .../{ => dp}/drm_dp_mst_topology_internal.h   |  0
>  drivers/gpu/drm/drm_kms_helper_common.c   | 15 -
>  drivers/gpu/drm/i915/Kconfig  |  1 +
>  drivers/gpu/drm/msm/Kconfig   |  1 +
>  drivers/gpu/drm/nouveau/Kconfig   |  1 +
>  drivers/gpu/drm/rockchip/Kconfig  |  1 +
>  drivers/gpu/drm/tegra/Kconfig |  1 +
>  drivers/gpu/drm/xlnx/Kconfig  |  1 +
>  21 files changed, 54 insertions(+), 21 deletions(-)
>  create mode 100644 drivers/gpu/drm/dp/Makefile
>  rename drivers/gpu/drm/{drm_dp_helper.c => dp/drm_dp.c} (100%)
>  rename drivers/gpu/drm/{ => dp}/drm_dp_aux_dev.c (100%)
>  rename drivers/gpu/drm/{ => dp}/drm_dp_cec.c (100%)
>  rename drivers/gpu/drm/{ => dp}/drm_dp_dual_mode_helper.c (100%)
>  rename drivers/gpu/drm/{ => dp}/drm_dp_helper_internal.h (100%)
>  create mode 100644 drivers/gpu/drm/dp/drm_dp_helper_mod.c
>  rename drivers/gpu/drm/{ => dp}/drm_dp_mst_topology.c (100%)
>  rename drivers/gpu/drm/{ => dp}/drm_dp_mst_topology_internal.h (100%)
> 
> diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
> index b1f22e457fd0..91f54aeb0b7c 100644
> --- a/drivers/gpu/drm/Kconfig
> +++ b/drivers/gpu/drm/Kconfig
> @@ -80,6 +80,12 @@ config DRM_DEBUG_SELFTEST
>  
>   If in doubt, say "N".
>  
> +config DRM_DP_HELPER
> +   tristate
> +   depends on DRM
> +   help
> + DRM helpers for DisplayPort.
> +
>  config DRM_KMS_HELPER
> tristate
> depends on DRM
> @@ -236,6 +242,7 @@ config DRM_RADEON
> depends on DRM && PCI && MMU
> depends on AGP || !AGP
> select FW_LOADER
> +   select DRM_DP_HELPER
>  select DRM_KMS_HELPER
>  select DRM_TTM
> select DRM_TTM_HELPER
> @@ -256,6 +263,7 @@ config DRM_AMDGPU
> tristate "AMD GPU"
> depends on DRM && PCI && MMU
> select FW_LOADER
> +   select DRM_DP_HELPER
> select DRM_KMS_HELPER
> select DRM_SCHED
> select DRM_TTM
> diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
> index 301a44dc18e3..69be80ef1d31 100644
> --- a/drivers/gpu/drm/Makefile
> +++ b/drivers/gpu/drm/Makefile
> @@ -48,21 +48,18 @@ obj-$(CONFIG_DRM_VRAM_HELPER) += drm_vram_helper.o
>  drm_ttm_helper-y := drm_gem_ttm_helper.o
>  obj-$(CONFIG_DRM_TTM_HELPER) += drm_ttm_helper.o
>  
> -drm_kms_helper-y := drm_bridge_connector.o drm_crtc_helper.o
> drm_dp_helper.o \
> +drm_kms_helper-y := drm_bridge_connector.o drm_crtc_helper.o \
> drm_dsc.o drm_encoder_slave.o drm_flip_work.o drm_hdcp.o \
> drm_probe_helper.o \
> -   drm_plane_helper.o drm_dp_mst_topology.o drm_atomic_helper.o
> \
> -   drm_kms_helper_common.o drm_dp_dual_mode_helper.o \
> +   drm_plane_helper.o drm_atomic_helper.o \
> +   drm_kms_helper_common.o \
> drm_simple_kms_helper.o drm_modeset_helper.o \
> drm_scdc_helper.o drm_gem_atomic_helper.o \
> drm_gem_framebuffer_helper.o \
> drm_atomic_state_helper.o drm_damage_helper.o \
> drm_format_helper.o drm_self_refresh_helper.o drm_rect.o
> -
>  drm_kms_helper-$(CONFIG_DRM_P

Re: [PATCH 1/2] drm: exynos: dsi: Convert to bridge driver

2022-01-11 Thread Jagan Teki
On Mon, Jan 10, 2022 at 9:10 PM Robert Foss  wrote:
>
> On Mon, 10 Jan 2022 at 16:35, Jagan Teki  wrote:
> >
> > Hi Robert,
> >
> > On Mon, Jan 10, 2022 at 9:02 PM Robert Foss  wrote:
> > >
> > > Hey Jagan,
> > >
> > > This is a mistake on my end, I must have been looking at reviewing
> > > this series and then accidentally included it with another batch of
> > > patches. Thank you for catching this.
> >
> > Thanks for the response.
> >
> > >
> > > I would suggest reverting these two patches[1][2]. Is that ok with you?
> >
> > May be I will revert 1/2. but 2/2 is valid. Please let me know, if you
> > have any concerns on reverting 1/2.
>
> Please go ahead!

Sent.

Thanks,
Jagan.


Re: [PATCH] drm/i915/guc: Don't error on reset of banned context

2022-01-11 Thread Matthew Brost
On Thu, Jan 06, 2022 at 04:31:43PM -0800, john.c.harri...@intel.com wrote:
> From: John Harrison 
> 
> There is a race (already documented in the code) whereby a context can
> be (re-)queued for submission at the same time as it is being banned
> due to a hang and reset. That leads to a hang/reset report from GuC
> for a context which i915 thinks is already banned.
> 

I think there are 2 issues here.

1. Banning of context (e.g. user closes a non-persistent context)
results in an context reset. In this case we will receive a G2H
indicating a context reset and we want to convert the context reset to a
nop.

2. A GT reset races with a context reset result in the context getting
banned before the G2H is processed. Again we want to convert the context
reset to a nop. This race should be sealed when we can flush the G2H
handler in the reset path. Flushing G2H handler depends on the error
capture not allocating memory in non-sleeping contexts. Thomas H had a
patch for this.

In both cases we shouldn't print an error.

> While the race is indented to be fixed in a future GuC update, there
> is no actual harm beyond the wasted execution time of that new hang
> detection period. The context has already been banned for bad
> behaviour so a fresh hang is hardly surprising and certainly isn't
> going to be losing any work that wouldn't already have been lost if
> there was no race.
>

See above, I think you are confusing the issues here. This won't be
fixed by an updated GuC firmware.

> So don't treat this situation as an error. The error message is seen
> by the CI system as something fatal and causes test failures. Instead,
> just print an informational so the user at least knows a context reset
> occurred (given that the error capture is being skipped).
> 
> Signed-off-by: John Harrison 
> ---
>  drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
> b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> index 9989d121127d..e8a32a7e7daf 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> @@ -3978,6 +3978,10 @@ static void guc_handle_context_reset(struct intel_guc 
> *guc,
>  !context_blocked(ce))) {
>   capture_error_state(guc, ce);
>   guc_context_replay(ce);
> + } else if (intel_context_is_banned(ce)) {
> + drm_info(&guc_to_gt(guc)->i915->drm,
> +  "Reset notificaion for banned context 0x%04X on %s",
> +  ce->guc_id.id, ce->engine->name);

The context being blocking isn't an error either. I think real fix is
changing the below drm_err to drm_info and call it a day.

Matt

>   } else {
>   drm_err(&guc_to_gt(guc)->i915->drm,
>   "Invalid GuC engine reset notificaion for 0x%04X on %s: 
> banned = %d, blocked = %d",
> -- 
> 2.25.1
> 


[PATCH] Revert "drm: exynos: dsi: Convert to bridge driver"

2022-01-11 Thread Jagan Teki
This reverts commit 92e794fab87af0793403d5e4a547f0be94a0e656.

It is merged by accident, the actual patch series on this bridge
conversion is still under review.

Revert this as it breaks the exynos DSI.

Signed-off-by: Jagan Teki 
---
 drivers/gpu/drm/exynos/exynos_drm_dsi.c | 93 +
 1 file changed, 32 insertions(+), 61 deletions(-)

diff --git a/drivers/gpu/drm/exynos/exynos_drm_dsi.c 
b/drivers/gpu/drm/exynos/exynos_drm_dsi.c
index 3d4713346949..bce5331ed1e6 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_dsi.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_dsi.c
@@ -257,7 +257,6 @@ struct exynos_dsi {
struct drm_connector connector;
struct drm_panel *panel;
struct list_head bridge_chain;
-   struct drm_bridge bridge;
struct drm_bridge *out_bridge;
struct device *dev;
struct drm_display_mode mode;
@@ -289,9 +288,9 @@ struct exynos_dsi {
 #define host_to_dsi(host) container_of(host, struct exynos_dsi, dsi_host)
 #define connector_to_dsi(c) container_of(c, struct exynos_dsi, connector)
 
-static inline struct exynos_dsi *bridge_to_dsi(struct drm_bridge *b)
+static inline struct exynos_dsi *encoder_to_dsi(struct drm_encoder *e)
 {
-   return container_of(b, struct exynos_dsi, bridge);
+   return container_of(e, struct exynos_dsi, encoder);
 }
 
 enum reg_idx {
@@ -1376,10 +1375,9 @@ static void exynos_dsi_unregister_te_irq(struct 
exynos_dsi *dsi)
}
 }
 
-static void exynos_dsi_atomic_enable(struct drm_bridge *bridge,
-struct drm_bridge_state *old_bridge_state)
+static void exynos_dsi_enable(struct drm_encoder *encoder)
 {
-   struct exynos_dsi *dsi = bridge_to_dsi(bridge);
+   struct exynos_dsi *dsi = encoder_to_dsi(encoder);
struct drm_bridge *iter;
int ret;
 
@@ -1402,8 +1400,7 @@ static void exynos_dsi_atomic_enable(struct drm_bridge 
*bridge,
list_for_each_entry_reverse(iter, &dsi->bridge_chain,
chain_node) {
if (iter->funcs->pre_enable)
-   iter->funcs->atomic_pre_enable(iter,
-  
old_bridge_state);
+   iter->funcs->pre_enable(iter);
}
}
 
@@ -1417,7 +1414,7 @@ static void exynos_dsi_atomic_enable(struct drm_bridge 
*bridge,
} else {
list_for_each_entry(iter, &dsi->bridge_chain, chain_node) {
if (iter->funcs->enable)
-   iter->funcs->atomic_enable(iter, 
old_bridge_state);
+   iter->funcs->enable(iter);
}
}
 
@@ -1433,10 +1430,9 @@ static void exynos_dsi_atomic_enable(struct drm_bridge 
*bridge,
pm_runtime_put(dsi->dev);
 }
 
-static void exynos_dsi_atomic_disable(struct drm_bridge *bridge,
- struct drm_bridge_state *old_bridge_state)
+static void exynos_dsi_disable(struct drm_encoder *encoder)
 {
-   struct exynos_dsi *dsi = bridge_to_dsi(bridge);
+   struct exynos_dsi *dsi = encoder_to_dsi(encoder);
struct drm_bridge *iter;
 
if (!(dsi->state & DSIM_STATE_ENABLED))
@@ -1448,7 +1444,7 @@ static void exynos_dsi_atomic_disable(struct drm_bridge 
*bridge,
 
list_for_each_entry_reverse(iter, &dsi->bridge_chain, chain_node) {
if (iter->funcs->disable)
-   iter->funcs->atomic_disable(iter, old_bridge_state);
+   iter->funcs->disable(iter);
}
 
exynos_dsi_set_display_enable(dsi, false);
@@ -1456,13 +1452,22 @@ static void exynos_dsi_atomic_disable(struct drm_bridge 
*bridge,
 
list_for_each_entry(iter, &dsi->bridge_chain, chain_node) {
if (iter->funcs->post_disable)
-   iter->funcs->atomic_post_disable(iter, 
old_bridge_state);
+   iter->funcs->post_disable(iter);
}
 
dsi->state &= ~DSIM_STATE_ENABLED;
pm_runtime_put_sync(dsi->dev);
 }
 
+static void exynos_dsi_mode_set(struct drm_encoder *encoder,
+   struct drm_display_mode *mode,
+   struct drm_display_mode *adjusted_mode)
+{
+   struct exynos_dsi *dsi = encoder_to_dsi(encoder);
+
+   drm_mode_copy(&dsi->mode, adjusted_mode);
+}
+
 static enum drm_connector_status
 exynos_dsi_detect(struct drm_connector *connector, bool force)
 {
@@ -1499,9 +1504,9 @@ static const struct drm_connector_helper_funcs 
exynos_dsi_connector_helper_funcs
.get_modes = exynos_dsi_get_modes,
 };
 
-static int exynos_dsi_create_connector(struct exynos_dsi *dsi)
+static int exynos_dsi_create_connector(struct drm_encoder *encoder)
 {
-   struct drm_encoder *encoder = &dsi->encoder;
+   struct exynos_dsi *dsi = encoder_to_dsi(encoder);
struct drm_connector *connector = &dsi->conn

[PATCH v11 4/4] drm/msm/dp: stop link training after link training 2 failed

2022-01-11 Thread Kuogee Hsieh
Each DP link training contains link training 1 followed by link
training 2.  There is maximum of 5 retries of DP link training
before declared link training failed. It is required to stop link
training at end of link training 2 if it is failed so that next
link training 1 can start freshly. This patch fixes link compliance
test  case 4.3.1.13 (Source Device Link Training EQ Fallback Test).

Changes in v10:
--  group into one series

Changes in v11:
-- drop drm/msm/dp: dp_link_parse_sink_count() return immediately if aux read

Fixes: 2e0adc765d88 ("drm/msm/dp: do not end dp link training until video is 
ready")
Signed-off-by: Kuogee Hsieh 
Reviewed-by: Stephen Boyd 
---
 drivers/gpu/drm/msm/dp/dp_ctrl.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/msm/dp/dp_ctrl.c b/drivers/gpu/drm/msm/dp/dp_ctrl.c
index f98df93..245e1b9 100644
--- a/drivers/gpu/drm/msm/dp/dp_ctrl.c
+++ b/drivers/gpu/drm/msm/dp/dp_ctrl.c
@@ -1755,6 +1755,9 @@ int dp_ctrl_on_link(struct dp_ctrl *dp_ctrl)
/* end with failure */
break; /* lane == 1 already */
}
+
+   /* stop link training before start re training  */
+   dp_ctrl_clear_training_pattern(ctrl);
}
}
 
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH v11 3/4] drm/msm/dp: add support of tps4 (training pattern 4) for HBR3

2022-01-11 Thread Kuogee Hsieh
From: Kuogee Hsieh 

Some DP sinkers prefer to use tps4 instead of tps3 during training #2.
This patch will use tps4 to perform link training #2 if sinker's DPCD
supports it.

Changes in V2:
-- replace  dp_catalog_ctrl_set_pattern() with  
dp_catalog_ctrl_set_pattern_state_bit()

Changes in V3:
-- change state_ctrl_bits type to u32 and pattern type to u8

Changes in V4:
-- align } else if { and } else {

Changes in v10:
--  group into one series

Changes in v11:
-- drop drm/msm/dp: dp_link_parse_sink_count() return immediately if aux read

Signed-off-by: Kuogee Hsieh 

Reviewed-by: Stephen Boyd 
---
 drivers/gpu/drm/msm/dp/dp_catalog.c | 12 ++--
 drivers/gpu/drm/msm/dp/dp_catalog.h |  2 +-
 drivers/gpu/drm/msm/dp/dp_ctrl.c| 17 -
 3 files changed, 19 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/msm/dp/dp_catalog.c 
b/drivers/gpu/drm/msm/dp/dp_catalog.c
index 6ae9b29..64f0b26 100644
--- a/drivers/gpu/drm/msm/dp/dp_catalog.c
+++ b/drivers/gpu/drm/msm/dp/dp_catalog.c
@@ -456,19 +456,19 @@ void dp_catalog_ctrl_config_msa(struct dp_catalog 
*dp_catalog,
dp_write_p0(catalog, MMSS_DP_DSC_DTO, 0x0);
 }
 
-int dp_catalog_ctrl_set_pattern(struct dp_catalog *dp_catalog,
-   u32 pattern)
+int dp_catalog_ctrl_set_pattern_state_bit(struct dp_catalog *dp_catalog,
+   u32 state_bit)
 {
int bit, ret;
u32 data;
struct dp_catalog_private *catalog = container_of(dp_catalog,
struct dp_catalog_private, dp_catalog);
 
-   bit = BIT(pattern - 1);
-   DRM_DEBUG_DP("hw: bit=%d train=%d\n", bit, pattern);
+   bit = BIT(state_bit - 1);
+   DRM_DEBUG_DP("hw: bit=%d train=%d\n", bit, state_bit);
dp_catalog_ctrl_state_ctrl(dp_catalog, bit);
 
-   bit = BIT(pattern - 1) << DP_MAINLINK_READY_LINK_TRAINING_SHIFT;
+   bit = BIT(state_bit - 1) << DP_MAINLINK_READY_LINK_TRAINING_SHIFT;
 
/* Poll for mainlink ready status */
ret = readx_poll_timeout(readl, catalog->io->dp_controller.link.base +
@@ -476,7 +476,7 @@ int dp_catalog_ctrl_set_pattern(struct dp_catalog 
*dp_catalog,
data, data & bit,
POLLING_SLEEP_US, POLLING_TIMEOUT_US);
if (ret < 0) {
-   DRM_ERROR("set pattern for link_train=%d failed\n", pattern);
+   DRM_ERROR("set state_bit for link_train=%d failed\n", 
state_bit);
return ret;
}
return 0;
diff --git a/drivers/gpu/drm/msm/dp/dp_catalog.h 
b/drivers/gpu/drm/msm/dp/dp_catalog.h
index 6965afa..7dea101 100644
--- a/drivers/gpu/drm/msm/dp/dp_catalog.h
+++ b/drivers/gpu/drm/msm/dp/dp_catalog.h
@@ -94,7 +94,7 @@ void dp_catalog_ctrl_mainlink_ctrl(struct dp_catalog 
*dp_catalog, bool enable);
 void dp_catalog_ctrl_config_misc(struct dp_catalog *dp_catalog, u32 cc, u32 
tb);
 void dp_catalog_ctrl_config_msa(struct dp_catalog *dp_catalog, u32 rate,
u32 stream_rate_khz, bool fixed_nvid);
-int dp_catalog_ctrl_set_pattern(struct dp_catalog *dp_catalog, u32 pattern);
+int dp_catalog_ctrl_set_pattern_state_bit(struct dp_catalog *dp_catalog, u32 
pattern);
 void dp_catalog_ctrl_reset(struct dp_catalog *dp_catalog);
 bool dp_catalog_ctrl_mainlink_ready(struct dp_catalog *dp_catalog);
 void dp_catalog_ctrl_enable_irq(struct dp_catalog *dp_catalog, bool enable);
diff --git a/drivers/gpu/drm/msm/dp/dp_ctrl.c b/drivers/gpu/drm/msm/dp/dp_ctrl.c
index 9c80b49..f98df93 100644
--- a/drivers/gpu/drm/msm/dp/dp_ctrl.c
+++ b/drivers/gpu/drm/msm/dp/dp_ctrl.c
@@ -1083,7 +1083,7 @@ static int dp_ctrl_link_train_1(struct dp_ctrl_private 
*ctrl,
 
*training_step = DP_TRAINING_1;
 
-   ret = dp_catalog_ctrl_set_pattern(ctrl->catalog, DP_TRAINING_PATTERN_1);
+   ret = dp_catalog_ctrl_set_pattern_state_bit(ctrl->catalog, 1);
if (ret)
return ret;
dp_ctrl_train_pattern_set(ctrl, DP_TRAINING_PATTERN_1 |
@@ -1181,7 +1181,8 @@ static int dp_ctrl_link_train_2(struct dp_ctrl_private 
*ctrl,
int *training_step)
 {
int tries = 0, ret = 0;
-   char pattern;
+   u8 pattern;
+   u32 state_ctrl_bit;
int const maximum_retries = 5;
u8 link_status[DP_LINK_STATUS_SIZE];
 
@@ -1189,12 +1190,18 @@ static int dp_ctrl_link_train_2(struct dp_ctrl_private 
*ctrl,
 
*training_step = DP_TRAINING_2;
 
-   if (drm_dp_tps3_supported(ctrl->panel->dpcd))
+   if (drm_dp_tps4_supported(ctrl->panel->dpcd)) {
+   pattern = DP_TRAINING_PATTERN_4;
+   state_ctrl_bit = 4;
+   } else if (drm_dp_tps3_supported(ctrl->panel->dpcd)) {
pattern = DP_TRAINING_PATTERN_3;
-   else
+   state_ctrl_bit = 3;
+   } else {
pattern = DP_TRAINING_PATTERN_2;
+   state_ctrl_bit = 2;
+   }
 
-   ret = dp_c

[PATCH v11 2/4] drm/msm/dp: populate connector of struct dp_panel

2022-01-11 Thread Kuogee Hsieh
DP CTS test case 4.2.2.6 has valid edid with bad checksum on purpose
and expect DP source return correct checksum. During drm edid read,
correct edid checksum is calculated and stored at
connector::real_edid_checksum.

The problem is struct dp_panel::connector never be assigned, instead the
connector is stored in struct msm_dp::connector. When we run compliance
testing test case 4.2.2.6 dp_panel_handle_sink_request() won't have a valid
edid set in struct dp_panel::edid so we'll try to use the connectors
real_edid_checksum and hit a NULL pointer dereference error because the
connector pointer is never assigned.

Changes in V2:
-- populate panel connector at msm_dp_modeset_init() instead of at 
dp_panel_read_sink_caps()

Changes in V3:
-- remove unhelpful kernel crash trace commit text
-- remove renaming dp_display parameter to dp

Changes in V4:
-- add more details to commit text

Changes in v10:
--  group into one series

Changes in v11:
-- drop drm/msm/dp: dp_link_parse_sink_count() return immediately if aux read

Fixes: 7948fe12d47 ("drm/msm/dp: return correct edid checksum after corrupted 
edid checksum read")
Signee-off-by: Kuogee Hsieh 

Reviewed-by: Bjorn Andersson 
Reviewed-by: Stephen Boyd 
---
 drivers/gpu/drm/msm/dp/dp_display.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/msm/dp/dp_display.c 
b/drivers/gpu/drm/msm/dp/dp_display.c
index f6bb4bc..1d06b6a 100644
--- a/drivers/gpu/drm/msm/dp/dp_display.c
+++ b/drivers/gpu/drm/msm/dp/dp_display.c
@@ -1489,6 +1489,7 @@ int msm_dp_modeset_init(struct msm_dp *dp_display, struct 
drm_device *dev,
struct drm_encoder *encoder)
 {
struct msm_drm_private *priv;
+   struct dp_display_private *dp_priv;
int ret;
 
if (WARN_ON(!encoder) || WARN_ON(!dp_display) || WARN_ON(!dev))
@@ -1497,6 +1498,8 @@ int msm_dp_modeset_init(struct msm_dp *dp_display, struct 
drm_device *dev,
priv = dev->dev_private;
dp_display->drm_dev = dev;
 
+   dp_priv = container_of(dp_display, struct dp_display_private, 
dp_display);
+
ret = dp_display_request_irq(dp_display);
if (ret) {
DRM_ERROR("request_irq failed, ret=%d\n", ret);
@@ -1514,6 +1517,8 @@ int msm_dp_modeset_init(struct msm_dp *dp_display, struct 
drm_device *dev,
return ret;
}
 
+   dp_priv->panel->connector = dp_display->connector;
+
priv->connectors[priv->num_connectors++] = dp_display->connector;
 
dp_display->bridge = msm_dp_bridge_init(dp_display, dev, encoder);
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH v11 1/4] drm/msm/dp: do not initialize phy until plugin interrupt received

2022-01-11 Thread Kuogee Hsieh
Current DP drivers have regulators, clocks, irq and phy are grouped
together within a function and executed not in a symmetric manner.
This increase difficulty of code maintenance and limited code scalability.
This patch divides the driver life cycle of operation into four states,
resume (including booting up), dongle plugin, dongle unplugged and suspend.
Regulators, core clocks and irq are grouped together and enabled at resume
(or booting up) so that the DP controller is armed and ready to receive HPD
plugin interrupts. HPD plugin interrupt is generated when a dongle plugs
into DUT (device under test). Once HPD plugin interrupt is received, DP
controller will initialize phy so that dpcd read/write will function and
following link training can be proceeded successfully. DP phy will be
disabled after main link is teared down at end of unplugged HPD interrupt
handle triggered by dongle unplugged out of DUT. Finally regulators, code
clocks and irq are disabled at corresponding suspension.

Changes in V2:
-- removed unnecessary dp_ctrl NULL check
-- removed unnecessary phy init_count and power_count DRM_DEBUG_DP logs
-- remove flip parameter out of dp_ctrl_irq_enable()
-- add fixes tag

Changes in V3:
-- call dp_display_host_phy_init() instead of dp_ctrl_phy_init() at
dp_display_host_init() for eDP

Changes in V4:
-- rewording commit text to match this commit changes

Changes in V5:
-- rebase on top of msm-next branch

Changes in V6:
-- delete flip variable

Changes in V7:
-- dp_ctrl_irq_enable/disabe() merged into dp_ctrl_reset_irq_ctrl()

Changes in V8:
-- add more detail comment regrading dp phy at dp_display_host_init()

Changes in V9:
-- remove set phy_initialized to false when -ECONNRESET detected

Changes in v10:
--  group into one series

Changes in v11:
-- drop drm/msm/dp: dp_link_parse_sink_count() return immediately if aux read

Fixes: 8ede2ecc3e5e ("drm/msm/dp: Add DP compliance tests on Snapdragon 
Chipsets")
Signed-off-by: Kuogee Hsieh 
---
 drivers/gpu/drm/msm/dp/dp_ctrl.c| 80 +
 drivers/gpu/drm/msm/dp/dp_ctrl.h|  8 ++--
 drivers/gpu/drm/msm/dp/dp_display.c | 89 -
 3 files changed, 94 insertions(+), 83 deletions(-)

diff --git a/drivers/gpu/drm/msm/dp/dp_ctrl.c b/drivers/gpu/drm/msm/dp/dp_ctrl.c
index c724cb0..9c80b49 100644
--- a/drivers/gpu/drm/msm/dp/dp_ctrl.c
+++ b/drivers/gpu/drm/msm/dp/dp_ctrl.c
@@ -1365,60 +1365,44 @@ static int dp_ctrl_enable_stream_clocks(struct 
dp_ctrl_private *ctrl)
return ret;
 }
 
-int dp_ctrl_host_init(struct dp_ctrl *dp_ctrl, bool flip, bool reset)
+void dp_ctrl_reset_irq_ctrl(struct dp_ctrl *dp_ctrl, bool enable)
+{
+   struct dp_ctrl_private *ctrl;
+
+   ctrl = container_of(dp_ctrl, struct dp_ctrl_private, dp_ctrl);
+
+   dp_catalog_ctrl_reset(ctrl->catalog);
+
+   if (enable)
+   dp_catalog_ctrl_enable_irq(ctrl->catalog, enable);
+}
+
+void dp_ctrl_phy_init(struct dp_ctrl *dp_ctrl)
 {
struct dp_ctrl_private *ctrl;
struct dp_io *dp_io;
struct phy *phy;
 
-   if (!dp_ctrl) {
-   DRM_ERROR("Invalid input data\n");
-   return -EINVAL;
-   }
-
ctrl = container_of(dp_ctrl, struct dp_ctrl_private, dp_ctrl);
dp_io = &ctrl->parser->io;
phy = dp_io->phy;
 
-   ctrl->dp_ctrl.orientation = flip;
-
-   if (reset)
-   dp_catalog_ctrl_reset(ctrl->catalog);
-
-   DRM_DEBUG_DP("flip=%d\n", flip);
dp_catalog_ctrl_phy_reset(ctrl->catalog);
phy_init(phy);
-   dp_catalog_ctrl_enable_irq(ctrl->catalog, true);
-
-   return 0;
 }
 
-/**
- * dp_ctrl_host_deinit() - Uninitialize DP controller
- * @dp_ctrl: Display Port Driver data
- *
- * Perform required steps to uninitialize DP controller
- * and its resources.
- */
-void dp_ctrl_host_deinit(struct dp_ctrl *dp_ctrl)
+void dp_ctrl_phy_exit(struct dp_ctrl *dp_ctrl)
 {
struct dp_ctrl_private *ctrl;
struct dp_io *dp_io;
struct phy *phy;
 
-   if (!dp_ctrl) {
-   DRM_ERROR("Invalid input data\n");
-   return;
-   }
-
ctrl = container_of(dp_ctrl, struct dp_ctrl_private, dp_ctrl);
dp_io = &ctrl->parser->io;
phy = dp_io->phy;
 
-   dp_catalog_ctrl_enable_irq(ctrl->catalog, false);
+   dp_catalog_ctrl_phy_reset(ctrl->catalog);
phy_exit(phy);
-
-   DRM_DEBUG_DP("Host deinitialized successfully\n");
 }
 
 static bool dp_ctrl_use_fixed_nvid(struct dp_ctrl_private *ctrl)
@@ -1488,7 +1472,10 @@ static int dp_ctrl_deinitialize_mainlink(struct 
dp_ctrl_private *ctrl)
}
 
phy_power_off(phy);
+
+   /* aux channel down, reinit phy */
phy_exit(phy);
+   phy_init(phy);
 
return 0;
 }
@@ -1893,8 +1880,14 @@ int dp_ctrl_off_link_stream(struct dp_ctrl *dp_ctrl)
return ret;
}
 
+   DRM_DEBUG_DP("Before, phy=%x init_count=%d power_on=%d\n",
+

[PATCH v11 0/4] group dp driver related patches into one series

2022-01-11 Thread Kuogee Hsieh
Group below 4 dp driver related patches into one series.

Kuogee Hsieh (4):
  drm/msm/dp: do not initialize phy until plugin interrupt received
  drm/msm/dp:  populate connector of struct  dp_panel
  drm/msm/dp: add support of tps4 (training pattern 4) for HBR3
  drm/msm/dp: stop link training after link training 2 failed

 drivers/gpu/drm/msm/dp/dp_catalog.c |  12 ++---
 drivers/gpu/drm/msm/dp/dp_catalog.h |   2 +-
 drivers/gpu/drm/msm/dp/dp_ctrl.c| 100 
 drivers/gpu/drm/msm/dp/dp_ctrl.h|   8 +--
 drivers/gpu/drm/msm/dp/dp_display.c |  94 ++---
 5 files changed, 121 insertions(+), 95 deletions(-)

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



Re: Phyr Starter

2022-01-11 Thread Matthew Wilcox
On Tue, Jan 11, 2022 at 11:01:42AM -0400, Jason Gunthorpe wrote:
> On Tue, Jan 11, 2022 at 04:32:56AM +, Matthew Wilcox wrote:
> > On Mon, Jan 10, 2022 at 08:41:26PM -0400, Jason Gunthorpe wrote:
> > > On Mon, Jan 10, 2022 at 07:34:49PM +, Matthew Wilcox wrote:
> > > 
> > > > Finally, it may be possible to stop using scatterlist to describe the
> > > > input to the DMA-mapping operation.  We may be able to get struct
> > > > scatterlist down to just dma_address and dma_length, with chaining
> > > > handled through an enclosing struct.
> > > 
> > > Can you talk about this some more? IMHO one of the key properties of
> > > the scatterlist is that it can hold huge amounts of pages without
> > > having to do any kind of special allocation due to the chaining.
> > > 
> > > The same will be true of the phyr idea right?
> > 
> > My thinking is that we'd pass a relatively small array of phyr (maybe 16
> > entries) to get_user_phyr().  If that turned out not to be big enough,
> > then we have two options; one is to map those 16 ranges with sg and use
> > the sg chaining functionality before throwing away the phyr and calling
> > get_user_phyr() again. 
> 
> Then we are we using get_user_phyr() at all if we are just storing it
> in a sg?

I did consider just implementing get_user_sg() (actually 4 years ago),
but that cements the use of sg as both an input and output data structure
for DMA mapping, which I am under the impression we're trying to get
away from.

> Also 16 entries is way to small, it should be at least a whole PMD
> worth so we don't have to relock the PMD level each iteration.
> 
> I would like to see a flow more like:
> 
>   cpu_phyr_list = get_user_phyr(uptr, 1G);
>   dma_phyr_list = dma_map_phyr(device, cpu_phyr_list);
>   [..]
>   dma_unmap_phyr(device, dma_phyr_list);
>   unpin_drity_free(cpu_phy_list);
> 
> Where dma_map_phyr() can build a temporary SGL for old iommu drivers
> compatability. iommu drivers would want to implement natively, of
> course.
> 
> ie no loops in drivers.

Let me just rewrite that for you ...

umem->phyrs = get_user_phyrs(addr, size, &umem->phyr_len);
umem->sgt = dma_map_phyrs(device, umem->phyrs, umem->phyr_len,
DMA_BIDIRECTIONAL, dma_attr);
...
dma_unmap_phyr(device, umem->phyrs, umem->phyr_len, umem->sgt->sgl,
umem->sgt->nents, DMA_BIDIRECTIONAL, dma_attr);
sg_free_table(umem->sgt);
free_user_phyrs(umem->phyrs, umem->phyr_len);

> > The question is whether this is the right kind of optimisation to be
> > doing.  I hear you that we want a dense format, but it's questionable
> > whether the kind of thing you're suggesting is actually denser than this
> > scheme.  For example, if we have 1GB pages and userspace happens to have
> > allocated pages (3, 4, 5, 6, 7, 8, 9, 10) then this can be represented
> > as a single phyr.  A power-of-two scheme would have us use four entries
> > (3, 4-7, 8-9, 10).
> 
> That is not quite what I had in mind..
> 
> struct phyr_list {
>unsigned int first_page_offset_bytes;
>size_t total_length_bytes;
>phys_addr_t min_alignment;
>struct packed_phyr *list_of_pages;
> };
> 
> Where each 'packed_phyr' is an aligned page of some kind. The packing
> has to be able to represent any number of pfns, so we have four major
> cases:
>  - 4k pfns (use 8 bytes)
>  - Natural order pfn (use 8 bytes)
>  - 4k aligned pfns, arbitary number (use 12 bytes)
>  - <4k aligned, arbitary length (use 16 bytes?)
> 
> In all cases the interior pages are fully used, only the first and
> last page is sliced based on the two parameters in the phyr_list.

This kind of representation works for a virtually contiguous range.
Unfortunately, that's not sufficient for some bio users (as I discovered
after getting a representation like this enshrined in the NVMe spec as
the PRP List).

> The last case is, perhaps, a possible route to completely replace
> scatterlist. Few places need true byte granularity for interior pages,
> so we can invent some coding to say 'this is 8 byte aligned, and n
> bytes long' that only fits < 4k or something. Exceptional cases can
> then still work. I'm not sure what block needs here - is it just 512?

Replacing scatterlist is not my goal.  That seems like a lot more work
for little gain.  I just want to delete page_link, offset and length
from struct scatterlist.  Given the above sequence of calls, we're going
to get sg lists that aren't chained.  They may have to be vmalloced,
but they should be contiguous.

> I would imagine a few steps to this process:
>  1) 'phyr_list' datastructure, with chaining, pre-allocation, etc
>  2) Wrapper around existing gup to get a phyr_list for user VA
>  3) Compat 'dma_map_phyr()' that coverts a phyr_list to a sgl and back
> (However, with full performance for iommu passthrough)
>  4) Patches changing RDMA/VFIO/DRM to this API
>  5) Patches optimizing get_user_phyr()
>  6) Patches implementing 

Re: Phyr Starter

2022-01-11 Thread Logan Gunthorpe



On 2022-01-11 1:17 a.m., John Hubbard wrote:
> On 1/10/22 11:34, Matthew Wilcox wrote:
>> TLDR: I want to introduce a new data type:
>>
>> struct phyr {
>>  phys_addr_t addr;
>>  size_t len;
>> };
>>
>> and use it to replace bio_vec as well as using it to replace the array
>> of struct pages used by get_user_pages() and friends.
>>
>> ---
> 
> This would certainly solve quite a few problems at once. Very compelling.

I agree.

> Zooming in on the pinning aspect for a moment: last time I attempted to
> convert O_DIRECT callers from gup to pup, I recall wanting very much to
> record, in each bio_vec, whether these pages were acquired via FOLL_PIN,
> or some non-FOLL_PIN method. Because at the end of the IO, it is not
> easy to disentangle which pages require put_page() and which require
> unpin_user_page*().
> 
> And changing the bio_vec for *that* purpose was not really acceptable.
> 
> But now that you're looking to change it in a big way (and with some
> spare bits avaiable...oohh!), maybe I can go that direction after all.
> 
> Or, are you looking at a design in which any phyr is implicitly FOLL_PIN'd
> if it exists at all?

I'd also second being able to store a handful of flags in each phyr. My
userspace P2PDMA patchset needs to add a flag to each sgl to indicate
whether it was mapped as a bus address or not (which would be necessary
for the DMA mapped side dma_map_phyr).

Though, it's not immediately obvious where to put the flags without
increasing the size of the structure :(

Logan



Re: [PATCH] drm/nouveau/core/object: Fix the uninitialized use of "type"

2022-01-11 Thread Lyude Paul
Reviewed-by: Lyude Paul 

On Fri, 2021-12-17 at 18:56 -0800, Yizhuo Zhai wrote:
> In function nvkm_ioctl_map(), the variable "type" could be
> uninitialized if "nvkm_object_map()" returns error code, however,
> it does not check the return value and directly use the "type" in
> the if statement, which is potentially unsafe.
> 
> Cc: sta...@vger.kernel.org
> Fixes: 01326050391c ("drm/nouveau/core/object: allow arguments to be passed
> to map function")
> Signed-off-by: Yizhuo Zhai 
> ---
>  drivers/gpu/drm/nouveau/nvkm/core/ioctl.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/gpu/drm/nouveau/nvkm/core/ioctl.c
> b/drivers/gpu/drm/nouveau/nvkm/core/ioctl.c
> index 735cb6816f10..4264d9d79783 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/core/ioctl.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/core/ioctl.c
> @@ -266,6 +266,8 @@ nvkm_ioctl_map(struct nvkm_client *client,
> ret = nvkm_object_map(object, data, size, &type,
>   &args->v0.handle,
>   &args->v0.length);
> +   if (ret)
> +   return ret;
> if (type == NVKM_OBJECT_MAP_IO)
> args->v0.type = NVIF_IOCTL_MAP_V0_IO;
> else

-- 
Cheers,
 Lyude Paul (she/her)
 Software Engineer at Red Hat



[PATCH 4/4] drm/i915/uapi: document behaviour for DG2 64K support

2022-01-11 Thread Robert Beckett
From: Matthew Auld 

On discrete platforms like DG2, we need to support a minimum page size
of 64K when dealing with device local-memory. This is quite tricky for
various reasons, so try to document the new implicit uapi for this.

v2: Fixed suggestions on formatting [Daniel]

Signed-off-by: Matthew Auld 
Signed-off-by: Ramalingam C 
Signed-off-by: Robert Beckett 
cc: Simon Ser 
cc: Pekka Paalanen 
Cc: Jordan Justen 
Cc: Kenneth Graunke 
Cc: mesa-...@lists.freedesktop.org
Cc: Tony Ye 
Cc: Slawomir Milczarek 
---
 include/uapi/drm/i915_drm.h | 44 -
 1 file changed, 39 insertions(+), 5 deletions(-)

diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 5e678917da70..486b7b96291e 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1118,10 +1118,16 @@ struct drm_i915_gem_exec_object2 {
/**
 * When the EXEC_OBJECT_PINNED flag is specified this is populated by
 * the user with the GTT offset at which this object will be pinned.
+*
 * When the I915_EXEC_NO_RELOC flag is specified this must contain the
 * presumed_offset of the object.
+*
 * During execbuffer2 the kernel populates it with the value of the
 * current GTT offset of the object, for future presumed_offset writes.
+*
+* See struct drm_i915_gem_create_ext for the rules when dealing with
+* alignment restrictions with I915_MEMORY_CLASS_DEVICE, on devices with
+* minimum page sizes, like DG2.
 */
__u64 offset;
 
@@ -3145,11 +3151,39 @@ struct drm_i915_gem_create_ext {
 *
 * The (page-aligned) allocated size for the object will be returned.
 *
-* Note that for some devices we have might have further minimum
-* page-size restrictions(larger than 4K), like for device local-memory.
-* However in general the final size here should always reflect any
-* rounding up, if for example using the 
I915_GEM_CREATE_EXT_MEMORY_REGIONS
-* extension to place the object in device local-memory.
+*
+* **DG2 64K min page size implications:**
+*
+* On discrete platforms, starting from DG2, we have to contend with GTT
+* page size restrictions when dealing with I915_MEMORY_CLASS_DEVICE
+* objects.  Specifically the hardware only supports 64K or larger GTT
+* page sizes for such memory. The kernel will already ensure that all
+* I915_MEMORY_CLASS_DEVICE memory is allocated using 64K or larger page
+* sizes underneath.
+*
+* Note that the returned size here will always reflect any required
+* rounding up done by the kernel, i.e 4K will now become 64K on devices
+* such as DG2.
+*
+* **Special DG2 GTT address alignment requirement:**
+*
+* The GTT alignment will also need be at least 2M for  such objects.
+*
+* Note that due to how the hardware implements 64K GTT page support, we
+* have some further complications:
+*
+*   1) The entire PDE(which covers a 2MB virtual address range), must
+*   contain only 64K PTEs, i.e mixing 4K and 64K PTEs in the same
+*   PDE is forbidden by the hardware.
+*
+*   2) We still need to support 4K PTEs for I915_MEMORY_CLASS_SYSTEM
+*   objects.
+*
+* To keep things simple for userland, we mandate that any GTT mappings
+* must be aligned to and rounded up to 2MB. As this only wastes virtual
+* address space and avoids userland having to copy any needlessly
+* complicated PDE sharing scheme (coloring) and only affects GD2, this
+* id deemed to be a good compromise.
 */
__u64 size;
/**
-- 
2.25.1



[PATCH 3/4] drm/i915: add gtt misalignment test

2022-01-11 Thread Robert Beckett
add test to check handling of misaligned offsets and sizes

Signed-off-by: Robert Beckett 
---
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 130 ++
 1 file changed, 130 insertions(+)

diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
index fea031b4ec4f..28de0b333835 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
@@ -22,10 +22,12 @@
  *
  */
 
+#include "gt/intel_gtt.h"
 #include 
 #include 
 
 #include "gem/i915_gem_context.h"
+#include "gem/i915_gem_region.h"
 #include "gem/selftests/mock_context.h"
 #include "gt/intel_context.h"
 #include "gt/intel_gpu_commands.h"
@@ -1066,6 +1068,120 @@ static int shrink_boom(struct i915_address_space *vm,
return err;
 }
 
+static int misaligned_case(struct i915_address_space *vm, struct 
intel_memory_region *mr,
+  u64 addr, u64 size, unsigned long flags)
+{
+   struct drm_i915_gem_object *obj;
+   struct i915_vma *vma;
+   int err = 0;
+   u64 expected_vma_size, expected_node_size;
+
+   obj = i915_gem_object_create_region(mr, size, 0, 0);
+   if (IS_ERR(obj))
+   return PTR_ERR(obj);
+
+   vma = i915_vma_instance(obj, vm, NULL);
+   if (IS_ERR(vma)) {
+   err = PTR_ERR(vma);
+   goto err_put;
+   }
+
+   err = i915_vma_pin(vma, 0, 0, addr | flags);
+   if (err)
+   goto err_put;
+   i915_vma_unpin(vma);
+
+   if (!drm_mm_node_allocated(&vma->node)) {
+   err = -EINVAL;
+   goto err_put;
+   }
+
+   if (i915_vma_misplaced(vma, 0, 0, addr | flags)) {
+   err = -EINVAL;
+   goto err_put;
+   }
+
+   expected_vma_size = round_up(size, 1 << (ffs(vma->page_sizes.gtt) - 1));
+   expected_node_size = expected_vma_size;
+
+   if (IS_DG2(vm->i915) && i915_gem_object_is_lmem(obj)) {
+   /* dg2 should expand lmem node to 2MB */
+   expected_vma_size = round_up(size, I915_GTT_PAGE_SIZE_64K);
+   expected_node_size = round_up(size, I915_GTT_PAGE_SIZE_2M);
+   }
+
+   if (vma->size != expected_vma_size || vma->node.size != 
expected_node_size) {
+   err = i915_vma_unbind(vma);
+   err = -EBADSLT;
+   goto err_put;
+   }
+
+   err = i915_vma_unbind(vma);
+   if (err)
+   goto err_put;
+
+   GEM_BUG_ON(drm_mm_node_allocated(&vma->node));
+
+err_put:
+   i915_gem_object_put(obj);
+   cleanup_freed_objects(vm->i915);
+   return err;
+}
+
+static int misaligned_pin(struct i915_address_space *vm,
+ u64 hole_start, u64 hole_end,
+ unsigned long end_time)
+{
+   struct intel_memory_region *mr;
+   enum intel_region_id id;
+   unsigned long flags = PIN_OFFSET_FIXED | PIN_USER;
+   int err = 0;
+   u64 hole_size = hole_end - hole_start;
+
+   if (i915_is_ggtt(vm))
+   flags |= PIN_GLOBAL;
+
+   for_each_memory_region(mr, vm->i915, id) {
+   u64 min_alignment = i915_vm_min_alignment(vm, id);
+   u64 size = min_alignment;
+   u64 addr = round_up(hole_start + (hole_size / 2), 
min_alignment);
+
+   /* we can't test < 4k alignment due to flags being encoded in 
lower bits */
+   if (min_alignment != I915_GTT_PAGE_SIZE_4K) {
+   err = misaligned_case(vm, mr, addr + (min_alignment / 
2), size, flags);
+   /* misaligned should error with -EINVAL*/
+   if (!err)
+   err = -EBADSLT;
+   if (err != -EINVAL)
+   return err;
+   }
+
+   /* test for vma->size expansion to min page size */
+   err = misaligned_case(vm, mr, addr, PAGE_SIZE, flags);
+   if (min_alignment > hole_size) {
+   if (!err)
+   err = -EBADSLT;
+   else if (err == -ENOSPC)
+   err = 0;
+   }
+   if (err)
+   return err;
+
+   /* test for intermediate size not expanding vma->size for large 
alignments */
+   err = misaligned_case(vm, mr, addr, size / 2, flags);
+   if (min_alignment > hole_size) {
+   if (!err)
+   err = -EBADSLT;
+   else if (err == -ENOSPC)
+   err = 0;
+   }
+   if (err)
+   return err;
+   }
+
+   return 0;
+}
+
 static int exercise_ppgtt(struct drm_i915_private *dev_priv,
  int (*func)(struct i915_address_space *vm,
  u64 hole_start, u64 hole_end,
@@ -11

[PATCH 2/4] drm/i915: support 64K GTT pages for discrete cards

2022-01-11 Thread Robert Beckett
From: Matthew Auld 

discrete cards optimise 64K GTT pages for local-memory, since everything
should be allocated at 64K granularity. We say goodbye to sparse
entries, and instead get a compact 256B page-table for 64K pages,
which should be more cache friendly. 4K pages for local-memory
are no longer supported by the HW.

Signed-off-by: Matthew Auld 
Signed-off-by: Stuart Summers 
Signed-off-by: Ramalingam C 
Cc: Joonas Lahtinen 
Cc: Rodrigo Vivi 
---
 .../gpu/drm/i915/gem/selftests/huge_pages.c   |  60 ++
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c  | 109 +-
 drivers/gpu/drm/i915/gt/intel_gtt.h   |   3 +
 drivers/gpu/drm/i915/gt/intel_ppgtt.c |   1 +
 4 files changed, 170 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c 
b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
index 11f0aa65f8a3..ef3439b290ca 100644
--- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
+++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
@@ -1483,6 +1483,65 @@ static int igt_ppgtt_sanity_check(void *arg)
return err;
 }
 
+static int igt_ppgtt_compact(void *arg)
+{
+   struct drm_i915_private *i915 = arg;
+   struct drm_i915_gem_object *obj;
+   int err;
+
+   /*
+* Simple test to catch issues with compact 64K pages -- since the pt is
+* compacted to 256B that gives us 32 entries per pt, however since the
+* backing page for the pt is 4K, any extra entries we might incorrectly
+* write out should be ignored by the HW. If ever hit such a case this
+* test should catch it since some of our writes would land in scratch.
+*/
+
+   if (!HAS_64K_PAGES(i915)) {
+   pr_info("device lacks compact 64K page support, skipping\n");
+   return 0;
+   }
+
+   if (!HAS_LMEM(i915)) {
+   pr_info("device lacks LMEM support, skipping\n");
+   return 0;
+   }
+
+   /* We want the range to cover multiple page-table boundaries. */
+   obj = i915_gem_object_create_lmem(i915, SZ_4M, 0);
+   if (IS_ERR(obj))
+   return err;
+
+   err = i915_gem_object_pin_pages_unlocked(obj);
+   if (err)
+   goto out_put;
+
+   if (obj->mm.page_sizes.phys < I915_GTT_PAGE_SIZE_64K) {
+   pr_info("LMEM compact unable to allocate huge-page(s)\n");
+   goto out_unpin;
+   }
+
+   /*
+* Disable 2M GTT pages by forcing the page-size to 64K for the GTT
+* insertion.
+*/
+   obj->mm.page_sizes.sg = I915_GTT_PAGE_SIZE_64K;
+
+   err = igt_write_huge(i915, obj);
+   if (err)
+   pr_err("LMEM compact write-huge failed\n");
+
+out_unpin:
+   i915_gem_object_unpin_pages(obj);
+out_put:
+   i915_gem_object_put(obj);
+
+   if (err == -ENOMEM)
+   err = 0;
+
+   return err;
+}
+
 static int igt_tmpfs_fallback(void *arg)
 {
struct drm_i915_private *i915 = arg;
@@ -1740,6 +1799,7 @@ int i915_gem_huge_page_live_selftests(struct 
drm_i915_private *i915)
SUBTEST(igt_tmpfs_fallback),
SUBTEST(igt_ppgtt_smoke_huge),
SUBTEST(igt_ppgtt_sanity_check),
+   SUBTEST(igt_ppgtt_compact),
};
 
if (!HAS_PPGTT(i915)) {
diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c 
b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index b012c50f7ce7..8d081497e87e 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -233,6 +233,8 @@ static u64 __gen8_ppgtt_clear(struct i915_address_space * 
const vm,
   start, end, lvl);
} else {
unsigned int count;
+   unsigned int pte = gen8_pd_index(start, 0);
+   unsigned int num_ptes;
u64 *vaddr;
 
count = gen8_pt_count(start, end);
@@ -242,10 +244,18 @@ static u64 __gen8_ppgtt_clear(struct i915_address_space * 
const vm,
atomic_read(&pt->used));
GEM_BUG_ON(!count || count >= atomic_read(&pt->used));
 
+   num_ptes = count;
+   if (pt->is_compact) {
+   GEM_BUG_ON(num_ptes % 16);
+   GEM_BUG_ON(pte % 16);
+   num_ptes /= 16;
+   pte /= 16;
+   }
+
vaddr = px_vaddr(pt);
-   memset64(vaddr + gen8_pd_index(start, 0),
+   memset64(vaddr + pte,
 vm->scratch[0]->encode,
-count);
+num_ptes);
 
atomic_sub(count, &pt->used);
start += count;
@@ -453,6 +463,96 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,

[PATCH 1/4] drm/i915: enforce min GTT alignment for discrete cards

2022-01-11 Thread Robert Beckett
From: Matthew Auld 

For local-memory objects we need to align the GTT addresses
to 64K, both for the ppgtt and ggtt.

We need to support vm->min_alignment > 4K, depending
on the vm itself and the type of object we are inserting.
With this in mind update the GTT selftests to take this
into account.

For DG2 we further align and pad lmem object GTT addresses
to 2MB to ensure PDEs contain consistent page sizes as
required by the HW.

Signed-off-by: Matthew Auld 
Signed-off-by: Ramalingam C 
Signed-off-by: Robert Beckett 
Cc: Joonas Lahtinen 
Cc: Rodrigo Vivi 
---
 .../i915/gem/selftests/i915_gem_client_blt.c  | 23 +++--
 drivers/gpu/drm/i915/gt/intel_gtt.c   | 14 +++
 drivers/gpu/drm/i915/gt/intel_gtt.h   |  9 ++
 drivers/gpu/drm/i915/i915_vma.c   | 14 +++
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 96 ---
 5 files changed, 115 insertions(+), 41 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c 
b/drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c
index c08f766e6e15..7fee95a65414 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c
@@ -39,6 +39,7 @@ struct tiled_blits {
struct blit_buffer scratch;
struct i915_vma *batch;
u64 hole;
+   u64 align;
u32 width;
u32 height;
 };
@@ -410,14 +411,21 @@ tiled_blits_create(struct intel_engine_cs *engine, struct 
rnd_state *prng)
goto err_free;
}
 
-   hole_size = 2 * PAGE_ALIGN(WIDTH * HEIGHT * 4);
+   t->align = I915_GTT_PAGE_SIZE_2M; /* XXX worst case, derive from vm! */
+   t->align = max(t->align,
+  i915_vm_min_alignment(t->ce->vm, INTEL_MEMORY_LOCAL));
+   t->align = max(t->align,
+  i915_vm_min_alignment(t->ce->vm, INTEL_MEMORY_SYSTEM));
+
+   hole_size = 2 * round_up(WIDTH * HEIGHT * 4, t->align);
hole_size *= 2; /* room to maneuver */
-   hole_size += 2 * I915_GTT_MIN_ALIGNMENT;
+   hole_size += 2 * t->align; /* padding on either side */
 
mutex_lock(&t->ce->vm->mutex);
memset(&hole, 0, sizeof(hole));
err = drm_mm_insert_node_in_range(&t->ce->vm->mm, &hole,
- hole_size, 0, I915_COLOR_UNEVICTABLE,
+ hole_size, t->align,
+ I915_COLOR_UNEVICTABLE,
  0, U64_MAX,
  DRM_MM_INSERT_BEST);
if (!err)
@@ -428,7 +436,7 @@ tiled_blits_create(struct intel_engine_cs *engine, struct 
rnd_state *prng)
goto err_put;
}
 
-   t->hole = hole.start + I915_GTT_MIN_ALIGNMENT;
+   t->hole = hole.start + t->align;
pr_info("Using hole at %llx\n", t->hole);
 
err = tiled_blits_create_buffers(t, WIDTH, HEIGHT, prng);
@@ -455,7 +463,7 @@ static void tiled_blits_destroy(struct tiled_blits *t)
 static int tiled_blits_prepare(struct tiled_blits *t,
   struct rnd_state *prng)
 {
-   u64 offset = PAGE_ALIGN(t->width * t->height * 4);
+   u64 offset = round_up(t->width * t->height * 4, t->align);
u32 *map;
int err;
int i;
@@ -486,8 +494,7 @@ static int tiled_blits_prepare(struct tiled_blits *t,
 
 static int tiled_blits_bounce(struct tiled_blits *t, struct rnd_state *prng)
 {
-   u64 offset =
-   round_up(t->width * t->height * 4, 2 * I915_GTT_MIN_ALIGNMENT);
+   u64 offset = round_up(t->width * t->height * 4, 2 * t->align);
int err;
 
/* We want to check position invariant tiling across GTT eviction */
@@ -500,7 +507,7 @@ static int tiled_blits_bounce(struct tiled_blits *t, struct 
rnd_state *prng)
 
/* Reposition so that we overlap the old addresses, and slightly off */
err = tiled_blit(t,
-&t->buffers[2], t->hole + I915_GTT_MIN_ALIGNMENT,
+&t->buffers[2], t->hole + t->align,
 &t->buffers[1], t->hole + 3 * offset / 2);
if (err)
return err;
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c 
b/drivers/gpu/drm/i915/gt/intel_gtt.c
index a94be0306464..156852dcf33a 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -219,6 +219,20 @@ void i915_address_space_init(struct i915_address_space 
*vm, int subclass)
 
GEM_BUG_ON(!vm->total);
drm_mm_init(&vm->mm, 0, vm->total);
+
+   memset64(vm->min_alignment, I915_GTT_MIN_ALIGNMENT,
+ARRAY_SIZE(vm->min_alignment));
+
+   if (HAS_64K_PAGES(vm->i915)) {
+   if (IS_DG2(vm->i915)) {
+   vm->min_alignment[INTEL_MEMORY_LOCAL] = 
I915_GTT_PAGE_SIZE_2M;
+   vm->min_alignment[INTEL_MEMORY_STOLEN_LOCAL] = 
I915_GTT_PAGE_SIZE_2M;
+   } else

[PATCH 0/4] dicsrete card 64K page support

2022-01-11 Thread Robert Beckett
This series continues support for 64K pages for discrete cards.
It supersedes the 64K patches from 
https://patchwork.freedesktop.org/series/95686/#rev4
Changes since that series:

- set min alignment for DG2 to 2MB in i915_address_space_init
- replace coloring with simpler 2MB VA alignment for lmem buffers
- enforce alignment to 2MB for lmem objects on DG2 in i915_vma_insert
- expand vma reservation to round up to 2MB on DG2 in i915_vma_insert
- add alignment test


Matthew Auld (3):
  drm/i915: enforce min GTT alignment for discrete cards
  drm/i915: support 64K GTT pages for discrete cards
  drm/i915/uapi: document behaviour for DG2 64K support

Robert Beckett (1):
  drm/i915: add gtt misalignment test

 .../gpu/drm/i915/gem/selftests/huge_pages.c   |  60 +
 .../i915/gem/selftests/i915_gem_client_blt.c  |  23 +-
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c  | 109 -
 drivers/gpu/drm/i915/gt/intel_gtt.c   |  14 ++
 drivers/gpu/drm/i915/gt/intel_gtt.h   |  12 +
 drivers/gpu/drm/i915/gt/intel_ppgtt.c |   1 +
 drivers/gpu/drm/i915/i915_vma.c   |  14 ++
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 226 +++---
 include/uapi/drm/i915_drm.h   |  44 +++-
 9 files changed, 454 insertions(+), 49 deletions(-)

-- 
2.25.1



[PATCH] drm/i915: Lock timeline mutex directly in error path of eb_pin_timeline

2022-01-11 Thread Matthew Brost
Don't use the interruptable version of the timeline mutex lock in the
error path of eb_pin_timeline as the cleanup must always happen.

v2:
 (John Harrison)
  - Don't check for interrupt during mutex lock
v3:
 (Tvrtko)
  - A comment explaining why lock helper isn't used

Fixes: 544460c33821 ("drm/i915: Multi-BB execbuf")
Signed-off-by: Matthew Brost 
---
 drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 9e221ce427075..4a611d62e991a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -2518,9 +2518,14 @@ static int eb_pin_timeline(struct i915_execbuffer *eb, 
struct intel_context *ce,
  timeout) < 0) {
i915_request_put(rq);
 
-   tl = intel_context_timeline_lock(ce);
+   /*
+* Error path, cannot use intel_context_timeline_lock as
+* that is user interruptable and this clean up step
+* must be done.
+*/
+   mutex_lock(&ce->timeline->mutex);
intel_context_exit(ce);
-   intel_context_timeline_unlock(tl);
+   mutex_unlock(&ce->timeline->mutex);
 
if (nonblock)
return -EWOULDBLOCK;
-- 
2.34.1



[PATCH] drm/i915: Flip guc_id allocation partition

2022-01-11 Thread Matthew Brost
Move the multi-lrc guc_id from the lower allocation partition (0 to
number of multi-lrc guc_ids) to upper allocation partition (number of
single-lrc to max guc_ids).

Signed-off-by: Matthew Brost 
---
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 57 ++-
 1 file changed, 42 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 9989d121127df..1bacc9621cea8 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -147,6 +147,8 @@ guc_create_parallel(struct intel_engine_cs **engines,
  */
 #define NUMBER_MULTI_LRC_GUC_ID(guc)   \
((guc)->submission_state.num_guc_ids / 16)
+#define NUMBER_SINGLE_LRC_GUC_ID(guc)  \
+   ((guc)->submission_state.num_guc_ids - NUMBER_MULTI_LRC_GUC_ID(guc))
 
 /*
  * Below is a set of functions which control the GuC scheduling state which
@@ -1776,11 +1778,6 @@ int intel_guc_submission_init(struct intel_guc *guc)
INIT_WORK(&guc->submission_state.destroyed_worker,
  destroyed_worker_func);
 
-   guc->submission_state.guc_ids_bitmap =
-   bitmap_zalloc(NUMBER_MULTI_LRC_GUC_ID(guc), GFP_KERNEL);
-   if (!guc->submission_state.guc_ids_bitmap)
-   return -ENOMEM;
-
spin_lock_init(&guc->timestamp.lock);
INIT_DELAYED_WORK(&guc->timestamp.work, guc_timestamp_ping);
guc->timestamp.ping_delay = (POLL_TIME_CLKS / gt->clock_frequency + 1) 
* HZ;
@@ -1796,7 +1793,8 @@ void intel_guc_submission_fini(struct intel_guc *guc)
guc_flush_destroyed_contexts(guc);
guc_lrc_desc_pool_destroy(guc);
i915_sched_engine_put(guc->sched_engine);
-   bitmap_free(guc->submission_state.guc_ids_bitmap);
+   if (guc->submission_state.guc_ids_bitmap)
+   bitmap_free(guc->submission_state.guc_ids_bitmap);
 }
 
 static inline void queue_request(struct i915_sched_engine *sched_engine,
@@ -1863,6 +1861,33 @@ static void guc_submit_request(struct i915_request *rq)
spin_unlock_irqrestore(&sched_engine->lock, flags);
 }
 
+static int new_mlrc_guc_id(struct intel_guc *guc, struct intel_context *ce)
+{
+   int ret;
+
+   GEM_BUG_ON(!intel_context_is_parent(ce));
+   GEM_BUG_ON(!guc->submission_state.guc_ids_bitmap);
+
+   ret =  bitmap_find_free_region(guc->submission_state.guc_ids_bitmap,
+  NUMBER_MULTI_LRC_GUC_ID(guc),
+  order_base_2(ce->parallel.number_children
+   + 1));
+   if (likely(!(ret < 0)))
+   ret += NUMBER_SINGLE_LRC_GUC_ID(guc);
+
+   return ret;
+}
+
+static int new_slrc_guc_id(struct intel_guc *guc, struct intel_context *ce)
+{
+   GEM_BUG_ON(intel_context_is_parent(ce));
+
+   return ida_simple_get(&guc->submission_state.guc_ids,
+ 0, NUMBER_SINGLE_LRC_GUC_ID(guc),
+ GFP_KERNEL | __GFP_RETRY_MAYFAIL |
+ __GFP_NOWARN);
+}
+
 static int new_guc_id(struct intel_guc *guc, struct intel_context *ce)
 {
int ret;
@@ -1870,16 +1895,10 @@ static int new_guc_id(struct intel_guc *guc, struct 
intel_context *ce)
GEM_BUG_ON(intel_context_is_child(ce));
 
if (intel_context_is_parent(ce))
-   ret = 
bitmap_find_free_region(guc->submission_state.guc_ids_bitmap,
- NUMBER_MULTI_LRC_GUC_ID(guc),
- 
order_base_2(ce->parallel.number_children
-  + 1));
+   ret = new_mlrc_guc_id(guc, ce);
else
-   ret = ida_simple_get(&guc->submission_state.guc_ids,
-NUMBER_MULTI_LRC_GUC_ID(guc),
-guc->submission_state.num_guc_ids,
-GFP_KERNEL | __GFP_RETRY_MAYFAIL |
-__GFP_NOWARN);
+   ret = new_slrc_guc_id(guc, ce);
+
if (unlikely(ret < 0))
return ret;
 
@@ -1989,6 +2008,14 @@ static int pin_guc_id(struct intel_guc *guc, struct 
intel_context *ce)
 
GEM_BUG_ON(atomic_read(&ce->guc_id.ref));
 
+   if (unlikely(intel_context_is_parent(ce) &&
+!guc->submission_state.guc_ids_bitmap)) {
+   guc->submission_state.guc_ids_bitmap =
+   bitmap_zalloc(NUMBER_MULTI_LRC_GUC_ID(guc), GFP_KERNEL);
+   if (!guc->submission_state.guc_ids_bitmap)
+   return -ENOMEM;
+   }
+
 try_again:
spin_lock_irqsave(&guc->submission_state.lock, flags);
 
-- 
2.34.1



Re: [Patch v4 23/24] drm/amdkfd: CRIU prepare for svm resume

2022-01-11 Thread philip yang

  


On 2022-01-10 6:58 p.m., Felix Kuehling
  wrote:

On
  2022-01-05 9:43 a.m., philip yang wrote:
  
  


On 2021-12-22 7:37 p.m., Rajneesh Bhardwaj wrote:

During CRIU restore phase, the VMAs for
  the virtual address ranges are
  
  not at their final location yet so in this stage, only cache
  the data
  
  required to successfully resume the svm ranges during an
  imminent CRIU
  
  resume phase.
  
  
  Signed-off-by: Rajneesh
  Bhardwaj
  
  ---
  
    drivers/gpu/drm/amd/amdkfd/kfd_chardev.c |  4 +-
  
    drivers/gpu/drm/amd/amdkfd/kfd_priv.h    |  5 ++
  
    drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 99
  
  
    drivers/gpu/drm/amd/amdkfd/kfd_svm.h | 12 +++
  
    4 files changed, 118 insertions(+), 2 deletions(-)
  
  
  diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
  b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
  
  index 916b8d000317..f7aa15b18f95 100644
  
  --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
  
  +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
  
  @@ -2638,8 +2638,8 @@ static int criu_restore_objects(struct
  file *filep,
  
    goto exit;
  
    break;
  
    case KFD_CRIU_OBJECT_TYPE_SVM_RANGE:
  
  -    /* TODO: Implement SVM range */
  
  -    *priv_offset += sizeof(struct
  kfd_criu_svm_range_priv_data);
  
  +    ret = kfd_criu_restore_svm(p, (uint8_t __user
  *)args->priv_data,
  
  + priv_offset,
  max_priv_data_size);
  
    if (ret)
  
    goto exit;
  
    break;
  
  diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
  b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
  
  index 87eb6739a78e..92191c541c29 100644
  
  --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
  
  +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
  
  @@ -790,6 +790,7 @@ struct svm_range_list {
  
    struct list_head    list;
  
    struct work_struct    deferred_list_work;
  
    struct list_head    deferred_range_list;
  
  +    struct list_head    criu_svm_metadata_list;
  
    spinlock_t    deferred_list_lock;
  
    atomic_t    evicted_ranges;
  
    bool    drain_pagefaults;
  
  @@ -1148,6 +1149,10 @@ int kfd_criu_restore_event(struct file
  *devkfd,
  
   uint8_t __user *user_priv_data,
  
   uint64_t *priv_data_offset,
  
   uint64_t max_priv_data_size);
  
  +int kfd_criu_restore_svm(struct kfd_process *p,
  
  + uint8_t __user *user_priv_data,
  
  + uint64_t *priv_data_offset,
  
  + uint64_t max_priv_data_size);
  
    /* CRIU - End */
  
      /* Queue Context Management */
  
  diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
  b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
  
  index 6d59f1bedcf2..e9f6c63c2a26 100644
  
  --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
  
  +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
  
  @@ -45,6 +45,14 @@
  
     */
  
    #define AMDGPU_SVM_RANGE_RETRY_FAULT_PENDING    2000
  
    +struct criu_svm_metadata {
  
  +    struct list_head list;
  
  +    __u64 start_addr;
  
  +    __u64 size;
  
  +    /* Variable length array of attributes */
  
  +    struct kfd_ioctl_svm_attribute attrs[0];
  
  +};
  

This data structure is struct kfd_criu_svm_range_priv_data plus
list_head, maybe you can add list_head to struct
kfd_criu_svm_range_priv_data and remove this new data structure,
then you can remove extra kzalloc, kfree for each svm object
resume and function svm_criu_prepare_for_resume could be
removed. 
  
  Adding list_head to the private structure is a bad idea, because
  that structure is copied to/from user mode. Kernel mode po

Re: [PATCH v2 01/14] drm/edid: Don't clear YUV422 if using deep color

2022-01-11 Thread Maxime Ripard
Hi Ville,

Thanks for your review

On Wed, Dec 15, 2021 at 03:48:39PM +0200, Ville Syrjälä wrote:
> On Wed, Dec 15, 2021 at 01:43:53PM +0100, Maxime Ripard wrote:
> > The current code, when parsing the EDID Deep Color depths, that the
> > YUV422 cannot be used, referring to the HDMI 1.3 Specification.
> > 
> > This specification, in its section 6.2.4, indeed states:
> > 
> >   For each supported Deep Color mode, RGB 4:4:4 shall be supported and
> >   optionally YCBCR 4:4:4 may be supported.
> > 
> >   YCBCR 4:2:2 is not permitted for any Deep Color mode.
> > 
> > This indeed can be interpreted like the code does, but the HDMI 1.4
> > specification further clarifies that statement in its section 6.2.4:
> > 
> >   For each supported Deep Color mode, RGB 4:4:4 shall be supported and
> >   optionally YCBCR 4:4:4 may be supported.
> > 
> >   YCBCR 4:2:2 is also 36-bit mode but does not require the further use
> >   of the Deep Color modes described in section 6.5.2 and 6.5.3.
> > 
> > This means that, even though YUV422 can be used with 12 bit per color,
> > it shouldn't be treated as a deep color mode.
> > 
> > This deviates from the interpretation of the code and comment, so let's
> > fix those.
> > 
> > Fixes: d0c94692e0a3 ("drm/edid: Parse and handle HDMI deep color modes.")
> > Signed-off-by: Maxime Ripard 
> > ---
> >  drivers/gpu/drm/drm_edid.c | 5 ++---
> >  1 file changed, 2 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
> > index 12893e7be89b..e57d1b8cdaaa 100644
> > --- a/drivers/gpu/drm/drm_edid.c
> > +++ b/drivers/gpu/drm/drm_edid.c
> > @@ -5106,10 +5106,9 @@ static void drm_parse_hdmi_deep_color_info(struct 
> > drm_connector *connector,
> >  
> > /*
> >  * Deep color support mandates RGB444 support for all video
> > -* modes and forbids YCRCB422 support for all video modes per
> > -* HDMI 1.3 spec.
> > +* modes.
> >  */
> > -   info->color_formats = DRM_COLOR_FORMAT_RGB444;
> > +   info->color_formats |= DRM_COLOR_FORMAT_RGB444;
> >  
> > /* YCRCB444 is optional according to spec. */
> > if (hdmi[6] & DRM_EDID_HDMI_DC_Y444) {
> 
> This whole code seems pretty much wrong. What it tries to do (I think)
> is make sure we don't use deep color with YCbCr 4:4:4 unless supported.
> But what it actually does is also disable YCbCr 4:4:4 8bpc when deep
> color is not supported for YCbCr 4:4:4.
> 
> I think what we want is to just get rid of this color_formats stuff here
> entirely and instead have some kind of separate tracking of RGB 4:4:4 vs.
> YCbCr 4:4:4 deep color capabilities.

I'm sorry, I'm not entirely sure to understand what you're suggesting
here. Do you want to get ride of info->color_formats entirely?

Maxime


signature.asc
Description: PGP signature


Re: [Intel-gfx] [PATCH 1/2] drm/dp: note that DPCD 0x2002-0x2003 match 0x200-0x201

2022-01-11 Thread Jani Nikula
On Mon, 10 Jan 2022, Ville Syrjälä  wrote:
> On Tue, Jan 04, 2022 at 08:48:56PM +0200, Jani Nikula wrote:
>> DP_SINK_COUNT_ESI and DP_DEVICE_SERVICE_IRQ_VECTOR_ESI0 have the same
>> contents as DP_SINK_COUNT and DP_DEVICE_SERVICE_IRQ_VECTOR,
>> respectively.
>
> IIRC there was an oversight in the earlier spec revisions that
> showed bit 7 as reserved for one of the locations. But looks like
> that got fixed.

Yeah. Thanks for the review, pushed both to drm-misc-next.

BR,
Jani.


>
> Reviewed-by: Ville Syrjälä 
>
>> 
>> Signed-off-by: Jani Nikula 
>> ---
>>  include/drm/drm_dp_helper.h | 7 ++-
>>  1 file changed, 2 insertions(+), 5 deletions(-)
>> 
>> diff --git a/include/drm/drm_dp_helper.h b/include/drm/drm_dp_helper.h
>> index 30359e434c3f..98d020835b49 100644
>> --- a/include/drm/drm_dp_helper.h
>> +++ b/include/drm/drm_dp_helper.h
>> @@ -1038,11 +1038,8 @@ struct drm_panel;
>>  #define DP_SIDEBAND_MSG_UP_REQ_BASE 0x1600   /* 1.2 MST */
>>  
>>  /* DPRX Event Status Indicator */
>> -#define DP_SINK_COUNT_ESI   0x2002   /* 1.2 */
>> -/* 0-5 sink count */
>> -# define DP_SINK_COUNT_CP_READY (1 << 6)
>> -
>> -#define DP_DEVICE_SERVICE_IRQ_VECTOR_ESI0   0x2003   /* 1.2 */
>> +#define DP_SINK_COUNT_ESI   0x2002   /* same as 0x200 */
>> +#define DP_DEVICE_SERVICE_IRQ_VECTOR_ESI0   0x2003   /* same as 0x201 */
>>  
>>  #define DP_DEVICE_SERVICE_IRQ_VECTOR_ESI1   0x2004   /* 1.2 */
>>  # define DP_RX_GTC_MSTR_REQ_STATUS_CHANGE(1 << 0)
>> -- 
>> 2.30.2

-- 
Jani Nikula, Intel Open Source Graphics Center


Re: [Patch v4 18/24] drm/amdkfd: CRIU checkpoint and restore xnack mode

2022-01-11 Thread philip yang

  


On 2022-01-10 7:10 p.m., Felix Kuehling
  wrote:

On
  2022-01-05 10:22 a.m., philip yang wrote:
  
  


On 2021-12-22 7:37 p.m., Rajneesh Bhardwaj wrote:

Recoverable page faults are represented
  by the xnack mode setting inside
  
  a kfd process and are used to represent the device page
  faults. For CR,
  
  we don't consider negative values which are typically used for
  querying
  
  the current xnack mode without modifying it.
  
  
  Signed-off-by: Rajneesh
  Bhardwaj
  
  ---
  
    drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 15
  +++
  
    drivers/gpu/drm/amd/amdkfd/kfd_priv.h    |  1 +
  
    2 files changed, 16 insertions(+)
  
  
  diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
  b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
  
  index 178b0ccfb286..446eb9310915 100644
  
  --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
  
  +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
  
  @@ -1845,6 +1845,11 @@ static int
  criu_checkpoint_process(struct kfd_process *p,
  
    memset(&process_priv, 0, sizeof(process_priv));
  
      process_priv.version = KFD_CRIU_PRIV_VERSION;
  
  +    /* For CR, we don't consider negative xnack mode which is
  used for
  
  + * querying without changing it, here 0 simply means
  disabled and 1
  
  + * means enabled so retry for finding a valid PTE.
  
  + */
  

Negative value to query xnack mode is for
kfd_ioctl_set_xnack_mode user space ioctl interface, which is
not used by CRIU, I think this comment is misleading,

+    process_priv.xnack_mode =
  p->xnack_enabled ? 1 : 0;
  

change to process_priv.xnack_enabled

    ret =
  copy_to_user(user_priv_data + *priv_offset,
  
    &process_priv, sizeof(process_priv));
  
  @@ -2231,6 +2236,16 @@ static int criu_restore_process(struct
  kfd_process *p,
  
    return -EINVAL;
  
    }
  
    +    pr_debug("Setting XNACK mode\n");
  
  +    if (process_priv.xnack_mode &&
  !kfd_process_xnack_mode(p, true)) {
  
  +    pr_err("xnack mode cannot be set\n");
  
  +    ret = -EPERM;
  
  +    goto exit;
  
  +    } else {
  


On GFXv9 GPUs except Aldebaran, this means the process
checkpointed is xnack off, it can restore and resume on GPU with
xnack on, then shader will continue running successfully, but
driver is not guaranteed to map svm ranges on GPU all the time,
if retry fault happens, the shader will not recover. Maybe
change to:


If (KFD_GC_VERSION(dev) != IP_VERSION(9, 4, 2) {


  
  The code here was correct. The xnack mode applies to the whole
  process, not just one GPU. The logic for checking the capabilities
  of all GPUs is already in kfd_process_xnack_mode. If XNACK cannot
  be supported by all GPUs, restoring a non-0 XNACK mode will fail.
  
  
  Any GPU can run in XNACK-disabled mode. So we don't need any
  limitations for process_priv.xnack_enabled == 0.
  

Yes, the code was correct, for case all GPUs dev->noretry=0
  (xnack on), process->xnack_enabled=0, we unmap the queues while
  migrating, guarantee to map svm ranges on GPUs then resume queues.
  If retry fault happens, we don't recover the fault, report the
  fault to user space. That is all correct.
Regards,
Philip


  
  Regards,
  
    Felix
  
  
  
      if (process_priv.xnack_enabled !=
kfd_process_xnack_mode(p, true)) {


 pr_err("xnack mode cannot be set\n");


 ret = -EPERM;


 goto exit;


    }


}


pr_debug("set xnack mode: %d\n", process_priv.xnack_enabled);


p->xnack_enabled = process_priv.xnack_enabled;



+    pr_debug("set xnack mode:
  %d\n", process_priv.xnack_mode);
  
  +    p->xnack_enabled = process_priv.xnack_mode;
  
  +    }
  
  +
  
    exit:
  
 

Re: [PATCH] drm/atomic: Check new_crtc_state->active to determine if CRTC needs disable in self refresh mode

2022-01-11 Thread Alex Deucher
Pushed out to drm-misc-next-fixes.

Alex

On Fri, Jan 7, 2022 at 9:07 PM Liu Ying  wrote:
>
> On Fri, 2022-01-07 at 14:53 -0500, Alex Deucher wrote:
> > On Wed, Dec 29, 2021 at 11:07 PM Liu Ying  wrote:
> > >
> > > Actual hardware state of CRTC is controlled by the member 'active'
> > > in
> > > struct drm_crtc_state instead of the member 'enable', according to
> > > the
> > > kernel doc of the member 'enable'.  In fact, the drm client modeset
> > > and atomic helpers are using the member 'active' to do the control.
> > >
> > > Referencing the member 'enable' of new_crtc_state, the function
> > > crtc_needs_disable() may fail to reflect if CRTC needs disable in
> > > self refresh mode, e.g., when the framebuffer emulation will be
> > > blanked
> > > through the client modeset helper with the next commit, the member
> > > 'enable' of new_crtc_state is still true while the member 'active'
> > > is
> > > false, hence the relevant potential encoder and bridges won't be
> > > disabled.
> > >
> > > So, let's check new_crtc_state->active to determine if CRTC needs
> > > disable
> > > in self refresh mode instead of new_crtc_state->enable.
> > >
> > > Fixes: 1452c25b0e60 ("drm: Add helpers to kick off self refresh
> > > mode in drivers")
> > > Cc: Sean Paul 
> > > Cc: Rob Clark 
> > > Cc: Maarten Lankhorst 
> > > Cc: Maxime Ripard 
> > > Cc: Thomas Zimmermann 
> > > Cc: David Airlie 
> > > Cc: Daniel Vetter 
> > > Signed-off-by: Liu Ying 
> >
> > Reviewed-by: Alex Deucher 
> >
> > Do you need someone to push this for you?
>
> Yes, please.  Thanks.
>
> Liu Ying
>
> >
> > Alex
> >
> > > ---
> > >  drivers/gpu/drm/drm_atomic_helper.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/gpu/drm/drm_atomic_helper.c
> > > b/drivers/gpu/drm/drm_atomic_helper.c
> > > index a7a05e1e26bb..9603193d2fa1 100644
> > > --- a/drivers/gpu/drm/drm_atomic_helper.c
> > > +++ b/drivers/gpu/drm/drm_atomic_helper.c
> > > @@ -1016,7 +1016,7 @@ crtc_needs_disable(struct drm_crtc_state
> > > *old_state,
> > >  * it's in self refresh mode and needs to be fully
> > > disabled.
> > >  */
> > > return old_state->active ||
> > > -  (old_state->self_refresh_active && !new_state-
> > > >enable) ||
> > > +  (old_state->self_refresh_active && !new_state-
> > > >active) ||
> > >new_state->self_refresh_active;
> > >  }
> > >
> > > --
> > > 2.25.1
> > >
>


Re: [git pull] drm for 5.17-rc1 (pre-merge window pull)

2022-01-11 Thread Harry Wentland


On 2022-01-11 10:08, Alex Deucher wrote:
> On Mon, Jan 10, 2022 at 9:53 PM Linus Torvalds
>  wrote:
>>
>> On Mon, Jan 10, 2022 at 6:44 PM Linus Torvalds
>>  wrote:
>>>
>>> I'll double-check to see if a revert fixes it at the top of my tree.
>>
>> Yup. It reverts cleanly, and the end result builds and works fine, and
>> doesn't show the horrendous flickering.
>>
>> I have done that revert, and will continue the merge window work.
>> Somebody else gets to figure out what the actual bug is, but that
>> commit was horribly broken on my machine (Sapphire Pulse RX 580 8GB,
>> fwiw).
> 
> Thanks for tracking this down.  We are investigating the issue.
> 

Thanks for tracking this down. It was the result of a bad merge
from an internal branch.

Attached is a v2 of the buggy patch that should get this right.
If you have a chance to try it out let us know, if not we'll get
someone to repro and test the fix.

Harry

> Alex
From 5ed98e330781615434711a5fc31a6a7473f9344f Mon Sep 17 00:00:00 2001
From: Meenakshikumar Somasundaram 
Date: Mon, 15 Nov 2021 01:51:37 -0500
Subject: [PATCH] drm/amd/display: Fix for otg synchronization logic

[Why]
During otg sync trigger, plane states are used to decide whether the otg
is already synchronized or not. There are scenarions when otgs are
disabled without plane state getting disabled and in such case the otg is
excluded from synchronization.

[How]
Introduced pipe_idx_syncd in pipe_ctx that tracks each otgs master pipe.
When a otg is disabled/enabled, pipe_idx_syncd is reset to itself.
On sync trigger, pipe_idx_syncd is checked to decide whether a otg is
already synchronized and the otg is further included or excluded from
synchronization.

v2:
  Don't drop is_blanked logic

Reviewed-by: Jun Lei 
Reviewed-by: Mustapha Ghaddar 
Acked-by: Bhawanpreet Lakha 
Signed-off-by: meenakshikumar somasundaram 
Tested-by: Daniel Wheeler 
Signed-off-by: Alex Deucher 
Signed-off-by: Harry Wentland 
---
 drivers/gpu/drm/amd/display/dc/core/dc.c  | 40 +-
 .../gpu/drm/amd/display/dc/core/dc_resource.c | 54 +++
 drivers/gpu/drm/amd/display/dc/dc.h   |  1 +
 .../display/dc/dce110/dce110_hw_sequencer.c   |  8 +++
 .../drm/amd/display/dc/dcn31/dcn31_resource.c |  3 ++
 .../gpu/drm/amd/display/dc/inc/core_types.h   |  1 +
 drivers/gpu/drm/amd/display/dc/inc/resource.h | 11 
 7 files changed, 105 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c b/drivers/gpu/drm/amd/display/dc/core/dc.c
index 01c8849b9db2..6f5528d34093 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -1404,20 +1404,34 @@ static void program_timing_sync(
 status->timing_sync_info.master = false;
 
 		}
-		/* remove any other unblanked pipes as they have already been synced */
-		for (j = j + 1; j < group_size; j++) {
-			bool is_blanked;
 
-			if (pipe_set[j]->stream_res.opp->funcs->dpg_is_blanked)
-is_blanked =
-	pipe_set[j]->stream_res.opp->funcs->dpg_is_blanked(pipe_set[j]->stream_res.opp);
-			else
-is_blanked =
-	pipe_set[j]->stream_res.tg->funcs->is_blanked(pipe_set[j]->stream_res.tg);
-			if (!is_blanked) {
-group_size--;
-pipe_set[j] = pipe_set[group_size];
-j--;
+		/* remove any other pipes that are already been synced */
+		if (dc->config.use_pipe_ctx_sync_logic) {
+			/* check pipe's syncd to decide which pipe to be removed */
+			for (j = 1; j < group_size; j++) {
+if (pipe_set[j]->pipe_idx_syncd == pipe_set[0]->pipe_idx_syncd) {
+	group_size--;
+	pipe_set[j] = pipe_set[group_size];
+	j--;
+} else
+	/* link slave pipe's syncd with master pipe */
+	pipe_set[j]->pipe_idx_syncd = pipe_set[0]->pipe_idx_syncd;
+			}
+		} else {
+			for (j = j + 1; j < group_size; j++) {
+bool is_blanked;
+
+if (pipe_set[j]->stream_res.opp->funcs->dpg_is_blanked)
+	is_blanked =
+		pipe_set[j]->stream_res.opp->funcs->dpg_is_blanked(pipe_set[j]->stream_res.opp);
+else
+	is_blanked =
+		pipe_set[j]->stream_res.tg->funcs->is_blanked(pipe_set[j]->stream_res.tg);
+if (!is_blanked) {
+	group_size--;
+	pipe_set[j] = pipe_set[group_size];
+	j--;
+}
 			}
 		}
 
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
index de5c7d1e0267..eaeef72773f6 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
@@ -3216,3 +3216,57 @@ struct hpo_dp_link_encoder *resource_get_hpo_dp_link_enc_for_det_lt(
 	return hpo_dp_link_enc;
 }
 #endif
+
+void reset_syncd_pipes_from_disabled_pipes(struct dc *dc,
+		struct dc_state *context)
+{
+	int i, j;
+	struct pipe_ctx *pipe_ctx_old, *pipe_ctx, *pipe_ctx_syncd;
+
+	/* If pipe backend is reset, need to reset pipe syncd status */
+	for (i = 0; i < dc->res_pool->pipe_count; i++) {
+		pipe_ctx_old =	&dc->current_state->res_ctx.pipe_ctx[i];
+		pipe_ctx = &context->res_ct

Re: [git pull] drm for 5.17-rc1 (pre-merge window pull)

2022-01-11 Thread Alex Deucher
On Mon, Jan 10, 2022 at 9:53 PM Linus Torvalds
 wrote:
>
> On Mon, Jan 10, 2022 at 6:44 PM Linus Torvalds
>  wrote:
> >
> > I'll double-check to see if a revert fixes it at the top of my tree.
>
> Yup. It reverts cleanly, and the end result builds and works fine, and
> doesn't show the horrendous flickering.
>
> I have done that revert, and will continue the merge window work.
> Somebody else gets to figure out what the actual bug is, but that
> commit was horribly broken on my machine (Sapphire Pulse RX 580 8GB,
> fwiw).

Thanks for tracking this down.  We are investigating the issue.

Alex


Re: Phyr Starter

2022-01-11 Thread Jason Gunthorpe
On Tue, Jan 11, 2022 at 02:01:17PM +, Matthew Wilcox wrote:
> On Tue, Jan 11, 2022 at 12:17:18AM -0800, John Hubbard wrote:
> > Zooming in on the pinning aspect for a moment: last time I attempted to
> > convert O_DIRECT callers from gup to pup, I recall wanting very much to
> > record, in each bio_vec, whether these pages were acquired via FOLL_PIN,
> > or some non-FOLL_PIN method. Because at the end of the IO, it is not
> > easy to disentangle which pages require put_page() and which require
> > unpin_user_page*().
> > 
> > And changing the bio_vec for *that* purpose was not really acceptable.
> > 
> > But now that you're looking to change it in a big way (and with some
> > spare bits avaiable...oohh!), maybe I can go that direction after all.
> > 
> > Or, are you looking at a design in which any phyr is implicitly FOLL_PIN'd
> > if it exists at all?
> 
> That.  I think there's still good reasons to keep a single-page (or
> maybe dual-page) GUP around, but no reason to mix it with ranges.
> 
> > Or any other thoughts in this area are very welcome.
> 
> That's there's no support for unpinning part of a range.  You pin it,
> do the IO, unpin it.  That simplifies the accounting.

VFIO wouldn't like this :(

Jason
 


Re: Phyr Starter

2022-01-11 Thread Jason Gunthorpe
On Tue, Jan 11, 2022 at 04:32:56AM +, Matthew Wilcox wrote:
> On Mon, Jan 10, 2022 at 08:41:26PM -0400, Jason Gunthorpe wrote:
> > On Mon, Jan 10, 2022 at 07:34:49PM +, Matthew Wilcox wrote:
> > 
> > > Finally, it may be possible to stop using scatterlist to describe the
> > > input to the DMA-mapping operation.  We may be able to get struct
> > > scatterlist down to just dma_address and dma_length, with chaining
> > > handled through an enclosing struct.
> > 
> > Can you talk about this some more? IMHO one of the key properties of
> > the scatterlist is that it can hold huge amounts of pages without
> > having to do any kind of special allocation due to the chaining.
> > 
> > The same will be true of the phyr idea right?
> 
> My thinking is that we'd pass a relatively small array of phyr (maybe 16
> entries) to get_user_phyr().  If that turned out not to be big enough,
> then we have two options; one is to map those 16 ranges with sg and use
> the sg chaining functionality before throwing away the phyr and calling
> get_user_phyr() again. 

Then we are we using get_user_phyr() at all if we are just storing it
in a sg?

Also 16 entries is way to small, it should be at least a whole PMD
worth so we don't have to relock the PMD level each iteration.

I would like to see a flow more like:

  cpu_phyr_list = get_user_phyr(uptr, 1G);
  dma_phyr_list = dma_map_phyr(device, cpu_phyr_list);
  [..]
  dma_unmap_phyr(device, dma_phyr_list);
  unpin_drity_free(cpu_phy_list);

Where dma_map_phyr() can build a temporary SGL for old iommu drivers
compatability. iommu drivers would want to implement natively, of
course.

ie no loops in drivers.

> The question is whether this is the right kind of optimisation to be
> doing.  I hear you that we want a dense format, but it's questionable
> whether the kind of thing you're suggesting is actually denser than this
> scheme.  For example, if we have 1GB pages and userspace happens to have
> allocated pages (3, 4, 5, 6, 7, 8, 9, 10) then this can be represented
> as a single phyr.  A power-of-two scheme would have us use four entries
> (3, 4-7, 8-9, 10).

That is not quite what I had in mind..

struct phyr_list {
   unsigned int first_page_offset_bytes;
   size_t total_length_bytes;
   phys_addr_t min_alignment;
   struct packed_phyr *list_of_pages;
};

Where each 'packed_phyr' is an aligned page of some kind. The packing
has to be able to represent any number of pfns, so we have four major
cases:
 - 4k pfns (use 8 bytes)
 - Natural order pfn (use 8 bytes)
 - 4k aligned pfns, arbitary number (use 12 bytes)
 - <4k aligned, arbitary length (use 16 bytes?)

In all cases the interior pages are fully used, only the first and
last page is sliced based on the two parameters in the phyr_list.

The first_page_offset_bytes/total_length_bytes mean we don't need to
use the inefficient coding for many common cases, just stick to the 4k
coding and slice the first/last page down.

The last case is, perhaps, a possible route to completely replace
scatterlist. Few places need true byte granularity for interior pages,
so we can invent some coding to say 'this is 8 byte aligned, and n
bytes long' that only fits < 4k or something. Exceptional cases can
then still work. I'm not sure what block needs here - is it just 512?

Basically think of list_of_pages as showing a contiguous list of at
least min_aligned pages and first_page_offset_bytes/total_length_bytes
taking a byte granular slice out of that logical range.

>From a HW perspective I see two basic modalities:

 - Streaming HW, which read/writes in a single pass (think
   NVMe/storage/network). Usually takes a list of dma_addr_t and
   length that HW just walks over. Rarely cares about things like page
   boundaries. Optimization goal is to minimize list length. In this
   case we map each packed_phyr into a HW SGL

 - Random Access HW, which is randomly touching memory (think RDMA,
   VFIO, DRM, IOMMU). Usually stores either a linear list of same-size
   dma_addr_t pages, or a radix tree page table of dma_addr_t.
   Needs to have a minimum alignment of each chunk (usually 4k) to
   represent it. Optimization goal is to have maximum page size. In
   this case we use min_alignment to size the HW array and decode the
   packed_phyrs into individual pages.

> Using a (dma_addr, size_t) tuple makes coalescing adjacent pages very
> cheap.

With the above this still works, the very last entry in list_of_pages
would be the 12 byte pfn type and when we start a new page the logic
would then optimize it down to 8 bytes, if possible. At that point we
know we are not going to change it:
 
 - An interior page that is up perfectly aligned is represented as a
   natural order
 - A starting page that ends on perfect alignment is widened to
   natural order and first_page_offset_bytes is corrected
 - An ending page that starts on perfect alignment is widened to
   natural order and total_length_bytes is set
   (though no harm in keeping the 12 byte repr

Re: Phyr Starter

2022-01-11 Thread Thomas Zimmermann

Hi

Am 11.01.22 um 14:56 schrieb Matthew Wilcox:

On Tue, Jan 11, 2022 at 12:40:10PM +0100, Thomas Zimmermann wrote:

Hi

Am 10.01.22 um 20:34 schrieb Matthew Wilcox:

TLDR: I want to introduce a new data type:

struct phyr {
  phys_addr_t addr;
  size_t len;
};


Did you look at struct dma_buf_map? [1]


Thanks.  I wasn't aware of that.  It doesn't seem to actually solve the
problem, in that it doesn't carry any length information.  Did you mean
to point me at a different structure?



It's the structure I meant. It refers to a buffer, so the length could 
be added. For something more sophisticated, dma_buf_map could be changed 
to distinguish between the buffer and an iterator pointing into the buffer.


But if it's really different, then so be it.

Best regards
Thomas

--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev


OpenPGP_signature
Description: OpenPGP digital signature


Re: Phyr Starter

2022-01-11 Thread Matthew Wilcox
On Tue, Jan 11, 2022 at 12:17:18AM -0800, John Hubbard wrote:
> Zooming in on the pinning aspect for a moment: last time I attempted to
> convert O_DIRECT callers from gup to pup, I recall wanting very much to
> record, in each bio_vec, whether these pages were acquired via FOLL_PIN,
> or some non-FOLL_PIN method. Because at the end of the IO, it is not
> easy to disentangle which pages require put_page() and which require
> unpin_user_page*().
> 
> And changing the bio_vec for *that* purpose was not really acceptable.
> 
> But now that you're looking to change it in a big way (and with some
> spare bits avaiable...oohh!), maybe I can go that direction after all.
> 
> Or, are you looking at a design in which any phyr is implicitly FOLL_PIN'd
> if it exists at all?

That.  I think there's still good reasons to keep a single-page (or
maybe dual-page) GUP around, but no reason to mix it with ranges.

> Or any other thoughts in this area are very welcome.

That's there's no support for unpinning part of a range.  You pin it,
do the IO, unpin it.  That simplifies the accounting.



Re: Phyr Starter

2022-01-11 Thread Matthew Wilcox
On Tue, Jan 11, 2022 at 12:40:10PM +0100, Thomas Zimmermann wrote:
> Hi
> 
> Am 10.01.22 um 20:34 schrieb Matthew Wilcox:
> > TLDR: I want to introduce a new data type:
> > 
> > struct phyr {
> >  phys_addr_t addr;
> >  size_t len;
> > };
> 
> Did you look at struct dma_buf_map? [1]

Thanks.  I wasn't aware of that.  It doesn't seem to actually solve the
problem, in that it doesn't carry any length information.  Did you mean
to point me at a different structure?




Re: [PATCH v4 0/6] drm: exynos: dsi: Convert drm bridge

2022-01-11 Thread Jagan Teki
On Tue, Jan 11, 2022 at 5:20 PM Andrzej Hajda  wrote:
>
> Hi Jagan,
>
> On 11.01.2022 10:32, Jagan Teki wrote:
> > Hi Andrzej,
> >
> > On Tue, Dec 28, 2021 at 4:18 PM Andrzej Hajda  
> > wrote:
> >> Hi Marek,
> >>
> >> On 23.12.2021 10:15, Marek Szyprowski wrote:
> >>> Hi Jagan,
> >>>
> >>> On 18.12.2021 00:16, Marek Szyprowski wrote:
>  On 15.12.2021 15:56, Jagan Teki wrote:
> > On Wed, Dec 15, 2021 at 7:49 PM Marek Szyprowski
> >  wrote:
> >> On 15.12.2021 13:57, Jagan Teki wrote:
> >>> On Wed, Dec 15, 2021 at 5:31 PM Marek Szyprowski
> >>>  wrote:
>  On 15.12.2021 11:15, Jagan Teki wrote:
> > Updated series about drm bridge conversion of exynos dsi.
> > Previous version can be accessible, here [1].
> >
> > Patch 1: connector reset
> >
> > Patch 2: panel_bridge API
> >
> > Patch 3: Bridge conversion
> >
> > Patch 4: Atomic functions
> >
> > Patch 5: atomic_set
> >
> > Patch 6: DSI init in enable
>  There is a little progress! :)
> 
>  Devices with a simple display pipeline (only a DSI panel, like
>  Trats/Trats2) works till the last patch. Then, after applying
>  ("[PATCH
>  v4 6/6] drm: exynos: dsi: Move DSI init in bridge enable"), I get no
>  display at all.
> 
>  A TM2e board with in-bridge (Exynos MIC) stops displaying anything
>  after
>  applying patch "[PATCH v4 2/6] drm: exynos: dsi: Use drm
>  panel_bridge API".
> 
>  In case of the Arndale board with tc358764 bridge, no much
>  progress. The
>  display is broken just after applying the "[PATCH v2] drm: bridge:
>  tc358764: Use drm panel_bridge API" patch on top of linux-next.
> 
>  In all cases the I had "drm: of: Lookup if child node has panel or
>  bridge" patch applied.
> >>> Just skip the 6/6 for now.
> >>>
> >>> Apply
> >>> -
> >>> https://protect2.fireeye.com/v1/url?k=a24f3f76-fdd40659-a24eb439-0cc47a31cdf8-97ea12b4c5258d11&q=1&e=37a169bf-7ca5-4362-aad7-486018c7a708&u=https%3A%2F%2Fpatchwork.amarulasolutions.com%2Fpatch%2F1825%2F
> >>> -
> >>> https://protect2.fireeye.com/v1/url?k=a226360f-fdbd0f20-a227bd40-0cc47a31cdf8-ebd66aebee1058d7&q=1&e=37a169bf-7ca5-4362-aad7-486018c7a708&u=https%3A%2F%2Fpatchwork.amarulasolutions.com%2Fpatch%2F1823%2F
> >>>
> >>> Then apply 1/6 to 5/6.  and update the status?
> >> Okay, my fault, I didn't check that case on Arndale.
> >>
> >> I've checked and indeed, Trats/Trats2 and Arndale works after the above
> >> 2 patches AND patches 1-5.
> >>
> >> The only problem is now on TM2e, which uses Exynos MIC as in-bridge for
> >> Exynos DSI:
> >>
> >> [4.068866] [drm] Exynos DRM: using 1380.decon device for DMA
> >> mapping operations
> >> [4.069183] exynos-drm exynos-drm: bound 1380.decon (ops
> >> decon_component_ops)
> >> [4.128983] exynos-drm exynos-drm: bound 1388.decon (ops
> >> decon_component_ops)
> >> [4.129261] exynos-drm exynos-drm: bound 1393.mic (ops
> >> exynos_mic_component_ops)
> >> [4.133508] exynos-dsi 1390.dsi: [drm:exynos_dsi_host_attach]
> >> *ERROR* failed to find the bridge: -19
> >> [4.136392] exynos-drm exynos-drm: bound 1390.dsi (ops
> >> exynos_dsi_component_ops)
> >> [4.145499] rc_core: Couldn't load IR keymap rc-cec
> >> [4.145666] Registered IR keymap rc-empty
> >> [4.148402] rc rc0: sii8620 as /devices/virtual/rc/rc0
> >> [4.156051] input: sii8620 as /devices/virtual/rc/rc0/input1
> >> [4.160647] exynos-drm exynos-drm: bound 1397.hdmi (ops
> >> hdmi_component_ops)
> >> [4.169923] exynos-drm exynos-drm: [drm] Cannot find any crtc or
> >> sizes
> >> [4.173958] exynos-drm exynos-drm: [drm] Cannot find any crtc or
> >> sizes
> >> [4.182304] [drm] Initialized exynos 1.1.0 20180330 for
> >> exynos-drm on
> >> minor 0
> >>
> >> The display pipeline for TM2e is:
> >>
> >> Exynos5433 Decon -> Exynos MIC -> Exynos DSI -> s6e3ha2 DSI panel
> > If Trats/Trats2 is working then it has to work. I don't see any
> > difference in output pipeline. Can you please share the full log, I
> > cannot see host_attach print saying "Attached.."
>  Well, there is a failure message about the panel:
> 
>  exynos-dsi 1390.dsi: [drm:exynos_dsi_host_attach] *ERROR* failed
>  to find the bridge: -19
> 
>  however it looks that something might be broken in dts. The in-bridge
>  (Exynos MIC) is on port 0 and the panel is @0, what imho might cause
>  the issue.
> 
>  I've tried to change in in-bridge ('mic_to_dsi') port to 1 in
>  exynos5433.dtsi. Then the panel has been attached:
> 
>  exyn

Re: [PATCH] drm/mipi-dbi: Fix source-buffer address in mipi_dbi_buf_copy

2022-01-11 Thread Noralf Trønnes



Den 11.01.2022 14.26, skrev Thomas Zimmermann:
> Set the source-buffer address after mapping the buffer into the
> kernel's address space. Makes MIPI DBI helpers work again.
> 
> Signed-off-by: Thomas Zimmermann 
> Fixes: c47160d8edcd ("drm/mipi-dbi: Remove dependency on GEM CMA helper 
> library")
> Reported-by: Noralf Trønnes 
> Cc: Thomas Zimmermann 
> Cc: Daniel Vetter 
> Cc: Maarten Lankhorst 
> Cc: Maxime Ripard 
> ---

Reviewed-by: Noralf Trønnes 


Re: [PATCH v2] drm/bridge: analogix_dp: Grab runtime PM reference for DP-AUX

2022-01-11 Thread Andrzej Hajda

Hi Brian,


I am not DP specialist so CC-ed people working with DP

On 01.10.2021 23:42, Brian Norris wrote:

If the display is not enable()d, then we aren't holding a runtime PM
reference here. Thus, it's easy to accidentally cause a hang, if user
space is poking around at /dev/drm_dp_aux0 at the "wrong" time.

Let's get the panel and PM state right before trying to talk AUX.

Fixes: 0d97ad03f422 ("drm/bridge: analogix_dp: Remove duplicated code")
Cc: 
Cc: Tomeu Vizoso 
Signed-off-by: Brian Norris 



Few questions/issues here:

1. If it is just to avoid accidental 'hangs' it would be better to just 
check if the panel is working before transfer, if not, return error 
code. If there is better reason for this pm dance, please provide it  in 
description.


2. Again I see an assumption that panel-prepare enables power for 
something different than video transmission, accidentally it is true for 
most devices, but devices having more fine grained power management will 
break, or at least will be used inefficiently - but maybe in case of dp 
it is OK ???


3. More general issue - I am not sure if this should not be handled 
uniformly for all drm_dp devices.



Regards

Andrzej



---

Changes in v2:
- Fix spelling in Subject
- DRM_DEV_ERROR() -> drm_err()
- Propagate errors from un-analogix_dp_prepare_panel()

  .../drm/bridge/analogix/analogix_dp_core.c| 21 ++-
  1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c 
b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
index b7d2e4449cfa..6fc46ac93ef8 100644
--- a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
+++ b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
@@ -1632,8 +1632,27 @@ static ssize_t analogix_dpaux_transfer(struct drm_dp_aux 
*aux,
   struct drm_dp_aux_msg *msg)
  {
struct analogix_dp_device *dp = to_dp(aux);
+   int ret, ret2;
  
-	return analogix_dp_transfer(dp, msg);

+   ret = analogix_dp_prepare_panel(dp, true, false);
+   if (ret) {
+   drm_err(dp->drm_dev, "Failed to prepare panel (%d)\n", ret);
+   return ret;
+   }
+
+   pm_runtime_get_sync(dp->dev);
+   ret = analogix_dp_transfer(dp, msg);
+   pm_runtime_put(dp->dev);
+
+   ret2 = analogix_dp_prepare_panel(dp, false, false);
+   if (ret2) {
+   drm_err(dp->drm_dev, "Failed to unprepare panel (%d)\n", ret2);
+   /* Prefer the analogix_dp_transfer() error, if it exists. */
+   if (!ret)
+   ret = ret2;
+   }
+
+   return ret;
  }
  
  struct analogix_dp_device *


[PATCH] drm/mipi-dbi: Fix source-buffer address in mipi_dbi_buf_copy

2022-01-11 Thread Thomas Zimmermann
Set the source-buffer address after mapping the buffer into the
kernel's address space. Makes MIPI DBI helpers work again.

Signed-off-by: Thomas Zimmermann 
Fixes: c47160d8edcd ("drm/mipi-dbi: Remove dependency on GEM CMA helper 
library")
Reported-by: Noralf Trønnes 
Cc: Thomas Zimmermann 
Cc: Daniel Vetter 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
---
 drivers/gpu/drm/drm_mipi_dbi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_mipi_dbi.c b/drivers/gpu/drm/drm_mipi_dbi.c
index ded8968b3e8a..0327d595e028 100644
--- a/drivers/gpu/drm/drm_mipi_dbi.c
+++ b/drivers/gpu/drm/drm_mipi_dbi.c
@@ -209,11 +209,11 @@ int mipi_dbi_buf_copy(void *dst, struct drm_framebuffer 
*fb,
ret = drm_gem_fb_begin_cpu_access(fb, DMA_FROM_DEVICE);
if (ret)
return ret;
-   src = data[0].vaddr; /* TODO: Use mapping abstraction properly */
 
ret = drm_gem_fb_vmap(fb, map, data);
if (ret)
goto out_drm_gem_fb_end_cpu_access;
+   src = data[0].vaddr; /* TODO: Use mapping abstraction properly */
 
switch (fb->format->format) {
case DRM_FORMAT_RGB565:
-- 
2.34.1



Re: [PATCH v2 0/2] video: A couple of fixes for the vga16fb driver

2022-01-11 Thread Maxime Ripard
Hi,

On Mon, Jan 10, 2022 at 10:56:23AM +0100, Javier Martinez Canillas wrote:
> This patch series contains two fixes for the vga16fb driver. I looked at
> the driver due a regression reported [0], caused by commit d391c5827107
> ("drivers/firmware: move x86 Generic System Framebuffers support").
> 
> The mentioned commit didn't change any logic but just moved the platform
> device registration that matches the vesafb and efifb drivers to happen
> later. And this caused the vga16fb driver to be probed even in machines
> that don't have an EGA or VGA video adapter.
> 
> This is a v2 of the patch series that addresses issues pointed out by
> Geert Uytterhoeven.
> 
> Patch #1 is fixing the wrong check to determine if either EGA or VGA is
> used and patch #2 adds a check to the driver to only be loaded for EGA
> and VGA 16 color graphic cards.

For both patches,

Acked-by: Maxime Ripard 

Maxime


signature.asc
Description: PGP signature


Re: [PATCH v3 2/2] dt-bindings: panel: Introduce a panel-lvds binding

2022-01-11 Thread Laurent Pinchart
Hi Maxime,

Thank you for the patch.

On Tue, Jan 11, 2022 at 12:06:35PM +0100, Maxime Ripard wrote:
> Following the previous patch, let's introduce a generic panel-lvds
> binding that documents the panels that don't have any particular
> constraint documented.
> 
> Reviewed-by: Rob Herring 
> Signed-off-by: Maxime Ripard 
> 
> ---
> 
> Changes from v2:
>   - Added a MAINTAINERS entry
> 
> Changes from v1:
>   - Added missing compatible
>   - Fixed lint
> ---
>  .../bindings/display/panel/panel-lvds.yaml| 57 +++
>  MAINTAINERS   |  1 +
>  2 files changed, 58 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/display/panel/panel-lvds.yaml
> 
> diff --git a/Documentation/devicetree/bindings/display/panel/panel-lvds.yaml 
> b/Documentation/devicetree/bindings/display/panel/panel-lvds.yaml
> new file mode 100644
> index ..fcc50db6a812
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/display/panel/panel-lvds.yaml
> @@ -0,0 +1,57 @@
> +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/display/panel/panel-lvds.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: Generic LVDS Display Panel Device Tree Bindings
> +
> +maintainers:
> +  - Lad Prabhakar 
> +  - Thierry Reding 
> +
> +allOf:
> +  - $ref: panel-common.yaml#
> +  - $ref: /schemas/display/lvds.yaml/#
> +
> +select:
> +  properties:
> +compatible:
> +  contains:
> +const: panel-lvds
> +
> +  not:
> +properties:
> +  compatible:
> +contains:
> +  enum:
> +- advantech,idk-1110wr
> +- advantech,idk-2121wr
> +- innolux,ee101ia-01d
> +- mitsubishi,aa104xd12
> +- mitsubishi,aa121td01
> +- sgd,gktw70sdae4se

I still don't like this :-( Couldn't we instead do

select:
  properties:
compatible:
  contains:
enum:
  - auo,b101ew05
  - tbs,a711-panel

?

> +
> +  required:
> +- compatible
> +
> +properties:
> +  compatible:
> +items:
> +  - enum:
> +  - auo,b101ew05
> +  - tbs,a711-panel
> +
> +  - const: panel-lvds
> +
> +unevaluatedProperties: false
> +
> +required:
> +  - compatible
> +  - data-mapping
> +  - width-mm
> +  - height-mm
> +  - panel-timing
> +  - port
> +
> +...
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 368072da0a05..02001455949e 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -6080,6 +6080,7 @@ T:  git git://anongit.freedesktop.org/drm/drm-misc
>  S:   Maintained
>  F:   drivers/gpu/drm/panel/panel-lvds.c
>  F:   Documentation/devicetree/bindings/display/lvds.yaml
> +F:   Documentation/devicetree/bindings/display/panel/panel-lvds.yaml
>  
>  DRM DRIVER FOR MANTIX MLAF057WE51 PANELS
>  M:   Guido Günther 

-- 
Regards,

Laurent Pinchart


Re: [PATCH v3 1/2] dt-bindings: display: Turn lvds.yaml into a generic schema

2022-01-11 Thread Laurent Pinchart
Hi Maxime,

Thank you for the patch.

On Tue, Jan 11, 2022 at 12:06:34PM +0100, Maxime Ripard wrote:
> The lvds.yaml file so far was both defining the generic LVDS properties
> (such as data-mapping) that could be used for any LVDS sink, but also
> the panel-lvds binding.
> 
> That last binding was to describe LVDS panels simple enough, and had a
> number of other bindings using it as a base to specialise it further.
> 
> However, this situation makes it fairly hard to extend and reuse both
> the generic parts, and the panel-lvds itself.
> 
> Let's remove the panel-lvds parts and leave only the generic LVDS
> properties.
> 
> Reviewed-by: Rob Herring 
> Signed-off-by: Maxime Ripard 
> 
> ---
> 
> Changes from v2:
>   - Fix references to that file
> 
> Changes from v1:
>   - Moved the schema out of panel
> ---
>  .../bindings/display/bridge/lvds-codec.yaml   |  2 +-
>  .../bindings/display/{panel => }/lvds.yaml| 31 ++-
>  .../display/panel/advantech,idk-1110wr.yaml   | 19 ++--
>  .../display/panel/innolux,ee101ia-01d.yaml| 23 --
>  .../display/panel/mitsubishi,aa104xd12.yaml   | 19 ++--
>  .../display/panel/mitsubishi,aa121td01.yaml   | 19 ++--
>  .../display/panel/sgd,gktw70sdae4se.yaml  | 19 ++--
>  MAINTAINERS   |  2 +-
>  8 files changed, 93 insertions(+), 41 deletions(-)
>  rename Documentation/devicetree/bindings/display/{panel => }/lvds.yaml (86%)
> 
> diff --git a/Documentation/devicetree/bindings/display/bridge/lvds-codec.yaml 
> b/Documentation/devicetree/bindings/display/bridge/lvds-codec.yaml
> index 5079c1cc337b..27b905b81b12 100644
> --- a/Documentation/devicetree/bindings/display/bridge/lvds-codec.yaml
> +++ b/Documentation/devicetree/bindings/display/bridge/lvds-codec.yaml
> @@ -67,7 +67,7 @@ properties:
>- vesa-24
>  description: |
>The color signals mapping order. See details in
> -  Documentation/devicetree/bindings/display/panel/lvds.yaml
> +  Documentation/devicetree/bindings/display/lvds.yaml
>  
>port@1:
>  $ref: /schemas/graph.yaml#/properties/port
> diff --git a/Documentation/devicetree/bindings/display/panel/lvds.yaml 
> b/Documentation/devicetree/bindings/display/lvds.yaml
> similarity index 86%
> rename from Documentation/devicetree/bindings/display/panel/lvds.yaml
> rename to Documentation/devicetree/bindings/display/lvds.yaml
> index 49460c9dceea..55751402fb13 100644
> --- a/Documentation/devicetree/bindings/display/panel/lvds.yaml
> +++ b/Documentation/devicetree/bindings/display/lvds.yaml
> @@ -1,10 +1,10 @@
>  # SPDX-License-Identifier: GPL-2.0
>  %YAML 1.2
>  ---
> -$id: http://devicetree.org/schemas/display/panel/lvds.yaml#
> +$id: http://devicetree.org/schemas/display/lvds.yaml#
>  $schema: http://devicetree.org/meta-schemas/core.yaml#
>  
> -title: LVDS Display Panel
> +title: LVDS Display Common Properties
>  
>  maintainers:
>- Laurent Pinchart 
> @@ -26,18 +26,7 @@ description: |+

The description mentions "This bindings supports display panels
compatible with the following specifications". This needs a small update
to avoid referring to panels.

With this updated,

Reviewed-by: Laurent Pinchart 

>Device compatible with those specifications have been marketed under the
>FPD-Link and FlatLink brands.
>  
> -allOf:
> -  - $ref: panel-common.yaml#
> -
>  properties:
> -  compatible:
> -contains:
> -  const: panel-lvds
> -description:
> -  Shall contain "panel-lvds" in addition to a mandatory panel-specific
> -  compatible string defined in individual panel bindings. The 
> "panel-lvds"
> -  value shall never be used on its own.
> -
>data-mapping:
>  enum:
>- jeida-18
> @@ -96,22 +85,6 @@ properties:
>If set, reverse the bit order described in the data mappings below on 
> all
>data lanes, transmitting bits for slots 6 to 0 instead of 0 to 6.
>  
> -  port: true
> -  ports: true
> -
> -required:
> -  - compatible
> -  - data-mapping
> -  - width-mm
> -  - height-mm
> -  - panel-timing
> -
> -oneOf:
> -  - required:
> -  - port
> -  - required:
> -  - ports
> -
>  additionalProperties: true
>  
>  ...
> diff --git 
> a/Documentation/devicetree/bindings/display/panel/advantech,idk-1110wr.yaml 
> b/Documentation/devicetree/bindings/display/panel/advantech,idk-1110wr.yaml
> index 93878c2cd370..3a8c2c11f9bd 100644
> --- 
> a/Documentation/devicetree/bindings/display/panel/advantech,idk-1110wr.yaml
> +++ 
> b/Documentation/devicetree/bindings/display/panel/advantech,idk-1110wr.yaml
> @@ -11,13 +11,23 @@ maintainers:
>- Thierry Reding 
>  
>  allOf:
> -  - $ref: lvds.yaml#
> +  - $ref: panel-common.yaml#
> +  - $ref: /schemas/display/lvds.yaml/#
> +
> +select:
> +  properties:
> +compatible:
> +  contains:
> +const: advantech,idk-1110wr
> +
> +  required:
> +- compatible
>  

Re: [PATCH RESEND v4 v5 4/4] drm/vc4: Notify the firmware when DRM is in charge

2022-01-11 Thread Maxime Ripard
Hi Thomas,

On Tue, Jan 11, 2022 at 10:38:36AM +0100, Thomas Zimmermann wrote:
> Hi
> 
> Am 15.12.21 um 10:51 schrieb Maxime Ripard:
> > Once the call to drm_fb_helper_remove_conflicting_framebuffers() has
> > been made, simplefb has been unregistered and the KMS driver is entirely
> > in charge of the display.
> > 
> > Thus, we can notify the firmware it can free whatever resource it was
> > using to maintain simplefb functional.
> > 
> > Signed-off-by: Maxime Ripard 
> > ---
> >   drivers/gpu/drm/vc4/vc4_drv.c | 22 ++
> >   1 file changed, 22 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/vc4/vc4_drv.c b/drivers/gpu/drm/vc4/vc4_drv.c
> > index 86c61ee120b7..a03053c8e22c 100644
> > --- a/drivers/gpu/drm/vc4/vc4_drv.c
> > +++ b/drivers/gpu/drm/vc4/vc4_drv.c
> > @@ -37,6 +37,8 @@
> >   #include 
> >   #include 
> > +#include 
> > +
> >   #include "uapi/drm/vc4_drm.h"
> >   #include "vc4_drv.h"
> > @@ -215,6 +217,7 @@ static void vc4_match_add_drivers(struct device *dev,
> >   static int vc4_drm_bind(struct device *dev)
> >   {
> > struct platform_device *pdev = to_platform_device(dev);
> > +   struct rpi_firmware *firmware = NULL;
> > struct drm_device *drm;
> > struct vc4_dev *vc4;
> > struct device_node *node;
> > @@ -251,10 +254,29 @@ static int vc4_drm_bind(struct device *dev)
> > if (ret)
> > return ret;
> > +   node = of_find_compatible_node(NULL, NULL, 
> > "raspberrypi,bcm2835-firmware");
> > +   if (node) {
> > +   firmware = rpi_firmware_get(node);
> > +   of_node_put(node);
> > +
> > +   if (!firmware)
> > +   return -EPROBE_DEFER;
> > +   }
> > +
> 
> The code is
> 
> Acked-by: Thomas Zimmermann 

Thanks for your review

> Just for my understanding:
> 
> You retrieve the firmware before killing simpledrm simply to keep the
> display on if it fails, right?

Exactly

> What's the possible error that would justify a retry (via EPROBE_DEFER)?

The firmware there is backed by a driver that might not have probed yet,
in which case we just want to retry later on

Maxime


signature.asc
Description: PGP signature


Re: [PATCH RESEND v4 v5 0/4] drm/vc4: Use the firmware to stop the display pipeline

2022-01-11 Thread Maxime Ripard
On Wed, 15 Dec 2021 10:51:13 +0100, Maxime Ripard wrote:
> The VC4 driver has had limited support to disable the HDMI controllers and
> pixelvalves at boot if the firmware has enabled them.
> 
> However, this proved to be limited, and a bit unreliable so a new firmware
> command has been introduced some time ago to make it free all its resources 
> and
> disable any display output it might have enabled.
> 
> [...]

Applied to drm/drm-misc (drm-misc-next).

Thanks!
Maxime


Re: [PATCH v5 25/32] iommu/mtk: Migrate to aggregate driver

2022-01-11 Thread Yong Wu
Hi Stephen,

Thanks for helping update here.

On Thu, 2022-01-06 at 13:45 -0800, Stephen Boyd wrote:
> Use an aggregate driver instead of component ops so that we can get
> proper driver probe ordering of the aggregate device with respect to
> all
> the component devices that make up the aggregate device.
> 
> Cc: Yong Wu 
> Cc: Joerg Roedel 
> Cc: Will Deacon 
> Cc: Daniel Vetter 
> Cc: "Rafael J. Wysocki" 
> Cc: Rob Clark 
> Cc: Russell King 
> Cc: Saravana Kannan 
> Signed-off-by: Stephen Boyd 

When I test this on mt8195 which have two IOMMU HWs(calling
component_aggregate_regsiter twice), it will abort like this. Then what
should we do if we have two instances?
Thanks.

[2.652424] Error: Driver 'mtk_iommu_agg' is already registered,
aborting...
[2.654033] mtk-iommu: probe of 1c01f000.iommu failed with error -16
[2.662034] Unable to handle kernel NULL pointer dereference at
virtual address 0020
...
[2.672413] pc : aggregate_device_match+0xa8/0x1c8
[2.673027] lr : aggregate_device_match+0x68/0x1c8
...
[2.683091] Call trace:
[2.683403]  aggregate_device_match+0xa8/0x1c8
[2.683970]  __device_attach_driver+0x38/0xd0
[2.684526]  bus_for_each_drv+0x68/0xd0
[2.685015]  __device_attach+0xec/0x148
[2.685503]  device_attach+0x14/0x20
[2.685960]  bus_rescan_devices_helper+0x50/0x90
[2.686545]  bus_for_each_dev+0x7c/0xd8
[2.687033]  bus_rescan_devices+0x20/0x30
[2.687542]  __component_add+0x7c/0xa0
[2.688022]  component_add+0x14/0x20
[2.688479]  mtk_smi_larb_probe+0xe0/0x120


> ---
>  drivers/iommu/mtk_iommu.c| 14 +-
>  drivers/iommu/mtk_iommu.h|  6 --
>  drivers/iommu/mtk_iommu_v1.c | 14 +-
>  3 files changed, 22 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index 25b834104790..8e722898cbe2 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -752,9 +752,13 @@ static int mtk_iommu_hw_init(const struct
> mtk_iommu_data *data)
>   return 0;
>  }
>  
> -static const struct component_master_ops mtk_iommu_com_ops = {
> - .bind   = mtk_iommu_bind,
> - .unbind = mtk_iommu_unbind,
> +static struct aggregate_driver mtk_iommu_aggregate_driver = {
> + .probe  = mtk_iommu_bind,
> + .remove = mtk_iommu_unbind,
> + .driver = {
> + .name   = "mtk_iommu_agg",
> + .owner  = THIS_MODULE,
> + },
>  };
>  
>  static int mtk_iommu_probe(struct platform_device *pdev)
> @@ -895,7 +899,7 @@ static int mtk_iommu_probe(struct platform_device
> *pdev)
>   goto out_list_del;
>   }
>  
> - ret = component_master_add_with_match(dev, &mtk_iommu_com_ops,
> match);
> + ret = component_aggregate_register(dev,
> &mtk_iommu_aggregate_driver, match);
>   if (ret)
>   goto out_bus_set_null;
>   return ret;
> @@ -928,7 +932,7 @@ static int mtk_iommu_remove(struct
> platform_device *pdev)
>   device_link_remove(data->smicomm_dev, &pdev->dev);
>   pm_runtime_disable(&pdev->dev);
>   devm_free_irq(&pdev->dev, data->irq, data);
> - component_master_del(&pdev->dev, &mtk_iommu_com_ops);
> + component_aggregate_unregister(&pdev->dev,
> &mtk_iommu_aggregate_driver);
>   return 0;
>  }
>  
> diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
> index f81fa8862ed0..064fd4f4eade 100644
> --- a/drivers/iommu/mtk_iommu.h
> +++ b/drivers/iommu/mtk_iommu.h
> @@ -94,15 +94,17 @@ static inline void release_of(struct device *dev,
> void *data)
>   of_node_put(data);
>  }
>  
> -static inline int mtk_iommu_bind(struct device *dev)
> +static inline int mtk_iommu_bind(struct aggregate_device *adev)
>  {
> + struct device *dev = adev->parent;
>   struct mtk_iommu_data *data = dev_get_drvdata(dev);
>  
>   return component_bind_all(dev, &data->larb_imu);
>  }
>  
> -static inline void mtk_iommu_unbind(struct device *dev)
> +static inline void mtk_iommu_unbind(struct aggregate_device *adev)
>  {
> + struct device *dev = adev->parent;
>   struct mtk_iommu_data *data = dev_get_drvdata(dev);
>  
>   component_unbind_all(dev, &data->larb_imu);
> diff --git a/drivers/iommu/mtk_iommu_v1.c
> b/drivers/iommu/mtk_iommu_v1.c
> index be22fcf988ce..5fb29058a165 100644
> --- a/drivers/iommu/mtk_iommu_v1.c
> +++ b/drivers/iommu/mtk_iommu_v1.c
> @@ -534,9 +534,13 @@ static const struct of_device_id
> mtk_iommu_of_ids[] = {
>   {}
>  };
>  
> -static const struct component_master_ops mtk_iommu_com_ops = {
> - .bind   = mtk_iommu_bind,
> - .unbind = mtk_iommu_unbind,
> +static struct aggregate_driver mtk_iommu_aggregate_driver = {
> + .probe  = mtk_iommu_bind,
> + .remove = mtk_iommu_unbind,
> + .driver = {
> + .name   = "mtk_iommu_agg",
> + .owner  = THIS_MODULE,
> + },
>  };
>  
>  static int 

[PATCH 2/2] drm/i915/gt: make a gt sysfs group and move power management files

2022-01-11 Thread Andi Shyti
The GT has its own properties and in sysfs they should be grouped
in the 'gt/' directory.

Create a 'gt/' directory in sysfs which will contain gt0...gtN
directories related to each tile configured in the GPU. Move the
power management files inside those directories.

The previous power management files are kept in their original
root directory to avoid breaking the ABI. They point to the tile
'0' and a warning message is printed whenever accessed to. The
deprecated interface needs for the CONFIG_SYSFS_DEPRECATED_V2
flag in order to be generated.

The new sysfs structure will have a similar layout for the 4 tile
case:

/sys/.../card0
 ├── gt
 │   ├── gt0
 │   │   ├── id
 │   │   ├── rc6_enable
 │   │   ├── rc6_residency_ms
 │   │   ├── rps_act_freq_mhz
 │   │   ├── rps_boost_freq_mhz
 │   │   ├── rps_cur_freq_mhz
 │   │   ├── rps_max_freq_mhz
 │   │   ├── rps_min_freq_mhz
 │   │   ├── rps_RP0_freq_mhz
 │   │   ├── rps_RP1_freq_mhz
 │   │   └── rps_RPn_freq_mhz
 .   .
 .   .
 .   .
 │   └── gt3
 │   ├── id
 │   ├── rc6_enable
 │   ├── rc6_residency_ms
 │   ├── rps_act_freq_mhz
 │   ├── rps_boost_freq_mhz
 │   ├── rps_cur_freq_mhz
 │   ├── rps_max_freq_mhz
 │   ├── rps_min_freq_mhz
 │   ├── rps_RP0_freq_mhz
 │   ├── rps_RP1_freq_mhz
 │   └── rps_RPn_freq_mhz
 ├── gt_act_freq_mhz   -+
 ├── gt_boost_freq_mhz  |
 ├── gt_cur_freq_mhz|Original interface
 ├── gt_max_freq_mhz+─-> kept as existing ABI;
 ├── gt_min_freq_mhz|it points to gt0/
 ├── gt_RP0_freq_mhz|
 └── gt_RP1_freq_mhz|
 └── gt_RPn_freq_mhz   -+

Signed-off-by: Andi Shyti 
Signed-off-by: Lucas De Marchi 
Cc: Matt Roper 
Cc: Sujaritha Sundaresan 
Cc: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/Makefile |   4 +-
 drivers/gpu/drm/i915/gt/intel_gt.c|   2 +
 drivers/gpu/drm/i915/gt/sysfs_gt.c| 126 
 drivers/gpu/drm/i915/gt/sysfs_gt.h|  44 +++
 drivers/gpu/drm/i915/gt/sysfs_gt_pm.c | 394 ++
 drivers/gpu/drm/i915/gt/sysfs_gt_pm.h |  16 ++
 drivers/gpu/drm/i915/i915_drv.h   |   2 +
 drivers/gpu/drm/i915/i915_reg.h   |   1 +
 drivers/gpu/drm/i915/i915_sysfs.c | 315 +---
 drivers/gpu/drm/i915/i915_sysfs.h |   3 +
 10 files changed, 601 insertions(+), 306 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gt/sysfs_gt.c
 create mode 100644 drivers/gpu/drm/i915/gt/sysfs_gt.h
 create mode 100644 drivers/gpu/drm/i915/gt/sysfs_gt_pm.c
 create mode 100644 drivers/gpu/drm/i915/gt/sysfs_gt_pm.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 1b62b9f65196..0170fdd6f454 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -121,7 +121,9 @@ gt-y += \
gt/intel_timeline.o \
gt/intel_workarounds.o \
gt/shmem_utils.o \
-   gt/sysfs_engines.o
+   gt/sysfs_engines.o \
+   gt/sysfs_gt.o \
+   gt/sysfs_gt_pm.o
 # autogenerated null render state
 gt-y += \
gt/gen6_renderstate.o \
diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c 
b/drivers/gpu/drm/i915/gt/intel_gt.c
index 5e062c9525f8..cfc0fc127522 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -24,6 +24,7 @@
 #include "intel_rps.h"
 #include "intel_uncore.h"
 #include "shmem_utils.h"
+#include "sysfs_gt.h"
 #include "pxp/intel_pxp.h"
 
 static void
@@ -452,6 +453,7 @@ void intel_gt_driver_register(struct intel_gt *gt)
intel_rps_driver_register(>->rps);
 
intel_gt_debugfs_register(gt);
+   intel_gt_sysfs_register(gt);
 }
 
 static int intel_gt_init_scratch(struct intel_gt *gt, unsigned int size)
diff --git a/drivers/gpu/drm/i915/gt/sysfs_gt.c 
b/drivers/gpu/drm/i915/gt/sysfs_gt.c
new file mode 100644
index ..46cf033a53ec
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/sysfs_gt.c
@@ -0,0 +1,126 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2020 Intel Corporation
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "i915_drv.h"
+#include "i915_sysfs.h"
+#include "intel_gt.h"
+#include "intel_gt_types.h"
+#include "intel_rc6.h"
+
+#include "sysfs_gt.h"
+#include "sysfs_gt_pm.h"
+
+struct intel_gt *intel_gt_sysfs_get_drvdata(struct device *dev,
+   const char *name)
+{
+   struct kobject *kobj = &dev->kobj;
+
+   /*
+* We are interested at knowing from where the interface
+* has been called, whether it's called from gt/ or from
+* the parent directory.
+* From the interface position it depends also the value of
+* the private data.
+* If the interface is called from gt/ then private data is
+  

[PATCH 1/2] drm/i915: Prepare for multiple GTs

2022-01-11 Thread Andi Shyti
From: Tvrtko Ursulin 

On a multi-tile platform, each tile has its own registers + GGTT
space, and BAR 0 is extended to cover all of them.

Up to four gts are supported in i915->gt[], with slot zero
shadowing the existing i915->gt0 to enable source compatibility
with legacy driver paths. A for_each_gt macro is added to iterate
over the GTs and will be used by upcoming patches that convert
various parts of the driver to be multi-gt aware.

Only the primary/root tile is initialized for now; the other
tiles will be detected and plugged in by future patches once the
necessary infrastructure is in place to handle them.

Signed-off-by: Abdiel Janulgue 
Signed-off-by: Daniele Ceraolo Spurio 
Signed-off-by: Tvrtko Ursulin 
Signed-off-by: Matt Roper 
Signed-off-by: Andi Shyti 
Cc: Daniele Ceraolo Spurio 
Cc: Joonas Lahtinen 
Cc: Matthew Auld 
---
 drivers/gpu/drm/i915/gt/intel_gt.c| 139 --
 drivers/gpu/drm/i915/gt/intel_gt.h|  14 +-
 drivers/gpu/drm/i915/gt/intel_gt_pm.c |   9 +-
 drivers/gpu/drm/i915/gt/intel_gt_types.h  |   7 +
 drivers/gpu/drm/i915/i915_driver.c|  29 ++--
 drivers/gpu/drm/i915/i915_drv.h   |   6 +
 drivers/gpu/drm/i915/intel_memory_region.h|   3 +
 drivers/gpu/drm/i915/intel_uncore.c   |  12 +-
 drivers/gpu/drm/i915/intel_uncore.h   |   3 +-
 .../gpu/drm/i915/selftests/mock_gem_device.c  |   5 +-
 10 files changed, 185 insertions(+), 42 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c 
b/drivers/gpu/drm/i915/gt/intel_gt.c
index 298ff32c8d0c..5e062c9525f8 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -26,7 +26,8 @@
 #include "shmem_utils.h"
 #include "pxp/intel_pxp.h"
 
-void __intel_gt_init_early(struct intel_gt *gt, struct drm_i915_private *i915)
+static void
+__intel_gt_init_early(struct intel_gt *gt)
 {
spin_lock_init(>->irq_lock);
 
@@ -46,19 +47,27 @@ void __intel_gt_init_early(struct intel_gt *gt, struct 
drm_i915_private *i915)
intel_rps_init_early(>->rps);
 }
 
+/* Preliminary initialization of Tile 0 */
 void intel_gt_init_early(struct intel_gt *gt, struct drm_i915_private *i915)
 {
gt->i915 = i915;
gt->uncore = &i915->uncore;
+
+   __intel_gt_init_early(gt);
 }
 
-int intel_gt_probe_lmem(struct intel_gt *gt)
+static int intel_gt_probe_lmem(struct intel_gt *gt)
 {
struct drm_i915_private *i915 = gt->i915;
+   unsigned int instance = gt->info.id;
struct intel_memory_region *mem;
int id;
int err;
 
+   id = INTEL_REGION_LMEM + instance;
+   if (drm_WARN_ON(&i915->drm, id >= INTEL_REGION_STOLEN_SMEM))
+   return -ENODEV;
+
mem = intel_gt_setup_lmem(gt);
if (mem == ERR_PTR(-ENODEV))
mem = intel_gt_setup_fake_lmem(gt);
@@ -73,9 +82,8 @@ int intel_gt_probe_lmem(struct intel_gt *gt)
return err;
}
 
-   id = INTEL_REGION_LMEM;
-
mem->id = id;
+   mem->instance = instance;
 
intel_memory_region_set_name(mem, "local%u", mem->instance);
 
@@ -790,16 +798,21 @@ void intel_gt_driver_release(struct intel_gt *gt)
intel_gt_fini_buffer_pool(gt);
 }
 
-void intel_gt_driver_late_release(struct intel_gt *gt)
+void intel_gt_driver_late_release(struct drm_i915_private *i915)
 {
+   struct intel_gt *gt;
+   unsigned int id;
+
/* We need to wait for inflight RCU frees to release their grip */
rcu_barrier();
 
-   intel_uc_driver_late_release(>->uc);
-   intel_gt_fini_requests(gt);
-   intel_gt_fini_reset(gt);
-   intel_gt_fini_timelines(gt);
-   intel_engines_free(gt);
+   for_each_gt(gt, i915, id) {
+   intel_uc_driver_late_release(>->uc);
+   intel_gt_fini_requests(gt);
+   intel_gt_fini_reset(gt);
+   intel_gt_fini_timelines(gt);
+   intel_engines_free(gt);
+   }
 }
 
 /**
@@ -908,6 +921,112 @@ u32 intel_gt_read_register_fw(struct intel_gt *gt, 
i915_reg_t reg)
return intel_uncore_read_fw(gt->uncore, reg);
 }
 
+static int
+intel_gt_tile_setup(struct intel_gt *gt, phys_addr_t phys_addr)
+{
+   struct drm_i915_private *i915 = gt->i915;
+   unsigned int id = gt->info.id;
+   int ret;
+
+   if (id) {
+   struct intel_uncore_mmio_debug *mmio_debug;
+   struct intel_uncore *uncore;
+
+   /* For multi-tile platforms BAR0 must have at least 16MB per 
tile */
+   if (GEM_WARN_ON(pci_resource_len(to_pci_dev(i915->drm.dev), 0) <
+   (id + 1) * SZ_16M))
+   return -EINVAL;
+
+   uncore = kzalloc(sizeof(*uncore), GFP_KERNEL);
+   if (!gt->uncore)
+   return -ENOMEM;
+
+   mmio_debug = kzalloc(sizeof(*mmio_debug), GFP_KERNEL);
+   if (!mmio_debug) {
+   kfree(uncore);
+

[PATCH 0/2] Introduce multitile support

2022-01-11 Thread Andi Shyti
Hi,

This is the second series that prepares i915 to host multitile
platforms. It introduces the for_each_gt() macro that loops over
the tiles to perform per gt actions.

This patch is a combination of two patches developed originally
by Abdiel, who introduced some refactoring during probe, and then
Tvrtko has added the necessary tools to start using the various
tiles.

The second patch re-organises the sysfs interface to expose the
API for each of the GTs. I decided to prioritise this patch
over others to unblock Sujaritha for further development.

A third series will still follow this.

Andi Shyti (1):
  drm/i915/gt: make a gt sysfs group and move power management files

Tvrtko Ursulin (1):
  drm/i915: Prepare for multiple GTs

 drivers/gpu/drm/i915/Makefile |   4 +-
 drivers/gpu/drm/i915/gt/intel_gt.c| 141 ++-
 drivers/gpu/drm/i915/gt/intel_gt.h|  14 +-
 drivers/gpu/drm/i915/gt/intel_gt_pm.c |   9 +-
 drivers/gpu/drm/i915/gt/intel_gt_types.h  |   7 +
 drivers/gpu/drm/i915/gt/sysfs_gt.c| 126 ++
 drivers/gpu/drm/i915/gt/sysfs_gt.h|  44 ++
 drivers/gpu/drm/i915/gt/sysfs_gt_pm.c | 394 ++
 drivers/gpu/drm/i915/gt/sysfs_gt_pm.h |  16 +
 drivers/gpu/drm/i915/i915_driver.c|  29 +-
 drivers/gpu/drm/i915/i915_drv.h   |   8 +
 drivers/gpu/drm/i915/i915_reg.h   |   1 +
 drivers/gpu/drm/i915/i915_sysfs.c | 315 +-
 drivers/gpu/drm/i915/i915_sysfs.h |   3 +
 drivers/gpu/drm/i915/intel_memory_region.h|   3 +
 drivers/gpu/drm/i915/intel_uncore.c   |  12 +-
 drivers/gpu/drm/i915/intel_uncore.h   |   3 +-
 .../gpu/drm/i915/selftests/mock_gem_device.c  |   5 +-
 18 files changed, 786 insertions(+), 348 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gt/sysfs_gt.c
 create mode 100644 drivers/gpu/drm/i915/gt/sysfs_gt.h
 create mode 100644 drivers/gpu/drm/i915/gt/sysfs_gt_pm.c
 create mode 100644 drivers/gpu/drm/i915/gt/sysfs_gt_pm.h

-- 
2.34.1



Re: [PATCH v2 1/2] drm/mipi-dbi: Remove dependency on GEM CMA helper library

2022-01-11 Thread Noralf Trønnes
> The MIPI DBI helpers access struct drm_gem_cma_object.vaddr in a
> few places. Replace all instances with the correct generic GEM
> functions. Use drm_gem_fb_vmap() for mapping a framebuffer's GEM
> objects and drm_gem_fb_vunmap() for unmapping them. This removes
> the dependency on CMA helpers within MIPI DBI.
>
> Signed-off-by: Thomas Zimmermann 
> Reviewed-by: Daniel Vetter 
> ---
>  drivers/gpu/drm/drm_mipi_dbi.c | 34 +-
>  1 file changed, 25 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_mipi_dbi.c
b/drivers/gpu/drm/drm_mipi_dbi.c
> index 71b646c4131f..f80fd6c0ccf8 100644
> --- a/drivers/gpu/drm/drm_mipi_dbi.c
> +++ b/drivers/gpu/drm/drm_mipi_dbi.c
> @@ -15,9 +15,10 @@
>  #include 
>  #include 
>  #include 
> -#include 
> +#include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -200,13 +201,19 @@ int mipi_dbi_buf_copy(void *dst, struct
drm_framebuffer *fb,
> struct drm_rect *clip, bool swap)
>  {
>   struct drm_gem_object *gem = drm_gem_fb_get_obj(fb, 0);
> - struct drm_gem_cma_object *cma_obj = to_drm_gem_cma_obj(gem);
> - void *src = cma_obj->vaddr;
> + struct dma_buf_map map[DRM_FORMAT_MAX_PLANES];
> + struct dma_buf_map data[DRM_FORMAT_MAX_PLANES];
> + void *src;
>   int ret;
>
>   ret = drm_gem_fb_begin_cpu_access(fb, DMA_FROM_DEVICE);
>   if (ret)
>   return ret;
> + src = data[0].vaddr; /* TODO: Use mapping abstraction properly */

This assignment should be after the _vmap() call. The MIPI DBI drivers
are currently broken because of this.

Noralf.

> +
> + ret = drm_gem_fb_vmap(fb, map, data);
> + if (ret)
> + goto out_drm_gem_fb_end_cpu_access;
>
>   switch (fb->format->format) {
>   case DRM_FORMAT_RGB565:
> @@ -221,9 +228,11 @@ int mipi_dbi_buf_copy(void *dst, struct
drm_framebuffer *fb,
>   default:
>   drm_err_once(fb->dev, "Format is not supported: %p4cc\n",
>&fb->format->format);
> - return -EINVAL;
> + ret = -EINVAL;
>   }
>
> + drm_gem_fb_vunmap(fb, map);
> +out_drm_gem_fb_end_cpu_access:
>   drm_gem_fb_end_cpu_access(fb, DMA_FROM_DEVICE);
>
>   return ret;
>


[PATCH 8/8] drm/ast: Move SIL164-based connector code into separate helpers

2022-01-11 Thread Thomas Zimmermann
Add helpers for initializing SIL164-based connectors. These used to be
handled by the VGA connector code. But SIL164 provides output via DVI-I,
so set the encoder and connector types accordingly.

If a SIL164 chip has been detected, ast will now create a DVI-I
connector instead of a VGA connector.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/ast/ast_drv.h  | 15 ++
 drivers/gpu/drm/ast/ast_mode.c | 99 +-
 2 files changed, 112 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/ast/ast_drv.h b/drivers/gpu/drm/ast/ast_drv.h
index 420d19d8459e..c3a582372649 100644
--- a/drivers/gpu/drm/ast/ast_drv.h
+++ b/drivers/gpu/drm/ast/ast_drv.h
@@ -140,6 +140,17 @@ to_ast_vga_connector(struct drm_connector *connector)
return container_of(connector, struct ast_vga_connector, base);
 }
 
+struct ast_sil164_connector {
+   struct drm_connector base;
+   struct ast_i2c_chan *i2c;
+};
+
+static inline struct ast_sil164_connector *
+to_ast_sil164_connector(struct drm_connector *connector)
+{
+   return container_of(connector, struct ast_sil164_connector, base);
+}
+
 /*
  * Device
  */
@@ -165,6 +176,10 @@ struct ast_private {
struct drm_encoder encoder;
struct ast_vga_connector vga_connector;
} vga;
+   struct {
+   struct drm_encoder encoder;
+   struct ast_sil164_connector sil164_connector;
+   } sil164;
struct {
struct drm_encoder encoder;
struct drm_connector connector;
diff --git a/drivers/gpu/drm/ast/ast_mode.c b/drivers/gpu/drm/ast/ast_mode.c
index a0f4f042141e..f9daeb8d801a 100644
--- a/drivers/gpu/drm/ast/ast_mode.c
+++ b/drivers/gpu/drm/ast/ast_mode.c
@@ -1344,6 +1344,100 @@ static int ast_vga_output_init(struct ast_private *ast)
return 0;
 }
 
+/*
+ * SIL164 Connector
+ */
+
+static int ast_sil164_connector_helper_get_modes(struct drm_connector 
*connector)
+{
+   struct ast_sil164_connector *ast_sil164_connector = 
to_ast_sil164_connector(connector);
+   struct edid *edid;
+   int count;
+
+   if (!ast_sil164_connector->i2c)
+   goto err_drm_connector_update_edid_property;
+
+   edid = drm_get_edid(connector, &ast_sil164_connector->i2c->adapter);
+   if (!edid)
+   goto err_drm_connector_update_edid_property;
+
+   count = drm_add_edid_modes(connector, edid);
+   kfree(edid);
+
+   return count;
+
+err_drm_connector_update_edid_property:
+   drm_connector_update_edid_property(connector, NULL);
+   return 0;
+}
+
+static const struct drm_connector_helper_funcs 
ast_sil164_connector_helper_funcs = {
+   .get_modes = ast_sil164_connector_helper_get_modes,
+};
+
+static const struct drm_connector_funcs ast_sil164_connector_funcs = {
+   .reset = drm_atomic_helper_connector_reset,
+   .fill_modes = drm_helper_probe_single_connector_modes,
+   .destroy = drm_connector_cleanup,
+   .atomic_duplicate_state = drm_atomic_helper_connector_duplicate_state,
+   .atomic_destroy_state = drm_atomic_helper_connector_destroy_state,
+};
+
+static int ast_sil164_connector_init(struct drm_device *dev,
+struct ast_sil164_connector 
*ast_sil164_connector)
+{
+   struct drm_connector *connector = &ast_sil164_connector->base;
+   int ret;
+
+   ast_sil164_connector->i2c = ast_i2c_create(dev);
+   if (!ast_sil164_connector->i2c)
+   drm_err(dev, "failed to add ddc bus for connector\n");
+
+   if (ast_sil164_connector->i2c)
+   ret = drm_connector_init_with_ddc(dev, connector, 
&ast_sil164_connector_funcs,
+ DRM_MODE_CONNECTOR_DVII,
+ 
&ast_sil164_connector->i2c->adapter);
+   else
+   ret = drm_connector_init(dev, connector, 
&ast_sil164_connector_funcs,
+DRM_MODE_CONNECTOR_DVII);
+   if (ret)
+   return ret;
+
+   drm_connector_helper_add(connector, &ast_sil164_connector_helper_funcs);
+
+   connector->interlace_allowed = 0;
+   connector->doublescan_allowed = 0;
+
+   connector->polled = DRM_CONNECTOR_POLL_CONNECT;
+
+   return 0;
+}
+
+static int ast_sil164_output_init(struct ast_private *ast)
+{
+   struct drm_device *dev = &ast->base;
+   struct drm_crtc *crtc = &ast->crtc;
+   struct drm_encoder *encoder = &ast->output.sil164.encoder;
+   struct ast_sil164_connector *ast_sil164_connector = 
&ast->output.sil164.sil164_connector;
+   struct drm_connector *connector = &ast_sil164_connector->base;
+   int ret;
+
+   ret = drm_simple_encoder_init(dev, encoder, DRM_MODE_ENCODER_TMDS);
+   if (ret)
+   return ret;
+   encoder->possible_crtcs = drm_crtc_mask(crtc);
+
+  

[PATCH 7/8] drm/ast: Move DP501-based connector code into separate helpers

2022-01-11 Thread Thomas Zimmermann
Add helpers for DP501-based connectors. DP501 provides output via
DisplayPort. This used to be handled by the VGA connector code.

If a DP501 chip has been detected, ast will now create a DisplayPort
connector instead of a VGA connector.

Remove the DP501 code from ast_vga_connector_helper_get_modes(). Also
remove the call to drm_connector_update_edid_property(), which is
performed by drm_get_edid().

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/ast/ast_drv.h  |   4 ++
 drivers/gpu/drm/ast/ast_mode.c | 128 +++--
 2 files changed, 109 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/ast/ast_drv.h b/drivers/gpu/drm/ast/ast_drv.h
index cda50fb887ed..420d19d8459e 100644
--- a/drivers/gpu/drm/ast/ast_drv.h
+++ b/drivers/gpu/drm/ast/ast_drv.h
@@ -165,6 +165,10 @@ struct ast_private {
struct drm_encoder encoder;
struct ast_vga_connector vga_connector;
} vga;
+   struct {
+   struct drm_encoder encoder;
+   struct drm_connector connector;
+   } dp501;
} output;
 
bool support_wide_screen;
diff --git a/drivers/gpu/drm/ast/ast_mode.c b/drivers/gpu/drm/ast/ast_mode.c
index 17e4e038a3ed..a0f4f042141e 100644
--- a/drivers/gpu/drm/ast/ast_mode.c
+++ b/drivers/gpu/drm/ast/ast_mode.c
@@ -40,6 +40,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1256,30 +1257,22 @@ static int ast_crtc_init(struct drm_device *dev)
 static int ast_vga_connector_helper_get_modes(struct drm_connector *connector)
 {
struct ast_vga_connector *ast_vga_connector = 
to_ast_vga_connector(connector);
-   struct ast_private *ast = to_ast_private(connector->dev);
-   struct edid *edid = NULL;
-   bool flags = false;
-   int ret;
+   struct edid *edid;
+   int count;
 
-   if (ast->tx_chip_type == AST_TX_DP501) {
-   edid = kmalloc(128, GFP_KERNEL);
-   if (!edid)
-   return -ENOMEM;
+   if (!ast_vga_connector->i2c)
+   goto err_drm_connector_update_edid_property;
 
-   flags = ast_dp501_read_edid(connector->dev, (u8 *)edid);
-   if (!flags) {
-   kfree(edid);
-   edid = NULL;
-   }
-   }
-   if (!flags && ast_vga_connector->i2c)
-   edid = drm_get_edid(connector, 
&ast_vga_connector->i2c->adapter);
-   if (edid) {
-   drm_connector_update_edid_property(connector, edid);
-   ret = drm_add_edid_modes(connector, edid);
-   kfree(edid);
-   return ret;
-   }
+   edid = drm_get_edid(connector, &ast_vga_connector->i2c->adapter);
+   if (!edid)
+   goto err_drm_connector_update_edid_property;
+
+   count = drm_add_edid_modes(connector, edid);
+   kfree(edid);
+
+   return count;
+
+err_drm_connector_update_edid_property:
drm_connector_update_edid_property(connector, NULL);
return 0;
 }
@@ -1351,6 +1344,92 @@ static int ast_vga_output_init(struct ast_private *ast)
return 0;
 }
 
+/*
+ * DP501 Connector
+ */
+
+static int ast_dp501_connector_helper_get_modes(struct drm_connector 
*connector)
+{
+   void *edid;
+   bool succ;
+   int count;
+
+   edid = kmalloc(EDID_LENGTH, GFP_KERNEL);
+   if (!edid)
+   goto err_drm_connector_update_edid_property;
+
+   succ = ast_dp501_read_edid(connector->dev, edid);
+   if (!succ)
+   goto err_kfree;
+
+   drm_connector_update_edid_property(connector, edid);
+   count = drm_add_edid_modes(connector, edid);
+   kfree(edid);
+
+   return count;
+
+err_kfree:
+   kfree(edid);
+err_drm_connector_update_edid_property:
+   drm_connector_update_edid_property(connector, NULL);
+   return 0;
+}
+
+static const struct drm_connector_helper_funcs 
ast_dp501_connector_helper_funcs = {
+   .get_modes = ast_dp501_connector_helper_get_modes,
+};
+
+static const struct drm_connector_funcs ast_dp501_connector_funcs = {
+   .reset = drm_atomic_helper_connector_reset,
+   .fill_modes = drm_helper_probe_single_connector_modes,
+   .destroy = drm_connector_cleanup,
+   .atomic_duplicate_state = drm_atomic_helper_connector_duplicate_state,
+   .atomic_destroy_state = drm_atomic_helper_connector_destroy_state,
+};
+
+static int ast_dp501_connector_init(struct drm_device *dev, struct 
drm_connector *connector)
+{
+   int ret;
+
+   ret = drm_connector_init(dev, connector, &ast_dp501_connector_funcs,
+DRM_MODE_CONNECTOR_DisplayPort);
+   if (ret)
+   return ret;
+
+   drm_connector_helper_add(connector, &ast_dp501_connector_helper_funcs);
+
+   connector->interlace_allowed = 0;
+   connector->doublescan_allowed = 0;
+
+   connector->polled = DRM_CONNEC

[PATCH 3/8] drm/ast: Remove AST_TX_ITE66121 constant

2022-01-11 Thread Thomas Zimmermann
The ITE66121 is an HDMI transmitter chip. There's no code for
detecting or programming the chip within ast. Remove the enum
constant.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/ast/ast_drv.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/ast/ast_drv.h b/drivers/gpu/drm/ast/ast_drv.h
index 00bfa41ff7cb..6e77be1d06d3 100644
--- a/drivers/gpu/drm/ast/ast_drv.h
+++ b/drivers/gpu/drm/ast/ast_drv.h
@@ -69,7 +69,6 @@ enum ast_chip {
 enum ast_tx_chip {
AST_TX_NONE,
AST_TX_SIL164,
-   AST_TX_ITE66121,
AST_TX_DP501,
 };
 
-- 
2.34.1



[PATCH 2/8] drm/ast: Move connector mode_valid function to CRTC

2022-01-11 Thread Thomas Zimmermann
The tests in ast_mode_valid() verify the correct resolution for the
supplied mode. This is a limitation of the CRTC, so move the function
to the CRTC helpers. No functional changes.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/ast/ast_mode.c | 129 +
 1 file changed, 66 insertions(+), 63 deletions(-)

diff --git a/drivers/gpu/drm/ast/ast_mode.c b/drivers/gpu/drm/ast/ast_mode.c
index 20626c78a693..c555960a488a 100644
--- a/drivers/gpu/drm/ast/ast_mode.c
+++ b/drivers/gpu/drm/ast/ast_mode.c
@@ -1002,6 +1002,71 @@ static void ast_crtc_dpms(struct drm_crtc *crtc, int 
mode)
}
 }
 
+static enum drm_mode_status
+ast_crtc_helper_mode_valid(struct drm_crtc *crtc, const struct 
drm_display_mode *mode)
+{
+   struct ast_private *ast = to_ast_private(crtc->dev);
+   enum drm_mode_status status;
+   uint32_t jtemp;
+
+   if (ast->support_wide_screen) {
+   if ((mode->hdisplay == 1680) && (mode->vdisplay == 1050))
+   return MODE_OK;
+   if ((mode->hdisplay == 1280) && (mode->vdisplay == 800))
+   return MODE_OK;
+   if ((mode->hdisplay == 1440) && (mode->vdisplay == 900))
+   return MODE_OK;
+   if ((mode->hdisplay == 1360) && (mode->vdisplay == 768))
+   return MODE_OK;
+   if ((mode->hdisplay == 1600) && (mode->vdisplay == 900))
+   return MODE_OK;
+
+   if ((ast->chip == AST2100) || (ast->chip == AST2200) ||
+   (ast->chip == AST2300) || (ast->chip == AST2400) ||
+   (ast->chip == AST2500)) {
+   if ((mode->hdisplay == 1920) && (mode->vdisplay == 
1080))
+   return MODE_OK;
+
+   if ((mode->hdisplay == 1920) && (mode->vdisplay == 
1200)) {
+   jtemp = ast_get_index_reg_mask(ast, 
AST_IO_CRTC_PORT, 0xd1, 0xff);
+   if (jtemp & 0x01)
+   return MODE_NOMODE;
+   else
+   return MODE_OK;
+   }
+   }
+   }
+
+   status = MODE_NOMODE;
+
+   switch (mode->hdisplay) {
+   case 640:
+   if (mode->vdisplay == 480)
+   status = MODE_OK;
+   break;
+   case 800:
+   if (mode->vdisplay == 600)
+   status = MODE_OK;
+   break;
+   case 1024:
+   if (mode->vdisplay == 768)
+   status = MODE_OK;
+   break;
+   case 1280:
+   if (mode->vdisplay == 1024)
+   status = MODE_OK;
+   break;
+   case 1600:
+   if (mode->vdisplay == 1200)
+   status = MODE_OK;
+   break;
+   default:
+   break;
+   }
+
+   return status;
+}
+
 static int ast_crtc_helper_atomic_check(struct drm_crtc *crtc,
struct drm_atomic_state *state)
 {
@@ -1104,6 +1169,7 @@ ast_crtc_helper_atomic_disable(struct drm_crtc *crtc,
 }
 
 static const struct drm_crtc_helper_funcs ast_crtc_helper_funcs = {
+   .mode_valid = ast_crtc_helper_mode_valid,
.atomic_check = ast_crtc_helper_atomic_check,
.atomic_flush = ast_crtc_helper_atomic_flush,
.atomic_enable = ast_crtc_helper_atomic_enable,
@@ -1238,71 +1304,8 @@ static int ast_get_modes(struct drm_connector *connector)
return 0;
 }
 
-static enum drm_mode_status ast_mode_valid(struct drm_connector *connector,
- struct drm_display_mode *mode)
-{
-   struct ast_private *ast = to_ast_private(connector->dev);
-   int flags = MODE_NOMODE;
-   uint32_t jtemp;
-
-   if (ast->support_wide_screen) {
-   if ((mode->hdisplay == 1680) && (mode->vdisplay == 1050))
-   return MODE_OK;
-   if ((mode->hdisplay == 1280) && (mode->vdisplay == 800))
-   return MODE_OK;
-   if ((mode->hdisplay == 1440) && (mode->vdisplay == 900))
-   return MODE_OK;
-   if ((mode->hdisplay == 1360) && (mode->vdisplay == 768))
-   return MODE_OK;
-   if ((mode->hdisplay == 1600) && (mode->vdisplay == 900))
-   return MODE_OK;
-
-   if ((ast->chip == AST2100) || (ast->chip == AST2200) ||
-   (ast->chip == AST2300) || (ast->chip == AST2400) ||
-   (ast->chip == AST2500)) {
-   if ((mode->hdisplay == 1920) && (mode->vdisplay == 
1080))
-   return MODE_OK;
-
-   if ((mode->hdisplay == 1920) && (mode->vdisplay == 
1200)) {
-   jtemp = ast_get_index_reg_mask(ast, 
AST_IO_CRT

[PATCH 4/8] drm/ast: Remove unused value dp501_maxclk

2022-01-11 Thread Thomas Zimmermann
Remove reading the link-rate. The value is maintained by the connector
code but never used.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/ast/ast_dp501.c | 58 -
 drivers/gpu/drm/ast/ast_drv.h   |  1 -
 drivers/gpu/drm/ast/ast_mode.c  |  7 ++--
 3 files changed, 3 insertions(+), 63 deletions(-)

diff --git a/drivers/gpu/drm/ast/ast_dp501.c b/drivers/gpu/drm/ast/ast_dp501.c
index cd93c44f2662..204c926a18ea 100644
--- a/drivers/gpu/drm/ast/ast_dp501.c
+++ b/drivers/gpu/drm/ast/ast_dp501.c
@@ -272,64 +272,6 @@ static bool ast_launch_m68k(struct drm_device *dev)
return true;
 }
 
-u8 ast_get_dp501_max_clk(struct drm_device *dev)
-{
-   struct ast_private *ast = to_ast_private(dev);
-   u32 boot_address, offset, data;
-   u8 linkcap[4], linkrate, linklanes, maxclk = 0xff;
-   u32 *plinkcap;
-
-   if (ast->config_mode == ast_use_p2a) {
-   boot_address = get_fw_base(ast);
-
-   /* validate FW version */
-   offset = AST_DP501_GBL_VERSION;
-   data = ast_mindwm(ast, boot_address + offset);
-   if ((data & AST_DP501_FW_VERSION_MASK) != 
AST_DP501_FW_VERSION_1) /* version: 1x */
-   return maxclk;
-
-   /* Read Link Capability */
-   offset  = AST_DP501_LINKRATE;
-   plinkcap = (u32 *)linkcap;
-   *plinkcap  = ast_mindwm(ast, boot_address + offset);
-   if (linkcap[2] == 0) {
-   linkrate = linkcap[0];
-   linklanes = linkcap[1];
-   data = (linkrate == 0x0a) ? (90 * linklanes) : (54 * 
linklanes);
-   if (data > 0xff)
-   data = 0xff;
-   maxclk = (u8)data;
-   }
-   } else {
-   if (!ast->dp501_fw_buf)
-   return AST_DP501_DEFAULT_DCLK;  /* 1024x768 as default 
*/
-
-   /* dummy read */
-   offset = 0x;
-   data = readl(ast->dp501_fw_buf + offset);
-
-   /* validate FW version */
-   offset = AST_DP501_GBL_VERSION;
-   data = readl(ast->dp501_fw_buf + offset);
-   if ((data & AST_DP501_FW_VERSION_MASK) != 
AST_DP501_FW_VERSION_1) /* version: 1x */
-   return maxclk;
-
-   /* Read Link Capability */
-   offset = AST_DP501_LINKRATE;
-   plinkcap = (u32 *)linkcap;
-   *plinkcap = readl(ast->dp501_fw_buf + offset);
-   if (linkcap[2] == 0) {
-   linkrate = linkcap[0];
-   linklanes = linkcap[1];
-   data = (linkrate == 0x0a) ? (90 * linklanes) : (54 * 
linklanes);
-   if (data > 0xff)
-   data = 0xff;
-   maxclk = (u8)data;
-   }
-   }
-   return maxclk;
-}
-
 bool ast_dp501_read_edid(struct drm_device *dev, u8 *ediddata)
 {
struct ast_private *ast = to_ast_private(dev);
diff --git a/drivers/gpu/drm/ast/ast_drv.h b/drivers/gpu/drm/ast/ast_drv.h
index 6e77be1d06d3..479bb120dd05 100644
--- a/drivers/gpu/drm/ast/ast_drv.h
+++ b/drivers/gpu/drm/ast/ast_drv.h
@@ -171,7 +171,6 @@ struct ast_private {
} config_mode;
 
enum ast_tx_chip tx_chip_type;
-   u8 dp501_maxclk;
u8 *dp501_fw_addr;
const struct firmware *dp501_fw;/* dp501 fw */
 };
diff --git a/drivers/gpu/drm/ast/ast_mode.c b/drivers/gpu/drm/ast/ast_mode.c
index c555960a488a..0a8aa6e3aa38 100644
--- a/drivers/gpu/drm/ast/ast_mode.c
+++ b/drivers/gpu/drm/ast/ast_mode.c
@@ -1281,16 +1281,15 @@ static int ast_get_modes(struct drm_connector 
*connector)
int ret;
 
if (ast->tx_chip_type == AST_TX_DP501) {
-   ast->dp501_maxclk = 0xff;
edid = kmalloc(128, GFP_KERNEL);
if (!edid)
return -ENOMEM;
 
flags = ast_dp501_read_edid(connector->dev, (u8 *)edid);
-   if (flags)
-   ast->dp501_maxclk = 
ast_get_dp501_max_clk(connector->dev);
-   else
+   if (!flags) {
kfree(edid);
+   edid = NULL;
+   }
}
if (!flags && ast_connector->i2c)
edid = drm_get_edid(connector, &ast_connector->i2c->adapter);
-- 
2.34.1



[PATCH 5/8] drm/ast: Rename struct ast_connector to struct ast_vga_connector

2022-01-11 Thread Thomas Zimmermann
Prepare for introducing other connectors besides VGA. No functional
changes.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/ast/ast_drv.h  | 10 
 drivers/gpu/drm/ast/ast_mode.c | 45 +-
 2 files changed, 27 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/ast/ast_drv.h b/drivers/gpu/drm/ast/ast_drv.h
index 479bb120dd05..e1cb31acdaac 100644
--- a/drivers/gpu/drm/ast/ast_drv.h
+++ b/drivers/gpu/drm/ast/ast_drv.h
@@ -129,15 +129,15 @@ struct ast_i2c_chan {
struct i2c_algo_bit_data bit;
 };
 
-struct ast_connector {
+struct ast_vga_connector {
struct drm_connector base;
struct ast_i2c_chan *i2c;
 };
 
-static inline struct ast_connector *
-to_ast_connector(struct drm_connector *connector)
+static inline struct ast_vga_connector *
+to_ast_vga_connector(struct drm_connector *connector)
 {
-   return container_of(connector, struct ast_connector, base);
+   return container_of(connector, struct ast_vga_connector, base);
 }
 
 /*
@@ -161,7 +161,7 @@ struct ast_private {
struct ast_cursor_plane cursor_plane;
struct drm_crtc crtc;
struct drm_encoder encoder;
-   struct ast_connector connector;
+   struct ast_vga_connector connector;
 
bool support_wide_screen;
enum {
diff --git a/drivers/gpu/drm/ast/ast_mode.c b/drivers/gpu/drm/ast/ast_mode.c
index 0a8aa6e3aa38..f7f4034cc91e 100644
--- a/drivers/gpu/drm/ast/ast_mode.c
+++ b/drivers/gpu/drm/ast/ast_mode.c
@@ -1269,12 +1269,12 @@ static int ast_encoder_init(struct drm_device *dev)
 }
 
 /*
- * Connector
+ * VGA Connector
  */
 
-static int ast_get_modes(struct drm_connector *connector)
+static int ast_vga_connector_helper_get_modes(struct drm_connector *connector)
 {
-   struct ast_connector *ast_connector = to_ast_connector(connector);
+   struct ast_vga_connector *ast_vga_connector = 
to_ast_vga_connector(connector);
struct ast_private *ast = to_ast_private(connector->dev);
struct edid *edid = NULL;
bool flags = false;
@@ -1291,23 +1291,23 @@ static int ast_get_modes(struct drm_connector 
*connector)
edid = NULL;
}
}
-   if (!flags && ast_connector->i2c)
-   edid = drm_get_edid(connector, &ast_connector->i2c->adapter);
+   if (!flags && ast_vga_connector->i2c)
+   edid = drm_get_edid(connector, 
&ast_vga_connector->i2c->adapter);
if (edid) {
-   drm_connector_update_edid_property(&ast_connector->base, edid);
+   drm_connector_update_edid_property(connector, edid);
ret = drm_add_edid_modes(connector, edid);
kfree(edid);
return ret;
}
-   drm_connector_update_edid_property(&ast_connector->base, NULL);
+   drm_connector_update_edid_property(connector, NULL);
return 0;
 }
 
-static const struct drm_connector_helper_funcs ast_connector_helper_funcs = {
-   .get_modes = ast_get_modes,
+static const struct drm_connector_helper_funcs ast_vga_connector_helper_funcs 
= {
+   .get_modes = ast_vga_connector_helper_get_modes,
 };
 
-static const struct drm_connector_funcs ast_connector_funcs = {
+static const struct drm_connector_funcs ast_vga_connector_funcs = {
.reset = drm_atomic_helper_connector_reset,
.fill_modes = drm_helper_probe_single_connector_modes,
.destroy = drm_connector_cleanup,
@@ -1315,29 +1315,29 @@ static const struct drm_connector_funcs 
ast_connector_funcs = {
.atomic_destroy_state = drm_atomic_helper_connector_destroy_state,
 };
 
-static int ast_connector_init(struct drm_device *dev)
+static int ast_vga_connector_init(struct drm_device *dev)
 {
struct ast_private *ast = to_ast_private(dev);
-   struct ast_connector *ast_connector = &ast->connector;
-   struct drm_connector *connector = &ast_connector->base;
+   struct ast_vga_connector *ast_vga_connector = &ast->connector;
+   struct drm_connector *connector = &ast_vga_connector->base;
struct drm_encoder *encoder = &ast->encoder;
int ret;
 
-   ast_connector->i2c = ast_i2c_create(dev);
-   if (!ast_connector->i2c)
+   ast_vga_connector->i2c = ast_i2c_create(dev);
+   if (!ast_vga_connector->i2c)
drm_err(dev, "failed to add ddc bus for connector\n");
 
-   if (ast_connector->i2c)
-   ret = drm_connector_init_with_ddc(dev, connector, 
&ast_connector_funcs,
+   if (ast_vga_connector->i2c)
+   ret = drm_connector_init_with_ddc(dev, connector, 
&ast_vga_connector_funcs,
  DRM_MODE_CONNECTOR_VGA,
- &ast_connector->i2c->adapter);
+ 
&ast_vga_connector->i2c->adapter);
else
-   ret = drm_connector_init(dev, connector, &ast_connector_funcs,
+   ret = drm

[PATCH 6/8] drm/ast: Initialize encoder and connector for VGA in helper function

2022-01-11 Thread Thomas Zimmermann
Move encoder and connector initialization into a single helper and
put all related mode-setting structures into a single place. Done in
preparation of moving transmitter code into separate helpers. No
functional changes.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/ast/ast_drv.h  |  8 +++--
 drivers/gpu/drm/ast/ast_mode.c | 62 --
 2 files changed, 42 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/ast/ast_drv.h b/drivers/gpu/drm/ast/ast_drv.h
index e1cb31acdaac..cda50fb887ed 100644
--- a/drivers/gpu/drm/ast/ast_drv.h
+++ b/drivers/gpu/drm/ast/ast_drv.h
@@ -160,8 +160,12 @@ struct ast_private {
struct drm_plane primary_plane;
struct ast_cursor_plane cursor_plane;
struct drm_crtc crtc;
-   struct drm_encoder encoder;
-   struct ast_vga_connector connector;
+   union {
+   struct {
+   struct drm_encoder encoder;
+   struct ast_vga_connector vga_connector;
+   } vga;
+   } output;
 
bool support_wide_screen;
enum {
diff --git a/drivers/gpu/drm/ast/ast_mode.c b/drivers/gpu/drm/ast/ast_mode.c
index f7f4034cc91e..17e4e038a3ed 100644
--- a/drivers/gpu/drm/ast/ast_mode.c
+++ b/drivers/gpu/drm/ast/ast_mode.c
@@ -1249,25 +1249,6 @@ static int ast_crtc_init(struct drm_device *dev)
return 0;
 }
 
-/*
- * Encoder
- */
-
-static int ast_encoder_init(struct drm_device *dev)
-{
-   struct ast_private *ast = to_ast_private(dev);
-   struct drm_encoder *encoder = &ast->encoder;
-   int ret;
-
-   ret = drm_simple_encoder_init(dev, encoder, DRM_MODE_ENCODER_DAC);
-   if (ret)
-   return ret;
-
-   encoder->possible_crtcs = 1;
-
-   return 0;
-}
-
 /*
  * VGA Connector
  */
@@ -1315,12 +1296,10 @@ static const struct drm_connector_funcs 
ast_vga_connector_funcs = {
.atomic_destroy_state = drm_atomic_helper_connector_destroy_state,
 };
 
-static int ast_vga_connector_init(struct drm_device *dev)
+static int ast_vga_connector_init(struct drm_device *dev,
+ struct ast_vga_connector *ast_vga_connector)
 {
-   struct ast_private *ast = to_ast_private(dev);
-   struct ast_vga_connector *ast_vga_connector = &ast->connector;
struct drm_connector *connector = &ast_vga_connector->base;
-   struct drm_encoder *encoder = &ast->encoder;
int ret;
 
ast_vga_connector->i2c = ast_i2c_create(dev);
@@ -1344,7 +1323,30 @@ static int ast_vga_connector_init(struct drm_device *dev)
 
connector->polled = DRM_CONNECTOR_POLL_CONNECT;
 
-   drm_connector_attach_encoder(connector, encoder);
+   return 0;
+}
+
+static int ast_vga_output_init(struct ast_private *ast)
+{
+   struct drm_device *dev = &ast->base;
+   struct drm_crtc *crtc = &ast->crtc;
+   struct drm_encoder *encoder = &ast->output.vga.encoder;
+   struct ast_vga_connector *ast_vga_connector = 
&ast->output.vga.vga_connector;
+   struct drm_connector *connector = &ast_vga_connector->base;
+   int ret;
+
+   ret = drm_simple_encoder_init(dev, encoder, DRM_MODE_ENCODER_DAC);
+   if (ret)
+   return ret;
+   encoder->possible_crtcs = drm_crtc_mask(crtc);
+
+   ret = ast_vga_connector_init(dev, ast_vga_connector);
+   if (ret)
+   return ret;
+
+   ret = drm_connector_attach_encoder(connector, encoder);
+   if (ret)
+   return ret;
 
return 0;
 }
@@ -1405,8 +1407,16 @@ int ast_mode_config_init(struct ast_private *ast)
return ret;
 
ast_crtc_init(dev);
-   ast_encoder_init(dev);
-   ast_vga_connector_init(dev);
+
+   switch (ast->tx_chip_type) {
+   case AST_TX_NONE:
+   case AST_TX_SIL164:
+   case AST_TX_DP501:
+   ret = ast_vga_output_init(ast);
+   break;
+   }
+   if (ret)
+   return ret;
 
drm_mode_config_reset(dev);
 
-- 
2.34.1



[PATCH 1/8] drm/ast: Fail if connector initialization fails

2022-01-11 Thread Thomas Zimmermann
Update the connector code to fail if the connector could not be
initialized. The current code just ignored the error and failed
later when the connector was supposed to be used.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/ast/ast_mode.c | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/ast/ast_mode.c b/drivers/gpu/drm/ast/ast_mode.c
index 956c8982192b..20626c78a693 100644
--- a/drivers/gpu/drm/ast/ast_mode.c
+++ b/drivers/gpu/drm/ast/ast_mode.c
@@ -1319,18 +1319,21 @@ static int ast_connector_init(struct drm_device *dev)
struct ast_connector *ast_connector = &ast->connector;
struct drm_connector *connector = &ast_connector->base;
struct drm_encoder *encoder = &ast->encoder;
+   int ret;
 
ast_connector->i2c = ast_i2c_create(dev);
if (!ast_connector->i2c)
drm_err(dev, "failed to add ddc bus for connector\n");
 
if (ast_connector->i2c)
-   drm_connector_init_with_ddc(dev, connector, 
&ast_connector_funcs,
-   DRM_MODE_CONNECTOR_VGA,
-   &ast_connector->i2c->adapter);
+   ret = drm_connector_init_with_ddc(dev, connector, 
&ast_connector_funcs,
+ DRM_MODE_CONNECTOR_VGA,
+ &ast_connector->i2c->adapter);
else
-   drm_connector_init(dev, connector, &ast_connector_funcs,
-  DRM_MODE_CONNECTOR_VGA);
+   ret = drm_connector_init(dev, connector, &ast_connector_funcs,
+DRM_MODE_CONNECTOR_VGA);
+   if (ret)
+   return ret;
 
drm_connector_helper_add(connector, &ast_connector_helper_funcs);
 
-- 
2.34.1



  1   2   >