date:20210521

Re: [PATCH 2/2] drivers/firmware: consolidate EFI framebuffer setup for all arches

2021-05-21 Thread Thomas Zimmermann


Hi

Am 21.05.21 um 21:37 schrieb Javier Martinez Canillas:

The register_gop_device() function registers an "efi-framebuffer" platform
device to match against the efifb driver, to have an early framebuffer for
EFI platforms.

But the Generic System Framebuffers (sysfb) already has support for this.

Instead of having duplicated logic for x86 and other architectures using
EFI, consolidate the two in sysfb and remove it from the EFI init logic.

Signed-off-by: Javier Martinez Canillas 
---

  arch/arm/Kconfig  |  1 +
  arch/arm/include/asm/efi.h|  5 +-
  arch/arm64/Kconfig|  1 +
  arch/arm64/include/asm/efi.h  |  5 +-
  arch/riscv/Kconfig|  1 +
  arch/riscv/include/asm/efi.h  |  5 +-
  drivers/firmware/Kconfig  |  7 ++-
  drivers/firmware/Makefile |  2 +-
  drivers/firmware/efi/efi-init.c   | 90 ---
  drivers/firmware/efi/sysfb_efi.c  | 77 +-
  drivers/firmware/sysfb.c  | 40 +-
  drivers/firmware/sysfb_simplefb.c | 29 ++
  include/linux/sysfb.h | 28 +-
  13 files changed, 145 insertions(+), 146 deletions(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 24804f11302..30ba195ca72 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -127,6 +127,7 @@ config ARM
select PERF_USE_VMALLOC
select RTC_LIB
select SET_FS
+   select SYSFB


Don't select this as part of the architecture. Rather make an option for 
SYSFB that depends in the architectures that supports it.


Best regards
Thomas


--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer



OpenPGP_signature
Description: OpenPGP digital signature

[PATCH] drm/panel: panel-dsi-cm: convert sysfs snprintf to sysfs_emit

2021-05-21 Thread Xuezhi Zhang

From: Xuezhi Zhang 

Fix the following coccicheck warnings:
drivers/gpu/drm//panel/panel-dsi-cm.c:271:8-16:
WARNING: use scnprintf or sprintf
drivers/gpu/drm//panel/panel-dsi-cm.c:251:8-16:
WARNING: use scnprintf or sprintf

Signed-off-by: Xuezhi Zhang 
---
 drivers/gpu/drm/panel/panel-dsi-cm.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/panel/panel-dsi-cm.c 
b/drivers/gpu/drm/panel/panel-dsi-cm.c
index 5fbfb71ca3d9..a8efb06cca64 100644
--- a/drivers/gpu/drm/panel/panel-dsi-cm.c
+++ b/drivers/gpu/drm/panel/panel-dsi-cm.c
@@ -248,7 +248,7 @@ static ssize_t num_dsi_errors_show(struct device *dev,
if (r)
return r;
 
-   return snprintf(buf, PAGE_SIZE, "%d\n", errors);
+   return sysfs_emit(buf, "%d\n", errors);
 }
 
 static ssize_t hw_revision_show(struct device *dev,
@@ -268,7 +268,7 @@ static ssize_t hw_revision_show(struct device *dev,
if (r)
return r;
 
-   return snprintf(buf, PAGE_SIZE, "%02x.%02x.%02x\n", id1, id2, id3);
+   return sysfs_emit(buf, "%02x.%02x.%02x\n", id1, id2, id3);
 }
 
 static DEVICE_ATTR_RO(num_dsi_errors);
-- 
2.25.1

Re: [PATCH 1/1] drm/mcde: Remove redundant error printing in mcde_dsi_probe()

2021-05-21 Thread Linus Walleij

On Tue, May 11, 2021 at 11:14 AM Zhen Lei  wrote:

> When devm_ioremap_resource() fails, a clear enough error message will be
> printed by its subfunction __devm_ioremap_resource(). The error
> information contains the device name, failure cause, and possibly resource
> information.
>
> Therefore, remove the error printing here to simplify code and reduce the
> binary size.
>
> Reported-by: Hulk Robot 
> Signed-off-by: Zhen Lei 

Patch applied to the DRM misc tree.

Yours,
Linus Walleij

Re: [PATCH v2 3/3] ARM: dts: gemini: remove xxx-cells from display

2021-05-21 Thread Linus Walleij

On Wed, May 19, 2021 at 10:35 PM Corentin Labbe  wrote:

> dtb_check complains about #address-cells and #size-cells, so lets
> remove them.
>
> Signed-off-by: Corentin Labbe 

Patch applied to the Gemini tree.

Yours,
Linus Walleij

Re: [PATCH v2 2/3] ARM: dts: gemini-dlink-dir-685: Remove address from display port

2021-05-21 Thread Linus Walleij

On Wed, May 19, 2021 at 10:35 PM Corentin Labbe  wrote:

> The address and reg adds no value to the port node, remove them.
>
> Signed-off-by: Corentin Labbe 

Patch applied to the Gemini tree.

Yours,
Linus Walleij

Re: [PATCH v2 1/3] dt-bindings: display: convert faraday,tve200

2021-05-21 Thread Linus Walleij

On Wed, May 19, 2021 at 10:35 PM Corentin Labbe  wrote:

> Converts display/faraday,tve200.txt to yaml.
>
> Signed-off-by: Corentin Labbe 

Patch applied to the DRM misc tree.

Yours,
Linus Walleij

Re: [PATCH 0/3] drm/msm/dp: Simplify aux code

2021-05-21 Thread khsieh


On 2021-05-21 14:57, Stephen Boyd wrote:

Quoting Stephen Boyd (2021-05-07 14:25:02)

Here's a few patches that simplify the aux handling code and bubble up
timeouts and nacks to the upper DRM layers. The goal is to get DRM to
know that the other side isn't there or that there's been a timeout,
instead of saying that everything is fine and putting some error 
message

into the logs.

Cc: Dmitry Baryshkov 
Cc: Abhinav Kumar 
Cc: Kuogee Hsieh 
Cc: aravi...@codeaurora.org
Cc: Sean Paul 



Kuogee, have you had a change to review this series?


Sorry  missed this one.
Will review it now.

Stephen Boyd (3):
  drm/msm/dp: Simplify aux irq handling code
  drm/msm/dp: Shrink locking area of dp_aux_transfer()
  drm/msm/dp: Handle aux timeouts, nacks, defers

 drivers/gpu/drm/msm/dp/dp_aux.c | 181 


 drivers/gpu/drm/msm/dp/dp_aux.h |   8 --
 drivers/gpu/drm/msm/dp/dp_catalog.c |   2 +-
 drivers/gpu/drm/msm/dp/dp_catalog.h |   2 +-
 4 files changed, 80 insertions(+), 113 deletions(-)


base-commit: 51595e3b4943b0079638b2657f603cf5c8ea3a66
--
https://chromeos.dev

Re: [PATCH 06/11] drm/: drm_gem_plane_helper_prepare_fb is now the default

2021-05-21 Thread Chun-Kuang Hu

Hi, Daniel:

Daniel Vetter  於 2021年5月21日 週五 下午5:10寫道：
>
> No need to set it explicitly.
>
> Signed-off-by: Daniel Vetter 
> Cc: Laurentiu Palcu 
> Cc: Lucas Stach 
> Cc: Shawn Guo 
> Cc: Sascha Hauer 
> Cc: Pengutronix Kernel Team 
> Cc: Fabio Estevam 
> Cc: NXP Linux Team 
> Cc: Philipp Zabel 
> Cc: Paul Cercueil 
> Cc: Chun-Kuang Hu 
> Cc: Matthias Brugger 
> Cc: Neil Armstrong 
> Cc: Kevin Hilman 
> Cc: Jerome Brunet 
> Cc: Martin Blumenstingl 
> Cc: Marek Vasut 
> Cc: Stefan Agner 
> Cc: Sandy Huang 
> Cc: "Heiko Stübner" 
> Cc: Yannick Fertre 
> Cc: Philippe Cornu 
> Cc: Benjamin Gaignard 
> Cc: Maxime Coquelin 
> Cc: Alexandre Torgue 
> Cc: Maxime Ripard 
> Cc: Chen-Yu Tsai 
> Cc: Jernej Skrabec 
> Cc: Jyri Sarha 
> Cc: Tomi Valkeinen 
> Cc: linux-arm-ker...@lists.infradead.org
> Cc: linux-m...@vger.kernel.org
> Cc: linux-media...@lists.infradead.org
> Cc: linux-amlo...@lists.infradead.org
> Cc: linux-rockc...@lists.infradead.org
> Cc: linux-st...@st-md-mailman.stormreply.com
> Cc: linux-su...@lists.linux.dev

For Mediatek,
Acked-by: Chun-Kuang Hu

Re: [PATCH v7 00/10] drm: Fix EDID reading on ti-sn65dsi86 by introducing the DP AUX bus

2021-05-21 Thread Lyude Paul

For patches 5, and 6:

Reviewed-by: Lyude Paul 

This week got really busy so I wasn't able to look at the rest of them, but next
week is going to be a lot less busy so I should be able to look at them then

On Mon, 2021-05-17 at 13:08 -0700, Douglas Anderson wrote:
> The primary goal of this series is to try to properly fix EDID reading
> for eDP panels using the ti-sn65dsi86 bridge.
> 
> Previously we had a patch that added EDID reading but it turned out
> not to work at bootup. This caused some extra churn at bootup as we
> tried (and failed) to read the EDID several times and also ended up
> forcing us to use the hardcoded mode at boot. With this patch series I
> believe EDID reading is reliable at boot now and we never use the
> hardcoded mode.
> 
> High level note: in this series the EDID reading is driven by the
> panel driver, not by the bridge chip driver. I believe this makes a
> reasonable amount of sense since the panel driver already _could_
> drive reading the EDID if provided with the DDC bus and in future
> planned work we'll want to give the panel driver the DDC bus (to make
> decisions based on EDID) and the AUX bus (to control the
> backlight). There are also planned patches from Laurent to make
> ti-sn65dsi86 able to drive full DP monitors. In that case the bridge
> chip will still be in charge of reading the EDID, but it's not hard to
> make this dynamic.
> 
> This series is the logical successor to the 3-part series containing
> the patch ("drm/bridge: ti-sn65dsi86: Properly get the EDID, but only
> if refclk") [1].
> 
> This patch was tested against drm-misc-next commit 60a6b73dd821
> ("drm/ingenic: Fix pixclock rate for 24-bit serial panels") on a
> sc7180-trogdor-lazor device.
> 
> At v7 now, this patch series grew a bit from v6 because it introduces
> the DP AUX bus.
> 
> Between v2 and v3, high-level view of changes:
> - stop doing the EDID caching in the core.
> 
> Between v3 and v4, high-level view of changes:
> - EDID reading is actually driven by the panel driver now. See above.
> - Lots of chicken-and-egg problems solved w/ sub-devices.
> 
> Between v4 and v5, high-level view of changes.
> - Some of the early patches landed, so dropped from series.
> - New pm_runtime_disable() fix (fixed a patch that already landed).
> - Added Bjorn's tags to most patches
> - Fixed problems when building as a module.
> - Reordered debugfs patch and fixed error handling there.
> - Dropped last patch. I'm not convinced it's safe w/out more work.
> 
> Between v5 and v6, high-level view of changes:
> - Added the patch ("drm/dp: Allow an early call to register DDC i2c
>   bus")
> - Many patches had been landed, so only a few "controversial" ones
>   left.
> 
> Between v6 and v7, high-level view of changes:
> - New AUX DP bus!
> 
> [1]
> https://lore.kernel.org/r/20210304155144.3.I60a7fb23ce4589006bc95c64ab8d15c74b876e68@changeid/
> 
> Changes in v7:
> - pm_runtime_dont_use_autosuspend() fix new for v7.
> - List hpd properties bindings patch new for v7.
> - ti-sn65dsi86: Add aux-bus child patch new for v7.
> - Patch introducing the DP AUX bus is new for v7.
> - Patch to allow panel-simple to be DP AUX EP new for v7.
> - Patch using the DP AUX for DDC new for v7.
> - Remove use of now-dropped drm_dp_aux_register_ddc() call.
> - Beefed up commit message in context of the DP AUX bus.
> - Set the proper sub-device "dev" pointer in the AUX structure.
> - Patch to support for DP AUX bus on ti-sn65dsi86 new for v7.
> - Adjusted commit message to talk about DP AUX bus.
> - Panel now under bridge chip instead of getting a link to ddc-i2c
> 
> Changes in v6:
> - Use new drm_dp_aux_register_ddc() calls.
> 
> Douglas Anderson (10):
>   drm/panel: panel-simple: Add missing pm_runtime_dont_use_autosuspend()
>     calls
>   dt-bindings: display: simple: List hpd properties in panel-simple
>   dt-bindings: drm/bridge: ti-sn65dsi86: Add aux-bus child
>   drm: Introduce the DP AUX bus
>   drm/panel: panel-simple: Allow panel-simple be a DP AUX endpoint
>     device
>   drm/panel: panel-simple: Stash DP AUX bus; allow using it for DDC
>   drm/bridge: ti-sn65dsi86: Promote the AUX channel to its own sub-dev
>   drm/bridge: ti-sn65dsi86: Add support for the DP AUX bus
>   drm/bridge: ti-sn65dsi86: Don't read EDID blob over DDC
>   arm64: dts: qcom: sc7180-trogdor: Move panel under the bridge chip
> 
>  .../bindings/display/bridge/ti,sn65dsi86.yaml |  22 +-
>  .../bindings/display/panel/panel-simple.yaml  |   2 +
>  arch/arm64/boot/dts/qcom/sc7180-trogdor.dtsi  |  30 +-
>  drivers/gpu/drm/Kconfig   |   5 +
>  drivers/gpu/drm/Makefile  |   2 +
>  drivers/gpu/drm/bridge/Kconfig    |   1 +
>  drivers/gpu/drm/bridge/ti-sn65dsi86.c | 111 --
>  drivers/gpu/drm/drm_dp_aux_bus.c  | 322 ++
>  drivers/gpu/drm/panel/Kconfig |   1 +
>  drivers/gpu/drm/panel/panel-simple.c  |  66 +++-
>  include/drm/drm_dp_aux_bus.h

Re: [PATCH 27/38] drm/msm/dp/dp_catalog: Correctly document param 'dp_catalog'

2021-05-21 Thread Stephen Boyd

Quoting Lee Jones (2021-05-20 05:02:37)
> Fixes the following W=1 kernel build warning(s):
>
>  drivers/gpu/drm/msm/dp/dp_catalog.c:206: warning: Function parameter or 
> member 'dp_catalog' not described in 'dp_catalog_aux_reset'
>  drivers/gpu/drm/msm/dp/dp_catalog.c:206: warning: Excess function parameter 
> 'aux' description in 'dp_catalog_aux_reset'
>
> Cc: Rob Clark 
> Cc: Sean Paul 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: Chandan Uddaraju 
> Cc: Stephen Boyd 
> Cc: linux-arm-...@vger.kernel.org
> Cc: dri-devel@lists.freedesktop.org
> Cc: freedr...@lists.freedesktop.org
> Signed-off-by: Lee Jones 
> ---

Reviewed-by: Stephen Boyd

Re: [PATCH v2] drm/msm: Use nvmem_cell_read_variable_le_u32() to read speed bin

2021-05-21 Thread Stephen Boyd

Quoting Doug Anderson (2021-05-21 15:35:33)
> Hi,
>
> On Fri, May 21, 2021 at 3:02 PM Stephen Boyd  wrote:
> >
> > Quoting Douglas Anderson (2021-05-21 13:45:50)
> > > Let's use the newly-added nvmem_cell_read_variable_le_u32() to future
> > > proof ourselves a little bit.
> > >
> > > Signed-off-by: Douglas Anderson 
> > > ---
> > > The patch that this depends on is now in mainline so it can be merged
> > > at will. I'm just sending this as a singleton patch to make it obvious
> > > that there are no dependencies now.
> > >
> > > Changes in v2:
> > > - Rebased
> > >
> > >  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 5 ++---
> > >  1 file changed, 2 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> > > b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > index b4d8e1b01ee4..a07214157ad3 100644
> > > --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > @@ -1403,10 +1403,10 @@ static int a6xx_set_supported_hw(struct device 
> > > *dev, struct a6xx_gpu *a6xx_gpu,
> > >  {
> > > struct opp_table *opp_table;
> > > u32 supp_hw = UINT_MAX;
> > > -   u16 speedbin;
> > > +   u32 speedbin;
> > > int ret;
> > >
> > > -   ret = nvmem_cell_read_u16(dev, "speed_bin", );
> > > +   ret = nvmem_cell_read_variable_le_u32(dev, "speed_bin", 
> > > );
> >
> > I missed the review of this API, sorry.
>
> You commented on the patch that added it, though? Oddly I can't find
> your commit on lore.kernel.org (?), but it's in my inbox...

Must be brain fog on my end!

>
>
> > I wonder why it doesn't return
> > the value into an __le32 pointer. Then the caller could use
> > le32_to_cpu() like other places in the kernel and we know that code is
> > properly converting the little endian value to CPU native order. Right
> > now the API doesn't express the endianess of the bits in the return
> > value because it uses u32, so from a static checker perspective (sparse)
> > those bits are CPU native order, not little endian.
>
> I think it's backwards of what you're saying? This function is for
> when the value is stored in nvram in little endian but returned to the
> caller in CPU native order. It would be really awkward _not_ to
> convert this value from LE to native order in the
> nvmem_cell_read_variable_le_u32() function because that functions
> handles the fact that the cell could be specified as several different
> sizes (as long as it's less than 32-bits).
>

Ah ok. I was looking at the name of the API and thinking it was an le32;
happily glossing over that _u between le and 32. So it's "nvmem cell read
variable little endian to cpu u32"?

Reviewed-by: Stephen Boyd

Re: [PATCH v2] drm/msm: Use nvmem_cell_read_variable_le_u32() to read speed bin

2021-05-21 Thread Doug Anderson

Hi,

On Fri, May 21, 2021 at 3:02 PM Stephen Boyd  wrote:
>
> Quoting Douglas Anderson (2021-05-21 13:45:50)
> > Let's use the newly-added nvmem_cell_read_variable_le_u32() to future
> > proof ourselves a little bit.
> >
> > Signed-off-by: Douglas Anderson 
> > ---
> > The patch that this depends on is now in mainline so it can be merged
> > at will. I'm just sending this as a singleton patch to make it obvious
> > that there are no dependencies now.
> >
> > Changes in v2:
> > - Rebased
> >
> >  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 5 ++---
> >  1 file changed, 2 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> > b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > index b4d8e1b01ee4..a07214157ad3 100644
> > --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > @@ -1403,10 +1403,10 @@ static int a6xx_set_supported_hw(struct device 
> > *dev, struct a6xx_gpu *a6xx_gpu,
> >  {
> > struct opp_table *opp_table;
> > u32 supp_hw = UINT_MAX;
> > -   u16 speedbin;
> > +   u32 speedbin;
> > int ret;
> >
> > -   ret = nvmem_cell_read_u16(dev, "speed_bin", );
> > +   ret = nvmem_cell_read_variable_le_u32(dev, "speed_bin", );
>
> I missed the review of this API, sorry.

You commented on the patch that added it, though? Oddly I can't find
your commit on lore.kernel.org (?), but it's in my inbox...


> I wonder why it doesn't return
> the value into an __le32 pointer. Then the caller could use
> le32_to_cpu() like other places in the kernel and we know that code is
> properly converting the little endian value to CPU native order. Right
> now the API doesn't express the endianess of the bits in the return
> value because it uses u32, so from a static checker perspective (sparse)
> those bits are CPU native order, not little endian.

I think it's backwards of what you're saying? This function is for
when the value is stored in nvram in little endian but returned to the
caller in CPU native order. It would be really awkward _not_ to
convert this value from LE to native order in the
nvmem_cell_read_variable_le_u32() function because that functions
handles the fact that the cell could be specified as several different
sizes (as long as it's less than 32-bits).

-Doug


-Doug

Re: [PATCH] drm/amd/amdkfd: Drop unnecessary NULL check after container_of

2021-05-21 Thread Felix Kuehling

Am 2021-05-21 um 11:02 a.m. schrieb Guenter Roeck:
> The first parameter passed to container_of() is the pointer to the work
> structure passed to the worker and never NULL. The NULL check on the
> result of container_of() is therefore unnecessary and misleading.
> Remove it.
>
> This change was made automatically with the following Coccinelle script.
>
> @@
> type t;
> identifier v;
> statement s;
> @@
>
> <+...
> (
>   t v = container_of(...);
> |
>   v = container_of(...);
> )
>   ...
>   when != v
> - if (\( !v \| v == NULL \) ) s
> ...+>
>
> Signed-off-by: Guenter Roeck 

Thank you. The patch looks good to me. I caught a few such pointless
checks in code review but must have missed this one. I'll apply your
patch to amd-staging-drm-next. The patch is

Reviewed-by: Felix Kuehling 


> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_process.c | 2 --
>  1 file changed, 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> index 5b6c5669c03d..2f8d352e0069 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> @@ -110,8 +110,6 @@ static void kfd_sdma_activity_worker(struct work_struct 
> *work)
>  
>   workarea = container_of(work, struct kfd_sdma_activity_handler_workarea,
>   sdma_activity_work);
> - if (!workarea)
> - return;
>  
>   pdd = workarea->pdd;
>   if (!pdd)

Re: [PATCH v2] drm/msm: Use nvmem_cell_read_variable_le_u32() to read speed bin

2021-05-21 Thread Stephen Boyd

Quoting Douglas Anderson (2021-05-21 13:45:50)
> Let's use the newly-added nvmem_cell_read_variable_le_u32() to future
> proof ourselves a little bit.
>
> Signed-off-by: Douglas Anderson 
> ---
> The patch that this depends on is now in mainline so it can be merged
> at will. I'm just sending this as a singleton patch to make it obvious
> that there are no dependencies now.
>
> Changes in v2:
> - Rebased
>
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index b4d8e1b01ee4..a07214157ad3 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -1403,10 +1403,10 @@ static int a6xx_set_supported_hw(struct device *dev, 
> struct a6xx_gpu *a6xx_gpu,
>  {
> struct opp_table *opp_table;
> u32 supp_hw = UINT_MAX;
> -   u16 speedbin;
> +   u32 speedbin;
> int ret;
>
> -   ret = nvmem_cell_read_u16(dev, "speed_bin", );
> +   ret = nvmem_cell_read_variable_le_u32(dev, "speed_bin", );

I missed the review of this API, sorry. I wonder why it doesn't return
the value into an __le32 pointer. Then the caller could use
le32_to_cpu() like other places in the kernel and we know that code is
properly converting the little endian value to CPU native order. Right
now the API doesn't express the endianess of the bits in the return
value because it uses u32, so from a static checker perspective (sparse)
those bits are CPU native order, not little endian.

> /*
>  * -ENOENT means that the platform doesn't support speedbin which is
>  * fine

Re: [PATCH 0/3] drm/msm/dp: Simplify aux code

2021-05-21 Thread Stephen Boyd

Quoting Stephen Boyd (2021-05-07 14:25:02)
> Here's a few patches that simplify the aux handling code and bubble up
> timeouts and nacks to the upper DRM layers. The goal is to get DRM to
> know that the other side isn't there or that there's been a timeout,
> instead of saying that everything is fine and putting some error message
> into the logs.
>
> Cc: Dmitry Baryshkov 
> Cc: Abhinav Kumar 
> Cc: Kuogee Hsieh 
> Cc: aravi...@codeaurora.org
> Cc: Sean Paul 
>

Kuogee, have you had a change to review this series?

> Stephen Boyd (3):
>   drm/msm/dp: Simplify aux irq handling code
>   drm/msm/dp: Shrink locking area of dp_aux_transfer()
>   drm/msm/dp: Handle aux timeouts, nacks, defers
>
>  drivers/gpu/drm/msm/dp/dp_aux.c | 181 
>  drivers/gpu/drm/msm/dp/dp_aux.h |   8 --
>  drivers/gpu/drm/msm/dp/dp_catalog.c |   2 +-
>  drivers/gpu/drm/msm/dp/dp_catalog.h |   2 +-
>  4 files changed, 80 insertions(+), 113 deletions(-)
>
>
> base-commit: 51595e3b4943b0079638b2657f603cf5c8ea3a66
> --
> https://chromeos.dev
>

Re: [PATCH 0/4] dma-buf: Add an API for exporting sync files (v8)

2021-05-21 Thread Jason Ekstrand

IGT: https://patchwork.freedesktop.org/series/90433/
Mesa (Vulkan WSI):
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4037

Ideally, it'd be nice to see a RADV MR and maybe even radeonsi before
we really land on it.  However, I think it's proved out well enough in
ANV to at least land the first three.

--Jason

On Thu, May 20, 2021 at 2:00 PM Jason Ekstrand  wrote:
>
> This is mostly a re-send of v8 only with a fourth patch which contains the
> sync file import ioctl that I had in the original series.  I've not updated
> the IGT tests yet for sync file import.  This resend is mostly intended to
> aid in discussions around implicit sync in general.  I'll write up some IGT
> tests if there is serious interest in patch 4.  I can also update the Mesa
> MR to use it for Vulkan.
>
> ---
>
> Modern userspace APIs like Vulkan are built on an explicit
> synchronization model.  This doesn't always play nicely with the
> implicit synchronization used in the kernel and assumed by X11 and
> Wayland.  The client -> compositor half of the synchronization isn't too
> bad, at least on intel, because we can control whether or not i915
> synchronizes on the buffer and whether or not it's considered written.
>
> The harder part is the compositor -> client synchronization when we get
> the buffer back from the compositor.  We're required to be able to
> provide the client with a VkSemaphore and VkFence representing the point
> in time where the window system (compositor and/or display) finished
> using the buffer.  With current APIs, it's very hard to do this in such
> a way that we don't get confused by the Vulkan driver's access of the
> buffer.  In particular, once we tell the kernel that we're rendering to
> the buffer again, any CPU waits on the buffer or GPU dependencies will
> wait on some of the client rendering and not just the compositor.
>
> This new IOCTL solves this problem by allowing us to get a snapshot of
> the implicit synchronization state of a given dma-buf in the form of a
> sync file.  It's effectively the same as a poll() or I915_GEM_WAIT only,
> instead of CPU waiting directly, it encapsulates the wait operation, at
> the current moment in time, in a sync_file so we can check/wait on it
> later.  As long as the Vulkan driver does the sync_file export from the
> dma-buf before we re-introduce it for rendering, it will only contain
> fences from the compositor or display.  This allows to accurately turn
> it into a VkFence or VkSemaphore without any over- synchronization.
>
> Mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4037
> IGT tests: 
> https://lists.freedesktop.org/archives/igt-dev/2021-March/029825.html
>
> Cc: Christian König 
> Cc: Michel Dänzer 
> Cc: Dave Airlie 
> Cc: Bas Nieuwenhuizen 
> Cc: Daniel Stone 
>
> Christian König (1):
>   dma-buf: add dma_fence_array_for_each (v2)
>
> Jason Ekstrand (3):
>   dma-buf: add dma_resv_get_singleton_rcu (v4)
>   dma-buf: Add an API for exporting sync files (v9)
>   RFC: dma-buf: Add an API for importing sync files (v6)
>
>  drivers/dma-buf/dma-buf.c |  94 +++
>  drivers/dma-buf/dma-fence-array.c |  27 +++
>  drivers/dma-buf/dma-resv.c| 122 ++
>  include/linux/dma-fence-array.h   |  17 +
>  include/linux/dma-resv.h  |   3 +
>  include/uapi/linux/dma-buf.h  |  25 ++
>  6 files changed, 288 insertions(+)
>
> --
> 2.31.1
>

Re: [PATCH v17 1/4] dt-bindings: msm: disp: add yaml schemas for DPU bindings

2021-05-21 Thread Bjorn Andersson

On Fri 21 May 15:51 CDT 2021, Stephen Boyd wrote:

> Quoting Bjorn Andersson (2021-05-21 09:00:29)
> > On Fri 21 May 05:27 CDT 2021, Krishna Manikandan wrote:
> > > diff --git 
> > > a/Documentation/devicetree/bindings/display/msm/dpu-sc7180.yaml 
> > > b/Documentation/devicetree/bindings/display/msm/dpu-sc7180.yaml
> > [..]
> > > +  ports:
> > > +$ref: /schemas/graph.yaml#/properties/ports
> > > +description: |
> > > +  Contains the list of output ports from DPU device. These ports
> > > +  connect to interfaces that are external to the DPU hardware,
> > > +  such as DSI, DP etc. Each output port contains an endpoint that
> > > +  describes how it is connected to an external interface.
> > > +
> > > +properties:
> > > +  port@0:
> > > +$ref: /schemas/graph.yaml#/properties/port
> > > +description: DPU_INTF1 (DSI1)
> > > +
> > > +  port@2:
> > > +$ref: /schemas/graph.yaml#/properties/port
> > > +description: DPU_INTF0 (DP)
> >
> > Why is port@0 INTF1 and why is port@2 INTF0? In the binding you're
> > translating the two ports that are described are 0 and 1, representing
> > INTF1 and INTF2, or DSI1 and DSI2, respectively.
> >
> > Further more, I have a need for somehow describing the pairing of 4 DP
> > INTFs (INTF 0, 3, 4 and 5) and how they are connected to the 3+1 DP+eDP
> > controllers.
> >
> > Downstream this seems to be handled by adding cell-index to the DP
> > controllers and then matching that against the numbering in the driver's
> > INTF array. But rather than adding cell-index to map this, can't we
> > define that the port index is the INTF-number here?
> >
> >
> > This would obviously break compatibility with existing DTBs, but we
> > could start by doing it selectively for the new compatibles, fix up the
> > existing dts files and then drop the selective application after 1 or 2
> > LTS releases.
> 
> I requested that the existing DT not change a while ago when the DP
> interface was being added to this binding. Is it possible to figure out
> what interface it is that the port is for from the binding? It feels
> like the problem is that the driver wants to look through the graph and
> make connectors for each one, but it doesn't know what type of connector
> to make.

Today there's a single priv->dp pointer which is initialized as the one
and only displayport controller component is bound.
_dpu_kms_set_encoder_mode() has no knowledge about which interface this
single controller is attached to, so dpu_encoder_setup_display() will
always just pick INTF_DP index 0.

So in its current form if your single DP port isn't sitting on the
platform's first DP INTF you need to hack dpu_hw_catalog and remove the
previous ones.

But with my desire to reuse the DP controller code for eDP, and the fact
that I have 3 DP controllers in my laptop we need something more.

But after considering my proposal further I realized that it is too
static for my use case anyway. SC8180x has INTF 0 and 4 are wired to
the DP controllers associated with the two primary USB ports and then
INTF 3 is dynamically switched between them for MST purposes.

So using port indices would prevent me from doing this dynamic dance.

Regards,
Bjorn

Re: [PATCH] drm/amd/amdkfd: Drop unnecessary NULL check after container_of

2021-05-21 Thread Alex Deucher

Applied.  Thanks!

Alex

On Fri, May 21, 2021 at 11:02 AM Guenter Roeck  wrote:
>
> The first parameter passed to container_of() is the pointer to the work
> structure passed to the worker and never NULL. The NULL check on the
> result of container_of() is therefore unnecessary and misleading.
> Remove it.
>
> This change was made automatically with the following Coccinelle script.
>
> @@
> type t;
> identifier v;
> statement s;
> @@
>
> <+...
> (
>   t v = container_of(...);
> |
>   v = container_of(...);
> )
>   ...
>   when != v
> - if (\( !v \| v == NULL \) ) s
> ...+>
>
> Signed-off-by: Guenter Roeck 
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_process.c | 2 --
>  1 file changed, 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> index 5b6c5669c03d..2f8d352e0069 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> @@ -110,8 +110,6 @@ static void kfd_sdma_activity_worker(struct work_struct 
> *work)
>
> workarea = container_of(work, struct 
> kfd_sdma_activity_handler_workarea,
> sdma_activity_work);
> -   if (!workarea)
> -   return;
>
> pdd = workarea->pdd;
> if (!pdd)
> --
> 2.25.1
>

Re: [PATCH] drm/amdgpu: Fix inconsistent indenting

2021-05-21 Thread Alex Deucher

Applied.  Thanks!

Alex

On Fri, May 21, 2021 at 9:35 AM Christian König
 wrote:
>
> Am 21.05.21 um 11:50 schrieb Jiapeng Chong:
> > Eliminate the follow smatch warning:
> >
> > drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c:449
> > sdma_v5_0_ring_emit_mem_sync() warn: inconsistent indenting.
> >
> > Reported-by: Abaci Robot 
> > Signed-off-by: Jiapeng Chong 
>
> Reviewed-by: Christian König 
>
> > ---
> >   drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 13 ++---
> >   1 file changed, 6 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c 
> > b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
> > index 75d7310..c45e1b0 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
> > @@ -440,20 +440,19 @@ static void sdma_v5_0_ring_emit_ib(struct amdgpu_ring 
> > *ring,
> >*/
> >   static void sdma_v5_0_ring_emit_mem_sync(struct amdgpu_ring *ring)
> >   {
> > -uint32_t gcr_cntl =
> > - SDMA_GCR_GL2_INV | SDMA_GCR_GL2_WB | SDMA_GCR_GLM_INV |
> > - SDMA_GCR_GL1_INV | SDMA_GCR_GLV_INV | 
> > SDMA_GCR_GLK_INV |
> > - SDMA_GCR_GLI_INV(1);
> > + uint32_t gcr_cntl = SDMA_GCR_GL2_INV | SDMA_GCR_GL2_WB | 
> > SDMA_GCR_GLM_INV |
> > + SDMA_GCR_GL1_INV | SDMA_GCR_GLV_INV | 
> > SDMA_GCR_GLK_INV |
> > + SDMA_GCR_GLI_INV(1);
> >
> >   /* flush entire cache L0/L1/L2, this can be optimized by performance 
> > requirement */
> >   amdgpu_ring_write(ring, SDMA_PKT_HEADER_OP(SDMA_OP_GCR_REQ));
> >   amdgpu_ring_write(ring, SDMA_PKT_GCR_REQ_PAYLOAD1_BASE_VA_31_7(0));
> >   amdgpu_ring_write(ring, 
> > SDMA_PKT_GCR_REQ_PAYLOAD2_GCR_CONTROL_15_0(gcr_cntl) |
> > - SDMA_PKT_GCR_REQ_PAYLOAD2_BASE_VA_47_32(0));
> > +   SDMA_PKT_GCR_REQ_PAYLOAD2_BASE_VA_47_32(0));
> >   amdgpu_ring_write(ring, SDMA_PKT_GCR_REQ_PAYLOAD3_LIMIT_VA_31_7(0) |
> > - SDMA_PKT_GCR_REQ_PAYLOAD3_GCR_CONTROL_18_16(gcr_cntl 
> > >> 16));
> > +   
> > SDMA_PKT_GCR_REQ_PAYLOAD3_GCR_CONTROL_18_16(gcr_cntl >> 16));
> >   amdgpu_ring_write(ring, SDMA_PKT_GCR_REQ_PAYLOAD4_LIMIT_VA_47_32(0) |
> > - SDMA_PKT_GCR_REQ_PAYLOAD4_VMID(0));
> > +   SDMA_PKT_GCR_REQ_PAYLOAD4_VMID(0));
> >   }
> >
> >   /**
>
> ___
> amd-gfx mailing list
> amd-...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v2 2/2] drm/amdgpu: Fix crash when hot unplug in BACO.

2021-05-21 Thread Alex Deucher

On Fri, May 21, 2021 at 4:41 PM Andrey Grodzovsky
 wrote:
>
> Problem:
> When device goes into sleep state due to prolonged

s/sleep state/runtime suspend/

> innactivity (e.g. BACO sleep) and then hot unplugged,

inactivity

> PCI core will try to wake up the device as part of
> unplug process. Since the device is gone all HW
> programming during rpm resume fails leading
> to a bad SW state later during pci remove handling.
>
> Fix:
> Use a flag we use for PCIe error recovery to avoid
> accessing registres. This allows to succefully complete

successfully

> rpm resume sequence and finish pci remove.
>
> v2: Renamed HW access block flag
>
> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1081
> Signed-off-by: Andrey Grodzovsky 

With the above comments fixed, the series is:
Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 5 +
>  1 file changed, 5 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index d8db5929cdd9..b9d221fcb66d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -1555,6 +1555,11 @@ static int amdgpu_pmops_runtime_resume(struct device 
> *dev)
> if (!adev->runpm)
> return -EINVAL;
>
> +   /* Avoids registers access if device is physically gone */
> +   if (!pci_device_is_present(adev->pdev))
> +   adev->no_hw_access = true;
> +
> +
> if (amdgpu_device_supports_px(drm_dev)) {
> drm_dev->switch_power_state = DRM_SWITCH_POWER_CHANGING;
>
> --
> 2.25.1
>
> ___
> amd-gfx mailing list
> amd-...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v17 1/4] dt-bindings: msm: disp: add yaml schemas for DPU bindings

2021-05-21 Thread Stephen Boyd

Quoting Bjorn Andersson (2021-05-21 09:00:29)
> On Fri 21 May 05:27 CDT 2021, Krishna Manikandan wrote:
> > diff --git a/Documentation/devicetree/bindings/display/msm/dpu-sc7180.yaml 
> > b/Documentation/devicetree/bindings/display/msm/dpu-sc7180.yaml
> [..]
> > +  ports:
> > +$ref: /schemas/graph.yaml#/properties/ports
> > +description: |
> > +  Contains the list of output ports from DPU device. These ports
> > +  connect to interfaces that are external to the DPU hardware,
> > +  such as DSI, DP etc. Each output port contains an endpoint that
> > +  describes how it is connected to an external interface.
> > +
> > +properties:
> > +  port@0:
> > +$ref: /schemas/graph.yaml#/properties/port
> > +description: DPU_INTF1 (DSI1)
> > +
> > +  port@2:
> > +$ref: /schemas/graph.yaml#/properties/port
> > +description: DPU_INTF0 (DP)
>
> Why is port@0 INTF1 and why is port@2 INTF0? In the binding you're
> translating the two ports that are described are 0 and 1, representing
> INTF1 and INTF2, or DSI1 and DSI2, respectively.
>
> Further more, I have a need for somehow describing the pairing of 4 DP
> INTFs (INTF 0, 3, 4 and 5) and how they are connected to the 3+1 DP+eDP
> controllers.
>
> Downstream this seems to be handled by adding cell-index to the DP
> controllers and then matching that against the numbering in the driver's
> INTF array. But rather than adding cell-index to map this, can't we
> define that the port index is the INTF-number here?
>
>
> This would obviously break compatibility with existing DTBs, but we
> could start by doing it selectively for the new compatibles, fix up the
> existing dts files and then drop the selective application after 1 or 2
> LTS releases.

I requested that the existing DT not change a while ago when the DP
interface was being added to this binding. Is it possible to figure out
what interface it is that the port is for from the binding? It feels
like the problem is that the driver wants to look through the graph and
make connectors for each one, but it doesn't know what type of connector
to make.

Re: [PATCH v17 3/4] dt-bindings: msm: dsi: add yaml schemas for DSI PHY bindings

2021-05-21 Thread Stephen Boyd

Quoting Krishna Manikandan (2021-05-21 03:27:23)
> Add YAML schema for the device tree bindings for DSI PHY.
>
> Signed-off-by: Krishna Manikandan 
>
> Changes in v1:
>- Merge dsi-phy.yaml and dsi-phy-10nm.yaml (Stephen Boyd)
>- Remove qcom,dsi-phy-regulator-ldo-mode (Stephen Boyd)
>- Add clock cells properly (Stephen Boyd)
>- Remove unnecessary decription from clock names (Stephen Boyd)
>- Add pin names for the supply entries for 10nm phy which is
>  used in sc7180 and sdm845 (Stephen Boyd)
>- Remove unused header files from examples (Stephen Boyd)
>- Drop labels for display nodes and correct node name (Stephen Boyd)
>
> Changes in v2:
>- Drop maxItems for clock (Stephen Boyd)
>- Add vdds supply pin information for sdm845 (Stephen Boyd)
>- Add examples for 14nm, 20nm and 28nm phy yaml files (Stephen Boyd)
>- Keep child nodes directly under soc node (Stephen Boyd)
>
> Changes in v3:
>- Use a separate yaml file to describe the common properties
>  for all the dsi phy versions (Stephen Boyd)
>- Remove soc from examples (Stephen Boyd)
>- Add description for register property
>
> Changes in v4:
>- Modify the title for all the phy versions (Stephen Boyd)
>- Drop description for all the phy versions (Stephen Boyd)
>- Modify the description for register property (Stephen Boyd)
>
> Changes in v5:
>- Remove unused properties from common dsi phy file
>- Add clock-cells and phy-cells to required property
>  list (Stephen Boyd)
>
> Changes in v6:
>- Add proper compatible string in example
> ---

Reviewed-by: Stephen Boyd

Re: [PATCH v17 2/4] dt-bindings: msm: dsi: add yaml schemas for DSI bindings

2021-05-21 Thread Stephen Boyd

Quoting Krishna Manikandan (2021-05-21 03:27:22)
> Add YAML schema for the device tree bindings for DSI
>
> Signed-off-by: Krishna Manikandan 
>
> Changes in v1:
> - Separate dsi controller bindings to a separate patch (Stephen Boyd)
> - Merge dsi-common-controller.yaml and dsi-controller-main.yaml to
>   a single file (Stephen Boyd)
> - Drop supply entries and definitions from properties (Stephen Boyd)
> - Modify phy-names property for dsi controller (Stephen Boyd)
> - Remove boolean from description (Stephen Boyd)
> - Drop pinctrl properties as they are standard entries (Stephen Boyd)
> - Modify the description for ports property and keep the reference
>   to the generic binding where this is defined (Stephen Boyd)
> - Add description to clock names (Stephen Boyd)
> - Correct the indendation (Stephen Boyd)
> - Drop the label for display dt nodes and correct the node
>   name (Stephen Boyd)
>
> Changes in v2:
> - Drop maxItems for clock (Stephen Boyd)
> - Drop qcom,mdss-mdp-transfer-time-us as it is not used in upstream
>   dt file (Stephen Boyd)
> - Keep child node directly under soc node (Stephen Boyd)
> - Drop qcom,sync-dual-dsi as it is not used in upstream dt
>
> Changes in v3:
> - Add description for register property (Stephen Boyd)
>
> Changes in v4:
> - Add maxItems for phys property (Stephen Boyd)
> - Add maxItems for reg property (Stephen Boyd)
> - Add reference for data-lanes property (Stephen Boyd)
> - Remove soc from example (Stephen Boyd)
>
> Changes in v5:
> - Modify title and description (Stephen Boyd)
> - Add required properties for ports node (Stephen Boyd)
> - Add data-lanes in the example (Stephen Boyd)
> - Drop qcom,master-dsi property (Stephen Boyd)
>
> Changes in v6:
> - Add required properties for port@0, port@1 and corresponding
>   endpoints (Stephen Boyd)
> - Add address-cells and size-cells for ports (Stephen Boyd)
> - Use additionalProperties instead of unevaluatedProperties (Stephen Boyd)
>
> Changes in v7:
> - Add reference for ports and data-lanes (Rob Herring)
> - Add maxItems and minItems for data-lanes (Rob Herring)
>
> Changes in v8:
> - Drop common properties and description from ports (Rob Herring)
> - Add reference for endpoint (Rob Herring)
> - Add correct reference for data-lanes (Rob Herring)
> - Drop common properties from required list for ports (Rob Herring)
> ---

Reviewed-by: Stephen Boyd

Re: [PATCH v17 1/4] dt-bindings: msm: disp: add yaml schemas for DPU bindings

2021-05-21 Thread Stephen Boyd

Quoting Krishna Manikandan (2021-05-21 03:27:21)
> MSM Mobile Display Subsystem (MDSS) encapsulates sub-blocks
> like DPU display controller, DSI etc. Add YAML schema
> for DPU device tree bindings.
>
> Signed-off-by: Krishna Manikandan 
>
> Changes in v2:
> - Changed dpu to DPU (Sam Ravnborg)
> - Fixed indentation issues (Sam Ravnborg)
> - Added empty line between different properties (Sam Ravnborg)
> - Replaced reference txt files with  their corresponding
>   yaml files (Sam Ravnborg)
> - Modified the file to use "|" only when it is
>   necessary (Sam Ravnborg)
>
> Changes in v3:
> - Corrected the license used (Rob Herring)
> - Added maxItems for properties (Rob Herring)
> - Dropped generic descriptions (Rob Herring)
> - Added ranges property (Rob Herring)
> - Corrected the indendation (Rob Herring)
> - Added additionalProperties (Rob Herring)
> - Split dsi file into two, one for dsi controller
>   and another one for dsi phy per target (Rob Herring)
> - Corrected description for pinctrl-names (Rob Herring)
> - Corrected the examples used in yaml file (Rob Herring)
> - Delete dsi.txt and dpu.txt (Rob Herring)
>
> Changes in v4:
> - Move schema up by one level (Rob Herring)
> - Add patternProperties for mdp node (Rob Herring)
> - Corrected description of some properties (Rob Herring)
>
> Changes in v5:
> - Correct the indentation (Rob Herring)
> - Remove unnecessary description from properties (Rob Herring)
> - Correct the number of interconnect entries (Rob Herring)
> - Add interconnect names for sc7180 (Rob Herring)
> - Add description for ports (Rob Herring)
> - Remove common properties (Rob Herring)
> - Add unevalutatedProperties (Rob Herring)
> - Reference existing dsi controller yaml in the common
>   dsi controller file (Rob Herring)
> - Correct the description of clock names to include only the
>   clocks that are required (Rob Herring)
> - Remove properties which are already covered under the common
>   binding (Rob Herring)
> - Add dsi phy supply nodes which are required for sc7180 and
>   sdm845 targets (Rob Herring)
> - Add type ref for syscon-sfpb (Rob Herring)
>
> Changes in v6:
> - Fixed errors during dt_binding_check (Rob Herring)
> - Add maxItems for phys and phys-names (Rob Herring)
> - Use unevaluatedProperties wherever required (Rob Herring)
> - Removed interrupt controller from required properties for
>   dsi controller (Rob Herring)
> - Add constraints for dsi-phy reg-names based on the compatible
>   phy version (Rob Herring)
> - Add constraints for dsi-phy supply nodes based on the
>   compatible phy version (Rob Herring)
>
> Changes in v7:
> - Add default value for qcom,mdss-mdp-transfer-time-us (Rob Herring)
> - Modify the schema for data-lanes (Rob Herring)
> - Split the phy schema into separate schemas based on
>   the phy version (Rob Herring)
>
> Changes in v8:
> - Resolve merge conflicts with latest dsi.txt file
> - Include dp yaml change also in the same series
>
> Changes in v9:
> - Combine target specific dsi controller yaml files
>   to a single yaml file (Rob Herring)
> - Combine target specific dsi phy yaml files into a
>   single yaml file (Rob Herring)
> - Use unevaluatedProperties and additionalProperties
>   wherever required
> - Remove duplicate properties from common yaml files
>
> Changes in v10:
> - Split the patch into separate patches for DPU, DSI and
>   PHY (Stephen Boyd)
> - Drop unnecessary fullstop (Stephen Boyd)
> - Add newline whereever required (Stephen Boyd)
> - Add description for clock used (Stephen Boyd)
> - Modify the description for interconnect entries  (Stephen Boyd)
> - Drop assigned clock entries as it a generic property (Stephen Boyd)
> - Correct the definition for interrupts (Stephen Boyd)
> - Drop clock names from required properties (Stephen Boyd)
> - Drop labels for display nodes from example (Stephen Boyd)
> - Drop flags from interrupts entries (Stephen Boyd)
>
> Changes in v11:
> - Drop maxItems for clocks (Stephen Boyd)
>
> Changes in v12:
> - Add description for register property (Stephen Boyd)
> - Add maxItems for interrupts (Stephen Boyd)
> - Add description for iommus property (Stephen Boyd)
> - Add description for interconnects (Stephen Boyd)
> - Change display node name to display_controller (Stephen Boyd)
>
> Changes in v13:
> - Add maxItems for reg property (Stephen Boyd)
> - Add ranges property in example (Stephen Boyd)
> - Modify description for iommus property (Stephen Boyd)
> - Add Dp bindings for ports in the same patch (Stephen Boyd)
> - Remove soc from examples and change address and size cells
>   accordingly (Stephen Boyd)
> - Add reference for ports
>
> Changes in v14:
> - Modify

Re: [PATCH v17 4/4] dt-bindings: msm/dp: Add bindings of MSM DisplayPort controller

2021-05-21 Thread Stephen Boyd

Quoting Krishna Manikandan (2021-05-21 03:27:24)
> Add bindings for Snapdragon DisplayPort controller driver.
>
> Signed-off-by: Chandan Uddaraju 
> Signed-off-by: Vara Reddy 
> Signed-off-by: Tanmay Shah 
> Signed-off-by: Kuogee Hsieh 
> Signed-off-by: Krishna Manikandan 
>
> Changes in V2:
> -Provide details about sel-gpio
>
> Changes in V4:
> -Provide details about max dp lanes
> -Change the commit text
>
> Changes in V5:
> -moved dp.txt to yaml file
>
> Changes in v6:
> - Squash all AUX LUT properties into one pattern Property
> - Make aux-cfg[0-9]-settings properties optional
> - Remove PLL/PHY bindings from DP controller dts
> - Add DP clocks description
> - Remove _clk suffix from clock names
> - Rename pixel clock to stream_pixel
> - Remove redundant bindings (GPIO, PHY, HDCP clock, etc..)
> - Fix indentation
> - Add Display Port as interface of DPU in DPU bindings
>   and add port mapping accordingly.
>
> Chages in v7:
> - Add dp-controller.yaml file common between multiple SOC
> - Rename dp-sc7180.yaml to dp-controller-sc7180.yaml
> - change compatible string and add SOC name to it.
> - Remove Root clock generator for pixel clock
> - Add assigned-clocks and assigned-clock-parents bindings
> - Remove redundant properties, descriptions and blank lines
> - Add DP port in DPU bindings
> - Update depends-on tag in commit message and rebase change accordingly
>
> Changes in v8:
> - Add MDSS AHB clock in bindings
>
> Changes in v9:
> - Remove redundant reg-name property
> - Change assigned-clocks and assigned-clocks-parents counts to 2
> - Use IRQ flags in example dts
>
> Changes in v10:
> - Change title of this patch as it does not contain PLL bindings anymore
> - Remove redundant properties
> - Remove use of IRQ flag
> - Fix ports property
>
> Changes in v11:
> - add ports required of both #address-cells and  #size-cells
> - add required operating-points-v2
> - add required #sound-dai-cells
> - add required power-domains
> - update maintainer list
>
> Changes in v12:
> - remove soc node from examples (Stephen Boyd)
> - split dpu-sc7180.yaml changes to separate patch (Stephen Boyd)
>
> Changes in v13:
> - add assigned-clocks
> - add assigned-clock-parents
>
> Changes in v14:
> - add reference for ports (Rob Herring)
>
> Changes in v15:
> - drop common properties from ports (Rob Herring)
> ---

Reviewed-by: Stephen Boyd

[PATCH v2] drm/msm: Use nvmem_cell_read_variable_le_u32() to read speed bin

2021-05-21 Thread Douglas Anderson

Let's use the newly-added nvmem_cell_read_variable_le_u32() to future
proof ourselves a little bit.

Signed-off-by: Douglas Anderson 
---
The patch that this depends on is now in mainline so it can be merged
at will. I'm just sending this as a singleton patch to make it obvious
that there are no dependencies now.

Changes in v2:
- Rebased

 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index b4d8e1b01ee4..a07214157ad3 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -1403,10 +1403,10 @@ static int a6xx_set_supported_hw(struct device *dev, 
struct a6xx_gpu *a6xx_gpu,
 {
struct opp_table *opp_table;
u32 supp_hw = UINT_MAX;
-   u16 speedbin;
+   u32 speedbin;
int ret;
 
-   ret = nvmem_cell_read_u16(dev, "speed_bin", );
+   ret = nvmem_cell_read_variable_le_u32(dev, "speed_bin", );
/*
 * -ENOENT means that the platform doesn't support speedbin which is
 * fine
@@ -1419,7 +1419,6 @@ static int a6xx_set_supported_hw(struct device *dev, 
struct a6xx_gpu *a6xx_gpu,
  ret);
goto done;
}
-   speedbin = le16_to_cpu(speedbin);
 
supp_hw = fuse_to_supp_hw(dev, revn, speedbin);
 
-- 
2.31.1.818.g46aad6cb9e-goog

[PATCH v2 2/2] drm/amdgpu: Fix crash when hot unplug in BACO.

2021-05-21 Thread Andrey Grodzovsky

Problem:
When device goes into sleep state due to prolonged
innactivity (e.g. BACO sleep) and then hot unplugged,
PCI core will try to wake up the device as part of
unplug process. Since the device is gone all HW
programming during rpm resume fails leading
to a bad SW state later during pci remove handling.

Fix:
Use a flag we use for PCIe error recovery to avoid
accessing registres. This allows to succefully complete
rpm resume sequence and finish pci remove.

v2: Renamed HW access block flag

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1081
Signed-off-by: Andrey Grodzovsky 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index d8db5929cdd9..b9d221fcb66d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1555,6 +1555,11 @@ static int amdgpu_pmops_runtime_resume(struct device 
*dev)
if (!adev->runpm)
return -EINVAL;
 
+   /* Avoids registers access if device is physically gone */
+   if (!pci_device_is_present(adev->pdev))
+   adev->no_hw_access = true;
+
+
if (amdgpu_device_supports_px(drm_dev)) {
drm_dev->switch_power_state = DRM_SWITCH_POWER_CHANGING;
 
-- 
2.25.1

[PATCH v2 1/2] drm/amdgpu: Rename flag which prevents HW access

2021-05-21 Thread Andrey Grodzovsky

Make it's name not feature but function descriptive.

Signed-off-by: Andrey Grodzovsky 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h| 2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 6 +++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c| 4 ++--
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 2 +-
 drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c | 2 +-
 5 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index d830a541ba89..d0e557cb5f1d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1076,7 +1076,7 @@ struct amdgpu_device {
struct ratelimit_state  throttling_logging_rs;
uint32_tras_features;
 
-   boolin_pci_err_recovery;
+   boolno_hw_access;
struct pci_saved_state  *pci_state;
 
struct amdgpu_reset_control *reset_cntl;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index bf5055642b82..60e945471a54 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -340,7 +340,7 @@ void amdgpu_device_vram_access(struct amdgpu_device *adev, 
loff_t pos,
 /* Check if hw access should be skipped because of hotplug or device error */
 bool amdgpu_device_skip_hw_access(struct amdgpu_device *adev)
 {
-   if (adev->in_pci_err_recovery)
+   if (adev->no_hw_access)
return true;
 
 #ifdef CONFIG_LOCKDEP
@@ -5335,9 +5335,9 @@ pci_ers_result_t amdgpu_pci_slot_reset(struct pci_dev 
*pdev)
set_bit(AMDGPU_NEED_FULL_RESET, _context.flags);
set_bit(AMDGPU_SKIP_HW_RESET, _context.flags);
 
-   adev->in_pci_err_recovery = true;
+   adev->no_hw_access = true;
r = amdgpu_device_pre_asic_reset(adev, _context);
-   adev->in_pci_err_recovery = false;
+   adev->no_hw_access = false;
if (r)
goto out;
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index baa7d9778583..ce1577687ac2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -232,7 +232,7 @@ int psp_wait_for(struct psp_context *psp, uint32_t 
reg_index,
int i;
struct amdgpu_device *adev = psp->adev;
 
-   if (psp->adev->in_pci_err_recovery)
+   if (psp->adev->no_hw_access)
return 0;
 
for (i = 0; i < adev->usec_timeout; i++) {
@@ -261,7 +261,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
bool ras_intr = false;
bool skip_unsupport = false;
 
-   if (psp->adev->in_pci_err_recovery)
+   if (psp->adev->no_hw_access)
return 0;
 
if (!drm_dev_enter(>adev->ddev, ))
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 2408ed4c7d84..540fedf787c8 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -7332,7 +7332,7 @@ static int gfx_v10_0_hw_fini(void *handle)
amdgpu_irq_put(adev, >gfx.priv_reg_irq, 0);
amdgpu_irq_put(adev, >gfx.priv_inst_irq, 0);
 
-   if (!adev->in_pci_err_recovery) {
+   if (!adev->no_hw_access) {
 #ifndef BRING_UP_DEBUG
if (amdgpu_async_gfx_ring) {
r = gfx_v10_0_kiq_disable_kgq(adev);
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
index dc7d2e71aa6f..9526b46582c8 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
@@ -126,7 +126,7 @@ int smu_cmn_send_smc_msg_with_param(struct smu_context *smu,
struct amdgpu_device *adev = smu->adev;
int ret = 0, index = 0;
 
-   if (smu->adev->in_pci_err_recovery)
+   if (smu->adev->no_hw_access)
return 0;
 
index = smu_cmn_to_asic_specific_index(smu,
-- 
2.25.1

Re: [PATCH] drm/amdgpu: Fix crash when hot unplug in BACO.

2021-05-21 Thread Andrey Grodzovsky


Will do.

Andrey

On 2021-05-21 4:18 p.m., Alex Deucher wrote:

On Fri, May 21, 2021 at 4:14 PM Andrey Grodzovsky
 wrote:

Problem:
When device goes into sleep state due to prolonged
innactivity (e.g. BACO sleep) and then hot unplugged,
PCI core will try to wake up the device as part of
unplug process. Since the device is gone all HW
programming during rpm resume fails leading
to a bad SW state later during pci remove handling.

Fix:
Use a flag we use for PCIe error recovery to avoid
accessing registres. This allows to succefully complete
rpm resume sequence and finish pci remove.

Might make sense to create a preliminary patch to change the name of
this flag to something like no_hw_access since it's not specific to
pci error handling.

Alex


P.S Must use pci_device_is_present and not drm_dev_enter/exit
here since rpm resume happens before PCI remove and so the
unplug flag is not set yet.

Link: 
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fdrm%2Famd%2F-%2Fissues%2F1081data=04%7C01%7Candrey.grodzovsky%40amd.com%7C2a0ec02245b64de0139808d91c959987%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637572251118922092%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=egcYBxBU%2BkIqbEdypVueXQcWb%2Bqe%2BKCC30Mw%2FjgR6ag%3Dreserved=0
Signed-off-by: Andrey Grodzovsky 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 5 +
  1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index d8db5929cdd9..ab95ebf56636 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1555,6 +1555,11 @@ static int amdgpu_pmops_runtime_resume(struct device 
*dev)
 if (!adev->runpm)
 return -EINVAL;

+   /* Avoids registers access if device is physically gone */
+   if (!pci_device_is_present(adev->pdev))
+   adev->in_pci_err_recovery = true;
+
+
 if (amdgpu_device_supports_px(drm_dev)) {
 drm_dev->switch_power_state = DRM_SWITCH_POWER_CHANGING;

--
2.25.1

Re: [PATCH] drm/amdgpu: Fix crash when hot unplug in BACO.

2021-05-21 Thread Alex Deucher

On Fri, May 21, 2021 at 4:14 PM Andrey Grodzovsky
 wrote:
>
> Problem:
> When device goes into sleep state due to prolonged
> innactivity (e.g. BACO sleep) and then hot unplugged,
> PCI core will try to wake up the device as part of
> unplug process. Since the device is gone all HW
> programming during rpm resume fails leading
> to a bad SW state later during pci remove handling.
>
> Fix:
> Use a flag we use for PCIe error recovery to avoid
> accessing registres. This allows to succefully complete
> rpm resume sequence and finish pci remove.

Might make sense to create a preliminary patch to change the name of
this flag to something like no_hw_access since it's not specific to
pci error handling.

Alex

>
> P.S Must use pci_device_is_present and not drm_dev_enter/exit
> here since rpm resume happens before PCI remove and so the
> unplug flag is not set yet.
>
> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1081
> Signed-off-by: Andrey Grodzovsky 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 5 +
>  1 file changed, 5 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index d8db5929cdd9..ab95ebf56636 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -1555,6 +1555,11 @@ static int amdgpu_pmops_runtime_resume(struct device 
> *dev)
> if (!adev->runpm)
> return -EINVAL;
>
> +   /* Avoids registers access if device is physically gone */
> +   if (!pci_device_is_present(adev->pdev))
> +   adev->in_pci_err_recovery = true;
> +
> +
> if (amdgpu_device_supports_px(drm_dev)) {
> drm_dev->switch_power_state = DRM_SWITCH_POWER_CHANGING;
>
> --
> 2.25.1
>

[PATCH] drm/amdgpu: Fix crash when hot unplug in BACO.

2021-05-21 Thread Andrey Grodzovsky

Problem:
When device goes into sleep state due to prolonged
innactivity (e.g. BACO sleep) and then hot unplugged,
PCI core will try to wake up the device as part of
unplug process. Since the device is gone all HW
programming during rpm resume fails leading
to a bad SW state later during pci remove handling.

Fix:
Use a flag we use for PCIe error recovery to avoid
accessing registres. This allows to succefully complete
rpm resume sequence and finish pci remove.

P.S Must use pci_device_is_present and not drm_dev_enter/exit
here since rpm resume happens before PCI remove and so the 
unplug flag is not set yet.

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1081
Signed-off-by: Andrey Grodzovsky 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index d8db5929cdd9..ab95ebf56636 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1555,6 +1555,11 @@ static int amdgpu_pmops_runtime_resume(struct device 
*dev)
if (!adev->runpm)
return -EINVAL;
 
+   /* Avoids registers access if device is physically gone */
+   if (!pci_device_is_present(adev->pdev))
+   adev->in_pci_err_recovery = true;
+
+
if (amdgpu_device_supports_px(drm_dev)) {
drm_dev->switch_power_state = DRM_SWITCH_POWER_CHANGING;
 
-- 
2.25.1

Re: [PATCH v5 0/7] dma-buf: Performance improvements for system heap & a system-uncached implementation

2021-05-21 Thread John Stultz

On Fri, May 21, 2021 at 2:40 AM Lee Jones  wrote:
> On Tue, 10 Nov 2020 at 03:49, John Stultz  wrote:
>> This series reworks the system heap to use sgtables, and then
>> consolidates the pagelist method from the heap-helpers into the
>> CMA heap. After which the heap-helpers logic is removed (as it
>> is unused). I'd still like to find a better way to avoid some of
>> the logic duplication in implementing the entire dma_buf_ops
>> handlers per heap. But unfortunately that code is tied somewhat
>> to how the buffer's memory is tracked. As more heaps show up I
>> think we'll have a better idea how to best share code, so for
>> now I think this is ok.
>>
>> After this, the series introduces an optimization that
>> Ørjan Eide implemented for ION that avoids calling sync on
>> attachments that don't have a mapping.
>>
>> Next, an optimization to use larger order pages for the system
>> heap. This change brings us closer to the current performance
>> of the ION allocation code (though there still is a gap due
>> to ION using a mix of deferred-freeing and page pools, I'll be
>> looking at integrating those eventually).
>>
>> Finally, a reworked version of my uncached system heap
>> implementation I was submitting a few weeks back. Since it
>> duplicated a lot of the now reworked system heap code, I
>> realized it would be much simpler to add the functionality to
>> the system_heap implementation itself.
>>
>> While not improving the core allocation performance, the
>> uncached heap allocations do result in *much* improved
>> performance on HiKey960 as it avoids a lot of flushing and
>> invalidating buffers that the cpu doesn't touch often.
>>
>
>
> John, did this ever make it past v5?  I don't see a follow-up.

So most of these have landed upstream already.

The one exception is the system-uncached heap implementation, as
DanielV wanted a usecase where it was beneficial to a device with an
open driver.
Unfortunately this hasn't been trivial to show with the open gpu
devices I have, but taking Nicolas Dufresne's note, we're looking to
enable v4l2 integration in AOSP on db845c, so we can hopefully show
some benefit there. The HAL integration work has been taking some time
to get working though.

So it's a bit blocked on that for now.

thanks
-john

Re: [PATCH v6 1/9] drm/i915/dpcd_bl: Remove redundant AUX backlight frequency calculations

2021-05-21 Thread Rodrigo Vivi

On Fri, May 14, 2021 at 02:14:55PM -0400, Lyude Paul wrote:
> Noticed this while moving all of the VESA backlight code in i915 over to
> DRM helpers: it would appear that we calculate the frequency value we want
> to write to DP_EDP_BACKLIGHT_FREQ_SET twice even though this value never
> actually changes during runtime. So, let's simplify things by just caching
> this value in intel_panel.backlight, and re-writing it as-needed.
> 
> Changes since v1:
> * Wrap panel->backlight.edp.vesa.pwm_freq_pre_divider in
>   DP_EDP_BACKLIGHT_FREQ_AUX_SET_CAP check - Jani

This looks okay to me now... Jani, agree?

> 
> Signed-off-by: Lyude Paul 
> Cc: Jani Nikula 
> Cc: Dave Airlie 
> Cc: greg.depo...@gmail.com
> ---
>  .../drm/i915/display/intel_display_types.h|  1 +
>  .../drm/i915/display/intel_dp_aux_backlight.c | 65 ++-
>  2 files changed, 20 insertions(+), 46 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_display_types.h 
> b/drivers/gpu/drm/i915/display/intel_display_types.h
> index 9c0adfc60c6f..7054a37363fb 100644
> --- a/drivers/gpu/drm/i915/display/intel_display_types.h
> +++ b/drivers/gpu/drm/i915/display/intel_display_types.h
> @@ -311,6 +311,7 @@ struct intel_panel {
>   union {
>   struct {
>   u8 pwmgen_bit_count;
> + u8 pwm_freq_pre_divider;
>   } vesa;
>   struct {
>   bool sdr_uses_aux;
> diff --git a/drivers/gpu/drm/i915/display/intel_dp_aux_backlight.c 
> b/drivers/gpu/drm/i915/display/intel_dp_aux_backlight.c
> index 8e9ac9ba1d38..68bfe50ada59 100644
> --- a/drivers/gpu/drm/i915/display/intel_dp_aux_backlight.c
> +++ b/drivers/gpu/drm/i915/display/intel_dp_aux_backlight.c
> @@ -373,50 +373,6 @@ intel_dp_aux_vesa_set_backlight(const struct 
> drm_connector_state *conn_state,
>   }
>  }
>  
> -/*
> - * Set PWM Frequency divider to match desired frequency in vbt.
> - * The PWM Frequency is calculated as 27Mhz / (F x P).
> - * - Where F = PWM Frequency Pre-Divider value programmed by field 7:0 of the
> - * EDP_BACKLIGHT_FREQ_SET register (DPCD Address 00728h)
> - * - Where P = 2^Pn, where Pn is the value programmed by field 4:0 of the
> - * EDP_PWMGEN_BIT_COUNT register (DPCD Address 00724h)
> - */
> -static bool intel_dp_aux_vesa_set_pwm_freq(struct intel_connector *connector)
> -{
> - struct drm_i915_private *dev_priv = to_i915(connector->base.dev);
> - struct intel_dp *intel_dp = intel_attached_dp(connector);
> - const u8 pn = connector->panel.backlight.edp.vesa.pwmgen_bit_count;
> - int freq, fxp, f, fxp_actual, fxp_min, fxp_max;
> -
> - freq = dev_priv->vbt.backlight.pwm_freq_hz;
> - if (!freq) {
> - drm_dbg_kms(_priv->drm,
> - "Use panel default backlight frequency\n");
> - return false;
> - }
> -
> - fxp = DIV_ROUND_CLOSEST(KHz(DP_EDP_BACKLIGHT_FREQ_BASE_KHZ), freq);
> - f = clamp(DIV_ROUND_CLOSEST(fxp, 1 << pn), 1, 255);
> - fxp_actual = f << pn;
> -
> - /* Ensure frequency is within 25% of desired value */
> - fxp_min = DIV_ROUND_CLOSEST(fxp * 3, 4);
> - fxp_max = DIV_ROUND_CLOSEST(fxp * 5, 4);
> -
> - if (fxp_min > fxp_actual || fxp_actual > fxp_max) {
> - drm_dbg_kms(_priv->drm, "Actual frequency out of range\n");
> - return false;
> - }
> -
> - if (drm_dp_dpcd_writeb(_dp->aux,
> -DP_EDP_BACKLIGHT_FREQ_SET, (u8) f) < 0) {
> - drm_dbg_kms(_priv->drm,
> - "Failed to write aux backlight freq\n");
> - return false;
> - }
> - return true;
> -}
> -
>  static void
>  intel_dp_aux_vesa_enable_backlight(const struct intel_crtc_state *crtc_state,
>  const struct drm_connector_state 
> *conn_state, u32 level)
> @@ -459,9 +415,13 @@ intel_dp_aux_vesa_enable_backlight(const struct 
> intel_crtc_state *crtc_state,
>   break;
>   }
>  
> - if (intel_dp->edp_dpcd[2] & DP_EDP_BACKLIGHT_FREQ_AUX_SET_CAP)
> - if (intel_dp_aux_vesa_set_pwm_freq(connector))
> + if (panel->backlight.edp.vesa.pwm_freq_pre_divider) {
> + if (drm_dp_dpcd_writeb(_dp->aux, 
> DP_EDP_BACKLIGHT_FREQ_SET,
> +
> panel->backlight.edp.vesa.pwm_freq_pre_divider) == 1)
>   new_dpcd_buf |= DP_EDP_BACKLIGHT_FREQ_AUX_SET_ENABLE;
> + else
> + drm_dbg_kms(>drm, "Failed to write aux backlight 
> frequency\n");
> + }
>  
>   if (new_dpcd_buf != dpcd_buf) {
>   if (drm_dp_dpcd_writeb(_dp->aux,
> @@ -482,6 +442,14 @@ static void intel_dp_aux_vesa_disable_backlight(const 
> struct drm_connector_state
> false);
>  }
>  
> +/*
> + * Compute PWM frequency divider value based off the frequency provided

Re: [PATCH v6 8/9] drm/dp: Extract i915's eDP backlight code into DRM helpers

2021-05-21 Thread Rodrigo Vivi

On Fri, May 14, 2021 at 02:15:02PM -0400, Lyude Paul wrote:
> Since we're about to implement eDP backlight support in nouveau using the
> standard protocol from VESA, we might as well just take the code that's
> already written for this and move it into a set of shared DRM helpers.
> 
> Note that these helpers are intended to handle DPCD related backlight
> control bits such as setting the brightness level over AUX, probing the
> backlight's TCON, enabling/disabling the backlight over AUX if supported,
> etc. Any PWM-related portions of backlight control are explicitly left up
> to the driver, as these will vary from platform to platform.
> 
> The only exception to this is the calculation of the PWM frequency
> pre-divider value. This is because the only platform-specific information
> required for this is the PWM frequency of the panel, which the driver is
> expected to provide if available. The actual algorithm for calculating this
> value is standard and is defined in the eDP specification from VESA.
> 
> Note that these helpers do not yet implement the full range of features
> the VESA backlight interface provides, and only provide the following
> functionality (all of which was already present in i915's DPCD backlight
> support):
> 
> * Basic control of brightness levels
> * Basic probing of backlight capabilities
> * Helpers for enabling and disabling the backlight
> 
> v3:
> * Split out changes to i915's backlight code to separate patches to make it
>   easier to review
> v4:
> * Style/spelling changes from Thomas Zimmermann
> v5:
> * Start using new drm_dbg_*() functions

This version was indeed much easier to review. Thanks for addressing all the
comments.

Reviewed-by: Rodrigo Vivi 

> 
> Signed-off-by: Lyude Paul 
> Cc: Jani Nikula 
> Cc: Dave Airlie 
> Cc: greg.depo...@gmail.com
> ---
>  drivers/gpu/drm/drm_dp_helper.c   | 347 ++
>  .../drm/i915/display/intel_display_types.h|   5 +-
>  .../drm/i915/display/intel_dp_aux_backlight.c | 285 ++
>  include/drm/drm_dp_helper.h   |  48 +++
>  4 files changed, 427 insertions(+), 258 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_dp_helper.c b/drivers/gpu/drm/drm_dp_helper.c
> index 55b53df6ce34..24bbc710c825 100644
> --- a/drivers/gpu/drm/drm_dp_helper.c
> +++ b/drivers/gpu/drm/drm_dp_helper.c
> @@ -3115,3 +3115,350 @@ int drm_dp_pcon_convert_rgb_to_ycbcr(struct 
> drm_dp_aux *aux, u8 color_spc)
>   return 0;
>  }
>  EXPORT_SYMBOL(drm_dp_pcon_convert_rgb_to_ycbcr);
> +
> +/**
> + * drm_edp_backlight_set_level() - Set the backlight level of an eDP panel 
> via AUX
> + * @aux: The DP AUX channel to use
> + * @bl: Backlight capability info from drm_edp_backlight_init()
> + * @level: The brightness level to set
> + *
> + * Sets the brightness level of an eDP panel's backlight. Note that the 
> panel's backlight must
> + * already have been enabled by the driver by calling 
> drm_edp_backlight_enable().
> + *
> + * Returns: %0 on success, negative error code on failure
> + */
> +int drm_edp_backlight_set_level(struct drm_dp_aux *aux, const struct 
> drm_edp_backlight_info *bl,
> + u16 level)
> +{
> + int ret;
> + u8 buf[2] = { 0 };
> +
> + if (bl->lsb_reg_used) {
> + buf[0] = (level & 0xff00) >> 8;
> + buf[1] = (level & 0x00ff);
> + } else {
> + buf[0] = level;
> + }
> +
> + ret = drm_dp_dpcd_write(aux, DP_EDP_BACKLIGHT_BRIGHTNESS_MSB, buf, 
> sizeof(buf));
> + if (ret != sizeof(buf)) {
> + drm_err(aux->drm_dev,
> + "%s: Failed to write aux backlight level: %d\n",
> + aux->name, ret);
> + return ret < 0 ? ret : -EIO;
> + }
> +
> + return 0;
> +}
> +EXPORT_SYMBOL(drm_edp_backlight_set_level);
> +
> +static int
> +drm_edp_backlight_set_enable(struct drm_dp_aux *aux, const struct 
> drm_edp_backlight_info *bl,
> +  bool enable)
> +{
> + int ret;
> + u8 buf;
> +
> + /* The panel uses something other then DPCD for enabling its backlight 
> */
> + if (!bl->aux_enable)
> + return 0;
> +
> + ret = drm_dp_dpcd_readb(aux, DP_EDP_DISPLAY_CONTROL_REGISTER, );
> + if (ret != 1) {
> + drm_err(aux->drm_dev, "%s: Failed to read eDP display control 
> register: %d\n",
> + aux->name, ret);
> + return ret < 0 ? ret : -EIO;
> + }
> + if (enable)
> + buf |= DP_EDP_BACKLIGHT_ENABLE;
> + else
> + buf &= ~DP_EDP_BACKLIGHT_ENABLE;
> +
> + ret = drm_dp_dpcd_writeb(aux, DP_EDP_DISPLAY_CONTROL_REGISTER, buf);
> + if (ret != 1) {
> + drm_err(aux->drm_dev, "%s: Failed to write eDP display control 
> register: %d\n",
> + aux->name, ret);
> + return ret < 0 ? ret : -EIO;
> + }
> +
> + return 0;
> +}
> +
> +/**
> + * drm_edp_backlight_enable() - Enable an eDP

[PATCH 2/2] drivers/firmware: consolidate EFI framebuffer setup for all arches

2021-05-21 Thread Javier Martinez Canillas

The register_gop_device() function registers an "efi-framebuffer" platform
device to match against the efifb driver, to have an early framebuffer for
EFI platforms.

But the Generic System Framebuffers (sysfb) already has support for this.

Instead of having duplicated logic for x86 and other architectures using
EFI, consolidate the two in sysfb and remove it from the EFI init logic.

Signed-off-by: Javier Martinez Canillas 
---

 arch/arm/Kconfig  |  1 +
 arch/arm/include/asm/efi.h|  5 +-
 arch/arm64/Kconfig|  1 +
 arch/arm64/include/asm/efi.h  |  5 +-
 arch/riscv/Kconfig|  1 +
 arch/riscv/include/asm/efi.h  |  5 +-
 drivers/firmware/Kconfig  |  7 ++-
 drivers/firmware/Makefile |  2 +-
 drivers/firmware/efi/efi-init.c   | 90 ---
 drivers/firmware/efi/sysfb_efi.c  | 77 +-
 drivers/firmware/sysfb.c  | 40 +-
 drivers/firmware/sysfb_simplefb.c | 29 ++
 include/linux/sysfb.h | 28 +-
 13 files changed, 145 insertions(+), 146 deletions(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 24804f11302..30ba195ca72 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -127,6 +127,7 @@ config ARM
select PERF_USE_VMALLOC
select RTC_LIB
select SET_FS
+   select SYSFB
select SYS_SUPPORTS_APM_EMULATION
# Above selects are sorted alphabetically; please add new ones
# according to that.  Thanks.
diff --git a/arch/arm/include/asm/efi.h b/arch/arm/include/asm/efi.h
index 9de7ab2ce05..a6f3b179e8a 100644
--- a/arch/arm/include/asm/efi.h
+++ b/arch/arm/include/asm/efi.h
@@ -17,6 +17,7 @@
 
 #ifdef CONFIG_EFI
 void efi_init(void);
+extern void efifb_setup_from_dmi(struct screen_info *si, const char *opt);
 
 int efi_create_mapping(struct mm_struct *mm, efi_memory_desc_t *md);
 int efi_set_mapping_permissions(struct mm_struct *mm, efi_memory_desc_t *md);
@@ -52,10 +53,6 @@ void efi_virtmap_unload(void);
 struct screen_info *alloc_screen_info(void);
 void free_screen_info(struct screen_info *si);
 
-static inline void efifb_setup_from_dmi(struct screen_info *si, const char 
*opt)
-{
-}
-
 /*
  * A reasonable upper bound for the uncompressed kernel size is 32 MBytes,
  * so we will reserve that amount of memory. We have no easy way to tell what
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 9f1d8566bbf..20886eb48ab 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1884,6 +1884,7 @@ config EFI
select EFI_RUNTIME_WRAPPERS
select EFI_STUB
select EFI_GENERIC_STUB
+   select SYSFB
imply IMA_SECURE_AND_OR_TRUSTED_BOOT
default y
help
diff --git a/arch/arm64/include/asm/efi.h b/arch/arm64/include/asm/efi.h
index 3578aba9c60..42d673a011c 100644
--- a/arch/arm64/include/asm/efi.h
+++ b/arch/arm64/include/asm/efi.h
@@ -14,6 +14,7 @@
 
 #ifdef CONFIG_EFI
 extern void efi_init(void);
+extern void efifb_setup_from_dmi(struct screen_info *si, const char *opt);
 #else
 #define efi_init()
 #endif
@@ -85,10 +86,6 @@ static inline void free_screen_info(struct screen_info *si)
 {
 }
 
-static inline void efifb_setup_from_dmi(struct screen_info *si, const char 
*opt)
-{
-}
-
 #define EFI_ALLOC_ALIGNSZ_64K
 
 /*
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index a8ad8eb7612..f4bc6736c4e 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -474,6 +474,7 @@ config EFI
select EFI_GENERIC_STUB
select EFI_RUNTIME_WRAPPERS
select RISCV_ISA_C
+   select SYSFB
depends on MMU
default y
help
diff --git a/arch/riscv/include/asm/efi.h b/arch/riscv/include/asm/efi.h
index 6d98cd99968..7a8f0d45b13 100644
--- a/arch/riscv/include/asm/efi.h
+++ b/arch/riscv/include/asm/efi.h
@@ -13,6 +13,7 @@
 
 #ifdef CONFIG_EFI
 extern void efi_init(void);
+extern void efifb_setup_from_dmi(struct screen_info *si, const char *opt);
 #else
 #define efi_init()
 #endif
@@ -39,10 +40,6 @@ static inline void free_screen_info(struct screen_info *si)
 {
 }
 
-static inline void efifb_setup_from_dmi(struct screen_info *si, const char 
*opt)
-{
-}
-
 void efi_virtmap_load(void);
 void efi_virtmap_unload(void);
 
diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
index 396bd1d5cbf..cc1a61f6d57 100644
--- a/drivers/firmware/Kconfig
+++ b/drivers/firmware/Kconfig
@@ -253,9 +253,8 @@ config QCOM_SCM_DOWNLOAD_MODE_DEFAULT
 
 config SYSFB
bool
-   depends on X86
 
-config X86_SYSFB
+config SYSFB_SIMPLEFB
bool "Mark VGA/VBE/EFI FB as generic system framebuffer"
depends on SYSFB
help
@@ -263,10 +262,10 @@ config X86_SYSFB
  bootloader or kernel can show basic video-output during boot for
  user-guidance and debugging. Historically, x86 used the VESA BIOS
  Extensions and EFI-framebuffers for this, which are mostly limited
- to

[PATCH 0/2] allow the sysfb support to be used in non-x86 arches

2021-05-21 Thread Javier Martinez Canillas

The x86 architecture platform has a Generic System Framebuffers (sysfb)
support, that register a system frambuffer platform devices. It either
registers a "simple-framebuffer" for the simple{fb,drm} drivers or legacy
VGA/EFI FB devices for the vgafb/efifb drivers.

Besides this, the EFI initialization code used by other architectures such
as aarch64 and riscv, has similar logic but only register an EFI FB device.

The sysfb is generic enough to be reused by other architectures and can be
moved out of the arch/x86 directory to drivers/firmware, allowing the EFI
logic used by non-x86 architectures to be folded into sysfb as well.

Patch #1 in this series do the former while patch #2 the latter. This has
been tested on x86_64 and aarch64 machines using the efifb, simplefb and
simpledrm drivers. But more testing will be highly appreciated, to make
sure that no regressions are being introduced by these changes.

Since this touches both arch/{x86,arm,arm64,riscv} and drivers/firmware, I
don't know how it should be merged. But I didn't find a way to split these.

Best regards,
Javier


Javier Martinez Canillas (2):
  drivers/firmware: move x86 Generic System Framebuffers support
  drivers/firmware: consolidate EFI framebuffer setup for all arches

 arch/arm/Kconfig  |  1 +
 arch/arm/include/asm/efi.h|  5 +-
 arch/arm64/Kconfig|  1 +
 arch/arm64/include/asm/efi.h  |  5 +-
 arch/riscv/Kconfig|  1 +
 arch/riscv/include/asm/efi.h  |  5 +-
 arch/x86/Kconfig  | 27 +-
 arch/x86/kernel/Makefile  |  3 -
 drivers/firmware/Kconfig  | 30 +++
 drivers/firmware/Makefile |  2 +
 drivers/firmware/efi/Makefile |  2 +
 drivers/firmware/efi/efi-init.c   | 90 ---
 .../firmware/efi}/sysfb_efi.c | 79 +++-
 {arch/x86/kernel => drivers/firmware}/sysfb.c | 42 +
 .../firmware}/sysfb_simplefb.c| 31 ---
 .../x86/include/asm => include/linux}/sysfb.h | 34 +++
 16 files changed, 182 insertions(+), 176 deletions(-)
 rename {arch/x86/kernel => drivers/firmware/efi}/sysfb_efi.c (84%)
 rename {arch/x86/kernel => drivers/firmware}/sysfb.c (70%)
 rename {arch/x86/kernel => drivers/firmware}/sysfb_simplefb.c (82%)
 rename {arch/x86/include/asm => include/linux}/sysfb.h (68%)

-- 
2.31.1

[PATCH 1/2] drivers/firmware: move x86 Generic System Framebuffers support

2021-05-21 Thread Javier Martinez Canillas

The x86 architecture has generic support to register a system framebuffer
platform device. It either registers a "simple-framebuffer" if the config
option CONFIG_X86_SYSFB is enabled, or a legacy VGA/VBE/EFI FB device.

But the code is generic enough to be reused by other architectures and can
be moved out of the arch/x86 directory.

Signed-off-by: Javier Martinez Canillas 
---

 arch/x86/Kconfig  | 27 +---
 arch/x86/kernel/Makefile  |  3 --
 drivers/firmware/Kconfig  | 31 +++
 drivers/firmware/Makefile |  2 ++
 drivers/firmware/efi/Makefile |  2 ++
 .../firmware/efi}/sysfb_efi.c |  2 +-
 {arch/x86/kernel => drivers/firmware}/sysfb.c |  2 +-
 .../firmware}/sysfb_simplefb.c|  2 +-
 .../x86/include/asm => include/linux}/sysfb.h |  6 ++--
 9 files changed, 42 insertions(+), 35 deletions(-)
 rename {arch/x86/kernel => drivers/firmware/efi}/sysfb_efi.c (99%)
 rename {arch/x86/kernel => drivers/firmware}/sysfb.c (98%)
 rename {arch/x86/kernel => drivers/firmware}/sysfb_simplefb.c (99%)
 rename {arch/x86/include/asm => include/linux}/sysfb.h (95%)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 0045e1b4419..40816b5be27 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -257,6 +257,7 @@ config X86
select SRCU
select STACK_VALIDATION if HAVE_STACK_VALIDATION && 
(HAVE_STATIC_CALL_INLINE || RETPOLINE)
select SYSCTL_EXCEPTION_TRACE
+   select SYSFB
select THREAD_INFO_IN_TASK
select USER_STACKTRACE_SUPPORT
select VIRT_TO_BUS
@@ -2806,32 +2807,6 @@ config AMD_NB
def_bool y
depends on CPU_SUP_AMD && PCI
 
-config X86_SYSFB
-   bool "Mark VGA/VBE/EFI FB as generic system framebuffer"
-   help
- Firmwares often provide initial graphics framebuffers so the BIOS,
- bootloader or kernel can show basic video-output during boot for
- user-guidance and debugging. Historically, x86 used the VESA BIOS
- Extensions and EFI-framebuffers for this, which are mostly limited
- to x86.
- This option, if enabled, marks VGA/VBE/EFI framebuffers as generic
- framebuffers so the new generic system-framebuffer drivers can be
- used on x86. If the framebuffer is not compatible with the generic
- modes, it is advertised as fallback platform framebuffer so legacy
- drivers like efifb, vesafb and uvesafb can pick it up.
- If this option is not selected, all system framebuffers are always
- marked as fallback platform framebuffers as usual.
-
- Note: Legacy fbdev drivers, including vesafb, efifb, uvesafb, will
- not be able to pick up generic system framebuffers if this option
- is selected. You are highly encouraged to enable simplefb as
- replacement if you select this option. simplefb can correctly deal
- with generic system framebuffers. But you should still keep vesafb
- and others enabled as fallback if a system framebuffer is
- incompatible with simplefb.
-
- If unsure, say Y.
-
 endmenu
 
 
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 0704c2a9427..149a0f8e89d 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -135,9 +135,6 @@ obj-$(CONFIG_X86_CHECK_BIOS_CORRUPTION) += check.o
 obj-$(CONFIG_SWIOTLB)  += pci-swiotlb.o
 obj-$(CONFIG_OF)   += devicetree.o
 obj-$(CONFIG_UPROBES)  += uprobes.o
-obj-y  += sysfb.o
-obj-$(CONFIG_X86_SYSFB)+= sysfb_simplefb.o
-obj-$(CONFIG_EFI)  += sysfb_efi.o
 
 obj-$(CONFIG_PERF_EVENTS)  += perf_regs.o
 obj-$(CONFIG_TRACING)  += tracepoint.o
diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
index db0ea2d2d75..396bd1d5cbf 100644
--- a/drivers/firmware/Kconfig
+++ b/drivers/firmware/Kconfig
@@ -251,6 +251,37 @@ config QCOM_SCM_DOWNLOAD_MODE_DEFAULT
 
  Say Y here to enable "download mode" by default.
 
+config SYSFB
+   bool
+   depends on X86
+
+config X86_SYSFB
+   bool "Mark VGA/VBE/EFI FB as generic system framebuffer"
+   depends on SYSFB
+   help
+ Firmwares often provide initial graphics framebuffers so the BIOS,
+ bootloader or kernel can show basic video-output during boot for
+ user-guidance and debugging. Historically, x86 used the VESA BIOS
+ Extensions and EFI-framebuffers for this, which are mostly limited
+ to x86.
+ This option, if enabled, marks VGA/VBE/EFI framebuffers as generic
+ framebuffers so the new generic system-framebuffer drivers can be
+ used on x86. If the framebuffer is not compatible with the generic
+ modes, it is advertised as fallback platform

Re: [PATCH] drm/fb-helper: improve DRM fbdev emulation device names

2021-05-21 Thread Thomas Zimmermann


Hi

Am 21.05.21 um 19:18 schrieb Javier Martinez Canillas:

On 5/21/21 6:53 PM, Thomas Zimmermann wrote:

[snip]



So what with all the drivers which do _not_ have drm in their name? Also
I'm never sure how much these are uapi or not ...




That someone could threat as an uapi is a fair point indeed.
  

Why do we need a suffix anyway?



Yes, I thought the same and was torn about posting a patch to just remove
the suffix. I don't think users care that much if is a fb device from a
fbdev driver or a DRM driver using the fbdev emulation.


Yup. I don't see how anything in userspace would depend on the exact 
name; especially since fbdev emulation only provides basic features. 
(I'd welcome a counter examples that proves me wrong.)


IMHO we can risk it to remove the suffix entirely. But that needs an ack 
from Daniel or Dave.


Best regards
Thomas




-Daniel



Best regards,



--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer



OpenPGP_signature
Description: OpenPGP digital signature

Re: [PATCH] drm/i915/gvt: remove local storage of debugfs file

2021-05-21 Thread Greg Kroah-Hartman

On Wed, May 19, 2021 at 04:21:23PM +0800, Zhenyu Wang wrote:
> On 2021.05.19 10:31:18 +0200, Greg Kroah-Hartman wrote:
> > On Wed, May 19, 2021 at 04:03:13PM +0800, Zhenyu Wang wrote:
> > > On 2021.05.18 18:28:53 +0200, Greg Kroah-Hartman wrote:
> > > > On Tue, May 18, 2021 at 06:17:05PM +0200, Greg Kroah-Hartman wrote:
> > > > > There is no need to keep the dentry around for the debugfs kvmgt cache
> > > > > file, as we can just look it up when we want to remove it later on.
> > > > > Simplify the structure by removing the dentry and relying on debugfs
> > > > > to find the dentry to remove when we want to.
> > > > > 
> > > > > By doing this change, we remove the last in-kernel user that was 
> > > > > storing
> > > > > the result of debugfs_create_long(), so that api can be cleaned up.
> > > > > 
> > > > > Cc: Zhenyu Wang 
> > > > > Cc: Zhi Wang 
> > > > > Cc: Jani Nikula 
> > > > > Cc: Joonas Lahtinen 
> > > > > Cc: Rodrigo Vivi 
> > > > > Cc: David Airlie 
> > > > > Cc: Daniel Vetter 
> > > > > Cc: intel-gvt-...@lists.freedesktop.org
> > > > > Cc: intel-...@lists.freedesktop.org
> > > > > Cc: dri-devel@lists.freedesktop.org
> > > > > Signed-off-by: Greg Kroah-Hartman 
> > > > > ---
> > > > >  drivers/gpu/drm/i915/gvt/kvmgt.c | 11 +--
> > > > >  1 file changed, 5 insertions(+), 6 deletions(-)
> > > > 
> > > > Note, I can take this through my debugfs tree if wanted, that way I can
> > > > clean up the debugfs_create_long() api at the same time.  Otherwise it's
> > > > fine, I can wait until next -rc1 for that to happen.
> > > > 
> > > 
> > > It's fine with me to go through debugfs tree. Just double check that 
> > > recent
> > > kvmgt change would not cause conflict with this as well.
> > 
> > How can I check that?  I'll be glad to take this through my tree, we can
> > handle the merge issues later for 5.14-rc1 :)
> > 
> 
> Current kvmgt change in merge queue is just 
> https://patchwork.freedesktop.org/patch/433536/?series=89995=2
> It applies fine with debugfs change.

Thanks, I'll go take this through my tree now.

greg k-h

Re: [PATCH v17 4/4] dt-bindings: msm/dp: Add bindings of MSM DisplayPort controller

2021-05-21 Thread Rob Herring

On Fri, 21 May 2021 15:57:24 +0530, Krishna Manikandan wrote:
> Add bindings for Snapdragon DisplayPort controller driver.
> 
> Signed-off-by: Chandan Uddaraju 
> Signed-off-by: Vara Reddy 
> Signed-off-by: Tanmay Shah 
> Signed-off-by: Kuogee Hsieh 
> Signed-off-by: Krishna Manikandan 
> 
> Changes in V2:
> -Provide details about sel-gpio
> 
> Changes in V4:
> -Provide details about max dp lanes
> -Change the commit text
> 
> Changes in V5:
> -moved dp.txt to yaml file
> 
> Changes in v6:
> - Squash all AUX LUT properties into one pattern Property
> - Make aux-cfg[0-9]-settings properties optional
> - Remove PLL/PHY bindings from DP controller dts
> - Add DP clocks description
> - Remove _clk suffix from clock names
> - Rename pixel clock to stream_pixel
> - Remove redundant bindings (GPIO, PHY, HDCP clock, etc..)
> - Fix indentation
> - Add Display Port as interface of DPU in DPU bindings
>   and add port mapping accordingly.
> 
> Chages in v7:
> - Add dp-controller.yaml file common between multiple SOC
> - Rename dp-sc7180.yaml to dp-controller-sc7180.yaml
> - change compatible string and add SOC name to it.
> - Remove Root clock generator for pixel clock
> - Add assigned-clocks and assigned-clock-parents bindings
> - Remove redundant properties, descriptions and blank lines
> - Add DP port in DPU bindings
> - Update depends-on tag in commit message and rebase change accordingly
> 
> Changes in v8:
> - Add MDSS AHB clock in bindings
> 
> Changes in v9:
> - Remove redundant reg-name property
> - Change assigned-clocks and assigned-clocks-parents counts to 2
> - Use IRQ flags in example dts
> 
> Changes in v10:
> - Change title of this patch as it does not contain PLL bindings anymore
> - Remove redundant properties
> - Remove use of IRQ flag
> - Fix ports property
> 
> Changes in v11:
> - add ports required of both #address-cells and  #size-cells
> - add required operating-points-v2
> - add required #sound-dai-cells
> - add required power-domains
> - update maintainer list
> 
> Changes in v12:
> - remove soc node from examples (Stephen Boyd)
> - split dpu-sc7180.yaml changes to separate patch (Stephen Boyd)
> 
> Changes in v13:
> - add assigned-clocks
> - add assigned-clock-parents
> 
> Changes in v14:
> - add reference for ports (Rob Herring)
> 
> Changes in v15:
> - drop common properties from ports (Rob Herring)
> ---
>  .../bindings/display/msm/dp-controller.yaml| 146 
> +
>  1 file changed, 146 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/display/msm/dp-controller.yaml
> 

Reviewed-by: Rob Herring

Re: [PATCH v17 2/4] dt-bindings: msm: dsi: add yaml schemas for DSI bindings

2021-05-21 Thread Rob Herring

On Fri, May 21, 2021 at 03:57:22PM +0530, Krishna Manikandan wrote:
> Add YAML schema for the device tree bindings for DSI
> 
> Signed-off-by: Krishna Manikandan 
> 
> Changes in v1:
> - Separate dsi controller bindings to a separate patch (Stephen Boyd)
> - Merge dsi-common-controller.yaml and dsi-controller-main.yaml to
>   a single file (Stephen Boyd)
> - Drop supply entries and definitions from properties (Stephen Boyd)
> - Modify phy-names property for dsi controller (Stephen Boyd)
> - Remove boolean from description (Stephen Boyd)
> - Drop pinctrl properties as they are standard entries (Stephen Boyd)
> - Modify the description for ports property and keep the reference
>   to the generic binding where this is defined (Stephen Boyd)
> - Add description to clock names (Stephen Boyd)
> - Correct the indendation (Stephen Boyd)
> - Drop the label for display dt nodes and correct the node
>   name (Stephen Boyd)
> 
> Changes in v2:
> - Drop maxItems for clock (Stephen Boyd)
> - Drop qcom,mdss-mdp-transfer-time-us as it is not used in upstream
>   dt file (Stephen Boyd)
> - Keep child node directly under soc node (Stephen Boyd)
> - Drop qcom,sync-dual-dsi as it is not used in upstream dt
> 
> Changes in v3:
> - Add description for register property (Stephen Boyd)
> 
> Changes in v4:
> - Add maxItems for phys property (Stephen Boyd)
> - Add maxItems for reg property (Stephen Boyd)
> - Add reference for data-lanes property (Stephen Boyd)
> - Remove soc from example (Stephen Boyd)
> 
> Changes in v5:
> - Modify title and description (Stephen Boyd)
> - Add required properties for ports node (Stephen Boyd)
> - Add data-lanes in the example (Stephen Boyd)
> - Drop qcom,master-dsi property (Stephen Boyd)
> 
> Changes in v6:
> - Add required properties for port@0, port@1 and corresponding
>   endpoints (Stephen Boyd)
> - Add address-cells and size-cells for ports (Stephen Boyd)
> - Use additionalProperties instead of unevaluatedProperties (Stephen Boyd)
> 
> Changes in v7:
> - Add reference for ports and data-lanes (Rob Herring)
> - Add maxItems and minItems for data-lanes (Rob Herring)
> 
> Changes in v8:
> - Drop common properties and description from ports (Rob Herring)
> - Add reference for endpoint (Rob Herring)
> - Add correct reference for data-lanes (Rob Herring)
> - Drop common properties from required list for ports (Rob Herring)

I think I said it before, but move the changelogs in this series after 
the '---' so they are dropped when committed. That's not the usual DRM 
way, but this case has particularly useless revision history to capture.

> ---
>  .../bindings/display/msm/dsi-controller-main.yaml  | 185 +++
>  .../devicetree/bindings/display/msm/dsi.txt| 249 
> -
>  2 files changed, 185 insertions(+), 249 deletions(-)
>  create mode 100644 
> Documentation/devicetree/bindings/display/msm/dsi-controller-main.yaml
>  delete mode 100644 Documentation/devicetree/bindings/display/msm/dsi.txt
> 
> diff --git 
> a/Documentation/devicetree/bindings/display/msm/dsi-controller-main.yaml 
> b/Documentation/devicetree/bindings/display/msm/dsi-controller-main.yaml
> new file mode 100644
> index 000..df6a750
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/display/msm/dsi-controller-main.yaml
> @@ -0,0 +1,185 @@
> +# SPDX-License-Identifier: GPL-2.0-only or BSD-2-Clause
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/display/msm/dsi-controller-main.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: Qualcomm Display DSI controller
> +
> +maintainers:
> +  - Krishna Manikandan 
> +
> +allOf:
> +  - $ref: "../dsi-controller.yaml#"
> +
> +properties:
> +  compatible:
> +items:
> +  - const: qcom,mdss-dsi-ctrl
> +
> +  reg:
> +maxItems: 1
> +
> +  reg-names:
> +const: dsi_ctrl
> +
> +  interrupts:
> +maxItems: 1
> +
> +  clocks:
> +items:
> +  - description: Display byte clock
> +  - description: Display byte interface clock
> +  - description: Display pixel clock
> +  - description: Display escape clock
> +  - description: Display AHB clock
> +  - description: Display AXI clock
> +
> +  clock-names:
> +items:
> +  - const: byte
> +  - const: byte_intf
> +  - const: pixel
> +  - const: core
> +  - const: iface
> +  - const: bus
> +
> +  phys:
> +maxItems: 1
> +
> +  phy-names:
> +const: dsi
> +
> +  "#address-cells": true
> +
> +  "#size-cells": true
> +
> +  syscon-sfpb:
> +description: A phandle to mmss_sfpb syscon node (only for DSIv2).
> +$ref: "/schemas/types.yaml#/definitions/phandle"
> +
> +  qcom,dual-dsi-mode:
> +type: boolean
> +description: |
> +  Indicates if the DSI controller is driving a panel which needs
> +  2 DSI links.
> +
> +  power-domains:
> +maxItems:

Re: [Mesa-dev] [PATCH 01/11] drm/amdgpu: Comply with implicit fencing rules

2021-05-21 Thread Daniel Vetter

On Fri, May 21, 2021 at 8:08 PM Christian König
 wrote:
>
> Am 21.05.21 um 17:16 schrieb Daniel Vetter:
> > On Fri, May 21, 2021 at 05:00:46PM +0200, Bas Nieuwenhuizen wrote:
> >> On Fri, May 21, 2021 at 4:37 PM Daniel Vetter  wrote:
> >>> On Fri, May 21, 2021 at 11:46:23AM +0200, Bas Nieuwenhuizen wrote:
>  On Fri, May 21, 2021 at 11:10 AM Daniel Vetter  
>  wrote:
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 4 ++--
> >   1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > index 88a24a0b5691..cc8426e1e8a8 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > @@ -617,8 +617,8 @@ static int amdgpu_cs_parser_bos(struct 
> > amdgpu_cs_parser *p,
> >  amdgpu_bo_list_for_each_entry(e, p->bo_list) {
> >  struct amdgpu_bo *bo = ttm_to_amdgpu_bo(e->tv.bo);
> >
> > -   /* Make sure we use the exclusive slot for shared BOs */
> > -   if (bo->prime_shared_count)
> > +   /* Make sure we use the exclusive slot for all 
> > potentially shared BOs */
> > +   if (!(bo->flags & AMDGPU_GEM_CREATE_VM_ALWAYS_VALID))
> >  e->tv.num_shared = 0;
>  I think it also makes sense to skip this with
>  AMDGPU_GEM_CREATE_EXPLICIT_SYNC? It can be shared but I don't think
>  anyone expects implicit sync to happen with those.
> >>> Ah yes, I missed this entirely. So the "no implicit flag" is already
> >>> there, and the _only_ thing that's missing really is a way to fish out the
> >>> implicit fences, and set them.
> >>>
> >>> https://lore.kernel.org/dri-devel/20210520190007.534046-1-ja...@jlekstrand.net/
> >>>
> >>> So I think all that's really needed in radv is not setting
> >>> RADEON_FLAG_IMPLICIT_SYNC for winsys buffers when Jason's dma-buf ioctl
> >>> are present (means you need to do some import/export and keep the fd
> >>> around for winsys buffers, but shouldn't be too bad), and then control the
> >>> implicit fences entirely explicitly like vk expects.
> >> That is the part I'm less sure about. This is a BO wide flag so we are
> >> also disabling implicit sync in the compositor. If the compositor does
> >> only do read stuff that is ok, as the inserted exclusive fence will
> >> work for that. But as I learned recently the app provided buffer may
> >> end up being written to by the X server which open a whole can of
> >> potential problems if implicit sync gets disabled between Xserver
> >> operations on the app provided buffer. Hence setting that on the WSI
> >> buffer is a whole new can of potential problems and hence I've said a
> >> submission based flag would be preferred.
> >>
> >> I can certainly try it out though.
> > Hm yeah that's the wrong flag. We need a flag on the drm_file which the
> > explicit userspace sets. And which is valid only for itself.
> >
> > There's a nice flags field when creating a ctx, but it's not validated and
> > there's already a comment that we have to filter out garbage priority, so
> > that's not use. I'll whip up something entirely untested just as a draft.
>
> We could provide an IOCTL for the BO to change the flag.

That's not the semantics we need.

> But could we first figure out the semantics we want to use here?
>
> Cause I'm pretty sure we don't actually need those changes at all and as
> said before I'm certainly NAKing things which break existing use cases.

Please read how other drivers do this and at least _try_ to understand
it. I'm really loosing my patience here with you NAKing patches you're
not even understanding (or did you actually read and fully understand
the entire story I typed up here, and your NAK is on the entire
thing?). There's not much useful conversation to be had with that
approach. And with drivers I mean kernel + userspace here.

That's the other frustration part: You're trying to fix this purely in
the kernel. This is exactly one of these issues why we require open
source userspace, so that we can fix the issues correctly across the
entire stack. And meanwhile you're steadfastily refusing to even look
at that the userspace side of the picture.

Also I thought through your tlb issue, why are you even putting these
tlb flush fences into the shard dma_resv slots? If you store them
somewhere else in the amdgpu private part, the oversync issues goes
away
- in your ttm bo move callback, you can just make your bo copy job
depend on them too (you have to anyway)
- even for p2p there's not an issue here, because you have the
->move_notify callback, and can then lift the tlb flush fences from
your private place to the shared slots so the exporter can see them.

The kernel move fences otoh are a bit more nasty to wring through the
p2p dma-buf interface. That one probably needs something new.
-Daniel

>
>

Re: [PATCH v17 3/4] dt-bindings: msm: dsi: add yaml schemas for DSI PHY bindings

2021-05-21 Thread Rob Herring

On Fri, 21 May 2021 15:57:23 +0530, Krishna Manikandan wrote:
> Add YAML schema for the device tree bindings for DSI PHY.
> 
> Signed-off-by: Krishna Manikandan 
> 
> Changes in v1:
>- Merge dsi-phy.yaml and dsi-phy-10nm.yaml (Stephen Boyd)
>- Remove qcom,dsi-phy-regulator-ldo-mode (Stephen Boyd)
>- Add clock cells properly (Stephen Boyd)
>- Remove unnecessary decription from clock names (Stephen Boyd)
>- Add pin names for the supply entries for 10nm phy which is
>  used in sc7180 and sdm845 (Stephen Boyd)
>- Remove unused header files from examples (Stephen Boyd)
>- Drop labels for display nodes and correct node name (Stephen Boyd)
> 
> Changes in v2:
>- Drop maxItems for clock (Stephen Boyd)
>- Add vdds supply pin information for sdm845 (Stephen Boyd)
>- Add examples for 14nm, 20nm and 28nm phy yaml files (Stephen Boyd)
>- Keep child nodes directly under soc node (Stephen Boyd)
> 
> Changes in v3:
>- Use a separate yaml file to describe the common properties
>  for all the dsi phy versions (Stephen Boyd)
>- Remove soc from examples (Stephen Boyd)
>- Add description for register property
> 
> Changes in v4:
>- Modify the title for all the phy versions (Stephen Boyd)
>- Drop description for all the phy versions (Stephen Boyd)
>- Modify the description for register property (Stephen Boyd)
> 
> Changes in v5:
>- Remove unused properties from common dsi phy file
>- Add clock-cells and phy-cells to required property
>  list (Stephen Boyd)
> 
> Changes in v6:
>- Add proper compatible string in example
> ---
>  .../bindings/display/msm/dsi-phy-10nm.yaml | 68 +
>  .../bindings/display/msm/dsi-phy-14nm.yaml | 66 
>  .../bindings/display/msm/dsi-phy-20nm.yaml | 71 
> ++
>  .../bindings/display/msm/dsi-phy-28nm.yaml | 68 +
>  .../bindings/display/msm/dsi-phy-common.yaml   | 40 
>  5 files changed, 313 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/display/msm/dsi-phy-10nm.yaml
>  create mode 100644 
> Documentation/devicetree/bindings/display/msm/dsi-phy-14nm.yaml
>  create mode 100644 
> Documentation/devicetree/bindings/display/msm/dsi-phy-20nm.yaml
>  create mode 100644 
> Documentation/devicetree/bindings/display/msm/dsi-phy-28nm.yaml
>  create mode 100644 
> Documentation/devicetree/bindings/display/msm/dsi-phy-common.yaml
> 


Please add Acked-by/Reviewed-by tags when posting new versions. However,
there's no need to repost patches *only* to add the tags. The upstream
maintainer will do that for acks received on the version they apply.

If a tag was not added on purpose, please state why and what changed.

Re: [PATCH v17 1/4] dt-bindings: msm: disp: add yaml schemas for DPU bindings

2021-05-21 Thread Rob Herring

On Fri, 21 May 2021 15:57:21 +0530, Krishna Manikandan wrote:
> MSM Mobile Display Subsystem (MDSS) encapsulates sub-blocks
> like DPU display controller, DSI etc. Add YAML schema
> for DPU device tree bindings.
> 
> Signed-off-by: Krishna Manikandan 
> 
> Changes in v2:
> - Changed dpu to DPU (Sam Ravnborg)
> - Fixed indentation issues (Sam Ravnborg)
> - Added empty line between different properties (Sam Ravnborg)
> - Replaced reference txt files with  their corresponding
>   yaml files (Sam Ravnborg)
> - Modified the file to use "|" only when it is
>   necessary (Sam Ravnborg)
> 
> Changes in v3:
> - Corrected the license used (Rob Herring)
> - Added maxItems for properties (Rob Herring)
> - Dropped generic descriptions (Rob Herring)
> - Added ranges property (Rob Herring)
> - Corrected the indendation (Rob Herring)
> - Added additionalProperties (Rob Herring)
> - Split dsi file into two, one for dsi controller
>   and another one for dsi phy per target (Rob Herring)
> - Corrected description for pinctrl-names (Rob Herring)
> - Corrected the examples used in yaml file (Rob Herring)
> - Delete dsi.txt and dpu.txt (Rob Herring)
> 
> Changes in v4:
> - Move schema up by one level (Rob Herring)
> - Add patternProperties for mdp node (Rob Herring)
> - Corrected description of some properties (Rob Herring)
> 
> Changes in v5:
> - Correct the indentation (Rob Herring)
> - Remove unnecessary description from properties (Rob Herring)
> - Correct the number of interconnect entries (Rob Herring)
> - Add interconnect names for sc7180 (Rob Herring)
> - Add description for ports (Rob Herring)
> - Remove common properties (Rob Herring)
> - Add unevalutatedProperties (Rob Herring)
> - Reference existing dsi controller yaml in the common
>   dsi controller file (Rob Herring)
> - Correct the description of clock names to include only the
>   clocks that are required (Rob Herring)
> - Remove properties which are already covered under the common
>   binding (Rob Herring)
> - Add dsi phy supply nodes which are required for sc7180 and
>   sdm845 targets (Rob Herring)
> - Add type ref for syscon-sfpb (Rob Herring)
> 
> Changes in v6:
> - Fixed errors during dt_binding_check (Rob Herring)
> - Add maxItems for phys and phys-names (Rob Herring)
> - Use unevaluatedProperties wherever required (Rob Herring)
> - Removed interrupt controller from required properties for
>   dsi controller (Rob Herring)
> - Add constraints for dsi-phy reg-names based on the compatible
>   phy version (Rob Herring)
> - Add constraints for dsi-phy supply nodes based on the
>   compatible phy version (Rob Herring)
> 
> Changes in v7:
> - Add default value for qcom,mdss-mdp-transfer-time-us (Rob Herring)
> - Modify the schema for data-lanes (Rob Herring)
> - Split the phy schema into separate schemas based on
>   the phy version (Rob Herring)
> 
> Changes in v8:
> - Resolve merge conflicts with latest dsi.txt file
> - Include dp yaml change also in the same series
> 
> Changes in v9:
> - Combine target specific dsi controller yaml files
>   to a single yaml file (Rob Herring)
> - Combine target specific dsi phy yaml files into a
>   single yaml file (Rob Herring)
> - Use unevaluatedProperties and additionalProperties
>   wherever required
> - Remove duplicate properties from common yaml files
> 
> Changes in v10:
> - Split the patch into separate patches for DPU, DSI and
>   PHY (Stephen Boyd)
> - Drop unnecessary fullstop (Stephen Boyd)
> - Add newline whereever required (Stephen Boyd)
> - Add description for clock used (Stephen Boyd)
> - Modify the description for interconnect entries  (Stephen Boyd)
> - Drop assigned clock entries as it a generic property (Stephen Boyd)
> - Correct the definition for interrupts (Stephen Boyd)
> - Drop clock names from required properties (Stephen Boyd)
> - Drop labels for display nodes from example (Stephen Boyd)
> - Drop flags from interrupts entries (Stephen Boyd)
> 
> Changes in v11:
> - Drop maxItems for clocks (Stephen Boyd)
> 
> Changes in v12:
> - Add description for register property (Stephen Boyd)
> - Add maxItems for interrupts (Stephen Boyd)
> - Add description for iommus property (Stephen Boyd)
> - Add description for interconnects (Stephen Boyd)
> - Change display node name to display_controller (Stephen Boyd)
> 
> Changes in v13:
> - Add maxItems for reg property (Stephen Boyd)
> - Add ranges property in example (Stephen Boyd)
> - Modify description for iommus property (Stephen Boyd)
> - Add Dp bindings for ports in the same patch (Stephen Boyd)
> - Remove soc from examples and change address and size cells
>   accordingly (Stephen Boyd)
> - Add reference for ports
> 
> Changes

ttm_resource_manager::use_tt

2021-05-21 Thread Zack Rusin


Hi, Christian.

I was just going over some old patches and I was just looking at your series 
introducing use_tt:
https://patchwork.freedesktop.org/series/80078/ and the change 
https://patchwork.freedesktop.org/patch/382079/?series=80078=1

Do you happen to remember what was the worry behind the /* TODO: This is most 
likely not correct */ in vmwgfx_ttm_buffer.c? I'm trying to figure out if we 
need to address it.

z

[PATCH 1/3] drm/i915/gt: Move engine setup out of set_default_submission

2021-05-21 Thread Matthew Brost

From: Chris Wilson 

Now that we no longer switch back and forth between guc and execlists,
we no longer need to restore the backend's vfunc and can leave them set
after initialisation. The only catch is that we lose the submission on
wedging and still need to reset the submit_request vfunc on unwedging.

Signed-off-by: Matthew Brost 
Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
Reviewed-by: Matthew Brost 
---
 .../drm/i915/gt/intel_execlists_submission.c  | 46 -
 .../gpu/drm/i915/gt/intel_ring_submission.c   |  4 --
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 50 ---
 3 files changed, 44 insertions(+), 56 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c 
b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index de124870af44..1108c193ab65 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -3076,29 +3076,6 @@ static void execlists_set_default_submission(struct 
intel_engine_cs *engine)
engine->submit_request = execlists_submit_request;
engine->schedule = i915_schedule;
engine->execlists.tasklet.callback = execlists_submission_tasklet;
-
-   engine->reset.prepare = execlists_reset_prepare;
-   engine->reset.rewind = execlists_reset_rewind;
-   engine->reset.cancel = execlists_reset_cancel;
-   engine->reset.finish = execlists_reset_finish;
-
-   engine->park = execlists_park;
-   engine->unpark = NULL;
-
-   engine->flags |= I915_ENGINE_SUPPORTS_STATS;
-   if (!intel_vgpu_active(engine->i915)) {
-   engine->flags |= I915_ENGINE_HAS_SEMAPHORES;
-   if (can_preempt(engine)) {
-   engine->flags |= I915_ENGINE_HAS_PREEMPTION;
-   if (IS_ACTIVE(CONFIG_DRM_I915_TIMESLICE_DURATION))
-   engine->flags |= I915_ENGINE_HAS_TIMESLICES;
-   }
-   }
-
-   if (intel_engine_has_preemption(engine))
-   engine->emit_bb_start = gen8_emit_bb_start;
-   else
-   engine->emit_bb_start = gen8_emit_bb_start_noarb;
 }
 
 static void execlists_shutdown(struct intel_engine_cs *engine)
@@ -3129,6 +3106,14 @@ logical_ring_default_vfuncs(struct intel_engine_cs 
*engine)
engine->cops = _context_ops;
engine->request_alloc = execlists_request_alloc;
 
+   engine->reset.prepare = execlists_reset_prepare;
+   engine->reset.rewind = execlists_reset_rewind;
+   engine->reset.cancel = execlists_reset_cancel;
+   engine->reset.finish = execlists_reset_finish;
+
+   engine->park = execlists_park;
+   engine->unpark = NULL;
+
engine->emit_flush = gen8_emit_flush_xcs;
engine->emit_init_breadcrumb = gen8_emit_init_breadcrumb;
engine->emit_fini_breadcrumb = gen8_emit_fini_breadcrumb_xcs;
@@ -3149,6 +3134,21 @@ logical_ring_default_vfuncs(struct intel_engine_cs 
*engine)
 * until a more refined solution exists.
 */
}
+
+   engine->flags |= I915_ENGINE_SUPPORTS_STATS;
+   if (!intel_vgpu_active(engine->i915)) {
+   engine->flags |= I915_ENGINE_HAS_SEMAPHORES;
+   if (can_preempt(engine)) {
+   engine->flags |= I915_ENGINE_HAS_PREEMPTION;
+   if (IS_ACTIVE(CONFIG_DRM_I915_TIMESLICE_DURATION))
+   engine->flags |= I915_ENGINE_HAS_TIMESLICES;
+   }
+   }
+
+   if (intel_engine_has_preemption(engine))
+   engine->emit_bb_start = gen8_emit_bb_start;
+   else
+   engine->emit_bb_start = gen8_emit_bb_start_noarb;
 }
 
 static void logical_ring_default_irqs(struct intel_engine_cs *engine)
diff --git a/drivers/gpu/drm/i915/gt/intel_ring_submission.c 
b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
index 9585546556ee..5f4f7f1df48f 100644
--- a/drivers/gpu/drm/i915/gt/intel_ring_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
@@ -989,14 +989,10 @@ static void gen6_bsd_submit_request(struct i915_request 
*request)
 static void i9xx_set_default_submission(struct intel_engine_cs *engine)
 {
engine->submit_request = i9xx_submit_request;
-
-   engine->park = NULL;
-   engine->unpark = NULL;
 }
 
 static void gen6_bsd_set_default_submission(struct intel_engine_cs *engine)
 {
-   i9xx_set_default_submission(engine);
engine->submit_request = gen6_bsd_submit_request;
 }
 
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 92688a9b6717..f72faa0b8339 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -608,35 +608,6 @@ static int guc_resume(struct intel_engine_cs *engine)
 static void guc_set_default_submission(struct intel_engine_cs *engine)
 {
engine->submit_request = guc_submit_request;
-

[PATCH 2/3] drm/i915/gt: Move submission_method into intel_gt

2021-05-21 Thread Matthew Brost

From: Chris Wilson 

Since we setup the submission method for the engines once, it is easy to
assign an enum and use that instead of probing into the backends.

Signed-off-by: Matthew Brost 
Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
Reviewed-by: Matthew Brost 
---
 drivers/gpu/drm/i915/gt/intel_engine.h   |  8 +++-
 drivers/gpu/drm/i915/gt/intel_engine_cs.c| 12 
 drivers/gpu/drm/i915/gt/intel_execlists_submission.c |  8 
 drivers/gpu/drm/i915/gt/intel_execlists_submission.h |  3 ---
 drivers/gpu/drm/i915/gt/intel_gt_types.h |  7 +++
 drivers/gpu/drm/i915/gt/intel_reset.c|  7 +++
 drivers/gpu/drm/i915/gt/selftest_execlists.c |  2 +-
 drivers/gpu/drm/i915/gt/selftest_ring_submission.c   |  2 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c|  5 -
 drivers/gpu/drm/i915/gt/uc/intel_guc_submission.h|  1 -
 drivers/gpu/drm/i915/i915_perf.c | 10 +-
 11 files changed, 32 insertions(+), 33 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h 
b/drivers/gpu/drm/i915/gt/intel_engine.h
index 47ee8578e511..8d9184920c51 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine.h
@@ -13,8 +13,9 @@
 #include "i915_reg.h"
 #include "i915_request.h"
 #include "i915_selftest.h"
-#include "gt/intel_timeline.h"
 #include "intel_engine_types.h"
+#include "intel_gt_types.h"
+#include "intel_timeline.h"
 #include "intel_workarounds.h"
 
 struct drm_printer;
@@ -262,6 +263,11 @@ void intel_engine_init_active(struct intel_engine_cs 
*engine,
 #define ENGINE_MOCK1
 #define ENGINE_VIRTUAL 2
 
+static inline bool intel_engine_uses_guc(const struct intel_engine_cs *engine)
+{
+   return engine->gt->submission_method >= INTEL_SUBMISSION_GUC;
+}
+
 static inline bool
 intel_engine_has_preempt_reset(const struct intel_engine_cs *engine)
 {
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index eba2da9679a5..e54a2a4df87c 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -909,12 +909,16 @@ int intel_engines_init(struct intel_gt *gt)
enum intel_engine_id id;
int err;
 
-   if (intel_uc_uses_guc_submission(>uc))
+   if (intel_uc_uses_guc_submission(>uc)) {
+   gt->submission_method = INTEL_SUBMISSION_GUC;
setup = intel_guc_submission_setup;
-   else if (HAS_EXECLISTS(gt->i915))
+   } else if (HAS_EXECLISTS(gt->i915)) {
+   gt->submission_method = INTEL_SUBMISSION_ELSP;
setup = intel_execlists_submission_setup;
-   else
+   } else {
+   gt->submission_method = INTEL_SUBMISSION_RING;
setup = intel_ring_submission_setup;
+   }
 
for_each_engine(engine, gt, id) {
err = engine_setup_common(engine);
@@ -1479,7 +1483,7 @@ static void intel_engine_print_registers(struct 
intel_engine_cs *engine,
drm_printf(m, "\tIPEHR: 0x%08x\n", ENGINE_READ(engine, IPEHR));
}
 
-   if (intel_engine_in_guc_submission_mode(engine)) {
+   if (intel_engine_uses_guc(engine)) {
/* nothing to print yet */
} else if (HAS_EXECLISTS(dev_priv)) {
struct i915_request * const *port, *rq;
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c 
b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index 1108c193ab65..9d2da5ccaef6 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -1768,7 +1768,6 @@ process_csb(struct intel_engine_cs *engine, struct 
i915_request **inactive)
 */
GEM_BUG_ON(!tasklet_is_locked(>tasklet) &&
   !reset_in_progress(execlists));
-   GEM_BUG_ON(!intel_engine_in_execlists_submission_mode(engine));
 
/*
 * Note that csb_write, csb_status may be either in HWSP or mmio.
@@ -3884,13 +3883,6 @@ void intel_execlists_show_requests(struct 
intel_engine_cs *engine,
spin_unlock_irqrestore(>active.lock, flags);
 }
 
-bool
-intel_engine_in_execlists_submission_mode(const struct intel_engine_cs *engine)
-{
-   return engine->set_default_submission ==
-  execlists_set_default_submission;
-}
-
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
 #include "selftest_execlists.c"
 #endif
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h 
b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
index fd61dae820e9..4ca9b475e252 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
@@ -43,7 +43,4 @@ int intel_virtual_engine_attach_bond(struct intel_engine_cs 
*engine,
 const struct intel_engine_cs *master,
 const struct intel_engine_cs *sibling);

[PATCH 0/3] Clean a few backend interfaces in the i915

2021-05-21 Thread Matthew Brost

As discussed in [1] start merging some support patches as a precursor to
GuC submission the i915. This is step #1 mentioned in [1].

[1] https://patchwork.freedesktop.org/series/89844/

Signed-off-by: Matthew Brost 

Chris Wilson (3):
  drm/i915/gt: Move engine setup out of set_default_submission
  drm/i915/gt: Move submission_method into intel_gt
  drm/i915/gt: Move CS interrupt handler to the backend

 drivers/gpu/drm/i915/gt/intel_engine.h|  8 +-
 drivers/gpu/drm/i915/gt/intel_engine_cs.c | 19 +++-
 drivers/gpu/drm/i915/gt/intel_engine_types.h  | 14 +--
 .../drm/i915/gt/intel_execlists_submission.c  | 95 +--
 .../drm/i915/gt/intel_execlists_submission.h  |  3 -
 drivers/gpu/drm/i915/gt/intel_gt_irq.c| 82 +---
 drivers/gpu/drm/i915/gt/intel_gt_irq.h| 23 +
 drivers/gpu/drm/i915/gt/intel_gt_types.h  |  7 ++
 drivers/gpu/drm/i915/gt/intel_reset.c |  7 +-
 .../gpu/drm/i915/gt/intel_ring_submission.c   | 12 ++-
 drivers/gpu/drm/i915/gt/intel_rps.c   |  2 +-
 drivers/gpu/drm/i915/gt/selftest_execlists.c  |  2 +-
 .../drm/i915/gt/selftest_ring_submission.c|  2 +-
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 64 ++---
 .../gpu/drm/i915/gt/uc/intel_guc_submission.h |  1 -
 drivers/gpu/drm/i915/i915_irq.c   | 10 +-
 drivers/gpu/drm/i915/i915_perf.c  | 10 +-
 17 files changed, 199 insertions(+), 162 deletions(-)

-- 
2.28.0

[PATCH 3/3] drm/i915/gt: Move CS interrupt handler to the backend

2021-05-21 Thread Matthew Brost

From: Chris Wilson 

The different submission backends each have their own preferred
behaviour and interrupt setup. Let each handle their own interrupts.

This becomes more useful later as we to extract the use of auxiliary
state in the interrupt handler that is backend specific.

Signed-off-by: Matthew Brost 
Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
Reviewed-by: Matthew Brost 
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c |  7 ++
 drivers/gpu/drm/i915/gt/intel_engine_types.h  | 14 +---
 .../drm/i915/gt/intel_execlists_submission.c  | 41 ++
 drivers/gpu/drm/i915/gt/intel_gt_irq.c| 82 ++-
 drivers/gpu/drm/i915/gt/intel_gt_irq.h| 23 ++
 .../gpu/drm/i915/gt/intel_ring_submission.c   |  8 ++
 drivers/gpu/drm/i915/gt/intel_rps.c   |  2 +-
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 11 ++-
 drivers/gpu/drm/i915/i915_irq.c   | 10 ++-
 9 files changed, 124 insertions(+), 74 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index e54a2a4df87c..3f9a811eb02b 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -255,6 +255,11 @@ static void intel_engine_sanitize_mmio(struct 
intel_engine_cs *engine)
intel_engine_set_hwsp_writemask(engine, ~0u);
 }
 
+static void nop_irq_handler(struct intel_engine_cs *engine, u16 iir)
+{
+   GEM_DEBUG_WARN_ON(iir);
+}
+
 static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id)
 {
const struct engine_info *info = _engines[id];
@@ -292,6 +297,8 @@ static int intel_engine_setup(struct intel_gt *gt, enum 
intel_engine_id id)
engine->hw_id = info->hw_id;
engine->guc_id = MAKE_GUC_ID(info->class, info->instance);
 
+   engine->irq_handler = nop_irq_handler;
+
engine->class = info->class;
engine->instance = info->instance;
__sprint_engine_name(engine);
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h 
b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index 883bafc44902..9ef349cd5cea 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -402,6 +402,7 @@ struct intel_engine_cs {
u32 irq_enable_mask; /* bitmask to enable ring interrupt */
void(*irq_enable)(struct intel_engine_cs *engine);
void(*irq_disable)(struct intel_engine_cs *engine);
+   void(*irq_handler)(struct intel_engine_cs *engine, u16 iir);
 
void(*sanitize)(struct intel_engine_cs *engine);
int (*resume)(struct intel_engine_cs *engine);
@@ -481,10 +482,9 @@ struct intel_engine_cs {
 #define I915_ENGINE_HAS_PREEMPTION   BIT(2)
 #define I915_ENGINE_HAS_SEMAPHORES   BIT(3)
 #define I915_ENGINE_HAS_TIMESLICES   BIT(4)
-#define I915_ENGINE_NEEDS_BREADCRUMB_TASKLET BIT(5)
-#define I915_ENGINE_IS_VIRTUAL   BIT(6)
-#define I915_ENGINE_HAS_RELATIVE_MMIO BIT(7)
-#define I915_ENGINE_REQUIRES_CMD_PARSER BIT(8)
+#define I915_ENGINE_IS_VIRTUAL   BIT(5)
+#define I915_ENGINE_HAS_RELATIVE_MMIO BIT(6)
+#define I915_ENGINE_REQUIRES_CMD_PARSER BIT(7)
unsigned int flags;
 
/*
@@ -593,12 +593,6 @@ intel_engine_has_timeslices(const struct intel_engine_cs 
*engine)
return engine->flags & I915_ENGINE_HAS_TIMESLICES;
 }
 
-static inline bool
-intel_engine_needs_breadcrumb_tasklet(const struct intel_engine_cs *engine)
-{
-   return engine->flags & I915_ENGINE_NEEDS_BREADCRUMB_TASKLET;
-}
-
 static inline bool
 intel_engine_is_virtual(const struct intel_engine_cs *engine)
 {
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c 
b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index 9d2da5ccaef6..8db200422950 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -118,6 +118,7 @@
 #include "intel_engine_stats.h"
 #include "intel_execlists_submission.h"
 #include "intel_gt.h"
+#include "intel_gt_irq.h"
 #include "intel_gt_pm.h"
 #include "intel_gt_requests.h"
 #include "intel_lrc.h"
@@ -2384,6 +2385,45 @@ static void execlists_submission_tasklet(struct 
tasklet_struct *t)
rcu_read_unlock();
 }
 
+static void execlists_irq_handler(struct intel_engine_cs *engine, u16 iir)
+{
+   bool tasklet = false;
+
+   if (unlikely(iir & GT_CS_MASTER_ERROR_INTERRUPT)) {
+   u32 eir;
+
+   /* Upper 16b are the enabling mask, rsvd for internal errors */
+   eir = ENGINE_READ(engine, RING_EIR) & GENMASK(15, 0);
+   ENGINE_TRACE(engine, "CS error: %x\n", eir);
+
+   /* Disable the error interrupt until after the reset */
+   if (likely(eir)) {
+   ENGINE_WRITE(engine, RING_EMR, ~0u);
+   ENGINE_WRITE(engine, RING_EIR, eir);
+

Re: [PATCH 1/4] dma-buf: add dma_fence_array_for_each (v2)

2021-05-21 Thread Christian König


Am 21.05.21 um 18:27 schrieb Jason Ekstrand:

On Fri, May 21, 2021 at 2:51 AM Christian König
 wrote:

Am 20.05.21 um 21:00 schrieb Jason Ekstrand:

From: Christian König 

Add a helper to iterate over all fences in a dma_fence_array object.

v2 (Jason Ekstrand)
   - Return NULL from dma_fence_array_first if head == NULL.  This matches
 the iterator behavior of dma_fence_chain_for_each in that it iterates
 zero times if head == NULL.
   - Return NULL from dma_fence_array_next if index > array->num_fences.

Signed-off-by: Jason Ekstrand 

Reviewed-by: Christian König 

BTW: I'm only seeing this patch from the series. Looks like somehow the
rest didn't made it into my inbox.

https://lists.freedesktop.org/archives/dri-devel/2021-May/307561.html

Not sure why it didn't make your mail.  This one was CC'd to you
because you're the author from a year ago or something.


Yeah, feel free to add an Acked-by on exporting the fences to the 
sync_file part.


But I think we really need to untangle the resource management from the 
implicit sync handling before the importing fence part can land.


Regards,
Christian.



--Jason



---
   drivers/dma-buf/dma-fence-array.c | 27 +++
   include/linux/dma-fence-array.h   | 17 +
   2 files changed, 44 insertions(+)

diff --git a/drivers/dma-buf/dma-fence-array.c 
b/drivers/dma-buf/dma-fence-array.c
index d3fbd950be944..2ac1afc697d0f 100644
--- a/drivers/dma-buf/dma-fence-array.c
+++ b/drivers/dma-buf/dma-fence-array.c
@@ -201,3 +201,30 @@ bool dma_fence_match_context(struct dma_fence *fence, u64 
context)
   return true;
   }
   EXPORT_SYMBOL(dma_fence_match_context);
+
+struct dma_fence *dma_fence_array_first(struct dma_fence *head)
+{
+ struct dma_fence_array *array;
+
+ if (!head)
+ return NULL;
+
+ array = to_dma_fence_array(head);
+ if (!array)
+ return head;
+
+ return array->fences[0];
+}
+EXPORT_SYMBOL(dma_fence_array_first);
+
+struct dma_fence *dma_fence_array_next(struct dma_fence *head,
+unsigned int index)
+{
+ struct dma_fence_array *array = to_dma_fence_array(head);
+
+ if (!array || index >= array->num_fences)
+ return NULL;
+
+ return array->fences[index];
+}
+EXPORT_SYMBOL(dma_fence_array_next);
diff --git a/include/linux/dma-fence-array.h b/include/linux/dma-fence-array.h
index 303dd712220fd..588ac8089dd61 100644
--- a/include/linux/dma-fence-array.h
+++ b/include/linux/dma-fence-array.h
@@ -74,6 +74,19 @@ to_dma_fence_array(struct dma_fence *fence)
   return container_of(fence, struct dma_fence_array, base);
   }

+/**
+ * dma_fence_array_for_each - iterate over all fences in array
+ * @fence: current fence
+ * @index: index into the array
+ * @head: potential dma_fence_array object
+ *
+ * Test if @array is a dma_fence_array object and if yes iterate over all 
fences
+ * in the array. If not just iterate over the fence in @array itself.
+ */
+#define dma_fence_array_for_each(fence, index, head) \
+ for (index = 0, fence = dma_fence_array_first(head); fence; \
+  ++(index), fence = dma_fence_array_next(head, index))
+
   struct dma_fence_array *dma_fence_array_create(int num_fences,
  struct dma_fence **fences,
  u64 context, unsigned seqno,
@@ -81,4 +94,8 @@ struct dma_fence_array *dma_fence_array_create(int num_fences,

   bool dma_fence_match_context(struct dma_fence *fence, u64 context);

+struct dma_fence *dma_fence_array_first(struct dma_fence *head);
+struct dma_fence *dma_fence_array_next(struct dma_fence *head,
+unsigned int index);
+
   #endif /* __LINUX_DMA_FENCE_ARRAY_H */

Re: [Mesa-dev] [PATCH 01/11] drm/amdgpu: Comply with implicit fencing rules

2021-05-21 Thread Christian König


Am 21.05.21 um 17:16 schrieb Daniel Vetter:

On Fri, May 21, 2021 at 05:00:46PM +0200, Bas Nieuwenhuizen wrote:

On Fri, May 21, 2021 at 4:37 PM Daniel Vetter  wrote:

On Fri, May 21, 2021 at 11:46:23AM +0200, Bas Nieuwenhuizen wrote:

On Fri, May 21, 2021 at 11:10 AM Daniel Vetter  wrote:

---
  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 88a24a0b5691..cc8426e1e8a8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -617,8 +617,8 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
 amdgpu_bo_list_for_each_entry(e, p->bo_list) {
 struct amdgpu_bo *bo = ttm_to_amdgpu_bo(e->tv.bo);

-   /* Make sure we use the exclusive slot for shared BOs */
-   if (bo->prime_shared_count)
+   /* Make sure we use the exclusive slot for all potentially 
shared BOs */
+   if (!(bo->flags & AMDGPU_GEM_CREATE_VM_ALWAYS_VALID))
 e->tv.num_shared = 0;

I think it also makes sense to skip this with
AMDGPU_GEM_CREATE_EXPLICIT_SYNC? It can be shared but I don't think
anyone expects implicit sync to happen with those.

Ah yes, I missed this entirely. So the "no implicit flag" is already
there, and the _only_ thing that's missing really is a way to fish out the
implicit fences, and set them.

https://lore.kernel.org/dri-devel/20210520190007.534046-1-ja...@jlekstrand.net/

So I think all that's really needed in radv is not setting
RADEON_FLAG_IMPLICIT_SYNC for winsys buffers when Jason's dma-buf ioctl
are present (means you need to do some import/export and keep the fd
around for winsys buffers, but shouldn't be too bad), and then control the
implicit fences entirely explicitly like vk expects.

That is the part I'm less sure about. This is a BO wide flag so we are
also disabling implicit sync in the compositor. If the compositor does
only do read stuff that is ok, as the inserted exclusive fence will
work for that. But as I learned recently the app provided buffer may
end up being written to by the X server which open a whole can of
potential problems if implicit sync gets disabled between Xserver
operations on the app provided buffer. Hence setting that on the WSI
buffer is a whole new can of potential problems and hence I've said a
submission based flag would be preferred.

I can certainly try it out though.

Hm yeah that's the wrong flag. We need a flag on the drm_file which the
explicit userspace sets. And which is valid only for itself.

There's a nice flags field when creating a ctx, but it's not validated and
there's already a comment that we have to filter out garbage priority, so
that's not use. I'll whip up something entirely untested just as a draft.


We could provide an IOCTL for the BO to change the flag.

But could we first figure out the semantics we want to use here?

Cause I'm pretty sure we don't actually need those changes at all and as 
said before I'm certainly NAKing things which break existing use cases.


Regards,
Christian.


-Daniel




Are you bored enough to type this up for radv? I'll give Jason's kernel
stuff another review meanwhile.
-Daniel


 e->bo_va = amdgpu_vm_bo_find(vm, bo);
 }
--
2.31.0


--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [PATCH v4 06/12] drm/vc4: hdmi: Prevent clock unbalance

2021-05-21 Thread Dave Stevenson

On Fri, 7 May 2021 at 16:05, Maxime Ripard  wrote:
>
> Since we fixed the hooks to disable the encoder at boot, we now have an
> unbalanced clk_disable call at boot since we never enabled them in the
> first place.
>
> Let's mimic the state of the hardware and enable the clocks at boot if
> the controller is enabled to get the use-count right.
>
> Cc:  # v5.10+
> Fixes: 09c438139b8f ("drm/vc4: hdmi: Implement finer-grained hooks")
> Signed-off-by: Maxime Ripard 

Reviewed-by: Dave Stevenson 

> ---
>  drivers/gpu/drm/vc4/vc4_hdmi.c | 8 
>  1 file changed, 8 insertions(+)
>
> diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
> index 1fda574579af..9c919472ae84 100644
> --- a/drivers/gpu/drm/vc4/vc4_hdmi.c
> +++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
> @@ -1995,6 +1995,14 @@ static int vc4_hdmi_bind(struct device *dev, struct 
> device *master, void *data)
> if (vc4_hdmi->variant->reset)
> vc4_hdmi->variant->reset(vc4_hdmi);
>
> +   if ((of_device_is_compatible(dev->of_node, "brcm,bcm2711-hdmi0") ||
> +of_device_is_compatible(dev->of_node, "brcm,bcm2711-hdmi1")) &&
> +   HDMI_READ(HDMI_VID_CTL) & VC4_HD_VID_CTL_ENABLE) {
> +   clk_prepare_enable(vc4_hdmi->pixel_clock);
> +   clk_prepare_enable(vc4_hdmi->hsm_clock);
> +   clk_prepare_enable(vc4_hdmi->pixel_bvb_clock);
> +   }
> +
> pm_runtime_enable(dev);
>
> drm_simple_encoder_init(drm, encoder, DRM_MODE_ENCODER_TMDS);
> --
> 2.31.1
>

Re: [PATCH v4 05/12] drm/vc4: crtc: Lookup the encoder from the register at boot

2021-05-21 Thread Dave Stevenson

Hi Maxime

On Fri, 7 May 2021 at 16:05, Maxime Ripard  wrote:
>
> At boot, we can't rely on the vc4_get_crtc_encoder since we don't have a
> state yet and thus will not be able to figure out which connector is
> attached to our CRTC.
>
> However, we have a muxing bit in the CRTC register we can use to get the
> encoder currently connected to the pixelvalve. We can thus read that
> register, lookup the associated register through the vc4_pv_data
> structure, and then pass it to vc4_crtc_disable so that we can perform
> the proper operations.
>
> Fixes: 875a4d536842 ("drm/vc4: drv: Disable the CRTC at boot time")
> Signed-off-by: Maxime Ripard 

Reviewed-by: Dave Stevenson 

> ---
>  drivers/gpu/drm/vc4/vc4_crtc.c | 38 ++
>  1 file changed, 34 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/vc4/vc4_crtc.c b/drivers/gpu/drm/vc4/vc4_crtc.c
> index 36ea684a349b..f715648f89dd 100644
> --- a/drivers/gpu/drm/vc4/vc4_crtc.c
> +++ b/drivers/gpu/drm/vc4/vc4_crtc.c
> @@ -413,11 +413,10 @@ static void require_hvs_enabled(struct drm_device *dev)
>  }
>
>  static int vc4_crtc_disable(struct drm_crtc *crtc,
> +   struct drm_encoder *encoder,
> struct drm_atomic_state *state,
> unsigned int channel)
>  {
> -   struct drm_encoder *encoder = vc4_get_crtc_encoder(crtc, state,
> -  
> drm_atomic_get_old_connector_state);
> struct vc4_encoder *vc4_encoder = to_vc4_encoder(encoder);
> struct vc4_crtc *vc4_crtc = to_vc4_crtc(crtc);
> struct drm_device *dev = crtc->dev;
> @@ -458,10 +457,29 @@ static int vc4_crtc_disable(struct drm_crtc *crtc,
> return 0;
>  }
>
> +static struct drm_encoder *vc4_crtc_get_encoder_by_type(struct drm_crtc 
> *crtc,
> +   enum vc4_encoder_type 
> type)
> +{
> +   struct drm_encoder *encoder;
> +
> +   drm_for_each_encoder(encoder, crtc->dev) {
> +   struct vc4_encoder *vc4_encoder = to_vc4_encoder(encoder);
> +
> +   if (vc4_encoder->type == type)
> +   return encoder;
> +   }
> +
> +   return NULL;
> +}
> +
>  int vc4_crtc_disable_at_boot(struct drm_crtc *crtc)
>  {
> struct drm_device *drm = crtc->dev;
> struct vc4_crtc *vc4_crtc = to_vc4_crtc(crtc);
> +   enum vc4_encoder_type encoder_type;
> +   const struct vc4_pv_data *pv_data;
> +   struct drm_encoder *encoder;
> +   unsigned encoder_sel;
> int channel;
>
> if (!(of_device_is_compatible(vc4_crtc->pdev->dev.of_node,
> @@ -480,7 +498,17 @@ int vc4_crtc_disable_at_boot(struct drm_crtc *crtc)
> if (channel < 0)
> return 0;
>
> -   return vc4_crtc_disable(crtc, NULL, channel);
> +   encoder_sel = VC4_GET_FIELD(CRTC_READ(PV_CONTROL), 
> PV_CONTROL_CLK_SELECT);
> +   if (WARN_ON(encoder_sel != 0))
> +   return 0;
> +
> +   pv_data = vc4_crtc_to_vc4_pv_data(vc4_crtc);
> +   encoder_type = pv_data->encoder_types[encoder_sel];
> +   encoder = vc4_crtc_get_encoder_by_type(crtc, encoder_type);
> +   if (WARN_ON(!encoder))
> +   return 0;
> +
> +   return vc4_crtc_disable(crtc, encoder, NULL, channel);
>  }
>
>  static void vc4_crtc_atomic_disable(struct drm_crtc *crtc,
> @@ -489,6 +517,8 @@ static void vc4_crtc_atomic_disable(struct drm_crtc *crtc,
> struct drm_crtc_state *old_state = 
> drm_atomic_get_old_crtc_state(state,
>  
> crtc);
> struct vc4_crtc_state *old_vc4_state = to_vc4_crtc_state(old_state);
> +   struct drm_encoder *encoder = vc4_get_crtc_encoder(crtc, state,
> +  
> drm_atomic_get_old_connector_state);
> struct drm_device *dev = crtc->dev;
>
> require_hvs_enabled(dev);
> @@ -496,7 +526,7 @@ static void vc4_crtc_atomic_disable(struct drm_crtc *crtc,
> /* Disable vblank irq handling before crtc is disabled. */
> drm_crtc_vblank_off(crtc);
>
> -   vc4_crtc_disable(crtc, state, old_vc4_state->assigned_channel);
> +   vc4_crtc_disable(crtc, encoder, state, 
> old_vc4_state->assigned_channel);
>
> /*
>  * Make sure we issue a vblank event after disabling the CRTC if
> --
> 2.31.1
>

Re: [PATCH v4 04/12] drm/vc4: crtc: Fix vc4_get_crtc_encoder logic

2021-05-21 Thread Dave Stevenson

Hi Maxime

On Fri, 7 May 2021 at 16:05, Maxime Ripard  wrote:
>
> The vc4_get_crtc_encoder function currently only works when the
> connector->state->crtc pointer is set, which is only true when the
> connector is currently enabled.
>
> However, we use it as part of the disable path as well, and our lookup
> will fail in that case, resulting in it returning a null pointer we
> can't act on.
>
> We can access the connector that used to be connected to that crtc
> though using the old connector state in the disable path.
>
> Since we want to support both the enable and disable path, we can
> support it by passing the state accessor variant as a function pointer,
> together with the atomic state.
>
> Fixes: 792c3132bc1b ("drm/vc4: encoder: Add finer-grained encoder callbacks")
> Signed-off-by: Maxime Ripard 

Reviewed-by: Dave Stevenson 

> ---
>  drivers/gpu/drm/vc4/vc4_crtc.c | 21 -
>  1 file changed, 16 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/vc4/vc4_crtc.c b/drivers/gpu/drm/vc4/vc4_crtc.c
> index 8a19d6c76605..36ea684a349b 100644
> --- a/drivers/gpu/drm/vc4/vc4_crtc.c
> +++ b/drivers/gpu/drm/vc4/vc4_crtc.c
> @@ -262,14 +262,22 @@ static u32 vc4_crtc_get_fifo_full_level_bits(struct 
> vc4_crtc *vc4_crtc,
>   * allows drivers to push pixels to more than one encoder from the
>   * same CRTC.
>   */
> -static struct drm_encoder *vc4_get_crtc_encoder(struct drm_crtc *crtc)
> +static struct drm_encoder *vc4_get_crtc_encoder(struct drm_crtc *crtc,
> +   struct drm_atomic_state 
> *state,
> +   struct drm_connector_state 
> *(*get_state)(struct drm_atomic_state *state,
> + 
>struct drm_connector *connector))
>  {
> struct drm_connector *connector;
> struct drm_connector_list_iter conn_iter;
>
> drm_connector_list_iter_begin(crtc->dev, _iter);
> drm_for_each_connector_iter(connector, _iter) {
> -   if (connector->state->crtc == crtc) {
> +   struct drm_connector_state *conn_state = get_state(state, 
> connector);
> +
> +   if (!conn_state)
> +   continue;
> +
> +   if (conn_state->crtc == crtc) {
> drm_connector_list_iter_end(_iter);
> return connector->encoder;
> }
> @@ -292,7 +300,8 @@ static void vc4_crtc_config_pv(struct drm_crtc *crtc, 
> struct drm_atomic_state *s
>  {
> struct drm_device *dev = crtc->dev;
> struct vc4_dev *vc4 = to_vc4_dev(dev);
> -   struct drm_encoder *encoder = vc4_get_crtc_encoder(crtc);
> +   struct drm_encoder *encoder = vc4_get_crtc_encoder(crtc, state,
> +  
> drm_atomic_get_new_connector_state);
> struct vc4_encoder *vc4_encoder = to_vc4_encoder(encoder);
> struct vc4_crtc *vc4_crtc = to_vc4_crtc(crtc);
> const struct vc4_pv_data *pv_data = vc4_crtc_to_vc4_pv_data(vc4_crtc);
> @@ -407,7 +416,8 @@ static int vc4_crtc_disable(struct drm_crtc *crtc,
> struct drm_atomic_state *state,
> unsigned int channel)
>  {
> -   struct drm_encoder *encoder = vc4_get_crtc_encoder(crtc);
> +   struct drm_encoder *encoder = vc4_get_crtc_encoder(crtc, state,
> +  
> drm_atomic_get_old_connector_state);
> struct vc4_encoder *vc4_encoder = to_vc4_encoder(encoder);
> struct vc4_crtc *vc4_crtc = to_vc4_crtc(crtc);
> struct drm_device *dev = crtc->dev;
> @@ -507,7 +517,8 @@ static void vc4_crtc_atomic_enable(struct drm_crtc *crtc,
>  {
> struct drm_device *dev = crtc->dev;
> struct vc4_crtc *vc4_crtc = to_vc4_crtc(crtc);
> -   struct drm_encoder *encoder = vc4_get_crtc_encoder(crtc);
> +   struct drm_encoder *encoder = vc4_get_crtc_encoder(crtc, state,
> +  
> drm_atomic_get_new_connector_state);
> struct vc4_encoder *vc4_encoder = to_vc4_encoder(encoder);
>
> require_hvs_enabled(dev);
> --
> 2.31.1
>

Re: [PATCH 2/4] dma-buf: add dma_resv_get_singleton_rcu (v4)

2021-05-21 Thread Daniel Vetter

On Thu, May 20, 2021 at 02:00:05PM -0500, Jason Ekstrand wrote:
> Add a helper function to get a single fence representing
> all fences in a dma_resv object.
> 
> This fence is either the only one in the object or all not
> signaled fences of the object in a flatted out dma_fence_array.
> 
> v2 (Jason Ekstrand):
>  - Take reference of fences both for creating the dma_fence_array and in
>the case where we return one fence.
>  - Handle the case where dma_resv_get_list() returns NULL
> 
> v3 (Jason Ekstrand):
>  - Add an _rcu suffix because it is read-only
>  - Rewrite to use dma_resv_get_fences_rcu so it's RCU-safe
>  - Add an EXPORT_SYMBOL_GPL declaration
>  - Re-author the patch to Jason since very little is left of Christian
>König's original patch
>  - Remove the extra fence argument
> 
> v4 (Jason Ekstrand):
>  - Restore the extra fence argument
> 
> Signed-off-by: Jason Ekstrand 
> 
> get_singleton

Spurious thing here.

> ---
>  drivers/dma-buf/dma-resv.c | 122 +
>  include/linux/dma-resv.h   |   3 +
>  2 files changed, 125 insertions(+)
> 
> diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
> index 6ddbeb5dfbf65..25995fc15c370 100644
> --- a/drivers/dma-buf/dma-resv.c
> +++ b/drivers/dma-buf/dma-resv.c
> @@ -33,6 +33,8 @@
>   */
>  
>  #include 
> +#include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -49,6 +51,19 @@
>   * write-side updates.
>   */
>  
> +/**
> + * dma_fence_deep_dive_for_each - deep dive into the fence containers
> + * @fence: resulting fence
> + * @chain: variable for a dma_fence_chain
> + * @index: index into a dma_fence_array
> + * @head: starting point
> + *
> + * Helper to deep dive into the fence containers for flattening them.
> + */
> +#define dma_fence_deep_dive_for_each(fence, chain, index, head)  \
> + dma_fence_chain_for_each(chain, head)   \
> + dma_fence_array_for_each(fence, index, chain)

Since this is is just internal helper in the .c file we generally don't
document it. Maybe small comment if you feel it's worth it.

> +
>  DEFINE_WD_CLASS(reservation_ww_class);
>  EXPORT_SYMBOL(reservation_ww_class);
>  
> @@ -517,6 +532,113 @@ int dma_resv_get_fences_rcu(struct dma_resv *obj,
>  }
>  EXPORT_SYMBOL_GPL(dma_resv_get_fences_rcu);
>  
> +/**
> + * dma_resv_get_singleton - get a single fence for the dma_resv object

Name doesn't match here.

> + * @obj: the reservation object
> + * @extra: extra fence to add to the resulting array
> + * @result: resulting dma_fence
> + *
> + * Get a single fence representing all unsignaled fences in the dma_resv 
> object
> + * plus the given extra fence. If we got only one fence return a new
> + * reference to that, otherwise return a dma_fence_array object.
> + *
> + * RETURNS
> + * Returns -NOMEM if allocations fail, zero otherwise.

Kernel often encodes this in ERR_PTR so that you don't have to pass a
pointer to store the result. Would feel more kerenl-y I think that way. So
no result parameter, and on alloc failure you'd return

return ERR_PTR(-ENOMEM);

> + */
> +int dma_resv_get_singleton_rcu(struct dma_resv *obj, struct dma_fence *extra,

tbh the _rcu here is confusing. I think _unlocked is the better suffix,
maybe we should rename dma_resv_get_fences_rcu too for consistency. The
rcu-ness of the lookup isn't leaked to callers at all, so no point giving
them a panic.

> +struct dma_fence **result)
> +{
> + struct dma_fence **resv_fences, *fence, *chain, **fences;
> + struct dma_fence_array *array;
> + unsigned int num_resv_fences, num_fences;
> + unsigned int ret, i, j;
> +
> + ret = dma_resv_get_fences_rcu(obj, NULL, _resv_fences, 
> _fences);
> + if (ret)
> + return ret;
> +
> + num_fences = 0;
> + *result = NULL;
> +
> + if (num_resv_fences == 0 && !extra)
> + return 0;
> +
> + for (i = 0; i < num_resv_fences; ++i) {
> + dma_fence_deep_dive_for_each(fence, chain, j, resv_fences[i]) {
> + if (dma_fence_is_signaled(fence))
> + continue;
> +
> + *result = fence;
> + ++num_fences;
> + }
> + }
> +
> + if (extra) {
> + dma_fence_deep_dive_for_each(fence, chain, j, extra) {
> + if (dma_fence_is_signaled(fence))
> + continue;
> +
> + *result = fence;
> + ++num_fences;
> + }
> + }
> +
> + if (num_fences <= 1) {
> + *result = dma_fence_get(*result);
> + goto put_resv_fences;
> + }
> +
> + fences = kmalloc_array(num_fences, sizeof(struct dma_fence*),
> +GFP_KERNEL);
> + if (!fences) {
> + *result = NULL;
> + ret = -ENOMEM;
> + goto put_resv_fences;
> + }
> +
> + num_fences = 0;
> +

Re: [PATCH v4 11/12] drm/vc4: hdmi: Add a workqueue to set scrambling

2021-05-21 Thread Dave Stevenson

Hi Maxime

Thanks for the patch.

On Fri, 7 May 2021 at 16:06, Maxime Ripard  wrote:
>
> It looks like some displays (like the LG 27UL850-W) don't enable the
> scrambling when the HDMI driver enables it. However, if we set later the
> scrambler enable bit, the display will work as expected.
>
> Let's create delayed work queue to periodically look at the display
> scrambling status, and if it's not set yet try to enable it again.
>
> Signed-off-by: Maxime Ripard 

Reviewed-by: Dave Stevenson 

> ---
>  drivers/gpu/drm/vc4/vc4_hdmi.c | 25 +
>  drivers/gpu/drm/vc4/vc4_hdmi.h |  2 ++
>  2 files changed, 27 insertions(+)
>
> diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
> index bda12fea0dce..4fa7ea419594 100644
> --- a/drivers/gpu/drm/vc4/vc4_hdmi.c
> +++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
> @@ -482,6 +482,8 @@ static bool vc4_hdmi_supports_scrambling(struct 
> drm_encoder *encoder,
> return true;
>  }
>
> +#define SCRAMBLING_POLLING_DELAY_MS1000
> +
>  static void vc4_hdmi_enable_scrambling(struct drm_encoder *encoder)
>  {
> struct drm_display_mode *mode = >crtc->state->adjusted_mode;
> @@ -498,6 +500,9 @@ static void vc4_hdmi_enable_scrambling(struct drm_encoder 
> *encoder)
>
> HDMI_WRITE(HDMI_SCRAMBLER_CTL, HDMI_READ(HDMI_SCRAMBLER_CTL) |
>VC5_HDMI_SCRAMBLER_CTL_ENABLE);
> +
> +   queue_delayed_work(system_wq, _hdmi->scrambling_work,
> +  msecs_to_jiffies(SCRAMBLING_POLLING_DELAY_MS));
>  }
>
>  static void vc4_hdmi_disable_scrambling(struct drm_encoder *encoder)
> @@ -516,6 +521,9 @@ static void vc4_hdmi_disable_scrambling(struct 
> drm_encoder *encoder)
> if (crtc && !vc4_hdmi_mode_needs_scrambling(>mode))
> return;
>
> +   if (delayed_work_pending(_hdmi->scrambling_work))
> +   cancel_delayed_work_sync(_hdmi->scrambling_work);
> +
> HDMI_WRITE(HDMI_SCRAMBLER_CTL, HDMI_READ(HDMI_SCRAMBLER_CTL) &
>~VC5_HDMI_SCRAMBLER_CTL_ENABLE);
>
> @@ -523,6 +531,22 @@ static void vc4_hdmi_disable_scrambling(struct 
> drm_encoder *encoder)
> drm_scdc_set_high_tmds_clock_ratio(vc4_hdmi->ddc, false);
>  }
>
> +static void vc4_hdmi_scrambling_wq(struct work_struct *work)
> +{
> +   struct vc4_hdmi *vc4_hdmi = container_of(to_delayed_work(work),
> +struct vc4_hdmi,
> +scrambling_work);
> +
> +   if (drm_scdc_get_scrambling_status(vc4_hdmi->ddc))
> +   return;
> +
> +   drm_scdc_set_high_tmds_clock_ratio(vc4_hdmi->ddc, true);
> +   drm_scdc_set_scrambling(vc4_hdmi->ddc, true);
> +
> +   queue_delayed_work(system_wq, _hdmi->scrambling_work,
> +  msecs_to_jiffies(SCRAMBLING_POLLING_DELAY_MS));
> +}
> +
>  static void vc4_hdmi_encoder_post_crtc_disable(struct drm_encoder *encoder,
>struct drm_atomic_state *state)
>  {
> @@ -2031,6 +2055,7 @@ static int vc4_hdmi_bind(struct device *dev, struct 
> device *master, void *data)
> vc4_hdmi = devm_kzalloc(dev, sizeof(*vc4_hdmi), GFP_KERNEL);
> if (!vc4_hdmi)
> return -ENOMEM;
> +   INIT_DELAYED_WORK(_hdmi->scrambling_work, vc4_hdmi_scrambling_wq);
>
> dev_set_drvdata(dev, vc4_hdmi);
> encoder = _hdmi->encoder.base.base;
> diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.h b/drivers/gpu/drm/vc4/vc4_hdmi.h
> index 3cd021136402..00efcf291c5a 100644
> --- a/drivers/gpu/drm/vc4/vc4_hdmi.h
> +++ b/drivers/gpu/drm/vc4/vc4_hdmi.h
> @@ -126,6 +126,8 @@ struct vc4_hdmi {
> struct vc4_hdmi_encoder encoder;
> struct drm_connector connector;
>
> +   struct delayed_work scrambling_work;
> +
> struct i2c_adapter *ddc;
> void __iomem *hdmicore_regs;
> void __iomem *hd_regs;
> --
> 2.31.1
>

Re: [PATCH v4 10/12] drm/vc4: hdmi: Enable the scrambler

2021-05-21 Thread Dave Stevenson

Hi Maxime

On Fri, 7 May 2021 at 16:06, Maxime Ripard  wrote:
>
> The HDMI controller on the BCM2711 includes a scrambler in order to
> reach the HDMI 2.0 modes that require it. Let's add the support for it.
>
> Acked-by: Thomas Zimmermann 
> Signed-off-by: Maxime Ripard 

Reviewed-by: Dave Stevenson 

> ---
>  drivers/gpu/drm/vc4/vc4_hdmi.c  | 64 +
>  drivers/gpu/drm/vc4/vc4_hdmi_regs.h |  3 ++
>  2 files changed, 67 insertions(+)
>
> diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
> index 01d24ce8a795..bda12fea0dce 100644
> --- a/drivers/gpu/drm/vc4/vc4_hdmi.c
> +++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
> @@ -35,6 +35,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -76,6 +77,8 @@
>  #define VC5_HDMI_VERTB_VSPO_SHIFT  16
>  #define VC5_HDMI_VERTB_VSPO_MASK   VC4_MASK(29, 16)
>
> +#define VC5_HDMI_SCRAMBLER_CTL_ENABLE  BIT(0)
> +
>  #define VC5_HDMI_DEEP_COLOR_CONFIG_1_INIT_PACK_PHASE_SHIFT 8
>  #define VC5_HDMI_DEEP_COLOR_CONFIG_1_INIT_PACK_PHASE_MASK  VC4_MASK(10, 
> 8)
>
> @@ -462,6 +465,64 @@ static void vc4_hdmi_set_infoframes(struct drm_encoder 
> *encoder)
> vc4_hdmi_set_audio_infoframe(encoder);
>  }
>
> +static bool vc4_hdmi_supports_scrambling(struct drm_encoder *encoder,
> +struct drm_display_mode *mode)
> +{
> +   struct vc4_hdmi_encoder *vc4_encoder = to_vc4_hdmi_encoder(encoder);
> +   struct vc4_hdmi *vc4_hdmi = encoder_to_vc4_hdmi(encoder);
> +   struct drm_display_info *display = _hdmi->connector.display_info;
> +
> +   if (!vc4_encoder->hdmi_monitor)
> +   return false;
> +
> +   if (!display->hdmi.scdc.supported ||
> +   !display->hdmi.scdc.scrambling.supported)
> +   return false;
> +
> +   return true;
> +}
> +
> +static void vc4_hdmi_enable_scrambling(struct drm_encoder *encoder)
> +{
> +   struct drm_display_mode *mode = >crtc->state->adjusted_mode;
> +   struct vc4_hdmi *vc4_hdmi = encoder_to_vc4_hdmi(encoder);
> +
> +   if (!vc4_hdmi_supports_scrambling(encoder, mode))
> +   return;
> +
> +   if (!vc4_hdmi_mode_needs_scrambling(mode))
> +   return;
> +
> +   drm_scdc_set_high_tmds_clock_ratio(vc4_hdmi->ddc, true);
> +   drm_scdc_set_scrambling(vc4_hdmi->ddc, true);
> +
> +   HDMI_WRITE(HDMI_SCRAMBLER_CTL, HDMI_READ(HDMI_SCRAMBLER_CTL) |
> +  VC5_HDMI_SCRAMBLER_CTL_ENABLE);
> +}
> +
> +static void vc4_hdmi_disable_scrambling(struct drm_encoder *encoder)
> +{
> +   struct vc4_hdmi *vc4_hdmi = encoder_to_vc4_hdmi(encoder);
> +   struct drm_crtc *crtc = encoder->crtc;
> +
> +   /*
> +* At boot, encoder->crtc will be NULL. Since we don't know the
> +* state of the scrambler and in order to avoid any
> +* inconsistency, let's disable it all the time.
> +*/
> +   if (crtc && !vc4_hdmi_supports_scrambling(encoder, >mode))
> +   return;
> +
> +   if (crtc && !vc4_hdmi_mode_needs_scrambling(>mode))
> +   return;
> +
> +   HDMI_WRITE(HDMI_SCRAMBLER_CTL, HDMI_READ(HDMI_SCRAMBLER_CTL) &
> +  ~VC5_HDMI_SCRAMBLER_CTL_ENABLE);
> +
> +   drm_scdc_set_scrambling(vc4_hdmi->ddc, false);
> +   drm_scdc_set_high_tmds_clock_ratio(vc4_hdmi->ddc, false);
> +}
> +
>  static void vc4_hdmi_encoder_post_crtc_disable(struct drm_encoder *encoder,
>struct drm_atomic_state *state)
>  {
> @@ -474,6 +535,8 @@ static void vc4_hdmi_encoder_post_crtc_disable(struct 
> drm_encoder *encoder,
>
> HDMI_WRITE(HDMI_VID_CTL,
>HDMI_READ(HDMI_VID_CTL) | VC4_HD_VID_CTL_BLANKPIX);
> +
> +   vc4_hdmi_disable_scrambling(encoder);
>  }
>
>  static void vc4_hdmi_encoder_post_crtc_powerdown(struct drm_encoder *encoder,
> @@ -924,6 +987,7 @@ static void vc4_hdmi_encoder_post_crtc_enable(struct 
> drm_encoder *encoder,
> }
>
> vc4_hdmi_recenter_fifo(vc4_hdmi);
> +   vc4_hdmi_enable_scrambling(encoder);
>  }
>
>  static void vc4_hdmi_encoder_enable(struct drm_encoder *encoder)
> diff --git a/drivers/gpu/drm/vc4/vc4_hdmi_regs.h 
> b/drivers/gpu/drm/vc4/vc4_hdmi_regs.h
> index e1b58eac766f..19d2fdc446bc 100644
> --- a/drivers/gpu/drm/vc4/vc4_hdmi_regs.h
> +++ b/drivers/gpu/drm/vc4/vc4_hdmi_regs.h
> @@ -100,6 +100,7 @@ enum vc4_hdmi_field {
> HDMI_RM_FORMAT,
> HDMI_RM_OFFSET,
> HDMI_SCHEDULER_CONTROL,
> +   HDMI_SCRAMBLER_CTL,
> HDMI_SW_RESET_CONTROL,
> HDMI_TX_PHY_CHANNEL_SWAP,
> HDMI_TX_PHY_CLK_DIV,
> @@ -238,6 +239,7 @@ static const struct vc4_hdmi_register __maybe_unused 
> vc5_hdmi_hdmi0_fields[] = {
> VC4_HDMI_REG(HDMI_GCP_CONFIG, 0x178),
> VC4_HDMI_REG(HDMI_GCP_WORD_1, 0x17c),
> VC4_HDMI_REG(HDMI_HOTPLUG, 0x1a8),
> +

Re: [PATCH v4 03/12] drm/vc4: crtc: Pass the drm_atomic_state to config_pv

2021-05-21 Thread Dave Stevenson

Hi Maxime

Thanks for the patch

On Fri, 7 May 2021 at 16:05, Maxime Ripard  wrote:
>
> The vc4_crtc_config_pv will need to access the drm_atomic_state
> structure and its only parent function, vc4_crtc_atomic_enable already
> has access to it. Let's pass it as a parameter.
>
> Fixes: 792c3132bc1b ("drm/vc4: encoder: Add finer-grained encoder callbacks")
> Signed-off-by: Maxime Ripard 

Reviewed-by: Dave Stevenson 

> ---
>  drivers/gpu/drm/vc4/vc4_crtc.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/vc4/vc4_crtc.c b/drivers/gpu/drm/vc4/vc4_crtc.c
> index f1f2e8cbce79..8a19d6c76605 100644
> --- a/drivers/gpu/drm/vc4/vc4_crtc.c
> +++ b/drivers/gpu/drm/vc4/vc4_crtc.c
> @@ -288,7 +288,7 @@ static void vc4_crtc_pixelvalve_reset(struct drm_crtc 
> *crtc)
> CRTC_WRITE(PV_CONTROL, CRTC_READ(PV_CONTROL) | PV_CONTROL_FIFO_CLR);
>  }
>
> -static void vc4_crtc_config_pv(struct drm_crtc *crtc)
> +static void vc4_crtc_config_pv(struct drm_crtc *crtc, struct 
> drm_atomic_state *state)
>  {
> struct drm_device *dev = crtc->dev;
> struct vc4_dev *vc4 = to_vc4_dev(dev);
> @@ -296,8 +296,8 @@ static void vc4_crtc_config_pv(struct drm_crtc *crtc)
> struct vc4_encoder *vc4_encoder = to_vc4_encoder(encoder);
> struct vc4_crtc *vc4_crtc = to_vc4_crtc(crtc);
> const struct vc4_pv_data *pv_data = vc4_crtc_to_vc4_pv_data(vc4_crtc);
> -   struct drm_crtc_state *state = crtc->state;
> -   struct drm_display_mode *mode = >adjusted_mode;
> +   struct drm_crtc_state *crtc_state = crtc->state;
> +   struct drm_display_mode *mode = _state->adjusted_mode;
> bool interlace = mode->flags & DRM_MODE_FLAG_INTERLACE;
> u32 pixel_rep = (mode->flags & DRM_MODE_FLAG_DBLCLK) ? 2 : 1;
> bool is_dsi = (vc4_encoder->type == VC4_ENCODER_TYPE_DSI0 ||
> @@ -522,7 +522,7 @@ static void vc4_crtc_atomic_enable(struct drm_crtc *crtc,
> if (vc4_encoder->pre_crtc_configure)
> vc4_encoder->pre_crtc_configure(encoder, state);
>
> -   vc4_crtc_config_pv(crtc);
> +   vc4_crtc_config_pv(crtc, state);
>
> CRTC_WRITE(PV_CONTROL, CRTC_READ(PV_CONTROL) | PV_CONTROL_EN);
>
> --
> 2.31.1
>

Re: [PATCH v17 4/4] dt-bindings: msm/dp: Add bindings of MSM DisplayPort controller

2021-05-21 Thread Bjorn Andersson

On Fri 21 May 05:27 CDT 2021, Krishna Manikandan wrote:

> Add bindings for Snapdragon DisplayPort controller driver.
> 

Reviewed-by: Bjorn Andersson 

Regards,
Bjorn

> Signed-off-by: Chandan Uddaraju 
> Signed-off-by: Vara Reddy 
> Signed-off-by: Tanmay Shah 
> Signed-off-by: Kuogee Hsieh 
> Signed-off-by: Krishna Manikandan 
> 
> Changes in V2:
> -Provide details about sel-gpio
> 
> Changes in V4:
> -Provide details about max dp lanes
> -Change the commit text
> 
> Changes in V5:
> -moved dp.txt to yaml file
> 
> Changes in v6:
> - Squash all AUX LUT properties into one pattern Property
> - Make aux-cfg[0-9]-settings properties optional
> - Remove PLL/PHY bindings from DP controller dts
> - Add DP clocks description
> - Remove _clk suffix from clock names
> - Rename pixel clock to stream_pixel
> - Remove redundant bindings (GPIO, PHY, HDCP clock, etc..)
> - Fix indentation
> - Add Display Port as interface of DPU in DPU bindings
>   and add port mapping accordingly.
> 
> Chages in v7:
> - Add dp-controller.yaml file common between multiple SOC
> - Rename dp-sc7180.yaml to dp-controller-sc7180.yaml
> - change compatible string and add SOC name to it.
> - Remove Root clock generator for pixel clock
> - Add assigned-clocks and assigned-clock-parents bindings
> - Remove redundant properties, descriptions and blank lines
> - Add DP port in DPU bindings
> - Update depends-on tag in commit message and rebase change accordingly
> 
> Changes in v8:
> - Add MDSS AHB clock in bindings
> 
> Changes in v9:
> - Remove redundant reg-name property
> - Change assigned-clocks and assigned-clocks-parents counts to 2
> - Use IRQ flags in example dts
> 
> Changes in v10:
> - Change title of this patch as it does not contain PLL bindings anymore
> - Remove redundant properties
> - Remove use of IRQ flag
> - Fix ports property
> 
> Changes in v11:
> - add ports required of both #address-cells and  #size-cells
> - add required operating-points-v2
> - add required #sound-dai-cells
> - add required power-domains
> - update maintainer list
> 
> Changes in v12:
> - remove soc node from examples (Stephen Boyd)
> - split dpu-sc7180.yaml changes to separate patch (Stephen Boyd)
> 
> Changes in v13:
> - add assigned-clocks
> - add assigned-clock-parents
> 
> Changes in v14:
> - add reference for ports (Rob Herring)
> 
> Changes in v15:
> - drop common properties from ports (Rob Herring)
> ---
>  .../bindings/display/msm/dp-controller.yaml| 146 
> +
>  1 file changed, 146 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/display/msm/dp-controller.yaml
> 
> diff --git a/Documentation/devicetree/bindings/display/msm/dp-controller.yaml 
> b/Documentation/devicetree/bindings/display/msm/dp-controller.yaml
> new file mode 100644
> index 000..64d8d9e
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/display/msm/dp-controller.yaml
> @@ -0,0 +1,146 @@
> +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/display/msm/dp-controller.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: MSM Display Port Controller
> +
> +maintainers:
> +  - Kuogee Hsieh 
> +
> +description: |
> +  Device tree bindings for DisplayPort host controller for MSM targets
> +  that are compatible with VESA DisplayPort interface specification.
> +
> +properties:
> +  compatible:
> +enum:
> +  - qcom,sc7180-dp
> +
> +  reg:
> +maxItems: 1
> +
> +  interrupts:
> +maxItems: 1
> +
> +  clocks:
> +items:
> +  - description: AHB clock to enable register access
> +  - description: Display Port AUX clock
> +  - description: Display Port Link clock
> +  - description: Link interface clock between DP and PHY
> +  - description: Display Port Pixel clock
> +
> +  clock-names:
> +items:
> +  - const: core_iface
> +  - const: core_aux
> +  - const: ctrl_link
> +  - const: ctrl_link_iface
> +  - const: stream_pixel
> +
> +  assigned-clocks:
> +items:
> +  - description: link clock source
> +  - description: pixel clock source
> +
> +  assigned-clock-parents:
> +items:
> +  - description: phy 0 parent
> +  - description: phy 1 parent
> +
> +  phys:
> +maxItems: 1
> +
> +  phy-names:
> +items:
> +  - const: dp
> +
> +  operating-points-v2:
> +maxItems: 1
> +
> +  power-domains:
> +maxItems: 1
> +
> +  "#sound-dai-cells":
> +const: 0
> +
> +  ports:
> +$ref: /schemas/graph.yaml#/properties/ports
> +properties:
> +  port@0:
> +$ref: /schemas/graph.yaml#/properties/port
> +description: Input endpoint of the controller
> +
> +  port@1:
> +$ref: /schemas/graph.yaml#/properties/port
> +description: Output endpoint of the controller
> +
> +required:
> +  - compatible
> +  - reg
> +  - interrupts
> +  - clocks
> +  - clock-names
> +  - phys
> +  - phy-names
> +  - "#sound-dai-cells"
> +

Re: [PATCH] drm/fb-helper: improve DRM fbdev emulation device names

2021-05-21 Thread Javier Martinez Canillas

On 5/21/21 6:53 PM, Thomas Zimmermann wrote:

[snip]

>>
>> So what with all the drivers which do _not_ have drm in their name? Also
>> I'm never sure how much these are uapi or not ...
>

That someone could threat as an uapi is a fair point indeed.

> Why do we need a suffix anyway?
> 

Yes, I thought the same and was torn about posting a patch to just remove
the suffix. I don't think users care that much if is a fb device from a
fbdev driver or a DRM driver using the fbdev emulation.

>> -Daniel
>>

Best regards,
-- 
Javier Martinez Canillas
Software Engineer
New Platform Technologies Enablement team
RHEL Engineering

Re: [PATCH v17 3/4] dt-bindings: msm: dsi: add yaml schemas for DSI PHY bindings

2021-05-21 Thread Bjorn Andersson

On Fri 21 May 05:27 CDT 2021, Krishna Manikandan wrote:

> Add YAML schema for the device tree bindings for DSI PHY.
> 
> Signed-off-by: Krishna Manikandan 
> 
> Changes in v1:
>- Merge dsi-phy.yaml and dsi-phy-10nm.yaml (Stephen Boyd)
>- Remove qcom,dsi-phy-regulator-ldo-mode (Stephen Boyd)
>- Add clock cells properly (Stephen Boyd)
>- Remove unnecessary decription from clock names (Stephen Boyd)
>- Add pin names for the supply entries for 10nm phy which is
>  used in sc7180 and sdm845 (Stephen Boyd)
>- Remove unused header files from examples (Stephen Boyd)
>- Drop labels for display nodes and correct node name (Stephen Boyd)
> 
> Changes in v2:
>- Drop maxItems for clock (Stephen Boyd)
>- Add vdds supply pin information for sdm845 (Stephen Boyd)
>- Add examples for 14nm, 20nm and 28nm phy yaml files (Stephen Boyd)
>- Keep child nodes directly under soc node (Stephen Boyd)
> 
> Changes in v3:
>- Use a separate yaml file to describe the common properties
>  for all the dsi phy versions (Stephen Boyd)
>- Remove soc from examples (Stephen Boyd)
>- Add description for register property
> 
> Changes in v4:
>- Modify the title for all the phy versions (Stephen Boyd)
>- Drop description for all the phy versions (Stephen Boyd)
>- Modify the description for register property (Stephen Boyd)
> 
> Changes in v5:
>- Remove unused properties from common dsi phy file
>- Add clock-cells and phy-cells to required property
>  list (Stephen Boyd)
> 
> Changes in v6:
>- Add proper compatible string in example
> ---
>  .../bindings/display/msm/dsi-phy-10nm.yaml | 68 +
>  .../bindings/display/msm/dsi-phy-14nm.yaml | 66 
>  .../bindings/display/msm/dsi-phy-20nm.yaml | 71 
> ++
>  .../bindings/display/msm/dsi-phy-28nm.yaml | 68 +
>  .../bindings/display/msm/dsi-phy-common.yaml   | 40 
>  5 files changed, 313 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/display/msm/dsi-phy-10nm.yaml
>  create mode 100644 
> Documentation/devicetree/bindings/display/msm/dsi-phy-14nm.yaml
>  create mode 100644 
> Documentation/devicetree/bindings/display/msm/dsi-phy-20nm.yaml
>  create mode 100644 
> Documentation/devicetree/bindings/display/msm/dsi-phy-28nm.yaml
>  create mode 100644 
> Documentation/devicetree/bindings/display/msm/dsi-phy-common.yaml
> 
> diff --git a/Documentation/devicetree/bindings/display/msm/dsi-phy-10nm.yaml 
> b/Documentation/devicetree/bindings/display/msm/dsi-phy-10nm.yaml
> new file mode 100644
> index 000..4a26bef
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/display/msm/dsi-phy-10nm.yaml
> @@ -0,0 +1,68 @@
> +# SPDX-License-Identifier: GPL-2.0-only or BSD-2-Clause
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/display/msm/dsi-phy-10nm.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: Qualcomm Display DSI 10nm PHY
> +
> +maintainers:
> +  - Krishna Manikandan 
> +
> +allOf:
> +  - $ref: dsi-phy-common.yaml#
> +
> +properties:
> +  compatible:
> +oneOf:
> +  - const: qcom,dsi-phy-10nm
> +  - const: qcom,dsi-phy-10nm-8998
> +
> +  reg:
> +items:
> +  - description: dsi phy register set
> +  - description: dsi phy lane register set
> +  - description: dsi pll register set
> +
> +  reg-names:
> +items:
> +  - const: dsi_phy
> +  - const: dsi_phy_lane
> +  - const: dsi_pll
> +
> +  vdds-supply:
> +description: |
> +  Connected to DSI0_MIPI_DSI_PLL_VDDA0P9 pin for sc7180 target and
> +  connected to VDDA_MIPI_DSI_0_PLL_0P9 pin for sdm845 target

"Reference to the 0.9V supply for the PLL." would have been sufficient.

But overall I think the patch looks good.

Reviewed-by: Bjorn Andersson 

Regards,
Bjorn

Re: [PATCH v17 2/4] dt-bindings: msm: dsi: add yaml schemas for DSI bindings

2021-05-21 Thread Bjorn Andersson

On Fri 21 May 05:27 CDT 2021, Krishna Manikandan wrote:

> Add YAML schema for the device tree bindings for DSI
> 

Reviewed-by: Bjorn Andersson 

Regards,
Bjorn

> Signed-off-by: Krishna Manikandan 
> 
> Changes in v1:
> - Separate dsi controller bindings to a separate patch (Stephen Boyd)
> - Merge dsi-common-controller.yaml and dsi-controller-main.yaml to
>   a single file (Stephen Boyd)
> - Drop supply entries and definitions from properties (Stephen Boyd)
> - Modify phy-names property for dsi controller (Stephen Boyd)
> - Remove boolean from description (Stephen Boyd)
> - Drop pinctrl properties as they are standard entries (Stephen Boyd)
> - Modify the description for ports property and keep the reference
>   to the generic binding where this is defined (Stephen Boyd)
> - Add description to clock names (Stephen Boyd)
> - Correct the indendation (Stephen Boyd)
> - Drop the label for display dt nodes and correct the node
>   name (Stephen Boyd)
> 
> Changes in v2:
> - Drop maxItems for clock (Stephen Boyd)
> - Drop qcom,mdss-mdp-transfer-time-us as it is not used in upstream
>   dt file (Stephen Boyd)
> - Keep child node directly under soc node (Stephen Boyd)
> - Drop qcom,sync-dual-dsi as it is not used in upstream dt
> 
> Changes in v3:
> - Add description for register property (Stephen Boyd)
> 
> Changes in v4:
> - Add maxItems for phys property (Stephen Boyd)
> - Add maxItems for reg property (Stephen Boyd)
> - Add reference for data-lanes property (Stephen Boyd)
> - Remove soc from example (Stephen Boyd)
> 
> Changes in v5:
> - Modify title and description (Stephen Boyd)
> - Add required properties for ports node (Stephen Boyd)
> - Add data-lanes in the example (Stephen Boyd)
> - Drop qcom,master-dsi property (Stephen Boyd)
> 
> Changes in v6:
> - Add required properties for port@0, port@1 and corresponding
>   endpoints (Stephen Boyd)
> - Add address-cells and size-cells for ports (Stephen Boyd)
> - Use additionalProperties instead of unevaluatedProperties (Stephen Boyd)
> 
> Changes in v7:
> - Add reference for ports and data-lanes (Rob Herring)
> - Add maxItems and minItems for data-lanes (Rob Herring)
> 
> Changes in v8:
> - Drop common properties and description from ports (Rob Herring)
> - Add reference for endpoint (Rob Herring)
> - Add correct reference for data-lanes (Rob Herring)
> - Drop common properties from required list for ports (Rob Herring)
> ---
>  .../bindings/display/msm/dsi-controller-main.yaml  | 185 +++
>  .../devicetree/bindings/display/msm/dsi.txt| 249 
> -
>  2 files changed, 185 insertions(+), 249 deletions(-)
>  create mode 100644 
> Documentation/devicetree/bindings/display/msm/dsi-controller-main.yaml
>  delete mode 100644 Documentation/devicetree/bindings/display/msm/dsi.txt
> 
> diff --git 
> a/Documentation/devicetree/bindings/display/msm/dsi-controller-main.yaml 
> b/Documentation/devicetree/bindings/display/msm/dsi-controller-main.yaml
> new file mode 100644
> index 000..df6a750
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/display/msm/dsi-controller-main.yaml
> @@ -0,0 +1,185 @@
> +# SPDX-License-Identifier: GPL-2.0-only or BSD-2-Clause
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/display/msm/dsi-controller-main.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: Qualcomm Display DSI controller
> +
> +maintainers:
> +  - Krishna Manikandan 
> +
> +allOf:
> +  - $ref: "../dsi-controller.yaml#"
> +
> +properties:
> +  compatible:
> +items:
> +  - const: qcom,mdss-dsi-ctrl
> +
> +  reg:
> +maxItems: 1
> +
> +  reg-names:
> +const: dsi_ctrl
> +
> +  interrupts:
> +maxItems: 1
> +
> +  clocks:
> +items:
> +  - description: Display byte clock
> +  - description: Display byte interface clock
> +  - description: Display pixel clock
> +  - description: Display escape clock
> +  - description: Display AHB clock
> +  - description: Display AXI clock
> +
> +  clock-names:
> +items:
> +  - const: byte
> +  - const: byte_intf
> +  - const: pixel
> +  - const: core
> +  - const: iface
> +  - const: bus
> +
> +  phys:
> +maxItems: 1
> +
> +  phy-names:
> +const: dsi
> +
> +  "#address-cells": true
> +
> +  "#size-cells": true
> +
> +  syscon-sfpb:
> +description: A phandle to mmss_sfpb syscon node (only for DSIv2).
> +$ref: "/schemas/types.yaml#/definitions/phandle"
> +
> +  qcom,dual-dsi-mode:
> +type: boolean
> +description: |
> +  Indicates if the DSI controller is driving a panel which needs
> +  2 DSI links.
> +
> +  power-domains:
> +maxItems: 1
> +
> +  operating-points-v2: true
> +
> +  ports:
> +$ref: "/schemas/graph.yaml#/properties/ports"
> +description: |
> +  Contains DSI controller input and output

Re: [PATCH v17 1/4] dt-bindings: msm: disp: add yaml schemas for DPU bindings

2021-05-21 Thread Bjorn Andersson

On Fri 21 May 11:00 CDT 2021, Bjorn Andersson wrote:

> On Fri 21 May 05:27 CDT 2021, Krishna Manikandan wrote:
> > diff --git a/Documentation/devicetree/bindings/display/msm/dpu-sc7180.yaml 
> > b/Documentation/devicetree/bindings/display/msm/dpu-sc7180.yaml
> [..]
> > +  ports:
> > +$ref: /schemas/graph.yaml#/properties/ports
> > +description: |
> > +  Contains the list of output ports from DPU device. These ports
> > +  connect to interfaces that are external to the DPU hardware,
> > +  such as DSI, DP etc. Each output port contains an endpoint that
> > +  describes how it is connected to an external interface.
> > +
> > +properties:
> > +  port@0:
> > +$ref: /schemas/graph.yaml#/properties/port
> > +description: DPU_INTF1 (DSI1)
> > +
> > +  port@2:
> > +$ref: /schemas/graph.yaml#/properties/port
> > +description: DPU_INTF0 (DP)
> 
> Why is port@0 INTF1 and why is port@2 INTF0? In the binding you're
> translating the two ports that are described are 0 and 1, representing
> INTF1 and INTF2, or DSI1 and DSI2, respectively.
> 
> Further more, I have a need for somehow describing the pairing of 4 DP
> INTFs (INTF 0, 3, 4 and 5) and how they are connected to the 3+1 DP+eDP
> controllers.
> 
> Downstream this seems to be handled by adding cell-index to the DP
> controllers and then matching that against the numbering in the driver's
> INTF array. But rather than adding cell-index to map this, can't we
> define that the port index is the INTF-number here?
> 
> 
> This would obviously break compatibility with existing DTBs, but we
> could start by doing it selectively for the new compatibles, fix up the
> existing dts files and then drop the selective application after 1 or 2
> LTS releases.
> 

In a chat with Rob I realized that my feedback here is unrelated to the
yaml conversion and any conclusions of this discussion should be a
separate patch anyways.

So with the two style issues below resolve you have my:

Reviewed-by: Bjorn Andersson 

[..]
> > +examples:
[..]
> > +   ports {
> > +   #address-cells = <1>;
> > +   #size-cells = <0>;
> > +
> > +   port@0 {
> > +   reg = <0>;
> > +   dpu_intf1_out: endpoint {
> > +  remote-endpoint = 
> > <_in>;
> > +   };
> > +   };
> > +
> > +port@2 {
> > +reg = <2>;
> > +dpu_intf0_out: endpoint {
> > +   remote-endpoint = 
> > <_in>;
> > +};
> > +};
> 
> The indentation is inconsistent among the ports.
> 
> > +   };
> > + };
> > +};
> > +...
> > diff --git a/Documentation/devicetree/bindings/display/msm/dpu-sdm845.yaml 
> > b/Documentation/devicetree/bindings/display/msm/dpu-sdm845.yaml
[..]
> > +  operating-points-v2: true
> 
> You have a blank line between all other properties, but not here.
> 
> > +  ports:

Regards,
Bjorn

Re: [Intel-gfx] [RFC 2/2] drm/doc/rfc: i915 new parallel submission uAPI plan

2021-05-21 Thread Matthew Brost

On Fri, May 21, 2021 at 01:00:54PM +0100, Tvrtko Ursulin wrote:
> 
> On 19/05/2021 00:58, Matthew Brost wrote:
> > Add entry fpr i915 new parallel submission uAPI plan.
> > 
> > v2:
> >   (Daniel Vetter):
> >- Expand logical order explaination
> >- Add dummy header
> >- Only allow N BBs in execbuf IOCTL
> >- Configure parallel submission per slot not per gem context
> > 
> > Cc: Tvrtko Ursulin 
> > Cc: Tony Ye 
> > CC: Carl Zhang 
> > Cc: Daniel Vetter 
> > Cc: Jason Ekstrand 
> > Signed-off-by: Matthew Brost 
> > ---
> >   Documentation/gpu/rfc/i915_parallel_execbuf.h | 144 ++
> >   Documentation/gpu/rfc/i915_scheduler.rst  |  53 ++-
> >   2 files changed, 196 insertions(+), 1 deletion(-)
> >   create mode 100644 Documentation/gpu/rfc/i915_parallel_execbuf.h
> > 
> > diff --git a/Documentation/gpu/rfc/i915_parallel_execbuf.h 
> > b/Documentation/gpu/rfc/i915_parallel_execbuf.h
> > new file mode 100644
> > index ..8c64b983ccad
> > --- /dev/null
> > +++ b/Documentation/gpu/rfc/i915_parallel_execbuf.h
> > @@ -0,0 +1,144 @@
> > +#define I915_CONTEXT_ENGINES_EXT_PARALLEL_SUBMIT 2 /* see 
> > i915_context_engines_parallel_submit */
> > +
> > +/*
> > + * i915_context_engines_parallel_submit:
> > + *
> > + * Setup a slot to allow multiple BBs to be submitted in a single execbuf 
> > IOCTL.
> > + * Those BBs will then be scheduled to run on the GPU in parallel. Multiple
> > + * hardware contexts are created internally in the i915 run these BBs. 
> > Once a
> > + * slot is configured for N BBs only N BBs can be submitted in each execbuf
> > + * IOCTL and this is implict behavior (e.g. the user doesn't tell the 
> > execbuf
> > + * IOCTL there are N BBs, the execbuf IOCTL know how many BBs there are 
> > based on
> > + * the slots configuration).
> 
> 1)
> Expand the term slot here with "slot in the context engine map" least once
> for clarity.
> 

Sure.

> 2)
> About where execbuf will implicitly be finding batches - suggest to also
> cover first/last flag here. I know you have it in the readme but I think it
> is good if uapi header is as self-contained as possible.
>

Yep, good idea.

> > + *
> > + * Their are two currently defined ways to control the placement of the
> > + * hardware contexts on physical engines: default behavior (no flags) and
> > + * I915_PARALLEL_IMPLICT_BONDS (a flag). More flags may be added the in the
> > + * future as new hardware / use cases arise. Details of how to use this
> > + * interface below above the flags.
> > + *
> > + * Returns -EINVAL if hardware context placement configuration invalid or 
> > if the
> > + * placement configuration isn't supported on the platform / submission
> > + * interface.
> > + * Returns -ENODEV if extension isn't supported on the platform / 
> > submission
> > + * inteface.
> > + */
> > +struct i915_context_engines_parallel_submit {
> > +   struct i915_user_extension base;
> > +
> > +   __u16 engine_index; /* slot for parallel engine */
> > +   __u16 width;/* number of contexts per parallel engine */
> > +   __u16 num_siblings; /* number of siblings per context */
> > +   __u16 mbz16;
> > +/*
> > + * Default placement behvavior (currently unsupported):
> > + *
> > + * Rather than restricting parallel submission to a single class with a
> > + * logically contiguous placement (I915_PARALLEL_IMPLICT_BONDS), add a 
> > mode that
> 
> What do you mean with logically contiguous here? It sounds ambiguous versus
> logical vs "normal" engine instance numbers.
>

This is a little backwards. I think when I wrote this originally the implicit
placement comments were first. I can reword this.
 
> > + * enables parallel submission across multiple engine classes. In this 
> > case each
> > + * context's logical engine mask indicates where that context can placed. 
> > It is
> > + * implied in this mode that all contexts have mutual exclusive placement 
> > (e.g.
> > + * if one context is running CS0 no other contexts can run on CS0).
> 
> I think talk about logical context and its mask is too implementation detail
> at the uapi level. Instead I would suggest more userspace programmer centric
> description.

Ok, can you give me suggestion? Writing DOC isn't really my strength.

> 
> > + *
> > + * Example 1 pseudo code:
> > + * CSX[Y] = engine class X, logical instance Y
> > + * INVALID = I915_ENGINE_CLASS_INVALID, I915_ENGINE_CLASS_INVALID_NONE
> > + * set_engines(INVALID)
> > + * set_parallel(engine_index=0, width=2, num_siblings=2,
> > + * engines=CS0[0],CS0[1],CS1[0],CS1[1])
> > + *
> > + * Results in the following valid placements:
> > + * CS0[0], CS1[0]
> > + * CS0[0], CS1[1]
> > + * CS0[1], CS1[0]
> > + * CS0[1], CS1[1]
> > + *
> > + * This can also be though of as 2 virtual engines:
> > + * VE[0] = CS0[0], CS0[1]
> > + * VE[1] = CS1[0], CS1[1]
> 
> Ah okay so essentially similar to what I was proposing a year ago. But then
> it is no longer "set_parallel" really. It is one slot in

Re: [PATCH] drm/fb-helper: improve DRM fbdev emulation device names

2021-05-21 Thread Thomas Zimmermann


Hi

Am 21.05.21 um 17:33 schrieb Daniel Vetter:

On Fri, May 21, 2021 at 03:19:10PM +0200, Javier Martinez Canillas wrote:

Framebuffer devices that are registered by DRM drivers for fbdev emulation
have a "drmfb" suffix in their name. But makes them to be quite confusing
if a driver already has "drm" in its name:

$ cat /proc/fb
0 rockchipdrmdrmfb

$ cat /proc/fb
0 simpledrmdrmfb

Instead, let's just add a "-fb" suffix to denote that are DRM drivers FB:

$ cat /proc/fb
0 rockchipdrm-fb

$ cat /proc/fb
0 simpledrm-fb

Suggested-by: Peter Robinson 
Signed-off-by: Javier Martinez Canillas 


So what with all the drivers which do _not_ have drm in their name? Also
I'm never sure how much these are uapi or not ...


Why do we need a suffix anyway?


-Daniel


---

  drivers/gpu/drm/drm_fb_helper.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
index f6baa204612..bbaff92c509 100644
--- a/drivers/gpu/drm/drm_fb_helper.c
+++ b/drivers/gpu/drm/drm_fb_helper.c
@@ -1737,7 +1737,7 @@ void drm_fb_helper_fill_info(struct fb_info *info,
   sizes->fb_width, sizes->fb_height);
  
  	info->par = fb_helper;

-   snprintf(info->fix.id, sizeof(info->fix.id), "%sdrmfb",
+   snprintf(info->fix.id, sizeof(info->fix.id), "%s-fb",
 fb_helper->dev->driver->name);
  
  }

--
2.31.1





--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer



OpenPGP_signature
Description: OpenPGP digital signature

Re: [Intel-gfx] [RFC 2/2] drm/doc/rfc: i915 new parallel submission uAPI plan

2021-05-21 Thread Matthew Brost

On Thu, May 20, 2021 at 09:41:20PM +0200, Daniel Vetter wrote:
> On Thu, May 20, 2021 at 11:57:44AM +0100, Tvrtko Ursulin wrote:
> > 
> > On 20/05/2021 10:54, Daniel Vetter wrote:
> > > On Wed, May 19, 2021 at 7:19 PM Matthew Brost  
> > > wrote:
> > > > 
> > > > On Wed, May 19, 2021 at 01:10:04PM +0200, Daniel Vetter wrote:
> > > > > On Tue, May 18, 2021 at 04:58:30PM -0700, Matthew Brost wrote:
> > > > > > Add entry fpr i915 new parallel submission uAPI plan.
> > > > > > 
> > > > > > v2:
> > > > > >   (Daniel Vetter):
> > > > > >- Expand logical order explaination
> > > > > >- Add dummy header
> > > > > >- Only allow N BBs in execbuf IOCTL
> > > > > >- Configure parallel submission per slot not per gem context
> > > > > > 
> > > > > > Cc: Tvrtko Ursulin 
> > > > > > Cc: Tony Ye 
> > > > > > CC: Carl Zhang 
> > > > > > Cc: Daniel Vetter 
> > > > > > Cc: Jason Ekstrand 
> > > > > > Signed-off-by: Matthew Brost 
> > > > > > ---
> > > > > >   Documentation/gpu/rfc/i915_parallel_execbuf.h | 144 
> > > > > > ++
> > > > > >   Documentation/gpu/rfc/i915_scheduler.rst  |  53 ++-
> > > > > >   2 files changed, 196 insertions(+), 1 deletion(-)
> > > > > >   create mode 100644 Documentation/gpu/rfc/i915_parallel_execbuf.h
> > > > > > 
> > > > > > diff --git a/Documentation/gpu/rfc/i915_parallel_execbuf.h 
> > > > > > b/Documentation/gpu/rfc/i915_parallel_execbuf.h
> > > > > > new file mode 100644
> > > > > > index ..8c64b983ccad
> > > > > > --- /dev/null
> > > > > > +++ b/Documentation/gpu/rfc/i915_parallel_execbuf.h
> > > > > > @@ -0,0 +1,144 @@
> > > > > > +#define I915_CONTEXT_ENGINES_EXT_PARALLEL_SUBMIT 2 /* see 
> > > > > > i915_context_engines_parallel_submit */
> > > > > > +
> > > > > > +/*
> > > > > > + * i915_context_engines_parallel_submit:
> > > > > > + *
> > > > > > + * Setup a slot to allow multiple BBs to be submitted in a single 
> > > > > > execbuf IOCTL.
> > > > > > + * Those BBs will then be scheduled to run on the GPU in parallel. 
> > > > > > Multiple
> > > > > > + * hardware contexts are created internally in the i915 run these 
> > > > > > BBs. Once a
> > > > > > + * slot is configured for N BBs only N BBs can be submitted in 
> > > > > > each execbuf
> > > > > > + * IOCTL and this is implict behavior (e.g. the user doesn't tell 
> > > > > > the execbuf
> > > > > > + * IOCTL there are N BBs, the execbuf IOCTL know how many BBs 
> > > > > > there are based on
> > > > > > + * the slots configuration).
> > > > > > + *
> > > > > > + * Their are two currently defined ways to control the placement 
> > > > > > of the
> > > > > > + * hardware contexts on physical engines: default behavior (no 
> > > > > > flags) and
> > > > > > + * I915_PARALLEL_IMPLICT_BONDS (a flag). More flags may be added 
> > > > > > the in the
> > > > > > + * future as new hardware / use cases arise. Details of how to use 
> > > > > > this
> > > > > > + * interface below above the flags.
> > > > > > + *
> > > > > > + * Returns -EINVAL if hardware context placement configuration 
> > > > > > invalid or if the
> > > > > > + * placement configuration isn't supported on the platform / 
> > > > > > submission
> > > > > > + * interface.
> > > > > > + * Returns -ENODEV if extension isn't supported on the platform / 
> > > > > > submission
> > > > > > + * inteface.
> > > > > > + */
> > > > > > +struct i915_context_engines_parallel_submit {
> > > > > > +   struct i915_user_extension base;
> > > > > > +
> > > > > > +   __u16 engine_index; /* slot for parallel engine */
> > > > > > +   __u16 width;/* number of contexts per parallel 
> > > > > > engine */
> > > > > > +   __u16 num_siblings; /* number of siblings per context */
> > > > > > +   __u16 mbz16;
> > > > > 
> > > > > Ok the big picture looks reasonable now, the flags still confuse me.
> > > > > 
> > > > 
> > > > Yea, it is a bit confusing.
> > > > 
> > > > > > +/*
> > > > > > + * Default placement behvavior (currently unsupported):
> > > > > > + *
> > > > > > + * Rather than restricting parallel submission to a single class 
> > > > > > with a
> > > > > > + * logically contiguous placement (I915_PARALLEL_IMPLICT_BONDS), 
> > > > > > add a mode that
> > > > > > + * enables parallel submission across multiple engine classes. In 
> > > > > > this case each
> > > > > > + * context's logical engine mask indicates where that context can 
> > > > > > placed. It is
> > > > > > + * implied in this mode that all contexts have mutual exclusive 
> > > > > > placement (e.g.
> > > > > > + * if one context is running CS0 no other contexts can run on CS0).
> > > > > > + *
> > > > > > + * Example 1 pseudo code:
> > > > > > + * CSX[Y] = engine class X, logical instance Y
> > > > > > + * INVALID = I915_ENGINE_CLASS_INVALID, 
> > > > > > I915_ENGINE_CLASS_INVALID_NONE
> > > > > > + * set_engines(INVALID)
> > > > > > + * set_parallel(engine_index=0, width=2, num_siblings=2,
> > > > > > + *

Re: [PATCH 1/4] dma-buf: add dma_fence_array_for_each (v2)

2021-05-21 Thread Jason Ekstrand

On Fri, May 21, 2021 at 2:51 AM Christian König
 wrote:
>
> Am 20.05.21 um 21:00 schrieb Jason Ekstrand:
> > From: Christian König 
> >
> > Add a helper to iterate over all fences in a dma_fence_array object.
> >
> > v2 (Jason Ekstrand)
> >   - Return NULL from dma_fence_array_first if head == NULL.  This matches
> > the iterator behavior of dma_fence_chain_for_each in that it iterates
> > zero times if head == NULL.
> >   - Return NULL from dma_fence_array_next if index > array->num_fences.
> >
> > Signed-off-by: Jason Ekstrand 
>
> Reviewed-by: Christian König 
>
> BTW: I'm only seeing this patch from the series. Looks like somehow the
> rest didn't made it into my inbox.

https://lists.freedesktop.org/archives/dri-devel/2021-May/307561.html

Not sure why it didn't make your mail.  This one was CC'd to you
because you're the author from a year ago or something.

--Jason


> > ---
> >   drivers/dma-buf/dma-fence-array.c | 27 +++
> >   include/linux/dma-fence-array.h   | 17 +
> >   2 files changed, 44 insertions(+)
> >
> > diff --git a/drivers/dma-buf/dma-fence-array.c 
> > b/drivers/dma-buf/dma-fence-array.c
> > index d3fbd950be944..2ac1afc697d0f 100644
> > --- a/drivers/dma-buf/dma-fence-array.c
> > +++ b/drivers/dma-buf/dma-fence-array.c
> > @@ -201,3 +201,30 @@ bool dma_fence_match_context(struct dma_fence *fence, 
> > u64 context)
> >   return true;
> >   }
> >   EXPORT_SYMBOL(dma_fence_match_context);
> > +
> > +struct dma_fence *dma_fence_array_first(struct dma_fence *head)
> > +{
> > + struct dma_fence_array *array;
> > +
> > + if (!head)
> > + return NULL;
> > +
> > + array = to_dma_fence_array(head);
> > + if (!array)
> > + return head;
> > +
> > + return array->fences[0];
> > +}
> > +EXPORT_SYMBOL(dma_fence_array_first);
> > +
> > +struct dma_fence *dma_fence_array_next(struct dma_fence *head,
> > +unsigned int index)
> > +{
> > + struct dma_fence_array *array = to_dma_fence_array(head);
> > +
> > + if (!array || index >= array->num_fences)
> > + return NULL;
> > +
> > + return array->fences[index];
> > +}
> > +EXPORT_SYMBOL(dma_fence_array_next);
> > diff --git a/include/linux/dma-fence-array.h 
> > b/include/linux/dma-fence-array.h
> > index 303dd712220fd..588ac8089dd61 100644
> > --- a/include/linux/dma-fence-array.h
> > +++ b/include/linux/dma-fence-array.h
> > @@ -74,6 +74,19 @@ to_dma_fence_array(struct dma_fence *fence)
> >   return container_of(fence, struct dma_fence_array, base);
> >   }
> >
> > +/**
> > + * dma_fence_array_for_each - iterate over all fences in array
> > + * @fence: current fence
> > + * @index: index into the array
> > + * @head: potential dma_fence_array object
> > + *
> > + * Test if @array is a dma_fence_array object and if yes iterate over all 
> > fences
> > + * in the array. If not just iterate over the fence in @array itself.
> > + */
> > +#define dma_fence_array_for_each(fence, index, head) \
> > + for (index = 0, fence = dma_fence_array_first(head); fence; \
> > +  ++(index), fence = dma_fence_array_next(head, index))
> > +
> >   struct dma_fence_array *dma_fence_array_create(int num_fences,
> >  struct dma_fence **fences,
> >  u64 context, unsigned seqno,
> > @@ -81,4 +94,8 @@ struct dma_fence_array *dma_fence_array_create(int 
> > num_fences,
> >
> >   bool dma_fence_match_context(struct dma_fence *fence, u64 context);
> >
> > +struct dma_fence *dma_fence_array_first(struct dma_fence *head);
> > +struct dma_fence *dma_fence_array_next(struct dma_fence *head,
> > +unsigned int index);
> > +
> >   #endif /* __LINUX_DMA_FENCE_ARRAY_H */
>

Re: [Intel-gfx] [RFC 2/2] drm/doc/rfc: i915 new parallel submission uAPI plan

2021-05-21 Thread Matthew Brost

On Thu, May 20, 2021 at 09:44:59PM +0200, Daniel Vetter wrote:
> On Thu, May 20, 2021 at 08:10:59AM -0700, Matthew Brost wrote:
> > On Thu, May 20, 2021 at 11:54:25AM +0200, Daniel Vetter wrote:
> > > On Wed, May 19, 2021 at 7:19 PM Matthew Brost  
> > > wrote:
> > > >
> > > > On Wed, May 19, 2021 at 01:10:04PM +0200, Daniel Vetter wrote:
> > > > > On Tue, May 18, 2021 at 04:58:30PM -0700, Matthew Brost wrote:
> > > > > > Add entry fpr i915 new parallel submission uAPI plan.
> > > > > >
> > > > > > v2:
> > > > > >  (Daniel Vetter):
> > > > > >   - Expand logical order explaination
> > > > > >   - Add dummy header
> > > > > >   - Only allow N BBs in execbuf IOCTL
> > > > > >   - Configure parallel submission per slot not per gem context
> > > > > >
> > > > > > Cc: Tvrtko Ursulin 
> > > > > > Cc: Tony Ye 
> > > > > > CC: Carl Zhang 
> > > > > > Cc: Daniel Vetter 
> > > > > > Cc: Jason Ekstrand 
> > > > > > Signed-off-by: Matthew Brost 
> > > > > > ---
> > > > > >  Documentation/gpu/rfc/i915_parallel_execbuf.h | 144 
> > > > > > ++
> > > > > >  Documentation/gpu/rfc/i915_scheduler.rst  |  53 ++-
> > > > > >  2 files changed, 196 insertions(+), 1 deletion(-)
> > > > > >  create mode 100644 Documentation/gpu/rfc/i915_parallel_execbuf.h
> > > > > >
> > > > > > diff --git a/Documentation/gpu/rfc/i915_parallel_execbuf.h 
> > > > > > b/Documentation/gpu/rfc/i915_parallel_execbuf.h
> > > > > > new file mode 100644
> > > > > > index ..8c64b983ccad
> > > > > > --- /dev/null
> > > > > > +++ b/Documentation/gpu/rfc/i915_parallel_execbuf.h
> > > > > > @@ -0,0 +1,144 @@
> > > > > > +#define I915_CONTEXT_ENGINES_EXT_PARALLEL_SUBMIT 2 /* see 
> > > > > > i915_context_engines_parallel_submit */
> > > > > > +
> > > > > > +/*
> > > > > > + * i915_context_engines_parallel_submit:
> > > > > > + *
> > > > > > + * Setup a slot to allow multiple BBs to be submitted in a single 
> > > > > > execbuf IOCTL.
> > > > > > + * Those BBs will then be scheduled to run on the GPU in parallel. 
> > > > > > Multiple
> > > > > > + * hardware contexts are created internally in the i915 run these 
> > > > > > BBs. Once a
> > > > > > + * slot is configured for N BBs only N BBs can be submitted in 
> > > > > > each execbuf
> > > > > > + * IOCTL and this is implict behavior (e.g. the user doesn't tell 
> > > > > > the execbuf
> > > > > > + * IOCTL there are N BBs, the execbuf IOCTL know how many BBs 
> > > > > > there are based on
> > > > > > + * the slots configuration).
> > > > > > + *
> > > > > > + * Their are two currently defined ways to control the placement 
> > > > > > of the
> > > > > > + * hardware contexts on physical engines: default behavior (no 
> > > > > > flags) and
> > > > > > + * I915_PARALLEL_IMPLICT_BONDS (a flag). More flags may be added 
> > > > > > the in the
> > > > > > + * future as new hardware / use cases arise. Details of how to use 
> > > > > > this
> > > > > > + * interface below above the flags.
> > > > > > + *
> > > > > > + * Returns -EINVAL if hardware context placement configuration 
> > > > > > invalid or if the
> > > > > > + * placement configuration isn't supported on the platform / 
> > > > > > submission
> > > > > > + * interface.
> > > > > > + * Returns -ENODEV if extension isn't supported on the platform / 
> > > > > > submission
> > > > > > + * inteface.
> > > > > > + */
> > > > > > +struct i915_context_engines_parallel_submit {
> > > > > > +   struct i915_user_extension base;
> > > > > > +
> > > > > > +   __u16 engine_index; /* slot for parallel engine */
> > > > > > +   __u16 width;/* number of contexts per parallel 
> > > > > > engine */
> > > > > > +   __u16 num_siblings; /* number of siblings per context */
> > > > > > +   __u16 mbz16;
> > > > >
> > > > > Ok the big picture looks reasonable now, the flags still confuse me.
> > > > >
> > > >
> > > > Yea, it is a bit confusing.
> > > >
> > > > > > +/*
> > > > > > + * Default placement behvavior (currently unsupported):
> > > > > > + *
> > > > > > + * Rather than restricting parallel submission to a single class 
> > > > > > with a
> > > > > > + * logically contiguous placement (I915_PARALLEL_IMPLICT_BONDS), 
> > > > > > add a mode that
> > > > > > + * enables parallel submission across multiple engine classes. In 
> > > > > > this case each
> > > > > > + * context's logical engine mask indicates where that context can 
> > > > > > placed. It is
> > > > > > + * implied in this mode that all contexts have mutual exclusive 
> > > > > > placement (e.g.
> > > > > > + * if one context is running CS0 no other contexts can run on CS0).
> > > > > > + *
> > > > > > + * Example 1 pseudo code:
> > > > > > + * CSX[Y] = engine class X, logical instance Y
> > > > > > + * INVALID = I915_ENGINE_CLASS_INVALID, 
> > > > > > I915_ENGINE_CLASS_INVALID_NONE
> > > > > > + * set_engines(INVALID)
> > > > > > + * set_parallel(engine_index=0, width=2, num_siblings=2,
> > > > > > + *

Re: [Intel-gfx] [Mesa-dev] [RFC 2/2] drm/doc/rfc: i915 new parallel submission uAPI plan

2021-05-21 Thread Matthew Brost

On Fri, May 21, 2021 at 10:35:37AM +0200, Christian König wrote:
> Am 20.05.21 um 23:38 schrieb Jason Ekstrand:
> > On Thu, May 20, 2021 at 10:46 AM Matthew Brost  
> > wrote:
> > > On Thu, May 20, 2021 at 01:11:59PM +0200, Christian König wrote:
> > > > Am 19.05.21 um 18:51 schrieb Matthew Brost:
> > > > > On Wed, May 19, 2021 at 01:45:39PM +0200, Christian König wrote:
> > > > > > Oh, yeah we call that gang submit on the AMD side.
> > > > > > 
> > > > > > Had already some internal discussions how to implement this, but so 
> > > > > > far
> > > > > > couldn't figure out how to cleanly introduce that into the DRM 
> > > > > > scheduler.
> > > > > > 
> > > > > > Can you briefly describe in a few words how that is supposed to 
> > > > > > work on the
> > > > > > Intel side?
> > On Intel, we actually have two cases which don't fit the current
> > drm/scheduler model well: balanced and bonded.
> > 
> > In the balanced model, we want to submit a batch which can go to any
> > one of some set of engines and we don't care which.  It's up to the
> > kernel to pick an engine.  Imagine you had 64 identical HW compute
> > queues, for instance.  This could be done by making all the identical
> > engines share a single drm_gpu_scheduler and round-robin around the HW
> > queues or something.  I don't know that we strictly need drm/scheduler
> > to be aware of it but it might be nice if it grew support for this
> > mode so we could maintain a 1:1 relationship between HW queues and
> > drm_gpu_schedulers.  That said, I'm not sure how this would play with
> > GuC queues so maybe it doesn't help?
> 
> Oh, we do have support for load balancing like that.
> 
> When you call drm_sched_entity_init() you can give a list of
> drm_gpu_scheduler object to use round robing for scheduling.
> 
> New jobs are then scheduler to the drm_gpu_scheduler instance which is idle
> or rather the least busy one.
> 
> > The bonded model is like your ganged, I think.  We want to submit N
> > batches to run in parallel.  And they actually have to be executing on
> > the GPU simultaneously and not just sort-of at similar times.  We need
> > this for video.  There are also potential use-cases in Vulkan or even
> > GL that might be able to use this.  One difference with the balanced
> > mode is that bonds don't, strictly speaking, need to be on the same
> > type of engine.  Imagine, for instance, a 3D batch with a parallel
> > compute batch doing vertex pre-processing.
> > 
> > I'm pretty sure the bonded case is something that the mobile drivers
> > (panfrost, etc.) would like as well for doing Vulkan on tilers where
> > you often have to have two command buffers running in parallel.
> > They're currently doing it by submitting a giant pile of batches where
> > they split the batch and add sync primitives every time some GL call
> > requires them to sync between fragment and vertex pipes.
> 
> Yeah, we have exactly the same problem as well.
> 
> But so far every model we discussed has some drawbacks and it is rather hard
> for the scheduler to guarantee that stuff runs at the same time.
> 
> So if you got any ideas how to cleanly implement that then they would be
> rather welcomed.
> 

Everything Jason said about our submission modes is correct for execlists, we
have balanced + bonded models which is tightly coupled with that backend.

Fortunately with GuC submission most of this complexity goes away as the GuC
handles this for us. e.g. For balanced when we register a context we just give
it a mask of which physical engines a context is allowed to run on. For parallel
we register N contexts in a single command with the placement information +
submit to all the contexts with a command which moves the tails in LRCs for us.
We really don't need to bake any of this into the DRM scheduler for GuC
submission. 

Execlists is different story but I think our plan is to do the minimum possible
to plum that into the DRM scheduler (e.g. leave the balanced / bonded code in
the backend). Could we update the DRM scheduler to understand balanced (seems
like already does to some extent) and bonded? Yes we could but IMO the ROI on
that is low for Intel. The DRM scheduler is quite clean at the moment compared
to our near incomprehensible execlist backend. Execlists are basically a legacy
interface so pushing features which are only required for that backend to the
DRM scheduler doesn't make sense to me. That being said, if this is something
AMD needs in the DRM scheduler of course we can support you.

Matt

> Regards,
> Christian.
> 
> > 
> > So, to sum up, I think there's likely some good collaboration to be
> > had here for everyone. :-)
> > 
> > --Jason
> > 
> > > > > Sure, I've done a quick PoC internally and have been able to hook this
> > > > > into the DRM scheduler.
> > > > > 
> > > > > Basically each BB still maps to a single job as each job is somewhat
> > > > > unique (e.g. each job has its own ring, lrc, seqno, etc...). However 
> > > > > all
> > > > > the

Re: [PATCH v17 1/4] dt-bindings: msm: disp: add yaml schemas for DPU bindings

2021-05-21 Thread Bjorn Andersson

On Fri 21 May 05:27 CDT 2021, Krishna Manikandan wrote:
> diff --git a/Documentation/devicetree/bindings/display/msm/dpu-sc7180.yaml 
> b/Documentation/devicetree/bindings/display/msm/dpu-sc7180.yaml
[..]
> +  ports:
> +$ref: /schemas/graph.yaml#/properties/ports
> +description: |
> +  Contains the list of output ports from DPU device. These ports
> +  connect to interfaces that are external to the DPU hardware,
> +  such as DSI, DP etc. Each output port contains an endpoint that
> +  describes how it is connected to an external interface.
> +
> +properties:
> +  port@0:
> +$ref: /schemas/graph.yaml#/properties/port
> +description: DPU_INTF1 (DSI1)
> +
> +  port@2:
> +$ref: /schemas/graph.yaml#/properties/port
> +description: DPU_INTF0 (DP)

Why is port@0 INTF1 and why is port@2 INTF0? In the binding you're
translating the two ports that are described are 0 and 1, representing
INTF1 and INTF2, or DSI1 and DSI2, respectively.

Further more, I have a need for somehow describing the pairing of 4 DP
INTFs (INTF 0, 3, 4 and 5) and how they are connected to the 3+1 DP+eDP
controllers.

Downstream this seems to be handled by adding cell-index to the DP
controllers and then matching that against the numbering in the driver's
INTF array. But rather than adding cell-index to map this, can't we
define that the port index is the INTF-number here?


This would obviously break compatibility with existing DTBs, but we
could start by doing it selectively for the new compatibles, fix up the
existing dts files and then drop the selective application after 1 or 2
LTS releases.

> +
> +required:
> +  - port@0

Does this imply that I am not allowed to build a product on sc7180 that
only has DP output?

> +
> +required:
> +  - compatible
> +  - reg
> +  - reg-names
> +  - clocks
> +  - interrupts
> +  - power-domains
> +  - operating-points-v2
> +  - ports
> +
> +required:
> +  - compatible
> +  - reg
> +  - reg-names
> +  - power-domains
> +  - clocks
> +  - interrupts
> +  - interrupt-controller
> +  - iommus
> +  - ranges
> +
> +additionalProperties: false
> +
> +examples:
> +  - |
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +display-subsystem@ae0 {
> + #address-cells = <1>;
> + #size-cells = <1>;
> + compatible = "qcom,sc7180-mdss";
> + reg = <0xae0 0x1000>;
> + reg-names = "mdss";
> + power-domains = < MDSS_GDSC>;
> + clocks = < GCC_DISP_AHB_CLK>,
> +  < DISP_CC_MDSS_AHB_CLK>,
> +  < DISP_CC_MDSS_MDP_CLK>;
> + clock-names = "iface", "ahb", "core";
> +
> + interrupts = ;
> + interrupt-controller;
> + #interrupt-cells = <1>;
> +
> + interconnects = <_noc MASTER_MDP0 _virt SLAVE_EBI1>;
> + interconnect-names = "mdp0-mem";
> +
> + iommus = <_smmu 0x800 0x2>;
> + ranges;
> +
> + display-controller@ae01000 {
> +   compatible = "qcom,sc7180-dpu";
> +   reg = <0x0ae01000 0x8f000>,
> + <0x0aeb 0x2008>;
> +
> +   reg-names = "mdp", "vbif";
> +
> +   clocks = < GCC_DISP_HF_AXI_CLK>,
> +< DISP_CC_MDSS_AHB_CLK>,
> +< DISP_CC_MDSS_ROT_CLK>,
> +< DISP_CC_MDSS_MDP_LUT_CLK>,
> +< DISP_CC_MDSS_MDP_CLK>,
> +< DISP_CC_MDSS_VSYNC_CLK>;
> +   clock-names = "bus", "iface", "rot", "lut", "core",
> + "vsync";
> +
> +   interrupt-parent = <>;
> +   interrupts = <0>;
> +   power-domains = < SC7180_CX>;
> +   operating-points-v2 = <_opp_table>;
> +
> +   ports {
> +   #address-cells = <1>;
> +   #size-cells = <0>;
> +
> +   port@0 {
> +   reg = <0>;
> +   dpu_intf1_out: endpoint {
> +  remote-endpoint = 
> <_in>;
> +   };
> +   };
> +
> +port@2 {
> +reg = <2>;
> +dpu_intf0_out: endpoint {
> +   remote-endpoint = 
> <_in>;
> +};
> +};

The indentation is inconsistent among the ports.

> +   };
> + };
> +};
> +...
> diff --git a/Documentation/devicetree/bindings/display/msm/dpu-sdm845.yaml 
>

Re: [PATCH] drm/fb-helper: improve DRM fbdev emulation device names

2021-05-21 Thread Daniel Vetter

On Fri, May 21, 2021 at 03:19:10PM +0200, Javier Martinez Canillas wrote:
> Framebuffer devices that are registered by DRM drivers for fbdev emulation
> have a "drmfb" suffix in their name. But makes them to be quite confusing
> if a driver already has "drm" in its name:
> 
> $ cat /proc/fb
> 0 rockchipdrmdrmfb
> 
> $ cat /proc/fb
> 0 simpledrmdrmfb
> 
> Instead, let's just add a "-fb" suffix to denote that are DRM drivers FB:
> 
> $ cat /proc/fb
> 0 rockchipdrm-fb
> 
> $ cat /proc/fb
> 0 simpledrm-fb
> 
> Suggested-by: Peter Robinson 
> Signed-off-by: Javier Martinez Canillas 

So what with all the drivers which do _not_ have drm in their name? Also
I'm never sure how much these are uapi or not ...
-Daniel

> ---
> 
>  drivers/gpu/drm/drm_fb_helper.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
> index f6baa204612..bbaff92c509 100644
> --- a/drivers/gpu/drm/drm_fb_helper.c
> +++ b/drivers/gpu/drm/drm_fb_helper.c
> @@ -1737,7 +1737,7 @@ void drm_fb_helper_fill_info(struct fb_info *info,
>  sizes->fb_width, sizes->fb_height);
>  
>   info->par = fb_helper;
> - snprintf(info->fix.id, sizeof(info->fix.id), "%sdrmfb",
> + snprintf(info->fix.id, sizeof(info->fix.id), "%s-fb",
>fb_helper->dev->driver->name);
>  
>  }
> -- 
> 2.31.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

[PATCH v3 12/12] drm/i915/lmem: Verify checks for lmem residency

2021-05-21 Thread Thomas Hellström

Since objects can be migrated or evicted when not pinned or locked,
update the checks for lmem residency or future residency so that
the value returned is not immediately stale.

Signed-off-by: Thomas Hellström 
Reviewed-by: Matthew Auld 
---
v2: Simplify i915_gem_object_migratable() (Reported by Mattew Auld)
---
 drivers/gpu/drm/i915/display/intel_display.c |  2 +-
 drivers/gpu/drm/i915/gem/i915_gem_lmem.c | 42 +++-
 drivers/gpu/drm/i915/gem/i915_gem_object.c   | 18 +
 drivers/gpu/drm/i915/gem/i915_gem_object.h   |  4 ++
 4 files changed, 64 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
b/drivers/gpu/drm/i915/display/intel_display.c
index de1f13d203b5..b95def2d5af3 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -11615,7 +11615,7 @@ intel_user_framebuffer_create(struct drm_device *dev,
 
/* object is backed with LMEM for discrete */
i915 = to_i915(obj->base.dev);
-   if (HAS_LMEM(i915) && !i915_gem_object_is_lmem(obj)) {
+   if (HAS_LMEM(i915) && !i915_gem_object_validates_to_lmem(obj)) {
/* object is "remote", not in local memory */
i915_gem_object_put(obj);
return ERR_PTR(-EREMOTE);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c 
b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
index 2b8cd15de1d9..d539dffa1554 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
@@ -23,10 +23,50 @@ i915_gem_object_lmem_io_map(struct drm_i915_gem_object *obj,
return io_mapping_map_wc(>mm.region->iomap, offset, size);
 }
 
+/**
+ * i915_gem_object_validates_to_lmem - Whether the object is resident in
+ * lmem when pages are present.
+ * @obj: The object to check.
+ *
+ * Migratable objects residency may change from under us if the object is
+ * not pinned or locked. This function is intended to be used to check whether
+ * the object can only reside in lmem when pages are present.
+ *
+ * Return: Whether the object is always resident in lmem when pages are
+ * present.
+ */
+bool i915_gem_object_validates_to_lmem(struct drm_i915_gem_object *obj)
+{
+   struct intel_memory_region *mr = READ_ONCE(obj->mm.region);
+
+   return !i915_gem_object_migratable(obj) &&
+   mr && (mr->type == INTEL_MEMORY_LOCAL ||
+  mr->type == INTEL_MEMORY_STOLEN_LOCAL);
+}
+
+/**
+ * i915_gem_object_is_lmem - Whether the object is resident in
+ * lmem
+ * @obj: The object to check.
+ *
+ * Even if an object is allowed to migrate and change memory region,
+ * this function checks whether it will always be present in lmem when
+ * valid *or* if that's not the case, whether it's currently resident in lmem.
+ * For migratable and evictable objects, the latter only makes sense when
+ * the object is locked.
+ *
+ * Return: Whether the object migratable but resident in lmem, or not
+ * migratable and will be present in lmem when valid.
+ */
 bool i915_gem_object_is_lmem(struct drm_i915_gem_object *obj)
 {
-   struct intel_memory_region *mr = obj->mm.region;
+   struct intel_memory_region *mr = READ_ONCE(obj->mm.region);
 
+#ifdef CONFIG_LOCKDEP
+   if (i915_gem_object_migratable(obj) &&
+   i915_gem_object_evictable(obj))
+   assert_object_held(obj);
+#endif
return mr && (mr->type == INTEL_MEMORY_LOCAL ||
  mr->type == INTEL_MEMORY_STOLEN_LOCAL);
 }
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index df2b4e6b9bcc..c8bb6fb1dba3 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -458,6 +458,24 @@ bool i915_gem_object_evictable(struct drm_i915_gem_object 
*obj)
return pin_count == 0;
 }
 
+/**
+ * i915_gem_object_migratable - Whether the object is migratable out of the
+ * current region.
+ * @obj: Pointer to the object.
+ *
+ * Return: Whether the object is allowed to be resident in other
+ * regions than the current while pages are present.
+ */
+bool i915_gem_object_migratable(struct drm_i915_gem_object *obj)
+{
+   struct intel_memory_region *mr = READ_ONCE(obj->mm.region);
+
+   if (!mr)
+   return false;
+
+   return obj->mm.n_placements > 1;
+}
+
 void i915_gem_init__objects(struct drm_i915_private *i915)
 {
INIT_WORK(>mm.free_work, __i915_gem_free_work);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index ae5930e307d5..a3ad8cf4eefd 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -596,6 +596,10 @@ void __i915_gem_free_object(struct drm_i915_gem_object 
*obj);
 
 bool i915_gem_object_evictable(struct drm_i915_gem_object *obj);
 
+bool i915_gem_object_migratable(struct drm_i915_gem_object *obj);
+
+bool

[PATCH v3 11/12] drm/i915/ttm: Introduce a TTM i915 gem object backend

2021-05-21 Thread Thomas Hellström

Most logical place to introduce TTM buffer objects is as an i915
gem object backend. We need to add some ops to account for added
functionality like delayed delete and LRU list manipulation.

Initially we support only LMEM and SYSTEM memory, but SYSTEM
(which in this case means evicted LMEM objects) is not
visible to i915 GEM yet. The plan is to move the i915 gem system region
over to the TTM system memory type in upcoming patches.

We set up GPU bindings directly both from LMEM and from the system region,
as there is no need to use the legacy TTM_TT memory type. We reserve
that for future porting of GGTT bindings to TTM.

Remove the old lmem backend.

Signed-off-by: Thomas Hellström 
Reviewed-by: Matthew Auld 
---
v2:
- Break out needed TTM functionality to a separate patch (Reported by
Christian König).
- Fix an unhandled error (Reported by Matthew Auld and Maarten Lankhorst)
- Remove a stray leftover sg_table allocation (Reported by Matthew Auld)
- Use ttm_tt_unpopulate() rather than ttm_tt_destroy() in the purge path
  as some TTM functionality relies on having a ttm_tt present for !is_iomem.
v3:
- Use ttm_bo_type_device for userspace visible objects so that TTM can
  allocate an address space offset for mmap'ing.
- Fix up the destruction path (Reported by Matthew Auld)
- Use ttm_bo_validate() for purging (Reported by Christian König)
- Create ttm_tts write-combined as they are currently for eviction only and
  we want to maintain consistent write-combined caching for bos that are
  not in system only. (Suggested by Daniel Vetter)
- Make struct ttm_placements static.
- Add the ttm device funcs/ops to i915_gem_ttm.h for the region code.
- Rename new->dst and old->src. Check for swapin in the move callback.
---
 drivers/gpu/drm/i915/Makefile |   1 +
 drivers/gpu/drm/i915/gem/i915_gem_create.c|   9 +-
 drivers/gpu/drm/i915/gem/i915_gem_lmem.c  |  84 ---
 drivers/gpu/drm/i915/gem/i915_gem_lmem.h  |   5 -
 drivers/gpu/drm/i915/gem/i915_gem_object.c| 125 +++--
 drivers/gpu/drm/i915/gem/i915_gem_object.h|   9 +
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |  27 +-
 drivers/gpu/drm/i915/gem/i915_gem_region.c|   6 +-
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c   | 531 ++
 drivers/gpu/drm/i915/gem/i915_gem_ttm.h   |  50 ++
 drivers/gpu/drm/i915/gt/intel_region_lmem.c   |   3 +-
 drivers/gpu/drm/i915/i915_gem.c   |   5 +-
 drivers/gpu/drm/i915/intel_memory_region.c|   1 -
 drivers/gpu/drm/i915/intel_memory_region.h|   1 -
 drivers/gpu/drm/i915/intel_region_ttm.c   |   6 +-
 drivers/gpu/drm/i915/intel_region_ttm.h   |   8 +-
 16 files changed, 717 insertions(+), 154 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_ttm.c
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_ttm.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 998606b7f49f..2022e763f8f7 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -154,6 +154,7 @@ gem-y += \
gem/i915_gem_stolen.o \
gem/i915_gem_throttle.o \
gem/i915_gem_tiling.o \
+   gem/i915_gem_ttm.o \
gem/i915_gem_userptr.o \
gem/i915_gem_wait.o \
gem/i915_gemfs.o
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_create.c 
b/drivers/gpu/drm/i915/gem/i915_gem_create.c
index 548ddf39d853..93bf63bbaff1 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_create.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_create.c
@@ -85,13 +85,10 @@ i915_gem_setup(struct drm_i915_gem_object *obj, u64 size)
return -E2BIG;
 
/*
-* For now resort to CPU based clearing for device local-memory, in the
-* near future this will use the blitter engine for accelerated, GPU
-* based clearing.
+* I915_BO_ALLOC_USER will make sure the object is cleared before
+* any user access.
 */
-   flags = 0;
-   if (mr->type == INTEL_MEMORY_LOCAL)
-   flags = I915_BO_ALLOC_CPU_CLEAR;
+   flags = I915_BO_ALLOC_USER;
 
ret = mr->ops->init_object(mr, obj, size, flags);
if (ret)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c 
b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
index 3b4aa28a076d..2b8cd15de1d9 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
@@ -4,74 +4,10 @@
  */
 
 #include "intel_memory_region.h"
-#include "intel_region_ttm.h"
 #include "gem/i915_gem_region.h"
 #include "gem/i915_gem_lmem.h"
 #include "i915_drv.h"
 
-static void lmem_put_pages(struct drm_i915_gem_object *obj,
-  struct sg_table *pages)
-{
-   intel_region_ttm_node_free(obj->mm.region, obj->mm.st_mm_node);
-   obj->mm.dirty = false;
-   sg_free_table(pages);
-   kfree(pages);
-}
-
-static int lmem_get_pages(struct drm_i915_gem_object *obj)
-{
-   unsigned int flags;
-   struct sg_table *pages;
-
-   flags =

[PATCH v3 09/12] drm/ttm: Document and optimize ttm_bo_pipeline_gutting()

2021-05-21 Thread Thomas Hellström

If the bo is idle when calling ttm_bo_pipeline_gutting(), we unnecessarily
create a ghost object and push it out to delayed destroy.
Fix this by adding a path for idle, and document the function.

Also avoid having the bo end up in a bad state vulnerable to user-space
triggered kernel BUGs if the call to ttm_tt_create() fails.

Finally reuse ttm_bo_pipeline_gutting() in ttm_bo_evict().

Cc: Christian König 
Signed-off-by: Thomas Hellström 
---
 drivers/gpu/drm/ttm/ttm_bo.c  | 20 ++--
 drivers/gpu/drm/ttm/ttm_bo_util.c | 52 ---
 drivers/gpu/drm/ttm/ttm_tt.c  |  5 +++
 include/drm/ttm/ttm_tt.h  | 10 ++
 4 files changed, 73 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index ca1b098b6a56..a8fa3375b8aa 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -501,10 +501,15 @@ static int ttm_bo_evict(struct ttm_buffer_object *bo,
bdev->funcs->evict_flags(bo, );
 
if (!placement.num_placement && !placement.num_busy_placement) {
-   ttm_bo_wait(bo, false, false);
+   ret = ttm_bo_wait(bo, true, false);
+   if (ret)
+   return ret;
 
-   ttm_bo_cleanup_memtype_use(bo);
-   return ttm_tt_create(bo, false);
+   /*
+* Since we've already synced, this frees backing store
+* immediately.
+*/
+   return ttm_bo_pipeline_gutting(bo);
}
 
ret = ttm_bo_mem_space(bo, , _mem, ctx);
@@ -974,13 +979,8 @@ int ttm_bo_validate(struct ttm_buffer_object *bo,
/*
 * Remove the backing store if no placement is given.
 */
-   if (!placement->num_placement && !placement->num_busy_placement) {
-   ret = ttm_bo_pipeline_gutting(bo);
-   if (ret)
-   return ret;
-
-   return ttm_tt_create(bo, false);
-   }
+   if (!placement->num_placement && !placement->num_busy_placement)
+   return ttm_bo_pipeline_gutting(bo);
 
/*
 * Check whether we need to move buffer.
diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c 
b/drivers/gpu/drm/ttm/ttm_bo_util.c
index 4a7d3d672f9a..7fa9b3a852eb 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_util.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
@@ -585,26 +585,70 @@ int ttm_bo_move_accel_cleanup(struct ttm_buffer_object 
*bo,
 }
 EXPORT_SYMBOL(ttm_bo_move_accel_cleanup);
 
+/**
+ * ttm_bo_pipeline_gutting - purge the contents of a bo
+ * @bo: The buffer object
+ *
+ * Purge the contents of a bo, async if the bo is not idle.
+ * After a successful call, the bo is left unpopulated in
+ * system placement. The function may wait uninterruptible
+ * for idle on OOM.
+ *
+ * Return: 0 if successful, negative error code on failure.
+ */
 int ttm_bo_pipeline_gutting(struct ttm_buffer_object *bo)
 {
static const struct ttm_place sys_mem = { .mem_type = TTM_PL_SYSTEM };
struct ttm_buffer_object *ghost;
+   struct ttm_tt *ttm;
int ret;
 
-   ret = ttm_buffer_object_transfer(bo, );
+   /* If already idle, no need for ghost object dance. */
+   ret = ttm_bo_wait(bo, false, true);
+   if (ret != -EBUSY) {
+   if (!bo->ttm) {
+   ret = ttm_tt_create(bo, true);
+   if (ret)
+   return ret;
+   } else {
+   ttm_tt_unpopulate(bo->bdev, bo->ttm);
+   if (bo->type == ttm_bo_type_device)
+   ttm_tt_mark_for_clear(bo->ttm);
+   }
+   ttm_resource_free(bo, >mem);
+   ttm_resource_alloc(bo, _mem, >mem);
+
+   return 0;
+   }
+
+   /*
+* We need an unpopulated ttm_tt after giving our current one,
+* if any, to the ghost object. And we can't afford to fail
+* creating one *after* the operation.
+*/
+
+   ttm = bo->ttm;
+   bo->ttm = NULL;
+   ret = ttm_tt_create(bo, true);
+   swap(bo->ttm, ttm);
if (ret)
return ret;
 
+   ret = ttm_buffer_object_transfer(bo, );
+   if (ret) {
+   ttm_tt_destroy(bo->bdev, ttm);
+   return ret;
+   }
+
ret = dma_resv_copy_fences(>base._resv, bo->base.resv);
/* Last resort, wait for the BO to be idle when we are OOM */
if (ret)
ttm_bo_wait(bo, false, false);
 
-   ttm_resource_alloc(bo, _mem, >mem);
-   bo->ttm = NULL;
-
dma_resv_unlock(>base._resv);
ttm_bo_put(ghost);
+   bo->ttm = ttm;
+   ttm_resource_alloc(bo, _mem, >mem);
 
return 0;
 }
diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
index 0e41227116b1..913b330a234b 100644
--- a/drivers/gpu/drm/ttm/ttm_tt.c
+++ b/drivers/gpu/drm/ttm/ttm_tt.c
@@ -134,6 +134,11 @@ void

[PATCH v3 06/12] drm/ttm: Add a generic TTM memcpy move for page-based iomem

2021-05-21 Thread Thomas Hellström

The internal ttm_bo_util memcpy uses ioremap functionality, and while it
probably might be possible to use it for copying in- and out of
sglist represented io memory, using io_mem_reserve() / io_mem_free()
callbacks, that would cause problems with fault().
Instead, implement a method mapping page-by-page using kmap_local()
semantics. As an additional benefit we then avoid the occasional global
TLB flushes of ioremap() and consuming ioremap space, elimination of a
critical point of failure and with a slight change of semantics we could
also push the memcpy out async for testing and async driver development
purposes.

A special linear iomem iterator is introduced internally to mimic the
old ioremap behaviour for code-paths that can't immediately be ported
over. This adds to the code size and should be considered a temporary
solution.

Looking at the code we have a lot of checks for iomap tagged pointers.
Ideally we should extend the core memremap functions to also accept
uncached memory and kmap_local functionality. Then we could strip a
lot of code.

Cc: Christian König 
Signed-off-by: Thomas Hellström 
---
v3:
- Split up in various TTM files and addressed review comments by
  Christian König. Tested and fixed legacy iomap memcpy path on i915.
---
 drivers/gpu/drm/ttm/ttm_bo_util.c  | 278 ++---
 drivers/gpu/drm/ttm/ttm_module.c   |  35 
 drivers/gpu/drm/ttm/ttm_resource.c | 166 +
 drivers/gpu/drm/ttm/ttm_tt.c   |  42 +
 include/drm/ttm/ttm_bo_driver.h|  28 +++
 include/drm/ttm/ttm_caching.h  |   2 +
 include/drm/ttm/ttm_kmap_iter.h|  61 +++
 include/drm/ttm/ttm_resource.h |  61 +++
 include/drm/ttm/ttm_tt.h   |  16 ++
 9 files changed, 508 insertions(+), 181 deletions(-)
 create mode 100644 include/drm/ttm/ttm_kmap_iter.h

diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c 
b/drivers/gpu/drm/ttm/ttm_bo_util.c
index ae8b61460724..912cbe8e60a2 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_util.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
@@ -72,190 +72,126 @@ void ttm_mem_io_free(struct ttm_device *bdev,
mem->bus.addr = NULL;
 }
 
-static int ttm_resource_ioremap(struct ttm_device *bdev,
-  struct ttm_resource *mem,
-  void **virtual)
+/**
+ * ttm_move_memcpy - Helper to perform a memcpy ttm move operation.
+ * @bo: The struct ttm_buffer_object.
+ * @new_mem: The struct ttm_resource we're moving to (copy destination).
+ * @new_iter: A struct ttm_kmap_iter representing the destination resource.
+ * @src_iter: A struct ttm_kmap_iter representing the source resource.
+ *
+ * This function is intended to be able to move out async under a
+ * dma-fence if desired.
+ */
+void ttm_move_memcpy(struct ttm_buffer_object *bo,
+struct ttm_resource *dst_mem,
+struct ttm_kmap_iter *dst_iter,
+struct ttm_kmap_iter *src_iter)
 {
-   int ret;
-   void *addr;
-
-   *virtual = NULL;
-   ret = ttm_mem_io_reserve(bdev, mem);
-   if (ret || !mem->bus.is_iomem)
-   return ret;
+   const struct ttm_kmap_iter_ops *dst_ops = dst_iter->ops;
+   const struct ttm_kmap_iter_ops *src_ops = src_iter->ops;
+   struct ttm_tt *ttm = bo->ttm;
+   struct dma_buf_map src_map, dst_map;
+   pgoff_t i;
 
-   if (mem->bus.addr) {
-   addr = mem->bus.addr;
-   } else {
-   size_t bus_size = (size_t)mem->num_pages << PAGE_SHIFT;
+   /* Single TTM move. NOP */
+   if (dst_ops->maps_tt && src_ops->maps_tt)
+   return;
 
-   if (mem->bus.caching == ttm_write_combined)
-   addr = ioremap_wc(mem->bus.offset, bus_size);
-#ifdef CONFIG_X86
-   else if (mem->bus.caching == ttm_cached)
-   addr = ioremap_cache(mem->bus.offset, bus_size);
-#endif
-   else
-   addr = ioremap(mem->bus.offset, bus_size);
-   if (!addr) {
-   ttm_mem_io_free(bdev, mem);
-   return -ENOMEM;
+   /* Don't move nonexistent data. Clear destination instead. */
+   if (src_ops->maps_tt && (!ttm || !ttm_tt_is_populated(ttm))) {
+   if (ttm && !(ttm->page_flags & TTM_PAGE_FLAG_ZERO_ALLOC))
+   return;
+
+   for (i = 0; i < dst_mem->num_pages; ++i) {
+   dst_ops->map_local(dst_iter, _map, i);
+   if (dst_map.is_iomem)
+   memset_io(dst_map.vaddr_iomem, 0, PAGE_SIZE);
+   else
+   memset(dst_map.vaddr, 0, PAGE_SIZE);
+   if (dst_ops->unmap_local)
+   dst_ops->unmap_local(dst_iter, _map);
}
+   return;
}
-   *virtual = addr;
-   return 0;
-}
-
-static void ttm_resource_iounmap(struct ttm_device

[PATCH v3 10/12] drm/ttm, drm/amdgpu: Allow the driver some control over swapping

2021-05-21 Thread Thomas Hellström

We are calling the eviction_valuable driver callback at eviction time to
determine whether we actually can evict a buffer object.
The upcoming i915 TTM backend needs the same functionality for swapout,
and that might actually be beneficial to other drivers as well.

Add an eviction_valuable call also in the swapout path. Try to keep the
current behaviour for all drivers by returning true if the buffer object
is already in the TTM_PL_SYSTEM placement. We change behaviour for the
case where a buffer object is in a TT backed placement when swapped out,
in which case the drivers normal eviction_valuable path is run.

Finally make sure we don't try to swapout a bo that was recently purged
and therefore unpopulated.

Reviewed-by: Maarten Lankhorst 
Cc: Christian König 
Signed-off-by: Thomas Hellström 
---
v3:
- Don't export ttm_tt_unpopulate
- Fix confusion reading the locked pointer instead of the value
  pointed to in ttm_bo_evict_swapout_allowable (Reported by
  Maarten Lankhorst)
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c |  4 +++
 drivers/gpu/drm/ttm/ttm_bo.c| 43 -
 drivers/gpu/drm/ttm/ttm_tt.c|  3 ++
 3 files changed, 34 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 8c7ec09eb1a4..d5a9d7a88315 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1399,6 +1399,10 @@ static bool amdgpu_ttm_bo_eviction_valuable(struct 
ttm_buffer_object *bo,
struct dma_fence *f;
int i;
 
+   /* Swapout? */
+   if (bo->mem.mem_type == TTM_PL_SYSTEM)
+   return true;
+
if (bo->type == ttm_bo_type_kernel &&
!amdgpu_vm_evictable(ttm_to_amdgpu_bo(bo)))
return false;
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index a8fa3375b8aa..f45486837b44 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -536,6 +536,10 @@ static int ttm_bo_evict(struct ttm_buffer_object *bo,
 bool ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
  const struct ttm_place *place)
 {
+   dma_resv_assert_held(bo->base.resv);
+   if (bo->mem.mem_type == TTM_PL_SYSTEM)
+   return true;
+
/* Don't evict this BO if it's outside of the
 * requested placement range
 */
@@ -558,7 +562,9 @@ EXPORT_SYMBOL(ttm_bo_eviction_valuable);
  * b. Otherwise, trylock it.
  */
 static bool ttm_bo_evict_swapout_allowable(struct ttm_buffer_object *bo,
-   struct ttm_operation_ctx *ctx, bool *locked, bool *busy)
+  struct ttm_operation_ctx *ctx,
+  const struct ttm_place *place,
+  bool *locked, bool *busy)
 {
bool ret = false;
 
@@ -576,6 +582,14 @@ static bool ttm_bo_evict_swapout_allowable(struct 
ttm_buffer_object *bo,
*busy = !ret;
}
 
+   if (ret && place && !bo->bdev->funcs->eviction_valuable(bo, place)) {
+   ret = false;
+   if (*locked) {
+   dma_resv_unlock(bo->base.resv);
+   *locked = false;
+   }
+   }
+
return ret;
 }
 
@@ -630,20 +644,14 @@ int ttm_mem_evict_first(struct ttm_device *bdev,
list_for_each_entry(bo, >lru[i], lru) {
bool busy;
 
-   if (!ttm_bo_evict_swapout_allowable(bo, ctx, ,
-   )) {
+   if (!ttm_bo_evict_swapout_allowable(bo, ctx, place,
+   , )) {
if (busy && !busy_bo && ticket !=
dma_resv_locking_ctx(bo->base.resv))
busy_bo = bo;
continue;
}
 
-   if (place && !bdev->funcs->eviction_valuable(bo,
- place)) {
-   if (locked)
-   dma_resv_unlock(bo->base.resv);
-   continue;
-   }
if (!ttm_bo_get_unless_zero(bo)) {
if (locked)
dma_resv_unlock(bo->base.resv);
@@ -1138,10 +1146,18 @@ EXPORT_SYMBOL(ttm_bo_wait);
 int ttm_bo_swapout(struct ttm_buffer_object *bo, struct ttm_operation_ctx *ctx,
   gfp_t gfp_flags)
 {
+   struct ttm_place place = {};
bool locked;
int ret;
 
-   if (!ttm_bo_evict_swapout_allowable(bo, ctx, , NULL))
+   /*
+* While the bo may already reside in SYSTEM placement, set
+* SYSTEM as

[PATCH v3 07/12] drm, drm/i915: Move the memcpy_from_wc functionality to core drm

2021-05-21 Thread Thomas Hellström

Memcpy from wc will be used as well by TTM memcpy.
Move it to core drm, and make the interface do the right thing
even on !X86.

Cc: Christian König 
Cc: Daniel Vetter 
Cc: Dave Airlie 
Signed-off-by: Thomas Hellström 
---
 drivers/gpu/drm/Makefile  |  2 +-
 drivers/gpu/drm/drm_drv.c |  2 +
 .../drm/{i915/i915_memcpy.c => drm_memcpy.c}  | 63 ++-
 drivers/gpu/drm/i915/Makefile |  1 -
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|  4 +-
 drivers/gpu/drm/i915/gem/i915_gem_object.c|  5 +-
 drivers/gpu/drm/i915/gt/selftest_reset.c  |  7 ++-
 drivers/gpu/drm/i915/gt/uc/intel_guc_log.c| 11 ++--
 drivers/gpu/drm/i915/i915_cmd_parser.c|  4 +-
 drivers/gpu/drm/i915/i915_drv.c   |  2 -
 drivers/gpu/drm/i915/i915_gpu_error.c |  8 +--
 drivers/gpu/drm/i915/i915_memcpy.h| 34 --
 .../drm/i915/selftests/intel_memory_region.c  |  7 ++-
 include/drm/drm_memcpy.h  | 47 ++
 14 files changed, 121 insertions(+), 76 deletions(-)
 rename drivers/gpu/drm/{i915/i915_memcpy.c => drm_memcpy.c} (70%)
 delete mode 100644 drivers/gpu/drm/i915/i915_memcpy.h
 create mode 100644 include/drm/drm_memcpy.h

diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
index a91cc7684904..f3ab8586c3d7 100644
--- a/drivers/gpu/drm/Makefile
+++ b/drivers/gpu/drm/Makefile
@@ -18,7 +18,7 @@ drm-y   :=drm_aperture.o drm_auth.o drm_cache.o \
drm_dumb_buffers.o drm_mode_config.o drm_vblank.o \
drm_syncobj.o drm_lease.o drm_writeback.o drm_client.o \
drm_client_modeset.o drm_atomic_uapi.o drm_hdcp.o \
-   drm_managed.o drm_vblank_work.o
+   drm_managed.o drm_vblank_work.o drm_memcpy.o \
 
 drm-$(CONFIG_DRM_LEGACY) += drm_agpsupport.o drm_bufs.o drm_context.o 
drm_dma.o \
drm_legacy_misc.o drm_lock.o drm_memory.o 
drm_scatter.o \
diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
index 3d8d68a98b95..351cc2900cf1 100644
--- a/drivers/gpu/drm/drm_drv.c
+++ b/drivers/gpu/drm/drm_drv.c
@@ -40,6 +40,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -1041,6 +1042,7 @@ static int __init drm_core_init(void)
 
drm_connector_ida_init();
idr_init(_minors_idr);
+   drm_memcpy_init_early();
 
ret = drm_sysfs_init();
if (ret < 0) {
diff --git a/drivers/gpu/drm/i915/i915_memcpy.c b/drivers/gpu/drm/drm_memcpy.c
similarity index 70%
rename from drivers/gpu/drm/i915/i915_memcpy.c
rename to drivers/gpu/drm/drm_memcpy.c
index 1b021a4902de..740377749caa 100644
--- a/drivers/gpu/drm/i915/i915_memcpy.c
+++ b/drivers/gpu/drm/drm_memcpy.c
@@ -1,3 +1,4 @@
+// SPDX-License-Identifier: MIT
 /*
  * Copyright © 2016 Intel Corporation
  *
@@ -22,16 +23,12 @@
  *
  */
 
+#ifdef CONFIG_X86
+#include 
 #include 
 #include 
 
-#include "i915_memcpy.h"
-
-#if IS_ENABLED(CONFIG_DRM_I915_DEBUG)
-#define CI_BUG_ON(expr) BUG_ON(expr)
-#else
-#define CI_BUG_ON(expr) BUILD_BUG_ON_INVALID(expr)
-#endif
+#include "drm/drm_memcpy.h"
 
 static DEFINE_STATIC_KEY_FALSE(has_movntdqa);
 
@@ -94,23 +91,24 @@ static void __memcpy_ntdqu(void *dst, const void *src, 
unsigned long len)
 }
 
 /**
- * i915_memcpy_from_wc: perform an accelerated *aligned* read from WC
+ * drm_memcpy_from_wc: perform an accelerated *aligned* read from WC
  * @dst: destination pointer
  * @src: source pointer
  * @len: how many bytes to copy
  *
- * i915_memcpy_from_wc copies @len bytes from @src to @dst using
+ * drm_memcpy_from_wc copies @len bytes from @src to @dst using
  * non-temporal instructions where available. Note that all arguments
  * (@src, @dst) must be aligned to 16 bytes and @len must be a multiple
  * of 16.
  *
  * To test whether accelerated reads from WC are supported, use
- * i915_memcpy_from_wc(NULL, NULL, 0);
+ * drm_memcpy_from_wc(NULL, NULL, 0);
+ * This interface is intended for memremapped memory without the __iomem tag.
  *
  * Returns true if the copy was successful, false if the preconditions
  * are not met.
  */
-bool i915_memcpy_from_wc(void *dst, const void *src, unsigned long len)
+bool drm_memcpy_from_wc(void *dst, const void *src, unsigned long len)
 {
if (unlikely(((unsigned long)dst | (unsigned long)src | len) & 15))
return false;
@@ -123,24 +121,53 @@ bool i915_memcpy_from_wc(void *dst, const void *src, 
unsigned long len)
 
return false;
 }
+EXPORT_SYMBOL(drm_memcpy_from_wc);
 
 /**
- * i915_unaligned_memcpy_from_wc: perform a mostly accelerated read from WC
+ * drm_memcpy_from_wc_dbm: perform an accelerated *aligned* read from WC with
+ * struct dma_buf_map arguments.
+ * @dst: destination map
+ * @src: source map
+ * @len: how many bytes to copy
+ *
+ * This is identical to drm_memcpy_from_wc, except it's intended for
+ * potentially ioremapped memory rather than memremapped memory.
+ *
+ * Returns

[PATCH v3 08/12] drm/ttm: Use drm_memcpy_from_wc_dbm for TTM bo moves

2021-05-21 Thread Thomas Hellström

Use fast wc memcpy for reading out of wc memory for TTM bo moves.

Cc: Dave Airlie 
Cc: Christian König 
Cc: Daniel Vetter 
Signed-off-by: Thomas Hellström 
---
 drivers/gpu/drm/ttm/ttm_bo_util.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c 
b/drivers/gpu/drm/ttm/ttm_bo_util.c
index 912cbe8e60a2..4a7d3d672f9a 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_util.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
@@ -31,6 +31,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -91,6 +92,7 @@ void ttm_move_memcpy(struct ttm_buffer_object *bo,
const struct ttm_kmap_iter_ops *src_ops = src_iter->ops;
struct ttm_tt *ttm = bo->ttm;
struct dma_buf_map src_map, dst_map;
+   bool wc_memcpy;
pgoff_t i;
 
/* Single TTM move. NOP */
@@ -114,11 +116,16 @@ void ttm_move_memcpy(struct ttm_buffer_object *bo,
return;
}
 
+   wc_memcpy = ((!src_ops->maps_tt || ttm->caching != ttm_cached) &&
+drm_has_memcpy_from_wc());
+
for (i = 0; i < dst_mem->num_pages; ++i) {
dst_ops->map_local(dst_iter, _map, i);
src_ops->map_local(src_iter, _map, i);
 
-   if (!src_map.is_iomem && !dst_map.is_iomem) {
+   if (wc_memcpy) {
+   drm_memcpy_from_wc_dbm(_map, _map, PAGE_SIZE);
+   } else if (!src_map.is_iomem && !dst_map.is_iomem) {
memcpy(dst_map.vaddr, src_map.vaddr, PAGE_SIZE);
} else if (!src_map.is_iomem) {
dma_buf_map_memcpy_to(_map, src_map.vaddr,
-- 
2.31.1

[PATCH v3 05/12] drm/i915/ttm: Embed a ttm buffer object in the i915 gem object

2021-05-21 Thread Thomas Hellström

Embed a struct ttm_buffer_object into the i915 gem object, making sure
we alias the gem object part. It's a bit unfortunate that the
struct ttm_buffer_ojbect embeds a gem object since we otherwise could
make the TTM part private to the TTM backend, and use the usual
i915 gem object for the other backends.
To make this a bit more storage efficient for the other backends,
we'd have to use a pointer for the gem object which would require
a lot of changes in the driver. We postpone that for later.

Signed-off-by: Thomas Hellström 
Acked-by: Maarten Lankhorst 
---
 drivers/gpu/drm/i915/gem/i915_gem_object.c   |  7 +++
 drivers/gpu/drm/i915/gem/i915_gem_object_types.h | 12 +++-
 2 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 2be6109d0093..5706d471692d 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -62,6 +62,13 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj,
  const struct drm_i915_gem_object_ops *ops,
  struct lock_class_key *key, unsigned flags)
 {
+   /*
+* A gem object is embedded both in a struct ttm_buffer_object :/ and
+* in a drm_i915_gem_object. Make sure they are aliased.
+*/
+   BUILD_BUG_ON(offsetof(typeof(*obj), base) !=
+offsetof(typeof(*obj), __do_not_access.base));
+
spin_lock_init(>vma.lock);
INIT_LIST_HEAD(>vma.list);
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index f5b46d11e6e6..d047ea126029 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -10,6 +10,7 @@
 #include 
 
 #include 
+#include 
 #include 
 
 #include "i915_active.h"
@@ -99,7 +100,16 @@ struct i915_gem_object_page_iter {
 };
 
 struct drm_i915_gem_object {
-   struct drm_gem_object base;
+   /*
+* We might have reason to revisit the below since it wastes
+* a lot of space for non-ttm gem objects.
+* In any case, always use the accessors for the ttm_buffer_object
+* when accessing it.
+*/
+   union {
+   struct drm_gem_object base;
+   struct ttm_buffer_object __do_not_access;
+   };
 
const struct drm_i915_gem_object_ops *ops;
 
-- 
2.31.1

[PATCH v3 04/12] drm/i915/ttm Initialize the ttm device and memory managers

2021-05-21 Thread Thomas Hellström

Temporarily remove the buddy allocator and related selftests
and hook up the TTM range manager for i915 regions.

Also modify the mock region selftests somewhat to account for a
fragmenting manager.

Signed-off-by: Thomas Hellström 
Reviewed-by: Matthew Auld  #v2
---
v2:
- Fix an error unwind in lmem_get_pages() (Reported by Matthew Auld)
- Break out and modify usage of i915_sg_dma_sizes() (Reported by Mattew Auld)
- Break out TTM changes to a separate patch (Reported by Christian König)
v3:
- Fix the same error unwind in mock_region_get_pages()
(Reported by Matthew Auld)
- Don't rely on new TTM functionality, but create a mock TTM device,
(Reported by Christian König)
---
 drivers/gpu/drm/i915/Kconfig  |   1 +
 drivers/gpu/drm/i915/Makefile |   2 +-
 drivers/gpu/drm/i915/gem/i915_gem_lmem.c  |  59 +-
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |   6 +-
 drivers/gpu/drm/i915/gem/i915_gem_pages.c |   3 +-
 drivers/gpu/drm/i915/gem/i915_gem_region.c| 120 ---
 drivers/gpu/drm/i915/gem/i915_gem_region.h|   4 -
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c |   4 +-
 drivers/gpu/drm/i915/gem/i915_gem_stolen.c|  10 +-
 drivers/gpu/drm/i915/gem/i915_gem_stolen.h|   9 +-
 drivers/gpu/drm/i915/gt/intel_gt.c|   2 -
 drivers/gpu/drm/i915/gt/intel_region_lmem.c   |  27 +-
 drivers/gpu/drm/i915/i915_buddy.c | 435 --
 drivers/gpu/drm/i915/i915_buddy.h | 131 ---
 drivers/gpu/drm/i915/i915_drv.c   |   8 +
 drivers/gpu/drm/i915/i915_drv.h   |   7 +-
 drivers/gpu/drm/i915/i915_gem.c   |   1 +
 drivers/gpu/drm/i915/i915_globals.c   |   1 -
 drivers/gpu/drm/i915/i915_globals.h   |   1 -
 drivers/gpu/drm/i915/i915_scatterlist.c   |  70 ++
 drivers/gpu/drm/i915/i915_scatterlist.h   |   4 +
 drivers/gpu/drm/i915/intel_memory_region.c| 180 ++--
 drivers/gpu/drm/i915/intel_memory_region.h|  44 +-
 drivers/gpu/drm/i915/intel_region_ttm.c   | 334 
 drivers/gpu/drm/i915/intel_region_ttm.h   |  38 +
 drivers/gpu/drm/i915/selftests/i915_buddy.c   | 789 --
 .../drm/i915/selftests/i915_mock_selftests.h  |   1 -
 .../drm/i915/selftests/intel_memory_region.c  | 133 +--
 drivers/gpu/drm/i915/selftests/mock_region.c  |  52 +-
 29 files changed, 722 insertions(+), 1754 deletions(-)
 delete mode 100644 drivers/gpu/drm/i915/i915_buddy.c
 delete mode 100644 drivers/gpu/drm/i915/i915_buddy.h
 create mode 100644 drivers/gpu/drm/i915/intel_region_ttm.c
 create mode 100644 drivers/gpu/drm/i915/intel_region_ttm.h
 delete mode 100644 drivers/gpu/drm/i915/selftests/i915_buddy.c

diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
index 1e1cb245fca7..b63d374dff23 100644
--- a/drivers/gpu/drm/i915/Kconfig
+++ b/drivers/gpu/drm/i915/Kconfig
@@ -26,6 +26,7 @@ config DRM_I915
select SND_HDA_I915 if SND_HDA_CORE
select CEC_CORE if CEC_NOTIFIER
select VMAP_PFN
+   select DRM_TTM
help
  Choose this option if you have a system that has "Intel Graphics
  Media Accelerator" or "HD Graphics" integrated graphics,
diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index d0d936d9137b..cb8823570996 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -50,6 +50,7 @@ i915-y += i915_drv.o \
  intel_memory_region.o \
  intel_pch.o \
  intel_pm.o \
+ intel_region_ttm.o \
  intel_runtime_pm.o \
  intel_sideband.o \
  intel_step.o \
@@ -160,7 +161,6 @@ gem-y += \
 i915-y += \
  $(gem-y) \
  i915_active.o \
- i915_buddy.o \
  i915_cmd_parser.o \
  i915_gem_evict.o \
  i915_gem_gtt.o \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c 
b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
index f44bdd08f7cb..3b4aa28a076d 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
@@ -4,16 +4,71 @@
  */
 
 #include "intel_memory_region.h"
+#include "intel_region_ttm.h"
 #include "gem/i915_gem_region.h"
 #include "gem/i915_gem_lmem.h"
 #include "i915_drv.h"
 
+static void lmem_put_pages(struct drm_i915_gem_object *obj,
+  struct sg_table *pages)
+{
+   intel_region_ttm_node_free(obj->mm.region, obj->mm.st_mm_node);
+   obj->mm.dirty = false;
+   sg_free_table(pages);
+   kfree(pages);
+}
+
+static int lmem_get_pages(struct drm_i915_gem_object *obj)
+{
+   unsigned int flags;
+   struct sg_table *pages;
+
+   flags = I915_ALLOC_MIN_PAGE_SIZE;
+   if (obj->flags & I915_BO_ALLOC_CONTIGUOUS)
+   flags |= I915_ALLOC_CONTIGUOUS;
+
+   obj->mm.st_mm_node = intel_region_ttm_node_alloc(obj->mm.region,
+obj->base.size,
+flags);
+   if

[PATCH v3 02/12] drm/i915: Don't free shared locks while shared

2021-05-21 Thread Thomas Hellström

We are currently sharing the VM reservation locks across a number of
gem objects with page-table memory. Since TTM will individiualize the
reservation locks when freeing objects, including accessing the shared
locks, make sure that the shared locks are not freed until that is done.
For PPGTT we add an additional refcount, for GGTT we take additional
measures to make sure objects sharing the GGTT reservation lock are
freed at GGTT takedown

Signed-off-by: Thomas Hellström 
Reviewed-by: Maarten Lankhorst 
---
v2: Try harder to make sure objects sharing the GGTT reservation lock are
freed at GGTT takedown.
v3: Use a pointer to the vm to indicate that an object shares a reservation
object from that vm, rather than a pointer to the reservation object itself.
---
 drivers/gpu/drm/i915/gem/i915_gem_object.c|  3 ++
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |  4 ++
 drivers/gpu/drm/i915/gt/intel_ggtt.c  | 19 ++--
 drivers/gpu/drm/i915/gt/intel_gtt.c   | 45 +++
 drivers/gpu/drm/i915/gt/intel_gtt.h   | 28 +++-
 drivers/gpu/drm/i915/gt/intel_ppgtt.c |  2 +-
 drivers/gpu/drm/i915/i915_drv.c   |  5 +++
 7 files changed, 93 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 28144410df86..2be6109d0093 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -252,6 +252,9 @@ static void __i915_gem_free_objects(struct drm_i915_private 
*i915,
if (obj->mm.n_placements > 1)
kfree(obj->mm.placements);
 
+   if (obj->shares_resv_from)
+   i915_vm_resv_put(obj->shares_resv_from);
+
/* But keep the pointer alive for RCU-protected lookups */
call_rcu(>rcu, __i915_gem_free_object_rcu);
cond_resched();
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index 0727d0c76aa0..0415f99b6b95 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -149,6 +149,10 @@ struct drm_i915_gem_object {
 * when i915_gem_ww_ctx_backoff() or i915_gem_ww_ctx_fini() are called.
 */
struct list_head obj_link;
+   /**
+* @shared_resv_from: The object shares the resv from this vm.
+*/
+   struct i915_address_space *shares_resv_from;
 
union {
struct rcu_head rcu;
diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c 
b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index 35069ca5d7de..10c23a749a95 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
@@ -746,7 +746,6 @@ static void ggtt_cleanup_hw(struct i915_ggtt *ggtt)
 
mutex_unlock(>vm.mutex);
i915_address_space_fini(>vm);
-   dma_resv_fini(>vm.resv);
 
arch_phys_wc_del(ggtt->mtrr);
 
@@ -768,6 +767,19 @@ void i915_ggtt_driver_release(struct drm_i915_private 
*i915)
ggtt_cleanup_hw(ggtt);
 }
 
+/**
+ * i915_ggtt_driver_late_release - Cleanup of GGTT that needs to be done after
+ * all free objects have been drained.
+ * @i915: i915 device
+ */
+void i915_ggtt_driver_late_release(struct drm_i915_private *i915)
+{
+   struct i915_ggtt *ggtt = >ggtt;
+
+   GEM_WARN_ON(kref_read(>vm.resv_ref) != 1);
+   dma_resv_fini(>vm._resv);
+}
+
 static unsigned int gen6_get_total_gtt_size(u16 snb_gmch_ctl)
 {
snb_gmch_ctl >>= SNB_GMCH_GGMS_SHIFT;
@@ -829,6 +841,7 @@ static int ggtt_probe_common(struct i915_ggtt *ggtt, u64 
size)
return -ENOMEM;
}
 
+   kref_init(>vm.resv_ref);
ret = setup_scratch_page(>vm);
if (ret) {
drm_err(>drm, "Scratch setup failed\n");
@@ -1135,7 +1148,7 @@ static int ggtt_probe_hw(struct i915_ggtt *ggtt, struct 
intel_gt *gt)
ggtt->vm.gt = gt;
ggtt->vm.i915 = i915;
ggtt->vm.dma = i915->drm.dev;
-   dma_resv_init(>vm.resv);
+   dma_resv_init(>vm._resv);
 
if (INTEL_GEN(i915) <= 5)
ret = i915_gmch_probe(ggtt);
@@ -1144,7 +1157,7 @@ static int ggtt_probe_hw(struct i915_ggtt *ggtt, struct 
intel_gt *gt)
else
ret = gen8_gmch_probe(ggtt);
if (ret) {
-   dma_resv_fini(>vm.resv);
+   dma_resv_fini(>vm._resv);
return ret;
}
 
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c 
b/drivers/gpu/drm/i915/gt/intel_gtt.c
index 9b98f9d9faa3..94849567143d 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -22,8 +22,11 @@ struct drm_i915_gem_object *alloc_pt_lmem(struct 
i915_address_space *vm, int sz)
 * object underneath, with the idea that one object_lock() will lock
 * them all at once.
 */
-   if (!IS_ERR(obj))
-

[PATCH v3 03/12] drm/i915: Fix i915_sg_page_sizes to record dma segments rather than physical pages

2021-05-21 Thread Thomas Hellström

All users of this function actually want the dma segment sizes, but that's
not what's calculated. Fix that and rename the function to
i915_sg_dma_sizes to reflect what's calculated.

Signed-off-by: Thomas Hellström 
Reviewed-by: Matthew Auld 
---
 drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c  |  2 +-
 drivers/gpu/drm/i915/gem/i915_gem_phys.c|  2 +-
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c |  2 +-
 drivers/gpu/drm/i915/i915_scatterlist.h | 16 
 4 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c 
b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
index ccede73c6465..616c3a2f1baf 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
@@ -209,7 +209,7 @@ static int i915_gem_object_get_pages_dmabuf(struct 
drm_i915_gem_object *obj)
if (IS_ERR(pages))
return PTR_ERR(pages);
 
-   sg_page_sizes = i915_sg_page_sizes(pages->sgl);
+   sg_page_sizes = i915_sg_dma_sizes(pages->sgl);
 
__i915_gem_object_set_pages(obj, pages, sg_page_sizes);
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_phys.c 
b/drivers/gpu/drm/i915/gem/i915_gem_phys.c
index 81dc2bf59bc3..36f373dc493c 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_phys.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_phys.c
@@ -208,7 +208,7 @@ static int i915_gem_object_shmem_to_phys(struct 
drm_i915_gem_object *obj)
 
 err_xfer:
if (!IS_ERR_OR_NULL(pages)) {
-   unsigned int sg_page_sizes = i915_sg_page_sizes(pages->sgl);
+   unsigned int sg_page_sizes = i915_sg_dma_sizes(pages->sgl);
 
__i915_gem_object_set_pages(obj, pages, sg_page_sizes);
}
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c 
b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
index a657b99ec760..602f0ed983ec 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
@@ -173,7 +173,7 @@ static int i915_gem_userptr_get_pages(struct 
drm_i915_gem_object *obj)
goto err;
}
 
-   sg_page_sizes = i915_sg_page_sizes(st->sgl);
+   sg_page_sizes = i915_sg_dma_sizes(st->sgl);
 
__i915_gem_object_set_pages(obj, st, sg_page_sizes);
 
diff --git a/drivers/gpu/drm/i915/i915_scatterlist.h 
b/drivers/gpu/drm/i915/i915_scatterlist.h
index 9cb26a224034..b96baad66a3a 100644
--- a/drivers/gpu/drm/i915/i915_scatterlist.h
+++ b/drivers/gpu/drm/i915/i915_scatterlist.h
@@ -101,15 +101,23 @@ static inline struct scatterlist *__sg_next(struct 
scatterlist *sg)
 (((__iter).curr += PAGE_SIZE) >= (__iter).max) ?   \
 (__iter) = __sgt_iter(__sg_next((__iter).sgp), false), 0 : 0)
 
-static inline unsigned int i915_sg_page_sizes(struct scatterlist *sg)
+/**
+ * i915_sg_dma_sizes - Record the dma segment sizes of a scatterlist
+ * @sg: The scatterlist
+ *
+ * Return: An unsigned int with segment sizes logically or'ed together.
+ * A caller can use this information to determine what hardware page table
+ * entry sizes can be used to map the memory represented by the scatterlist.
+ */
+static inline unsigned int i915_sg_dma_sizes(struct scatterlist *sg)
 {
unsigned int page_sizes;
 
page_sizes = 0;
-   while (sg) {
+   while (sg && sg_dma_len(sg)) {
GEM_BUG_ON(sg->offset);
-   GEM_BUG_ON(!IS_ALIGNED(sg->length, PAGE_SIZE));
-   page_sizes |= sg->length;
+   GEM_BUG_ON(!IS_ALIGNED(sg_dma_len(sg), PAGE_SIZE));
+   page_sizes |= sg_dma_len(sg);
sg = __sg_next(sg);
}
 
-- 
2.31.1

[PATCH v3 01/12] drm/i915: Untangle the vma pages_mutex

2021-05-21 Thread Thomas Hellström

Any sleeping dma_resv lock taken while the vma pages_mutex is held
will cause a lockdep splat.
Move the i915_gem_object_pin_pages() call out of the pages_mutex
critical section.

Signed-off-by: Thomas Hellström 
Reviewed-by: Maarten Lankhorst 
---
 drivers/gpu/drm/i915/i915_vma.c | 29 +
 1 file changed, 17 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index a6cd0fa62847..f2b5912fc542 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -800,32 +800,37 @@ static bool try_qad_pin(struct i915_vma *vma, unsigned 
int flags)
 static int vma_get_pages(struct i915_vma *vma)
 {
int err = 0;
+   bool pinned_pages = false;
 
if (atomic_add_unless(>pages_count, 1, 0))
return 0;
 
+   if (vma->obj) {
+   err = i915_gem_object_pin_pages(vma->obj);
+   if (err)
+   return err;
+   pinned_pages = true;
+   }
+
/* Allocations ahoy! */
-   if (mutex_lock_interruptible(>pages_mutex))
-   return -EINTR;
+   if (mutex_lock_interruptible(>pages_mutex)) {
+   err = -EINTR;
+   goto unpin;
+   }
 
if (!atomic_read(>pages_count)) {
-   if (vma->obj) {
-   err = i915_gem_object_pin_pages(vma->obj);
-   if (err)
-   goto unlock;
-   }
-
err = vma->ops->set_pages(vma);
-   if (err) {
-   if (vma->obj)
-   i915_gem_object_unpin_pages(vma->obj);
+   if (err)
goto unlock;
-   }
+   pinned_pages = false;
}
atomic_inc(>pages_count);
 
 unlock:
mutex_unlock(>pages_mutex);
+unpin:
+   if (pinned_pages)
+   __i915_gem_object_unpin_pages(vma->obj);
 
return err;
 }
-- 
2.31.1

[PATCH v3 00/12] drm/i915: Move LMEM (VRAM) management over to TTM

2021-05-21 Thread Thomas Hellström

This is an initial patch series to move discrete memory management over to
TTM. It will be followed up shortly with adding more functionality.

The buddy allocator is temporarily removed along with its selftests and
It is replaced with the TTM range manager and some selftests are adjusted
to account for introduced fragmentation. Work is ongoing to reintroduce the
buddy allocator as a TTM resource manager.

A new memcpy ttm move is introduced that uses kmap_local() functionality
rather than vmap(). Among other things stated in the patch commit message
it helps us deal with page-pased LMEM memory. It is generic enough to replace
the ttm memcpy move with some additional work if so desired. On x86 it also
enables prefetching reads from write-combined memory.

Finally the old i915 gem object LMEM backend is replaced with a
i915 gem object TTM backend and some additional i915 gem object ops are
introduced to support the added functionality.
Currently it is used only to support management and eviction of the LMEM
region, but work is underway to extend the support to system memory. In this
way we use TTM the way it was originally intended, having the GPU binding
taken care of by driver code.

Intention is to follow up with
- System memory support
- Pipelined accelerated moves / migration
- Re-added buddy allocator in the TTM framework

v2:
- Add patches to move pagefaulting over to TTM
- Break out TTM changes to separate patches
- Address various review comments as detailed in the affected patches

v3:
- Drop TTM pagefaulting patches for now due changing approach due to a NAK.
- Address feedback on TTM patches
- Move the new TTM memcpy functionality into TTM.
- Move fast WC memcpy to drm
- Various fixes all over the place as shown in patch commit messages.

Cc: Christian König 

Thomas Hellström (12):
  drm/i915: Untangle the vma pages_mutex
  drm/i915: Don't free shared locks while shared
  drm/i915: Fix i915_sg_page_sizes to record dma segments rather than
physical pages
  drm/i915/ttm Initialize the ttm device and memory managers
  drm/i915/ttm: Embed a ttm buffer object in the i915 gem object
  drm/ttm: Add a generic TTM memcpy move for page-based iomem
  drm, drm/i915: Move the memcpy_from_wc functionality to core drm
  drm/ttm: Use drm_memcpy_from_wc_dbm for TTM bo moves
  drm/ttm: Document and optimize ttm_bo_pipeline_gutting()
  drm/ttm, drm/amdgpu: Allow the driver some control over swapping
  drm/i915/ttm: Introduce a TTM i915 gem object backend
  drm/i915/lmem: Verify checks for lmem residency

 drivers/gpu/drm/Makefile  |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c   |   4 +
 drivers/gpu/drm/drm_drv.c |   2 +
 .../drm/{i915/i915_memcpy.c => drm_memcpy.c}  |  63 +-
 drivers/gpu/drm/i915/Kconfig  |   1 +
 drivers/gpu/drm/i915/Makefile |   4 +-
 drivers/gpu/drm/i915/display/intel_display.c  |   2 +-
 drivers/gpu/drm/i915/gem/i915_gem_create.c|   9 +-
 drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c|   2 +-
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|   4 +-
 drivers/gpu/drm/i915/gem/i915_gem_lmem.c  |  71 +-
 drivers/gpu/drm/i915/gem/i915_gem_lmem.h  |   5 -
 drivers/gpu/drm/i915/gem/i915_gem_object.c| 154 +++-
 drivers/gpu/drm/i915/gem/i915_gem_object.h|  13 +
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |  49 +-
 drivers/gpu/drm/i915/gem/i915_gem_pages.c |   3 +-
 drivers/gpu/drm/i915/gem/i915_gem_phys.c  |   2 +-
 drivers/gpu/drm/i915/gem/i915_gem_region.c| 126 +--
 drivers/gpu/drm/i915/gem/i915_gem_region.h|   4 -
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c |   4 +-
 drivers/gpu/drm/i915/gem/i915_gem_stolen.c|  10 +-
 drivers/gpu/drm/i915/gem/i915_gem_stolen.h|   9 +-
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c   | 531 
 drivers/gpu/drm/i915/gem/i915_gem_ttm.h   |  50 ++
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c   |   2 +-
 drivers/gpu/drm/i915/gt/intel_ggtt.c  |  19 +-
 drivers/gpu/drm/i915/gt/intel_gt.c|   2 -
 drivers/gpu/drm/i915/gt/intel_gtt.c   |  45 +-
 drivers/gpu/drm/i915/gt/intel_gtt.h   |  28 +-
 drivers/gpu/drm/i915/gt/intel_ppgtt.c |   2 +-
 drivers/gpu/drm/i915/gt/intel_region_lmem.c   |  30 +-
 drivers/gpu/drm/i915/gt/selftest_reset.c  |   7 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_log.c|  11 +-
 drivers/gpu/drm/i915/i915_buddy.c | 435 --
 drivers/gpu/drm/i915/i915_buddy.h | 131 ---
 drivers/gpu/drm/i915/i915_cmd_parser.c|   4 +-
 drivers/gpu/drm/i915/i915_drv.c   |  15 +-
 drivers/gpu/drm/i915/i915_drv.h   |   7 +-
 drivers/gpu/drm/i915/i915_gem.c   |   6 +-
 drivers/gpu/drm/i915/i915_globals.c   |   1 -
 drivers/gpu/drm/i915/i915_globals.h   |   1 -
 drivers/gpu/drm/i915/i915_gpu_error.c |   8 +-
 drivers/gpu/drm/i915/i915_memcpy.h|  34 -

Re: [PATCH -next] drm: Fix PM reference leak

2021-05-21 Thread Daniel Vetter

On Fri, May 21, 2021 at 09:03:06PM +0800, Zou Wei wrote:
> pm_runtime_get_sync will increment pm usage counter even it failed.
> Forgetting to putting operation will result in reference leak here.
> Fix it by replacing it with pm_runtime_resume_and_get to keep usage
> counter balanced.
> 
> Reported-by: Hulk Robot 
> Signed-off-by: Zou Wei 

Looks good, but can you pls split this up into a patch per driver (vc and
bridge/cdns-dsi here)?

Thanks, Daniel

> ---
>  drivers/gpu/drm/bridge/cdns-dsi.c | 2 +-
>  drivers/gpu/drm/vc4/vc4_hdmi.c| 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/bridge/cdns-dsi.c 
> b/drivers/gpu/drm/bridge/cdns-dsi.c
> index 76373e3..b31281f 100644
> --- a/drivers/gpu/drm/bridge/cdns-dsi.c
> +++ b/drivers/gpu/drm/bridge/cdns-dsi.c
> @@ -1028,7 +1028,7 @@ static ssize_t cdns_dsi_transfer(struct mipi_dsi_host 
> *host,
>   struct mipi_dsi_packet packet;
>   int ret, i, tx_len, rx_len;
>  
> - ret = pm_runtime_get_sync(host->dev);
> + ret = pm_runtime_resume_and_get(host->dev);
>   if (ret < 0)
>   return ret;
>  
> diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
> index c27b287..f20a65b 100644
> --- a/drivers/gpu/drm/vc4/vc4_hdmi.c
> +++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
> @@ -798,7 +798,7 @@ static void vc4_hdmi_encoder_pre_crtc_configure(struct 
> drm_encoder *encoder,
>   unsigned long pixel_rate, hsm_rate;
>   int ret;
>  
> - ret = pm_runtime_get_sync(_hdmi->pdev->dev);
> + ret = pm_runtime_resume_and_get(_hdmi->pdev->dev);
>   if (ret < 0) {
>   DRM_ERROR("Failed to retain power domain: %d\n", ret);
>   return;
> -- 
> 2.6.2
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [PATCH v2] drm/exynos: Use pm_runtime_resume_and_get() to replace open coding

2021-05-21 Thread Daniel Vetter

On Fri, May 21, 2021 at 05:06:06PM +0800, Tian Tao wrote:
> use pm_runtime_resume_and_get() to replace pm_runtime_get_sync and
> pm_runtime_put_noidle.

It would be good to explain why: Apparently get_sync increments the
refcount even if it fails, which ususally leads to leaks.

With that or similar added to the commit message:

Reviewed-by: Daniel Vetter 

> 
> Signed-off-by: Tian Tao 
> ---
> 
> v2: drop unnecessary change about if condition.
> ---
>  drivers/gpu/drm/exynos/exynos_drm_mic.c | 6 ++
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_mic.c 
> b/drivers/gpu/drm/exynos/exynos_drm_mic.c
> index 3821ea7..32672bf 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_mic.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_mic.c
> @@ -268,11 +268,9 @@ static void mic_pre_enable(struct drm_bridge *bridge)
>   if (mic->enabled)
>   goto unlock;
>  
> - ret = pm_runtime_get_sync(mic->dev);
> - if (ret < 0) {
> - pm_runtime_put_noidle(mic->dev);
> + ret = pm_runtime_resume_and_get(mic->dev);
> + if (ret < 0)
>   goto unlock;
> - }
>  
>   mic_set_path(mic, 1);
>  
> -- 
> 2.7.4
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [RFC PATCH 01/13] drm/dsc: Add dsc pps header init function

2021-05-21 Thread Daniel Vetter

On Fri, May 21, 2021 at 06:19:30PM +0530, Vinod Koul wrote:
> We required a helper to create and set the dsc_dce_header, so add the
> dsc_dce_header and API drm_dsc_dsi_pps_header_init
> 
> Signed-off-by: Vinod Koul 
> ---
>  drivers/gpu/drm/drm_dsc.c | 11 +++
>  include/drm/drm_dsc.h | 16 
>  2 files changed, 27 insertions(+)
> 
> diff --git a/drivers/gpu/drm/drm_dsc.c b/drivers/gpu/drm/drm_dsc.c
> index ff602f7ec65b..0c1b745090e2 100644
> --- a/drivers/gpu/drm/drm_dsc.c
> +++ b/drivers/gpu/drm/drm_dsc.c
> @@ -49,6 +49,17 @@ void drm_dsc_dp_pps_header_init(struct dp_sdp_header 
> *pps_header)
>  }
>  EXPORT_SYMBOL(drm_dsc_dp_pps_header_init);
>  
> +void drm_dsc_dsi_pps_header_init(struct dsc_dce_header *dsc_header)

Kerneldoc for anything exported to drivers please, also ideally for all
the structures.

Thanks, Daniel
> +{
> + memset(dsc_header, 0, sizeof(*dsc_header));
> +
> + dsc_header->bp0 = 0x0A;
> + dsc_header->bp1 = 1;
> + dsc_header->bp4 = 10;
> + dsc_header->bp6 = 128;
> +}
> +EXPORT_SYMBOL(drm_dsc_dsi_pps_header_init);
> +
>  /**
>   * drm_dsc_dp_rc_buffer_size - get rc buffer size in bytes
>   * @rc_buffer_block_size: block size code, according to DPCD offset 62h
> diff --git a/include/drm/drm_dsc.h b/include/drm/drm_dsc.h
> index bbe120f461e5..5a3bbeb3e12f 100644
> --- a/include/drm/drm_dsc.h
> +++ b/include/drm/drm_dsc.h
> @@ -602,8 +602,24 @@ struct drm_dsc_pps_infoframe {
>   struct drm_dsc_picture_parameter_set pps_payload;
>  } __packed;
>  
> +struct dsc_dce_header {
> + u8 bp0;
> + u8 bp1;
> + u8 bp2;
> + u8 bp3;
> + u8 bp4;
> + u8 bp5;
> + u8 bp6;
> +} __packed;
> +
> +struct drm_dsi_dsc_infoframe {
> + struct dsc_dce_header dsc_header;
> + struct drm_dsc_picture_parameter_set pps_payload;
> +} __packed;
> +
>  void drm_dsc_dp_pps_header_init(struct dp_sdp_header *pps_header);
>  int drm_dsc_dp_rc_buffer_size(u8 rc_buffer_block_size, u8 rc_buffer_size);
> +void drm_dsc_dsi_pps_header_init(struct dsc_dce_header *dsc_header);
>  void drm_dsc_pps_payload_pack(struct drm_dsc_picture_parameter_set *pps_sdp,
>   const struct drm_dsc_config *dsc_cfg);
>  int drm_dsc_compute_rc_parameters(struct drm_dsc_config *vdsc_cfg);
> -- 
> 2.26.3
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [PATCH] drm/doc: Includ fence chain api

2021-05-21 Thread Daniel Vetter

On Fri, May 21, 2021 at 10:26:28AM +0200, Christian König wrote:
> Am 21.05.21 um 10:24 schrieb Daniel Vetter:
> > We have this nice kerneldoc, but forgot to include it.
> 
> Well I didn't forgot it, I just didn't had time to double check that it is
> bug free :)

It does seem to generate decent looking output and no new warnings.

> > 
> > Signed-off-by: Daniel Vetter 
> > Cc: Sumit Semwal 
> > Cc: "Christian König" 
> > Cc: linux-me...@vger.kernel.org
> > Cc: linaro-mm-...@lists.linaro.org
> 
> Reviewed-by: Christian König 

Thanks for taking a look, applied to drm-misc-next.
-Daniel

> 
> > ---
> >   Documentation/driver-api/dma-buf.rst | 9 +
> >   1 file changed, 9 insertions(+)
> > 
> > diff --git a/Documentation/driver-api/dma-buf.rst 
> > b/Documentation/driver-api/dma-buf.rst
> > index 7f37ec30d9fd..7f21425d9435 100644
> > --- a/Documentation/driver-api/dma-buf.rst
> > +++ b/Documentation/driver-api/dma-buf.rst
> > @@ -178,6 +178,15 @@ DMA Fence Array
> >   .. kernel-doc:: include/linux/dma-fence-array.h
> >  :internal:
> > +DMA Fence Chain
> > +~~~
> > +
> > +.. kernel-doc:: drivers/dma-buf/dma-fence-chain.c
> > +   :export:
> > +
> > +.. kernel-doc:: include/linux/dma-fence-chain.h
> > +   :internal:
> > +
> >   DMA Fence uABI/Sync File
> >   
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [PATCH 01/11] drm/amdgpu: Comply with implicit fencing rules

2021-05-21 Thread Daniel Vetter

On Fri, May 21, 2021 at 05:00:46PM +0200, Bas Nieuwenhuizen wrote:
> On Fri, May 21, 2021 at 4:37 PM Daniel Vetter  wrote:
> >
> > On Fri, May 21, 2021 at 11:46:23AM +0200, Bas Nieuwenhuizen wrote:
> > > On Fri, May 21, 2021 at 11:10 AM Daniel Vetter  
> > > wrote:
> > > > ---
> > > >  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 4 ++--
> > > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
> > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > > > index 88a24a0b5691..cc8426e1e8a8 100644
> > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > > > @@ -617,8 +617,8 @@ static int amdgpu_cs_parser_bos(struct 
> > > > amdgpu_cs_parser *p,
> > > > amdgpu_bo_list_for_each_entry(e, p->bo_list) {
> > > > struct amdgpu_bo *bo = ttm_to_amdgpu_bo(e->tv.bo);
> > > >
> > > > -   /* Make sure we use the exclusive slot for shared BOs */
> > > > -   if (bo->prime_shared_count)
> > > > +   /* Make sure we use the exclusive slot for all 
> > > > potentially shared BOs */
> > > > +   if (!(bo->flags & AMDGPU_GEM_CREATE_VM_ALWAYS_VALID))
> > > > e->tv.num_shared = 0;
> > >
> > > I think it also makes sense to skip this with
> > > AMDGPU_GEM_CREATE_EXPLICIT_SYNC? It can be shared but I don't think
> > > anyone expects implicit sync to happen with those.
> >
> > Ah yes, I missed this entirely. So the "no implicit flag" is already
> > there, and the _only_ thing that's missing really is a way to fish out the
> > implicit fences, and set them.
> >
> > https://lore.kernel.org/dri-devel/20210520190007.534046-1-ja...@jlekstrand.net/
> >
> > So I think all that's really needed in radv is not setting
> > RADEON_FLAG_IMPLICIT_SYNC for winsys buffers when Jason's dma-buf ioctl
> > are present (means you need to do some import/export and keep the fd
> > around for winsys buffers, but shouldn't be too bad), and then control the
> > implicit fences entirely explicitly like vk expects.
> 
> That is the part I'm less sure about. This is a BO wide flag so we are
> also disabling implicit sync in the compositor. If the compositor does
> only do read stuff that is ok, as the inserted exclusive fence will
> work for that. But as I learned recently the app provided buffer may
> end up being written to by the X server which open a whole can of
> potential problems if implicit sync gets disabled between Xserver
> operations on the app provided buffer. Hence setting that on the WSI
> buffer is a whole new can of potential problems and hence I've said a
> submission based flag would be preferred.
> 
> I can certainly try it out though.

Hm yeah that's the wrong flag. We need a flag on the drm_file which the
explicit userspace sets. And which is valid only for itself.

There's a nice flags field when creating a ctx, but it's not validated and
there's already a comment that we have to filter out garbage priority, so
that's not use. I'll whip up something entirely untested just as a draft.
-Daniel



> 
> >
> > Are you bored enough to type this up for radv? I'll give Jason's kernel
> > stuff another review meanwhile.
> > -Daniel
> >
> > > > e->bo_va = amdgpu_vm_bo_find(vm, bo);
> > > > }
> > > > --
> > > > 2.31.0
> > > >
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [PATCH 11/11] drm/tiny: drm_gem_simple_display_pipe_prepare_fb is the default

2021-05-21 Thread Oleksandr Andrushchenko

On 5/21/21 12:09 PM, Daniel Vetter wrote:
> Goes through all the drivers and deletes the default hook since it's
> the default now.
>
> Signed-off-by: Daniel Vetter
Acked-by: Oleksandr Andrushchenko

Re: [PATCH] drm/gem: Tiny kernel clarification for drm_gem_fence_array_add

2021-05-21 Thread Daniel Vetter

On Fri, May 21, 2021 at 11:10:57AM +0200, Daniel Vetter wrote:
> Spotted while trying to convert panfrost to these.
> 
> Signed-off-by: Daniel Vetter 
> Cc: Maarten Lankhorst 
> Cc: Maxime Ripard 
> Cc: Thomas Zimmermann 
> Cc: David Airlie 
> Cc: Daniel Vetter 

Cc: Emma Anholt 

As original author of these.
-Daniel

> ---
>  drivers/gpu/drm/drm_gem.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
> index 9989425e9875..e5dc98425896 100644
> --- a/drivers/gpu/drm/drm_gem.c
> +++ b/drivers/gpu/drm/drm_gem.c
> @@ -1312,6 +1312,9 @@ EXPORT_SYMBOL(drm_gem_unlock_reservations);
>   * @fence_array: array of dma_fence * for the job to block on.
>   * @fence: the dma_fence to add to the list of dependencies.
>   *
> + * This functions consumes the reference for @fence both on success and error
> + * cases.
> + *
>   * Returns:
>   * 0 on success, or an error on failing to expand the array.
>   */
> -- 
> 2.31.0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

[PATCH] drm/amd/amdkfd: Drop unnecessary NULL check after container_of

2021-05-21 Thread Guenter Roeck

The first parameter passed to container_of() is the pointer to the work
structure passed to the worker and never NULL. The NULL check on the
result of container_of() is therefore unnecessary and misleading.
Remove it.

This change was made automatically with the following Coccinelle script.

@@
type t;
identifier v;
statement s;
@@

<+...
(
  t v = container_of(...);
|
  v = container_of(...);
)
  ...
  when != v
- if (\( !v \| v == NULL \) ) s
...+>

Signed-off-by: Guenter Roeck 
---
 drivers/gpu/drm/amd/amdkfd/kfd_process.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index 5b6c5669c03d..2f8d352e0069 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -110,8 +110,6 @@ static void kfd_sdma_activity_worker(struct work_struct 
*work)
 
workarea = container_of(work, struct kfd_sdma_activity_handler_workarea,
sdma_activity_work);
-   if (!workarea)
-   return;
 
pdd = workarea->pdd;
if (!pdd)
-- 
2.25.1

Re: [PATCH 01/11] drm/amdgpu: Comply with implicit fencing rules

2021-05-21 Thread Bas Nieuwenhuizen

On Fri, May 21, 2021 at 4:37 PM Daniel Vetter  wrote:
>
> On Fri, May 21, 2021 at 11:46:23AM +0200, Bas Nieuwenhuizen wrote:
> > On Fri, May 21, 2021 at 11:10 AM Daniel Vetter  
> > wrote:
> > > ---
> > >  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 4 ++--
> > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
> > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > > index 88a24a0b5691..cc8426e1e8a8 100644
> > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > > @@ -617,8 +617,8 @@ static int amdgpu_cs_parser_bos(struct 
> > > amdgpu_cs_parser *p,
> > > amdgpu_bo_list_for_each_entry(e, p->bo_list) {
> > > struct amdgpu_bo *bo = ttm_to_amdgpu_bo(e->tv.bo);
> > >
> > > -   /* Make sure we use the exclusive slot for shared BOs */
> > > -   if (bo->prime_shared_count)
> > > +   /* Make sure we use the exclusive slot for all 
> > > potentially shared BOs */
> > > +   if (!(bo->flags & AMDGPU_GEM_CREATE_VM_ALWAYS_VALID))
> > > e->tv.num_shared = 0;
> >
> > I think it also makes sense to skip this with
> > AMDGPU_GEM_CREATE_EXPLICIT_SYNC? It can be shared but I don't think
> > anyone expects implicit sync to happen with those.
>
> Ah yes, I missed this entirely. So the "no implicit flag" is already
> there, and the _only_ thing that's missing really is a way to fish out the
> implicit fences, and set them.
>
> https://lore.kernel.org/dri-devel/20210520190007.534046-1-ja...@jlekstrand.net/
>
> So I think all that's really needed in radv is not setting
> RADEON_FLAG_IMPLICIT_SYNC for winsys buffers when Jason's dma-buf ioctl
> are present (means you need to do some import/export and keep the fd
> around for winsys buffers, but shouldn't be too bad), and then control the
> implicit fences entirely explicitly like vk expects.

That is the part I'm less sure about. This is a BO wide flag so we are
also disabling implicit sync in the compositor. If the compositor does
only do read stuff that is ok, as the inserted exclusive fence will
work for that. But as I learned recently the app provided buffer may
end up being written to by the X server which open a whole can of
potential problems if implicit sync gets disabled between Xserver
operations on the app provided buffer. Hence setting that on the WSI
buffer is a whole new can of potential problems and hence I've said a
submission based flag would be preferred.

I can certainly try it out though.

>
> Are you bored enough to type this up for radv? I'll give Jason's kernel
> stuff another review meanwhile.
> -Daniel
>
> > > e->bo_va = amdgpu_vm_bo_find(vm, bo);
> > > }
> > > --
> > > 2.31.0
> > >
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

Re: [Mesa-dev] [PATCH 01/11] drm/amdgpu: Comply with implicit fencing rules

2021-05-21 Thread Daniel Vetter

On Fri, May 21, 2021 at 07:58:57AM -0700, Rob Clark wrote:
> On Fri, May 21, 2021 at 2:10 AM Daniel Vetter  wrote:
> >
> > - msm is mildly entertaining. It also supports MSM_SUBMIT_NO_IMPLICIT,
> >   but because it doesn't use the drm/scheduler it handles fences from
> >   the wrong context with a synchronous dma_fence_wait. See
> >   submit_fence_sync() leading to msm_gem_sync_object(). Investing into
> >   a scheduler might be a good idea.
> 
> Yeah, drm/scheduler is (along with a lot of other things) on the TODO
> list.  But this isn't quite as bad as it sounds because userspace uses
> a u_queue thread to call the submit ioctl rather than blocking the
> driver.  (It also offloads some other work from the driver thread,
> like submit merging to reduce # of ioctls.  Coincidentally that
> arrangement was a step towards preparing userspace for some
> hypothetical non-ioctl uapi ;-))

You're also holding a pile of locks, which I expect latest with
multi-engine buffer sharing will be pain. If you push this to the
scheduler then the locks aren't held. Or maybe I've misread the flow, it's
become all a bit a blurr after all these drivers :-)

> OTOH it would be good to move blocking until the system can free
> enough pages to repin bo's out of the ioctl path to better handle some
> memory pressure corner cases without having to be interruptable over a
> lot more of the submit path..  Running chrome+android on devices
> without a lot of memory is fun..

Uh that one needs the userspace thread. Or entirely different semantics of
your ioctl, because you're not allowed to allocate memory once any
dma_fence is visible. So offloading the entire pinning to a submit thread
is no-go.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [Mesa-dev] [PATCH 01/11] drm/amdgpu: Comply with implicit fencing rules

2021-05-21 Thread Rob Clark

On Fri, May 21, 2021 at 2:10 AM Daniel Vetter  wrote:
>
> - msm is mildly entertaining. It also supports MSM_SUBMIT_NO_IMPLICIT,
>   but because it doesn't use the drm/scheduler it handles fences from
>   the wrong context with a synchronous dma_fence_wait. See
>   submit_fence_sync() leading to msm_gem_sync_object(). Investing into
>   a scheduler might be a good idea.

Yeah, drm/scheduler is (along with a lot of other things) on the TODO
list.  But this isn't quite as bad as it sounds because userspace uses
a u_queue thread to call the submit ioctl rather than blocking the
driver.  (It also offloads some other work from the driver thread,
like submit merging to reduce # of ioctls.  Coincidentally that
arrangement was a step towards preparing userspace for some
hypothetical non-ioctl uapi ;-))

OTOH it would be good to move blocking until the system can free
enough pages to repin bo's out of the ioctl path to better handle some
memory pressure corner cases without having to be interruptable over a
lot more of the submit path..  Running chrome+android on devices
without a lot of memory is fun..

BR,
-R

Re: [PATCH 02/11] drm/panfrost: Remove sched_lock

2021-05-21 Thread Daniel Vetter

On Fri, May 21, 2021 at 11:32:48AM +0200, Lucas Stach wrote:
> Am Freitag, dem 21.05.2021 um 11:09 +0200 schrieb Daniel Vetter:
> > Scheduler takes care of its own locking, dont worry. For everything else
> > there's reservation locking on each bo.
> > 
> > So seems to be entirely redundnant and just a bit in the way.
> 
> I haven't read all the surrounding code, but this looks wrong from a
> glance. You must hold a lock across drm_sched_job_init ->
> drm_sched_entity_push_job as the scheduler fences are initialized in
> the job init, so if there's no exclusive section across those two
> function calls you might end up with jobs being queued with their fence
> seqnos not monotonically increasing, which breaks all kinds of other
> stuff.

Uh indeed. That's a bit a loaded gun since generally _init() shouldn't
have any such side effects.

> I don't see a reason to hold the lock across the reservation calls,
> though.

Ok I'll adjust the patch.
-Daniel

> 
> Regards,
> Lucas
> 
> > 
> > Signed-off-by: Daniel Vetter 
> > Cc: Rob Herring 
> > Cc: Tomeu Vizoso 
> > Cc: Steven Price 
> > Cc: Alyssa Rosenzweig 
> > ---
> >  drivers/gpu/drm/panfrost/panfrost_device.c |  1 -
> >  drivers/gpu/drm/panfrost/panfrost_device.h |  2 --
> >  drivers/gpu/drm/panfrost/panfrost_job.c| 13 ++---
> >  3 files changed, 2 insertions(+), 14 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/panfrost/panfrost_device.c 
> > b/drivers/gpu/drm/panfrost/panfrost_device.c
> > index 125ed973feaa..23070c01c63f 100644
> > --- a/drivers/gpu/drm/panfrost/panfrost_device.c
> > +++ b/drivers/gpu/drm/panfrost/panfrost_device.c
> > @@ -199,7 +199,6 @@ int panfrost_device_init(struct panfrost_device *pfdev)
> > int err;
> > struct resource *res;
> >  
> > -   mutex_init(>sched_lock);
> > INIT_LIST_HEAD(>scheduled_jobs);
> > INIT_LIST_HEAD(>as_lru_list);
> >  
> > diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h 
> > b/drivers/gpu/drm/panfrost/panfrost_device.h
> > index 597cf1459b0a..7519903bb031 100644
> > --- a/drivers/gpu/drm/panfrost/panfrost_device.h
> > +++ b/drivers/gpu/drm/panfrost/panfrost_device.h
> > @@ -105,8 +105,6 @@ struct panfrost_device {
> >  
> > struct panfrost_perfcnt *perfcnt;
> >  
> > -   struct mutex sched_lock;
> > -
> > struct {
> > struct work_struct work;
> > atomic_t pending;
> > diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c 
> > b/drivers/gpu/drm/panfrost/panfrost_job.c
> > index 6003cfeb1322..f5d39ee14ab5 100644
> > --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> > +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> > @@ -218,26 +218,19 @@ static void panfrost_attach_object_fences(struct 
> > drm_gem_object **bos,
> >  
> >  int panfrost_job_push(struct panfrost_job *job)
> >  {
> > -   struct panfrost_device *pfdev = job->pfdev;
> > int slot = panfrost_job_get_slot(job);
> > struct drm_sched_entity *entity = >file_priv->sched_entity[slot];
> > struct ww_acquire_ctx acquire_ctx;
> > int ret = 0;
> >  
> > -   mutex_lock(>sched_lock);
> > -
> > ret = drm_gem_lock_reservations(job->bos, job->bo_count,
> > _ctx);
> > -   if (ret) {
> > -   mutex_unlock(>sched_lock);
> > +   if (ret)
> > return ret;
> > -   }
> >  
> > ret = drm_sched_job_init(>base, entity, NULL);
> > -   if (ret) {
> > -   mutex_unlock(>sched_lock);
> > +   if (ret)
> > goto unlock;
> > -   }
> >  
> > job->render_done_fence = dma_fence_get(>base.s_fence->finished);
> >  
> > @@ -248,8 +241,6 @@ int panfrost_job_push(struct panfrost_job *job)
> >  
> > drm_sched_entity_push_job(>base, entity);
> >  
> > -   mutex_unlock(>sched_lock);
> > -
> > panfrost_attach_object_fences(job->bos, job->bo_count,
> >   job->render_done_fence);
> >  
> 
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [RFC PATCH 02/13] dt-bindings: msm/dsi: Document Display Stream Compression (DSC) parameters

2021-05-21 Thread Bjorn Andersson

On Fri 21 May 07:49 CDT 2021, Vinod Koul wrote:

> DSC enables streams to be compressed before we send to panel. This
> requires DSC enabled encoder and a panel to be present. So we add this
> information in board DTS and find if DSC can be enabled and the
> parameters required to configure DSC are added to binding document along
> with example
> 
> Signed-off-by: Vinod Koul 
> ---
>  .../devicetree/bindings/display/msm/dsi.txt   | 15 +++
>  1 file changed, 15 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/display/msm/dsi.txt 
> b/Documentation/devicetree/bindings/display/msm/dsi.txt
> index b9a64d3ff184..83d2fb92267e 100644
> --- a/Documentation/devicetree/bindings/display/msm/dsi.txt
> +++ b/Documentation/devicetree/bindings/display/msm/dsi.txt
> @@ -48,6 +48,13 @@ Optional properties:
>  - pinctrl-n: the "sleep" pinctrl state
>  - ports: contains DSI controller input and output ports as children, each
>containing one endpoint subnode.
> +- qcom,mdss-dsc-enabled: Display Stream Compression (DSC) is enabled
> +- qcom,mdss-slice-height: DSC slice height in pixels
> +- qcom,mdss-slice-width: DSC slice width in pixels
> +- qcom,mdss-slice-per-pkt: DSC slices per packet
> +- qcom,mdss-bit-per-component: DSC bits per component
> +- qcom,mdss-bit-per-pixel: DSC bits per pixel
> +- qcom,mdss-block-prediction-enable: Block prediction mode of DSC enabled
>  
>DSI Endpoint properties:
>- remote-endpoint: For port@0, set to phandle of the connected 
> panel/bridge's
> @@ -188,6 +195,14 @@ Example:
>   qcom,master-dsi;
>   qcom,sync-dual-dsi;
>  
> + qcom,mdss-dsc-enabled;

To me the activation of DSC seems to be a property of the panel.

> + qcom,mdss-slice-height = <16>;
> + qcom,mdss-slice-width = <540>;
> + qcom,mdss-slice-per-pkt = <1>;
> + qcom,mdss-bit-per-component = <8>;
> + qcom,mdss-bit-per-pixel = <8>;
> + qcom,mdss-block-prediction-enable;

Which of these properties relates to the DSC encoder and what needs to
be agreed with the sink? Can't we derive e.g. bpp from the information
we have from the attached panel already?

Regards,
Bjorn

> +
>   qcom,mdss-mdp-transfer-time-us = <12000>;
>  
>   pinctrl-names = "default", "sleep";
> -- 
> 2.26.3
>

Re: [PATCH 01/11] drm/amdgpu: Comply with implicit fencing rules

2021-05-21 Thread Daniel Vetter

On Fri, May 21, 2021 at 11:46:23AM +0200, Bas Nieuwenhuizen wrote:
> On Fri, May 21, 2021 at 11:10 AM Daniel Vetter  wrote:
> >
> > Docs for struct dma_resv are fairly clear:
> >
> > "A reservation object can have attached one exclusive fence (normally
> > associated with write operations) or N shared fences (read
> > operations)."
> >
> > https://dri.freedesktop.org/docs/drm/driver-api/dma-buf.html#reservation-objects
> >
> > Furthermore a review across all of upstream.
> >
> > First of render drivers and how they set implicit fences:
> >
> > - nouveau follows this contract, see in validate_fini_no_ticket()
> >
> > nouveau_bo_fence(nvbo, fence, !!b->write_domains);
> >
> >   and that last boolean controls whether the exclusive or shared fence
> >   slot is used.
> >
> > - radeon follows this contract by setting
> >
> > p->relocs[i].tv.num_shared = !r->write_domain;
> >
> >   in radeon_cs_parser_relocs(), which ensures that the call to
> >   ttm_eu_fence_buffer_objects() in radeon_cs_parser_fini() will do the
> >   right thing.
> >
> > - vmwgfx seems to follow this contract with the shotgun approach of
> >   always setting ttm_val_buf->num_shared = 0, which means
> >   ttm_eu_fence_buffer_objects() will only use the exclusive slot.
> >
> > - etnaviv follows this contract, as can be trivially seen by looking
> >   at submit_attach_object_fences()
> >
> > - i915 is a bit a convoluted maze with multiple paths leading to
> >   i915_vma_move_to_active(). Which sets the exclusive flag if
> >   EXEC_OBJECT_WRITE is set. This can either come as a buffer flag for
> >   softpin mode, or through the write_domain when using relocations. It
> >   follows this contract.
> >
> > - lima follows this contract, see lima_gem_submit() which sets the
> >   exclusive fence when the LIMA_SUBMIT_BO_WRITE flag is set for that
> >   bo
> >
> > - msm follows this contract, see msm_gpu_submit() which sets the
> >   exclusive flag when the MSM_SUBMIT_BO_WRITE is set for that buffer
> >
> > - panfrost follows this contract with the shotgun approach of just
> >   always setting the exclusive fence, see
> >   panfrost_attach_object_fences(). Benefits of a single engine I guess
> >
> > - v3d follows this contract with the same shotgun approach in
> >   v3d_attach_fences_and_unlock_reservation(), but it has at least an
> >   XXX comment that maybe this should be improved
> >
> > - v4c uses the same shotgun approach of always setting an exclusive
> >   fence, see vc4_update_bo_seqnos()
> >
> > - vgem also follows this contract, see vgem_fence_attach_ioctl() and
> >   the VGEM_FENCE_WRITE. This is used in some igts to validate prime
> >   sharing with i915.ko without the need of a 2nd gpu
> >
> > - vritio follows this contract again with the shotgun approach of
> >   always setting an exclusive fence, see virtio_gpu_array_add_fence()
> >
> > This covers the setting of the exclusive fences when writing.
> >
> > Synchronizing against the exclusive fence is a lot more tricky, and I
> > only spot checked a few:
> >
> > - i915 does it, with the optional EXEC_OBJECT_ASYNC to skip all
> >   implicit dependencies (which is used by vulkan)
> >
> > - etnaviv does this. Implicit dependencies are collected in
> >   submit_fence_sync(), again with an opt-out flag
> >   ETNA_SUBMIT_NO_IMPLICIT. These are then picked up in
> >   etnaviv_sched_dependency which is the
> >   drm_sched_backend_ops->dependency callback.
> >
> > - v4c seems to not do much here, maybe gets away with it by not having
> >   a scheduler and only a single engine. Since all newer broadcom chips than
> >   the OG vc4 use v3d for rendering, which follows this contract, the
> >   impact of this issue is fairly small.
> >
> > - v3d does this using the drm_gem_fence_array_add_implicit() helper,
> >   which then it's drm_sched_backend_ops->dependency callback
> >   v3d_job_dependency() picks up.
> >
> > - panfrost is nice here and tracks the implicit fences in
> >   panfrost_job->implicit_fences, which again the
> >   drm_sched_backend_ops->dependency callback panfrost_job_dependency()
> >   picks up. It is mildly questionable though since it only picks up
> >   exclusive fences in panfrost_acquire_object_fences(), but not buggy
> >   in practice because it also always sets the exclusive fence. It
> >   should pick up both sets of fences, just in case there's ever going
> >   to be a 2nd gpu in a SoC with a mali gpu. Or maybe a mali SoC with a
> >   pcie port and a real gpu, which might actually happen eventually. A
> >   bug, but easy to fix. Should probably use the
> >   drm_gem_fence_array_add_implicit() helper.
> >
> > - lima is nice an easy, uses drm_gem_fence_array_add_implicit() and
> >   the same schema as v3d.
> >
> > - msm is mildly entertaining. It also supports MSM_SUBMIT_NO_IMPLICIT,
> >   but because it doesn't use the drm/scheduler it handles fences from
> >   the wrong context with a synchronous dma_fence_wait. See
> >

RE: [igt-dev] [PATCH i-g-t] tests/kms_big_fb: Wait for vblank before collecting CRC

2021-05-21 Thread Srinivas, Vidya

Hello Juha-Pekka

We are seeing the CRC failures on Jasperlake systems. Sometimes the test passes 
and sometimes it fails throwing CRC error.

Regards
Vidya

-Original Message-
From: Juha-Pekka Heikkila  
Sent: Friday, May 21, 2021 5:00 PM
To: Srinivas, Vidya ; 
intel-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; 
igt-...@lists.freedesktop.org
Cc: Lin, Charlton 
Subject: Re: [igt-dev] [PATCH i-g-t] tests/kms_big_fb: Wait for vblank before 
collecting CRC

Hi Vidya,

on which machines this would help? I see there's many vblanks already being 
waited. There's igt_display_commit2 which probably will block and even if it 
didn't there's igt_pipe_crc_collect_crc(..) where crc calculation is started 
after flip and then get one crc before disabling crc counting again.

/Juha-Pekka

On 21.5.2021 7.08, Vidya Srinivas wrote:
> Without wait for vblank, CRC mismatch is seen between big and small 
> CRC on few systems
> 
> Change-Id: I3bec931aa901130997e693ac1cacf389e2a8100f
> Signed-off-by: Vidya Srinivas 
> ---
>   tests/kms_big_fb.c | 6 --
>   1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/tests/kms_big_fb.c b/tests/kms_big_fb.c index 
> b2027b6b9d1b..7d78ff829d41 100644
> --- a/tests/kms_big_fb.c
> +++ b/tests/kms_big_fb.c
> @@ -254,6 +254,7 @@ static void unset_lut(data_t *data)
>   static bool test_plane(data_t *data)
>   {
>   igt_plane_t *plane = data->plane;
> + igt_display_t *display = >display;
>   struct igt_fb *small_fb = >small_fb;
>   struct igt_fb *big_fb = >big_fb;
>   int w = data->big_fb_width - small_fb->width; @@ -337,16 +338,17 @@ 
> static bool test_plane(data_t *data)
>   igt_display_commit2(>display, data->display.is_atomic ?
>   COMMIT_ATOMIC : COMMIT_UNIVERSAL);
>   
> -
> + igt_wait_for_vblank(data->drm_fd, 
> +display->pipes[data->pipe].crtc_offset);
>   igt_pipe_crc_collect_crc(data->pipe_crc, _crc);
>   
>   igt_plane_set_fb(plane, big_fb);
>   igt_fb_set_position(big_fb, plane, x, y);
>   igt_fb_set_size(big_fb, plane, small_fb->width, 
> small_fb->height);
> +
>   igt_plane_set_size(plane, data->width, data->height);
>   igt_display_commit2(>display, data->display.is_atomic ?
>   COMMIT_ATOMIC : COMMIT_UNIVERSAL);
> -
> + igt_wait_for_vblank(data->drm_fd, 
> +display->pipes[data->pipe].crtc_offset);
>   igt_pipe_crc_collect_crc(data->pipe_crc, _crc);
>   
>   igt_plane_set_fb(plane, NULL);
>

Re: [Linaro-mm-sig] [RFC 1/3] dma-fence: Add boost fence op

2021-05-21 Thread Daniel Vetter

On Fri, May 21, 2021 at 09:43:59AM +0200, Christian König wrote:
> Am 20.05.21 um 19:08 schrieb Daniel Vetter:
> > [SNIP]
> > > AH! So we are basically telling the fence backend that we have just
> > > missed an event we waited for.
> > > 
> > > So what we want to know is how long the frontend wanted to wait instead
> > > of how long the backend took for rendering.
> > tbh I'm not sure the timestamp matters at all. What we do in i915 is
> > boost quite aggressively, and then let the usual clock tuning wittle
> > it down if we overshot. Plus soom cool-down to prevent
> > abuse/continuous boosting. I think we also differentiate between
> > display boost and userspace waits.
> 
> I was not thinking about time stamps here, but more like which information
> we need at which place.
> 
> > On the display side we also wait until the vblank has passed we aimed
> > for (atm always the next, we don't have target_frame support like
> > amdgpu), to avoid boosting when there's no point.
> > 
> > > > So boosting right when you've missed your frame (not what Rob implements
> > > > currently, but fixable) is the right semantics.
> > > > 
> > > > The other issue is that for cpu waits, we want to differentiate from 
> > > > fence
> > > > waits that userspace does intentially (e.g. wait ioctl) and waits that
> > > > random other things are doing within the kernel to keep track of 
> > > > progress.
> > > > 
> > > > For the former we know that userspace is stuck waiting for the gpu, and 
> > > > we
> > > > probably want to boost. For the latter we most definitely do _not_ want 
> > > > to
> > > > boost.
> > > > 
> > > > Otoh I do agree with you that the current api is a bit awkward, so 
> > > > perhaps
> > > > we do need a dma_fence_userspace_wait wrapper which boosts automatically
> > > > after a bit. And similarly perhaps a drm_vblank_dma_fence_wait, where 
> > > > you
> > > > give it a vblank target, and if the fence isn't signalled by then, we 
> > > > kick
> > > > it real hard.
> > > Yeah, something like an use case driven API would be nice to have.
> > > 
> > > For this particular case I suggest that we somehow extend the enable
> > > signaling callback.
> > > 
> > > > But otherwise yes this is absolutely a thing that matters a ton. If you
> > > > look at Matt Brost's scheduler rfc, there's also a line item in there
> > > > about adding this kind of boosting to drm/scheduler.
> > > BTW: I still can't see this in my inbox.
> > You've replied already:
> > 
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210518235830.133834-1-matthew.brost%40intel.com%2Fdata=04%7C01%7Cchristian.koenig%40amd.com%7Ce4f3688b832842c4236e08d91bb1e148%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637571273080820910%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=uk3Gs%2FW42BDqMuMJtujcAH5GvN8mOlDnmywK8x1I%2F0k%3Dreserved=0
> 
> Yeah, but doesn't that also require some changes to the DRM scheduler?
> 
> I was expecting that this is a bit more than just two patches.

It's just the rfc document, per the new rfc process:

https://dri.freedesktop.org/docs/drm/gpu/rfc/

It's rather obviously not any piece of code in there, but just meant to
check rough direction before we go rewrite the entire i915 execbuf
frontend.
-Daniel

> 
> Christian.
> 
> > 
> > It's just the big picture plan of what areas we're all trying to
> > tackle with some why, so that everyone knows what's coming in the next
> > half year at least. Probably longer until this is all sorted. I think
> > Matt has some poc hacked-up pile, but nothing really to show.
> > -Daniel
> > 
> > > Do you have a link?
> > > 
> > > Christian.
> > > 
> > > > -Daniel
> > > > 
> > > > 
> > > > > Regards,
> > > > > Christian.
> > > > > 
> > > > > > BR,
> > > > > > -R
> > > > > > 
> > > > > > > Thanks,
> > > > > > > Christian.
> > > > > > > 
> > > > > > > > BR,
> > > > > > > > -R
> > > > > > > > 
> > > > > > > > > Christian.
> > > > > > > > > 
> > > > > > > > > Am 19.05.21 um 20:38 schrieb Rob Clark:
> > > > > > > > > > From: Rob Clark 
> > > > > > > > > > 
> > > > > > > > > > Add a way to hint to the fence signaler that a fence waiter 
> > > > > > > > > > has missed a
> > > > > > > > > > deadline waiting on the fence.
> > > > > > > > > > 
> > > > > > > > > > In some cases, missing a vblank can result in lower gpu 
> > > > > > > > > > utilization,
> > > > > > > > > > when really we want to go in the opposite direction and 
> > > > > > > > > > boost gpu freq.
> > > > > > > > > > The boost callback gives some feedback to the fence 
> > > > > > > > > > signaler that we
> > > > > > > > > > are missing deadlines, so it can take this into account in 
> > > > > > > > > > it's freq/
> > > > > > > > > > utilization calculations.
> > > > > > > > > > 
> > > > > > > > > > Signed-off-by: Rob Clark 
> > > > > > > > > > ---
> > > > > > > > > >   include/linux/dma-fence.h | 26 
> > > > > > > > > >

1 2 >

1 - 100 of 190 matches

Mail list logo