Re: printk deadlock due to double lock attempt on current CPU's runqueue

2021-11-11 Thread Vincent Guittot
On Wed, 10 Nov 2021 at 20:50, Sultan Alsawaf  wrote:
>
> On Wed, Nov 10, 2021 at 10:00:35AM +0100, Vincent Guittot wrote:
> > Is it the same SCHED_WARN_ON(rq->tmp_alone_branch !=
> > >leaf_cfs_rq_list); that generates the deadlock on v5.15 too ?
> >
> > one remaining tmp_alone_branch warning has been fixed in v5.15 with
> > 2630cde26711 ("sched/fair: Add ancestors of unthrottled undecayed cfs_rq")
>
> I should clarify that I didn't actually reproduce the issue on v5.15; I just 
> saw
> that the call chain leading to the deadlock still existed in v5.15 after 
> looking
> through the code.

Thanks for the clarification

>
> Failing the SCHED_WARN_ON(rq->tmp_alone_branch != >leaf_cfs_rq_list); 
> assert
> is extremely rare in my experience, and I don't have a reproducer. It has only
> happened once after months of heavy usage (with lots of reboots too, so not 
> with
> crazy high uptime).
>
> Sultan


Re: [PATCH v3 2/5] dt-bindings: mfd: timers: Update maintainers for st,stm32-timers

2021-11-11 Thread Lee Jones
On Wed, 10 Nov 2021, Rob Herring wrote:

> On Wed, 10 Nov 2021 16:01:41 +0100, patrice.chot...@foss.st.com wrote:
> > From: Patrice Chotard 
> > 
> > Benjamin has left the company, remove his name from maintainers.
> > 
> > Signed-off-by: Patrice Chotard 
> > ---
> >  Documentation/devicetree/bindings/mfd/st,stm32-timers.yaml | 1 -
> >  1 file changed, 1 deletion(-)
> > 
> 
> Lee indicated he was going to pick this one up, so:
> 
> Acked-by: Rob Herring 

Since you already merged the treewide patch, you may as well take
this too.  We'll work through any conflicts that may occur as a
result.

Acked-by: Lee Jones 

-- 
Lee Jones [李琼斯]
Senior Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog


[PATCH] drm/i915/guc/slpc: Check GuC status before freq boost

2021-11-11 Thread Vinay Belgaumkar
It's possible that i915 might get wedged between a boost
and un-boost. Validate the i915-GuC connection before trying
to send a H2G to change the min frequency.

Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/4464

Cc: Ashutosh Dixit 
Signed-off-by: Vinay Belgaumkar 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
index 4e1d3cd29164..22c1c12369f2 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
@@ -183,11 +183,15 @@ static int slpc_unset_param(struct intel_guc_slpc *slpc,
 static int slpc_force_min_freq(struct intel_guc_slpc *slpc, u32 freq)
 {
struct drm_i915_private *i915 = slpc_to_i915(slpc);
+   struct intel_guc *guc = slpc_to_guc(slpc);
intel_wakeref_t wakeref;
int ret = 0;
 
lockdep_assert_held(>lock);
 
+   if (!intel_guc_is_ready(guc))
+   return -ENODEV;
+
/*
 * This function is a little different as compared to
 * intel_guc_slpc_set_min_freq(). Softlimit will not be updated
-- 
2.25.0



Re: [RFC PATCH v2 0/3] arm64: imx8mm: Add MIPI DSI support

2021-11-11 Thread Jagan Teki
On Fri, Nov 12, 2021 at 5:40 AM Tim Harvey  wrote:
>
> On Thu, Nov 11, 2021 at 2:15 AM Jagan Teki  wrote:
> >
> > This series support MIPI DSI on i.MX8MM.
> >
> > The DSIM bridge still need to work to make it compatible for
> > exynos drm dsi hardware block.
> >
> > This series work directly on to of linux-next with recent
> > dispmix-blk-ctrl changes.
> >
>
> Jagan,
>
> Thanks - I was able to get this series working using the set of
> exynos/drm patches from Michael submitted back in 2020-09-11:
> https://patchwork.kernel.org/project/dri-devel/list/?series=347439=both=*
>
> > Tested on i.Core MX8M Mini SoM with EDIMM2.2 and CTOUCH2
> > Carrier boards.
> >
> > Required changes:
> > 1. DSIM driver
> > https://patchwork.kernel.org/project/linux-arm-kernel/cover/20210704090230.26489-1-ja...@amarulasolutions.com/
>
> This exynos/drm RFC series you posted back in July was where I
> recalled the discussion about if the exynos driver could be split up
> vs duplicating parts of it in a separate driver.

Not sure Laurent, Inki has some discussion about this [1], looks like
they are still looking for a common driver.

>
> There were also some comments about this series. Can you address those
> comments, rebase and resend?
>
> I have not been able to get my hardware to work with this series yet
> and am still debugging that (currently crashing in
> samsung_dsim_host_attach)

I've initially tried a separate driver instead of exynos.[2]

>
> > 2. DPHY change
> > https://www.spinics.net/lists/devicetree/msg381691.html
>
> This was originally from Marek submitted on Oct 3 2020: [PATCH] phy:
> exynos-mipi-video: Add support for NXP i.MX8MM

I'm thinking this may not be required, as dphy reset can now handle
via blk-ctrl like this [3]. I have tested the reset handling via
blk-ctrl and it works for me.

>
> This one seems to have been acked but never got picked up for some reason.
>
> Marek, can you add the tags and re-submit?
>
> > 3. Bus format fix
> > https://github.com/openedev/linux/commit/6ca9781ed53ea75e26341dd57250e63794638b20
> >
>
> Jagan, can you submit this?

This is indeed not required, drm handles the bridge state via atomic
API's. I did check that as well. I will link my latest series soon.

[1] 
https://patchwork.kernel.org/project/linux-arm-kernel/patch/20210704090230.26489-7-ja...@amarulasolutions.com/
[2] 
https://patchwork.kernel.org/project/linux-arm-kernel/cover/20210621072424.111733-1-ja...@amarulasolutions.com/
[3] 
https://patchwork.kernel.org/project/linux-arm-kernel/patch/20211106155427.753197-1-aford...@gmail.com/

Jagan.


[PATCH] drm/bridge: dw-mipi-dsi: Switch to atomic operations

2021-11-11 Thread Jagan Teki
Replace atomic version of the enable/disable operations to
continue the transition to the atomic API.

Also added default drm atomic operations for duplicate, destroy
and reset state API's in order to have smooth transition on
atomic API's.

Tested on Engicam i.Core STM32MP1 SoM.

Signed-off-by: Jagan Teki 
---
 drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c | 19 ---
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c 
b/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c
index e44e18a0112a..ff0db96dfcd5 100644
--- a/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c
+++ b/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c
@@ -871,7 +871,8 @@ static void dw_mipi_dsi_clear_err(struct dw_mipi_dsi *dsi)
dsi_write(dsi, DSI_INT_MSK1, 0);
 }
 
-static void dw_mipi_dsi_bridge_post_disable(struct drm_bridge *bridge)
+static void dw_mipi_dsi_bridge_post_atomic_disable(struct drm_bridge *bridge,
+  struct drm_bridge_state 
*old_bridge_state)
 {
struct dw_mipi_dsi *dsi = bridge_to_dsi(bridge);
const struct dw_mipi_dsi_phy_ops *phy_ops = dsi->plat_data->phy_ops;
@@ -978,7 +979,8 @@ static void dw_mipi_dsi_bridge_mode_set(struct drm_bridge 
*bridge,
dw_mipi_dsi_mode_set(dsi->slave, adjusted_mode);
 }
 
-static void dw_mipi_dsi_bridge_enable(struct drm_bridge *bridge)
+static void dw_mipi_dsi_bridge_atomic_enable(struct drm_bridge *bridge,
+struct drm_bridge_state 
*old_bridge_state)
 {
struct dw_mipi_dsi *dsi = bridge_to_dsi(bridge);
 
@@ -1032,11 +1034,14 @@ static int dw_mipi_dsi_bridge_attach(struct drm_bridge 
*bridge,
 }
 
 static const struct drm_bridge_funcs dw_mipi_dsi_bridge_funcs = {
-   .mode_set = dw_mipi_dsi_bridge_mode_set,
-   .enable   = dw_mipi_dsi_bridge_enable,
-   .post_disable = dw_mipi_dsi_bridge_post_disable,
-   .mode_valid   = dw_mipi_dsi_bridge_mode_valid,
-   .attach   = dw_mipi_dsi_bridge_attach,
+   .atomic_duplicate_state = drm_atomic_helper_bridge_duplicate_state,
+   .atomic_destroy_state   = drm_atomic_helper_bridge_destroy_state,
+   .atomic_reset   = drm_atomic_helper_bridge_reset,
+   .atomic_enable  = dw_mipi_dsi_bridge_atomic_enable,
+   .atomic_post_disable= dw_mipi_dsi_bridge_post_atomic_disable,
+   .mode_set   = dw_mipi_dsi_bridge_mode_set,
+   .mode_valid = dw_mipi_dsi_bridge_mode_valid,
+   .attach = dw_mipi_dsi_bridge_attach,
 };
 
 #ifdef CONFIG_DEBUG_FS
-- 
2.25.1



[PATCH] drm/amd/display: fix cond_no_effect.cocci warnings

2021-11-11 Thread cgel . zte
From: Ye Guojin 

This was found by coccicheck:
./drivers/gpu/drm/amd/display/dc/core/dc_resource.c, 2516, 7-9, WARNING
possible condition with no effect (if == else)

hdmi_info.bits.YQ0_YQ1 is always YYC_QUANTIZATION_LIMITED_RANGE.

Reported-by: Zeal Robot 
Signed-off-by: Ye Guojin 
---
 drivers/gpu/drm/amd/display/dc/core/dc_resource.c | 12 +---
 1 file changed, 1 insertion(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
index fabe1b83bd4f..564163a85d2c 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
@@ -2509,17 +2509,7 @@ static void set_avi_info_frame(
 
/* TODO : We should handle YCC quantization */
/* but we do not have matrix calculation */
-   if (stream->qy_bit == 1) {
-   if (color_space == COLOR_SPACE_SRGB ||
-   color_space == COLOR_SPACE_2020_RGB_FULLRANGE)
-   hdmi_info.bits.YQ0_YQ1 = YYC_QUANTIZATION_LIMITED_RANGE;
-   else if (color_space == COLOR_SPACE_SRGB_LIMITED ||
-   color_space == 
COLOR_SPACE_2020_RGB_LIMITEDRANGE)
-   hdmi_info.bits.YQ0_YQ1 = YYC_QUANTIZATION_LIMITED_RANGE;
-   else
-   hdmi_info.bits.YQ0_YQ1 = YYC_QUANTIZATION_LIMITED_RANGE;
-   } else
-   hdmi_info.bits.YQ0_YQ1 = YYC_QUANTIZATION_LIMITED_RANGE;
+   hdmi_info.bits.YQ0_YQ1 = YYC_QUANTIZATION_LIMITED_RANGE;
 
///VIC
format = stream->timing.timing_3d_format;
-- 
2.25.1



[Bug 205089] amdgpu : drm:amdgpu_cs_ioctl : Failed to initialize parser -125

2021-11-11 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=205089

--- Comment #23 from Joey Espinosa (jlouis.espin...@gmail.com) ---
... and I guess some of this info:

Mesa: 21.2.5
DE: Gnome 41.1
Vulkan: 1.2.189
Xorg: 1.20.11

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 214991] New: VC4 DRM waiting for flip down makes UI freeze a while with kernel 5.15

2021-11-11 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=214991

Bug ID: 214991
   Summary: VC4 DRM waiting for flip down makes UI freeze a while
with kernel 5.15
   Product: Drivers
   Version: 2.5
Kernel Version: 5.15
  Hardware: ARM
OS: Linux
  Tree: Mainline
Status: NEW
  Severity: normal
  Priority: P1
 Component: Video(DRI - non Intel)
  Assignee: drivers_video-...@kernel-bugs.osdl.org
  Reporter: j...@endlessos.org
Regression: No

Created attachment 299547
  --> https://bugzilla.kernel.org/attachment.cgi?id=299547=edit
Full dmesg log

I tested Linux mainline kernel 5.15 (aarch64) with enabled VC4 on RPi 4B. I
notice UI freezes a while (about 10 seconds) some times.

The kernel shows the error message during the time:

[drm:drm_crtc_commit_wait] *ERROR* flip_done timed out
[drm:drm_atomic_helper_wait_for_flip_done] *ERROR* [CRTC:68:crtc-3] flip_done
timed out
[drm:drm_atomic_helper_wait_for_dependencies] *ERROR* [CRTC:68:crtc-3] commit
wait timed out
[drm:drm_crtc_commit_wait] *ERROR* flip_done timed out
vc4-drm gpu: [drm] *ERROR* Timed out waiting for commit

It is easy to reproduce this issue by invoking GL related things, for example
es2gears.

After detail test, I found it is related to these commits:

f3c420fe19f8 ("drm/vc4: kms: Convert to atomic helpers")
82faa3276012 ("drm/vc4: kms: Remove async modeset semaphore")

This issue cannot be reproduced after I revert the commits.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

[git pull] drm fixes + one missed next for 5.16-rc1

2021-11-11 Thread Dave Airlie
Hi Linus,

I missed a drm-misc-next pull for the main pull last week. It wasn't
that major and isn't the bulk of this at all. This has a bunch of
fixes all over, a lot for amdgpu and i915.

This contains a backmerge of 5.15 as we had a bunch of fixes queued up
in the past couple of days that were based on fixes that were in 5.15,
so I did a backmerge in so I could land them now instead of waiting
for post rc1. I think this also screwed up the diffstat.

Dave.


drm-next-2021-11-12:
drm next/fixes for 5.16-rc1

bridge:
- HPD improvements for lt9611uxc
- eDP aux-bus support for ps8640
- LVDS data-mapping selection support

ttm:
- remove huge page functionality (needs reworking)
- fix a race condition during BO eviction

panels:
- add some new panels

fbdev:
- fix double-free
- remove unused scrolling acceleration
- CONFIG_FB dep improvements

locking:
- improve contended locking logging
- naming collision fix

dma-buf:
- add dma_resv_for_each_fence iterator
- fix fence refcounting bug
- name locking fixesA

prime:
- fix object references during mmap

nouveau:
- various code style changes
- refcount fix
- device removal fixes
- protect client list with a mutex
- fix CE0 address calculation

i915:
- DP rates related fixes
- Revert disabling dual eDP that was causing state readout problems
- put the cdclk vtables in const data
- Fix DVO port type for older platforms
- Fix blankscreen by turning DP++ TMDS output buffers on encoder->shutdown
- CCS FBs related fixes
- Fix recursive lock in GuC submission
- Revert guc_id from i915_request tracepoint
- Build fix around dmabuf

amdgpu:
- GPU reset fix
- Aldebaran fix
- Yellow Carp fixes
- DCN2.1 DMCUB fix
- IOMMU regression fix for Picasso
- DSC display fixes
- BPC display calculation fixes
- Other misc display fixes
- Don't allow partial copy from user for DC debugfs
- SRIOV fixes
- GFX9 CSB pin count fix
- Various IP version check fixes
- DP 2.0 fixes
- Limit DCN1 MPO fix to DCN1

amdkfd:
- SVM fixes
- Fix gfx version for renoir
- Reset fixes

udl:
- timeout fix

imx:
- circular locking fix

virtio:
- NULL ptr deref fix
The following changes since commit d9bd054177fbd2c4762546aec40fc3071bfe4cc0:

  Merge tag 'amd-drm-next-5.16-2021-10-29' of
https://gitlab.freedesktop.org/agd5f/linux into drm-next (2021-11-02
12:40:58 +1000)

are available in the Git repository at:

  git://anongit.freedesktop.org/drm/drm tags/drm-next-2021-11-12

for you to fetch changes up to b6c24725249a6c1a889665d720cdff088f686f98:

  Merge tag 'drm-misc-fixes-2021-11-11' of
git://anongit.freedesktop.org/drm/drm-misc into drm-next (2021-11-12
13:06:41 +1000)


drm next/fixes for 5.16-rc1

bridge:
- HPD improvments for lt9611uxc
- eDP aux-bus support for ps8640
- LVDS data-mapping selection support

ttm:
- remove huge page functionality (needs reworking)
- fix a race condition during BO eviction

panels:
- add some new panels

fbdev:
- fix double-free
- remove unused scrolling acceleration
- CONFIG_FB dep improvements

locking:
- improve contended locking logging
- naming collision fix

dma-buf:
- add dma_resv_for_each_fence iterator
- fix fence refcounting bug
- name locking fixesA

prime:
- fix object references during mmap

nouveau:
- various code style changes
- refcount fix
- device removal fixes
- protect client list with a mutex
- fix CE0 address calculation

i915:
- DP rates related fixes
- Revert disabling dual eDP that was causing state readout problems
- put the cdclk vtables in const data
- Fix DVO port type for older platforms
- Fix blankscreen by turning DP++ TMDS output buffers on encoder->shutdown
- CCS FBs related fixes
- Fix recursive lock in GuC submission
- Revert guc_id from i915_request tracepoint
- Build fix around dmabuf

amdgpu:
- GPU reset fix
- Aldebaran fix
- Yellow Carp fixes
- DCN2.1 DMCUB fix
- IOMMU regression fix for Picasso
- DSC display fixes
- BPC display calculation fixes
- Other misc display fixes
- Don't allow partial copy from user for DC debugfs
- SRIOV fixes
- GFX9 CSB pin count fix
- Various IP version check fixes
- DP 2.0 fixes
- Limit DCN1 MPO fix to DCN1

amdkfd:
- SVM fixes
- Fix gfx version for renoir
- Reset fixes

udl:
- timeout fix

imx:
- circular locking fix

virtio:
- NULL ptr deref fix


Aaron Liu (1):
  drm/amdgpu: update RLC_PG_DELAY_3 Value to 200us for yellow carp

Alex Deucher (2):
  drm/amdgpu/powerplay: fix sysfs_emit/sysfs_emit_at handling
  drm/amdgpu: fix SI handling in amdgpu_device_asic_has_dc_support()

Alex Sierra (2):
  drm/amdkfd: avoid recursive lock in migrations back to RAM
  drm/amdkfd: lower the VAs base offset to 8KB

Alex Xu (Hello71) (1):
  drm/plane-helper: fix uninitialized variable reference

Amos Kong (1):
  drm/ttm_bo_api: update the description for @placement and @sg

Anand K Mistry (1):
  drm/prime: Fix use after free in mmap with drm_gem_ttm_mmap


[PATCH] drivers:gpu: remove unneeded variable

2021-11-11 Thread cgel . zte
From: chiminghao 

Fix the following coccicheck REVIEW:
./drivers/gpu/drm/tegra/dpaux.c:282:13-16 REVIEW Unneeded variable

Reported-by: Zeal Robot 
Signed-off-by: chiminghao 
---
 drivers/gpu/drm/tegra/dpaux.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/tegra/dpaux.c b/drivers/gpu/drm/tegra/dpaux.c
index 1f96e416fa08..b65b21f26d2b 100644
--- a/drivers/gpu/drm/tegra/dpaux.c
+++ b/drivers/gpu/drm/tegra/dpaux.c
@@ -279,7 +279,6 @@ static void tegra_dpaux_hotplug(struct work_struct *work)
 static irqreturn_t tegra_dpaux_irq(int irq, void *data)
 {
struct tegra_dpaux *dpaux = data;
-   irqreturn_t ret = IRQ_HANDLED;
u32 value;
 
/* clear interrupts */
@@ -296,7 +295,7 @@ static irqreturn_t tegra_dpaux_irq(int irq, void *data)
if (value & DPAUX_INTR_AUX_DONE)
complete(>complete);
 
-   return ret;
+   return IRQ_HANDLED;
 }
 
 enum tegra_dpaux_functions {
-- 
2.25.1



[PATCH] drm/i915/pmu: Increase the live_engine_busy_stats sample period

2021-11-11 Thread Umesh Nerlige Ramappa
Irrespective of the backend for request submissions, busyness for an
engine with an active context is calculated using:

busyness = total + (current_time - context_switch_in_time)

In execlists mode of operation, the context switch events are handled
by the CPU. Context switch in/out time and current_time are captured
in CPU time domain using ktime_get().

In GuC mode of submission, context switch events are handled by GuC and
the times in the above formula are captured in GT clock domain. This
information is shared with the CPU through shared memory. This results
in 2 caveats:

1) The time taken between start of a batch and the time that CPU is able
to see the context_switch_in_time in shared memory is dependent on GuC
and memory bandwidth constraints.

2) Determining current_time requires an MMIO read that can take anywhere
between a few us to a couple ms. A reference CPU time is captured soon
after reading the MMIO so that the caller can compare the cpu delta
between 2 busyness samples. The issue here is that the CPU delta and the
busyness delta can be skewed because of the time taken to read the
register.

These 2 factors affect the accuracy of the selftest -
live_engine_busy_stats. For (1) the selftest waits until busyness stats
are visible to the CPU. The effects of (2) are more prominent for the
current busyness sample period of 100 us. Increase the busyness sample
period from 100 us to 10 ms to overccome (2).

Signed-off-by: Umesh Nerlige Ramappa 
---
 drivers/gpu/drm/i915/gt/selftest_engine_pm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/selftest_engine_pm.c 
b/drivers/gpu/drm/i915/gt/selftest_engine_pm.c
index 0bfd738dbf3a..96cc565afa78 100644
--- a/drivers/gpu/drm/i915/gt/selftest_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/selftest_engine_pm.c
@@ -316,7 +316,7 @@ static int live_engine_busy_stats(void *arg)
ENGINE_TRACE(engine, "measuring busy time\n");
preempt_disable();
de = intel_engine_get_busy_time(engine, [0]);
-   udelay(100);
+   udelay(1);
de = ktime_sub(intel_engine_get_busy_time(engine, [1]), de);
preempt_enable();
dt = ktime_sub(t[1], t[0]);
-- 
2.20.1



[PATCH] fs:btrfs: remove unneeded variable

2021-11-11 Thread cgel . zte
From: chiminghao 

Fix the following coccicheck REVIEW:
./fs/btrfs/extent_map.c:299:5-8 REVIEW Unneeded variable

Reported-by: Zeal Robot 
Signed-off-by: chiminghao 
---
 fs/btrfs/extent_map.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/btrfs/extent_map.c b/fs/btrfs/extent_map.c
index 5a36add21305..1dcb5486ccb6 100644
--- a/fs/btrfs/extent_map.c
+++ b/fs/btrfs/extent_map.c
@@ -296,7 +296,6 @@ static void try_merge_map(struct extent_map_tree *tree, 
struct extent_map *em)
 int unpin_extent_cache(struct extent_map_tree *tree, u64 start, u64 len,
   u64 gen)
 {
-   int ret = 0;
struct extent_map *em;
bool prealloc = false;
 
@@ -328,7 +327,7 @@ int unpin_extent_cache(struct extent_map_tree *tree, u64 
start, u64 len,
free_extent_map(em);
 out:
write_unlock(>lock);
-   return ret;
+   return 0;
 
 }
 
-- 
2.25.1



RE: [PATCH v2] drm/dp: Fix aux->transfer NULL pointer dereference on drm_dp_dpcd_access

2021-11-11 Thread Yuan, Perry
[AMD Official Use Only]

Hi Harry.

> -Original Message-
> From: Wentland, Harry 
> Sent: Wednesday, November 10, 2021 11:32 PM
> To: Yuan, Perry ; Jani Nikula
> ; Maarten Lankhorst
> ; Maxime Ripard ;
> Thomas Zimmermann ; David Airlie ;
> Daniel Vetter 
> Cc: Huang, Shimmer ; Huang, Ray
> ; linux-ker...@vger.kernel.org; dri-
> de...@lists.freedesktop.org; Limonciello, Mario 
> Subject: Re: [PATCH v2] drm/dp: Fix aux->transfer NULL pointer dereference on
> drm_dp_dpcd_access
> 
> On 2021-11-05 03:35, Yuan, Perry wrote:
> > [AMD Official Use Only]
> >
> > Hi Jani:
> >
> >
> >> -Original Message-
> >> From: Jani Nikula 
> >> Sent: Wednesday, November 3, 2021 7:31 PM
> >> To: Yuan, Perry ; Maarten Lankhorst
> >> ; Maxime Ripard
> >> ; Thomas Zimmermann ;
> David
> >> Airlie ; Daniel Vetter 
> >> Cc: dri-devel@lists.freedesktop.org; linux-ker...@vger.kernel.org;
> >> Huang, Shimmer ; Huang, Ray
> 
> >> Subject: RE: [PATCH v2] drm/dp: Fix aux->transfer NULL pointer
> >> dereference on drm_dp_dpcd_access
> >>
> >> [CAUTION: External Email]
> >>
> >> On Wed, 03 Nov 2021, "Yuan, Perry"  wrote:
> >>> [AMD Official Use Only]
> >>>
> >>> Hi Jani:
> >>>
>  -Original Message-
>  From: Jani Nikula 
>  Sent: Tuesday, November 2, 2021 4:40 PM
>  To: Yuan, Perry ; Maarten Lankhorst
>  ; Maxime Ripard
>  ; Thomas Zimmermann ;
> >> David
>  Airlie ; Daniel Vetter 
>  Cc: dri-devel@lists.freedesktop.org; linux-ker...@vger.kernel.org;
>  Huang, Shimmer ; Huang, Ray
> >> 
>  Subject: RE: [PATCH v2] drm/dp: Fix aux->transfer NULL pointer
>  dereference on drm_dp_dpcd_access
> 
>  [CAUTION: External Email]
> 
>  On Tue, 02 Nov 2021, "Yuan, Perry"  wrote:
> > [AMD Official Use Only]
> >
> > Hi Jani:
> > Thanks for your comments.
> >
> >> -Original Message-
> >> From: Jani Nikula 
> >> Sent: Monday, November 1, 2021 9:07 PM
> >> To: Yuan, Perry ; Maarten Lankhorst
> >> ; Maxime Ripard
> >> ; Thomas Zimmermann
> >> ;
>  David
> >> Airlie ; Daniel Vetter 
> >> Cc: Yuan, Perry ;
> >> dri-devel@lists.freedesktop.org; linux- ker...@vger.kernel.org;
> >> Huang, Shimmer ; Huang, Ray
>  
> >> Subject: Re: [PATCH v2] drm/dp: Fix aux->transfer NULL pointer
> >> dereference on drm_dp_dpcd_access
> >>
> >> [CAUTION: External Email]
> >>
> >> On Mon, 01 Nov 2021, Perry Yuan  wrote:
> >>> Fix below crash by adding a check in the drm_dp_dpcd_access
> >>> which ensures that aux->transfer was actually initialized earlier.
> >>
> >> Gut feeling says this is papering over a real usage issue
> >> somewhere else. Why is the aux being used for transfers before
> >> ->transfer has been set? Why should the dp helper be defensive
> >> against all kinds of
>  misprogramming?
> >>
> >>
> >> BR,
> >> Jani.
> >>
> >
> > The issue was found by Intel IGT test suite, graphic by pass test case.
> >
> >> https://g
>  itl
> > ab.freedesktop.org%2Fdrm%2Figt-gpu-
>  toolsdata=04%7C01%7CPerry.Yuan
> > %40amd.com%7C83d011acfe65437c0fa808d99ddc65b0%7C3dd8961fe4
> >> 884e6
>  08e11a8
> >
> 
> >> 2d994e183d%7C0%7C0%7C637714392203200313%7CUnknown%7CTWFpbG
> >> Zsb
>  3d8eyJWIj
> >
> 
> >> oiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1
> >> 00
>  0
> >
> 
> >> p;sdata=snPpRYLGeJtTpNGle1YHZAvevcABbgLkgOsffiNzQPw%3Dreser
> >> ved
>  =0
> > normally use case will not see the issue.
> > To avoid this issue happy again when we run the test case , it
> > will be nice to
>  add a check before the transfer is called.
> > And we can see that it really needs to have a check here to make
> > ITG 
>  happy.
> 
>  You're missing my point. What is the root cause? Why do you have
>  the aux device or connector registered before ->transfer function
>  is initialized. I don't think you should do that.
> 
>  BR,
>  Jani.
> 
> >>>
> >>> One potential IGT fix patch to resolve the test case failure is:
> >>>
> >>> tests/amdgpu/amd_bypass.c
> >>>   data->pipe_crc = igt_pipe_crc_new(data->drm_fd, data->pipe_id,
> >>>- AMDGPU_PIPE_CRC_SOURCE_DPRX);
> >>>+
> >>> INTEL_PIPE_CRC_SOURCE_AUTO); The kernel panic error gone after change
> "dprx" to "auto" in the IGT test.
> >>>
> >>> In my view ,the IGT amdgpu bypass test will do some common setup
> >>> work
> >> including crc piple, source.
> >>> When the IGT sets up a new CRC pipe capture source for amdgpu bypass
> >> test,  the SOURCE was set as "dprx" instead of "auto"
> >>> It makes "amdgpu_dm_crtc_set_crc_source()"  failed to set correct
> >>> AUX
> >> and it's  transfer function invalid .
> >>> The system I tested use HDMI port connected to monitor .
> >>>
> >>> amdgpu_dm_crtc_set_crc_source-> 

Re: [PATCH v5 7/7] drm/mediatek: Add mt8195 DisplayPort driver

2021-11-11 Thread kernel test robot
Hi Markus,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on robh/for-next]
[also build test ERROR on pza/reset/next linus/master v5.15 next-2021]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Markus-Schneider-Pargmann/drm-mediatek-Add-mt8195-DisplayPort-driver/20211021-172815
base:   https://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git for-next
config: arm-allyesconfig (attached as .config)
compiler: arm-linux-gnueabi-gcc (GCC) 11.2.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://github.com/0day-ci/linux/commit/cacf71bb7517b3ea11577d354daf7024551aa948
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Markus-Schneider-Pargmann/drm-mediatek-Add-mt8195-DisplayPort-driver/20211021-172815
git checkout cacf71bb7517b3ea11577d354daf7024551aa948
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross 
ARCH=arm 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

>> drivers/gpu/drm/mediatek/mtk_dp.c:1031:6: error: no previous prototype for 
>> 'mtk_dp_initialize_settings' [-Werror=missing-prototypes]
1031 | void mtk_dp_initialize_settings(struct mtk_dp *mtk_dp)
 |  ^~
   In file included from include/linux/device.h:15,
from include/linux/acpi.h:15,
from include/linux/i2c.h:13,
from include/drm/drm_crtc.h:28,
from include/drm/drm_atomic_helper.h:31,
from drivers/gpu/drm/mediatek/mtk_dp.c:7:
   drivers/gpu/drm/mediatek/mtk_dp.c: In function 'mtk_dp_hpd_sink_event':
>> include/drm/drm_print.h:412:39: error: format '%ld' expects argument of type 
>> 'long int', but argument 3 has type 'ssize_t' {aka 'int'} [-Werror=format=]
 412 | dev_##level##type((drm)->dev, "[drm] " fmt, ##__VA_ARGS__)
 |   ^~~~
   include/linux/dev_printk.h:110:30: note: in definition of macro 
'dev_printk_index_wrap'
 110 | _p_func(dev, fmt, ##__VA_ARGS__);
   \
 |  ^~~
   include/linux/dev_printk.h:150:58: note: in expansion of macro 'dev_fmt'
 150 | dev_printk_index_wrap(_dev_info, KERN_INFO, dev, 
dev_fmt(fmt), ##__VA_ARGS__)
 |  ^~~
   include/drm/drm_print.h:412:9: note: in expansion of macro 'dev_info'
 412 | dev_##level##type((drm)->dev, "[drm] " fmt, ##__VA_ARGS__)
 | ^~~~
   include/drm/drm_print.h:416:9: note: in expansion of macro '__drm_printk'
 416 | __drm_printk((drm), info,, fmt, ##__VA_ARGS__)
 | ^~~~
   drivers/gpu/drm/mediatek/mtk_dp.c:1445:17: note: in expansion of macro 
'drm_info'
1445 | drm_info(mtk_dp->drm_dev,
 | ^~~~
>> include/drm/drm_print.h:412:39: error: format '%ld' expects argument of type 
>> 'long int', but argument 3 has type 'ssize_t' {aka 'int'} [-Werror=format=]
 412 | dev_##level##type((drm)->dev, "[drm] " fmt, ##__VA_ARGS__)
 |   ^~~~
   include/linux/dev_printk.h:110:30: note: in definition of macro 
'dev_printk_index_wrap'
 110 | _p_func(dev, fmt, ##__VA_ARGS__);
   \
 |  ^~~
   include/linux/dev_printk.h:150:58: note: in expansion of macro 'dev_fmt'
 150 | dev_printk_index_wrap(_dev_info, KERN_INFO, dev, 
dev_fmt(fmt), ##__VA_ARGS__)
 |  ^~~
   include/drm/drm_print.h:412:9: note: in expansion of macro 'dev_info'
 412 | dev_##level##type((drm)->dev, "[drm] " fmt, ##__VA_ARGS__)
 | ^~~~
   include/drm/drm_print.h:416:9: note: in expansion of macro '__drm_printk'
 416 | __drm_printk((drm), info,, fmt, ##__VA_ARGS__)
 | ^~~~
   drivers/gpu/drm/mediatek/mtk_dp.c:1452:17: note: in expansion of macro 
'drm_info'
1452 | drm_info(mtk_dp->drm_dev,
 | ^~~~
>> include/drm/drm_print.h:412:39: error: format '%ld' expects argument of type 
>> 'long int', but argument 3 has type 'ssi

Re: [PATCH] drm/i915: Use per device iommu check

2021-11-11 Thread Lu Baolu

On 11/11/21 11:18 PM, Tvrtko Ursulin wrote:


On 10/11/2021 14:37, Robin Murphy wrote:

On 2021-11-10 14:11, Tvrtko Ursulin wrote:


On 10/11/2021 12:35, Lu Baolu wrote:

On 2021/11/10 20:08, Tvrtko Ursulin wrote:


On 10/11/2021 12:04, Lu Baolu wrote:

On 2021/11/10 17:30, Tvrtko Ursulin wrote:


On 10/11/2021 07:12, Lu Baolu wrote:

Hi Tvrtko,

On 2021/11/9 20:17, Tvrtko Ursulin wrote:

From: Tvrtko Ursulin

On igfx + dgfx setups, it appears that intel_iommu=igfx_off 
option only
disables the igfx iommu. Stop relying on global 
intel_iommu_gfx_mapped
and probe presence of iommu domain per device to accurately 
reflect its

status.

Signed-off-by: Tvrtko Ursulin
Cc: Lu Baolu
---
Baolu, is my understanding here correct? Maybe I am confused by 
both
intel_iommu_gfx_mapped and dmar_map_gfx being globals in the 
intel_iommu
driver. But it certainly appears the setup can assign some 
iommu ops (and
assign the discrete i915 to iommu group) when those two are set 
to off.


diff --git a/drivers/gpu/drm/i915/i915_drv.h 
b/drivers/gpu/drm/i915/i915_drv.h

index e967cd08f23e..9fb38a54f1fe 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1763,26 +1763,27 @@ static inline bool run_as_guest(void)
  #define HAS_D12_PLANE_MINIMIZATION(dev_priv) 
(IS_ROCKETLAKE(dev_priv) || \

    IS_ALDERLAKE_S(dev_priv))

-static inline bool intel_vtd_active(void)
+static inline bool intel_vtd_active(struct drm_i915_private *i915)
  {
-#ifdef CONFIG_INTEL_IOMMU
-    if (intel_iommu_gfx_mapped)
+    if (iommu_get_domain_for_dev(i915->drm.dev))
  return true;
-#endif

  /* Running as a guest, we assume the host is enforcing 
VT'd */

  return run_as_guest();
  }

Have you verified this change? I am afraid that
iommu_get_domain_for_dev() always gets a valid iommu domain even
intel_iommu_gfx_mapped == 0.


Yes it seems to work as is:

default:

# grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
/sys/kernel/debug/dri/0/i915_capabilities:iommu: enabled
/sys/kernel/debug/dri/1/i915_capabilities:iommu: enabled

intel_iommu=igfx_off:

# grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
/sys/kernel/debug/dri/0/i915_capabilities:iommu: disabled
/sys/kernel/debug/dri/1/i915_capabilities:iommu: enabled

On my system dri device 0 is integrated graphics and 1 is discrete.


The drm device 0 has a dedicated iommu. When the user request igfx 
not
mapped, the VT-d implementation will turn it off to save power. 
But for

shared iommu, you definitely will get it enabled.


Sorry I am not following, what exactly do you mean? Is there a 
platform with integrated graphics without a dedicated iommu, in 
which case intel_iommu=igfx_off results in intel_iommu_gfx_mapped 
== 0 and iommu_get_domain_for_dev returning non-NULL?


Your code always work for an igfx with a dedicated iommu. This might be
always true on today's platforms. But from driver's point of view, we
should not make such assumption.

For example, if the iommu implementation decides not to turn off the
graphic iommu (perhaps due to some hw quirk or for graphic
virtualization), your code will be broken.


If I got it right, this would go back to your earlier recommendation 
to have the check look like this:


static bool intel_vtd_active(struct drm_i915_private *i915)
{
 struct iommu_domain *domain;

 domain = iommu_get_domain_for_dev(i915->drm.dev);
 if (domain && (domain->type & __IOMMU_DOMAIN_PAGING))
 return true;
 ...

This would be okay as a first step?

Elsewhere in the thread Robin suggested looking at the dec->dma_ops 
and comparing against iommu_dma_ops. These two solution would be 
effectively the same?


Effectively, yes. See iommu_setup_dma_ops() - the only way to end up 
with iommu_dma_ops is if a managed translation domain is present; if 
the IOMMU is present but the default domain type has been set to 
passthrough (either globally or forced for the given device) it will 
do nothing and leave you with dma-direct, while if the IOMMU has been 
ignored entirely then it should never even be called. Thus it neatly 
encapsulates what you're after here.


One concern I have is whether the pass-through mode truly does nothing 
or addresses perhaps still go through the dmar hardware just with no 
translation?


Pass-through mode means the latter.



If latter then most like for like change is actually exactly what the 
first version of my patch did. That is replace intel_iommu_gfx_mapped 
with a plain non-NULL check on iommu_get_domain_for_dev.


Depends on what you want here,

#1) the graphic device works in iommu pass-through mode
   - device have an iommu
   - but iommu does no translation
   - the dma transactions go through iommu with the same destination
 memory address specified by the device;

#2) the graphic device works without a system iommu
   - the iommu is off
   - there's no iommu on the path of DMA transaction.

My suggestion works for 

Re: [PATCH] drm/i915: Use per device iommu check

2021-11-11 Thread Lu Baolu

On 11/11/21 11:06 PM, Tvrtko Ursulin wrote:


On 10/11/2021 12:35, Lu Baolu wrote:

On 2021/11/10 20:08, Tvrtko Ursulin wrote:


On 10/11/2021 12:04, Lu Baolu wrote:

On 2021/11/10 17:30, Tvrtko Ursulin wrote:


On 10/11/2021 07:12, Lu Baolu wrote:

Hi Tvrtko,

On 2021/11/9 20:17, Tvrtko Ursulin wrote:

From: Tvrtko Ursulin

On igfx + dgfx setups, it appears that intel_iommu=igfx_off 
option only
disables the igfx iommu. Stop relying on global 
intel_iommu_gfx_mapped
and probe presence of iommu domain per device to accurately 
reflect its

status.

Signed-off-by: Tvrtko Ursulin
Cc: Lu Baolu
---
Baolu, is my understanding here correct? Maybe I am confused by both
intel_iommu_gfx_mapped and dmar_map_gfx being globals in the 
intel_iommu
driver. But it certainly appears the setup can assign some iommu 
ops (and
assign the discrete i915 to iommu group) when those two are set 
to off.


diff --git a/drivers/gpu/drm/i915/i915_drv.h 
b/drivers/gpu/drm/i915/i915_drv.h

index e967cd08f23e..9fb38a54f1fe 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1763,26 +1763,27 @@ static inline bool run_as_guest(void)
  #define HAS_D12_PLANE_MINIMIZATION(dev_priv) 
(IS_ROCKETLAKE(dev_priv) || \

    IS_ALDERLAKE_S(dev_priv))

-static inline bool intel_vtd_active(void)
+static inline bool intel_vtd_active(struct drm_i915_private *i915)
  {
-#ifdef CONFIG_INTEL_IOMMU
-    if (intel_iommu_gfx_mapped)
+    if (iommu_get_domain_for_dev(i915->drm.dev))
  return true;
-#endif

  /* Running as a guest, we assume the host is enforcing VT'd */
  return run_as_guest();
  }

Have you verified this change? I am afraid that
iommu_get_domain_for_dev() always gets a valid iommu domain even
intel_iommu_gfx_mapped == 0.


Yes it seems to work as is:

default:

# grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
/sys/kernel/debug/dri/0/i915_capabilities:iommu: enabled
/sys/kernel/debug/dri/1/i915_capabilities:iommu: enabled

intel_iommu=igfx_off:

# grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
/sys/kernel/debug/dri/0/i915_capabilities:iommu: disabled
/sys/kernel/debug/dri/1/i915_capabilities:iommu: enabled

On my system dri device 0 is integrated graphics and 1 is discrete.


The drm device 0 has a dedicated iommu. When the user request igfx not
mapped, the VT-d implementation will turn it off to save power. But for
shared iommu, you definitely will get it enabled.


Sorry I am not following, what exactly do you mean? Is there a 
platform with integrated graphics without a dedicated iommu, in which 
case intel_iommu=igfx_off results in intel_iommu_gfx_mapped == 0 and 
iommu_get_domain_for_dev returning non-NULL?


Your code always work for an igfx with a dedicated iommu. This might be
always true on today's platforms. But from driver's point of view, we
should not make such assumption.

For example, if the iommu implementation decides not to turn off the
graphic iommu (perhaps due to some hw quirk or for graphic
virtualization), your code will be broken.


I tried your suggestion (checking for __IOMMU_DOMAIN_PAGING) and it 
works better, however I have observed one odd behaviour (for me at least).


In short - why does the DMAR mode for the discrete device change 
depending on igfx_off parameter?


Consider the laptop has these two graphics cards:

# cat /sys/kernel/debug/dri/0/name
i915 dev=:00:02.0 unique=:00:02.0 # integrated

# cat /sys/kernel/debug/dri/1/name
i915 dev=:03:00.0 unique=:03:00.0 # discrete

Booting with different options:
===

default / intel_iommu=on


# cat /sys/class/iommu/dmar0/devices/:00:02.0/iommu_group/type
DMA-FQ
# cat /sys/class/iommu/dmar2/devices/:03:00.0/iommu_group/type
DMA-FQ

# grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
/sys/kernel/debug/dri/0/i915_capabilities:iommu: enabled
/sys/kernel/debug/dri/1/i915_capabilities:iommu: enabled

All good.

intel_iommu=igfx_off


## no dmar0 in sysfs
# cat /sys/class/iommu/dmar2/devices/:03:00.0/iommu_group/type
identity

Unexpected!?

# grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
/sys/kernel/debug/dri/0/i915_capabilities:iommu: disabled
/sys/kernel/debug/dri/1/i915_capabilities:iommu: disabled # At least the 
i915 patch detects it correctly.


intel_iommu=off
---

## no dmar0 in sysfs
## no dmar2 in sysfs

# grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
/sys/kernel/debug/dri/0/i915_capabilities:iommu: disabled
/sys/kernel/debug/dri/1/i915_capabilities:iommu: disabled

All good.

The fact discrete graphics changes from translated to pass-through when 
igfx_off is set is surprising to me. Is this a bug?


The existing VT-d implementation doesn't distinguish igfx from dgfx. It
only checks whether the device is of a display class:

#define IS_GFX_DEVICE(pdev) ((pdev->class >> 16) == PCI_BASE_CLASS_DISPLAY)

When igfx_off is specified, 

Re: [RFC PATCH v2 1/3] arm64: dts: imx8mm: Add eLCDIF node support

2021-11-11 Thread Tim Harvey
On Thu, Nov 11, 2021 at 2:15 AM Jagan Teki  wrote:
>
> Add eLCDIF controller node for i.MX8MM.
>

Jagan,

It doesn't look like you sent this to the Device Tree mainling list so
I added that to cc.

> Signed-off-by: Jagan Teki 
> ---
>  arch/arm64/boot/dts/freescale/imx8mm.dtsi | 19 +++
>  1 file changed, 19 insertions(+)
>
> diff --git a/arch/arm64/boot/dts/freescale/imx8mm.dtsi 
> b/arch/arm64/boot/dts/freescale/imx8mm.dtsi
> index c2f3f118f82e..caeb93313413 100644
> --- a/arch/arm64/boot/dts/freescale/imx8mm.dtsi
> +++ b/arch/arm64/boot/dts/freescale/imx8mm.dtsi
> @@ -1068,6 +1068,25 @@ aips4: bus@32c0 {
> #size-cells = <1>;
> ranges = <0x32c0 0x32c0 0x40>;
>
> +   lcdif: lcdif@32e0 {
> +   compatible = "fsl,imx28-lcdif";
> +   reg = <0x32e0 0x1>;
> +   clocks = < IMX8MM_CLK_LCDIF_PIXEL>,
> +< IMX8MM_CLK_DISP_AXI_ROOT>,
> +< IMX8MM_CLK_DISP_APB_ROOT>;
> +   clock-names = "pix", "disp_axi", "axi";
> +   assigned-clocks = < 
> IMX8MM_CLK_LCDIF_PIXEL>,
> + < IMX8MM_CLK_DISP_AXI>,
> + < IMX8MM_CLK_DISP_APB>;
> +   assigned-clock-parents = < 
> IMX8MM_VIDEO_PLL1_OUT>,
> +< 
> IMX8MM_SYS_PLL2_1000M>,
> +< 
> IMX8MM_SYS_PLL1_800M>;
> +   assigned-clock-rate = <59400>, 
> <5>, <2>;
> +   interrupts = ;
> +   power-domains = <_blk_ctrl 
> IMX8MM_DISPBLK_PD_LCDIF>;
> +   status = "disabled";
> +   };
> +
> disp_blk_ctrl: blk-ctrl@32e28000 {
> compatible = "fsl,imx8mm-disp-blk-ctrl", 
> "syscon";
> reg = <0x32e28000 0x100>;
> --
> 2.25.1
>


Re: [RFC PATCH v2 2/3] arm64: dts: imx8mm: Add MIPI DSI pipeline

2021-11-11 Thread Tim Harvey
On Thu, Nov 11, 2021 at 2:15 AM Jagan Teki  wrote:
>
> Add MIPI DSI pipeline for i.MX8MM.
>
> Video pipeline start from eLCDIF to MIPI DSI and respective
> Panel or Bridge on the backend side.
>
> Add support for it.

Jagan,

Thanks for your continued work on IMX8MM DSI support!

It doesn't look like you sent this to the Device Tree mainling list so
I added that to cc.

>
> Signed-off-by: Jagan Teki 
> ---
>  arch/arm64/boot/dts/freescale/imx8mm.dtsi | 55 +++
>  1 file changed, 55 insertions(+)
>
> diff --git a/arch/arm64/boot/dts/freescale/imx8mm.dtsi 
> b/arch/arm64/boot/dts/freescale/imx8mm.dtsi
> index caeb93313413..eddf3a467fd2 100644
> --- a/arch/arm64/boot/dts/freescale/imx8mm.dtsi
> +++ b/arch/arm64/boot/dts/freescale/imx8mm.dtsi
> @@ -188,6 +188,12 @@ clk_ext4: clock-ext4 {
> clock-output-names = "clk_ext4";
> };
>
> +   mipi_phy: mipi-video-phy {
> +   compatible = "fsl,imx8mm-mipi-video-phy";
> +   syscon = <_blk_ctrl>;
> +   #phy-cells = <1>;
> +   };
> +
> psci {
> compatible = "arm,psci-1.0";
> method = "smc";
> @@ -1085,6 +1091,55 @@ lcdif: lcdif@32e0 {
> interrupts = ;
> power-domains = <_blk_ctrl 
> IMX8MM_DISPBLK_PD_LCDIF>;
> status = "disabled";
> +
> +   port {
> +   lcdif_out_dsi: endpoint {
> +   remote-endpoint = 
> <_in_lcdif>;
> +   };
> +   };
> +   };
> +
> +   dsi: dsi@32e1 {

I wonder if this should this be 'mipi_dsi' like the CSI bindings
Adam's submitted here:
https://patchwork.kernel.org/project/linux-arm-kernel/patch/20211106155427.753197-2-aford...@gmail.com/

> +   compatible = "fsl,imx8mm-mipi-dsim";
> +   reg = <0x32e1 0x400>;
> +   clocks = < IMX8MM_CLK_DSI_CORE>,
> +< IMX8MM_CLK_DSI_PHY_REF>;
> +   clock-names = "bus_clk", "sclk_mipi";
> +   assigned-clocks = < IMX8MM_CLK_DSI_CORE>,
> + < 
> IMX8MM_VIDEO_PLL1_OUT>,
> + < 
> IMX8MM_CLK_DSI_PHY_REF>;
> +   assigned-clock-parents = < 
> IMX8MM_SYS_PLL1_266M>,
> +< 
> IMX8MM_VIDEO_PLL1_BYPASS>,
> +< 
> IMX8MM_VIDEO_PLL1_OUT>;
> +   assigned-clock-rates = <26600>, 
> <59400>, <2700>;
> +   interrupts = ;
> +   phys = <_phy 0>;
> +   phy-names = "dsim";
> +   power-domains = <_blk_ctrl 
> IMX8MM_DISPBLK_PD_MIPI_DSI>;
> +   samsung,burst-clock-frequency = <89100>;
> +   samsung,esc-clock-frequency = <5400>;
> +   samsung,pll-clock-frequency = <2700>;
> +   status = "disabled";
> +
> +   ports {
> +   #address-cells = <1>;
> +   #size-cells = <0>;
> +
> +   port@0 {
> +   reg = <0>;
> +   #address-cells = <1>;
> +   #size-cells = <0>;

I don't think the '#address-cells' and '#size-cells' are needed here
but I defer to the dt experts!

> +
> +   dsi_in_lcdif: endpoint@0 {
> +   reg = <0>;

Per Adam's comment to my posting this should be just "port {" and we
can get rid of the @0 and the "reg=0"

Best regards,

Tim

> +   remote-endpoint = 
> <_out_dsi>;
> +   };
> +   };
> +
> +   port@1 {
> +   reg = <1>;
> +   };
> +   };
> };
>
> disp_blk_ctrl: blk-ctrl@32e28000 {
> --
> 2.25.1
>


[RESEND PATCH v2 13/13] arm64: dt: qcom: pm660l: Remove board-specific WLED configuration

2021-11-11 Thread Marijn Suijten
This string- and electrical configuration depend on the board and panel,
and should hence not be defined generically for every user of pm660l.
SoMainline will pick this configuration again when enabling WLED on the
Sony Nile platform.

Fixes: 7b56a804e58b ("arm64: dts: qcom: pm660l: Add WLED support")
Signed-off-by: Marijn Suijten 
Reviewed-By: AngeloGioacchino Del Regno 

---
 arch/arm64/boot/dts/qcom/pm660l.dtsi | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/arch/arm64/boot/dts/qcom/pm660l.dtsi 
b/arch/arm64/boot/dts/qcom/pm660l.dtsi
index 05086cbe573b..cfef42353611 100644
--- a/arch/arm64/boot/dts/qcom/pm660l.dtsi
+++ b/arch/arm64/boot/dts/qcom/pm660l.dtsi
@@ -72,13 +72,6 @@ pm660l_wled: leds@d800 {
interrupt-names = "ovp";
label = "backlight";

-   qcom,switching-freq = <800>;
-   qcom,ovp-millivolt = <29600>;
-   qcom,current-boost-limit = <970>;
-   qcom,current-limit-microamp = <2>;
-   qcom,num-strings = <2>;
-   qcom,enabled-strings = <0 1>;
-
status = "disabled";
};

--
2.33.0



[RESEND PATCH v2 10/13] arm64: dts: qcom: pmi8994: Fix "eternal"->"external" typo in WLED node

2021-11-11 Thread Marijn Suijten
The property is named "qcom,external-pfet", as found by
dt_binding_check:

'qcom,eternal-pfet' does not match any of the regexes

Fixes: 37aa540cbd30 ("arm64: dts: qcom: pmi8994: Add WLED node")
Signed-off-by: Marijn Suijten 
Reviewed-By: AngeloGioacchino Del Regno 

---
 arch/arm64/boot/dts/qcom/pmi8994.dtsi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/qcom/pmi8994.dtsi 
b/arch/arm64/boot/dts/qcom/pmi8994.dtsi
index b4ac900ab115..a06ea9adae81 100644
--- a/arch/arm64/boot/dts/qcom/pmi8994.dtsi
+++ b/arch/arm64/boot/dts/qcom/pmi8994.dtsi
@@ -42,7 +42,7 @@ pmi8994_wled: wled@d800 {
/* Yes, all four strings *have to* be defined or things 
won't work. */
qcom,enabled-strings = <0 1 2 3>;
qcom,cabc;
-   qcom,eternal-pfet;
+   qcom,external-pfet;
status = "disabled";
};
};
--
2.33.0



[RESEND PATCH v2 09/13] backlight: qcom-wled: Respect enabled-strings in set_brightness

2021-11-11 Thread Marijn Suijten
The hardware is capable of controlling any non-contiguous sequence of
LEDs specified in the DT using qcom,enabled-strings as u32
array, and this also follows from the DT-bindings documentation.  The
numbers specified in this array represent indices of the LED strings
that are to be enabled and disabled.

Its value is appropriately used to setup and enable string modules, but
completely disregarded in the set_brightness paths which only iterate
over the number of strings linearly.
Take an example where only string 2 is enabled with
qcom,enabled_strings=<2>: this string is appropriately enabled but
subsequent brightness changes would have only touched the zero'th
brightness register because num_strings is 1 here.  This is simply
addressed by looking up the string for this index in the enabled_strings
array just like the other codepaths that iterate over num_strings.

Likewise enabled_strings is now also used in the autodetection path for
consistent behaviour: when a list of strings is specified in DT only
those strings will be probed for autodetection, analogous to how the
number of strings that need to be probed is already bound by
qcom,num-strings.  After all autodetection uses the set_brightness
helpers to set an initial value, which could otherwise end up changing
brightness on a different set of strings.

Fixes: 775d2ffb4af6 ("backlight: qcom-wled: Restructure the driver for WLED3")
Fixes: 03b2b5e86986 ("backlight: qcom-wled: Add support for WLED4 peripheral")
Signed-off-by: Marijn Suijten 
Reviewed-by: AngeloGioacchino Del Regno 

---
 drivers/video/backlight/qcom-wled.c | 22 --
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/drivers/video/backlight/qcom-wled.c 
b/drivers/video/backlight/qcom-wled.c
index 4524e80591cd..bdda6b424113 100644
--- a/drivers/video/backlight/qcom-wled.c
+++ b/drivers/video/backlight/qcom-wled.c
@@ -237,7 +237,7 @@ static int wled3_set_brightness(struct wled *wled, u16 
brightness)

for (i = 0; i < wled->cfg.num_strings; ++i) {
rc = regmap_bulk_write(wled->regmap, wled->ctrl_addr +
-  WLED3_SINK_REG_BRIGHT(i),
+  
WLED3_SINK_REG_BRIGHT(wled->cfg.enabled_strings[i]),
   , sizeof(v));
if (rc < 0)
return rc;
@@ -259,7 +259,7 @@ static int wled4_set_brightness(struct wled *wled, u16 
brightness)

for (i = 0; i < wled->cfg.num_strings; ++i) {
rc = regmap_bulk_write(wled->regmap, wled->sink_addr +
-  WLED4_SINK_REG_BRIGHT(i),
+  
WLED4_SINK_REG_BRIGHT(wled->cfg.enabled_strings[i]),
   , sizeof(v));
if (rc < 0)
return rc;
@@ -569,7 +569,7 @@ static irqreturn_t wled_short_irq_handler(int irq, void 
*_wled)

 static void wled_auto_string_detection(struct wled *wled)
 {
-   int rc = 0, i, delay_time_us;
+   int rc = 0, i, j, delay_time_us;
u32 sink_config = 0;
u8 sink_test = 0, sink_valid = 0, val;
bool fault_set;
@@ -616,14 +616,15 @@ static void wled_auto_string_detection(struct wled *wled)

/* Iterate through the strings one by one */
for (i = 0; i < wled->cfg.num_strings; i++) {
-   sink_test = BIT((WLED4_SINK_REG_CURR_SINK_SHFT + i));
+   j = wled->cfg.enabled_strings[i];
+   sink_test = BIT((WLED4_SINK_REG_CURR_SINK_SHFT + j));

/* Enable feedback control */
rc = regmap_write(wled->regmap, wled->ctrl_addr +
- WLED3_CTRL_REG_FEEDBACK_CONTROL, i + 1);
+ WLED3_CTRL_REG_FEEDBACK_CONTROL, j + 1);
if (rc < 0) {
dev_err(wled->dev, "Failed to enable feedback for SINK 
%d rc = %d\n",
-   i + 1, rc);
+   j + 1, rc);
goto failed_detect;
}

@@ -632,7 +633,7 @@ static void wled_auto_string_detection(struct wled *wled)
  WLED4_SINK_REG_CURR_SINK, sink_test);
if (rc < 0) {
dev_err(wled->dev, "Failed to configure SINK %d 
rc=%d\n",
-   i + 1, rc);
+   j + 1, rc);
goto failed_detect;
}

@@ -659,7 +660,7 @@ static void wled_auto_string_detection(struct wled *wled)

if (fault_set)
dev_dbg(wled->dev, "WLED OVP fault detected with SINK 
%d\n",
-   i + 1);
+   j + 1);
else
sink_valid |= sink_test;

@@ -699,15 +700,16 @@ static void wled_auto_string_detection(struct wled *wled)
/* Enable valid sinks */
if 

[RESEND PATCH v2 07/13] backlight: qcom-wled: Provide enabled_strings default for WLED 4 and 5

2021-11-11 Thread Marijn Suijten
Only WLED 3 sets a sensible default that allows operating this driver
with just qcom,num-strings in the DT; WLED 4 and 5 require
qcom,enabled-strings to be provided otherwise enabled_strings remains
zero-initialized, resuling in every string-specific register write
(currently only the setup and config functions, brightness follows in a
future patch) to only configure the zero'th string multiple times.

Signed-off-by: Marijn Suijten 
Reviewed-by: AngeloGioacchino Del Regno 

Reviewed-by: Daniel Thompson 
---
 drivers/video/backlight/qcom-wled.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/video/backlight/qcom-wled.c 
b/drivers/video/backlight/qcom-wled.c
index c342cd8440e1..a8fb8f19922d 100644
--- a/drivers/video/backlight/qcom-wled.c
+++ b/drivers/video/backlight/qcom-wled.c
@@ -1077,6 +1077,7 @@ static const struct wled_config wled4_config_defaults = {
.cabc = false,
.external_pfet = false,
.auto_detection_enabled = false,
+   .enabled_strings = {0, 1, 2, 3},
 };

 static int wled5_setup(struct wled *wled)
@@ -1190,6 +1191,7 @@ static const struct wled_config wled5_config_defaults = {
.cabc = false,
.external_pfet = false,
.auto_detection_enabled = false,
+   .enabled_strings = {0, 1, 2, 3},
 };

 static const u32 wled3_boost_i_limit_values[] = {
--
2.33.0



[RESEND PATCH v2 12/13] arm64: dts: qcom: Move WLED num-strings from pmi8994 to sony-xperia-tone

2021-11-11 Thread Marijn Suijten
The number of WLED strings used by a certain platform depend on the
panel connected to that board and may not be the same for every user of
pmi8994.

Signed-off-by: Marijn Suijten 
Reviewed-By: AngeloGioacchino Del Regno 

---
 arch/arm64/boot/dts/qcom/msm8996-sony-xperia-tone.dtsi | 1 +
 arch/arm64/boot/dts/qcom/pmi8994.dtsi  | 1 -
 2 files changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/qcom/msm8996-sony-xperia-tone.dtsi 
b/arch/arm64/boot/dts/qcom/msm8996-sony-xperia-tone.dtsi
index 507396c4d23b..ff7f39d29dd5 100644
--- a/arch/arm64/boot/dts/qcom/msm8996-sony-xperia-tone.dtsi
+++ b/arch/arm64/boot/dts/qcom/msm8996-sony-xperia-tone.dtsi
@@ -620,6 +620,7 @@ pmi8994_s11: s11 {
 _wled {
status = "okay";
default-brightness = <512>;
+   qcom,num-strings = <3>;
 };

 _requests {
diff --git a/arch/arm64/boot/dts/qcom/pmi8994.dtsi 
b/arch/arm64/boot/dts/qcom/pmi8994.dtsi
index 89ba4146e747..6e7c252568e6 100644
--- a/arch/arm64/boot/dts/qcom/pmi8994.dtsi
+++ b/arch/arm64/boot/dts/qcom/pmi8994.dtsi
@@ -38,7 +38,6 @@ pmi8994_wled: wled@d800 {
reg = <0xd800 0xd900>;
interrupts = <3 0xd8 0x02 IRQ_TYPE_EDGE_RISING>;
interrupt-names = "short";
-   qcom,num-strings = <3>;
qcom,cabc;
qcom,external-pfet;
status = "disabled";
--
2.33.0



[RESEND PATCH v2 11/13] arm64: dts: qcom: pmi8994: Remove hardcoded linear WLED enabled-strings

2021-11-11 Thread Marijn Suijten
The driver now sets an appropriate default for WLED4 (and WLED5) just
like WLED3 making this linear array from 0-3 redundant.  In addition the
driver is now able to parse arrays of variable length solving the "all
four strings *have to* be defined" comment.

Besides the driver will now warn when both properties are specified to
prevent ambiguity: the length of the array is enough to imply a set
number of strings.

Signed-off-by: Marijn Suijten 
Reviewed-By: AngeloGioacchino Del Regno 

---
 arch/arm64/boot/dts/qcom/pmi8994.dtsi | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/arm64/boot/dts/qcom/pmi8994.dtsi 
b/arch/arm64/boot/dts/qcom/pmi8994.dtsi
index a06ea9adae81..89ba4146e747 100644
--- a/arch/arm64/boot/dts/qcom/pmi8994.dtsi
+++ b/arch/arm64/boot/dts/qcom/pmi8994.dtsi
@@ -39,8 +39,6 @@ pmi8994_wled: wled@d800 {
interrupts = <3 0xd8 0x02 IRQ_TYPE_EDGE_RISING>;
interrupt-names = "short";
qcom,num-strings = <3>;
-   /* Yes, all four strings *have to* be defined or things 
won't work. */
-   qcom,enabled-strings = <0 1 2 3>;
qcom,cabc;
qcom,external-pfet;
status = "disabled";
--
2.33.0



[RESEND PATCH v2 08/13] backlight: qcom-wled: Remove unnecessary double whitespace

2021-11-11 Thread Marijn Suijten
Remove redundant spaces inside for loop conditions.  No other double
spaces were found that are not part of indentation with `[^\s]  `.

Signed-off-by: Marijn Suijten 
Reviewed-by: AngeloGioacchino Del Regno 

Reviewed-by: Daniel Thompson 
---
 drivers/video/backlight/qcom-wled.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/video/backlight/qcom-wled.c 
b/drivers/video/backlight/qcom-wled.c
index a8fb8f19922d..4524e80591cd 100644
--- a/drivers/video/backlight/qcom-wled.c
+++ b/drivers/video/backlight/qcom-wled.c
@@ -235,7 +235,7 @@ static int wled3_set_brightness(struct wled *wled, u16 
brightness)

v = cpu_to_le16(brightness & WLED3_SINK_REG_BRIGHT_MAX);

-   for (i = 0;  i < wled->cfg.num_strings; ++i) {
+   for (i = 0; i < wled->cfg.num_strings; ++i) {
rc = regmap_bulk_write(wled->regmap, wled->ctrl_addr +
   WLED3_SINK_REG_BRIGHT(i),
   , sizeof(v));
@@ -257,7 +257,7 @@ static int wled4_set_brightness(struct wled *wled, u16 
brightness)

v = cpu_to_le16(brightness & WLED3_SINK_REG_BRIGHT_MAX);

-   for (i = 0;  i < wled->cfg.num_strings; ++i) {
+   for (i = 0; i < wled->cfg.num_strings; ++i) {
rc = regmap_bulk_write(wled->regmap, wled->sink_addr +
   WLED4_SINK_REG_BRIGHT(i),
   , sizeof(v));
--
2.33.0



[RESEND PATCH v2 05/13] backlight: qcom-wled: Override default length with qcom, enabled-strings

2021-11-11 Thread Marijn Suijten
The length of qcom,enabled-strings as property array is enough to
determine the number of strings to be enabled, without needing to set
qcom,num-strings to override the default number of strings when less
than the default (which is also the maxium) is provided in DT.

Fixes: 775d2ffb4af6 ("backlight: qcom-wled: Restructure the driver for WLED3")
Signed-off-by: Marijn Suijten 
Reviewed-by: AngeloGioacchino Del Regno 

---
 drivers/video/backlight/qcom-wled.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/video/backlight/qcom-wled.c 
b/drivers/video/backlight/qcom-wled.c
index c5232478a343..9bfbf601762a 100644
--- a/drivers/video/backlight/qcom-wled.c
+++ b/drivers/video/backlight/qcom-wled.c
@@ -1518,6 +1518,8 @@ static int wled_configure(struct wled *wled)
return -EINVAL;
}
}
+
+   cfg->num_strings = string_len;
}

rc = of_property_read_u32(dev->of_node, "qcom,num-strings", );
--
2.33.0



[RESEND PATCH v2 00/13] backlight: qcom-wled: fix and solidify handling of enabled-strings

2021-11-11 Thread Marijn Suijten
This patchset fixes WLED's handling of enabled-strings: besides some
cleanup it is now actually possible to specify a non-contiguous array of
enabled strings (not necessarily starting at zero) and the values from
DT are now validated to prevent possible unexpected out-of-bounds
register and array element accesses.
Off-by-one mistakes in the maximum number of strings, also causing
out-of-bounds access, have been addressed as well.

Changes in v2:
- Reordered patch 4/10 (Validate enabled string indices in DT) to sit
  before patch 1/10 (Pass number of elements to read to read_u32_array);
- Pulled qcom,num-strings out of the DT enumeration parser, and moved it
  after qcom,enabled-strings parser to always have final sign-off over
  the number of strings;
- Extra validation for this number of strings against
  qcom,enabled-strings;
- Recombined patch 9 (Consistently use enabled-strings in
  set_brightness) and patch 10 (Consider enabled_strings in
  autodetection), which both solve the same problem in two different
  functions.  In addition the autodetection code uses set_brightness as
  helper already;
- Improved DT configurations for pmi8994 and pm660l, currently in 5.15
  rc's.

v1: 
https://lore.kernel.org/dri-devel/20211004192741.621870-1-marijn.suij...@somainline.org

Marijn Suijten (13):
  backlight: qcom-wled: Validate enabled string indices in DT
  backlight: qcom-wled: Pass number of elements to read to
read_u32_array
  backlight: qcom-wled: Use cpu_to_le16 macro to perform conversion
  backlight: qcom-wled: Fix off-by-one maximum with default num_strings
  backlight: qcom-wled: Override default length with
qcom,enabled-strings
  backlight: qcom-wled: Remove unnecessary 4th default string in WLED3
  backlight: qcom-wled: Provide enabled_strings default for WLED 4 and 5
  backlight: qcom-wled: Remove unnecessary double whitespace
  backlight: qcom-wled: Respect enabled-strings in set_brightness
  arm64: dts: qcom: pmi8994: Fix "eternal"->"external" typo in WLED node
  arm64: dts: qcom: pmi8994: Remove hardcoded linear WLED
enabled-strings
  arm64: dts: qcom: Move WLED num-strings from pmi8994 to
sony-xperia-tone
  arm64: dt: qcom: pm660l: Remove board-specific WLED configuration

 .../dts/qcom/msm8996-sony-xperia-tone.dtsi|   1 +
 arch/arm64/boot/dts/qcom/pm660l.dtsi  |   7 -
 arch/arm64/boot/dts/qcom/pmi8994.dtsi |   5 +-
 drivers/video/backlight/qcom-wled.c   | 131 ++
 4 files changed, 73 insertions(+), 71 deletions(-)

--
2.33.0



[RESEND PATCH v2 06/13] backlight: qcom-wled: Remove unnecessary 4th default string in WLED3

2021-11-11 Thread Marijn Suijten
The previous commit improves num_strings parsing to not go over the
maximum of 3 strings for WLED3 anymore.  Likewise this default index for
a hypothetical 4th string is invalid and could access registers that are
not mapped to the desired purpose.
Removing this value gets rid of undesired confusion and avoids the
possibility of accessing registers at this offset even if the 4th array
element is used by accident.

Signed-off-by: Marijn Suijten 
Reviewed-by: AngeloGioacchino Del Regno 

Reviewed-by: Daniel Thompson 
---
 drivers/video/backlight/qcom-wled.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/video/backlight/qcom-wled.c 
b/drivers/video/backlight/qcom-wled.c
index 9bfbf601762a..c342cd8440e1 100644
--- a/drivers/video/backlight/qcom-wled.c
+++ b/drivers/video/backlight/qcom-wled.c
@@ -946,7 +946,7 @@ static const struct wled_config wled3_config_defaults = {
.cs_out_en = false,
.ext_gen = false,
.cabc = false,
-   .enabled_strings = {0, 1, 2, 3},
+   .enabled_strings = {0, 1, 2},
 };

 static int wled4_setup(struct wled *wled)
--
2.33.0



[RESEND PATCH v2 03/13] backlight: qcom-wled: Use cpu_to_le16 macro to perform conversion

2021-11-11 Thread Marijn Suijten
The kernel already provides appropriate primitives to perform endianness
conversion which should be used in favour of manual bit-wrangling.

Signed-off-by: Marijn Suijten 
Reviewed-by: AngeloGioacchino Del Regno 

Reviewed-by: Daniel Thompson 
---
 drivers/video/backlight/qcom-wled.c | 25 +++--
 1 file changed, 11 insertions(+), 14 deletions(-)

diff --git a/drivers/video/backlight/qcom-wled.c 
b/drivers/video/backlight/qcom-wled.c
index d413b913fef3..977cd75827d7 100644
--- a/drivers/video/backlight/qcom-wled.c
+++ b/drivers/video/backlight/qcom-wled.c
@@ -231,14 +231,14 @@ struct wled {
 static int wled3_set_brightness(struct wled *wled, u16 brightness)
 {
int rc, i;
-   u8 v[2];
+   u16 v;

-   v[0] = brightness & 0xff;
-   v[1] = (brightness >> 8) & 0xf;
+   v = cpu_to_le16(brightness & WLED3_SINK_REG_BRIGHT_MAX);

for (i = 0;  i < wled->cfg.num_strings; ++i) {
rc = regmap_bulk_write(wled->regmap, wled->ctrl_addr +
-  WLED3_SINK_REG_BRIGHT(i), v, 2);
+  WLED3_SINK_REG_BRIGHT(i),
+  , sizeof(v));
if (rc < 0)
return rc;
}
@@ -249,19 +249,18 @@ static int wled3_set_brightness(struct wled *wled, u16 
brightness)
 static int wled4_set_brightness(struct wled *wled, u16 brightness)
 {
int rc, i;
-   u16 low_limit = wled->max_brightness * 4 / 1000;
-   u8 v[2];
+   u16 v, low_limit = wled->max_brightness * 4 / 1000;

/* WLED4's lower limit of operation is 0.4% */
if (brightness > 0 && brightness < low_limit)
brightness = low_limit;

-   v[0] = brightness & 0xff;
-   v[1] = (brightness >> 8) & 0xf;
+   v = cpu_to_le16(brightness & WLED3_SINK_REG_BRIGHT_MAX);

for (i = 0;  i < wled->cfg.num_strings; ++i) {
rc = regmap_bulk_write(wled->regmap, wled->sink_addr +
-  WLED4_SINK_REG_BRIGHT(i), v, 2);
+  WLED4_SINK_REG_BRIGHT(i),
+  , sizeof(v));
if (rc < 0)
return rc;
}
@@ -272,22 +271,20 @@ static int wled4_set_brightness(struct wled *wled, u16 
brightness)
 static int wled5_set_brightness(struct wled *wled, u16 brightness)
 {
int rc, offset;
-   u16 low_limit = wled->max_brightness * 1 / 1000;
-   u8 v[2];
+   u16 v, low_limit = wled->max_brightness * 1 / 1000;

/* WLED5's lower limit is 0.1% */
if (brightness < low_limit)
brightness = low_limit;

-   v[0] = brightness & 0xff;
-   v[1] = (brightness >> 8) & 0x7f;
+   v = cpu_to_le16(brightness & WLED5_SINK_REG_BRIGHT_MAX_15B);

offset = (wled->cfg.mod_sel == MOD_A) ?
  WLED5_SINK_REG_MOD_A_BRIGHTNESS_LSB :
  WLED5_SINK_REG_MOD_B_BRIGHTNESS_LSB;

rc = regmap_bulk_write(wled->regmap, wled->sink_addr + offset,
-  v, 2);
+  , sizeof(v));
return rc;
 }

--
2.33.0



[RESEND PATCH v2 02/13] backlight: qcom-wled: Pass number of elements to read to read_u32_array

2021-11-11 Thread Marijn Suijten
of_property_read_u32_array takes the number of elements to read as last
argument. This does not always need to be 4 (sizeof(u32)) but should
instead be the size of the array in DT as read just above with
of_property_count_elems_of_size.

To not make such an error go unnoticed again the driver now bails
accordingly when of_property_read_u32_array returns an error.
Surprisingly the indentation of newlined arguments is lining up again
after prepending `rc = `.

Fixes: 775d2ffb4af6 ("backlight: qcom-wled: Restructure the driver for WLED3")
Signed-off-by: Marijn Suijten 
Reviewed-by: AngeloGioacchino Del Regno 

Reviewed-by: Daniel Thompson 
---
 drivers/video/backlight/qcom-wled.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/video/backlight/qcom-wled.c 
b/drivers/video/backlight/qcom-wled.c
index 8a42ed89c59c..d413b913fef3 100644
--- a/drivers/video/backlight/qcom-wled.c
+++ b/drivers/video/backlight/qcom-wled.c
@@ -1535,10 +1535,15 @@ static int wled_configure(struct wled *wled)
return -EINVAL;
}

-   of_property_read_u32_array(dev->of_node,
+   rc = of_property_read_u32_array(dev->of_node,
"qcom,enabled-strings",
wled->cfg.enabled_strings,
-   sizeof(u32));
+   string_len);
+   if (rc) {
+   dev_err(dev, "Failed to read %d elements from 
qcom,enabled-strings: %d\n",
+   string_len, rc);
+   return rc;
+   }

for (i = 0; i < string_len; ++i) {
if (wled->cfg.enabled_strings[i] >= 
wled->max_string_count) {
--
2.33.0



[RESEND PATCH v2 04/13] backlight: qcom-wled: Fix off-by-one maximum with default num_strings

2021-11-11 Thread Marijn Suijten
When not specifying num-strings in the DT the default is used, but +1 is
added to it which turns WLED3 into 4 and WLED4/5 into 5 strings instead
of 3 and 4 respectively, causing out-of-bounds reads and register
read/writes.  This +1 exists for a deficiency in the DT parsing code,
and is simply omitted entirely - solving this oob issue - by parsing the
property separately much like qcom,enabled-strings.

This also allows more stringent checks on the maximum value when
qcom,enabled-strings is provided in the DT.  Note that num-strings is
parsed after enabled-strings to give it final sign-off over the length,
which DT currently utilizes to get around an incorrect fixed read of
four elements from that array (has been addressed in a prior patch).

Fixes: 93c64f1ea1e8 ("leds: add Qualcomm PM8941 WLED driver")
Signed-off-by: Marijn Suijten 
Reviewed-By: AngeloGioacchino Del Regno 

---
 drivers/video/backlight/qcom-wled.c | 51 +++--
 1 file changed, 19 insertions(+), 32 deletions(-)

diff --git a/drivers/video/backlight/qcom-wled.c 
b/drivers/video/backlight/qcom-wled.c
index 977cd75827d7..c5232478a343 100644
--- a/drivers/video/backlight/qcom-wled.c
+++ b/drivers/video/backlight/qcom-wled.c
@@ -1253,21 +1253,6 @@ static const struct wled_var_cfg wled5_ovp_cfg = {
.size = 16,
 };

-static u32 wled3_num_strings_values_fn(u32 idx)
-{
-   return idx + 1;
-}
-
-static const struct wled_var_cfg wled3_num_strings_cfg = {
-   .fn = wled3_num_strings_values_fn,
-   .size = 3,
-};
-
-static const struct wled_var_cfg wled4_num_strings_cfg = {
-   .fn = wled3_num_strings_values_fn,
-   .size = 4,
-};
-
 static u32 wled3_switch_freq_values_fn(u32 idx)
 {
return 19200 / (2 * (1 + idx));
@@ -1341,11 +1326,6 @@ static int wled_configure(struct wled *wled)
.val_ptr = >switch_freq,
.cfg = _switch_freq_cfg,
},
-   {
-   .name = "qcom,num-strings",
-   .val_ptr = >num_strings,
-   .cfg = _num_strings_cfg,
-   },
};

const struct wled_u32_opts wled4_opts[] = {
@@ -1369,11 +1349,6 @@ static int wled_configure(struct wled *wled)
.val_ptr = >switch_freq,
.cfg = _switch_freq_cfg,
},
-   {
-   .name = "qcom,num-strings",
-   .val_ptr = >num_strings,
-   .cfg = _num_strings_cfg,
-   },
};

const struct wled_u32_opts wled5_opts[] = {
@@ -1397,11 +1372,6 @@ static int wled_configure(struct wled *wled)
.val_ptr = >switch_freq,
.cfg = _switch_freq_cfg,
},
-   {
-   .name = "qcom,num-strings",
-   .val_ptr = >num_strings,
-   .cfg = _num_strings_cfg,
-   },
{
.name = "qcom,modulator-sel",
.val_ptr = >mod_sel,
@@ -1520,8 +1490,6 @@ static int wled_configure(struct wled *wled)
*bool_opts[i].val_ptr = true;
}

-   cfg->num_strings = cfg->num_strings + 1;
-
string_len = of_property_count_elems_of_size(dev->of_node,
 "qcom,enabled-strings",
 sizeof(u32));
@@ -1552,6 +1520,25 @@ static int wled_configure(struct wled *wled)
}
}

+   rc = of_property_read_u32(dev->of_node, "qcom,num-strings", );
+   if (!rc) {
+   if (val < 1 || val > wled->max_string_count) {
+   dev_err(dev, "qcom,num-strings must be between 1 and 
%d\n",
+   wled->max_string_count);
+   return -EINVAL;
+   }
+
+   if (string_len > 0) {
+   dev_warn(dev, "qcom,num-strings and 
qcom,enabled-strings are ambiguous\n");
+   if (val > string_len) {
+   dev_err(dev, "qcom,num-strings exceeds 
qcom,enabled-strings\n");
+   return -EINVAL;
+   }
+   }
+
+   cfg->num_strings = val;
+   }
+
return 0;
 }

--
2.33.0



[RESEND PATCH v2 01/13] backlight: qcom-wled: Validate enabled string indices in DT

2021-11-11 Thread Marijn Suijten
The strings passed in DT may possibly cause out-of-bounds register
accesses and should be validated before use.

Fixes: 775d2ffb4af6 ("backlight: qcom-wled: Restructure the driver for WLED3")
Signed-off-by: Marijn Suijten 
Reviewed-by: AngeloGioacchino Del Regno 

---
 drivers/video/backlight/qcom-wled.c | 18 +-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/drivers/video/backlight/qcom-wled.c 
b/drivers/video/backlight/qcom-wled.c
index d094299c2a48..8a42ed89c59c 100644
--- a/drivers/video/backlight/qcom-wled.c
+++ b/drivers/video/backlight/qcom-wled.c
@@ -1528,12 +1528,28 @@ static int wled_configure(struct wled *wled)
string_len = of_property_count_elems_of_size(dev->of_node,
 "qcom,enabled-strings",
 sizeof(u32));
-   if (string_len > 0)
+   if (string_len > 0) {
+   if (string_len > wled->max_string_count) {
+   dev_err(dev, "Cannot have more than %d strings\n",
+   wled->max_string_count);
+   return -EINVAL;
+   }
+
of_property_read_u32_array(dev->of_node,
"qcom,enabled-strings",
wled->cfg.enabled_strings,
sizeof(u32));

+   for (i = 0; i < string_len; ++i) {
+   if (wled->cfg.enabled_strings[i] >= 
wled->max_string_count) {
+   dev_err(dev,
+   "qcom,enabled-strings index %d at %d is 
out of bounds\n",
+   wled->cfg.enabled_strings[i], i);
+   return -EINVAL;
+   }
+   }
+   }
+
return 0;
 }

--
2.33.0



Re: [RFC PATCH v2 0/3] arm64: imx8mm: Add MIPI DSI support

2021-11-11 Thread Tim Harvey
On Thu, Nov 11, 2021 at 2:15 AM Jagan Teki  wrote:
>
> This series support MIPI DSI on i.MX8MM.
>
> The DSIM bridge still need to work to make it compatible for
> exynos drm dsi hardware block.
>
> This series work directly on to of linux-next with recent
> dispmix-blk-ctrl changes.
>

Jagan,

Thanks - I was able to get this series working using the set of
exynos/drm patches from Michael submitted back in 2020-09-11:
https://patchwork.kernel.org/project/dri-devel/list/?series=347439=both=*

> Tested on i.Core MX8M Mini SoM with EDIMM2.2 and CTOUCH2
> Carrier boards.
>
> Required changes:
> 1. DSIM driver
> https://patchwork.kernel.org/project/linux-arm-kernel/cover/20210704090230.26489-1-ja...@amarulasolutions.com/

This exynos/drm RFC series you posted back in July was where I
recalled the discussion about if the exynos driver could be split up
vs duplicating parts of it in a separate driver.

There were also some comments about this series. Can you address those
comments, rebase and resend?

I have not been able to get my hardware to work with this series yet
and am still debugging that (currently crashing in
samsung_dsim_host_attach)

> 2. DPHY change
> https://www.spinics.net/lists/devicetree/msg381691.html

This was originally from Marek submitted on Oct 3 2020: [PATCH] phy:
exynos-mipi-video: Add support for NXP i.MX8MM

This one seems to have been acked but never got picked up for some reason.

Marek, can you add the tags and re-submit?

> 3. Bus format fix
> https://github.com/openedev/linux/commit/6ca9781ed53ea75e26341dd57250e63794638b20
>

Jagan, can you submit this?

Best regards,

Tim


> Complete repo:
> https://github.com/openedev/linux/commits/111021-imx8mm-dsim
>
> Any inputs?
> Jagan.
>
> Jagan Teki (3):
>   arm64: dts: imx8mm: Add eLCDIF node support
>   arm64: dts: imx8mm: Add MIPI DSI pipeline
>   arm64: dts: imx8mm-icore: Enable LVDS panel for EDIMM2.2
>
>  .../freescale/imx8mm-icore-mx8mm-edimm2.2.dts | 85 +++
>  arch/arm64/boot/dts/freescale/imx8mm.dtsi | 74 
>  2 files changed, 159 insertions(+)
>
> --
> 2.25.1
>


Re: [PATCH v7 20/20] drm/mediatek: add mediatek-drm of vdosys1 support for MT8195

2021-11-11 Thread Chun-Kuang Hu
Hi, Nancy:

Nancy.Lin  於 2021年10月29日 週五 下午3:52寫道:
>
> Add driver data of mt8195 vdosys1 to mediatek-drm and the sub driver.
>
> Signed-off-by: Nancy.Lin 
> ---
>  drivers/gpu/drm/mediatek/mtk_disp_merge.c   |  4 ++
>  drivers/gpu/drm/mediatek/mtk_drm_crtc.c | 13 ++---
>  drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c | 30 +--
>  drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h |  1 +
>  drivers/gpu/drm/mediatek/mtk_drm_drv.c  | 56 -
>  5 files changed, 78 insertions(+), 26 deletions(-)
>
> diff --git a/drivers/gpu/drm/mediatek/mtk_disp_merge.c 
> b/drivers/gpu/drm/mediatek/mtk_disp_merge.c
> index dff2797a2f68..d64846c38fe1 100644
> --- a/drivers/gpu/drm/mediatek/mtk_disp_merge.c
> +++ b/drivers/gpu/drm/mediatek/mtk_disp_merge.c
> @@ -8,6 +8,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>
>  #include "mtk_drm_ddp_comp.h"
> @@ -79,6 +80,9 @@ void mtk_merge_stop(struct device *dev)
> struct mtk_disp_merge *priv = dev_get_drvdata(dev);
>
> mtk_merge_stop_cmdq(dev, NULL);
> +
> +   if (priv->async_clk)
> +   device_reset_optional(dev);

Separate this to an merge patch.

>  }
>
>  void mtk_merge_start_cmdq(struct device *dev, struct cmdq_pkt *cmdq_pkt)
> diff --git a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c 
> b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
> index 25580106a2c4..d41bd8201371 100644
> --- a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
> +++ b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
> @@ -876,15 +876,10 @@ int mtk_drm_crtc_create(struct drm_device *drm_dev,
> node = priv->comp_node[comp_id];
> comp = >ddp_comp[comp_id];
>
> -   if (!node) {
> -   dev_info(dev,
> -"Not creating crtc %d because component %d 
> is disabled or missing\n",
> -crtc_i, comp_id);
> -   return 0;
> -   }
> -
> -   if (!comp->dev) {
> -   dev_err(dev, "Component %pOF not initialized\n", 
> node);
> +   if (!node && !comp->dev) {
> +   dev_err(dev,
> +   "Not creating crtc %d because component %d is 
> disabled, missing or not initialized\n",
> +   crtc_i, comp_id);

Why do this? If this is necessary, separate this to an independent patch.

> return -ENODEV;
> }
> }
> diff --git a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c 
> b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
> index eb9835102d79..279087ae889b 100644
> --- a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
> +++ b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
> @@ -385,6 +385,18 @@ static const struct mtk_ddp_comp_funcs ddp_ufoe = {
> .start = mtk_ufoe_start,
>  };
>
> +static const struct mtk_ddp_comp_funcs ddp_ovl_adaptor = {
> +   .clk_enable = mtk_ovl_adaptor_clk_enable,
> +   .clk_disable = mtk_ovl_adaptor_clk_disable,
> +   .config = mtk_ovl_adaptor_config,
> +   .start = mtk_ovl_adaptor_start,
> +   .stop = mtk_ovl_adaptor_stop,
> +   .layer_nr = mtk_ovl_adaptor_layer_nr,
> +   .layer_config = mtk_ovl_adaptor_layer_config,
> +   .enable_vblank = mtk_ovl_adaptor_enable_vblank,
> +   .disable_vblank = mtk_ovl_adaptor_disable_vblank,
> +};

Separate this to an ovl_adaptor patch.

> +
>  static const char * const mtk_ddp_comp_stem[MTK_DDP_COMP_TYPE_MAX] = {
> [MTK_DISP_AAL] = "aal",
> [MTK_DISP_BLS] = "bls",
> @@ -398,6 +410,7 @@ static const char * const 
> mtk_ddp_comp_stem[MTK_DDP_COMP_TYPE_MAX] = {
> [MTK_DISP_OD] = "od",
> [MTK_DISP_OVL] = "ovl",
> [MTK_DISP_OVL_2L] = "ovl-2l",
> +   [MTK_DISP_OVL_ADAPTOR] = "ovl_adaptor",
> [MTK_DISP_POSTMASK] = "postmask",
> [MTK_DISP_PWM] = "pwm",
> [MTK_DISP_RDMA] = "rdma",
> @@ -443,6 +456,7 @@ static const struct mtk_ddp_comp_match 
> mtk_ddp_matches[DDP_COMPONENT_ID_MAX] = {
> [DDP_COMPONENT_OVL_2L0] = { MTK_DISP_OVL_2L,0, _ovl },
> [DDP_COMPONENT_OVL_2L1] = { MTK_DISP_OVL_2L,1, _ovl },
> [DDP_COMPONENT_OVL_2L2] = { MTK_DISP_OVL_2L,2, _ovl },
> +   [DDP_COMPONENT_OVL_ADAPTOR] = { MTK_DISP_OVL_ADAPTOR,   0, 
> _ovl_adaptor },
> [DDP_COMPONENT_POSTMASK0]   = { MTK_DISP_POSTMASK,  0, 
> _postmask },
> [DDP_COMPONENT_PWM0]= { MTK_DISP_PWM,   0, NULL },
> [DDP_COMPONENT_PWM1]= { MTK_DISP_PWM,   1, NULL },
> @@ -548,12 +562,17 @@ int mtk_ddp_comp_init(struct device_node *node, struct 
> mtk_ddp_comp *comp,
>
> comp->id = comp_id;
> comp->funcs = mtk_ddp_matches[comp_id].funcs;
> -   comp_pdev = of_find_device_by_node(node);
> -   if (!comp_pdev) {
> -   DRM_INFO("Waiting for device %s\n", node->full_name);
> -   return -EPROBE_DEFER;
> +   /* 

[PATCH] drm/msm: Demote debug message

2021-11-11 Thread Rob Clark
From: Rob Clark 

Mesa attempts to allocate a cached-coherent buffer in order to determine
if cached-coherent is supported.  Resulting in seeing this error message
once per process with newer mesa.  But no reason for this to be more
than a debug msg.

Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/msm/msm_gem.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index 6b03e00cc5f2..27c3ece4d146 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -1121,7 +1121,7 @@ static int msm_gem_new_impl(struct drm_device *dev,
break;
fallthrough;
default:
-   DRM_DEV_ERROR(dev->dev, "invalid cache flag: %x\n",
+   DRM_DEV_DEBUG(dev->dev, "invalid cache flag: %x\n",
(flags & MSM_BO_CACHE_MASK));
return -EINVAL;
}
-- 
2.31.1



[PATCH] drm/msm: Make a6xx_gpu_set_freq() static

2021-11-11 Thread Rob Clark
From: Rob Clark 

Reported-by: kernel test robot 
Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 8a2af3a27e33..dcde5eff931d 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -1641,7 +1641,7 @@ static unsigned long a6xx_gpu_busy(struct msm_gpu *gpu)
return (unsigned long)busy_time;
 }
 
-void a6xx_gpu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp)
+static void a6xx_gpu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp)
 {
struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
-- 
2.31.1



[PATCH v10 10/10] drm: use DEFINE_DYNAMIC_DEBUG_TRACE_GROUPS in 3 places

2021-11-11 Thread Jim Cromie
add sysfs knobs to enable modules' pr_debug()s ---> tracefs

Signed-off-by: Jim Cromie 
---
 drivers/gpu/drm/amd/display/dc/core/dc_debug.c |  8 
 drivers/gpu/drm/drm_print.c| 13 ++---
 drivers/gpu/drm/i915/intel_gvt.c   | 15 ---
 3 files changed, 30 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_debug.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_debug.c
index e49a755c6a69..58c56c1708e7 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_debug.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_debug.c
@@ -80,6 +80,14 @@ DEFINE_DYNAMIC_DEBUG_LOG_GROUPS(debug_dc, __debug_dc,
DC_DYNDBG_BITMAP_DESC(debug_dc),
amdgpu_bitmap);
 
+#if defined(CONFIG_TRACING)
+
+unsigned long __trace_dc;
+EXPORT_SYMBOL(__trace_dc);
+DEFINE_DYNAMIC_DEBUG_LOG_GROUPS(trace_dc, __trace_dc,
+   DC_DYNDBG_BITMAP_DESC(trace_dc),
+   amdgpu_bitmap);
+#endif
 #endif
 
 #define DC_LOGGER_INIT(logger)
diff --git a/drivers/gpu/drm/drm_print.c b/drivers/gpu/drm/drm_print.c
index d5e0ffad467b..ee20e9c14ce9 100644
--- a/drivers/gpu/drm/drm_print.c
+++ b/drivers/gpu/drm/drm_print.c
@@ -72,9 +72,16 @@ static struct dyndbg_bitdesc drm_dyndbg_bitmap[] = {
[8] = { DRM_DBG_CAT_DP },
[9] = { DRM_DBG_CAT_DRMRES }
 };
-DEFINE_DYNAMIC_DEBUG_BITGRPS(debug, __drm_debug, DRM_DEBUG_DESC,
-drm_dyndbg_bitmap);
-
+DEFINE_DYNAMIC_DEBUG_LOG_GROUPS(debug, __drm_debug, DRM_DEBUG_DESC,
+   drm_dyndbg_bitmap);
+
+#ifdef CONFIG_TRACING
+struct trace_array *trace_arr;
+unsigned long __drm_trace;
+EXPORT_SYMBOL(__drm_trace);
+DEFINE_DYNAMIC_DEBUG_TRACE_GROUPS(trace, __drm_trace, DRM_DEBUG_DESC,
+ drm_dyndbg_bitmap);
+#endif
 #endif
 
 void __drm_puts_coredump(struct drm_printer *p, const char *str)
diff --git a/drivers/gpu/drm/i915/intel_gvt.c b/drivers/gpu/drm/i915/intel_gvt.c
index efaac5777873..84348d4aedf6 100644
--- a/drivers/gpu/drm/i915/intel_gvt.c
+++ b/drivers/gpu/drm/i915/intel_gvt.c
@@ -195,8 +195,17 @@ static struct dyndbg_bitdesc i915_dyndbg_bitmap[] = {
help_(7, "gvt:render:") \
help_(8, "gvt:sched:")
 
-DEFINE_DYNAMIC_DEBUG_BITGRPS(debug_gvt, __gvt_debug,
-I915_GVT_CATEGORIES(debug_gvt),
-i915_dyndbg_bitmap);
+DEFINE_DYNAMIC_DEBUG_LOG_GROUPS(debug_gvt, __gvt_debug,
+   I915_GVT_CATEGORIES(debug_gvt),
+   i915_dyndbg_bitmap);
 
+#if defined(CONFIG_TRACING)
+
+unsigned long __gvt_trace;
+EXPORT_SYMBOL(__gvt_trace);
+DEFINE_DYNAMIC_DEBUG_TRACE_GROUPS(trace_gvt, __gvt_trace,
+ I915_GVT_CATEGORIES(trace_gvt),
+ i915_dyndbg_bitmap);
+
+#endif
 #endif
-- 
2.31.1



[PATCH v10 09/10] dyndbg: create DEFINE_DYNAMIC_DEBUG_LOG|TRACE_GROUPS

2021-11-11 Thread Jim Cromie
With the recent addition of pr_debug to tracefs via +T flag, we now
want to add drm.trace; like its model: drm.debug, it maps bits to
pr_debug categories, but this one enables/disables writing to tracefs
(iff CONFIG_TRACING).

Do this by:

1. add flags to dyndbg_bitmap_param, holds "p" or "T" to work for either.

   add DEFINE_DYNAMIC_DEBUG_BITGRPS_FLAGS to init .flags
   DEFINE_DYNAMIC_DEBUG_BITGRPS gets "p" for compat.
   use it from...

2. DEFINE_DYNAMIC_DEBUG_LOG_GROUPS as (1) with "p" flags - print to syslog
   DEFINE_DYNAMIC_DEBUG_TRACE_GROUPS as (1) with "T" flags - trace to tracefs
   add kdoc to these

NOTES

The flags args (1) is a string, not just a 'p' or 'T'.  This allows
use of decorator flags ("mflt") too, in case the module author wants
to insure those decorations are in the trace & log.

The LOG|TRACE (2) macros don't use any decorator flags, (and therefore
don't toggle them), allowing users to control those themselves.

Decorator flags are shared for both LOG and TRACE consumers,
coordination between users is expected.  ATM, theres no declarative
way to preset decorator flags, but DEFINE_DYNAMIC_DEBUG_BITGRPS_FLAGS
can be used to explicitly toggle them.

Signed-off-by: Jim Cromie 
---
 include/linux/dynamic_debug.h | 44 ++-
 lib/dynamic_debug.c   |  4 ++--
 2 files changed, 35 insertions(+), 13 deletions(-)

diff --git a/include/linux/dynamic_debug.h b/include/linux/dynamic_debug.h
index 792bcff0297e..918ac1a92358 100644
--- a/include/linux/dynamic_debug.h
+++ b/include/linux/dynamic_debug.h
@@ -255,30 +255,52 @@ struct dyndbg_bitdesc {
 
 struct dyndbg_bitmap_param {
unsigned long *bits;/* ref to shared state */
+   const char *flags;
unsigned int maplen;
struct dyndbg_bitdesc *map; /* indexed by bitpos */
 };
 
 #if defined(CONFIG_DYNAMIC_DEBUG) || \
(defined(CONFIG_DYNAMIC_DEBUG_CORE) && defined(DYNAMIC_DEBUG_MODULE))
+
+#define DEFINE_DYNAMIC_DEBUG_BITGRPS_FLAGS(fsname, _var, _flags, desc, data) \
+   MODULE_PARM_DESC(fsname, desc); \
+   static struct dyndbg_bitmap_param ddcats_##_var =   \
+   { .bits = &(_var), .flags = (_flags),   \
+ .map = data, .maplen = ARRAY_SIZE(data) };\
+   module_param_cb(fsname, _ops_dyndbg, _##_var, 0644)
+
+#define DEFINE_DYNAMIC_DEBUG_BITGRPS(fsname, _var, desc, data) \
+   DEFINE_DYNAMIC_DEBUG_BITGRPS_FLAGS(fsname, _var, "p", desc, data)
+
 /**
- * DEFINE_DYNAMIC_DEBUG_BITGRPS() - bitmap control of pr_debugs, by format 
match
+ * DEFINE_DYNAMIC_DEBUG_LOG_GROUPS() - bitmap control of grouped pr_debugs --> 
syslog
+ *
  * @fsname: parameter basename under /sys
  * @_var:   C-identifier holding bitmap
  * @desc:   string summarizing the controls provided
  * @bitmap: C array of struct dyndbg_bitdescs
  *
- * Intended for modules with a systematic use of pr_debug prefixes in
- * the format strings, this allows modules calling pr_debugs to
- * control them in groups by matching against their formats, and map
- * them to bits 0-N of a sysfs control point.
+ * Intended for modules having pr_debugs with prefixed/categorized
+ * formats; this lets you group them by substring match, map groups to
+ * bits, and enable per group to write to syslog, via @fsname.
  */
-#define DEFINE_DYNAMIC_DEBUG_BITGRPS(fsname, _var, desc, data) \
-   MODULE_PARM_DESC(fsname, desc); \
-   static struct dyndbg_bitmap_param ddcats_##_var =   \
-   { .bits = &(_var), .map = data, \
- .maplen = ARRAY_SIZE(data) }; \
-   module_param_cb(fsname, _ops_dyndbg, _##_var, 0644)
+#define DEFINE_DYNAMIC_DEBUG_LOG_GROUPS(fsname, _var, desc, data)  \
+   DEFINE_DYNAMIC_DEBUG_BITGRPS_FLAGS(fsname, _var, "p", desc, data)
+
+/**
+ * DEFINE_DYNAMIC_DEBUG_TRACE_GROUPS() - bitmap control of pr_debugs --> 
tracefs
+ * @fsname: parameter basename under /sys
+ * @_var:   C-identifier holding bitmap
+ * @desc:   string summarizing the controls provided
+ * @bitmap: C array of struct dyndbg_bitdescs
+ *
+ * Intended for modules having pr_debugs with prefixed/categorized
+ * formats; this lets you group them by substring match, map groups to
+ * bits, and enable per group to write to tracebuf, via @fsname.
+ */
+#define DEFINE_DYNAMIC_DEBUG_TRACE_GROUPS(fsname, _var, desc, data)\
+   DEFINE_DYNAMIC_DEBUG_BITGRPS_FLAGS(fsname, _var, "T", desc, data)
 
 extern const struct kernel_param_ops param_ops_dyndbg;
 
diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c
index d493ed6658b9..f5ba07668020 100644
--- a/lib/dynamic_debug.c
+++ b/lib/dynamic_debug.c
@@ -634,8 +634,8 @@ int param_set_dyndbg(const char *instr, const struct 
kernel_param *kp)
for (i = 0; i < p->maplen && i < BITS_PER_LONG; map++, i++) {
if (test_bit(i, ) == test_bit(i, 

[PATCH v10 07/10] drm_print: instrument drm_debug_enabled

2021-11-11 Thread Jim Cromie
Duplicate drm_debug_enabled() code into both "basic" and "dyndbg"
ifdef branches.  Then add a pr_debug("todo: ...") into the "dyndbg"
branch.

Then convert the "dyndbg" branch's code to a macro, so that the
pr_debug() get its callsite info from the invoking function, instead
of from drm_debug_enabled() itself.

This gives us unique callsite info for the 8 remaining users of
drm_debug_enabled(), and lets us enable them individually to see how
much logging traffic they generate.  The oft-visited callsites can
then be reviewed for runtime cost and possible optimizations.

Heres what we get:

bash-5.1# modprobe drm
dyndbg: 384 debug prints in module drm
bash-5.1# grep todo: /proc/dynamic_debug/control
drivers/gpu/drm/drm_edid.c:1843 [drm]connector_bad_edid =_ "todo: maybe avoid 
via dyndbg\012"
drivers/gpu/drm/drm_print.c:309 [drm]___drm_dbg =p "todo: maybe avoid via 
dyndbg\012"
drivers/gpu/drm/drm_print.c:286 [drm]__drm_dev_dbg =p "todo: maybe avoid via 
dyndbg\012"
drivers/gpu/drm/drm_vblank.c:1491 [drm]drm_vblank_restore =_ "todo: maybe avoid 
via dyndbg\012"
drivers/gpu/drm/drm_vblank.c:787 
[drm]drm_crtc_vblank_helper_get_vblank_timestamp_internal =_ "todo: maybe avoid 
via dyndbg\012"
drivers/gpu/drm/drm_vblank.c:410 [drm]drm_crtc_accurate_vblank_count =_ "todo: 
maybe avoid via dyndbg\012"
drivers/gpu/drm/drm_atomic_uapi.c:1457 [drm]drm_mode_atomic_ioctl =_ "todo: 
maybe avoid via dyndbg\012"
drivers/gpu/drm/drm_edid_load.c:178 [drm]edid_load =_ "todo: maybe avoid via 
dyndbg\012"

At quick glance, edid won't qualify, drm_print might, drm_vblank is
strongest chance, maybe atomic-ioctl too.

Signed-off-by: Jim Cromie 
---
 include/drm/drm_print.h | 17 +++--
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/include/drm/drm_print.h b/include/drm/drm_print.h
index 392cff7cb95c..a902bd4d8c55 100644
--- a/include/drm/drm_print.h
+++ b/include/drm/drm_print.h
@@ -381,6 +381,11 @@ enum drm_debug_category {
 #define DRM_DBG_CAT_DP DRM_UT_DP
 #define DRM_DBG_CAT_DRMRES DRM_UT_DRMRES
 
+static inline bool drm_debug_enabled(enum drm_debug_category category)
+{
+   return unlikely(__drm_debug & category);
+}
+
 #else /* CONFIG_DRM_USE_DYNAMIC_DEBUG */
 
 /* join prefix + format in cpp so dyndbg can see it */
@@ -414,12 +419,13 @@ enum drm_debug_category {
 #define DRM_DBG_CAT_DP "drm:dp: "
 #define DRM_DBG_CAT_DRMRES "drm:res: "
 
-#endif /* CONFIG_DRM_USE_DYNAMIC_DEBUG */
+#define drm_debug_enabled(category)\
+   ({  \
+   pr_debug("todo: maybe avoid via dyndbg\n"); \
+   unlikely(__drm_debug & (category)); \
+   })
 
-static inline bool drm_debug_enabled(enum drm_debug_category category)
-{
-   return unlikely(__drm_debug & category);
-}
+#endif /* CONFIG_DRM_USE_DYNAMIC_DEBUG */
 
 /*
  * struct device based logging
@@ -582,7 +588,6 @@ void __drm_dev_dbg(const struct device *dev, enum 
drm_debug_category category,
 #define drm_dbg_drmres(drm, fmt, ...)  \
drm_dev_dbg((drm) ? (drm)->dev : NULL, DRM_DBG_CAT_DRMRES, fmt, 
##__VA_ARGS__)
 
-
 /*
  * printk based logging
  *
-- 
2.31.1



[PATCH v10 08/10] dyndbg: add print-to-tracefs, selftest with it - RFC

2021-11-11 Thread Jim Cromie
Sean Paul proposed, in:
https://patchwork.freedesktop.org/series/78133/
drm/trace: Mirror DRM debug logs to tracefs

His patchset's objective is to be able to independently steer some of
the drm.debug stream to an alternate tracing destination, by splitting
drm_debug_enabled() into syslog & trace flavors, and enabling them
separately.  2 advantages were identified:

1- syslog is heavyweight, tracefs is much lighter
2- separate selection of enabled categories means less traffic

Dynamic-Debug can do 2nd exceedingly well:

A- all work is behind jump-label's NOOP, zero off cost.
B- exact site selectivity, precisely the useful traffic.
   can tailor enabled set interactively, at shell.

Since the tracefs interface is effective for drm (the threads suggest
so), adding that interface to dynamic-debug has real potential for
everyone including drm.

if CONFIG_TRACING:

Grab Sean's trace_init/cleanup code, use it to provide tracefs
available by default to all pr_debugs.  This will likely need some
further per-module treatment; perhaps something reflecting hierarchy
of module,file,function,line, maybe with a tuned flattening.

endif CONFIG_TRACING

Add a new +T flag to enable tracing, independent of +p, and add and
use 3 macros: dyndbg_site_is_enabled/logging/tracing(), to encapsulate
the flag checks.  Existing code treats T like other flags.

Add ddebug_validate_flags() as last step in ddebug_parse_flags().  Its
only job is to fail on +T for non-CONFIG_TRACING builds.  It only sees
the new flags, and cannot validate specific state transitions.  This
is fine, since we have no need for that; such a test would have to be
done in ddebug_change(), which actually updates the callsites.

ddebug_change() adjusts the static-key-enable/disable condition to use
_DPRINTK_ENABLED / abstraction macros.

dynamic_emit_prefix() now gates on _DPRINTK_ENABLED too, as an
optimization but mostly to allow decluttering of its users.

__dynamic_pr_debug() etal get minor changes:

 - call dynamic_emit_prefix() 1st, _enabled() optimizes.
 - if (T) call trace_array_printk
 - if (!p) go around original printk code.
   done to minimize diff,
   goto-ectomy + reindent later/separately
 - share vaf across p|T

WRT _dev, I skipped all the  specific dev_emit_prefix
additions for now.  tracefs is a fast customer with different needs,
its not clear that pretty device-ID-ish strings is useful tracefs
content (on ingest), or that couldn't be done more efficiently while
analysing or postprocesing the tracefs buffer.

SELFTEST: test_dynamic_debug.ko:

Uses the tracer facility to implement a kernel module selftest.

TODO:

Earlier core code had (tracerfn)() indirection, allowing a plugin
side-effector we could test the results of.

ATM all the tests which count +T'd callsite executions (and which
expect >0) are failing.

Now it needs a rethink to test from userspace, rather than the current
test-once at module-load.  It needs a parameters/testme button.

So remainder of this is a bit stale 

- A custom tracer counts the number of calls (of T-enabled pr_debugs),
- do_debugging(x) calls a set of categorized pr_debugs x times

- test registers the tracer on the module
  then iteratively:
  manipulates dyndbg states via query-cmds, mostly format ^prefix
  runs do_debugging()
  counts enabled callsite executions
  reports mismatches

- modprobe test_dynamic_debug use_bad_tracer=1
  attaches a bad/recursive tracer
  Bad Things (did) Happen.
  has thrown me interesting panics.
  cannot replicate atm.

RFC: (DONE)

The "tracer" interface probably needs work and a new name.  It is only
1/2 way towards a real tracefs interface; and the code I lifted from
Sean Paul in the next patch could be implemented in dynamic_debug.c
instead, and made available for all pr_debug users.

This would also eliminate need for dynamic_debug_(un)register_tracer(),
since dyndbg could just provide it when TRACING is on.

NOTES:

$> modprobe test_dynamic_debug dyndbg=+p

   it fails 3/29 tests. havent looked at why.

$> modprobe test_dynamic_debug use_bad_tracer=1

Earlier in dev, bad_tracer() exploded in recursion, I havent been able
to replicate that lately.

Signed-off-by: Jim Cromie 
---
 .../admin-guide/dynamic-debug-howto.rst   |   7 +-
 MAINTAINERS   |   1 +
 include/linux/dynamic_debug.h |  12 +-
 lib/Kconfig.debug |  11 +
 lib/Makefile  |   1 +
 lib/dynamic_debug.c   | 127 --
 lib/test_dynamic_debug.c  | 222 ++
 7 files changed, 355 insertions(+), 26 deletions(-)
 create mode 100644 lib/test_dynamic_debug.c

diff --git a/Documentation/admin-guide/dynamic-debug-howto.rst 
b/Documentation/admin-guide/dynamic-debug-howto.rst
index a89cfa083155..bf2a561cc9bc 100644
--- a/Documentation/admin-guide/dynamic-debug-howto.rst
+++ b/Documentation/admin-guide/dynamic-debug-howto.rst
@@ -227,7 +227,8 @@ of the 

[PATCH v10 06/10] drm_print: add choice to use dynamic debug in drm-debug

2021-11-11 Thread Jim Cromie
drm's debug system writes 10 distinct categories of messages to syslog
using a small API[1]: drm_dbg*(10 names), DRM_DEV_DEBUG*(3 names),
DRM_DEBUG*(8 names).  There are thousands of these callsites, each
categorized in this systematized way.

These callsites can be enabled at runtime by their category, each
controlled by a bit in drm.debug (/sys/modules/drm/parameter/debug).
In the current "basic" implementation, drm_debug_enabled() tests these
bits in __drm_debug each time an API[1] call is executed; while cheap
individually, the costs accumulate with uptime.

This patch uses dynamic-debug with (required) jump-label to patch
enabled callsites onto their respective NOOP slots, avoiding all
runtime bit-checks of __drm_debug by drm_debug_enabled().

Dynamic debug has no concept of category, but we can emulate one by
replacing enum categories with a set of prefix-strings; "drm:core:",
"drm:kms:" "drm:driver:" etc, and prepend them (at compile time) to
the given formats.

Then we can use:

   # echo module drm format "^drm:core: " +p > control`

to enable the whole category with one query.

This conversion yields many new prdbg callsites:

  dyndbg: 207 debug prints in module drm_kms_helper
  dyndbg: 376 debug prints in module drm
  dyndbg: 1811 debug prints in module i915
  dyndbg: 3917 debug prints in module amdgpu

Each site costs 56 bytes of .data, which is a big increase for
drm modules, so CONFIG_DRM_USE_DYNAMIC_DEBUG makes it optional.

CONFIG_JUMP_LABEL is also required, to get the promised optimizations.

The "basic" -> "dyndbg" switchover is layered into the macro scheme

A. A "prefix" version of DRM_UT_ map, named DRM_DBG_CAT_

"basic":  DRM_DBG_CAT_  <===  DRM_UT_.  Identity map.
"dyndbg":
   #define DRM_DBG_CAT_KMS"drm:kms: "
   #define DRM_DBG_CAT_PRIME  "drm:prime: "
   #define DRM_DBG_CAT_ATOMIC "drm:atomic: "

DRM_UT_* are preserved, since theyre used elsewhere.  Since the
callback maintains its state in __drm_debug, drm_debug_enabled() will
stay synchronized, and continue to work.  We can address them
separately if they are called enough to be worth fixing.

B. drm_dev_dbg() & drm_debug() are interposed with macros

basic:forward to renamed fn, with args preserved
enabled:  redirect to pr_debug, dev_dbg, with CATEGORY format catenated

This is where drm_debug_enabled() is avoided.  The prefix is prepended
at compile-time, no category at runtime.

C. API[1] uses DRM_DBG_CAT_s

The API already uses B, now it uses A too, instead of DRM_UT_, to
get the correct token type for "basic" and "dyndbg" configs.

D. use DEFINE_DYNAMIC_DEBUG_CATEGORIES()

This defines the map using DRM_CAT_s, and creates the /sysfs
bitmap to control those categories.

CONFIG_DRM_USE_DYNAMIC_DEBUG is also used to adjust amdgpu, i915
makefiles to add -DDYNAMIC_DEBUG_MODULE; it includes the current
CONFIG_DYNAMIC_DEBUG_CORE and is enabled by the user.

LIMITATIONS:

dev_dbg(etal) effectively prepends twice, category then driver-name,
yielding format strings like so:

bash-5.1# grep amdgpu: /proc/dynamic_debug/control | grep drm: | cut -d= -f2-
_ "amdgpu: drm:core: fence driver on ring %s use gpu addr 0x%016llx\012"
_ "amdgpu: drm:kms: Cannot create framebuffer from imported dma_buf\012"

This means we cannot use anchored "^drm:kms: " to specify the
category, a small loss of precision.

Note that searching on "format ^amdgpu: " works, but this is less
valuable, because the same can be done with "module amdgpu".

NOTES:

Because the dyndbg callback is keeping state in __drm_debug, it
synchronizes with drm_debug_enabled() and its remaining users; the
switchover should be transparent.

Code Review is expected to catch the lack of correspondence between
bit=>prefix definitions (the selector) and the prefixes used in the
API[1] layer above pr_debug()

I've coded the categories with trailing spaces.  This excludes any
sub-categories which might get added later.  This convention protects
any "drm:atomic:fail:" callsites from getting stomped on by `echo 0 >
debug`.  Other categories could differ, but we need some default.

Dyndbg requires that the prefix be in the compiled-in format string;
run-time prefixing evades callsite selection by category.

pr_debug("%s: ...", __func__, ...) // not ideal

Unfortunately __func__ is not a macro, and cannot be catenated at
preprocess/compile time.

If you want that, you might consider +mfl flags instead;)

Signed-off-by: Jim Cromie 
---
v5:
. use DEFINE_DYNAMIC_DEBUG_CATEGORIES in drm_print.c
. s/DRM_DBG_CLASS_/DRM_DBG_CAT_/ - dont need another term
. default=y in Kconfig entry - per @DanVet
. move some commit-log prose to dyndbg commit
. add-prototyes to (param_get/set)_dyndbg
. more wrinkles found by 
. relocate ratelimit chunk from elsewhere
v6:
. add kernel doc
. fix cpp paste, drop '#'
v7:
. change __drm_debug to long, to fit with DEFINE_DYNAMIC_DEBUG_CATEGORIES
. add -DDYNAMIC_DEBUG_MODULE to ccflags if DRM_USE_DYNAMIC_DEBUG
v8:
. adapt to altered ^ insertion
. add mem 

[PATCH v10 05/10] i915/gvt: use dyndbg.BITGRPS for existing pr_debugs

2021-11-11 Thread Jim Cromie
The gvt component of this driver has ~120 pr_debugs with formats using
one of 9 fixed string prefixes, which are quite similar to those
enumerated in DRM debug categories.  Following the interface model of
drm.debug, add a parameter to map bits to these format prefixes.

static struct dyndbg_bitdesc i915_bitmap[] = {
[0] = { "gvt:cmd:" },
[1] = { "gvt:core:" },
[2] = { "gvt:dpy:" },
[3] = { "gvt:el:" },
[4] = { "gvt:irq:" },
[5] = { "gvt:mm:" },
[6] = { "gvt:mmio:" },
[7] = { "gvt:render:" },
[8] = { "gvt:sched:" }
};
DEFINE_DYNAMIC_DEBUG_BITGRPS(debug_gvt, __gvt_debug,
"dyndbg bitmap desc",

If CONFIG_DYNAMIC_DEBUG_CORE=y, then gvt/Makefile adds
-DDYNAMIC_DEBUG_MODULE to cflags, which CONFIG_DYNAMIC_DEBUG=n
(CORE-only) builds need.  This is redone more comprehensively soon.

Signed-off-by: Jim Cromie 
---
 drivers/gpu/drm/i915/Makefile|  2 ++
 drivers/gpu/drm/i915/intel_gvt.c | 38 
 2 files changed, 40 insertions(+)

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 660bb03de6fc..0fa5f53312a8 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -317,6 +317,8 @@ i915-y += intel_gvt.o
 include $(src)/gvt/Makefile
 endif
 
+ccflags-$(CONFIG_DYNAMIC_DEBUG_CORE) += -DDYNAMIC_DEBUG_MODULE
+
 obj-$(CONFIG_DRM_I915) += i915.o
 obj-$(CONFIG_DRM_I915_GVT_KVMGT) += gvt/kvmgt.o
 
diff --git a/drivers/gpu/drm/i915/intel_gvt.c b/drivers/gpu/drm/i915/intel_gvt.c
index 4e70c1a9ef2e..efaac5777873 100644
--- a/drivers/gpu/drm/i915/intel_gvt.c
+++ b/drivers/gpu/drm/i915/intel_gvt.c
@@ -162,3 +162,41 @@ void intel_gvt_resume(struct drm_i915_private *dev_priv)
if (intel_gvt_active(dev_priv))
intel_gvt_pm_resume(dev_priv->gvt);
 }
+
+#if defined(CONFIG_DRM_USE_DYNAMIC_DEBUG)
+
+unsigned long __gvt_debug;
+EXPORT_SYMBOL(__gvt_debug);
+
+static struct dyndbg_bitdesc i915_dyndbg_bitmap[] = {
+   [0] = { "gvt:cmd:" },
+   [1] = { "gvt:core:" },
+   [2] = { "gvt:dpy:" },
+   [3] = { "gvt:el:" },
+   [4] = { "gvt:irq:" },
+   [5] = { "gvt:mm:" },
+   [6] = { "gvt:mmio:" },
+   [7] = { "gvt:render:" },
+   [8] = { "gvt:sched:" }
+};
+
+#define help_(_N, _cat)"\t  Bit-" #_N ":\t" _cat "\n"
+
+#define I915_GVT_CATEGORIES(name) \
+   " Enable debug output via /sys/module/i915/parameters/" #name   \
+   ", where each bit enables a debug category.\n"  \
+   help_(0, "gvt:cmd:")\
+   help_(1, "gvt:core:")   \
+   help_(2, "gvt:dpy:")\
+   help_(3, "gvt:el:") \
+   help_(4, "gvt:irq:")\
+   help_(5, "gvt:mm:") \
+   help_(6, "gvt:mmio:")   \
+   help_(7, "gvt:render:") \
+   help_(8, "gvt:sched:")
+
+DEFINE_DYNAMIC_DEBUG_BITGRPS(debug_gvt, __gvt_debug,
+I915_GVT_CATEGORIES(debug_gvt),
+i915_dyndbg_bitmap);
+
+#endif
-- 
2.31.1



[PATCH v10 04/10] i915/gvt: trim spaces from pr_debug "gvt: core:" prefixes

2021-11-11 Thread Jim Cromie
Taking embedded spaces out of existing prefixes makes them more easily
searchable; simplifying the extra quoting needed otherwise:

  $> echo format "^gvt: core:" +p >control

Dropping the internal spaces means any trailing space in a query will
more clearly terminate the prefix being searched for.

Consider a generic drm-debug example:

  # turn off ATOMIC reports
  echo format "^drm:atomic: " -p > control

  # turn off all ATOMIC:* reports, including any sub-categories
  echo format "^drm:atomic:" -p > control

  # turn on ATOMIC:FAIL: reports
  echo format "^drm:atomic:fail: " +p > control

Removing embedded spaces in the format prefixes simplifies the
corresponding match string.  This means that "quoted" match-prefixes
are only needed when the trailing space is desired, in order to
exclude explicitly sub-categorized pr-debugs; in this example,
"drm:atomic:fail:".

Signed-off-by: Jim Cromie 
---
 drivers/gpu/drm/i915/gvt/debug.h | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/gvt/debug.h b/drivers/gpu/drm/i915/gvt/debug.h
index c6027125c1ec..bbecc279e077 100644
--- a/drivers/gpu/drm/i915/gvt/debug.h
+++ b/drivers/gpu/drm/i915/gvt/debug.h
@@ -36,30 +36,30 @@ do {
\
 } while (0)
 
 #define gvt_dbg_core(fmt, args...) \
-   pr_debug("gvt: core: "fmt, ##args)
+   pr_debug("gvt:core: " fmt, ##args)
 
 #define gvt_dbg_irq(fmt, args...) \
-   pr_debug("gvt: irq: "fmt, ##args)
+   pr_debug("gvt:irq: " fmt, ##args)
 
 #define gvt_dbg_mm(fmt, args...) \
-   pr_debug("gvt: mm: "fmt, ##args)
+   pr_debug("gvt:mm: " fmt, ##args)
 
 #define gvt_dbg_mmio(fmt, args...) \
-   pr_debug("gvt: mmio: "fmt, ##args)
+   pr_debug("gvt:mmio: " fmt, ##args)
 
 #define gvt_dbg_dpy(fmt, args...) \
-   pr_debug("gvt: dpy: "fmt, ##args)
+   pr_debug("gvt:dpy: " fmt, ##args)
 
 #define gvt_dbg_el(fmt, args...) \
-   pr_debug("gvt: el: "fmt, ##args)
+   pr_debug("gvt:el: " fmt, ##args)
 
 #define gvt_dbg_sched(fmt, args...) \
-   pr_debug("gvt: sched: "fmt, ##args)
+   pr_debug("gvt:sched: " fmt, ##args)
 
 #define gvt_dbg_render(fmt, args...) \
-   pr_debug("gvt: render: "fmt, ##args)
+   pr_debug("gvt:render: " fmt, ##args)
 
 #define gvt_dbg_cmd(fmt, args...) \
-   pr_debug("gvt: cmd: "fmt, ##args)
+   pr_debug("gvt:cmd: " fmt, ##args)
 
 #endif
-- 
2.31.1



[PATCH v10 03/10] amdgpu: use dyndbg.BITGRPS to control existing pr_debugs

2021-11-11 Thread Jim Cromie
logger_types.h defines many DC_LOG_*() categorized debug wrappers.
Most of these already use DRM debug API, so are controllable using
drm.debug, but others use a bare pr_debug("$prefix: .."), with 1 of 13
different class-prefixes matching [:uppercase:]

Use DEFINE_DYNAMIC_DEBUG_BITGRPS to create a sysfs location which maps
from bits to these 13 sets of categorized pr_debugs to en/disable.

Makefile adds -DDYNAMIC_DEBUG_MODULE for CONFIG_DYNAMIC_DEBUG_CORE,
otherwise BUILD_BUG_ON triggers (obvious errors on subtle misuse is
better than mysterious ones).

Signed-off-by: Jim Cromie 
---
 drivers/gpu/drm/amd/amdgpu/Makefile   |  2 +
 .../gpu/drm/amd/display/dc/core/dc_debug.c| 47 ++-
 2 files changed, 48 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
b/drivers/gpu/drm/amd/amdgpu/Makefile
index 653726588956..077342ca803f 100644
--- a/drivers/gpu/drm/amd/amdgpu/Makefile
+++ b/drivers/gpu/drm/amd/amdgpu/Makefile
@@ -38,6 +38,8 @@ ccflags-y := -I$(FULL_AMD_PATH)/include/asic_reg \
-I$(FULL_AMD_DISPLAY_PATH)/amdgpu_dm \
-I$(FULL_AMD_PATH)/amdkfd
 
+ccflags-$(CONFIG_DYNAMIC_DEBUG_CORE) += -DYNAMIC_DEBUG_MODULE
+
 amdgpu-y := amdgpu_drv.o
 
 # add KMS driver
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_debug.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_debug.c
index 21be2a684393..e49a755c6a69 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_debug.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_debug.c
@@ -36,8 +36,53 @@
 
 #include "resource.h"
 
-#define DC_LOGGER_INIT(logger)
+#if defined(CONFIG_DRM_USE_DYNAMIC_DEBUG)
+#include 
+
+unsigned long __debug_dc;
+EXPORT_SYMBOL(__debug_dc);
+
+#define help_(_N, _cat)"\t  Bit-" #_N "\t" _cat "\n"
+
+#define DC_DYNDBG_BITMAP_DESC(name)\
+   "Control pr_debugs via /sys/module/amdgpu/parameters/" #name\
+   ", where each bit controls a debug category.\n" \
+   help_(0, "[SURFACE]:")  \
+   help_(1, "[CURSOR]:")   \
+   help_(2, "[PFLIP]:")\
+   help_(3, "[VBLANK]:")   \
+   help_(4, "[HW_LINK_TRAINING]:") \
+   help_(5, "[HW_AUDIO]:") \
+   help_(6, "[SCALER]:")   \
+   help_(7, "[BIOS]:") \
+   help_(8, "[BANDWIDTH_CALCS]:")  \
+   help_(9, "[DML]:")  \
+   help_(10, "[IF_TRACE]:")\
+   help_(11, "[GAMMA]:")   \
+   help_(12, "[SMU_MSG]:")
+
+static struct dyndbg_bitdesc amdgpu_bitmap[] = {
+   [0] = { "[CURSOR]:" },
+   [1] = { "[PFLIP]:" },
+   [2] = { "[VBLANK]:" },
+   [3] = { "[HW_LINK_TRAINING]:" },
+   [4] = { "[HW_AUDIO]:" },
+   [5] = { "[SCALER]:" },
+   [6] = { "[BIOS]:" },
+   [7] = { "[BANDWIDTH_CALCS]:" },
+   [8] = { "[DML]:" },
+   [9] = { "[IF_TRACE]:" },
+   [10] = { "[GAMMA]:" },
+   [11] = { "[SMU_MSG]:" }
+};
+
+DEFINE_DYNAMIC_DEBUG_LOG_GROUPS(debug_dc, __debug_dc,
+   DC_DYNDBG_BITMAP_DESC(debug_dc),
+   amdgpu_bitmap);
+
+#endif
 
+#define DC_LOGGER_INIT(logger)
 
 #define SURFACE_TRACE(...) do {\
if (dc->debug.surface_trace) \
-- 
2.31.1



[PATCH v10 02/10] drm: fix doc grammar

2021-11-11 Thread Jim Cromie
allocates and initializes ...

Signed-off-by: Jim Cromie 
---
 include/drm/drm_drv.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
index 0cd95953cdf5..4b29261c4537 100644
--- a/include/drm/drm_drv.h
+++ b/include/drm/drm_drv.h
@@ -486,7 +486,7 @@ void *__devm_drm_dev_alloc(struct device *parent,
  * @type: the type of the struct which contains struct _device
  * @member: the name of the _device within @type.
  *
- * This allocates and initialize a new DRM device. No device registration is 
done.
+ * This allocates and initializes a new DRM device. No device registration is 
done.
  * Call drm_dev_register() to advertice the device to user space and register 
it
  * with other core subsystems. This should be done last in the device
  * initialization sequence to make sure userspace can't access an inconsistent
-- 
2.31.1



[PATCH v10 01/10] dyndbg: add DEFINE_DYNAMIC_DEBUG_BITGRPS macro and callbacks

2021-11-11 Thread Jim Cromie
DEFINE_DYNAMIC_DEBUG_BITGRPS(fsname, var, bitmap_desc, bitmap)
allows users to create a drm.debug style (bitmap) sysfs interface,
mapping each bit to a group of pr_debugs, matching on their formats.

This works well when the formats systematically include a prefix
string such as ERR|WARN|INFO, etc.

Such groups can (already) be manipulated like so:

echo "format $prefix +p" >control

This macro merely makes it easier to operate them as groups

/* standard usage */
static struct dyndbg_bitdesc my_bitmap[] = {
[0] = { "gvt:cmd:" },
[1] = { "gvt:core:" },
[2] = { "gvt:dpy:" },
[3] = { "gvt:el:" },
[4] = { "gvt:irq:" },
[5] = { "gvt:mm:" },
[6] = { "gvt:mmio:" },
[7] = { "gvt:render:" },
[8] = { "gvt:sched:" }
};
DEFINE_DYNAMIC_DEBUG_BITGRPS(debug_gvt, __gvt_debug,
 "i915/gvt bitmap desc", my_bitmap);

In addition to the macro, patch adds:

 - int param_set_dyndbg()
 - int param_get_dyndbg()
 - struct kernel_param_ops param_ops_dyndbg

Following the model of kernel/params.c STANDARD_PARAM_DEFS, these are
non-static and exported.

get/set use an augmented kernel_param; the arg refs a new struct
dyndbg_bitmap_param containing:

A- the map of "categories", an array of struct dyndbg_bitdescs,
   indexed by bitpos, defining the match against pr_debug formats.

B- a pointer to the user module's ulong holding the bits/state.
   By sharing state, we coordinate with code that still uses it
   directly.  This allows drm-debug api to be converted incrementally,
   while still using __drm_debug & drm_debug_enabled() in other parts.

param_set_dyndbg() compares new vs old bits, and only updates prdbgs
on changes.  This maximally preserves the underlying state, which may
have been customized via later `echo $cmd >control`.  So if a user
really wants to know that all prdbgs are set precisely, they must
pre-clear then set.

dynamic_debug.h:

Add DEFINE_DYNAMIC_DEBUG_BITGRPS() described above, and a stub
throwing a BUILD_BUG (RFC) when used without DYNAMIC_DEBUG support.

Add structs dyndbg_bitdesc, dyndbg_bitmap_param to support the main
macro, and several helper macros wrapping the given categories with
^prefix and ' ' suffix.  This way the callback can be more broadly
used, by using the right helper macro.

Also externs the struct kernel_param param_ops_dyndbg symbol, as is
done in moduleparams.h for all the STANDARD params.

USAGE NOTES:

Using dyndbg to query on "format $str" requires that $str must be
present in the compiled-in format string.  Searching on "%s" does not
define a useful set of callsites.

Using DEFINE_DYNAMIC_DEBUG_CATEGORIES wo support gets a BUILD_BUG.
ISTM there is already action at a declarative distance, nobody needs
mystery as to why the /sysfs thingy didn't appear.

Dyndbg is agnostic wrt the categorization scheme used, in order to
play well with any prefix convention already in use in the codebase.
In fact, "prefix" is not strictly accurate without ^ anchor.

Ad-hoc categories and sub-categories are implicitly allowed, author
discipline and review is expected.

Hierarchical classes/categories are natural:

"^drm::"   is used in a later commit
"^drm:::" is a natural extension.
"^drm:atomic:fail:" has been proposed, sounds directly useful

RFC: in a real sense we abandon enum strictures here, and lose some
compiler help, on spelling errs for example.  Obviously "drm:" != "DRM:".

Some properties of a hierarchical category deserve explication:

Trailing spaces matter !

With 1..3-space ("drm: ", "drm:atomic: ", "drm:atomic:fail: "), the
":" doesn't terminate the search-space, the trailing space does.  So a
"drm:" search spec will match all DRM categories & subcategories, and
will not be useful in an interface where all categories are already
controlled together.  That said, "drm:atomic:" & "drm:atomic: " are
different, and both are useful in cases.

Ad-Hoc categories & sub-categories:

Ad-hoc categories are those format-prefixes already in use; both
amdgpu and i915 have numerous (120,~1800) pr_debugs, most of these use
a system, a small set (9,13) of prefixes, to categorize the output.
Dyndbg already works on these, this patch just allows adding a new
bitmap knob to control them.

Ad-hoc sub-categories are slightly trickier.
  since drm_dbg_atomic("fail: ...") is a macro:
pr_debug("drm:atomic:" " " format,...) // cpp-paste in a trailing space

We get "drm:atomic: fail:", with that undesirable embedded space;
obviously not ideal wrt clear and simple prefixes.

a possible fix: drm_dbg_atomic_("fail: ..."); // trailing _ for ad-hoc subcat

Summarizing:

 - "drm:kms: " & "drm:kms:" are different
 - "drm:kms"also different - includes drm:kms2:
 - "drm:kms:\t" also different - could be troublesome
 - "drm:kms:*"  doesn't work, no wildcard on format atm.

Order matters in DEFINE_DYNAMIC_DEBUG_CATEGORIES(... @bit_descs)

Since bits are/will-stay applied 0-N, 

[PATCH v10 00/10 RESEND] use DYNAMIC_DEBUG to implement DRM.debug & DRM.trace

2021-11-11 Thread Jim Cromie
Hi Jason, Greg, DRM-everyone, everyone,

resend to add more people, after rebasing on master to pick up
306589856399 drm/print: Add deprecation notes to DRM_...() functions

This patchset has 3 separate but related parts:

1. DEFINE_DYNAMIC_DEBUG_BITGRPS macro [patch 1/10]

   Declares DRM.debug style bitmap, bits control pr_debugs by matching formats
   Adds callback to translate bits to $cmd > dynamic_debug/control
   This could obsolete EXPORT(dynamic_debug_exec_queries) not included.

   /* anticipated_usage */
   static struct dyndbg_desc drm_categories_map[] = {
  [0] = { DRM_DBG_CAT_CORE },
  [1] = { DRM_DBG_CAT_DRIVER },
  [2] = { DRM_DBG_CAT_KMS },
  [3] = { DRM_DBG_CAT_PRIME }, ... };

   DEFINE_DYNAMIC_DEBUG_BITGRPS(debug, __drm_debug,
" bits control drm.debug categories ",
drm_categories_map);

   Please consider this patch for -next/now/current:
   - new interface, new code, no users to break
   - allows DRM folks to consider in earnest.
   - api bikeshedding to do ?
 struct dyndbg_desc isnt that great a name, others too probably.

2. use (1) to reimplement drm.debug [patches 3-7]:

   1st in amdgpu & i915 to control existing pr_debugs by their formats
   POC for (1)
   then in drm-print, for all drm.debug API users
   has kernel-footprint impact:
  amdgpu has ~3k pr_debugs.  (120kb callsite data)
  i915.ko has ~2k  

   avoids drm_debug_enabled(), gives NOOP savings & new flexibility.
   changes drm.debug categories from enum to format-prefix-string
   alters in-log format to include the format-prefix-string
   Daniel Vetter liked this at -v3
   https://lore.kernel.org/lkml/YPbPvm%2FxcBlTK1wq@phenom.ffwll.local/
   Im sure Ive (still) missed stuff.


3. separately, Sean Paul proposed: drm.trace to mirror drm.debug to tracefs
   https://patchwork.freedesktop.org/series/78133/

He argues:
   tracefs is fast/lightweight compared to syslog
   independent selection (of drm categories) to tracefs
   gives tailored traffic w.o flooding syslog

ISTM he's correct.  So it follows that write-to-tracefs is also a good
feature for dyndbg, where its then available for all pr_debug users,
including all of drm, on a per-site basis, via echo +T >control.  (iff
CONFIG_TRACING).

So basically, I borg'd his:
   [patch 14/14] drm/print: Add tracefs support to the drm logging helpers

Then I added a T flag, so it can be toggled from shell:

   # turn on all drm's pr_debug --> tracefs
   echo module drm +T > /proc/dynamic_debug/control

It appears to just work: (RFC)

The instance name is a placeholder, per-module subdirs kinda fits the
tracefs pattern, but full mod/file-basename/function/line feels like
overkill, mod/basename-func.line would flatten it nicely. RFC.


[root@gandalf dyndbg-tracefs]# pwd
/sys/kernel/tracing/instances/dyndbg-tracefs
[root@gandalf dyndbg-tracefs]# echo 1 > /sys/module/drm/parameters/trace
[root@gandalf dyndbg-tracefs]# head -n16 trace | sed -e 's/^#//'
 tracer: nop

 entries-in-buffer/entries-written: 405/405   #P:24

_-=> irqs-off
   / _=> need-resched
  | / _---=> hardirq/softirq
  || / _--=> preempt-depth
  ||| / _-=> migrate-disable
   / delay
   TASK-PID CPU#  |  TIMESTAMP  FUNCTION
  | | |   | | |
   <...>-2254[000] .  7040.894352: __dynamic_pr_debug: 
drm:core: comm="gnome-shel:cs0" pid=2254, dev=0xe200, auth=1, AMDGPU_CS
   <...>-2207[015] .  7040.894654: __dynamic_pr_debug: 
drm:core: comm="gnome-shell" pid=2207, dev=0xe200, auth=1, DRM_IOCTL_MODE_ADDFB2
   <...>-2207[015] .  7040.995403: __dynamic_pr_debug: 
drm:core: comm="gnome-shell" pid=2207, dev=0xe200, auth=1, DRM_IOCTL_MODE_RMFB
   <...>-2207[015] .  7040.995413: __dynamic_pr_debug: 
drm:core: OBJ ID: 121 (2)

This is the pr-debug doing most of that logging: (from dynamic_debug/control)

  drivers/gpu/drm/drm_ioctl.c:866 [drm]drm_ioctl =T "drm:core: comm=\042%s\042 
pid=%d, dev=0x%lx, auth=%d, %s\012"

Turning on decoration flags changes the trace:

  echo module drm format drm:core: +mflt > /proc/dynamic_debug/control 

   TASK-PID CPU#  |  TIMESTAMP  FUNCTION
  | | |   | | |
   <...>-2254[003] . 15980.936660: __dynamic_pr_debug: [2254] 
drm:drm_ioctl:866: drm:core: comm="gnome-shel:cs0" pid=2254, dev=0xe200, 
auth=1, AMDGPU_CS
   <...>-2207[015] . 15980.936966: __dynamic_pr_debug: [2207] 
drm:drm_ioctl:866: drm:core: comm="gnome-shell" pid=2207, dev=0xe200, auth=1, 
DRM_IOCTL_MODE_ADDFB2
   <...>-2207[015] . 15981.037727: __dynamic_pr_debug: [2207] 
drm:drm_ioctl:866: drm:core: comm="gnome-shell" pid=2207, 

RE: [RFC v2 05/22] drm/i915/xelpd: Define Degamma Lut range struct for HDR planes

2021-11-11 Thread Shankar, Uma


> -Original Message-
> From: Harry Wentland 
> Sent: Friday, November 12, 2021 2:41 AM
> To: Shankar, Uma ; Ville Syrjälä
> 
> Cc: intel-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org;
> ppaala...@gmail.com; brian.star...@arm.com; sebast...@sebastianwick.net;
> shashank.sha...@amd.com
> Subject: Re: [RFC v2 05/22] drm/i915/xelpd: Define Degamma Lut range struct 
> for
> HDR planes
> 
> 
> 
> On 2021-11-11 15:42, Shankar, Uma wrote:
> >
> >
> >> -Original Message-
> >> From: Ville Syrjälä 
> >> Sent: Thursday, November 11, 2021 10:13 PM
> >> To: Harry Wentland 
> >> Cc: Shankar, Uma ;
> >> intel-...@lists.freedesktop.org; dri- de...@lists.freedesktop.org;
> >> ppaala...@gmail.com; brian.star...@arm.com;
> >> sebast...@sebastianwick.net; shashank.sha...@amd.com
> >> Subject: Re: [RFC v2 05/22] drm/i915/xelpd: Define Degamma Lut range
> >> struct for HDR planes
> >>
> >> On Thu, Nov 11, 2021 at 10:17:17AM -0500, Harry Wentland wrote:
> >>>
> >>>
> >>> On 2021-09-06 17:38, Uma Shankar wrote:
>  Define the structure with XE_LPD degamma lut ranges. HDR and SDR
>  planes have different capabilities, implemented respective
>  structure for the HDR planes.
> 
>  Signed-off-by: Uma Shankar 
>  ---
>   drivers/gpu/drm/i915/display/intel_color.c | 52
>  ++
>   1 file changed, 52 insertions(+)
> 
>  diff --git a/drivers/gpu/drm/i915/display/intel_color.c
>  b/drivers/gpu/drm/i915/display/intel_color.c
>  index afcb4bf3826c..6403bd74324b 100644
>  --- a/drivers/gpu/drm/i915/display/intel_color.c
>  +++ b/drivers/gpu/drm/i915/display/intel_color.c
>  @@ -2092,6 +2092,58 @@ static void icl_read_luts(struct
>  intel_crtc_state
> >> *crtc_state)
>   }
>   }
> 
>  + /* FIXME input bpc? */
>  +__maybe_unused
>  +static const struct drm_color_lut_range d13_degamma_hdr[] = {
>  +/* segment 1 */
>  +{
>  +.flags = (DRM_MODE_LUT_GAMMA |
>  +  DRM_MODE_LUT_REFLECT_NEGATIVE |
>  +  DRM_MODE_LUT_INTERPOLATE |
>  +  DRM_MODE_LUT_NON_DECREASING),
>  +.count = 128,
>  +.input_bpc = 24, .output_bpc = 16,
>  +.start = 0, .end = (1 << 24) - 1,
>  +.min = 0, .max = (1 << 24) - 1,
>  +},
>  +/* segment 2 */
>  +{
>  +.flags = (DRM_MODE_LUT_GAMMA |
>  +  DRM_MODE_LUT_REFLECT_NEGATIVE |
>  +  DRM_MODE_LUT_INTERPOLATE |
>  +  DRM_MODE_LUT_REUSE_LAST |
>  +  DRM_MODE_LUT_NON_DECREASING),
>  +.count = 1,
>  +.input_bpc = 24, .output_bpc = 16,
>  +.start = (1 << 24) - 1, .end = 1 << 24,
>  +.min = 0, .max = (1 << 27) - 1,
>  +},
>  +/* Segment 3 */
>  +{
>  +.flags = (DRM_MODE_LUT_GAMMA |
>  +  DRM_MODE_LUT_REFLECT_NEGATIVE |
>  +  DRM_MODE_LUT_INTERPOLATE |
>  +  DRM_MODE_LUT_REUSE_LAST |
>  +  DRM_MODE_LUT_NON_DECREASING),
>  +.count = 1,
>  +.input_bpc = 24, .output_bpc = 16,
>  +.start = 1 << 24, .end = 3 << 24,
>  +.min = 0, .max = (1 << 27) - 1,
>  +},
>  +/* Segment 4 */
>  +{
>  +.flags = (DRM_MODE_LUT_GAMMA |
>  +  DRM_MODE_LUT_REFLECT_NEGATIVE |
>  +  DRM_MODE_LUT_INTERPOLATE |
>  +  DRM_MODE_LUT_REUSE_LAST |
>  +  DRM_MODE_LUT_NON_DECREASING),
>  +.count = 1,
>  +.input_bpc = 24, .output_bpc = 16,
>  +.start = 3 << 24, .end = 7 << 24,
>  +.min = 0, .max = (1 << 27) - 1,
>  +},
>  +};
> >>>
> >>> If I understand this right, userspace would need this definition in
> >>> order to populate the degamma blob. Should this sit in a UAPI header?
> >
> > Hi Harry, Pekka and Ville,
> > Sorry for being a bit late on the replies, got side tracked with various 
> > issues.
> > I am back on this. Apologies for delay.
> >
> >> My original idea (not sure it's fully realized in this series) is to
> >> have a new GAMMA_MODE/etc. enum property on each crtc (or plane) for
> >> which each enum value points to a kernel provided blob that contains one of
> these LUT descriptors.
> >> Userspace can then query them dynamically and pick the best one for
> >> its current use case.
> >
> > We have this as part of the series Ville. Patch 3 of this series
> > creates a 

[PATCH 2/4] drm/i915/dg2: Add Wa_16011777198

2021-11-11 Thread Matt Roper
Coarse power gating for render should not be enabled on some DG2
steppings.

Bspec: 52698
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/intel_rc6.c | 15 +++
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_rc6.c 
b/drivers/gpu/drm/i915/gt/intel_rc6.c
index 43093dd2d0c9..c3155ee58689 100644
--- a/drivers/gpu/drm/i915/gt/intel_rc6.c
+++ b/drivers/gpu/drm/i915/gt/intel_rc6.c
@@ -117,10 +117,17 @@ static void gen11_rc6_enable(struct intel_rc6 *rc6)
GEN6_RC_CTL_RC6_ENABLE |
GEN6_RC_CTL_EI_MODE(1);
 
-   pg_enable =
-   GEN9_RENDER_PG_ENABLE |
-   GEN9_MEDIA_PG_ENABLE |
-   GEN11_MEDIA_SAMPLER_PG_ENABLE;
+   /* Wa_16011777198 - Render powergating must remain disabled */
+   if (IS_DG2_GRAPHICS_STEP(gt->i915, G10, STEP_A0, STEP_C0) ||
+   IS_DG2_GRAPHICS_STEP(gt->i915, G11, STEP_A0, STEP_B0))
+   pg_enable =
+   GEN9_MEDIA_PG_ENABLE |
+   GEN11_MEDIA_SAMPLER_PG_ENABLE;
+   else
+   pg_enable =
+   GEN9_RENDER_PG_ENABLE |
+   GEN9_MEDIA_PG_ENABLE |
+   GEN11_MEDIA_SAMPLER_PG_ENABLE;
 
if (GRAPHICS_VER(gt->i915) >= 12) {
for (i = 0; i < I915_MAX_VCS; i++)
-- 
2.33.0



[PATCH 3/4] drm/i915/dg2: Add Wa_16013000631

2021-11-11 Thread Matt Roper
From: Ramalingam C 

Invalidate IC cache through pipe control command as part of the ctx
restore flow through indirect ctx pointer

Cc: Chris Wilson 
Signed-off-by: Ramalingam C 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/intel_lrc.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 56156cf18c41..5523d7b2f983 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -1176,6 +1176,11 @@ gen12_emit_indirect_ctx_xcs(const struct intel_context 
*ce, u32 *cs)
cs = gen12_emit_timestamp_wa(ce, cs);
cs = gen12_emit_restore_scratch(ce, cs);
 
+   /* Wa_16013000631:dg2 */
+   if (IS_DG2_GRAPHICS_STEP(ce->engine->i915, G10, STEP_B0, STEP_C0) ||
+   IS_DG2_G11(ce->engine->i915))
+   cs = gen8_emit_pipe_control(cs, 
PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE, 0);
+
return cs;
 }
 
-- 
2.33.0



[PATCH 4/4] drm/i915/dg2: extend Wa_1409120013 to DG2

2021-11-11 Thread Matt Roper
From: Matt Atwood 

Extend existing workaround 1409120013 to DG2.

Cc: José Roberto de Souza 
Signed-off-by: Matt Atwood 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/intel_pm.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index b3d4710c6b25..e85a43e2dad4 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -7444,9 +7444,9 @@ static void icl_init_clock_gating(struct drm_i915_private 
*dev_priv)
 
 static void gen12lp_init_clock_gating(struct drm_i915_private *dev_priv)
 {
-   /* Wa_1409120013:tgl,rkl,adl-s,dg1 */
+   /* Wa_1409120013:tgl,rkl,adl-s,dg1,dg2 */
if (IS_TIGERLAKE(dev_priv) || IS_ROCKETLAKE(dev_priv) ||
-   IS_ALDERLAKE_S(dev_priv) || IS_DG1(dev_priv))
+   IS_ALDERLAKE_S(dev_priv) || IS_DG1(dev_priv) || IS_DG2(dev_priv))
intel_uncore_write(_priv->uncore, ILK_DPFC_CHICKEN,
   DPFC_CHICKEN_COMP_DUMMY_PIXEL);
 
-- 
2.33.0



[PATCH 0/4] i915: Additional DG2 workarounds

2021-11-11 Thread Matt Roper
We have a few more DG2 workarounds that weren't included in the initial
batch.


Matt Atwood (1):
  drm/i915/dg2: extend Wa_1409120013 to DG2

Matt Roper (2):
  drm/i915/dg2: Add Wa_14010547955
  drm/i915/dg2: Add Wa_16011777198

Ramalingam C (1):
  drm/i915/dg2: Add Wa_16013000631

 drivers/gpu/drm/i915/display/intel_display.c |  4 
 drivers/gpu/drm/i915/gt/intel_lrc.c  |  5 +
 drivers/gpu/drm/i915/gt/intel_rc6.c  | 15 +++
 drivers/gpu/drm/i915/i915_reg.h  |  5 +++--
 drivers/gpu/drm/i915/intel_pm.c  |  4 ++--
 5 files changed, 25 insertions(+), 8 deletions(-)

-- 
2.33.0



[PATCH 1/4] drm/i915/dg2: Add Wa_14010547955

2021-11-11 Thread Matt Roper
This workaround is documented a bit strangely in the bspec; it's listed
as an A0 workaround, but the description clarifies that the workaround
is implicitly handled by the hardware and what the driver really needs
to do is program a chicken bit to reenable some internal behavior.

Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/display/intel_display.c | 4 
 drivers/gpu/drm/i915/i915_reg.h  | 5 +++--
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
b/drivers/gpu/drm/i915/display/intel_display.c
index 0ceee8ac6671..5d50d06f4eb7 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -988,6 +988,10 @@ static void icl_set_pipe_chicken(const struct 
intel_crtc_state *crtc_state)
else if (DISPLAY_VER(dev_priv) >= 13)
tmp |= UNDERRUN_RECOVERY_DISABLE_ADLP;
 
+   /* Wa_14010547955:dg2 */
+   if (IS_DG2_DISP_STEP(dev_priv, STEP_B0, STEP_FOREVER))
+   tmp |= DG2_RENDER_CCSTAG_4_3_EN;
+
intel_de_write(dev_priv, PIPE_CHICKEN(pipe), tmp);
 }
 
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 07d6cf76c389..680ace373e00 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -8480,8 +8480,9 @@ enum {
   _PIPEB_CHICKEN)
 #define   UNDERRUN_RECOVERY_DISABLE_ADLP   REG_BIT(30)
 #define   UNDERRUN_RECOVERY_ENABLE_DG2 REG_BIT(30)
-#define   PIXEL_ROUNDING_TRUNC_FB_PASSTHRU (1 << 15)
-#define   PER_PIXEL_ALPHA_BYPASS_EN(1 << 7)
+#define   PIXEL_ROUNDING_TRUNC_FB_PASSTHRU REG_BIT(15)
+#define   DG2_RENDER_CCSTAG_4_3_EN REG_BIT(12)
+#define   PER_PIXEL_ALPHA_BYPASS_ENREG_BIT(7)
 
 #define FF_MODE2   _MMIO(0x6604)
 #define   FF_MODE2_GS_TIMER_MASK   REG_GENMASK(31, 24)
-- 
2.33.0



[PATCH] drm/i915/execlists: Weak parallel submission support for execlists

2021-11-11 Thread Matthew Brost
A weak implementation of parallel submission (multi-bb execbuf IOCTL) for
execlists. Doing as little as possible to support this interface for
execlists - basically just passing submit fences between each request
generated and virtual engines are not allowed. This is on par with what
is there for the existing (hopefully soon deprecated) bonding interface.

We perma-pin these execlists contexts to align with GuC implementation.

v2:
 (John Harrison)
  - Drop siblings array as num_siblings must be 1
v3:
 (John Harrison)
  - Drop single submission

Signed-off-by: Matthew Brost 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 10 +++--
 drivers/gpu/drm/i915/gt/intel_context.c   |  4 +-
 .../drm/i915/gt/intel_execlists_submission.c  | 40 +++
 drivers/gpu/drm/i915/gt/intel_lrc.c   |  2 +
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  2 -
 5 files changed, 50 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index ebd775cb1661c..d7bf6c8f70b7b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -570,10 +570,6 @@ set_proto_ctx_engines_parallel_submit(struct 
i915_user_extension __user *base,
struct intel_engine_cs **siblings = NULL;
intel_engine_mask_t prev_mask;
 
-   /* FIXME: This is NIY for execlists */
-   if (!(intel_uc_uses_guc_submission(>gt.uc)))
-   return -ENODEV;
-
if (get_user(slot, >engine_index))
return -EFAULT;
 
@@ -583,6 +579,12 @@ set_proto_ctx_engines_parallel_submit(struct 
i915_user_extension __user *base,
if (get_user(num_siblings, >num_siblings))
return -EFAULT;
 
+   if (!intel_uc_uses_guc_submission(>gt.uc) && num_siblings != 1) {
+   drm_dbg(>drm, "Only 1 sibling (%d) supported in non-GuC 
mode\n",
+   num_siblings);
+   return -EINVAL;
+   }
+
if (slot >= set->num_engines) {
drm_dbg(>drm, "Invalid placement value, %d >= %d\n",
slot, set->num_engines);
diff --git a/drivers/gpu/drm/i915/gt/intel_context.c 
b/drivers/gpu/drm/i915/gt/intel_context.c
index 5634d14052bc9..1bec92e1d8e63 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -79,7 +79,8 @@ static int intel_context_active_acquire(struct intel_context 
*ce)
 
__i915_active_acquire(>active);
 
-   if (intel_context_is_barrier(ce) || intel_engine_uses_guc(ce->engine))
+   if (intel_context_is_barrier(ce) || intel_engine_uses_guc(ce->engine) ||
+   intel_context_is_parallel(ce))
return 0;
 
/* Preallocate tracking nodes */
@@ -563,7 +564,6 @@ void intel_context_bind_parent_child(struct intel_context 
*parent,
 * Callers responsibility to validate that this function is used
 * correctly but we use GEM_BUG_ON here ensure that they do.
 */
-   GEM_BUG_ON(!intel_engine_uses_guc(parent->engine));
GEM_BUG_ON(intel_context_is_pinned(parent));
GEM_BUG_ON(intel_context_is_child(parent));
GEM_BUG_ON(intel_context_is_pinned(child));
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c 
b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index ca03880fa7e49..5fd49ee47096d 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -2598,6 +2598,45 @@ static void execlists_context_cancel_request(struct 
intel_context *ce,
  current->comm);
 }
 
+static struct intel_context *
+execlists_create_parallel(struct intel_engine_cs **engines,
+ unsigned int num_siblings,
+ unsigned int width)
+{
+   struct intel_context *parent = NULL, *ce, *err;
+   int i;
+
+   GEM_BUG_ON(num_siblings != 1);
+
+   for (i = 0; i < width; ++i) {
+   ce = intel_context_create(engines[i]);
+   if (!ce) {
+   err = ERR_PTR(-ENOMEM);
+   goto unwind;
+   }
+
+   if (i == 0)
+   parent = ce;
+   else
+   intel_context_bind_parent_child(parent, ce);
+   }
+
+   parent->parallel.fence_context = dma_fence_context_alloc(1);
+
+   intel_context_set_nopreempt(parent);
+   for_each_child(parent, ce) {
+   intel_context_set_nopreempt(ce);
+   intel_context_set_single_submission(ce);
+   }
+
+   return parent;
+
+unwind:
+   if (parent)
+   intel_context_put(parent);
+   return err;
+}
+
 static const struct intel_context_ops execlists_context_ops = {
.flags = COPS_HAS_INFLIGHT,
 
@@ -2616,6 +2655,7 @@ static const struct intel_context_ops 
execlists_context_ops = {
.reset = 

Re: [RFC] arm64: dts: imx8mm: Add MIPI and LCDIF nodes

2021-11-11 Thread Tim Harvey
On Thu, Nov 11, 2021 at 2:19 AM Jagan Teki  wrote:
>
> On Wed, Nov 10, 2021 at 11:58 PM Jagan Teki  
> wrote:
> >
> > On Wed, Nov 10, 2021 at 2:24 AM Tim Harvey  wrote:
> > >
> > > On Tue, Nov 9, 2021 at 12:39 PM Marek Vasut  wrote:
> > > >
> > > > On 11/9/21 8:35 PM, Adam Ford wrote:
> > > >
> > > > [...]
> > > >
> > > > >> diff --git a/arch/arm64/boot/dts/freescale/imx8mm.dtsi 
> > > > >> b/arch/arm64/boot/dts/freescale/imx8mm.dtsi
> > > > >> index 208a0ed840f4..195dcbff7058 100644
> > > > >> --- a/arch/arm64/boot/dts/freescale/imx8mm.dtsi
> > > > >> +++ b/arch/arm64/boot/dts/freescale/imx8mm.dtsi
> > > > >> @@ -188,6 +188,12 @@
> > > > >>  clock-output-names = "clk_ext4";
> > > > >>  };
> > > > >>
> > > > >> +   mipi_phy: mipi-video-phy {
> > > > >> +   compatible = "fsl,imx8mm-mipi-video-phy";
> > > > >> +   syscon = <_blk_ctrl>;
> > > > >> +   #phy-cells = <1>;
> > > > >> +   };
> > > > >> +
> > > > >>  psci {
> > > > >>  compatible = "arm,psci-1.0";
> > > > >>  method = "smc";
> > > > >> @@ -1068,6 +1074,68 @@
> > > > >>  #size-cells = <1>;
> > > > >>  ranges = <0x32c0 0x32c0 0x40>;
> > > > >>
> > > > >> +   lcdif: lcdif@32e0 {
> > > > >> +   #address-cells = <1>;
> > > > >> +   #size-cells = <0>;
> > > > >> +   compatible = "fsl,imx8mm-lcdif", 
> > > > >> "fsl,imx6sx-lcdif";
> > > > >
> > > > > The compatible "imx6sx-lcdif" implies MXSFB_V6.  FWICT, it is like
> > > > > MXSFB_V4, but with overlays and those overlays have more registers
> > > > > configured in the mxsfb_kms driver.  Have you tried using imx28-lcdif
> > > > > to see if it makes a difference?
> > > >
> > > > Indeed, MX6SX has AS overlay plane support, MX{2,}8 does not.
> > > >
> > > > LCDIFv3 (as NXP calls it) in MX8MP is like LCDIFv6 (in MX6SX) with
> > > > slightly reordered register bits, but nothing like LCDIF rev3 (in MX23)
> > > > ... just to make sure there is no confusion.
> > > >
> > > > [...]
> > > >
> > > > >> +   mipi_dsi: mipi_dsi@32e1 {
> > > > >> +   #address-cells = <1>;
> > > > >> +   #size-cells = <0>;
> > > > >> +   compatible = "fsl,imx8mm-mipi-dsim";
> > > > >> +   reg = <0x32e1 0x400>;
> > > > >> +   clocks = < IMX8MM_CLK_DSI_CORE>,
> > > > >> +< 
> > > > >> IMX8MM_CLK_DSI_PHY_REF>;
> > > > >> +   clock-names = "bus_clk", "sclk_mipi";
> > > > >> +   assigned-clocks = < 
> > > > >> IMX8MM_CLK_DSI_CORE>,
> > > > >> + < 
> > > > >> IMX8MM_VIDEO_PLL1_OUT>,
> > > > >> + < 
> > > > >> IMX8MM_CLK_DSI_PHY_REF>;
> > > > >> +   assigned-clock-parents = < 
> > > > >> IMX8MM_SYS_PLL1_266M>,
> > > > >> +< 
> > > > >> IMX8MM_VIDEO_PLL1_BYPASS>,
> > > > >> +< 
> > > > >> IMX8MM_VIDEO_PLL1_OUT>;
> > > > >> +   assigned-clock-rates = <26600>, 
> > > > >> <59400>, <2700>;
> > > > >> +   interrupts =  > > > >> IRQ_TYPE_LEVEL_HIGH>;
> > > > >> +   phys = <_phy 0>;
> > > > >> +   phy-names = "dsim";
> > > > >> +   power-domains = <_blk_ctrl 
> > > > >> IMX8MM_DISPBLK_PD_MIPI_DSI>;
> > > > >> +   samsung,burst-clock-frequency = 
> > > > >> <89100>;
> > > > >> +   samsung,esc-clock-frequency = 
> > > > >> <5400>;
> > > > >> +   samsung,pll-clock-frequency = 
> > > > >> <2700>;
> > > >
> > > > This 27 MHz is really IMX8MM_CLK_DSI_PHY_REF and
> > > > samsung,burst-clock-frequency is really the DSI link clock which is
> > > > panel/bridge specific ... but, why do we need to specify such policy in
> > > > DT rather than have the panel/bridge drivers negotiate the best clock
> > > > settings with DSIM bridge driver ? This should be something which should
> > > > be implemented in the DRM subsystem, not hard-coded in DT. These ad-hoc
> > > > samsung,*-clock-frequency properties shouldn't even be needed then.
> > > >
> > > > Also, are the DSIM bindings stable now ?
> > >
> > > Thanks Marek.
> > >
> > > No, there is no dsim driver yet. I'm not clear if there is still
> > > dissagreement on if the drm/exynos driver can be split up or if a
> > > whole new somewhat duplicate driver needs to be made. I know Jagan
> > > also has a series he is 

Re: [RFC v2 05/22] drm/i915/xelpd: Define Degamma Lut range struct for HDR planes

2021-11-11 Thread Harry Wentland



On 2021-11-11 15:42, Shankar, Uma wrote:
> 
> 
>> -Original Message-
>> From: Ville Syrjälä 
>> Sent: Thursday, November 11, 2021 10:13 PM
>> To: Harry Wentland 
>> Cc: Shankar, Uma ; intel-...@lists.freedesktop.org; 
>> dri-
>> de...@lists.freedesktop.org; ppaala...@gmail.com; brian.star...@arm.com;
>> sebast...@sebastianwick.net; shashank.sha...@amd.com
>> Subject: Re: [RFC v2 05/22] drm/i915/xelpd: Define Degamma Lut range struct 
>> for
>> HDR planes
>>
>> On Thu, Nov 11, 2021 at 10:17:17AM -0500, Harry Wentland wrote:
>>>
>>>
>>> On 2021-09-06 17:38, Uma Shankar wrote:
 Define the structure with XE_LPD degamma lut ranges. HDR and SDR
 planes have different capabilities, implemented respective structure
 for the HDR planes.

 Signed-off-by: Uma Shankar 
 ---
  drivers/gpu/drm/i915/display/intel_color.c | 52
 ++
  1 file changed, 52 insertions(+)

 diff --git a/drivers/gpu/drm/i915/display/intel_color.c
 b/drivers/gpu/drm/i915/display/intel_color.c
 index afcb4bf3826c..6403bd74324b 100644
 --- a/drivers/gpu/drm/i915/display/intel_color.c
 +++ b/drivers/gpu/drm/i915/display/intel_color.c
 @@ -2092,6 +2092,58 @@ static void icl_read_luts(struct intel_crtc_state
>> *crtc_state)
}
  }

 + /* FIXME input bpc? */
 +__maybe_unused
 +static const struct drm_color_lut_range d13_degamma_hdr[] = {
 +  /* segment 1 */
 +  {
 +  .flags = (DRM_MODE_LUT_GAMMA |
 +DRM_MODE_LUT_REFLECT_NEGATIVE |
 +DRM_MODE_LUT_INTERPOLATE |
 +DRM_MODE_LUT_NON_DECREASING),
 +  .count = 128,
 +  .input_bpc = 24, .output_bpc = 16,
 +  .start = 0, .end = (1 << 24) - 1,
 +  .min = 0, .max = (1 << 24) - 1,
 +  },
 +  /* segment 2 */
 +  {
 +  .flags = (DRM_MODE_LUT_GAMMA |
 +DRM_MODE_LUT_REFLECT_NEGATIVE |
 +DRM_MODE_LUT_INTERPOLATE |
 +DRM_MODE_LUT_REUSE_LAST |
 +DRM_MODE_LUT_NON_DECREASING),
 +  .count = 1,
 +  .input_bpc = 24, .output_bpc = 16,
 +  .start = (1 << 24) - 1, .end = 1 << 24,
 +  .min = 0, .max = (1 << 27) - 1,
 +  },
 +  /* Segment 3 */
 +  {
 +  .flags = (DRM_MODE_LUT_GAMMA |
 +DRM_MODE_LUT_REFLECT_NEGATIVE |
 +DRM_MODE_LUT_INTERPOLATE |
 +DRM_MODE_LUT_REUSE_LAST |
 +DRM_MODE_LUT_NON_DECREASING),
 +  .count = 1,
 +  .input_bpc = 24, .output_bpc = 16,
 +  .start = 1 << 24, .end = 3 << 24,
 +  .min = 0, .max = (1 << 27) - 1,
 +  },
 +  /* Segment 4 */
 +  {
 +  .flags = (DRM_MODE_LUT_GAMMA |
 +DRM_MODE_LUT_REFLECT_NEGATIVE |
 +DRM_MODE_LUT_INTERPOLATE |
 +DRM_MODE_LUT_REUSE_LAST |
 +DRM_MODE_LUT_NON_DECREASING),
 +  .count = 1,
 +  .input_bpc = 24, .output_bpc = 16,
 +  .start = 3 << 24, .end = 7 << 24,
 +  .min = 0, .max = (1 << 27) - 1,
 +  },
 +};
>>>
>>> If I understand this right, userspace would need this definition in
>>> order to populate the degamma blob. Should this sit in a UAPI header?
> 
> Hi Harry, Pekka and Ville,
> Sorry for being a bit late on the replies, got side tracked with various 
> issues.
> I am back on this. Apologies for delay.
> 
>> My original idea (not sure it's fully realized in this series) is to have a 
>> new
>> GAMMA_MODE/etc. enum property on each crtc (or plane) for which each enum
>> value points to a kernel provided blob that contains one of these LUT 
>> descriptors.
>> Userspace can then query them dynamically and pick the best one for its 
>> current use
>> case.
> 
> We have this as part of the series Ville. Patch 3 of this series creates a 
> DEGAMMA_MODE
> property just for this. With that userspace can just query the blob_id's and 
> will get the
> various degamma mode possible and the respective segment and lut 
> distributions.
> 
> This will be generic, so for userspace it should just be able to query this 
> and parse and get
> the lut distribution and segment ranges.
> 

Thanks for the explanation.

Uma, have you had a chance to sketch some of this out in IGT? I'm trying
to see how userspace would do this in practice and will try to sketch an
IGT test for this myself, but if you have it already we could share the
effort.

>> The algorithm for choosing the best one might be something like:
>> - prefer LUT with bpc >= FB bpc, but perhaps not needlessly high bpc
>> - prefer interpolated vs. direct lookup based on current needs (eg. X
>>   could prefer direct lookup to get directcolor visuals).
>> - prefer one with 

Re: regression with mainline kernel

2021-11-11 Thread Sudip Mukherjee
Hi Linus,

On Thu, Nov 11, 2021 at 2:03 PM Sudip Mukherjee
 wrote:
>
> Hi Linus,
>
> My testing has been failing for the last few days. Last good test was
> with 6f2b76a4a384 and I started seeing the failure with ce840177930f5
> where boot timeout.
>
> Last good test - https://openqa.qa.codethink.co.uk/tests/323
> Failing test - https://openqa.qa.codethink.co.uk/tests/335
>
> Saw a similar issue with 5.10.79-rc1 today and bisect showed the
> problem with 8615ff6dd1ac but that was already in the last good test I
> had.

Did a bisect and this is a separate issue than the one we saw in 5.10.79-rc1.

The bisect log:
# bad: [ce840177930f591a181f55515fc6ac9e1f56b84a] Merge tag
'defconfig-5.16' of
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
# good: [6f2b76a4a384e05ac8d3349831f29dff5de1e1e2] Merge tag
'Smack-for-5.16' of https://github.com/cschaufler/smack-next
git bisect start 'ce840177930f5' '6f2b76a4a384e05ac8d3349831f29dff5de1e1e2'
# good: [a64a325bf6313aa5cde7ecd691927e92892d1b7f] Merge tag
'afs-next-20211102' of
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
git bisect good a64a325bf6313aa5cde7ecd691927e92892d1b7f
# bad: [dcd68326d29b62f3039e4f4d23d3e38f24d37360] Merge tag
'devicetree-for-5.16' of
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
git bisect bad dcd68326d29b62f3039e4f4d23d3e38f24d37360
# bad: [c7c774fe09389fc806bbe4b487c18e45f576c1ae] Merge tag
'drm-intel-next-2021-10-04' of
git://anongit.freedesktop.org/drm/drm-intel into drm-next
git bisect bad c7c774fe09389fc806bbe4b487c18e45f576c1ae
# good: [8017ecb11ebbcdfcbdff14c5edbdf1efc14991f4] drm/amd/display:
Added root clock optimization flags
git bisect good 8017ecb11ebbcdfcbdff14c5edbdf1efc14991f4
# good: [8a1ec3f3275479292613273a7be2ac87f2a7f6e6] drm/i915: Remove
DP_PORT_EN stuff from link training code
git bisect good 8a1ec3f3275479292613273a7be2ac87f2a7f6e6
# bad: [9962601ca5719050906915c3c33a63744ac7b15c] drm/bridge:
dw-hdmi-cec: Make use of the helper function
devm_add_action_or_reset()
git bisect bad 9962601ca5719050906915c3c33a63744ac7b15c
# bad: [606b102876e3741851dfb09d53f3ee57f650a52c] drm: fb_helper: fix
CONFIG_FB dependency
git bisect bad 606b102876e3741851dfb09d53f3ee57f650a52c
# good: [c43da06c24a485308e80d709737b446e8cad175d] dt-bindings:
drm/panel: boe-tv101wum-nl6: Support enabling a 3.3V rail
git bisect good c43da06c24a485308e80d709737b446e8cad175d
# good: [8d6b006e1f51c99016aa39ca9e03947cbdd024e3] drm/virtio:
implement context init: handle VIRTGPU_CONTEXT_PARAM_POLL_RINGS_MASK
git bisect good 8d6b006e1f51c99016aa39ca9e03947cbdd024e3
# bad: [d0f5d790ae863079025398015eb59347b01db455] drm/ttm: remove
TTM_PAGE_FLAG_NO_RETRY
git bisect bad d0f5d790ae863079025398015eb59347b01db455
# bad: [f5d28856b89baab4232a9f841e565763fcebcdf9] drm/ttm: stop
calling tt_swapin in vm_access
git bisect bad f5d28856b89baab4232a9f841e565763fcebcdf9
# bad: [78aa20fa4381623cf59a85d053486f98784ca3a0] drm/virtio:
implement context init: advertise feature to userspace
git bisect bad 78aa20fa4381623cf59a85d053486f98784ca3a0
# bad: [cd7f5ca33585918febe5e2f6dc090a21cfa775b0] drm/virtio:
implement context init: add virtio_gpu_fence_event
git bisect bad cd7f5ca33585918febe5e2f6dc090a21cfa775b0
# first bad commit: [cd7f5ca33585918febe5e2f6dc090a21cfa775b0]
drm/virtio: implement context init: add virtio_gpu_fence_event

And, indeed reverting cd7f5ca33585 on top of debe436e77c7 has fixed
the problem I was seeing on my qemu test of x86_64. The qemu image is
based on Ubuntu.

Will be happy to test any fix or more debugging if needed.


-- 
Regards
Sudip


RE: [RFC v2 05/22] drm/i915/xelpd: Define Degamma Lut range struct for HDR planes

2021-11-11 Thread Shankar, Uma



> -Original Message-
> From: Ville Syrjälä 
> Sent: Thursday, November 11, 2021 10:13 PM
> To: Harry Wentland 
> Cc: Shankar, Uma ; intel-...@lists.freedesktop.org; 
> dri-
> de...@lists.freedesktop.org; ppaala...@gmail.com; brian.star...@arm.com;
> sebast...@sebastianwick.net; shashank.sha...@amd.com
> Subject: Re: [RFC v2 05/22] drm/i915/xelpd: Define Degamma Lut range struct 
> for
> HDR planes
> 
> On Thu, Nov 11, 2021 at 10:17:17AM -0500, Harry Wentland wrote:
> >
> >
> > On 2021-09-06 17:38, Uma Shankar wrote:
> > > Define the structure with XE_LPD degamma lut ranges. HDR and SDR
> > > planes have different capabilities, implemented respective structure
> > > for the HDR planes.
> > >
> > > Signed-off-by: Uma Shankar 
> > > ---
> > >  drivers/gpu/drm/i915/display/intel_color.c | 52
> > > ++
> > >  1 file changed, 52 insertions(+)
> > >
> > > diff --git a/drivers/gpu/drm/i915/display/intel_color.c
> > > b/drivers/gpu/drm/i915/display/intel_color.c
> > > index afcb4bf3826c..6403bd74324b 100644
> > > --- a/drivers/gpu/drm/i915/display/intel_color.c
> > > +++ b/drivers/gpu/drm/i915/display/intel_color.c
> > > @@ -2092,6 +2092,58 @@ static void icl_read_luts(struct intel_crtc_state
> *crtc_state)
> > >   }
> > >  }
> > >
> > > + /* FIXME input bpc? */
> > > +__maybe_unused
> > > +static const struct drm_color_lut_range d13_degamma_hdr[] = {
> > > + /* segment 1 */
> > > + {
> > > + .flags = (DRM_MODE_LUT_GAMMA |
> > > +   DRM_MODE_LUT_REFLECT_NEGATIVE |
> > > +   DRM_MODE_LUT_INTERPOLATE |
> > > +   DRM_MODE_LUT_NON_DECREASING),
> > > + .count = 128,
> > > + .input_bpc = 24, .output_bpc = 16,
> > > + .start = 0, .end = (1 << 24) - 1,
> > > + .min = 0, .max = (1 << 24) - 1,
> > > + },
> > > + /* segment 2 */
> > > + {
> > > + .flags = (DRM_MODE_LUT_GAMMA |
> > > +   DRM_MODE_LUT_REFLECT_NEGATIVE |
> > > +   DRM_MODE_LUT_INTERPOLATE |
> > > +   DRM_MODE_LUT_REUSE_LAST |
> > > +   DRM_MODE_LUT_NON_DECREASING),
> > > + .count = 1,
> > > + .input_bpc = 24, .output_bpc = 16,
> > > + .start = (1 << 24) - 1, .end = 1 << 24,
> > > + .min = 0, .max = (1 << 27) - 1,
> > > + },
> > > + /* Segment 3 */
> > > + {
> > > + .flags = (DRM_MODE_LUT_GAMMA |
> > > +   DRM_MODE_LUT_REFLECT_NEGATIVE |
> > > +   DRM_MODE_LUT_INTERPOLATE |
> > > +   DRM_MODE_LUT_REUSE_LAST |
> > > +   DRM_MODE_LUT_NON_DECREASING),
> > > + .count = 1,
> > > + .input_bpc = 24, .output_bpc = 16,
> > > + .start = 1 << 24, .end = 3 << 24,
> > > + .min = 0, .max = (1 << 27) - 1,
> > > + },
> > > + /* Segment 4 */
> > > + {
> > > + .flags = (DRM_MODE_LUT_GAMMA |
> > > +   DRM_MODE_LUT_REFLECT_NEGATIVE |
> > > +   DRM_MODE_LUT_INTERPOLATE |
> > > +   DRM_MODE_LUT_REUSE_LAST |
> > > +   DRM_MODE_LUT_NON_DECREASING),
> > > + .count = 1,
> > > + .input_bpc = 24, .output_bpc = 16,
> > > + .start = 3 << 24, .end = 7 << 24,
> > > + .min = 0, .max = (1 << 27) - 1,
> > > + },
> > > +};
> >
> > If I understand this right, userspace would need this definition in
> > order to populate the degamma blob. Should this sit in a UAPI header?

Hi Harry, Pekka and Ville,
Sorry for being a bit late on the replies, got side tracked with various issues.
I am back on this. Apologies for delay.

> My original idea (not sure it's fully realized in this series) is to have a 
> new
> GAMMA_MODE/etc. enum property on each crtc (or plane) for which each enum
> value points to a kernel provided blob that contains one of these LUT 
> descriptors.
> Userspace can then query them dynamically and pick the best one for its 
> current use
> case.

We have this as part of the series Ville. Patch 3 of this series creates a 
DEGAMMA_MODE
property just for this. With that userspace can just query the blob_id's and 
will get the
various degamma mode possible and the respective segment and lut distributions.

This will be generic, so for userspace it should just be able to query this and 
parse and get
the lut distribution and segment ranges.

> The algorithm for choosing the best one might be something like:
> - prefer LUT with bpc >= FB bpc, but perhaps not needlessly high bpc
> - prefer interpolated vs. direct lookup based on current needs (eg. X
>   could prefer direct lookup to get directcolor visuals).
> - prefer one with extended range values if needed
> - for HDR prefer smaller step size in dark tones,
>   for SDR perhaps prefer a more uniform step size
> 
> Or maybe we should include some kind of usage hints as well?

I think the segment range and distribution of lut should be enough for a 
userspace
to pick the right ones, but we can add some examples in UAPI 

Re: [Intel-gfx] [PATCH] drm/i915: Drop stealing of bits from i915_sw_fence function pointer

2021-11-11 Thread Teres Alexis, Alan Previn
Hey Matt, apologies for the delay, went thru all the code, LGTM.

Reviewed-by: Alan Previn 

P.S. - As a side note, would be interesting to replay the original reason 
behind the overloading of the
func ptr bits to begin with... to see what the initial intention was.


...alan

On Wed, 2021-09-22 at 08:47 -0700, Matthew Brost wrote:
> Rather than stealing bits from i915_sw_fence function pointer use
> seperate fields for function pointer and flags. If using two different
> fields, the 4 byte alignment for the i915_sw_fence function pointer can
> also be dropped.
> 
> v2:
>  (CI)
>   - Set new function field rather than flags in __i915_sw_fence_init
> v3:
>  (Tvrtko)
>   - Remove BUG_ON(!fence->flags) in reinit as that will now blow up
>   - Only define fence->flags if CONFIG_DRM_I915_SW_FENCE_CHECK_DAG is
> defined
> 
> Signed-off-by: Matthew Brost 
> Acked-by: Jani Nikula 
> ---
>  drivers/gpu/drm/i915/display/intel_display.c  |  2 +-
>  drivers/gpu/drm/i915/gem/i915_gem_context.c   |  2 +-
>  drivers/gpu/drm/i915/i915_request.c   |  4 +--
>  drivers/gpu/drm/i915/i915_sw_fence.c  | 28 +++
>  drivers/gpu/drm/i915/i915_sw_fence.h  | 23 +++
>  drivers/gpu/drm/i915/i915_sw_fence_work.c |  2 +-
>  .../gpu/drm/i915/selftests/i915_sw_fence.c|  2 +-
>  drivers/gpu/drm/i915/selftests/lib_sw_fence.c |  8 +++---
>  8 files changed, 39 insertions(+), 32 deletions(-)
> 
>  
> -- 
> 2.32.0
> 



Re: [Intel-gfx] [PATCH 1/1] drm/i915/rpm: Enable runtime pm autosuspend by default

2021-11-11 Thread Vivi, Rodrigo
On Thu, 2021-11-11 at 14:42 +0200, Ville Syrjälä wrote:
> On Wed, Nov 10, 2021 at 05:24:22PM -0500, Rodrigo Vivi wrote:
> > On Wed, Nov 10, 2021 at 01:46:46PM +0200, Ville Syrjälä wrote:
> > > On Wed, Nov 10, 2021 at 10:59:26AM +0530, Tilak Tangudu wrote:
> > > > Enable runtime pm autosuspend by default for gen12 and
> > > > later versions.
> > > > 
> > > > Signed-off-by: Tilak Tangudu 
> > > > ---
> > > >  drivers/gpu/drm/i915/intel_runtime_pm.c | 4 
> > > >  1 file changed, 4 insertions(+)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c
> > > > b/drivers/gpu/drm/i915/intel_runtime_pm.c
> > > > index eaf7688f517d..ef75f24288ef 100644
> > > > --- a/drivers/gpu/drm/i915/intel_runtime_pm.c
> > > > +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
> > > > @@ -600,6 +600,10 @@ void intel_runtime_pm_enable(struct
> > > > intel_runtime_pm *rpm)
> > > > pm_runtime_use_autosuspend(kdev);
> > > > }
> > > >  
> > > > +   /* XXX: Enable by default only for newer platforms for
> > > > now */
> > > > +   if (GRAPHICS_VER(i915) >= 12)
> > > > +   pm_runtime_allow(kdev);
> > > 
> > > If we change some default then we should just do it across the
> > > board.
> > > There is nothing special about tgl+.
> > 
> > Nothing special with tgl and newer platforms indeed. This is why we
> > have the XXX message here.
> > 
> > The problem in the last attempt was with the gen9 platforms.
> 
> What problem was that?

unfortunately it looks like the logs are not available anymore. :(

Tilak, could you please send this patch without the if?

so we can at
least make sure we spot the differences and see if there's something
quick that we can do about the gen9 or if we should take this path of
gen12, then fix gen9 , then enable eveywhere

> 
> > Apparently some special there, and I didn't want to block the
> > progress while we cannot get to the gen9 bugs.
> > 
> > > 
> > > > +
> > > > /*
> > > >  * The core calls the driver load handler with an RPM
> > > > reference held.
> > > >  * We drop that here and will reacquire it during
> > > > unloading in
> > > > -- 
> > > > 2.25.1
> > > 
> > > -- 
> > > Ville Syrjälä
> > > Intel
> 



[PATCH 2/2] drm/msm: Restore error return on invalid fence

2021-11-11 Thread Rob Clark
From: Rob Clark 

When converting to use an idr to map userspace fence seqno values back
to a dma_fence, we lost the error return when userspace passes seqno
that is larger than the last submitted fence.  Restore this check.

Reported-by: Akhil P Oommen 
Fixes: a61acbbe9cf8 ("drm/msm: Track "seqno" fences by idr")
Signed-off-by: Rob Clark 
---
Note: I will rebase "drm/msm: Handle fence rollover" on top of this,
to simplify backporting this patch to stable kernels

 drivers/gpu/drm/msm/msm_drv.c| 6 ++
 drivers/gpu/drm/msm/msm_gem_submit.c | 1 +
 drivers/gpu/drm/msm/msm_gpu.h| 3 +++
 3 files changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index cb14d997c174..56500eb5219e 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -967,6 +967,12 @@ static int wait_fence(struct msm_gpu_submitqueue *queue, 
uint32_t fence_id,
struct dma_fence *fence;
int ret;
 
+   if (fence_id > queue->last_fence) {
+   DRM_ERROR_RATELIMITED("waiting on invalid fence: %u (of %u)\n",
+ fence_id, queue->last_fence);
+   return -EINVAL;
+   }
+
/*
 * Map submitqueue scoped "seqno" (which is actually an idr key)
 * back to underlying dma-fence
diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c 
b/drivers/gpu/drm/msm/msm_gem_submit.c
index 151d19e4453c..a38f23be497d 100644
--- a/drivers/gpu/drm/msm/msm_gem_submit.c
+++ b/drivers/gpu/drm/msm/msm_gem_submit.c
@@ -911,6 +911,7 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data,
drm_sched_entity_push_job(>base, queue->entity);
 
args->fence = submit->fence_id;
+   queue->last_fence = submit->fence_id;
 
msm_reset_syncobjs(syncobjs_to_reset, args->nr_in_syncobjs);
msm_process_post_deps(post_deps, args->nr_out_syncobjs,
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index bd4e0024033e..e73a5bb03544 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -376,6 +376,8 @@ static inline int msm_gpu_convert_priority(struct msm_gpu 
*gpu, int prio,
  * @ring_nr:   the ringbuffer used by this submitqueue, which is determined
  * by the submitqueue's priority
  * @faults:the number of GPU hangs associated with this submitqueue
+ * @last_fence: the sequence number of the last allocated fence (for error
+ * checking)
  * @ctx:   the per-drm_file context associated with the submitqueue (ie.
  * which set of pgtables do submits jobs associated with the
  * submitqueue use)
@@ -391,6 +393,7 @@ struct msm_gpu_submitqueue {
u32 flags;
u32 ring_nr;
int faults;
+   uint32_t last_fence;
struct msm_file_private *ctx;
struct list_head node;
struct idr fence_idr;
-- 
2.31.1



[PATCH 1/2] drm/msm: Fix wait_fence submitqueue leak

2021-11-11 Thread Rob Clark
From: Rob Clark 

We weren't dropping the submitqueue reference in all paths.  In
particular, when the fence has already been signalled. Split out
a helper to simplify handling this in the various different return
paths.

Fixes: a61acbbe9cf8 ("drm/msm: Track "seqno" fences by idr")
Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/msm/msm_drv.c | 49 +--
 1 file changed, 29 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index 73e827641024..cb14d997c174 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -961,29 +961,12 @@ static int msm_ioctl_gem_info(struct drm_device *dev, 
void *data,
return ret;
 }
 
-static int msm_ioctl_wait_fence(struct drm_device *dev, void *data,
-   struct drm_file *file)
+static int wait_fence(struct msm_gpu_submitqueue *queue, uint32_t fence_id,
+ ktime_t timeout)
 {
-   struct msm_drm_private *priv = dev->dev_private;
-   struct drm_msm_wait_fence *args = data;
-   ktime_t timeout = to_ktime(args->timeout);
-   struct msm_gpu_submitqueue *queue;
-   struct msm_gpu *gpu = priv->gpu;
struct dma_fence *fence;
int ret;
 
-   if (args->pad) {
-   DRM_ERROR("invalid pad: %08x\n", args->pad);
-   return -EINVAL;
-   }
-
-   if (!gpu)
-   return 0;
-
-   queue = msm_submitqueue_get(file->driver_priv, args->queueid);
-   if (!queue)
-   return -ENOENT;
-
/*
 * Map submitqueue scoped "seqno" (which is actually an idr key)
 * back to underlying dma-fence
@@ -995,7 +978,7 @@ static int msm_ioctl_wait_fence(struct drm_device *dev, 
void *data,
ret = mutex_lock_interruptible(>lock);
if (ret)
return ret;
-   fence = idr_find(>fence_idr, args->fence);
+   fence = idr_find(>fence_idr, fence_id);
if (fence)
fence = dma_fence_get_rcu(fence);
mutex_unlock(>lock);
@@ -1011,6 +994,32 @@ static int msm_ioctl_wait_fence(struct drm_device *dev, 
void *data,
}
 
dma_fence_put(fence);
+
+   return ret;
+}
+
+static int msm_ioctl_wait_fence(struct drm_device *dev, void *data,
+   struct drm_file *file)
+{
+   struct msm_drm_private *priv = dev->dev_private;
+   struct drm_msm_wait_fence *args = data;
+   struct msm_gpu_submitqueue *queue;
+   int ret;
+
+   if (args->pad) {
+   DRM_ERROR("invalid pad: %08x\n", args->pad);
+   return -EINVAL;
+   }
+
+   if (!priv->gpu)
+   return 0;
+
+   queue = msm_submitqueue_get(file->driver_priv, args->queueid);
+   if (!queue)
+   return -ENOENT;
+
+   ret = wait_fence(queue, args->fence, to_ktime(args->timeout));
+
msm_submitqueue_put(queue);
 
return ret;
-- 
2.31.1



[PATCH 0/2] drm/msm: wait_fence fixes

2021-11-11 Thread Rob Clark
From: Rob Clark 

A couple of wait_fence related fixes.

Rob Clark (2):
  drm/msm: Fix wait_fence submitqueue leak
  drm/msm: Restore error return on invalid fence

 drivers/gpu/drm/msm/msm_drv.c| 49 ++--
 drivers/gpu/drm/msm/msm_gem_submit.c |  1 +
 drivers/gpu/drm/msm/msm_gpu.h|  3 ++
 3 files changed, 36 insertions(+), 17 deletions(-)

-- 
2.31.1



Re: [PATCH] drm/v3d: pass null pointers using NULL

2021-11-11 Thread Melissa Wen
On 11/10, Colin Ian King wrote:
> There are a couple of calls that are passing null pointers as
> integer zeros rather than NULL. Fix this by using NULL instead.
> 
> Fixes: 07c2a41658c4 ("drm/v3d: alloc and init job in one shot")
> Signed-off-by: Colin Ian King 
> ---
>  drivers/gpu/drm/v3d/v3d_gem.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
> index e47ae40a865a..c7ed2e1cbab6 100644
> --- a/drivers/gpu/drm/v3d/v3d_gem.c
> +++ b/drivers/gpu/drm/v3d/v3d_gem.c
> @@ -774,7 +774,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
>  
>   if (args->flags & DRM_V3D_SUBMIT_CL_FLUSH_CACHE) {
>   ret = v3d_job_init(v3d, file_priv, (void *)_job, 
> sizeof(*clean_job),
> -v3d_job_free, 0, 0, V3D_CACHE_CLEAN);
> +v3d_job_free, 0, NULL, V3D_CACHE_CLEAN);
>   if (ret)
>   goto fail;
>  
> @@ -1007,7 +1007,7 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
>   goto fail;
>  
>   ret = v3d_job_init(v3d, file_priv, (void *)_job, 
> sizeof(*clean_job),
> -v3d_job_free, 0, 0, V3D_CACHE_CLEAN);
> +v3d_job_free, 0, NULL, V3D_CACHE_CLEAN);
>   if (ret)
>   goto fail;

Hi Colin,

This fix has been already done:
https://cgit.freedesktop.org/drm/drm-misc/commit/?id=75ad021f21927311b8d454939eb248a50df92525

Thanks, anyway.

Melissa
>  
> -- 
> 2.32.0
> 


signature.asc
Description: PGP signature


Re: [PATCH 3/3] drm/i915/dg2: Program recommended HW settings

2021-11-11 Thread Clint Taylor

Reviewed-by: Clint Taylor 

-Clint


On 11/2/21 3:25 PM, Matt Roper wrote:

The bspec's performance guide suggests programming specific values into
a few registers for optimal performance.  Although these aren't
workarounds, it's easiest to handle them inside the GT workaround
functions (which will also ensure that the values set here are properly
melded with other bits in the same registers that _are_ set by
workarounds).

Bspec: 68331, 45395

Cc: Matt Atwood 
Cc: Lucas De Marchi 
Cc: Siddiqui Ayaz A 
Signed-off-by: Matt Roper 
---
  drivers/gpu/drm/i915/gt/intel_workarounds.c | 26 -
  drivers/gpu/drm/i915/i915_reg.h |  9 +++
  2 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c 
b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 37fd541a9719..51591119da15 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -558,6 +558,22 @@ static void icl_ctx_workarounds_init(struct 
intel_engine_cs *engine,
wa_masked_en(wal, GEN9_ROW_CHICKEN4, GEN11_DIS_PICK_2ND_EU);
  }
  
+/*

+ * These settings aren't actually workarounds, but general tuning settings that
+ * need to be programmed on dg2 platform.
+ */
+static void dg2_ctx_gt_tuning_init(struct intel_engine_cs *engine,
+  struct i915_wa_list *wal)
+{
+   wa_write_clr_set(wal, GEN11_L3SQCREG5, L3_PWM_TIMER_INIT_VAL_MASK,
+REG_FIELD_PREP(L3_PWM_TIMER_INIT_VAL_MASK, 0x7f));
+   wa_add(wal,
+  FF_MODE2,
+  FF_MODE2_TDS_TIMER_MASK,
+  FF_MODE2_TDS_TIMER_128,
+  0, false);
+}
+
  /*
   * These settings aren't actually workarounds, but general tuning settings 
that
   * need to be programmed on several platforms.
@@ -647,7 +663,7 @@ static void dg1_ctx_workarounds_init(struct intel_engine_cs 
*engine,
  static void dg2_ctx_workarounds_init(struct intel_engine_cs *engine,
 struct i915_wa_list *wal)
  {
-   gen12_ctx_gt_tuning_init(engine, wal);
+   dg2_ctx_gt_tuning_init(engine, wal);
  
  	/* Wa_16011186671:dg2_g11 */

if (IS_DG2_GRAPHICS_STEP(engine->i915, G11, STEP_A0, STEP_B0)) {
@@ -1482,6 +1498,14 @@ dg2_gt_workarounds_init(struct intel_gt *gt, struct 
i915_wa_list *wal)
  
  	/* Wa_14014830051:dg2 */

wa_write_clr(wal, SARB_CHICKEN1, COMP_CKN_IN);
+
+   /*
+* The following are not actually "workarounds" but rather
+* recommended tuning settings documented in the bspec's
+* performance guide section.
+*/
+   wa_write_or(wal, XEHP_L3SCQREG7, BLEND_FILL_CACHING_OPT_DIS);
+   wa_write_or(wal, GEN12_SQCM, EN_32B_ACCESS);
  }
  
  static void

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index ee39d6bd0f3c..ef3b5732faad 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -731,6 +731,9 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
  
  #define GEN12_OA_TLB_INV_CR _MMIO(0xceec)
  
+#define GEN12_SQCM		_MMIO(0x8724)

+#define   EN_32B_ACCESSREG_BIT(30)
+
  /* Gen12 OAR unit */
  #define GEN12_OAR_OACONTROL _MMIO(0x2960)
  #define  GEN12_OAR_OACONTROL_COUNTER_FORMAT_SHIFT 1
@@ -8506,6 +8509,12 @@ enum {
  #define  GEN8_LQSC_FLUSH_COHERENT_LINES   (1 << 21)
  #define  GEN8_LQSQ_NONIA_COHERENT_ATOMICS_ENABLE REG_BIT(22)
  
+#define GEN11_L3SQCREG5_MMIO(0xb158)

+#define   L3_PWM_TIMER_INIT_VAL_MASK   REG_GENMASK(9, 0)
+
+#define XEHP_L3SCQREG7 _MMIO(0xb188)
+#define   BLEND_FILL_CACHING_OPT_DIS   REG_BIT(3)
+
  /* GEN8 chicken */
  #define HDC_CHICKEN0  _MMIO(0x7300)
  #define ICL_HDC_MODE  _MMIO(0xE5F4)


Re: [Freedreno] [PATCH v4 07/13] drm/msm: Track "seqno" fences by idr

2021-11-11 Thread Rob Clark
On Thu, Nov 11, 2021 at 7:54 AM Akhil P Oommen  wrote:
>
> On 11/10/2021 10:25 PM, Rob Clark wrote:
> > On Wed, Nov 10, 2021 at 7:28 AM Akhil P Oommen  
> > wrote:
> >>
> >> On 7/28/2021 6:36 AM, Rob Clark wrote:
> >>> From: Rob Clark 
> >>>
> >>> Previously the (non-fd) fence returned from submit ioctl was a raw
> >>> seqno, which is scoped to the ring.  But from UABI standpoint, the
> >>> ioctls related to seqno fences all specify a submitqueue.  We can
> >>> take advantage of that to replace the seqno fences with a cyclic idr
> >>> handle.
> >>>
> >>> This is in preperation for moving to drm scheduler, at which point
> >>> the submit ioctl will return after queuing the submit job to the
> >>> scheduler, but before the submit is written into the ring (and
> >>> therefore before a ring seqno has been assigned).  Which means we
> >>> need to replace the dma_fence that userspace may need to wait on
> >>> with a scheduler fence.
> >>>
> >>> Signed-off-by: Rob Clark 
> >>> Acked-by: Christian König 
> >>> ---
> >>>drivers/gpu/drm/msm/msm_drv.c | 30 +--
> >>>drivers/gpu/drm/msm/msm_fence.c   | 42 ---
> >>>drivers/gpu/drm/msm/msm_fence.h   |  3 --
> >>>drivers/gpu/drm/msm/msm_gem.h |  1 +
> >>>drivers/gpu/drm/msm/msm_gem_submit.c  | 23 ++-
> >>>drivers/gpu/drm/msm/msm_gpu.h |  5 
> >>>drivers/gpu/drm/msm/msm_submitqueue.c |  5 
> >>>7 files changed, 61 insertions(+), 48 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
> >>> index 9b8fa2ad0d84..1594ae39d54f 100644
> >>> --- a/drivers/gpu/drm/msm/msm_drv.c
> >>> +++ b/drivers/gpu/drm/msm/msm_drv.c
> >>> @@ -911,6 +911,7 @@ static int msm_ioctl_wait_fence(struct drm_device 
> >>> *dev, void *data,
> >>>ktime_t timeout = to_ktime(args->timeout);
> >>>struct msm_gpu_submitqueue *queue;
> >>>struct msm_gpu *gpu = priv->gpu;
> >>> + struct dma_fence *fence;
> >>>int ret;
> >>>
> >>>if (args->pad) {
> >>> @@ -925,10 +926,35 @@ static int msm_ioctl_wait_fence(struct drm_device 
> >>> *dev, void *data,
> >>>if (!queue)
> >>>return -ENOENT;
> >>>
> >>> - ret = msm_wait_fence(gpu->rb[queue->prio]->fctx, args->fence, 
> >>> ,
> >>> - true);
> >>> + /*
> >>> +  * Map submitqueue scoped "seqno" (which is actually an idr key)
> >>> +  * back to underlying dma-fence
> >>> +  *
> >>> +  * The fence is removed from the fence_idr when the submit is
> >>> +  * retired, so if the fence is not found it means there is nothing
> >>> +  * to wait for
> >>> +  */
> >>> + ret = mutex_lock_interruptible(>lock);
> >>> + if (ret)
> >>> + return ret;
> >>> + fence = idr_find(>fence_idr, args->fence);
> >>> + if (fence)
> >>> + fence = dma_fence_get_rcu(fence);
> >>> + mutex_unlock(>lock);
> >>> +
> >>> + if (!fence)
> >>> + return 0;
> >>>
> >>> + ret = dma_fence_wait_timeout(fence, true, 
> >>> timeout_to_jiffies());
> >>> + if (ret == 0) {
> >>> + ret = -ETIMEDOUT;
> >>> + } else if (ret != -ERESTARTSYS) {
> >>> + ret = 0;
> >>> + }
> >>> +
> >>> + dma_fence_put(fence);
> >>>msm_submitqueue_put(queue);
> >>> +
> >>>return ret;
> >>>}
> >>>
> >>> diff --git a/drivers/gpu/drm/msm/msm_fence.c 
> >>> b/drivers/gpu/drm/msm/msm_fence.c
> >>> index b92a9091a1e2..f2cece542c3f 100644
> >>> --- a/drivers/gpu/drm/msm/msm_fence.c
> >>> +++ b/drivers/gpu/drm/msm/msm_fence.c
> >>> @@ -24,7 +24,6 @@ msm_fence_context_alloc(struct drm_device *dev, 
> >>> volatile uint32_t *fenceptr,
> >>>strncpy(fctx->name, name, sizeof(fctx->name));
> >>>fctx->context = dma_fence_context_alloc(1);
> >>>fctx->fenceptr = fenceptr;
> >>> - init_waitqueue_head(>event);
> >>>spin_lock_init(>spinlock);
> >>>
> >>>return fctx;
> >>> @@ -45,53 +44,12 @@ static inline bool fence_completed(struct 
> >>> msm_fence_context *fctx, uint32_t fenc
> >>>(int32_t)(*fctx->fenceptr - fence) >= 0;
> >>>}
> >>>
> >>> -/* legacy path for WAIT_FENCE ioctl: */
> >>> -int msm_wait_fence(struct msm_fence_context *fctx, uint32_t fence,
> >>> - ktime_t *timeout, bool interruptible)
> >>> -{
> >>> - int ret;
> >>> -
> >>> - if (fence > fctx->last_fence) {
> >>> - DRM_ERROR_RATELIMITED("%s: waiting on invalid fence: %u (of 
> >>> %u)\n",
> >>> - fctx->name, fence, fctx->last_fence);
> >>> - return -EINVAL;
> >>
> >> Rob, we changed this pre-existing behaviour in this patch. Now, when
> >> userspace tries to wait on a future fence, we don't return an error.
> >>
> >> I just want to check if this was accidental or not?
> >
> > Hmm, perhaps we should do this to restore the previous behavior:
> >
> > -

Re: [PATCH 5/5] drm/msm: Add debugfs to disable hw err handling

2021-11-11 Thread Akhil P Oommen

On 11/9/2021 11:41 PM, Rob Clark wrote:

From: Rob Clark 

Add a debugfs interface to ignore hw error irqs, in order to force
fallback to sw hangcheck mechanism.  Because the hw error detection is
pretty good on newer gens, we need this for igt tests to test the sw
hang detection.

Signed-off-by: Rob Clark 
---
  drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 6 ++
  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 4 
  drivers/gpu/drm/msm/msm_debugfs.c | 3 +++
  drivers/gpu/drm/msm/msm_drv.h | 9 +
  4 files changed, 22 insertions(+)

diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
index 6163990a4d09..ec8e043c9d38 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
@@ -1252,6 +1252,7 @@ static void a5xx_fault_detect_irq(struct msm_gpu *gpu)
  
  static irqreturn_t a5xx_irq(struct msm_gpu *gpu)

  {
+   struct msm_drm_private *priv = gpu->dev->dev_private;
u32 status = gpu_read(gpu, REG_A5XX_RBBM_INT_0_STATUS);
  
  	/*

@@ -1261,6 +1262,11 @@ static irqreturn_t a5xx_irq(struct msm_gpu *gpu)
gpu_write(gpu, REG_A5XX_RBBM_INT_CLEAR_CMD,
status & ~A5XX_RBBM_INT_0_MASK_RBBM_AHB_ERROR);
  
+	if (priv->disable_err_irq) {

+   status &= A5XX_RBBM_INT_0_MASK_CP_CACHE_FLUSH_TS |
+ A5XX_RBBM_INT_0_MASK_CP_SW;
+   }
+
/* Pass status to a5xx_rbbm_err_irq because we've already cleared it */
if (status & RBBM_ERROR_MASK)
a5xx_rbbm_err_irq(gpu, status);
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 3d2da81cb2c9..8a2af3a27e33 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -1373,10 +1373,14 @@ static void a6xx_fault_detect_irq(struct msm_gpu *gpu)
  
  static irqreturn_t a6xx_irq(struct msm_gpu *gpu)

  {
+   struct msm_drm_private *priv = gpu->dev->dev_private;
u32 status = gpu_read(gpu, REG_A6XX_RBBM_INT_0_STATUS);
  
  	gpu_write(gpu, REG_A6XX_RBBM_INT_CLEAR_CMD, status);
  
+	if (priv->disable_err_irq)

+   status &= A6XX_RBBM_INT_0_MASK_CP_CACHE_FLUSH_TS;
+
if (status & A6XX_RBBM_INT_0_MASK_RBBM_HANG_DETECT)
a6xx_fault_detect_irq(gpu);
  
diff --git a/drivers/gpu/drm/msm/msm_debugfs.c b/drivers/gpu/drm/msm/msm_debugfs.c

index 6a99e8b5d25d..956b1efc3721 100644
--- a/drivers/gpu/drm/msm/msm_debugfs.c
+++ b/drivers/gpu/drm/msm/msm_debugfs.c
@@ -242,6 +242,9 @@ void msm_debugfs_init(struct drm_minor *minor)
debugfs_create_u32("hangcheck_period_ms", 0600, minor->debugfs_root,
>hangcheck_period);
  
+	debugfs_create_bool("disable_err_irq", 0600, minor->debugfs_root,

+   >disable_err_irq);
+
debugfs_create_file("shrink", S_IRWXU, minor->debugfs_root,
dev, _fops);
  
diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h

index 2943c21d9aac..a8da7a7efb84 100644
--- a/drivers/gpu/drm/msm/msm_drv.h
+++ b/drivers/gpu/drm/msm/msm_drv.h
@@ -246,6 +246,15 @@ struct msm_drm_private {
  
  	/* For hang detection, in ms */

unsigned int hangcheck_period;
+
+   /**
+* disable_err_irq:
+*
+* Disable handling of GPU hw error interrupts, to force fallback to
+* sw hangcheck timer.  Written (via debugfs) by igt tests to test
+* the sw hangcheck mechanism.
+*/
+   bool disable_err_irq;
  };
  
  struct msm_format {




Reviewed-by: Akhil P Oommen 

-Akhil.


Re: [PATCH 4/5] drm/msm: Handle fence rollover

2021-11-11 Thread Akhil P Oommen

On 11/9/2021 11:41 PM, Rob Clark wrote:

From: Rob Clark 

Add some helpers for fence comparision, which handle rollover properly,
and stop open coding fence seqno comparisions.

Signed-off-by: Rob Clark 
---
  drivers/gpu/drm/msm/msm_fence.h | 12 
  drivers/gpu/drm/msm/msm_gpu.c   |  6 +++---
  drivers/gpu/drm/msm/msm_gpu.h   |  2 +-
  3 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_fence.h b/drivers/gpu/drm/msm/msm_fence.h
index 4783db528bcc..17ee3822b423 100644
--- a/drivers/gpu/drm/msm/msm_fence.h
+++ b/drivers/gpu/drm/msm/msm_fence.h
@@ -60,4 +60,16 @@ void msm_update_fence(struct msm_fence_context *fctx, 
uint32_t fence);
  
  struct dma_fence * msm_fence_alloc(struct msm_fence_context *fctx);
  
+static inline bool

+fence_before(uint32_t a, uint32_t b)
+{
+   return (int32_t)(a - b) < 0;


This is good enough when a and b have close values. And that is a good 
assumption for KMD generated seqno.


Reviewed-by: Akhil P Oommen 

-Akhil.


+}
+
+static inline bool
+fence_after(uint32_t a, uint32_t b)
+{
+   return (int32_t)(a - b) > 0;
+}
+
  #endif
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index 13de1241d595..0f78c2615272 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -172,7 +172,7 @@ static void update_fences(struct msm_gpu *gpu, struct 
msm_ringbuffer *ring,
  
  	spin_lock_irqsave(>submit_lock, flags);

list_for_each_entry(submit, >submits, node) {
-   if (submit->seqno > fence)
+   if (fence_after(submit->seqno, fence))
break;
  
  		msm_update_fence(submit->ring->fctx,

@@ -509,7 +509,7 @@ static void hangcheck_handler(struct timer_list *t)
if (fence != ring->hangcheck_fence) {
/* some progress has been made.. ya! */
ring->hangcheck_fence = fence;
-   } else if (fence < ring->seqno) {
+   } else if (fence_before(fence, ring->seqno)) {
/* no progress and not done.. hung! */
ring->hangcheck_fence = fence;
DRM_DEV_ERROR(dev->dev, "%s: hangcheck detected gpu lockup rb 
%d!\n",
@@ -523,7 +523,7 @@ static void hangcheck_handler(struct timer_list *t)
}
  
  	/* if still more pending work, reset the hangcheck timer: */

-   if (ring->seqno > ring->hangcheck_fence)
+   if (fence_after(ring->seqno, ring->hangcheck_fence))
hangcheck_timer_reset(gpu);
  
  	/* workaround for missing irq: */

diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index 0dcc31c27ac3..bd4e0024033e 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -258,7 +258,7 @@ static inline bool msm_gpu_active(struct msm_gpu *gpu)
for (i = 0; i < gpu->nr_rings; i++) {
struct msm_ringbuffer *ring = gpu->rb[i];
  
-		if (ring->seqno > ring->memptrs->fence)

+   if (fence_after(ring->seqno, ring->memptrs->fence))
return true;
}
  





Re: [Intel-gfx] [PATCH] drm/i915/execlists: Weak parallel submission support for execlists

2021-11-11 Thread Matthew Brost
On Mon, Nov 01, 2021 at 10:35:09AM +, Tvrtko Ursulin wrote:
> 
> On 27/10/2021 21:10, Matthew Brost wrote:
> > On Wed, Oct 27, 2021 at 01:04:49PM -0700, John Harrison wrote:
> > > On 10/27/2021 12:17, Matthew Brost wrote:
> > > > On Tue, Oct 26, 2021 at 02:58:00PM -0700, John Harrison wrote:
> > > > > On 10/20/2021 14:47, Matthew Brost wrote:
> > > > > > A weak implementation of parallel submission (multi-bb execbuf 
> > > > > > IOCTL) for
> > > > > > execlists. Doing as little as possible to support this interface for
> > > > > > execlists - basically just passing submit fences between each 
> > > > > > request
> > > > > > generated and virtual engines are not allowed. This is on par with 
> > > > > > what
> > > > > > is there for the existing (hopefully soon deprecated) bonding 
> > > > > > interface.
> > > > > > 
> > > > > > We perma-pin these execlists contexts to align with GuC 
> > > > > > implementation.
> > > > > > 
> > > > > > v2:
> > > > > > (John Harrison)
> > > > > >  - Drop siblings array as num_siblings must be 1
> > > > > > 
> > > > > > Signed-off-by: Matthew Brost 
> > > > > > ---
> > > > > > drivers/gpu/drm/i915/gem/i915_gem_context.c   | 10 +++--
> > > > > > drivers/gpu/drm/i915/gt/intel_context.c   |  4 +-
> > > > > > .../drm/i915/gt/intel_execlists_submission.c  | 44 
> > > > > > ++-
> > > > > > drivers/gpu/drm/i915/gt/intel_lrc.c   |  2 +
> > > > > > .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  2 -
> > > > > > 5 files changed, 52 insertions(+), 10 deletions(-)
> > > > > > 
> > > > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
> > > > > > b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > > > index fb33d0322960..35e87a7d0ea9 100644
> > > > > > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > > > @@ -570,10 +570,6 @@ set_proto_ctx_engines_parallel_submit(struct 
> > > > > > i915_user_extension __user *base,
> > > > > > struct intel_engine_cs **siblings = NULL;
> > > > > > intel_engine_mask_t prev_mask;
> > > > > > -   /* FIXME: This is NIY for execlists */
> > > > > > -   if (!(intel_uc_uses_guc_submission(>gt.uc)))
> > > > > > -   return -ENODEV;
> > > > > > -
> > > > > > if (get_user(slot, >engine_index))
> > > > > > return -EFAULT;
> > > > > > @@ -583,6 +579,12 @@ set_proto_ctx_engines_parallel_submit(struct 
> > > > > > i915_user_extension __user *base,
> > > > > > if (get_user(num_siblings, >num_siblings))
> > > > > > return -EFAULT;
> > > > > > +   if (!intel_uc_uses_guc_submission(>gt.uc) && num_siblings 
> > > > > > != 1) {
> > > > > > +   drm_dbg(>drm, "Only 1 sibling (%d) supported in 
> > > > > > non-GuC mode\n",
> > > > > > +   num_siblings);
> > > > > > +   return -EINVAL;
> > > > > > +   }
> > > > > > +
> > > > > > if (slot >= set->num_engines) {
> > > > > > drm_dbg(>drm, "Invalid placement value, 
> > > > > > %d >= %d\n",
> > > > > > slot, set->num_engines);
> > > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_context.c 
> > > > > > b/drivers/gpu/drm/i915/gt/intel_context.c
> > > > > > index 5634d14052bc..1bec92e1d8e6 100644
> > > > > > --- a/drivers/gpu/drm/i915/gt/intel_context.c
> > > > > > +++ b/drivers/gpu/drm/i915/gt/intel_context.c
> > > > > > @@ -79,7 +79,8 @@ static int intel_context_active_acquire(struct 
> > > > > > intel_context *ce)
> > > > > > __i915_active_acquire(>active);
> > > > > > -   if (intel_context_is_barrier(ce) || 
> > > > > > intel_engine_uses_guc(ce->engine))
> > > > > > +   if (intel_context_is_barrier(ce) || 
> > > > > > intel_engine_uses_guc(ce->engine) ||
> > > > > > +   intel_context_is_parallel(ce))
> > > > > > return 0;
> > > > > > /* Preallocate tracking nodes */
> > > > > > @@ -563,7 +564,6 @@ void intel_context_bind_parent_child(struct 
> > > > > > intel_context *parent,
> > > > > >  * Callers responsibility to validate that this 
> > > > > > function is used
> > > > > >  * correctly but we use GEM_BUG_ON here ensure that 
> > > > > > they do.
> > > > > >  */
> > > > > > -   GEM_BUG_ON(!intel_engine_uses_guc(parent->engine));
> > > > > > GEM_BUG_ON(intel_context_is_pinned(parent));
> > > > > > GEM_BUG_ON(intel_context_is_child(parent));
> > > > > > GEM_BUG_ON(intel_context_is_pinned(child));
> > > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c 
> > > > > > b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > > > > index bedb80057046..2865b422300d 100644
> > > > > > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > > > > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > > > > @@ -927,8 +927,7 @@ static void 

Re: [PATCH 2/5] drm/msm: Drop priv->lastctx

2021-11-11 Thread Akhil P Oommen

On 11/9/2021 11:41 PM, Rob Clark wrote:

From: Rob Clark 

cur_ctx_seqno already does the same thing, but handles the edge cases
where a refcnt'd context can live after lastclose.  So let's not have
two ways to do the same thing.

Signed-off-by: Rob Clark 
---
  drivers/gpu/drm/msm/adreno/a2xx_gpu.c |  3 +--
  drivers/gpu/drm/msm/adreno/a3xx_gpu.c |  3 +--
  drivers/gpu/drm/msm/adreno/a4xx_gpu.c |  3 +--
  drivers/gpu/drm/msm/adreno/a5xx_gpu.c |  8 +++-
  drivers/gpu/drm/msm/adreno/a6xx_gpu.c |  9 +++--
  drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 10 --
  drivers/gpu/drm/msm/msm_drv.c |  6 --
  drivers/gpu/drm/msm/msm_drv.h |  2 +-
  drivers/gpu/drm/msm/msm_gpu.c |  2 +-
  drivers/gpu/drm/msm/msm_gpu.h | 11 +++
  10 files changed, 22 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a2xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a2xx_gpu.c
index bdc989183c64..22e8295a5e2b 100644
--- a/drivers/gpu/drm/msm/adreno/a2xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a2xx_gpu.c
@@ -12,7 +12,6 @@ static bool a2xx_idle(struct msm_gpu *gpu);
  
  static void a2xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)

  {
-   struct msm_drm_private *priv = gpu->dev->dev_private;
struct msm_ringbuffer *ring = submit->ring;
unsigned int i;
  
@@ -23,7 +22,7 @@ static void a2xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)

break;
case MSM_SUBMIT_CMD_CTX_RESTORE_BUF:
/* ignore if there has not been a ctx switch: */
-   if (priv->lastctx == submit->queue->ctx)
+   if (gpu->cur_ctx_seqno == submit->queue->ctx->seqno)
break;
fallthrough;
case MSM_SUBMIT_CMD_BUF:
diff --git a/drivers/gpu/drm/msm/adreno/a3xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a3xx_gpu.c
index 8fb847c174ff..2e481e2692ba 100644
--- a/drivers/gpu/drm/msm/adreno/a3xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a3xx_gpu.c
@@ -30,7 +30,6 @@ static bool a3xx_idle(struct msm_gpu *gpu);
  
  static void a3xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)

  {
-   struct msm_drm_private *priv = gpu->dev->dev_private;
struct msm_ringbuffer *ring = submit->ring;
unsigned int i;
  
@@ -41,7 +40,7 @@ static void a3xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)

break;
case MSM_SUBMIT_CMD_CTX_RESTORE_BUF:
/* ignore if there has not been a ctx switch: */
-   if (priv->lastctx == submit->queue->ctx)
+   if (gpu->cur_ctx_seqno == submit->queue->ctx->seqno)
break;
fallthrough;
case MSM_SUBMIT_CMD_BUF:
diff --git a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
index a96ee79cc5e0..c5524d6e8705 100644
--- a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
@@ -24,7 +24,6 @@ static bool a4xx_idle(struct msm_gpu *gpu);
  
  static void a4xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)

  {
-   struct msm_drm_private *priv = gpu->dev->dev_private;
struct msm_ringbuffer *ring = submit->ring;
unsigned int i;
  
@@ -35,7 +34,7 @@ static void a4xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)

break;
case MSM_SUBMIT_CMD_CTX_RESTORE_BUF:
/* ignore if there has not been a ctx switch: */
-   if (priv->lastctx == submit->queue->ctx)
+   if (gpu->cur_ctx_seqno == submit->queue->ctx->seqno)
break;
fallthrough;
case MSM_SUBMIT_CMD_BUF:
diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
index 5e2750eb3810..6163990a4d09 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
@@ -65,7 +65,6 @@ void a5xx_flush(struct msm_gpu *gpu, struct msm_ringbuffer 
*ring,
  
  static void a5xx_submit_in_rb(struct msm_gpu *gpu, struct msm_gem_submit *submit)

  {
-   struct msm_drm_private *priv = gpu->dev->dev_private;
struct msm_ringbuffer *ring = submit->ring;
struct msm_gem_object *obj;
uint32_t *ptr, dwords;
@@ -76,7 +75,7 @@ static void a5xx_submit_in_rb(struct msm_gpu *gpu, struct 
msm_gem_submit *submit
case MSM_SUBMIT_CMD_IB_TARGET_BUF:
break;
case MSM_SUBMIT_CMD_CTX_RESTORE_BUF:
-   if (priv->lastctx == submit->queue->ctx)
+   if (gpu->cur_ctx_seqno == submit->queue->ctx->seqno)
break;
fallthrough;
case MSM_SUBMIT_CMD_BUF:
@@ -126,12 +125,11 @@ static 

Re: [PATCH v4] drm/ttm: Clarify that the TTM_PL_SYSTEM is under TTMs control

2021-11-11 Thread Zack Rusin
On Wed, 2021-11-10 at 09:50 -0500, Zack Rusin wrote:
> TTM takes full control over TTM_PL_SYSTEM placed buffers. This makes
> driver internal usage of TTM_PL_SYSTEM prone to errors because it
> requires the drivers to manually handle all interactions between TTM
> which can swap out those buffers whenever it thinks it's the right
> thing to do and driver.
> 
> CPU buffers which need to be fenced and shared with accelerators
> should
> be placed in driver specific placements that can explicitly handle
> CPU/accelerator buffer fencing.
> Currently, apart, from things silently failing nothing is enforcing
> that requirement which means that it's easy for drivers and new
> developers to get this wrong. To avoid the confusion we can document
> this requirement and clarify the solution.
> 
> This came up during a discussion on dri-devel:
> https://lore.kernel.org/dri-devel/232f45e9-8748-1243-09bf-56763e666...@amd.com


Polite and gentle ping on that one. Are we ok with the wording here?

z


Re: [RFC v2 05/22] drm/i915/xelpd: Define Degamma Lut range struct for HDR planes

2021-11-11 Thread Ville Syrjälä
On Thu, Nov 11, 2021 at 10:17:17AM -0500, Harry Wentland wrote:
> 
> 
> On 2021-09-06 17:38, Uma Shankar wrote:
> > Define the structure with XE_LPD degamma lut ranges. HDR and SDR
> > planes have different capabilities, implemented respective
> > structure for the HDR planes.
> > 
> > Signed-off-by: Uma Shankar 
> > ---
> >  drivers/gpu/drm/i915/display/intel_color.c | 52 ++
> >  1 file changed, 52 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/display/intel_color.c 
> > b/drivers/gpu/drm/i915/display/intel_color.c
> > index afcb4bf3826c..6403bd74324b 100644
> > --- a/drivers/gpu/drm/i915/display/intel_color.c
> > +++ b/drivers/gpu/drm/i915/display/intel_color.c
> > @@ -2092,6 +2092,58 @@ static void icl_read_luts(struct intel_crtc_state 
> > *crtc_state)
> > }
> >  }
> >  
> > + /* FIXME input bpc? */
> > +__maybe_unused
> > +static const struct drm_color_lut_range d13_degamma_hdr[] = {
> > +   /* segment 1 */
> > +   {
> > +   .flags = (DRM_MODE_LUT_GAMMA |
> > + DRM_MODE_LUT_REFLECT_NEGATIVE |
> > + DRM_MODE_LUT_INTERPOLATE |
> > + DRM_MODE_LUT_NON_DECREASING),
> > +   .count = 128,
> > +   .input_bpc = 24, .output_bpc = 16,
> > +   .start = 0, .end = (1 << 24) - 1,
> > +   .min = 0, .max = (1 << 24) - 1,
> > +   },
> > +   /* segment 2 */
> > +   {
> > +   .flags = (DRM_MODE_LUT_GAMMA |
> > + DRM_MODE_LUT_REFLECT_NEGATIVE |
> > + DRM_MODE_LUT_INTERPOLATE |
> > + DRM_MODE_LUT_REUSE_LAST |
> > + DRM_MODE_LUT_NON_DECREASING),
> > +   .count = 1,
> > +   .input_bpc = 24, .output_bpc = 16,
> > +   .start = (1 << 24) - 1, .end = 1 << 24,
> > +   .min = 0, .max = (1 << 27) - 1,
> > +   },
> > +   /* Segment 3 */
> > +   {
> > +   .flags = (DRM_MODE_LUT_GAMMA |
> > + DRM_MODE_LUT_REFLECT_NEGATIVE |
> > + DRM_MODE_LUT_INTERPOLATE |
> > + DRM_MODE_LUT_REUSE_LAST |
> > + DRM_MODE_LUT_NON_DECREASING),
> > +   .count = 1,
> > +   .input_bpc = 24, .output_bpc = 16,
> > +   .start = 1 << 24, .end = 3 << 24,
> > +   .min = 0, .max = (1 << 27) - 1,
> > +   },
> > +   /* Segment 4 */
> > +   {
> > +   .flags = (DRM_MODE_LUT_GAMMA |
> > + DRM_MODE_LUT_REFLECT_NEGATIVE |
> > + DRM_MODE_LUT_INTERPOLATE |
> > + DRM_MODE_LUT_REUSE_LAST |
> > + DRM_MODE_LUT_NON_DECREASING),
> > +   .count = 1,
> > +   .input_bpc = 24, .output_bpc = 16,
> > +   .start = 3 << 24, .end = 7 << 24,
> > +   .min = 0, .max = (1 << 27) - 1,
> > +   },
> > +};
> 
> If I understand this right, userspace would need this definition in order
> to populate the degamma blob. Should this sit in a UAPI header?

My original idea (not sure it's fully realized in this series) is to
have a new GAMMA_MODE/etc. enum property on each crtc (or plane) for
which each enum value points to a kernel provided blob that contains
one of these LUT descriptors. Userspace can then query them dynamically
and pick the best one for its current use case.

The algorithm for choosing the best one might be something like:
- prefer LUT with bpc >= FB bpc, but perhaps not needlessly high bpc
- prefer interpolated vs. direct lookup based on current needs (eg. X
  could prefer direct lookup to get directcolor visuals).
- prefer one with extended range values if needed
- for HDR prefer smaller step size in dark tones,
  for SDR perhaps prefer a more uniform step size

Or maybe we should include some kind of usage hints as well?

And I was thinking of even adding a new property type (eg.
ENUM_BLOB) just for this sort of usecase. That could let us
have a bit more generic code to do all the validation around
the property values and whatnot.

The one nagging concern I really have with GAMMA_MODE is how a
mix of old and new userspace would work. Though that is more 
of a generic issue with any new property really.

-- 
Ville Syrjälä
Intel


Re: [RFC PATCH v2 2/3] arm64: dts: imx8mm: Add MIPI DSI pipeline

2021-11-11 Thread Jagan Teki
On Thu, Nov 11, 2021 at 3:51 PM Marek Vasut  wrote:
>
> On 11/11/21 11:14 AM, Jagan Teki wrote:
>
> [...]
>
> > + dsi: dsi@32e1 {
> > + compatible = "fsl,imx8mm-mipi-dsim";
> > + reg = <0x32e1 0x400>;
> > + clocks = < IMX8MM_CLK_DSI_CORE>,
> > +  < IMX8MM_CLK_DSI_PHY_REF>;
> > + clock-names = "bus_clk", "sclk_mipi";
> > + assigned-clocks = < IMX8MM_CLK_DSI_CORE>,
> > +   < 
> > IMX8MM_VIDEO_PLL1_OUT>,
> > +   < 
> > IMX8MM_CLK_DSI_PHY_REF>;
> > + assigned-clock-parents = < 
> > IMX8MM_SYS_PLL1_266M>,
> > +  < 
> > IMX8MM_VIDEO_PLL1_BYPASS>,
> > +  < 
> > IMX8MM_VIDEO_PLL1_OUT>;
> > + assigned-clock-rates = <26600>, 
> > <59400>, <2700>;
> > + interrupts = ;
> > + phys = <_phy 0>;
> > + phy-names = "dsim";
> > + power-domains = <_blk_ctrl 
> > IMX8MM_DISPBLK_PD_MIPI_DSI>;
> > + samsung,burst-clock-frequency = <89100>;
> > + samsung,esc-clock-frequency = <5400>;
> > + samsung,pll-clock-frequency = <2700>;
> > + status = "disabled";
>
>
> This 27 MHz is really IMX8MM_CLK_DSI_PHY_REF and
> samsung,burst-clock-frequency is really the DSI link clock which is
> panel/bridge specific ... but, why do we need to specify such policy in
> DT rather than have the panel/bridge drivers negotiate the best clock
> settings with DSIM bridge driver ? This should be something which should
> be implemented in the DRM subsystem, not hard-coded in DT. These ad-hoc
> samsung,*-clock-frequency properties shouldn't even be needed then.

This look confusion for me, all three clock are used it directly from
exynos. and these indeed are computing pll for this bridge and clock
control of dsim registers are updated from this out come values. No
thoughts as of now how to handle these externally and update the
internal register based on those out come values.

>
> Also, are the DSIM bindings stable now ?

Issue still lies on exynos dsi side, the final driver is not binding
properly. I'm trying to send the next version patches only for
existing exynos dsi to convert into bridge. and subsequently adding
i.mx8mm specifics. More problem for me to test it on exynos boards, i
don't have any either.

Jagan.


Re: [PATCH v5 1/2] drm/bridge: parade-ps8640: Enable runtime power management

2021-11-11 Thread Doug Anderson
Hi,

On Thu, Oct 28, 2021 at 12:39 PM Doug Anderson  wrote:
>
> Hi,
>
> On Thu, Oct 28, 2021 at 11:02 AM Philip Chen  wrote:
> >
> > Add "Sam Ravnborg " to cc list for vis.
> > Remove "Andrzej Hajda " from cc list as the
> > address can't be found.
>
> Looking at 
> ,
> it should be Andrzej Hajda . I've added.
>
>
> > On Thu, Oct 28, 2021 at 10:58 AM Philip Chen  
> > wrote:
> > >
> > > Fit ps8640 driver into runtime power management framework:
> > >
> > > First, break _poweron() to 3 parts: (1) turn on power and wait for
> > > ps8640's internal MCU to finish init (2) check panel HPD (which is
> > > proxied by GPIO9) (3) the other configs. As runtime_resume() can be
> > > called before panel is powered, we only add (1) to _resume() and leave
> > > (2)(3) to _pre_enable(). We also add (2) to _aux_transfer() as we want
> > > to ensure panel HPD is asserted before we start AUX CH transactions.
> > >
> > > Second, the original driver has a mysterious delay of 50 ms between (2)
> > > and (3). Since Parade's support can't explain what the delay is for,
> > > and we don't see removing the delay break any boards at hand, remove
> > > the delay to fit into this driver change.
> > >
> > > In addition, rename "powered" to "pre_enabled" and don't check for it
> > > in the pm_runtime calls. The pm_runtime calls are already refcounted
> > > so there's no reason to check there. The other user of "powered",
> > > _get_edid(), only cares if pre_enable() has already been called.
> > >
> > > Lastly, change some existing DRM_...() logging to dev_...() along the
> > > way, since DRM_...() seem to be deprecated in [1].
> > >
> > > [1] https://patchwork.freedesktop.org/patch/454760/
> > >
> > > Signed-off-by: Philip Chen 
> > > Reviewed-by: Douglas Anderson 
> > > Reviewed-by: Stephen Boyd 
> > > ---
> > > In v3, I added pm_suspend_ignore_children() in ps8640_probe().
> > > Also, I moved the change of "put_sync_suspend" from patch 2/2 to here.
> > > But I forgot to mention both changes. So edit v3 change log retroactively.
> > >
> > > In v4, I moved the change of "ps8640_ensure_hpd" return data type
> > > from patch 2/2 to here. But I forgot to mention it. So edit v4 change log
> > > retroactively.
> > >
> > > Changes in v5:
> > > - Move the implementation of _runtime_disable() around to resolve merge
> > >   conflict when rebasing.
> > > - Improve the document for how autosuspend_delay is picked.
>
> The new text looks good to me, thanks!
>
> Since this is from @chromium.org and only reviewed-by @chromium.org
> people, I'll plan to give it a 2-week snooze to give others ample time
> to comment on these two patches. If 2 weeks pass w/ no comments then
> I'll land to drm-misc-next. If someone gives an Ack and/or Reviewed-by
> then I'll likely land sooner.

My 2-week snooze went off, so this is now pushed to drm-misc-next
fixing a small whitespace warning that the dim tool complained about.

e9d9f9582c3d drm/bridge: parade-ps8640: Populate devices on aux-bus
826cff3f7ebb drm/bridge: parade-ps8640: Enable runtime power management

-Doug


[Bug 205089] amdgpu : drm:amdgpu_cs_ioctl : Failed to initialize parser -125

2021-11-11 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=205089

--- Comment #22 from Joey Espinosa (jlouis.espin...@gmail.com) ---
Kernel version would help too probably :-/

5.14.16-301.fc35.x86_64

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 205089] amdgpu : drm:amdgpu_cs_ioctl : Failed to initialize parser -125

2021-11-11 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=205089

Joey Espinosa (jlouis.espin...@gmail.com) changed:

   What|Removed |Added

 CC||jlouis.espin...@gmail.com

--- Comment #21 from Joey Espinosa (jlouis.espin...@gmail.com) ---
That didn't fix it for me. I'm having the exact same issue (same behavior,
anyway), and I'm on linux-firmware 20211027-126.fc35 (Fedora 35).

I started experiencing it after an update a few days ago, and I thought maybe
upgrading the OS from 34 -> 35 would maybe fix it. It didn't.

OS: Fedora 35
CPU: Ryzen 5950X
GPU: RX 6900 XT

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

Re: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler

2021-11-11 Thread Andrey Grodzovsky



On 2021-11-10 8:24 a.m., Daniel Vetter wrote:

On Wed, Nov 10, 2021 at 11:09:50AM +0100, Christian König wrote:

Am 10.11.21 um 10:50 schrieb Daniel Vetter:

On Tue, Nov 09, 2021 at 08:17:01AM -0800, Rob Clark wrote:

On Tue, Nov 9, 2021 at 1:07 AM Daniel Vetter  wrote:

On Mon, Nov 08, 2021 at 03:39:17PM -0800, Rob Clark wrote:

I stumbled across this thread when I ran into the same issue, while
working out how to move drm/msm to use scheduler's retire +
timeout/recovery (and get rid of our own mirror list of in-flight
jobs).  We already have hw error detection enabled, and it can signal
quite fast, so assuming the first job on the list is the guilty job
just won't work.

But I was considering a slightly different approach to fixing this,
instead just handling it all in drm_sched_main() and getting rid of
the complicated kthread parking gymnastics.  Ie. something along the
lines of:

So handling timeouts in the main sched thread wont work as soon as you
have multiple engines and reset that impacts across engines:

- Nothing is simplified since you still need to stop the other scheduler
threads.

- You get deadlocks if 2 schedulers time out at the same time, and both
want to stop the other one.

Hence workqueue. Now the rule for the wq is that you can only have one per
reset domain, so
- single engine you just take the one drm/sched provides
- if reset affects all your engines in the chip, then you allocate on in
the drm_device and pass that to all
- if you have a complex of gpus all interconnected (e.g. xgmi hive for
amd), then it's one wq for the entire hive

_All_ reset related things must be run on that workqueue or things breaks,
which means if you get hw fault that also needs to be run there. I guess
we should either patch drm/sched to check you call that function from the
right workqueue, or just handle it internally.

Hmm, ok.. I guess it would be useful to better document the reasoning
for the current design, that would have steered me more towards the
approach taken in this patch.

Maybe this was because you worked on an old kernel? Boris did update the
kerneldoc as part of making gpu reset work for panfrost, which has this
multi-engine reset problem. If that's not yet clear then we need to
improve the docs further.

AMD's problem is even worse, because their reset domain is the entire xgmi
hive, so multiple pci devices.

I'm pushing for quite a while that we get something like an
amdgpu_reset_domain structure or similar for this, but we unfortunately
don't have that yet.

Maybe it should be a good idea to have something like a drm_sched_domain or
similar with all the necessary information for the inter scheduler handling.

E.g. a workqueue for reset etc...

Yeah I think as soon as we have more stuff than just the wq then a
drm_sched_reset_domain sounds good.

But if it's just driver stuff (e.g. the xgmi locking you have in amdgpu
reset comes to mind) then I think just a driver_reset_domain struct that
bundles all that stuff up seems good enough.

E.g. on i915 I'm also pondering whether some of the fw requests should be
processed by the reset wq, to avoid locking headaches, so I don't think
hiding that work too much in abstractions is a good idea.
-Daniel



I suggest we keep the drm_sched_reset_domain as a base struct to hold the wq
(and possible something else cross drivers in the future) and then embed 
it in a derived

driver specific struct to hold driver specific stuff like
the XGMI lock you mentioned.

Andrey





Regards,
Christian.


Also there might more issues in drm/sched ofc, e.g. I've looked a bit at
ordering/barriers and I'm pretty sure a lot are still missing. Or at least
we should have comments in the code explaining why it all works.
-Daniel


BR,
-R


-Daniel


-
diff --git a/drivers/gpu/drm/scheduler/sched_main.c
b/drivers/gpu/drm/scheduler/sched_main.c
index 67382621b429..4d6ce775c316 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -764,6 +764,45 @@ static bool drm_sched_blocked(struct
drm_gpu_scheduler *sched)
  return false;
   }

+static bool handle_timeout(struct drm_gpu_scheduler *sched)
+{
+   struct drm_sched_job *bad;
+
+   if (!sched->has_timeout)
+   return false;
+
+   sched->has_timeout = false;
+
+   spin_lock(>job_list_lock);
+   bad = list_first_entry_or_null(>pending_list,
+  struct drm_sched_job, list);
+
+   if (!bad) {
+   spin_unlock(>job_list_lock);
+   return false;
+   }
+
+   spin_unlock(>job_list_lock);
+
+   if (sched->timeout_wq == system_wq) {
+   /*
+* If driver has no specific requirements about serializing
+* reset wrt. other engines, just call timedout_job() directly
+*/
+   sched->ops->timedout_job(job);
+   } else {
+   /*
+* Otherwise 

Re: [PATCH v4 07/13] drm/msm: Track "seqno" fences by idr

2021-11-11 Thread Akhil P Oommen

On 11/10/2021 10:25 PM, Rob Clark wrote:

On Wed, Nov 10, 2021 at 7:28 AM Akhil P Oommen  wrote:


On 7/28/2021 6:36 AM, Rob Clark wrote:

From: Rob Clark 

Previously the (non-fd) fence returned from submit ioctl was a raw
seqno, which is scoped to the ring.  But from UABI standpoint, the
ioctls related to seqno fences all specify a submitqueue.  We can
take advantage of that to replace the seqno fences with a cyclic idr
handle.

This is in preperation for moving to drm scheduler, at which point
the submit ioctl will return after queuing the submit job to the
scheduler, but before the submit is written into the ring (and
therefore before a ring seqno has been assigned).  Which means we
need to replace the dma_fence that userspace may need to wait on
with a scheduler fence.

Signed-off-by: Rob Clark 
Acked-by: Christian König 
---
   drivers/gpu/drm/msm/msm_drv.c | 30 +--
   drivers/gpu/drm/msm/msm_fence.c   | 42 ---
   drivers/gpu/drm/msm/msm_fence.h   |  3 --
   drivers/gpu/drm/msm/msm_gem.h |  1 +
   drivers/gpu/drm/msm/msm_gem_submit.c  | 23 ++-
   drivers/gpu/drm/msm/msm_gpu.h |  5 
   drivers/gpu/drm/msm/msm_submitqueue.c |  5 
   7 files changed, 61 insertions(+), 48 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index 9b8fa2ad0d84..1594ae39d54f 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -911,6 +911,7 @@ static int msm_ioctl_wait_fence(struct drm_device *dev, 
void *data,
   ktime_t timeout = to_ktime(args->timeout);
   struct msm_gpu_submitqueue *queue;
   struct msm_gpu *gpu = priv->gpu;
+ struct dma_fence *fence;
   int ret;

   if (args->pad) {
@@ -925,10 +926,35 @@ static int msm_ioctl_wait_fence(struct drm_device *dev, 
void *data,
   if (!queue)
   return -ENOENT;

- ret = msm_wait_fence(gpu->rb[queue->prio]->fctx, args->fence, ,
- true);
+ /*
+  * Map submitqueue scoped "seqno" (which is actually an idr key)
+  * back to underlying dma-fence
+  *
+  * The fence is removed from the fence_idr when the submit is
+  * retired, so if the fence is not found it means there is nothing
+  * to wait for
+  */
+ ret = mutex_lock_interruptible(>lock);
+ if (ret)
+ return ret;
+ fence = idr_find(>fence_idr, args->fence);
+ if (fence)
+ fence = dma_fence_get_rcu(fence);
+ mutex_unlock(>lock);
+
+ if (!fence)
+ return 0;

+ ret = dma_fence_wait_timeout(fence, true, timeout_to_jiffies());
+ if (ret == 0) {
+ ret = -ETIMEDOUT;
+ } else if (ret != -ERESTARTSYS) {
+ ret = 0;
+ }
+
+ dma_fence_put(fence);
   msm_submitqueue_put(queue);
+
   return ret;
   }

diff --git a/drivers/gpu/drm/msm/msm_fence.c b/drivers/gpu/drm/msm/msm_fence.c
index b92a9091a1e2..f2cece542c3f 100644
--- a/drivers/gpu/drm/msm/msm_fence.c
+++ b/drivers/gpu/drm/msm/msm_fence.c
@@ -24,7 +24,6 @@ msm_fence_context_alloc(struct drm_device *dev, volatile 
uint32_t *fenceptr,
   strncpy(fctx->name, name, sizeof(fctx->name));
   fctx->context = dma_fence_context_alloc(1);
   fctx->fenceptr = fenceptr;
- init_waitqueue_head(>event);
   spin_lock_init(>spinlock);

   return fctx;
@@ -45,53 +44,12 @@ static inline bool fence_completed(struct msm_fence_context 
*fctx, uint32_t fenc
   (int32_t)(*fctx->fenceptr - fence) >= 0;
   }

-/* legacy path for WAIT_FENCE ioctl: */
-int msm_wait_fence(struct msm_fence_context *fctx, uint32_t fence,
- ktime_t *timeout, bool interruptible)
-{
- int ret;
-
- if (fence > fctx->last_fence) {
- DRM_ERROR_RATELIMITED("%s: waiting on invalid fence: %u (of 
%u)\n",
- fctx->name, fence, fctx->last_fence);
- return -EINVAL;


Rob, we changed this pre-existing behaviour in this patch. Now, when
userspace tries to wait on a future fence, we don't return an error.

I just want to check if this was accidental or not?


Hmm, perhaps we should do this to restore the previous behavior:

-
diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index 73e827641024..3dd6da56eae6 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -1000,8 +1000,12 @@ static int msm_ioctl_wait_fence(struct
drm_device *dev, void *data,
 fence = dma_fence_get_rcu(fence);
 mutex_unlock(>lock);

-   if (!fence)
-   return 0;
+   if (!fence) {
+   struct msm_fence_context *fctx = gpu->rb[queue->ring_nr]->fctx;
+   DRM_ERROR_RATELIMITED("%s: waiting on invalid fence:
%u (of %u)\n",
+ fctx->name, fence, fctx->last_fence);
+   return -EINVAL;
+   }


With this, when userspace tries to wait on a fence 

Re: [PATCH] drm/i915: Use per device iommu check

2021-11-11 Thread Tvrtko Ursulin



On 10/11/2021 14:37, Robin Murphy wrote:

On 2021-11-10 14:11, Tvrtko Ursulin wrote:


On 10/11/2021 12:35, Lu Baolu wrote:

On 2021/11/10 20:08, Tvrtko Ursulin wrote:


On 10/11/2021 12:04, Lu Baolu wrote:

On 2021/11/10 17:30, Tvrtko Ursulin wrote:


On 10/11/2021 07:12, Lu Baolu wrote:

Hi Tvrtko,

On 2021/11/9 20:17, Tvrtko Ursulin wrote:

From: Tvrtko Ursulin

On igfx + dgfx setups, it appears that intel_iommu=igfx_off 
option only
disables the igfx iommu. Stop relying on global 
intel_iommu_gfx_mapped
and probe presence of iommu domain per device to accurately 
reflect its

status.

Signed-off-by: Tvrtko Ursulin
Cc: Lu Baolu
---
Baolu, is my understanding here correct? Maybe I am confused by 
both
intel_iommu_gfx_mapped and dmar_map_gfx being globals in the 
intel_iommu
driver. But it certainly appears the setup can assign some iommu 
ops (and
assign the discrete i915 to iommu group) when those two are set 
to off.


diff --git a/drivers/gpu/drm/i915/i915_drv.h 
b/drivers/gpu/drm/i915/i915_drv.h

index e967cd08f23e..9fb38a54f1fe 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1763,26 +1763,27 @@ static inline bool run_as_guest(void)
  #define HAS_D12_PLANE_MINIMIZATION(dev_priv) 
(IS_ROCKETLAKE(dev_priv) || \

    IS_ALDERLAKE_S(dev_priv))

-static inline bool intel_vtd_active(void)
+static inline bool intel_vtd_active(struct drm_i915_private *i915)
  {
-#ifdef CONFIG_INTEL_IOMMU
-    if (intel_iommu_gfx_mapped)
+    if (iommu_get_domain_for_dev(i915->drm.dev))
  return true;
-#endif

  /* Running as a guest, we assume the host is enforcing VT'd */
  return run_as_guest();
  }

Have you verified this change? I am afraid that
iommu_get_domain_for_dev() always gets a valid iommu domain even
intel_iommu_gfx_mapped == 0.


Yes it seems to work as is:

default:

# grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
/sys/kernel/debug/dri/0/i915_capabilities:iommu: enabled
/sys/kernel/debug/dri/1/i915_capabilities:iommu: enabled

intel_iommu=igfx_off:

# grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
/sys/kernel/debug/dri/0/i915_capabilities:iommu: disabled
/sys/kernel/debug/dri/1/i915_capabilities:iommu: enabled

On my system dri device 0 is integrated graphics and 1 is discrete.


The drm device 0 has a dedicated iommu. When the user request igfx not
mapped, the VT-d implementation will turn it off to save power. But 
for

shared iommu, you definitely will get it enabled.


Sorry I am not following, what exactly do you mean? Is there a 
platform with integrated graphics without a dedicated iommu, in 
which case intel_iommu=igfx_off results in intel_iommu_gfx_mapped == 
0 and iommu_get_domain_for_dev returning non-NULL?


Your code always work for an igfx with a dedicated iommu. This might be
always true on today's platforms. But from driver's point of view, we
should not make such assumption.

For example, if the iommu implementation decides not to turn off the
graphic iommu (perhaps due to some hw quirk or for graphic
virtualization), your code will be broken.


If I got it right, this would go back to your earlier recommendation 
to have the check look like this:


static bool intel_vtd_active(struct drm_i915_private *i915)
{
 struct iommu_domain *domain;

 domain = iommu_get_domain_for_dev(i915->drm.dev);
 if (domain && (domain->type & __IOMMU_DOMAIN_PAGING))
 return true;
 ...

This would be okay as a first step?

Elsewhere in the thread Robin suggested looking at the dec->dma_ops 
and comparing against iommu_dma_ops. These two solution would be 
effectively the same?


Effectively, yes. See iommu_setup_dma_ops() - the only way to end up 
with iommu_dma_ops is if a managed translation domain is present; if the 
IOMMU is present but the default domain type has been set to passthrough 
(either globally or forced for the given device) it will do nothing and 
leave you with dma-direct, while if the IOMMU has been ignored entirely 
then it should never even be called. Thus it neatly encapsulates what 
you're after here.


One concern I have is whether the pass-through mode truly does nothing 
or addresses perhaps still go through the dmar hardware just with no 
translation?


If latter then most like for like change is actually exactly what the 
first version of my patch did. That is replace intel_iommu_gfx_mapped 
with a plain non-NULL check on iommu_get_domain_for_dev.


Regards,

Tvrtko


Re: [RFC v2 05/22] drm/i915/xelpd: Define Degamma Lut range struct for HDR planes

2021-11-11 Thread Harry Wentland



On 2021-09-06 17:38, Uma Shankar wrote:
> Define the structure with XE_LPD degamma lut ranges. HDR and SDR
> planes have different capabilities, implemented respective
> structure for the HDR planes.
> 
> Signed-off-by: Uma Shankar 
> ---
>  drivers/gpu/drm/i915/display/intel_color.c | 52 ++
>  1 file changed, 52 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_color.c 
> b/drivers/gpu/drm/i915/display/intel_color.c
> index afcb4bf3826c..6403bd74324b 100644
> --- a/drivers/gpu/drm/i915/display/intel_color.c
> +++ b/drivers/gpu/drm/i915/display/intel_color.c
> @@ -2092,6 +2092,58 @@ static void icl_read_luts(struct intel_crtc_state 
> *crtc_state)
>   }
>  }
>  
> + /* FIXME input bpc? */
> +__maybe_unused
> +static const struct drm_color_lut_range d13_degamma_hdr[] = {
> + /* segment 1 */
> + {
> + .flags = (DRM_MODE_LUT_GAMMA |
> +   DRM_MODE_LUT_REFLECT_NEGATIVE |
> +   DRM_MODE_LUT_INTERPOLATE |
> +   DRM_MODE_LUT_NON_DECREASING),
> + .count = 128,
> + .input_bpc = 24, .output_bpc = 16,
> + .start = 0, .end = (1 << 24) - 1,
> + .min = 0, .max = (1 << 24) - 1,
> + },
> + /* segment 2 */
> + {
> + .flags = (DRM_MODE_LUT_GAMMA |
> +   DRM_MODE_LUT_REFLECT_NEGATIVE |
> +   DRM_MODE_LUT_INTERPOLATE |
> +   DRM_MODE_LUT_REUSE_LAST |
> +   DRM_MODE_LUT_NON_DECREASING),
> + .count = 1,
> + .input_bpc = 24, .output_bpc = 16,
> + .start = (1 << 24) - 1, .end = 1 << 24,
> + .min = 0, .max = (1 << 27) - 1,
> + },
> + /* Segment 3 */
> + {
> + .flags = (DRM_MODE_LUT_GAMMA |
> +   DRM_MODE_LUT_REFLECT_NEGATIVE |
> +   DRM_MODE_LUT_INTERPOLATE |
> +   DRM_MODE_LUT_REUSE_LAST |
> +   DRM_MODE_LUT_NON_DECREASING),
> + .count = 1,
> + .input_bpc = 24, .output_bpc = 16,
> + .start = 1 << 24, .end = 3 << 24,
> + .min = 0, .max = (1 << 27) - 1,
> + },
> + /* Segment 4 */
> + {
> + .flags = (DRM_MODE_LUT_GAMMA |
> +   DRM_MODE_LUT_REFLECT_NEGATIVE |
> +   DRM_MODE_LUT_INTERPOLATE |
> +   DRM_MODE_LUT_REUSE_LAST |
> +   DRM_MODE_LUT_NON_DECREASING),
> + .count = 1,
> + .input_bpc = 24, .output_bpc = 16,
> + .start = 3 << 24, .end = 7 << 24,
> + .min = 0, .max = (1 << 27) - 1,
> + },
> +};

If I understand this right, userspace would need this definition in order
to populate the degamma blob. Should this sit in a UAPI header?

Harry

> +
>  void intel_color_init(struct intel_crtc *crtc)
>  {
>   struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
> 



Re: [PATCH] drm/i915: Use per device iommu check

2021-11-11 Thread Tvrtko Ursulin



On 10/11/2021 12:35, Lu Baolu wrote:

On 2021/11/10 20:08, Tvrtko Ursulin wrote:


On 10/11/2021 12:04, Lu Baolu wrote:

On 2021/11/10 17:30, Tvrtko Ursulin wrote:


On 10/11/2021 07:12, Lu Baolu wrote:

Hi Tvrtko,

On 2021/11/9 20:17, Tvrtko Ursulin wrote:

From: Tvrtko Ursulin

On igfx + dgfx setups, it appears that intel_iommu=igfx_off option 
only
disables the igfx iommu. Stop relying on global 
intel_iommu_gfx_mapped
and probe presence of iommu domain per device to accurately 
reflect its

status.

Signed-off-by: Tvrtko Ursulin
Cc: Lu Baolu
---
Baolu, is my understanding here correct? Maybe I am confused by both
intel_iommu_gfx_mapped and dmar_map_gfx being globals in the 
intel_iommu
driver. But it certainly appears the setup can assign some iommu 
ops (and
assign the discrete i915 to iommu group) when those two are set to 
off.


diff --git a/drivers/gpu/drm/i915/i915_drv.h 
b/drivers/gpu/drm/i915/i915_drv.h

index e967cd08f23e..9fb38a54f1fe 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1763,26 +1763,27 @@ static inline bool run_as_guest(void)
  #define HAS_D12_PLANE_MINIMIZATION(dev_priv) 
(IS_ROCKETLAKE(dev_priv) || \

    IS_ALDERLAKE_S(dev_priv))

-static inline bool intel_vtd_active(void)
+static inline bool intel_vtd_active(struct drm_i915_private *i915)
  {
-#ifdef CONFIG_INTEL_IOMMU
-    if (intel_iommu_gfx_mapped)
+    if (iommu_get_domain_for_dev(i915->drm.dev))
  return true;
-#endif

  /* Running as a guest, we assume the host is enforcing VT'd */
  return run_as_guest();
  }

Have you verified this change? I am afraid that
iommu_get_domain_for_dev() always gets a valid iommu domain even
intel_iommu_gfx_mapped == 0.


Yes it seems to work as is:

default:

# grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
/sys/kernel/debug/dri/0/i915_capabilities:iommu: enabled
/sys/kernel/debug/dri/1/i915_capabilities:iommu: enabled

intel_iommu=igfx_off:

# grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
/sys/kernel/debug/dri/0/i915_capabilities:iommu: disabled
/sys/kernel/debug/dri/1/i915_capabilities:iommu: enabled

On my system dri device 0 is integrated graphics and 1 is discrete.


The drm device 0 has a dedicated iommu. When the user request igfx not
mapped, the VT-d implementation will turn it off to save power. But for
shared iommu, you definitely will get it enabled.


Sorry I am not following, what exactly do you mean? Is there a 
platform with integrated graphics without a dedicated iommu, in which 
case intel_iommu=igfx_off results in intel_iommu_gfx_mapped == 0 and 
iommu_get_domain_for_dev returning non-NULL?


Your code always work for an igfx with a dedicated iommu. This might be
always true on today's platforms. But from driver's point of view, we
should not make such assumption.

For example, if the iommu implementation decides not to turn off the
graphic iommu (perhaps due to some hw quirk or for graphic
virtualization), your code will be broken.


I tried your suggestion (checking for __IOMMU_DOMAIN_PAGING) and it works 
better, however I have observed one odd behaviour (for me at least).

In short - why does the DMAR mode for the discrete device change depending on 
igfx_off parameter?

Consider the laptop has these two graphics cards:

# cat /sys/kernel/debug/dri/0/name
i915 dev=:00:02.0 unique=:00:02.0 # integrated

# cat /sys/kernel/debug/dri/1/name
i915 dev=:03:00.0 unique=:03:00.0 # discrete

Booting with different options:
===

default / intel_iommu=on


# cat /sys/class/iommu/dmar0/devices/:00:02.0/iommu_group/type
DMA-FQ
# cat /sys/class/iommu/dmar2/devices/:03:00.0/iommu_group/type
DMA-FQ

# grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
/sys/kernel/debug/dri/0/i915_capabilities:iommu: enabled
/sys/kernel/debug/dri/1/i915_capabilities:iommu: enabled

All good.

intel_iommu=igfx_off


## no dmar0 in sysfs
# cat /sys/class/iommu/dmar2/devices/:03:00.0/iommu_group/type
identity

Unexpected!?

# grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
/sys/kernel/debug/dri/0/i915_capabilities:iommu: disabled
/sys/kernel/debug/dri/1/i915_capabilities:iommu: disabled # At least the i915 
patch detects it correctly.

intel_iommu=off
---

## no dmar0 in sysfs
## no dmar2 in sysfs

# grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
/sys/kernel/debug/dri/0/i915_capabilities:iommu: disabled
/sys/kernel/debug/dri/1/i915_capabilities:iommu: disabled

All good.

The fact discrete graphics changes from translated to pass-through when 
igfx_off is set is surprising to me. Is this a bug?

Regards,

Tvrtko


Re: [PATCH v6 3/8] dt-bindings: display: Add ingenic, jz4780-dw-hdmi DT Schema

2021-11-11 Thread Rob Herring
On Wed, 10 Nov 2021 20:43:28 +0100, H. Nikolaus Schaller wrote:
> From: Sam Ravnborg 
> 
> Add DT bindings for the hdmi driver for the Ingenic JZ4780 SoC.
> Based on .txt binding from Zubair Lutfullah Kakakhel
> 
> We also add add generic ddc-i2c-bus to synopsys,dw-hdmi.yaml
> 
> Signed-off-by: Sam Ravnborg 
> Signed-off-by: H. Nikolaus Schaller 
> Cc: Rob Herring 
> Cc: devicet...@vger.kernel.org
> ---
>  .../display/bridge/synopsys,dw-hdmi.yaml  |  3 +
>  .../bindings/display/ingenic-jz4780-hdmi.yaml | 76 +++
>  2 files changed, 79 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/display/ingenic-jz4780-hdmi.yaml
> 

My bot found errors running 'make DT_CHECKER_FLAGS=-m dt_binding_check'
on your patch (DT_CHECKER_FLAGS is new in v5.13):

yamllint warnings/errors:
./Documentation/devicetree/bindings/display/ingenic-jz4780-hdmi.yaml:36:5: 
[warning] wrong indentation: expected 2 but found 4 (indentation)

dtschema/dtc warnings/errors:
/builds/robherring/linux-dt-review/Documentation/devicetree/bindings/display/ingenic-jz4780-hdmi.example.dt.yaml:
 hdmi@1018: 'clock-names', 'ddc-i2c-bus', 'interrupt-parent', 'interrupts', 
'reg' do not match any of the regexes: 'pinctrl-[0-9]+'
From schema: 
/builds/robherring/linux-dt-review/Documentation/devicetree/bindings/display/ingenic-jz4780-hdmi.yaml

doc reference errors (make refcheckdocs):

See https://patchwork.ozlabs.org/patch/1553577

This check can fail if there are any dependencies. The base for a patch
series is generally the most recent rc1.

If you already ran 'make dt_binding_check' and didn't see the above
error(s), then make sure 'yamllint' is installed and dt-schema is up to
date:

pip3 install dtschema --upgrade

Please check and re-submit.



Re: [PATCH v2 6/8] drm: vkms: Refactor the plane composer to accept new formats

2021-11-11 Thread Pekka Paalanen
On Thu, 11 Nov 2021 11:07:21 -0300
Igor Torrente  wrote:

> Hi Pekka,
> 
> On Thu, Nov 11, 2021 at 6:33 AM Pekka Paalanen  wrote:
> >
> > On Wed, 10 Nov 2021 13:56:54 -0300
> > Igor Torrente  wrote:
> >  
> > > On Tue, Nov 9, 2021 at 8:40 AM Pekka Paalanen  
> > > wrote:  
> > > >
> > > > Hi Igor,
> > > >
> > > > again, that is a really nice speed-up. Unfortunately, I find the code
> > > > rather messy and hard to follow. I hope my comments below help with
> > > > re-designing it to be easier to understand.
> > > >
> > > >
> > > > On Tue, 26 Oct 2021 08:34:06 -0300
> > > > Igor Torrente  wrote:
> > > >  
> > > > > Currently the blend function only accepts XRGB_ and ARGB_
> > > > > as a color input.
> > > > >
> > > > > This patch refactors all the functions related to the plane 
> > > > > composition
> > > > > to overcome this limitation.
> > > > >
> > > > > Now the blend function receives a struct 
> > > > > `vkms_pixel_composition_functions`
> > > > > containing two handlers.
> > > > >
> > > > > One will generate a buffer of each line of the frame with the pixels
> > > > > converted to ARGB16161616. And the other will take this line buffer,
> > > > > do some computation on it, and store the pixels in the destination.
> > > > >
> > > > > Both the handlers have the same signature. They receive a pointer to
> > > > > the pixels that will be processed(`pixels_addr`), the number of pixels
> > > > > that will be treated(`length`), and the intermediate buffer of the 
> > > > > size
> > > > > of a frame line (`line_buffer`).
> > > > >
> > > > > The first function has been totally described previously.  
> > > >
> > > > What does this sentence mean?  
> > >
> > > In the sentence "One will generate...", I give an overview of the two 
> > > types of
> > > handlers. And the overview of the first handler describes the full 
> > > behavior of
> > > it.
> > >
> > > But it doesn't look clear enough, I will improve it in the future.
> > >  
> > > >  
> > > > >
> > > > > The second is more interesting, as it has to perform two roles 
> > > > > depending
> > > > > on where it is called in the code.
> > > > >
> > > > > The first is to convert(if necessary) the data received in the
> > > > > `line_buffer` and write in the memory pointed by `pixels_addr`.
> > > > >
> > > > > The second role is to perform the `alpha_blend`. So, it takes the 
> > > > > pixels
> > > > > in the `line_buffer` and `pixels_addr`, executes the blend, and stores
> > > > > the result back to the `pixels_addr`.
> > > > >
> > > > > The per-line implementation was chosen for performance reasons.
> > > > > The per-pixel functions were having performance issues due to indirect
> > > > > function call overhead.
> > > > >
> > > > > The per-line code trades off memory for execution time. The 
> > > > > `line_buffer`
> > > > > allows us to diminish the number of function calls.
> > > > >
> > > > > Results in the IGT test `kms_cursor_crc`:
> > > > >
> > > > > | Frametime   |
> > > > > |:---:|:-:|:--:|::|
> > > > > |  implmentation  |  Current  |  Per-pixel | Per-line |
> > > > > | frametime range |  8~22 ms  |  32~56 ms  |  6~19 ms |
> > > > > | Average |  10.0 ms  |   35.8 ms  |  8.6 ms  |
> > > > >
> > > > > Reported-by: kernel test robot 
> > > > > Signed-off-by: Igor Torrente 
> > > > > ---
> > > > > V2: Improves the performance drastically, by perfoming the operations
> > > > > per-line and not per-pixel(Pekka Paalanen).
> > > > > Minor improvements(Pekka Paalanen).
> > > > > ---
> > > > >  drivers/gpu/drm/vkms/vkms_composer.c | 321 
> > > > > ---
> > > > >  drivers/gpu/drm/vkms/vkms_formats.h  | 155 +
> > > > >  2 files changed, 342 insertions(+), 134 deletions(-)
> > > > >  create mode 100644 drivers/gpu/drm/vkms/vkms_formats.h
> > > > >
> > > > > diff --git a/drivers/gpu/drm/vkms/vkms_composer.c 
> > > > > b/drivers/gpu/drm/vkms/vkms_composer.c
> > > > > index 383ca657ddf7..69fe3a89bdc9 100644
> > > > > --- a/drivers/gpu/drm/vkms/vkms_composer.c
> > > > > +++ b/drivers/gpu/drm/vkms/vkms_composer.c

...

> > > > > +struct vkms_pixel_composition_functions {
> > > > > + void (*get_src_line)(void *pixels_addr, int length, u64 
> > > > > *line_buffer);
> > > > > + void (*set_output_line)(void *pixels_addr, int length, u64 
> > > > > *line_buffer);  
> > > >
> > > > I would be a little more comfortable if instead of u64 *line_buffer you
> > > > would have something like
> > > >
> > > > struct line_buffer {
> > > > u16 *row;
> > > > size_t nelem;
> > > > }
> > > >
> > > > so that the functions to be plugged into these function pointers could
> > > > assert that you do not accidentally overflow the array (which would
> > > > imply a code bug in kernel).
> > > >
> > > > One could perhaps go even for:
> > > >
> > > > struct line_pixel {
> > > > u16 r, g, b, a;
> > > > };
> > > >
> > > > struct 

Re: [PATCH v2 6/8] drm: vkms: Refactor the plane composer to accept new formats

2021-11-11 Thread Igor Torrente
Hi Pekka,

On Thu, Nov 11, 2021 at 6:33 AM Pekka Paalanen  wrote:
>
> On Wed, 10 Nov 2021 13:56:54 -0300
> Igor Torrente  wrote:
>
> > On Tue, Nov 9, 2021 at 8:40 AM Pekka Paalanen  wrote:
> > >
> > > Hi Igor,
> > >
> > > again, that is a really nice speed-up. Unfortunately, I find the code
> > > rather messy and hard to follow. I hope my comments below help with
> > > re-designing it to be easier to understand.
> > >
> > >
> > > On Tue, 26 Oct 2021 08:34:06 -0300
> > > Igor Torrente  wrote:
> > >
> > > > Currently the blend function only accepts XRGB_ and ARGB_
> > > > as a color input.
> > > >
> > > > This patch refactors all the functions related to the plane composition
> > > > to overcome this limitation.
> > > >
> > > > Now the blend function receives a struct 
> > > > `vkms_pixel_composition_functions`
> > > > containing two handlers.
> > > >
> > > > One will generate a buffer of each line of the frame with the pixels
> > > > converted to ARGB16161616. And the other will take this line buffer,
> > > > do some computation on it, and store the pixels in the destination.
> > > >
> > > > Both the handlers have the same signature. They receive a pointer to
> > > > the pixels that will be processed(`pixels_addr`), the number of pixels
> > > > that will be treated(`length`), and the intermediate buffer of the size
> > > > of a frame line (`line_buffer`).
> > > >
> > > > The first function has been totally described previously.
> > >
> > > What does this sentence mean?
> >
> > In the sentence "One will generate...", I give an overview of the two types 
> > of
> > handlers. And the overview of the first handler describes the full behavior 
> > of
> > it.
> >
> > But it doesn't look clear enough, I will improve it in the future.
> >
> > >
> > > >
> > > > The second is more interesting, as it has to perform two roles depending
> > > > on where it is called in the code.
> > > >
> > > > The first is to convert(if necessary) the data received in the
> > > > `line_buffer` and write in the memory pointed by `pixels_addr`.
> > > >
> > > > The second role is to perform the `alpha_blend`. So, it takes the pixels
> > > > in the `line_buffer` and `pixels_addr`, executes the blend, and stores
> > > > the result back to the `pixels_addr`.
> > > >
> > > > The per-line implementation was chosen for performance reasons.
> > > > The per-pixel functions were having performance issues due to indirect
> > > > function call overhead.
> > > >
> > > > The per-line code trades off memory for execution time. The 
> > > > `line_buffer`
> > > > allows us to diminish the number of function calls.
> > > >
> > > > Results in the IGT test `kms_cursor_crc`:
> > > >
> > > > | Frametime   |
> > > > |:---:|:-:|:--:|::|
> > > > |  implmentation  |  Current  |  Per-pixel | Per-line |
> > > > | frametime range |  8~22 ms  |  32~56 ms  |  6~19 ms |
> > > > | Average |  10.0 ms  |   35.8 ms  |  8.6 ms  |
> > > >
> > > > Reported-by: kernel test robot 
> > > > Signed-off-by: Igor Torrente 
> > > > ---
> > > > V2: Improves the performance drastically, by perfoming the operations
> > > > per-line and not per-pixel(Pekka Paalanen).
> > > > Minor improvements(Pekka Paalanen).
> > > > ---
> > > >  drivers/gpu/drm/vkms/vkms_composer.c | 321 ---
> > > >  drivers/gpu/drm/vkms/vkms_formats.h  | 155 +
> > > >  2 files changed, 342 insertions(+), 134 deletions(-)
> > > >  create mode 100644 drivers/gpu/drm/vkms/vkms_formats.h
> > > >
> > > > diff --git a/drivers/gpu/drm/vkms/vkms_composer.c 
> > > > b/drivers/gpu/drm/vkms/vkms_composer.c
> > > > index 383ca657ddf7..69fe3a89bdc9 100644
> > > > --- a/drivers/gpu/drm/vkms/vkms_composer.c
> > > > +++ b/drivers/gpu/drm/vkms/vkms_composer.c
> > > > @@ -9,18 +9,26 @@
> > > >  #include 
> > > >
> > > >  #include "vkms_drv.h"
> > > > -
> > > > -static u32 get_pixel_from_buffer(int x, int y, const u8 *buffer,
> > > > -  const struct vkms_composer *composer)
> > > > -{
> > > > - u32 pixel;
> > > > - int src_offset = composer->offset + (y * composer->pitch)
> > > > -   + (x * composer->cpp);
> > > > -
> > > > - pixel = *(u32 *)[src_offset];
> > > > -
> > > > - return pixel;
> > > > -}
> > > > +#include "vkms_formats.h"
> > > > +
> > > > +#define get_output_vkms_composer(buffer_pointer, composer)   \
> > > > + ((struct vkms_composer) {   \
> > > > + .fb = &(struct drm_framebuffer) {   \
> > > > + .format = &(struct drm_format_info) {   \
> > > > + .format = DRM_FORMAT_ARGB16161616,  \
> > > > + },  \
> > >
> > > Is that really how one can initialize a drm_format_info? Does that
> > > struct not have a lot more 

Re: [igt-dev] [RFC PATCH 0/1] drm: selftest: Convert to KUnit

2021-11-11 Thread Petri Latvala
On Wed, Nov 10, 2021 at 09:34:52PM -0300, André Almeida wrote:
> Hi,
> 
> This RFC is a preview of the progress we made in the KUnit hackathon[0].
> This patch, made by Maíra and Arthur, converts the damage helper test
> from the original DRM selftest framework to use the KUnit framework.
> 
> [0] https://groups.google.com/g/kunit-dev/c/YqFR1q2uZvk/m/IbvItSfHBAAJ
> 
> The IGT part of this work can be found here:
> https://gitlab.freedesktop.org/isinyaaa/igt-gpu-tools/-/tree/introduce-kunit

IGT side approach looks good. There's a couple of obscure bugs that I
spotted but nothing that is unfixable when it's time to review in
detail.


-- 
Petri Latvala


[PATCH v3] drm/i915: Skip error capture when wedged on init

2021-11-11 Thread Tvrtko Ursulin
From: Tvrtko Ursulin 

Trying to capture uninitialised engines when we wedged on init ends in
tears. Skip that together with uC capture, since failure to initialise the
latter can actually be one of the reasons for wedging on init.

v2:
 * Use i915_disable_error_state when wedging on init/fini.

v3:
 * Handle mock tests.

Signed-off-by: Tvrtko Ursulin 
Reviewed-by: Matthew Auld  # v1
---
 drivers/gpu/drm/i915/gt/intel_reset.c| 2 ++
 drivers/gpu/drm/i915/selftests/mock_gem_device.c | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c 
b/drivers/gpu/drm/i915/gt/intel_reset.c
index 51b56b8e5003..0fbd6dbadce7 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -1448,6 +1448,7 @@ void intel_gt_set_wedged_on_init(struct intel_gt *gt)
BUILD_BUG_ON(I915_RESET_ENGINE + I915_NUM_ENGINES >
 I915_WEDGED_ON_INIT);
intel_gt_set_wedged(gt);
+   i915_disable_error_state(gt->i915, -ENODEV);
set_bit(I915_WEDGED_ON_INIT, >reset.flags);
 
/* Wedged on init is non-recoverable */
@@ -1457,6 +1458,7 @@ void intel_gt_set_wedged_on_init(struct intel_gt *gt)
 void intel_gt_set_wedged_on_fini(struct intel_gt *gt)
 {
intel_gt_set_wedged(gt);
+   i915_disable_error_state(gt->i915, -ENODEV);
set_bit(I915_WEDGED_ON_FINI, >reset.flags);
intel_gt_retire_requests(gt); /* cleanup any wedged requests */
 }
diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c 
b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
index 9ab3f284d1dd..d0e2e61de8d4 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
@@ -177,6 +177,8 @@ struct drm_i915_private *mock_gem_device(void)
 
mock_uncore_init(>uncore, i915);
 
+   spin_lock_init(>gpu_error.lock);
+
i915_gem_init__mm(i915);
intel_gt_init_early(>gt, i915);
atomic_inc(>gt.wakeref.count); /* disable; no hw support */
-- 
2.30.2



[PATCH v2 6/6] drm/i915/ttm: Update i915_gem_obj_copy_ttm() to be asynchronous

2021-11-11 Thread Thomas Hellström
Update the copy function i915_gem_obj_copy_ttm() to be asynchronous for
future users and update the only current user to sync the objects
as needed after this function.

Signed-off-by: Thomas Hellström 
---
 drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c | 40 ++--
 drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c   |  2 +
 2 files changed, 30 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c 
b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
index ae2c49fc3500..53ed3972c7be 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
@@ -811,33 +811,49 @@ int i915_gem_obj_copy_ttm(struct drm_i915_gem_object *dst,
.interruptible = intr,
};
struct i915_refct_sgt *dst_rsgt;
-   struct dma_fence *copy_fence;
-   int ret;
+   struct dma_fence *copy_fence, *dep_fence;
+   struct i915_deps deps;
+   int ret, shared_err;
 
assert_object_held(dst);
assert_object_held(src);
+   i915_deps_init(, GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN);
 
/*
-* Sync for now. This will change with async moves.
+* We plan to add a shared fence only for the source. If that
+* fails, we await all source fences before commencing
+* the copy instead of only the exclusive.
 */
-   ret = ttm_bo_wait_ctx(dst_bo, );
+   shared_err = dma_resv_reserve_shared(src_bo->base.resv, 1);
+   ret = i915_deps_add_resv(, dst_bo->base.resv, true, false, );
if (!ret)
-   ret = ttm_bo_wait_ctx(src_bo, );
+   ret = i915_deps_add_resv(, src_bo->base.resv,
+!!shared_err, false, );
if (ret)
return ret;
 
+   dep_fence = i915_deps_to_fence(, );
+   if (IS_ERR(dep_fence))
+   return PTR_ERR(dep_fence);
+
dst_rsgt = i915_ttm_resource_get_st(dst, dst_bo->resource);
copy_fence = __i915_ttm_move(src_bo, false, dst_bo->resource,
-dst_bo->ttm, dst_rsgt, allow_accel, NULL);
+dst_bo->ttm, dst_rsgt, allow_accel,
+dep_fence);
 
i915_refct_sgt_put(dst_rsgt);
-   if (IS_ERR(copy_fence))
-   return PTR_ERR(copy_fence);
+   if (IS_ERR_OR_NULL(copy_fence))
+   return PTR_ERR_OR_ZERO(copy_fence);
 
-   if (copy_fence) {
-   dma_fence_wait(copy_fence, false);
-   dma_fence_put(copy_fence);
-   }
+   dma_resv_add_excl_fence(dst_bo->base.resv, copy_fence);
+
+   /* If we failed to reserve a shared slot, add an exclusive fence */
+   if (shared_err)
+   dma_resv_add_excl_fence(src_bo->base.resv, copy_fence);
+   else
+   dma_resv_add_shared_fence(src_bo->base.resv, copy_fence);
+
+   dma_fence_put(copy_fence);
 
return 0;
 }
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c 
b/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c
index 60d10ab55d1e..9aad84059d56 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c
@@ -80,6 +80,7 @@ static int i915_ttm_backup(struct i915_gem_apply_to_region 
*apply,
 
err = i915_gem_obj_copy_ttm(backup, obj, pm_apply->allow_gpu, false);
GEM_WARN_ON(err);
+   ttm_bo_wait_ctx(backup_bo, );
 
obj->ttm.backup = backup;
return 0;
@@ -170,6 +171,7 @@ static int i915_ttm_restore(struct i915_gem_apply_to_region 
*apply,
err = i915_gem_obj_copy_ttm(obj, backup, pm_apply->allow_gpu,
false);
GEM_WARN_ON(err);
+   ttm_bo_wait_ctx(backup_bo, );
 
obj->ttm.backup = NULL;
err = 0;
-- 
2.31.1



[PATCH v2 5/6] drm/i915/ttm: Implement asynchronous TTM moves

2021-11-11 Thread Thomas Hellström
Don't wait sync while migrating, but rather make the GPU blit await the
dependencies and add a moving fence to the object.

This also enables asynchronous VRAM management in that on eviction,
rather than waiting for the moving fence to expire before freeing VRAM,
it is freed immediately and the fence is stored with the VRAM manager and
handed out to newly allocated objects to await before clears and swapins,
or for kernel objects before setting up gpu vmas or mapping.

To collect dependencies before migrating, add a set of utilities that
coalesce these to a single dma_fence.

What is still missing for fully asynchronous operation is asynchronous vma
unbinding, which is still to be implemented.

This commit substantially reduces execution time in the gem_lmem_swapping
test.

v2:
- Make a couple of functions static.

Signed-off-by: Thomas Hellström 
---
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c  |  10 +
 drivers/gpu/drm/i915/gem/i915_gem_ttm.h  |   2 +-
 drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c | 329 +--
 drivers/gpu/drm/i915/gem/i915_gem_wait.c |   4 +-
 4 files changed, 318 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c 
b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
index a1df49378a0f..111a4282d779 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
@@ -326,6 +326,9 @@ static bool i915_ttm_eviction_valuable(struct 
ttm_buffer_object *bo,
 {
struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
 
+   if (!obj)
+   return false;
+
/*
 * EXTERNAL objects should never be swapped out by TTM, instead we need
 * to handle that ourselves. TTM will already skip such objects for us,
@@ -448,6 +451,10 @@ static int i915_ttm_shrinker_release_pages(struct 
drm_i915_gem_object *obj,
if (bo->ttm->page_flags & TTM_TT_FLAG_SWAPPED)
return 0;
 
+   ret = ttm_bo_wait_ctx(bo, );
+   if (ret)
+   return ret;
+
bo->ttm->page_flags |= TTM_TT_FLAG_SWAPPED;
ret = ttm_bo_validate(bo, , );
if (ret) {
@@ -549,6 +556,9 @@ static void i915_ttm_swap_notify(struct ttm_buffer_object 
*bo)
struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
int ret = i915_ttm_move_notify(bo);
 
+   if (!obj)
+   return;
+
GEM_WARN_ON(ret);
GEM_WARN_ON(obj->ttm.cached_io_rsgt);
if (!ret && obj->mm.madv != I915_MADV_WILLNEED)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.h 
b/drivers/gpu/drm/i915/gem/i915_gem_ttm.h
index 82cdabb542be..9d698ad00853 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.h
@@ -37,7 +37,7 @@ void i915_ttm_bo_destroy(struct ttm_buffer_object *bo);
 static inline struct drm_i915_gem_object *
 i915_ttm_to_gem(struct ttm_buffer_object *bo)
 {
-   if (GEM_WARN_ON(bo->destroy != i915_ttm_bo_destroy))
+   if (bo->destroy != i915_ttm_bo_destroy)
return NULL;
 
return container_of(bo, struct drm_i915_gem_object, __do_not_access);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c 
b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
index f35b386c56ca..ae2c49fc3500 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
@@ -3,6 +3,8 @@
  * Copyright © 2021 Intel Corporation
  */
 
+#include 
+
 #include 
 
 #include "i915_drv.h"
@@ -41,6 +43,228 @@ void i915_ttm_migrate_set_failure_modes(bool gpu_migration,
 }
 #endif
 
+/**
+ * DOC: Set of utilities to dynamically collect dependencies and
+ * eventually coalesce them into a single fence which is fed into
+ * the migration code. That single fence is, in the case of dependencies
+ * from multiple contexts, a struct dma_fence_array, since the
+ * i915 request code can break that up and await the individual
+ * fences.
+ *
+ * While collecting the individual dependencies, we store the refcounted
+ * struct dma_fence pointers in a realloc-type-managed pointer array, since
+ * that can be easily fed into a dma_fence_array. Other options are
+ * available, like for example an xarray for similarity with drm/sched.
+ * Can be changed easily if needed.
+ *
+ * We might want to break this out into a separate file as a utility.
+ */
+
+#define I915_DEPS_MIN_ALLOC_CHUNK 8U
+
+/**
+ * struct i915_deps - Collect dependencies into a single dma-fence
+ * @single: Storage for pointer if the collection is a single fence.
+ * @fence: Allocated array of fence pointers if more than a single fence;
+ * otherwise points to the address of @single.
+ * @num_deps: Current number of dependency fences.
+ * @fences_size: Size of the @fences array in number of pointers.
+ * @gfp: Allocation mode.
+ */
+struct i915_deps {
+   struct dma_fence *single;
+   struct dma_fence **fences;
+   unsigned int num_deps;
+   unsigned int fences_size;
+   gfp_t gfp;
+};
+
+static void 

Re: [PATCH] drm: pre-fill getfb2 modifier array with INVALID

2021-11-11 Thread Ville Syrjälä
On Thu, Nov 11, 2021 at 10:10:54AM +, Simon Ser wrote:
> User-space shouldn't look up the modifier array when the modifier
> flag is missing, but at the moment no docs make this clear (working
> on it). Right now the modifier array is pre-filled with zeroes, aka.
> LINEAR. Instead, pre-fill with INVALID to avoid footguns.
> 
> This is a uAPI change, but OTOH any user-space which looks up the
> modifier array without checking the flag is broken already, so
> should be fine.
> 
> Signed-off-by: Simon Ser 
> Cc: Daniel Vetter 
> Cc: Pekka Paalanen 
> Cc: Daniel Stone 

Isn't this going to break the test where we pass the get
getfb2 result back into addfb2?

> ---
>  drivers/gpu/drm/drm_framebuffer.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/drm_framebuffer.c 
> b/drivers/gpu/drm/drm_framebuffer.c
> index 07f5abc875e9..f7041c0a0407 100644
> --- a/drivers/gpu/drm/drm_framebuffer.c
> +++ b/drivers/gpu/drm/drm_framebuffer.c
> @@ -601,7 +601,7 @@ int drm_mode_getfb2_ioctl(struct drm_device *dev,
>   r->handles[i] = 0;
>   r->pitches[i] = 0;
>   r->offsets[i] = 0;
> - r->modifier[i] = 0;
> + r->modifier[i] = DRM_FORMAT_MOD_INVALID;
>   }
>  
>   for (i = 0; i < fb->format->num_planes; i++) {
> -- 
> 2.33.1
> 

-- 
Ville Syrjälä
Intel


[PATCH v2 4/6] drm/i915/ttm: Break refcounting loops at device region unref time

2021-11-11 Thread Thomas Hellström
There is an interesting refcounting loop:
struct intel_memory_region has a struct ttm_resource_manager,
ttm_resource_manager->move may hold a reference to i915_request,
i915_request may hold a reference to intel_context,
intel_context may hold a reference to drm_i915_gem_object,
drm_i915_gem_object may hold a reference to intel_memory_region.

Break this loop when we drop the device reference count on the
region by putting the region move fence.

Also hold dropping the device reference count until all objects of
the region has been deleted, to avoid issues if proceeding with the
device takedown while the region is still present.

Signed-off-by: Thomas Hellström 
---
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c |  1 +
 drivers/gpu/drm/i915/gt/intel_region_lmem.c |  1 +
 drivers/gpu/drm/i915/intel_memory_region.c  |  5 +++-
 drivers/gpu/drm/i915/intel_memory_region.h  |  1 +
 drivers/gpu/drm/i915/intel_region_ttm.c | 28 +
 drivers/gpu/drm/i915/intel_region_ttm.h |  2 ++
 6 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c 
b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
index 537a81445b90..a1df49378a0f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
@@ -1044,6 +1044,7 @@ int __i915_gem_ttm_object_init(struct intel_memory_region 
*mem,
 
 static const struct intel_memory_region_ops ttm_system_region_ops = {
.init_object = __i915_gem_ttm_object_init,
+   .disable = intel_region_ttm_disable,
 };
 
 struct intel_memory_region *
diff --git a/drivers/gpu/drm/i915/gt/intel_region_lmem.c 
b/drivers/gpu/drm/i915/gt/intel_region_lmem.c
index aec838ecb2ef..956916fd21f8 100644
--- a/drivers/gpu/drm/i915/gt/intel_region_lmem.c
+++ b/drivers/gpu/drm/i915/gt/intel_region_lmem.c
@@ -108,6 +108,7 @@ region_lmem_init(struct intel_memory_region *mem)
 static const struct intel_memory_region_ops intel_region_lmem_ops = {
.init = region_lmem_init,
.release = region_lmem_release,
+   .disable = intel_region_ttm_disable,
.init_object = __i915_gem_ttm_object_init,
 };
 
diff --git a/drivers/gpu/drm/i915/intel_memory_region.c 
b/drivers/gpu/drm/i915/intel_memory_region.c
index e7f7e6627750..1f67d2b68c24 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.c
+++ b/drivers/gpu/drm/i915/intel_memory_region.c
@@ -233,8 +233,11 @@ void intel_memory_regions_driver_release(struct 
drm_i915_private *i915)
struct intel_memory_region *region =
fetch_and_zero(>mm.regions[i]);
 
-   if (region)
+   if (region) {
+   if (region->ops->disable)
+   region->ops->disable(region);
intel_memory_region_put(region);
+   }
}
 }
 
diff --git a/drivers/gpu/drm/i915/intel_memory_region.h 
b/drivers/gpu/drm/i915/intel_memory_region.h
index 3feae3353d33..9bb77eacd206 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.h
+++ b/drivers/gpu/drm/i915/intel_memory_region.h
@@ -52,6 +52,7 @@ struct intel_memory_region_ops {
 
int (*init)(struct intel_memory_region *mem);
void (*release)(struct intel_memory_region *mem);
+   void (*disable)(struct intel_memory_region *mem);
 
int (*init_object)(struct intel_memory_region *mem,
   struct drm_i915_gem_object *obj,
diff --git a/drivers/gpu/drm/i915/intel_region_ttm.c 
b/drivers/gpu/drm/i915/intel_region_ttm.c
index 2e901a27e259..4219d83a2b19 100644
--- a/drivers/gpu/drm/i915/intel_region_ttm.c
+++ b/drivers/gpu/drm/i915/intel_region_ttm.c
@@ -114,6 +114,34 @@ void intel_region_ttm_fini(struct intel_memory_region *mem)
mem->region_private = NULL;
 }
 
+/**
+ * intel_region_ttm_disable - A TTM region disable callback helper
+ * @mem: The memory region.
+ *
+ * A helper that ensures that nothing any longer references a region at
+ * device takedown. Breaks refcounting loops and waits for objects in the
+ * region to be deleted.
+ */
+void intel_region_ttm_disable(struct intel_memory_region *mem)
+{
+   struct ttm_resource_manager *man = mem->region_private;
+
+   /*
+* Put the region's move fences. This releases requests that
+* may hold on to contexts and vms that may hold on to buffer
+* objects that may have a refcount on the region. :/
+*/
+   if (man)
+   ttm_resource_manager_cleanup(man);
+
+   /* Flush objects that may just have been freed */
+   i915_gem_flush_free_objects(mem->i915);
+
+   /* Wait until the only region reference left is our own. */
+   while (kref_read(>kref) > 1)
+   msleep(20);
+}
+
 /**
  * intel_region_ttm_resource_to_rsgt -
  * Convert an opaque TTM resource manager resource to a refcounted sg_table.
diff --git a/drivers/gpu/drm/i915/intel_region_ttm.h 
b/drivers/gpu/drm/i915/intel_region_ttm.h
index 7bbe2b46b504..197a8c179370 

[PATCH v2 3/6] drm/i915/ttm: Move the i915_gem_obj_copy_ttm() function

2021-11-11 Thread Thomas Hellström
Move the i915_gem_obj_copy_ttm() function to i915_gem_ttm_move.h.
This will help keep a number of functions static when introducing
async moves.

Signed-off-by: Thomas Hellström 
---
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c  | 47 ---
 drivers/gpu/drm/i915/gem/i915_gem_ttm.h  |  4 --
 drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c | 63 
 drivers/gpu/drm/i915/gem/i915_gem_ttm_move.h | 10 ++--
 drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c   |  1 +
 5 files changed, 56 insertions(+), 69 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c 
b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
index 68cfe6e9ceab..537a81445b90 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
@@ -1063,50 +1063,3 @@ i915_gem_ttm_system_setup(struct drm_i915_private *i915,
intel_memory_region_set_name(mr, "system-ttm");
return mr;
 }
-
-/**
- * i915_gem_obj_copy_ttm - Copy the contents of one ttm-based gem object to
- * another
- * @dst: The destination object
- * @src: The source object
- * @allow_accel: Allow using the blitter. Otherwise TTM memcpy is used.
- * @intr: Whether to perform waits interruptible:
- *
- * Note: The caller is responsible for assuring that the underlying
- * TTM objects are populated if needed and locked.
- *
- * Return: Zero on success. Negative error code on error. If @intr == true,
- * then it may return -ERESTARTSYS or -EINTR.
- */
-int i915_gem_obj_copy_ttm(struct drm_i915_gem_object *dst,
- struct drm_i915_gem_object *src,
- bool allow_accel, bool intr)
-{
-   struct ttm_buffer_object *dst_bo = i915_gem_to_ttm(dst);
-   struct ttm_buffer_object *src_bo = i915_gem_to_ttm(src);
-   struct ttm_operation_ctx ctx = {
-   .interruptible = intr,
-   };
-   struct i915_refct_sgt *dst_rsgt;
-   int ret;
-
-   assert_object_held(dst);
-   assert_object_held(src);
-
-   /*
-* Sync for now. This will change with async moves.
-*/
-   ret = ttm_bo_wait_ctx(dst_bo, );
-   if (!ret)
-   ret = ttm_bo_wait_ctx(src_bo, );
-   if (ret)
-   return ret;
-
-   dst_rsgt = i915_ttm_resource_get_st(dst, dst_bo->resource);
-   __i915_ttm_move(src_bo, false, dst_bo->resource, dst_bo->ttm,
-   dst_rsgt, allow_accel);
-
-   i915_refct_sgt_put(dst_rsgt);
-
-   return 0;
-}
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.h 
b/drivers/gpu/drm/i915/gem/i915_gem_ttm.h
index 074a7c08ff31..82cdabb542be 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.h
@@ -49,10 +49,6 @@ int __i915_gem_ttm_object_init(struct intel_memory_region 
*mem,
   resource_size_t page_size,
   unsigned int flags);
 
-int i915_gem_obj_copy_ttm(struct drm_i915_gem_object *dst,
- struct drm_i915_gem_object *src,
- bool allow_accel, bool intr);
-
 /* Internal I915 TTM declarations and definitions below. */
 
 #define I915_PL_LMEM0 TTM_PL_PRIV
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c 
b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
index ef22d4ed66ad..f35b386c56ca 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
@@ -378,18 +378,10 @@ i915_ttm_memcpy_work_arm(struct i915_ttm_memcpy_work 
*work,
return >fence;
 }
 
-/**
- * __i915_ttm_move - helper to perform TTM moves or clears.
- * @bo: The source buffer object.
- * @clear: Whether this is a clear operation.
- * @dst_mem: The destination ttm resource.
- * @dst_ttm: The destination ttm page vector.
- * @dst_rsgt: The destination refcounted sg-list.
- * @allow_accel: Whether to allow acceleration.
- */
-void __i915_ttm_move(struct ttm_buffer_object *bo, bool clear,
-struct ttm_resource *dst_mem, struct ttm_tt *dst_ttm,
-struct i915_refct_sgt *dst_rsgt, bool allow_accel)
+static void __i915_ttm_move(struct ttm_buffer_object *bo, bool clear,
+   struct ttm_resource *dst_mem,
+   struct ttm_tt *dst_ttm,
+   struct i915_refct_sgt *dst_rsgt, bool allow_accel)
 {
struct i915_ttm_memcpy_work *copy_work = NULL;
struct i915_ttm_memcpy_arg _arg, *arg = &_arg;
@@ -521,3 +513,50 @@ int i915_ttm_move(struct ttm_buffer_object *bo, bool evict,
i915_ttm_adjust_gem_after_move(obj);
return 0;
 }
+
+/**
+ * i915_gem_obj_copy_ttm - Copy the contents of one ttm-based gem object to
+ * another
+ * @dst: The destination object
+ * @src: The source object
+ * @allow_accel: Allow using the blitter. Otherwise TTM memcpy is used.
+ * @intr: Whether to perform waits interruptible:
+ *
+ * Note: The caller is responsible for assuring that the underlying
+ * TTM objects are populated if 

[PATCH v2 2/6] drm/i915: Add support for asynchronous moving fence waiting

2021-11-11 Thread Thomas Hellström
From: Maarten Lankhorst 

For now, we will only allow async migration when TTM is used,
so the paths we care about are related to TTM.

The mmap path is handled by having the fence in ttm_bo->moving,
when pinning, the binding only becomes available after the moving
fence is signaled, and pinning a cpu map will only work after
the moving fence signals.

This should close all holes where userspace can read a buffer
before it's fully migrated.

v2:
- Fix a couple of SPARSE warnings

Co-developed-by: Thomas Hellström 
Signed-off-by: Thomas Hellström 
Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/i915/display/intel_fbdev.c|  7 ++--
 drivers/gpu/drm/i915/display/intel_overlay.c  |  2 +-
 drivers/gpu/drm/i915/gem/i915_gem_pages.c |  6 +++
 .../i915/gem/selftests/i915_gem_coherency.c   |  4 +-
 .../drm/i915/gem/selftests/i915_gem_mman.c| 22 ++-
 drivers/gpu/drm/i915/i915_vma.c   | 39 ++-
 drivers/gpu/drm/i915/i915_vma.h   |  3 ++
 drivers/gpu/drm/i915/selftests/i915_vma.c |  4 +-
 8 files changed, 69 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_fbdev.c 
b/drivers/gpu/drm/i915/display/intel_fbdev.c
index adc3a81be9f7..5902ad0c2bd8 100644
--- a/drivers/gpu/drm/i915/display/intel_fbdev.c
+++ b/drivers/gpu/drm/i915/display/intel_fbdev.c
@@ -265,11 +265,12 @@ static int intelfb_create(struct drm_fb_helper *helper,
info->fix.smem_len = vma->node.size;
}
 
-   vaddr = i915_vma_pin_iomap(vma);
+   vaddr = i915_vma_pin_iomap_unlocked(vma);
if (IS_ERR(vaddr)) {
-   drm_err(_priv->drm,
-   "Failed to remap framebuffer into virtual memory\n");
ret = PTR_ERR(vaddr);
+   if (ret != -EINTR && ret != -ERESTARTSYS)
+   drm_err(_priv->drm,
+   "Failed to remap framebuffer into virtual 
memory\n");
goto out_unpin;
}
info->screen_base = vaddr;
diff --git a/drivers/gpu/drm/i915/display/intel_overlay.c 
b/drivers/gpu/drm/i915/display/intel_overlay.c
index 7e3f5c6ca484..21593f3f2664 100644
--- a/drivers/gpu/drm/i915/display/intel_overlay.c
+++ b/drivers/gpu/drm/i915/display/intel_overlay.c
@@ -1357,7 +1357,7 @@ static int get_registers(struct intel_overlay *overlay, 
bool use_phys)
overlay->flip_addr = sg_dma_address(obj->mm.pages->sgl);
else
overlay->flip_addr = i915_ggtt_offset(vma);
-   overlay->regs = i915_vma_pin_iomap(vma);
+   overlay->regs = i915_vma_pin_iomap_unlocked(vma);
i915_vma_unpin(vma);
 
if (IS_ERR(overlay->regs)) {
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c 
b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
index c4f684b7cc51..49c6e55c68ce 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
@@ -418,6 +418,12 @@ void *i915_gem_object_pin_map(struct drm_i915_gem_object 
*obj,
}
 
if (!ptr) {
+   err = i915_gem_object_wait_moving_fence(obj, true);
+   if (err) {
+   ptr = ERR_PTR(err);
+   goto err_unpin;
+   }
+
if (GEM_WARN_ON(type == I915_MAP_WC &&
!static_cpu_has(X86_FEATURE_PAT)))
ptr = ERR_PTR(-ENODEV);
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c 
b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c
index 13b088cc787e..067c512961ba 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c
@@ -101,7 +101,7 @@ static int gtt_set(struct context *ctx, unsigned long 
offset, u32 v)
 
intel_gt_pm_get(vma->vm->gt);
 
-   map = i915_vma_pin_iomap(vma);
+   map = i915_vma_pin_iomap_unlocked(vma);
i915_vma_unpin(vma);
if (IS_ERR(map)) {
err = PTR_ERR(map);
@@ -134,7 +134,7 @@ static int gtt_get(struct context *ctx, unsigned long 
offset, u32 *v)
 
intel_gt_pm_get(vma->vm->gt);
 
-   map = i915_vma_pin_iomap(vma);
+   map = i915_vma_pin_iomap_unlocked(vma);
i915_vma_unpin(vma);
if (IS_ERR(map)) {
err = PTR_ERR(map);
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c 
b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
index 6d30cdfa80f3..5d54181c2145 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
@@ -125,12 +125,13 @@ static int check_partial_mapping(struct 
drm_i915_gem_object *obj,
n = page - view.partial.offset;
GEM_BUG_ON(n >= view.partial.size);
 
-   io = i915_vma_pin_iomap(vma);
+   io = i915_vma_pin_iomap_unlocked(vma);
i915_vma_unpin(vma);
if (IS_ERR(io)) {
-   pr_err("Failed to iomap partial view: offset=%lu; err=%d\n",

[PATCH v2 1/6] drm/i915: Add functions to set/get moving fence

2021-11-11 Thread Thomas Hellström
From: Maarten Lankhorst 

We want to get rid of i915_vma tracking to simplify the code and
lifetimes. Add a way to set/put the moving fence, in preparation for
removing the tracking.

Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/i915/gem/i915_gem_object.c | 37 ++
 drivers/gpu/drm/i915/gem/i915_gem_object.h |  9 ++
 2 files changed, 46 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 591ee3cb7275..ec4313836597 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -33,6 +33,7 @@
 #include "i915_gem_object.h"
 #include "i915_memcpy.h"
 #include "i915_trace.h"
+#include "i915_gem_ttm.h"
 
 static struct kmem_cache *slab_objects;
 
@@ -726,6 +727,42 @@ static const struct drm_gem_object_funcs 
i915_gem_object_funcs = {
.export = i915_gem_prime_export,
 };
 
+struct dma_fence *
+i915_gem_object_get_moving_fence(struct drm_i915_gem_object *obj)
+{
+   return dma_fence_get(i915_gem_to_ttm(obj)->moving);
+}
+
+void i915_gem_object_set_moving_fence(struct drm_i915_gem_object *obj,
+ struct dma_fence *fence)
+{
+   dma_fence_put(i915_gem_to_ttm(obj)->moving);
+
+   i915_gem_to_ttm(obj)->moving = dma_fence_get(fence);
+}
+
+int i915_gem_object_wait_moving_fence(struct drm_i915_gem_object *obj,
+ bool intr)
+{
+   struct dma_fence *fence = i915_gem_to_ttm(obj)->moving;
+   int ret;
+
+   assert_object_held(obj);
+   if (!fence)
+   return 0;
+
+   ret = dma_fence_wait(fence, intr);
+   if (ret)
+   return ret;
+
+   if (fence->error)
+   return fence->error;
+
+   i915_gem_to_ttm(obj)->moving = NULL;
+   dma_fence_put(fence);
+   return 0;
+}
+
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
 #include "selftests/huge_gem_object.c"
 #include "selftests/huge_pages.c"
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 133963b46135..36bf3e2e602f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -517,6 +517,15 @@ i915_gem_object_finish_access(struct drm_i915_gem_object 
*obj)
i915_gem_object_unpin_pages(obj);
 }
 
+struct dma_fence *
+i915_gem_object_get_moving_fence(struct drm_i915_gem_object *obj);
+
+void i915_gem_object_set_moving_fence(struct drm_i915_gem_object *obj,
+ struct dma_fence *fence);
+
+int i915_gem_object_wait_moving_fence(struct drm_i915_gem_object *obj,
+ bool intr);
+
 void i915_gem_object_set_cache_coherency(struct drm_i915_gem_object *obj,
 unsigned int cache_level);
 bool i915_gem_object_can_bypass_llc(struct drm_i915_gem_object *obj);
-- 
2.31.1



[PATCH v2 0/6] drm/i915/ttm: Async migration

2021-11-11 Thread Thomas Hellström
This patch series deals with async migration and async vram management.
It still leaves an important part out, which is async unbinding which
will reduce latency further, at least when trying to migrate already active
objects.

Patches 1/6 and 2/6 deal with accessing and waiting for the TTM moving
fence from i915 GEM.
Patch 3 is pure code reorganization, no functional change.
Patch 4 breaks a refcounting loop involving the TTM moving fence.
Patch 5 uses TTM to implement the ttm move() callback async, it also
introduces a utility to collect dependencies and turn them into a
single dma_fence, which is needed for the intel_migrate code.
This also affects the gem object migrate code so.
Patch 6 makes the object copy utility async as well, mainly for future
users since the only current user, suspend backup and restore, typically
will want to sync anyway.

v2:
- Fix a couple of SPARSE warnings.

Maarten Lankhorst (2):
  drm/i915: Add functions to set/get moving fence
  drm/i915: Add support for asynchronous moving fence waiting

Thomas Hellström (4):
  drm/i915/ttm: Move the i915_gem_obj_copy_ttm() function
  drm/i915/ttm: Break refcounting loops at device region unref time
  drm/i915/ttm: Implement asynchronous TTM moves
  drm/i915/ttm: Update i915_gem_obj_copy_ttm() to be asynchronous

 drivers/gpu/drm/i915/display/intel_fbdev.c|   7 +-
 drivers/gpu/drm/i915/display/intel_overlay.c  |   2 +-
 drivers/gpu/drm/i915/gem/i915_gem_object.c|  37 ++
 drivers/gpu/drm/i915/gem/i915_gem_object.h|   9 +
 drivers/gpu/drm/i915/gem/i915_gem_pages.c |   6 +
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c   |  58 +--
 drivers/gpu/drm/i915/gem/i915_gem_ttm.h   |   6 +-
 drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c  | 396 --
 drivers/gpu/drm/i915/gem/i915_gem_ttm_move.h  |  10 +-
 drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c|   3 +
 drivers/gpu/drm/i915/gem/i915_gem_wait.c  |   4 +-
 .../i915/gem/selftests/i915_gem_coherency.c   |   4 +-
 .../drm/i915/gem/selftests/i915_gem_mman.c|  22 +-
 drivers/gpu/drm/i915/gt/intel_region_lmem.c   |   1 +
 drivers/gpu/drm/i915/i915_vma.c   |  39 +-
 drivers/gpu/drm/i915/i915_vma.h   |   3 +
 drivers/gpu/drm/i915/intel_memory_region.c|   5 +-
 drivers/gpu/drm/i915/intel_memory_region.h|   1 +
 drivers/gpu/drm/i915/intel_region_ttm.c   |  28 ++
 drivers/gpu/drm/i915/intel_region_ttm.h   |   2 +
 drivers/gpu/drm/i915/selftests/i915_vma.c |   4 +-
 21 files changed, 538 insertions(+), 109 deletions(-)

-- 
2.31.1



Re: [Intel-gfx] [PATCH 1/1] drm/i915/rpm: Enable runtime pm autosuspend by default

2021-11-11 Thread Ville Syrjälä
On Wed, Nov 10, 2021 at 05:24:22PM -0500, Rodrigo Vivi wrote:
> On Wed, Nov 10, 2021 at 01:46:46PM +0200, Ville Syrjälä wrote:
> > On Wed, Nov 10, 2021 at 10:59:26AM +0530, Tilak Tangudu wrote:
> > > Enable runtime pm autosuspend by default for gen12 and
> > > later versions.
> > > 
> > > Signed-off-by: Tilak Tangudu 
> > > ---
> > >  drivers/gpu/drm/i915/intel_runtime_pm.c | 4 
> > >  1 file changed, 4 insertions(+)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c 
> > > b/drivers/gpu/drm/i915/intel_runtime_pm.c
> > > index eaf7688f517d..ef75f24288ef 100644
> > > --- a/drivers/gpu/drm/i915/intel_runtime_pm.c
> > > +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
> > > @@ -600,6 +600,10 @@ void intel_runtime_pm_enable(struct intel_runtime_pm 
> > > *rpm)
> > >   pm_runtime_use_autosuspend(kdev);
> > >   }
> > >  
> > > + /* XXX: Enable by default only for newer platforms for now */
> > > + if (GRAPHICS_VER(i915) >= 12)
> > > + pm_runtime_allow(kdev);
> > 
> > If we change some default then we should just do it across the board.
> > There is nothing special about tgl+.
> 
> Nothing special with tgl and newer platforms indeed. This is why we
> have the XXX message here.
> 
> The problem in the last attempt was with the gen9 platforms.

What problem was that?

> Apparently some special there, and I didn't want to block the
> progress while we cannot get to the gen9 bugs.
> 
> > 
> > > +
> > >   /*
> > >* The core calls the driver load handler with an RPM reference held.
> > >* We drop that here and will reacquire it during unloading in
> > > -- 
> > > 2.25.1
> > 
> > -- 
> > Ville Syrjälä
> > Intel

-- 
Ville Syrjälä
Intel


Re: [PATCH v3] fbdev: Prevent probing generic drivers if a FB is already registered

2021-11-11 Thread Ilya Trukhanov
On Thu, Nov 11, 2021 at 12:57:57PM +0100, Javier Martinez Canillas wrote:
> The efifb and simplefb drivers just render to a pre-allocated frame buffer
> and rely on the display hardware being initialized before the kernel boots.
> 
> But if another driver already probed correctly and registered a fbdev, the
> generic drivers shouldn't be probed since an actual driver for the display
> hardware is already present.
> 
> This is more likely to occur after commit d391c5827107 ("drivers/firmware:
> move x86 Generic System Framebuffers support") since the "efi-framebuffer"
> and "simple-framebuffer" platform devices are registered at a later time.
> 
> Link: https://lore.kernel.org/r/2020200253.rfudkt3edbd3nsyj@lahvuun/
> Fixes: d391c5827107 ("drivers/firmware: move x86 Generic System Framebuffers 
> support")
> Reported-by: Ilya Trukhanov 
> Cc:  # 5.15.x
> Signed-off-by: Javier Martinez Canillas 
> Reviewed-by: Daniel Vetter 
> ---
> 
> Changes in v3:
> - Cc  since a Fixes: tag is not enough (gregkh).
> 
> Changes in v2:
> - Add a Link: tag with a reference to the bug report (Thorsten Leemhuis).
> - Add a comment explaining why the probe fails earlier (Daniel Vetter).
> - Add a Fixes: tag for stable to pick the fix (Daniel Vetter).
> - Add Daniel Vetter's Reviewed-by: tag.
> - Improve the commit message and mention the culprit commit
> 
>  drivers/video/fbdev/efifb.c| 11 +++
>  drivers/video/fbdev/simplefb.c | 11 +++
>  2 files changed, 22 insertions(+)
> 
> diff --git drivers/video/fbdev/efifb.c drivers/video/fbdev/efifb.c
> index edca3703b964..ea42ba6445b2 100644
> --- drivers/video/fbdev/efifb.c
> +++ drivers/video/fbdev/efifb.c
> @@ -351,6 +351,17 @@ static int efifb_probe(struct platform_device *dev)
>   char *option = NULL;
>   efi_memory_desc_t md;
>  
> + /*
> +  * Generic drivers must not be registered if a framebuffer exists.
> +  * If a native driver was probed, the display hardware was already
> +  * taken and attempting to use the system framebuffer is dangerous.
> +  */
> + if (num_registered_fb > 0) {
> + dev_err(>dev,
> + "efifb: a framebuffer is already registered\n");
> + return -EINVAL;
> + }
> +
>   if (screen_info.orig_video_isVGA != VIDEO_TYPE_EFI || pci_dev_disabled)
>   return -ENODEV;
>  
> diff --git drivers/video/fbdev/simplefb.c drivers/video/fbdev/simplefb.c
> index 62f0ded70681..b63074fd892e 100644
> --- drivers/video/fbdev/simplefb.c
> +++ drivers/video/fbdev/simplefb.c
> @@ -407,6 +407,17 @@ static int simplefb_probe(struct platform_device *pdev)
>   struct simplefb_par *par;
>   struct resource *mem;
>  
> + /*
> +  * Generic drivers must not be registered if a framebuffer exists.
> +  * If a native driver was probed, the display hardware was already
> +  * taken and attempting to use the system framebuffer is dangerous.
> +  */
> + if (num_registered_fb > 0) {
> + dev_err(>dev,
> + "simplefb: a framebuffer is already registered\n");
> + return -EINVAL;
> + }
> +
>   if (fb_get_options("simplefb", NULL))
>   return -ENODEV;
>  
> -- 
> 2.33.1
> 

This patch fixes the suspend issue I was experiencing. Thank you.

Tested-by: Ilya Trukhanov  


[PATCH v3] fbdev: Prevent probing generic drivers if a FB is already registered

2021-11-11 Thread Javier Martinez Canillas
The efifb and simplefb drivers just render to a pre-allocated frame buffer
and rely on the display hardware being initialized before the kernel boots.

But if another driver already probed correctly and registered a fbdev, the
generic drivers shouldn't be probed since an actual driver for the display
hardware is already present.

This is more likely to occur after commit d391c5827107 ("drivers/firmware:
move x86 Generic System Framebuffers support") since the "efi-framebuffer"
and "simple-framebuffer" platform devices are registered at a later time.

Link: https://lore.kernel.org/r/2020200253.rfudkt3edbd3nsyj@lahvuun/
Fixes: d391c5827107 ("drivers/firmware: move x86 Generic System Framebuffers 
support")
Reported-by: Ilya Trukhanov 
Cc:  # 5.15.x
Signed-off-by: Javier Martinez Canillas 
Reviewed-by: Daniel Vetter 
---

Changes in v3:
- Cc  since a Fixes: tag is not enough (gregkh).

Changes in v2:
- Add a Link: tag with a reference to the bug report (Thorsten Leemhuis).
- Add a comment explaining why the probe fails earlier (Daniel Vetter).
- Add a Fixes: tag for stable to pick the fix (Daniel Vetter).
- Add Daniel Vetter's Reviewed-by: tag.
- Improve the commit message and mention the culprit commit

 drivers/video/fbdev/efifb.c| 11 +++
 drivers/video/fbdev/simplefb.c | 11 +++
 2 files changed, 22 insertions(+)

diff --git drivers/video/fbdev/efifb.c drivers/video/fbdev/efifb.c
index edca3703b964..ea42ba6445b2 100644
--- drivers/video/fbdev/efifb.c
+++ drivers/video/fbdev/efifb.c
@@ -351,6 +351,17 @@ static int efifb_probe(struct platform_device *dev)
char *option = NULL;
efi_memory_desc_t md;
 
+   /*
+* Generic drivers must not be registered if a framebuffer exists.
+* If a native driver was probed, the display hardware was already
+* taken and attempting to use the system framebuffer is dangerous.
+*/
+   if (num_registered_fb > 0) {
+   dev_err(>dev,
+   "efifb: a framebuffer is already registered\n");
+   return -EINVAL;
+   }
+
if (screen_info.orig_video_isVGA != VIDEO_TYPE_EFI || pci_dev_disabled)
return -ENODEV;
 
diff --git drivers/video/fbdev/simplefb.c drivers/video/fbdev/simplefb.c
index 62f0ded70681..b63074fd892e 100644
--- drivers/video/fbdev/simplefb.c
+++ drivers/video/fbdev/simplefb.c
@@ -407,6 +407,17 @@ static int simplefb_probe(struct platform_device *pdev)
struct simplefb_par *par;
struct resource *mem;
 
+   /*
+* Generic drivers must not be registered if a framebuffer exists.
+* If a native driver was probed, the display hardware was already
+* taken and attempting to use the system framebuffer is dangerous.
+*/
+   if (num_registered_fb > 0) {
+   dev_err(>dev,
+   "simplefb: a framebuffer is already registered\n");
+   return -EINVAL;
+   }
+
if (fb_get_options("simplefb", NULL))
return -ENODEV;
 
-- 
2.33.1



Re: [PATCH] drm: document DRM_IOCTL_MODE_GETFB2

2021-11-11 Thread Daniel Stone
On Tue, 9 Nov 2021 at 08:56, Simon Ser  wrote:
> There are a few details specific to the GETFB2 IOCTL.
>
> It's not immediately clear how user-space should check for the
> number of planes. Suggest using the pitches field.

In fairness it is perfectly clear, it's just that I never considered
calling it without master/admin because why would you ever do that?
Anyway, the docs look correct and the more the better for sure, so
this is:
Acked-by: Daniel Stone 


Re: [PATCH] drm: pre-fill getfb2 modifier array with INVALID

2021-11-11 Thread Daniel Stone
On Thu, 11 Nov 2021 at 10:11, Simon Ser  wrote:
> User-space shouldn't look up the modifier array when the modifier
> flag is missing, but at the moment no docs make this clear (working
> on it). Right now the modifier array is pre-filled with zeroes, aka.
> LINEAR. Instead, pre-fill with INVALID to avoid footguns.
>
> This is a uAPI change, but OTOH any user-space which looks up the
> modifier array without checking the flag is broken already, so
> should be fine.

I don't know of any userspace which this would break.

Acked-by: Daniel Stone 


Re: [PATCH v2] fbdev: Prevent probing generic drivers if a FB is already registered

2021-11-11 Thread Greg Kroah-Hartman
On Thu, Nov 11, 2021 at 12:11:20PM +0100, Javier Martinez Canillas wrote:
> The efifb and simplefb drivers just render to a pre-allocated frame buffer
> and rely on the display hardware being initialized before the kernel boots.
> 
> But if another driver already probed correctly and registered a fbdev, the
> generic drivers shouldn't be probed since an actual driver for the display
> hardware is already present.
> 
> This is more likely to occur after commit d391c5827107 ("drivers/firmware:
> move x86 Generic System Framebuffers support") since the "efi-framebuffer"
> and "simple-framebuffer" platform devices are registered at a later time.
> 
> Link: https://lore.kernel.org/r/2020200253.rfudkt3edbd3nsyj@lahvuun/
> Fixes: d391c5827107 ("drivers/firmware: move x86 Generic System Framebuffers 
> support")
> Reported-by: Ilya Trukhanov 
> Signed-off-by: Javier Martinez Canillas 
> Reviewed-by: Daniel Vetter 
> ---
> 
> Changes in v2:
> - Add a Link: tag with a reference to the bug report (Thorsten Leemhuis).
> - Add a comment explaining why the probe fails earlier (Daniel Vetter).
> - Add a Fixes: tag for stable to pick the fix (Daniel Vetter).

That does not mean that it will make it into the stable tree.  Please
read:
https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
for how to do this properly.

thanks,

greg k-h


  1   2   >