Re: [PATCH 5/6] drm/rcar_du: changes to rcar-du driver resulting from drm_writeback_connector structure changes

2022-02-21 Thread Laurent Pinchart
Hi Dmitry,

On Tue, Feb 22, 2022 at 06:32:50AM +0300, Dmitry Baryshkov wrote:
> On Thu, 10 Feb 2022 at 07:59, Laurent Pinchart wrote:
> > On Wed, Feb 09, 2022 at 05:40:29PM -0800, Abhinav Kumar wrote:
> > > Hi Laurent
> > >
> > > Gentle reminder on this.
> >
> > I won't have time before next week I'm afraid.
> 
> Laurent, another gentle ping.

I'm really late on this so I probably deserve a bit of a rougher ping,
but thanks for being gentle :-)

> > > On 2/6/2022 11:20 PM, Abhinav Kumar wrote:
> > > > On 2/6/2022 3:32 PM, Dmitry Baryshkov wrote:
> > > >> On Wed, 2 Feb 2022 at 16:26, Laurent Pinchart wrote:
> > > >>> On Wed, Feb 02, 2022 at 03:15:03PM +0200, Jani Nikula wrote:
> > >  On Wed, 02 Feb 2022, Laurent Pinchart wrote:
> > > > On Wed, Feb 02, 2022 at 02:24:28PM +0530, Kandpal Suraj wrote:
> > > >> Changing rcar_du driver to accomadate the change of
> > > >> drm_writeback_connector.base and drm_writeback_connector.encoder
> > > >> to a pointer the reason for which is explained in the
> > > >> Patch(drm: add writeback pointers to drm_connector).
> > > >>
> > > >> Signed-off-by: Kandpal Suraj 
> > > >> ---
> > > >>   drivers/gpu/drm/rcar-du/rcar_du_crtc.h  | 2 ++
> > > >>   drivers/gpu/drm/rcar-du/rcar_du_writeback.c | 8 +---
> > > >>   2 files changed, 7 insertions(+), 3 deletions(-)
> > > >>
> > > >> diff --git a/drivers/gpu/drm/rcar-du/rcar_du_crtc.h
> > > >> b/drivers/gpu/drm/rcar-du/rcar_du_crtc.h
> > > >> index 66e8839db708..68f387a04502 100644
> > > >> --- a/drivers/gpu/drm/rcar-du/rcar_du_crtc.h
> > > >> +++ b/drivers/gpu/drm/rcar-du/rcar_du_crtc.h
> > > >> @@ -72,6 +72,8 @@ struct rcar_du_crtc {
> > > >> const char *const *sources;
> > > >> unsigned int sources_count;
> > > >>
> > > >> +  struct drm_connector connector;
> > > >> +  struct drm_encoder encoder;
> > > >
> > > > Those fields are, at best, poorly named. Furthermore, there's no 
> > > > need in
> > > > this driver or in other drivers using drm_writeback_connector to 
> > > > create
> > > > an encoder or connector manually. Let's not polute all drivers 
> > > > because
> > > > i915 doesn't have its abstractions right.
> > > 
> > >  i915 uses the quite common model for struct inheritance:
> > > 
> > > struct intel_connector {
> > > struct drm_connector base;
> > > /* ... */
> > > }
> > > 
> > >  Same with at least amd, ast, fsl-dcu, hisilicon, mga200, msm, 
> > >  nouveau,
> > >  radeon, tilcdc, and vboxvideo.
> > > 
> > >  We could argue about the relative merits of that abstraction, but I
> > >  think the bottom line is that it's popular and the drivers using it 
> > >  are
> > >  not going to be persuaded to move away from it.
> > > >>>
> > > >>> Nobody said inheritance is bad.
> > > >>>
> > >  It's no coincidence that the drivers who've implemented writeback so 
> > >  far
> > >  (komeda, mali, rcar-du, vc4, and vkms) do not use the abstraction,
> > >  because the drm_writeback_connector midlayer does, forcing the issue.
> > > >>>
> > > >>> Are you sure it's not a coincidence ? :-)
> > > >>>
> > > >>> The encoder and especially connector created by 
> > > >>> drm_writeback_connector
> > > >>> are there only because KMS requires a drm_encoder and a drm_connector 
> > > >>> to
> > > >>> be exposed to userspace (and I could argue that using a connector for
> > > >>> writeback is a hack, but that won't change). The connector is 
> > > >>> "virtual",
> > > >>> I still fail to see why i915 or any other driver would need to wrap it
> > > >>> into something else. The whole point of the drm_writeback_connector
> > > >>> abstraction is that drivers do not have to manage the writeback
> > > >>> drm_connector manually, they shouldn't touch it at all.
> > > >>
> > > >> Laurent, I wanted to shift a bit from the question of drm_connector to
> > > >> the question of drm_encoder being embedded in the 
> > > >> drm_writeback_connector.
> > > >> In case of the msm driver the drm_encoder is not a lightweight entity,
> > > >> but a full-featured driver part. Significant part of it can be shared
> > > >> with the writeback implementation, if we allow using a pointer to the
> > > >> external drm_encoder with the drm_writeback_connector.
> > > >> Does the following patch set stand a chance to receive your ack?
> > > >>   - Switch drm_writeback_connector to point to drm_encoder rather than
> > > >> embedding it?
> > > >>   - Create drm_encoder for the drm_writeback_connector when one is not
> > > >> specified, so the current drivers can be left unchanged.

The situation is a bit different for the encoder indeed.

The encoder concept is loosely defined nowadays, with more and more of
the "real" encoders being implemented as a drm_bridge. That's what I
usually recommend when reviewing new 

Re: [PATCH 1/3] drm/edid: parse multiple CEA extension block

2022-02-21 Thread Ville Syrjälä
On Tue, Feb 22, 2022 at 02:38:17PM +0800, Lee Shawn C wrote:
> Try to find and parse more CEA ext blocks if edid->extensions
> is greater than one.
> 
> Cc: Jani Nikula 
> Cc: Ville Syrjala 
> Cc: Ankit Nautiyal 
> Signed-off-by: Lee Shawn C 
> ---
>  drivers/gpu/drm/drm_edid.c | 75 +++---
>  1 file changed, 45 insertions(+), 30 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
> index 12893e7be89b..3d5dbbeca7f9 100644
> --- a/drivers/gpu/drm/drm_edid.c
> +++ b/drivers/gpu/drm/drm_edid.c
> @@ -4313,43 +4313,58 @@ add_cea_modes(struct drm_connector *connector, struct 
> edid *edid)
>   const u8 *cea = drm_find_cea_extension(edid);
>   const u8 *db, *hdmi = NULL, *video = NULL;
>   u8 dbl, hdmi_len, video_len = 0;
> - int modes = 0;
> + int modes = 0, j;
>  
> - if (cea && cea_revision(cea) >= 3) {
> - int i, start, end;
> + if (!cea)
> + return 0;
>  
> - if (cea_db_offsets(cea, , ))
> - return 0;
> + for (j = (cea - (u8 *)edid) / EDID_LENGTH; j <= edid->extensions;) {

That looks rather illegible. I think we want a
drm_find_cea_extension(const struct edid *edid, int *ext_index)
and then just loop until it stops giving us stuff.

There are also several other callers of drm_find_cea_extension().
Why don't they require the same treatment?

> + if (cea && cea_revision(cea) >= 3) {
> + int i, start, end;
>  
> - for_each_cea_db(cea, i, start, end) {
> - db = [i];
> - dbl = cea_db_payload_len(db);
> + if (cea_db_offsets(cea, , ))
> + continue;
>  
> - if (cea_db_tag(db) == VIDEO_BLOCK) {
> - video = db + 1;
> - video_len = dbl;
> - modes += do_cea_modes(connector, video, dbl);
> - } else if (cea_db_is_hdmi_vsdb(db)) {
> - hdmi = db;
> - hdmi_len = dbl;
> - } else if (cea_db_is_y420vdb(db)) {
> - const u8 *vdb420 = [2];
> -
> - /* Add 4:2:0(only) modes present in EDID */
> - modes += do_y420vdb_modes(connector,
> -   vdb420,
> -   dbl - 1);
> + for_each_cea_db(cea, i, start, end) {
> + db = [i];
> + dbl = cea_db_payload_len(db);
> +
> + if (cea_db_tag(db) == VIDEO_BLOCK) {
> + video = db + 1;
> + video_len = dbl;
> + modes += do_cea_modes(connector, video, 
> dbl);
> + } else if (cea_db_is_hdmi_vsdb(db)) {
> + hdmi = db;
> + hdmi_len = dbl;
> + } else if (cea_db_is_y420vdb(db)) {
> + const u8 *vdb420 = [2];
> +
> + /* Add 4:2:0(only) modes present in 
> EDID */
> + modes += do_y420vdb_modes(connector,
> +   vdb420,
> +   dbl - 1);
> + }
>   }
>   }
> - }
>  
> - /*
> -  * We parse the HDMI VSDB after having added the cea modes as we will
> -  * be patching their flags when the sink supports stereo 3D.
> -  */
> - if (hdmi)
> - modes += do_hdmi_vsdb_modes(connector, hdmi, hdmi_len, video,
> - video_len);
> + /*
> +  * We parse the HDMI VSDB after having added the cea modes as 
> we will
> +  * be patching their flags when the sink supports stereo 3D.
> +  */
> + if (hdmi) {
> + modes += do_hdmi_vsdb_modes(connector, hdmi, hdmi_len, 
> video,
> + video_len);
> + hdmi  = NULL;
> + video = NULL;
> + hdmi_len = 0;
> + video_len = 0;
> + }
> +
> + /* move to next CEA extension block */
> + cea = drm_find_edid_extension(edid, CEA_EXT, );
> + if (!cea)
> + break;
> + }
>  
>   return modes;
>  }
> -- 
> 2.17.1

-- 
Ville Syrjälä
Intel


Re: [Intel-gfx] [PATCH v3 08/11] drm/i915: Separate wakeref tracking

2022-02-21 Thread Ville Syrjälä
On Tue, Feb 22, 2022 at 12:25:39AM +0100, Andrzej Hajda wrote:
> -static noinline depot_stack_handle_t
> +static intel_wakeref_t
>  track_intel_runtime_pm_wakeref(struct intel_runtime_pm *rpm)
>  {
> - depot_stack_handle_t stack, *stacks;
> - unsigned long flags;
> -
> - if (rpm->no_wakeref_tracking)
> - return -1;
> -
> - stack = __save_depot_stack();
> - if (!stack)
> + if (!rpm->available)
>   return -1;

Still not the same.

-- 
Ville Syrjälä
Intel


Re: [PATCH v5 0/5] drm: exynos: dsi: Convert drm bridge

2022-02-21 Thread Jagan Teki
On Wed, Feb 2, 2022 at 9:54 PM Jagan Teki  wrote:
>
> Hi Marek,
>
> On Fri, Jan 21, 2022 at 6:14 PM Marek Szyprowski
>  wrote:
> >
> > Hi Jagan,
> >
> > On 21.01.2022 12:40, Jagan Teki wrote:
> > > On Fri, Jan 21, 2022 at 5:06 PM Marek Szyprowski
> > >  wrote:
> > >> On 17.01.2022 09:42, Jagan Teki wrote:
> > >>> Updated series about drm bridge conversion of exynos dsi.
> > >>>
> > >>> Previous version can be accessible, here [1].
> > >>>
> > >>> Patch 1: connector reset
> > >>>
> > >>> Patch 2: panel_bridge API
> > >>>
> > >>> Patch 3: bridge conversion
> > >>>
> > >>> Patch 4: atomic functions
> > >>>
> > >>> Patch 5: DSI init in pre_enable
> > >>>
> > >>> Apply below patches to test on Exynos DSI:
> > >>> - 
> > >>> https://protect2.fireeye.com/v1/url?k=53bdf119-0c26c815-53bc7a56-000babff3563-792dc1a6b54db43e=1=9a4ea3ad-9e7d-443d-ad21-ce694a7cd352=https%3A%2F%2Fpatchwork.amarulasolutions.com%2Fpatch%2F1825%2F
> > >>> - 
> > >>> https://protect2.fireeye.com/v1/url?k=cb269ea3-94bda7af-cb2715ec-000babff3563-e6f545b4a32558ba=1=9a4ea3ad-9e7d-443d-ad21-ce694a7cd352=https%3A%2F%2Fpatchwork.amarulasolutions.com%2Fpatch%2F1838%2F
> > >>>
> > >>> [1] 
> > >>> https://protect2.fireeye.com/v1/url?k=ee1dae12-b186971e-ee1c255d-000babff3563-83eaf8e86e67e0d9=1=9a4ea3ad-9e7d-443d-ad21-ce694a7cd352=https%3A%2F%2Fpatchwork.amarulasolutions.com%2Fcover%2F1826%2F
> > >>>
> > >>> Any inputs?
> > >> I've tried a few times, but I am unable to find what is the base for
> > >> this patchset. I always get a conflict around exynos_dsi_mode_set()
> > >> function. I've tried current linux-next, drm-next, v5.16-rc1 and v5.16.
> > >> It looks that I must have missed applying some patch before playing with
> > >> this.
> > >>
> > >> I've also tried to apply only the last patch, as if I got it right, it
> > >> is the only difference between v4 and v5 and updated 'drm: of: Lookup if
> > >> child node has panel or bridge'. In such case the board freezes during
> > >> the drm initialization.
> > > Please use drm-misc/drm-misc-next with below patches and then apply this 
> > > series.
> >
> > I don't have a good news. It doesn't work. The last patch even breaks
> > DSI operation:
> >
> > [4.520276] [drm] Exynos DRM: using 1380.decon device for DMA
> > mapping operations
> > [4.520578] exynos-drm exynos-drm: bound 1380.decon (ops
> > decon_component_ops)
> > [4.580473] exynos-drm exynos-drm: bound 1388.decon (ops
> > decon_component_ops)
> > [4.580726] exynos-drm exynos-drm: bound 1393.mic (ops
> > exynos_mic_component_ops)
> > [4.584304] exynos-dsi 1390.dsi: [drm:exynos_dsi_host_attach]
> > Attached s6e3hf2 device
> > [4.585141] exynos-drm exynos-drm: bound 1390.dsi (ops
> > exynos_dsi_component_ops)
> > [4.593189] rc_core: Couldn't load IR keymap rc-cec
> > [4.594133] Registered IR keymap rc-empty
> > [4.598760] rc rc0: sii8620 as /devices/virtual/rc/rc0
> > [4.605219] input: sii8620 as /devices/virtual/rc/rc0/input1
> > [4.610238] exynos-drm exynos-drm: bound 1397.hdmi (ops
> > hdmi_component_ops)
> > [4.920038] exynos-dsi 1390.dsi: xfer timed out: 39 03 00 00 f0 5a 5a
> > [5.024033] [ cut here ]
> > [5.024055] [CRTC:49:crtc-0] vblank wait timed out
> > [5.024129] WARNING: CPU: 4 PID: 151 at
> > drivers/gpu/drm/drm_atomic_helper.c:1530
> > drm_atomic_helper_wait_for_vblanks.part.24+0x298/0x2a8
> > [5.024171] Modules linked in:
> > [5.024195] CPU: 4 PID: 151 Comm: kworker/4:7 Not tainted 5.16.0-rc5+
> > #11232
> > [5.024219] Hardware name: Samsung TM2E board (DT)
> > [5.024232] Workqueue: events output_poll_execute
> > [5.024262] pstate: 6005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS
> > BTYPE=--)
> > [5.024285] pc : drm_atomic_helper_wait_for_vblanks.part.24+0x298/0x2a8
> > [5.024308] lr : drm_atomic_helper_wait_for_vblanks.part.24+0x298/0x2a8
> > [5.024327] sp : 800013b5b970
> > [5.024340] x29: 800013b5b970 x28:  x27:
> > 2437e400
> > [5.024391] x26:  x25:  x24:
> > 800011aa0c60
> > [5.024437] x23: 0001 x22: 25113000 x21:
> > 0001
> > [5.024482] x20: 316fc800 x19:  x18:
> > 
> > [5.024526] x17: 00480a11 x16: 0028 x15:
> > 800011b66df8
> > [5.024571] x14:  x13: 0a74756f2064656d x12:
> > 6974207469617720
> > [5.024615] x11: 656820747563205b x10: 003a x9 :
> > 7e82f035
> > [5.024661] x8 : 800011b66df8 x7 : 800013b5b740 x6 :
> > 0001
> > [5.024704] x5 : 0001 x4 :  x3 :
> > 0007
> > [5.024747] x2 : 800012524ea0 x1 : 68a66f6a76622200 x0 :
> > 
> > [5.024791] Call trace:
> > [5.024802] drm_atomic_helper_wait_for_vblanks.part.24+0x298/0x2a8
> > [5.024825]  

Re: [PATCH] drm/panel: panel-simple: Fix proper bpc for AM-1280800N3TZQW-T00H

2022-02-21 Thread Jagan Teki
On Mon, Feb 7, 2022 at 6:34 PM Jagan Teki  wrote:
>
> Hi Sam,
>
> On Mon, Dec 20, 2021 at 1:45 PM Sam Ravnborg  wrote:
> >
> > Hi Jagan,
> >
> > On Sun, Dec 19, 2021 at 10:10:10PM +0530, Jagan Teki wrote:
> > > Hi Sam,
> > >
> > > On Thu, Nov 11, 2021 at 3:11 PM Jagan Teki  
> > > wrote:
> > > >
> > > > AM-1280800N3TZQW-T00H panel support 8 bpc not 6 bpc as per
> > > > recent testing in i.MX8MM platform.
> > > >
> > > > Fix it.
> > > >
> > > > Fixes: bca684e69c4c ("drm/panel: simple: Add AM-1280800N3TZQW-T00H")
> > > > Signed-off-by: Jagan Teki 
> > > > ---
> > > >  drivers/gpu/drm/panel/panel-simple.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/panel/panel-simple.c 
> > > > b/drivers/gpu/drm/panel/panel-simple.c
> > > > index eb475a3a774b..cf3f21f649cb 100644
> > > > --- a/drivers/gpu/drm/panel/panel-simple.c
> > > > +++ b/drivers/gpu/drm/panel/panel-simple.c
> > > > @@ -719,7 +719,7 @@ static const struct drm_display_mode 
> > > > ampire_am_1280800n3tzqw_t00h_mode = {
> > > >  static const struct panel_desc ampire_am_1280800n3tzqw_t00h = {
> > > > .modes = _am_1280800n3tzqw_t00h_mode,
> > > > .num_modes = 1,
> > > > -   .bpc = 6,
> > > > +   .bpc = 8,
> > >
> > > Any response on this?
> >
> > I am too busy with other stuff at the moment to spend time on Linux
> > stuff, but expect to re-surface sometime after xmas.
>
> Any further comments?

Gentle Ping.


Re: [PATCH 0/2] DSI host and peripheral initialisation ordering

2022-02-21 Thread Laurent Pinchart
Hello,

On Fri, Feb 18, 2022 at 02:20:19PM +0100, Andrzej Hajda wrote:
> On 16.02.2022 17:59, Dave Stevenson wrote:
> > Hi All
> >
> > Hopefully I've cc'ed all those that have bashed this problem around 
> > previously,
> > or are otherwise linked to DRM bridges.
> >
> > There have been numerous discussions around how DSI support is currently 
> > broken
> > as it doesn't support initialising the PHY to LP-11 and potentially the 
> > clock
> > lane to HS prior to configuring the DSI peripheral. There is no op where the
> > interface is initialised but HS video isn't also being sent.
> > Currently you have:
> > - peripheral pre_enable (host not initialised yet)
> > - host pre_enable
> > - encoder enable
> > - host enable
> > - peripheral enable (video already running)
> >
> > vc4 and exynos currently implement the DSI host as an encoder, and split the
> > bridge_chain. This fails if you want to switch to being a bridge and/or use
> > atomic calls as the state of all the elements split off are not added by
> > drm_atomic_add_encoder_bridges.
> >
> > dw-mipi-dsi[1] and now msm[2] use the mode_set hook to initialise the PHY, 
> > so
> > the bridge/panel pre_enable can send commands. In their post_disable they 
> > then
> > call the downstream bridge/panel post_disable op manually so that shutdown
> > commands can be sent before shutting down the PHY. Nothing handles that 
> > fact,
> > so the framework then continues down the bridge chain and calls the 
> > post_disable
> > again, so we get unbalanced panel prepare/unprepare calls being reported 
> > [3].
> >
> > There have been patches[4] proposing reversing the entire direction of
> > pre_enable and post_disable, but that risks driving voltage into devices 
> > that
> > have yet to be powered up.
> > There have been discussions about adding either a pre_pre_enable, or adding 
> > a
> > DSI host_op to initialise the host[5]. Both require significant reworking 
> > to all
> > existing drivers in moving initialisation phases.
> > We have patches that look like they may well be addressing race conditions 
> > in
> > starting up a DSI peripheral[6].
> >
> > This patch takes a hybrid of the two: an optional reversing of the order for
> > specific links within the bridge chain within pre_enable and post_disable 
> > done
> > within the drm_bridge framework.
> > I'm more than happy to move where the flag exists in structures (currently 
> > as
> > DRM_BRIDGE_OP_UPSTREAM_FIRST in drm_bridge_ops, but it isn't an op),

API-wise that's my only concern, the flag should go somewhere else.

> > but does
> > this solve the problem posed? If not, then can you describe the actual 
> > scenario
> > it doesn't cover?
> > A DSI peripheral can set the flag to get the DSI host initialised first, and
> > therefore it has a stable LP-11 state before pre_enable. Likewise the 
> > peripheral
> > can still send shutdown commands prior to the DSI host being shut down in
> > post_disable. It also handles the case where there are multiple devices in 
> > the
> > chain that all want their upstream bridge enabled first, so should there be 
> > a
> > DSI mux between host and peripheral, then it can still get the host to the
> > correct state.
> >
> > An example tree is at [7] which is drm-misc-next with these patches and 
> > then a
> > conversion of vc4_dsi to use the atomic bridge functions (will be upstreamed
> > once we're over this hurdle). It is working happily with the Toshiba 
> > TC358762 on
> > a Raspberry Pi 7" panel.
> > The same approach but on our vendor 5.15 tree[8] has also been tested
> > successfully on a TI SN65DSI83 and LVDS panel.
> >
> > Whilst here, I've also documented the expected behaviour of DSI hosts and
> > peripherals to aid those who come along after.
> 
> Good summary, of multiple attempts of solving the issue (however I still 
> could add some more :) ).

Definitely good, thank you very much Dave for tackling this issue.

> I think the main issue is that we try to squeeze different hardware 
> protocol requirements into one quite restrictive framework - whole 
> crtc->encoder->bridges->(panel ||connector) is managed directly by drm core.
> No place to negotiate configuration directly between players 
> (bridges/panels).
> This patchset slightly looses the restrictions, so hopefully will help 
> for some time, but still every developer needs to solve riddles what to 
> put into callbacks, to allow driver working in different pipelines.

That's true, but documentation can help a lot there. Patch 2/2 turns the
riddle-solving task into documentation reading. Granted, not everybody
will read the documentation (and we should probably link to it from the
documentation of the pre_enable and post_disable operations), but the
behaviour is now defined, which is a major step forward.

> 
> Ideally I would like to drop idea of the bridge/panel and build 
> abstraction on data links.
> So for example DSI/EDP bridge during probe would register DSI sink with 
> their ops, 

Re: [PATCH 1/2] drm: Introduce DRM_BRIDGE_OP_UPSTREAM_FIRST to alter bridge init order

2022-02-21 Thread Laurent Pinchart
Hi Dave,

Thank you for the patch.

On Wed, Feb 16, 2022 at 04:59:43PM +, Dave Stevenson wrote:
> DSI sink devices typically want the DSI host powered up and configured
> before they are powered up. pre_enable is the place this would normally
> happen, but they are called in reverse order from panel/connector towards
> the encoder, which is the "wrong" order.
> 
> Add a new flag DRM_BRIDGE_OP_UPSTREAM_FIRST that any bridge can set
> to swap the order of pre_enable (and post_disable) so that any upstream
> bridges are called first to create the desired state.
> 
> eg:
> - Panel
> - Bridge 1
> - Bridge 2 DRM_BRIDGE_OP_UPSTREAM_FIRST
> - Bridge 3
> - Encoder
> Would result in pre_enable's being called as Panel, Bridge 1, Bridge 3,
> Bridge 2.

If there was a Bridge 4 between Bridge 3 and Encoder, would it be 

Panel, Bridge 1, Bridge 3, Bridge 4, Bridge 2

? I'd capture that here, to be explicit.

> Signed-off-by: Dave Stevenson 
> ---
>  drivers/gpu/drm/drm_bridge.c | 197 
> +--
>  include/drm/drm_bridge.h |   8 ++
>  2 files changed, 180 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_bridge.c b/drivers/gpu/drm/drm_bridge.c
> index c96847fc0ebc..7c24e8340efa 100644
> --- a/drivers/gpu/drm/drm_bridge.c
> +++ b/drivers/gpu/drm/drm_bridge.c
> @@ -522,21 +522,58 @@ EXPORT_SYMBOL(drm_bridge_chain_disable);
>   * Calls _bridge_funcs.post_disable op for all the bridges in the
>   * encoder chain, starting from the first bridge to the last. These are 
> called
>   * after completing the encoder's prepare op.

Missing blank line, as well as in three locations below.

> + * If a bridge sets the DRM_BRIDGE_OP_UPSTREAM_FIRST, then the post_disable 
> for
> + * that bridge will be called before the previous one to reverse the 
> pre_enable
> + * calling direction.
>   *
>   * Note: the bridge passed should be the one closest to the encoder
>   */
>  void drm_bridge_chain_post_disable(struct drm_bridge *bridge)
>  {
>   struct drm_encoder *encoder;
> + struct drm_bridge *next, *limit;
>  
>   if (!bridge)
>   return;
>  
>   encoder = bridge->encoder;
>   list_for_each_entry_from(bridge, >bridge_chain, chain_node) {
> + limit = NULL;
> +
> + if (!list_is_last(>chain_node, >bridge_chain)) 
> {
> + next = list_next_entry(bridge, chain_node);
> +
> + if (next->ops & DRM_BRIDGE_OP_UPSTREAM_FIRST) {
> + limit = next;
> +
> + list_for_each_entry_from(next, 
> >bridge_chain,
> +  chain_node) {
> + if (!(next->ops &
> + DRM_BRIDGE_OP_UPSTREAM_FIRST)) {
> + next = list_prev_entry(next, 
> chain_node);
> + limit = next;
> + break;
> + }
> + }
> +
> + list_for_each_entry_from_reverse(next, 
> >bridge_chain,
> +  chain_node) {
> + if (next == bridge)
> + break;
> +
> + if (next->funcs->post_disable)
> + next->funcs->post_disable(next);
> + }
> + }
> + }
> +
>   if (bridge->funcs->post_disable)
>   bridge->funcs->post_disable(bridge);
> +
> + if (limit)
> + bridge = limit;
>   }
> +
>  }
>  EXPORT_SYMBOL(drm_bridge_chain_post_disable);
>  
> @@ -577,22 +614,53 @@ EXPORT_SYMBOL(drm_bridge_chain_mode_set);
>   * Calls _bridge_funcs.pre_enable op for all the bridges in the encoder
>   * chain, starting from the last bridge to the first. These are called
>   * before calling the encoder's commit op.
> + * If a bridge sets the DRM_BRIDGE_OP_UPSTREAM_FIRST, then the pre_enable for
> + * the previous bridge will be called before pre_enable of this bridge.
>   *
>   * Note: the bridge passed should be the one closest to the encoder
>   */
>  void drm_bridge_chain_pre_enable(struct drm_bridge *bridge)
>  {
>   struct drm_encoder *encoder;
> - struct drm_bridge *iter;
> + struct drm_bridge *iter, *next, *limit;
>  
>   if (!bridge)
>   return;
>  
>   encoder = bridge->encoder;
> +
>   list_for_each_entry_reverse(iter, >bridge_chain, chain_node) {
> + if (iter->ops & DRM_BRIDGE_OP_UPSTREAM_FIRST) {
> + next = iter;
> + limit = bridge;
> + list_for_each_entry_from_reverse(next,
> +   

[PATCH 3/3] drm/edid: parse HF-EEODB CEA extension block

2022-02-21 Thread Lee Shawn C
While adding CEA modes, try to get available EEODB block
number. Then based on it to parse numbers of ext blocks,
retrieve CEA information and add more CEA modes.

Cc: Jani Nikula 
Cc: Ville Syrjala 
Cc: Ankit Nautiyal 
Signed-off-by: Lee Shawn C 
---
 drivers/gpu/drm/drm_displayid.c |  2 +-
 drivers/gpu/drm/drm_edid.c  | 19 +++
 include/drm/drm_edid.h  |  2 +-
 3 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/drm_displayid.c b/drivers/gpu/drm/drm_displayid.c
index 32da557b960f..ef0dfc9fa6f9 100644
--- a/drivers/gpu/drm/drm_displayid.c
+++ b/drivers/gpu/drm/drm_displayid.c
@@ -37,7 +37,7 @@ static const u8 *drm_find_displayid_extension(const struct 
edid *edid,
  int *length, int *idx,
  int *ext_index)
 {
-   const u8 *displayid = drm_find_edid_extension(edid, DISPLAYID_EXT, 
ext_index);
+   const u8 *displayid = drm_find_edid_extension(edid, edid->extensions, 
DISPLAYID_EXT, ext_index);
const struct displayid_header *base;
int ret;
 
diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
index a7391e427d69..9a987c77f203 100644
--- a/drivers/gpu/drm/drm_edid.c
+++ b/drivers/gpu/drm/drm_edid.c
@@ -3364,23 +3364,23 @@ add_detailed_modes(struct drm_connector *connector, 
struct edid *edid,
  * Search EDID for CEA extension block.
  */
 const u8 *drm_find_edid_extension(const struct edid *edid,
- int ext_id, int *ext_index)
+ int num_ext_blk, int ext_id, int *ext_index)
 {
const u8 *edid_ext = NULL;
int i;
 
/* No EDID or EDID extensions */
-   if (edid == NULL || edid->extensions == 0)
+   if (edid == NULL || edid->extensions == 0 || *ext_index >= num_ext_blk)
return NULL;
 
/* Find CEA extension */
-   for (i = *ext_index; i < edid->extensions; i++) {
+   for (i = *ext_index; i < num_ext_blk; i++) {
edid_ext = (const u8 *)edid + EDID_LENGTH * (i + 1);
if (edid_ext[0] == ext_id)
break;
}
 
-   if (i >= edid->extensions)
+   if (i >= num_ext_blk)
return NULL;
 
*ext_index = i + 1;
@@ -3397,7 +3397,7 @@ static const u8 *drm_find_cea_extension(const struct edid 
*edid)
 
/* Look for a top level CEA extension block */
/* FIXME: make callers iterate through multiple CEA ext blocks? */
-   cea = drm_find_edid_extension(edid, CEA_EXT, _index);
+   cea = drm_find_edid_extension(edid, edid->extensions, CEA_EXT, 
_index);
if (cea)
return cea;
 
@@ -4378,13 +4378,16 @@ add_cea_modes(struct drm_connector *connector, struct 
edid *edid)
 {
const u8 *cea = drm_find_cea_extension(edid);
const u8 *db, *hdmi = NULL, *video = NULL;
-   u8 dbl, hdmi_len, video_len = 0;
+   u8 dbl, hdmi_len, video_len = 0, num_ext_blk = edid->extensions;
int modes = 0, j;
 
if (!cea)
return 0;
 
-   for (j = (cea - (u8 *)edid) / EDID_LENGTH; j <= edid->extensions;) {
+   if (num_ext_blk && drm_edid_is_hf_eeodb_blk_available(edid))
+   num_ext_blk = drm_edid_read_hf_eeodb_blk_size(edid);
+
+   for (j = (cea - (u8 *)edid) / EDID_LENGTH; j <= num_ext_blk;) {
if (cea && cea_revision(cea) >= 3) {
int i, start, end;
 
@@ -4427,7 +4430,7 @@ add_cea_modes(struct drm_connector *connector, struct 
edid *edid)
}
 
/* move to next CEA extension block */
-   cea = drm_find_edid_extension(edid, CEA_EXT, );
+   cea = drm_find_edid_extension(edid, num_ext_blk, CEA_EXT, );
if (!cea)
break;
}
diff --git a/include/drm/drm_edid.h b/include/drm/drm_edid.h
index ba2812432ead..a9c2708b63a1 100644
--- a/include/drm/drm_edid.h
+++ b/include/drm/drm_edid.h
@@ -591,7 +591,7 @@ struct drm_display_mode *
 drm_display_mode_from_cea_vic(struct drm_device *dev,
  u8 video_code);
 const u8 *drm_find_edid_extension(const struct edid *edid,
- int ext_id, int *ext_index);
+ int num_ext_blk, int ext_id, int *ext_index);
 
 bool drm_edid_is_hf_eeodb_blk_available(const struct edid *edid);
 u8 drm_edid_read_hf_eeodb_blk_size(const struct edid *edid);
-- 
2.17.1



[PATCH 2/3] drm/edid: read HF-EEODB ext block

2022-02-21 Thread Lee Shawn C
Support to read HF_EEODB block that request by HDMI 2.1 specification.

Cc: Jani Nikula 
Cc: Ville Syrjala 
Cc: Ankit Nautiyal 
Signed-off-by: Lee Shawn C 
---
 drivers/gpu/drm/drm_connector.c |  5 ++-
 drivers/gpu/drm/drm_edid.c  | 76 ++---
 include/drm/drm_edid.h  |  2 +
 3 files changed, 77 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c
index a50c82bc2b2f..0f9e3ef00be7 100644
--- a/drivers/gpu/drm/drm_connector.c
+++ b/drivers/gpu/drm/drm_connector.c
@@ -2137,8 +2137,11 @@ int drm_connector_update_edid_property(struct 
drm_connector *connector,
if (connector->override_edid)
return 0;
 
-   if (edid)
+   if (edid) {
size = EDID_LENGTH * (1 + edid->extensions);
+   if (drm_edid_is_hf_eeodb_blk_available(edid))
+   size = EDID_LENGTH * (1 + 
drm_edid_read_hf_eeodb_blk_size(edid));
+   }
 
/* Set the display info, using edid if available, otherwise
 * resetting the values to defaults. This duplicates the work
diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
index 3d5dbbeca7f9..a7391e427d69 100644
--- a/drivers/gpu/drm/drm_edid.c
+++ b/drivers/gpu/drm/drm_edid.c
@@ -1991,7 +1991,7 @@ struct edid *drm_do_get_edid(struct drm_connector 
*connector,
void *data)
 {
int i, j = 0, valid_extensions = 0;
-   u8 *edid, *new;
+   u8 *edid, *new, ext_eeodb_blk_size;
struct edid *override;
 
override = drm_get_override_edid(connector);
@@ -2051,7 +2051,40 @@ struct edid *drm_do_get_edid(struct drm_connector 
*connector,
}
 
kfree(edid);
+   return (struct edid *)new;
+   }
+
+   if (drm_edid_is_hf_eeodb_blk_available((struct edid *)edid)) {
+   ext_eeodb_blk_size = drm_edid_read_hf_eeodb_blk_size((struct 
edid *)edid);
+
+   // no more ext blk wait for read
+   if (ext_eeodb_blk_size <= 1)
+   return (struct edid *)edid;
+
+   new = krealloc(edid, (ext_eeodb_blk_size + 1) * EDID_LENGTH, 
GFP_KERNEL);
+   if (!new)
+   goto out;
edid = new;
+
+   valid_extensions = ext_eeodb_blk_size - 1;
+   for (j = 2; j <= ext_eeodb_blk_size; j++) {
+   u8 *block = edid + j * EDID_LENGTH;
+
+   for (i = 0; i < 4; i++) {
+   if (get_edid_block(data, block, j, EDID_LENGTH))
+   goto out;
+   if (drm_edid_block_valid(block, j, false, NULL))
+   break;
+   }
+
+   if (i == 4)
+   valid_extensions--;
+   }
+
+   if (valid_extensions != ext_eeodb_blk_size - 1) {
+   DRM_ERROR("Not able to retrieve proper EDID contain 
HF-EEODB data.\n");
+   goto out;
+   }
}
 
return (struct edid *)edid;
@@ -3315,15 +3348,17 @@ add_detailed_modes(struct drm_connector *connector, 
struct edid *edid,
 #define VIDEO_BLOCK 0x02
 #define VENDOR_BLOCK0x03
 #define SPEAKER_BLOCK  0x04
-#define HDR_STATIC_METADATA_BLOCK  0x6
-#define USE_EXTENDED_TAG 0x07
-#define EXT_VIDEO_CAPABILITY_BLOCK 0x00
+#define EXT_VIDEO_CAPABILITY_BLOCK 0x00
+#define HDR_STATIC_METADATA_BLOCK  0x06
+#define USE_EXTENDED_TAG   0x07
 #define EXT_VIDEO_DATA_BLOCK_420   0x0E
-#define EXT_VIDEO_CAP_BLOCK_Y420CMDB 0x0F
+#define EXT_VIDEO_CAP_BLOCK_Y420CMDB   0x0F
+#define EXT_VIDEO_HF_EEODB_DATA_BLOCK  0x78
 #define EDID_BASIC_AUDIO   (1 << 6)
 #define EDID_CEA_YCRCB444  (1 << 5)
 #define EDID_CEA_YCRCB422  (1 << 4)
 #define EDID_CEA_VCDB_QS   (1 << 6)
+#define HF_EEODB_LENGTH2
 
 /*
  * Search EDID for CEA extension block.
@@ -4222,6 +4257,20 @@ static bool cea_db_is_hdmi_forum_vsdb(const u8 *db)
return oui(db[3], db[2], db[1]) == HDMI_FORUM_IEEE_OUI;
 }
 
+static bool cea_db_is_hdmi_forum_eeodb(const u8 *db)
+{
+   if (cea_db_tag(db) != USE_EXTENDED_TAG)
+   return false;
+
+   if (cea_db_payload_len(db) != HF_EEODB_LENGTH)
+   return false;
+
+   if (cea_db_extended_tag(db) != EXT_VIDEO_HF_EEODB_DATA_BLOCK)
+   return false;
+
+   return true;
+}
+
 static bool cea_db_is_vcdb(const u8 *db)
 {
if (cea_db_tag(db) != USE_EXTENDED_TAG)
@@ -4264,6 +4313,23 @@ static bool cea_db_is_y420vdb(const u8 *db)
return true;
 }
 
+bool drm_edid_is_hf_eeodb_blk_available(const struct edid *edid)
+{
+   const u8 *eeodb_header = (u8 *)edid + EDID_LENGTH + 4;
+
+   if (!edid->extensions)
+   return false;
+
+   return cea_db_is_hdmi_forum_eeodb(eeodb_header);
+}

[PATCH 1/3] drm/edid: parse multiple CEA extension block

2022-02-21 Thread Lee Shawn C
Try to find and parse more CEA ext blocks if edid->extensions
is greater than one.

Cc: Jani Nikula 
Cc: Ville Syrjala 
Cc: Ankit Nautiyal 
Signed-off-by: Lee Shawn C 
---
 drivers/gpu/drm/drm_edid.c | 75 +++---
 1 file changed, 45 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
index 12893e7be89b..3d5dbbeca7f9 100644
--- a/drivers/gpu/drm/drm_edid.c
+++ b/drivers/gpu/drm/drm_edid.c
@@ -4313,43 +4313,58 @@ add_cea_modes(struct drm_connector *connector, struct 
edid *edid)
const u8 *cea = drm_find_cea_extension(edid);
const u8 *db, *hdmi = NULL, *video = NULL;
u8 dbl, hdmi_len, video_len = 0;
-   int modes = 0;
+   int modes = 0, j;
 
-   if (cea && cea_revision(cea) >= 3) {
-   int i, start, end;
+   if (!cea)
+   return 0;
 
-   if (cea_db_offsets(cea, , ))
-   return 0;
+   for (j = (cea - (u8 *)edid) / EDID_LENGTH; j <= edid->extensions;) {
+   if (cea && cea_revision(cea) >= 3) {
+   int i, start, end;
 
-   for_each_cea_db(cea, i, start, end) {
-   db = [i];
-   dbl = cea_db_payload_len(db);
+   if (cea_db_offsets(cea, , ))
+   continue;
 
-   if (cea_db_tag(db) == VIDEO_BLOCK) {
-   video = db + 1;
-   video_len = dbl;
-   modes += do_cea_modes(connector, video, dbl);
-   } else if (cea_db_is_hdmi_vsdb(db)) {
-   hdmi = db;
-   hdmi_len = dbl;
-   } else if (cea_db_is_y420vdb(db)) {
-   const u8 *vdb420 = [2];
-
-   /* Add 4:2:0(only) modes present in EDID */
-   modes += do_y420vdb_modes(connector,
- vdb420,
- dbl - 1);
+   for_each_cea_db(cea, i, start, end) {
+   db = [i];
+   dbl = cea_db_payload_len(db);
+
+   if (cea_db_tag(db) == VIDEO_BLOCK) {
+   video = db + 1;
+   video_len = dbl;
+   modes += do_cea_modes(connector, video, 
dbl);
+   } else if (cea_db_is_hdmi_vsdb(db)) {
+   hdmi = db;
+   hdmi_len = dbl;
+   } else if (cea_db_is_y420vdb(db)) {
+   const u8 *vdb420 = [2];
+
+   /* Add 4:2:0(only) modes present in 
EDID */
+   modes += do_y420vdb_modes(connector,
+ vdb420,
+ dbl - 1);
+   }
}
}
-   }
 
-   /*
-* We parse the HDMI VSDB after having added the cea modes as we will
-* be patching their flags when the sink supports stereo 3D.
-*/
-   if (hdmi)
-   modes += do_hdmi_vsdb_modes(connector, hdmi, hdmi_len, video,
-   video_len);
+   /*
+* We parse the HDMI VSDB after having added the cea modes as 
we will
+* be patching their flags when the sink supports stereo 3D.
+*/
+   if (hdmi) {
+   modes += do_hdmi_vsdb_modes(connector, hdmi, hdmi_len, 
video,
+   video_len);
+   hdmi  = NULL;
+   video = NULL;
+   hdmi_len = 0;
+   video_len = 0;
+   }
+
+   /* move to next CEA extension block */
+   cea = drm_find_edid_extension(edid, CEA_EXT, );
+   if (!cea)
+   break;
+   }
 
return modes;
 }
-- 
2.17.1



[PATCH 1/4] drm/msm/dpu: document INTF_EDP/INTF_DP difference

2022-02-21 Thread Dmitry Baryshkov
Based on the discussions on the mailing list, document enum
dpu_intf_type and it's controversial fields: INTF_DP and INTF_EDP.

INTF_EDP is used for older eDP interface found on msm8x74/msm8x84
INTF_DP is used for both eDP and DP interfaces handled by the msm/dp
driver. The DPU driver does not make a difference between them.

Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h | 8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h
index bb9ceadeb0bb..4f8336cc7911 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h
@@ -205,12 +205,20 @@ enum dpu_intf {
INTF_MAX
 };
 
+/*
+ * Historically these values correspond to the values written to the
+ * DISP_INTF_SEL register, which had to programmed manually. On newer MDP
+ * generations this register is NOP, but we keep the values for historical
+ * reasons.
+ */
 enum dpu_intf_type {
INTF_NONE = 0x0,
INTF_DSI = 0x1,
INTF_HDMI = 0x3,
INTF_LCDC = 0x5,
+   /* old eDP found on 8x74 and 8x84 */
INTF_EDP = 0x9,
+   /* both DP and eDP,  handled by the new DP driver */
INTF_DP = 0xa,
INTF_TYPE_MAX,
 
-- 
2.34.1



[PATCH 4/4] drm/msm/dpu: drop INTF_EDP from interface type conditions

2022-02-21 Thread Dmitry Baryshkov
To remove possible confusion between (old) INTF_EDP and newer INTF_DP,
stop using INTF_EDP in DPU's code. Until the 8x74/8x84 SoCs are
supported by DPU driver, there is no point in using INTF_EDP.

Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c | 3 +--
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.c  | 2 +-
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c
index 478a608ba7f2..e76d240f554d 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c
@@ -92,8 +92,7 @@ static void drm_mode_to_intf_timing_params(
}
 
/* for DP/EDP, Shift timings to align it to bottom right */
-   if ((phys_enc->hw_intf->cap->type == INTF_DP) ||
-   (phys_enc->hw_intf->cap->type == INTF_EDP)) {
+   if (phys_enc->hw_intf->cap->type == INTF_DP) {
timing->h_back_porch += timing->h_front_porch;
timing->h_front_porch = 0;
timing->v_back_porch += timing->v_front_porch;
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.c
index 116e2b5b1a90..1548614c508b 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.c
@@ -141,7 +141,7 @@ static void dpu_hw_intf_setup_timing_engine(struct 
dpu_hw_intf *ctx,
hsync_ctl = (hsync_period << 16) | p->hsync_pulse_width;
display_hctl = (hsync_end_x << 16) | hsync_start_x;
 
-   if (ctx->cap->type == INTF_EDP || ctx->cap->type == INTF_DP) {
+   if (ctx->cap->type == INTF_DP) {
active_h_start = hsync_start_x;
active_h_end = active_h_start + p->xres - 1;
active_v_start = display_v_start;
-- 
2.34.1



[PATCH 3/4] drm/msm/dpu: drop obsolete INTF_EDP comment

2022-02-21 Thread Dmitry Baryshkov
DPU driver never supported INTF_EDP, so let's drop the obsolete comment.
If at some point 8x74/8x84's INTF_EDP is ported to DPU driver,
corresponding handling will have to be ported too. Until that time, the
comment serves no purpose.

Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c | 11 ---
 1 file changed, 11 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c
index f49f42e70b29..478a608ba7f2 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c
@@ -91,17 +91,6 @@ static void drm_mode_to_intf_timing_params(
timing->vsync_polarity = 0;
}
 
-   /*
-* For edp only:
-* DISPLAY_V_START = (VBP * HCYCLE) + HBP
-* DISPLAY_V_END = (VBP + VACTIVE) * HCYCLE - 1 - HFP
-*/
-   /*
-* if (vid_enc->hw->cap->type == INTF_EDP) {
-* display_v_start += mode->htotal - mode->hsync_start;
-* display_v_end -= mode->hsync_start - mode->hdisplay;
-* }
-*/
/* for DP/EDP, Shift timings to align it to bottom right */
if ((phys_enc->hw_intf->cap->type == INTF_DP) ||
(phys_enc->hw_intf->cap->type == INTF_EDP)) {
-- 
2.34.1



[PATCH 2/4] drm/msm/dpu: drop INTF_TYPE_MAX symbol

2022-02-21 Thread Dmitry Baryshkov
This enum value does not correspond to any of actual interface types,
it's not used by the driver, and the value of INTF_WB is greater than
INTF_TYPE_MAX. Thus this symbol serves no purpose and can be removed.

Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h
index 4f8336cc7911..a9b6d0955539 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h
@@ -220,7 +220,6 @@ enum dpu_intf_type {
INTF_EDP = 0x9,
/* both DP and eDP,  handled by the new DP driver */
INTF_DP = 0xa,
-   INTF_TYPE_MAX,
 
/* virtual interfaces */
INTF_WB = 0x100,
-- 
2.34.1



[PATCH 0/4] drm/msm/dpu: clearly document INTF_DP vs INTF_EDP difference

2022-02-21 Thread Dmitry Baryshkov
Recent dicussion on the mailing list [1], [2] outlined a need to document
which intf type is used for DP and which one is used for eDP interfaces.

This series implements my proposal [3]:

- Keep INTF_EDP reserved for 8x74/8x84
- Use INTF_DP for all contemporary DP and eDP ports
- Documet this in dpu_hw_mdss.h
- Remove INTF_EDP usage in dpu1 driver.

Main reasons behind this proposal:
- It's not always possible to separate eDP and DP. For example INTF_5 on
  sc7280 is connected to combo eDP/DP PHY.
- Using INTF_EDP would require us to split too many pieces, ending up
  with a singnificant amount of code duplication...
- ... for nothing. From the DPU point of view there is no difference
  between DP and eDP interfaces as found on current SoC generations.

[1]: 
https://lore.kernel.org/linux-arm-msm/0dac8ffa-89a6-d972-bdc1-3f7755c51...@linaro.org/
[2]: 
https://lore.kernel.org/linux-arm-msm/be397e2e-05ab-5c18-8e2d-16c443f0a...@quicinc.com/
[3]: 
https://lore.kernel.org/linux-arm-msm/e2fab93e-82a6-4837-4ee5-ee1b16caa...@linaro.org/

Dmitry Baryshkov (4):
  drm/msm/dpu: document INTF_EDP/INTF_DP difference
  drm/msm/dpu: drop INTF_TYPE_MAX symbol
  drm/msm/dpu: drop obsolete INTF_EDP comment
  drm/msm/dpu: drop INTF_EDP from interface type conditions

 .../gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c   | 14 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.c|  2 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h|  9 -
 3 files changed, 10 insertions(+), 15 deletions(-)

-- 
2.34.1



Re: [PATCH 2/2] drm/bridge: Document the expected behaviour of DSI host controllers

2022-02-21 Thread Laurent Pinchart
Hi Dave,

Thank you for the patch.


On Wed, Feb 16, 2022 at 04:59:44PM +, Dave Stevenson wrote:
> The exact behaviour of DSI host controllers is not specified,
> therefore define it.
> 
> Signed-off-by: Dave Stevenson 
> ---
>  Documentation/gpu/drm-kms-helpers.rst |  7 +++
>  drivers/gpu/drm/drm_bridge.c  | 38 
> +++
>  2 files changed, 45 insertions(+)
> 
> diff --git a/Documentation/gpu/drm-kms-helpers.rst 
> b/Documentation/gpu/drm-kms-helpers.rst
> index c3ce91eecbc1..362afdb867c6 100644
> --- a/Documentation/gpu/drm-kms-helpers.rst
> +++ b/Documentation/gpu/drm-kms-helpers.rst
> @@ -185,6 +185,13 @@ Bridge Helper Reference
>  .. kernel-doc:: drivers/gpu/drm/drm_bridge.c
> :export:
>  
> +MIPI-DSI bridge operation
> +-
> +
> +.. kernel-doc:: drivers/gpu/drm/drm_bridge.c
> +   :doc: dsi bridge operations
> +
> +
>  Bridge Connector Helper Reference
>  -
>  
> diff --git a/drivers/gpu/drm/drm_bridge.c b/drivers/gpu/drm/drm_bridge.c
> index 7c24e8340efa..14c2ee9e0328 100644
> --- a/drivers/gpu/drm/drm_bridge.c
> +++ b/drivers/gpu/drm/drm_bridge.c
> @@ -152,6 +152,44 @@
>   * situation when probing.
>   */
>  
> +/**
> + * DOC: dsi bridge operations
> + *
> + * DSI host interfaces are expected to be implemented as bridges rather than
> + * encoders, however there are a few aspects of their operation that need to
> + * be defined in order to provide a consistent interface.
> + *
> + * A DSI host should keep the PHY powered down until the pre_enable op is

I'd write "operation" in full everywhere to avoid mixing the two.

> + * called. All lanes should be in an idle state (not LP-11) up to this point.

Is the idle state LP-00 ? If so I'd state that explicitly.

"[...] in an idle state (LP-00, not LP-11) [...]"

> + * pre_enable should initialise the PHY, set the data lanes to LP-11, and the
> + * clock lane to either LP-11 or HS dependent on the mode_flag

s/dependent/depending/ ?

> + * MIPI_DSI_CLOCK_NON_CONTINUOUS.
> + *
> + * Ordinarily the downstream bridge DSI peripheral pre_enable will have been
> + * called before the DSI host. If the DSI peripheral requires LP-11 and/or
> + * the clock lane to be in HS mode prior to pre_enable, then it can set the
> + * DRM_BRIDGE_OP_UPSTREAM_FIRST flag to request the pre_enable (and
> + * post_disable) order to be altered to enable the DSI host first.
> + *
> + * Either the CRTC being enabled, or the DSI host enable op should switch the
> + * host to actively transmitting video on the data lanes.
> + *
> + * The reverse also applies. The DSI host disable op or stopping the CRTC 
> should
> + * stop transmitting video, and the data lanes should return to the LP-11 
> state.
> + * The DSI host post_disable op should disable the PHY.
> + * If the DRM_BRIDGE_OP_UPSTREAM_FIRST flag is set, then the DSI peripheral's
> + * bridge post_disable will be called before the DSI host's post_disable.
> + *
> + * Whilst it is valid to call host_transfer prior to pre_enable or after
> + * post_disable, the exact state of the lanes is undefined at this point. The
> + * DSI host should initialise the interface, transmit the data, and then 
> disable
> + * the interface again.
> + *
> + * Ultra Low Power State (ULPS) is not explicitly supported by DRM. If
> + * implemented, it therefore needs to be either handled entirely within the 
> DSI

s/either // (or you need an "or ..." :-))

Reviewed-by: Laurent Pinchart 

> + * Host driver.
> + */
> +
>  static DEFINE_MUTEX(bridge_lock);
>  static LIST_HEAD(bridge_list);
>  

-- 
Regards,

Laurent Pinchart


Re: [PATCH 5/6] drm/rcar_du: changes to rcar-du driver resulting from drm_writeback_connector structure changes

2022-02-21 Thread Dmitry Baryshkov
On Thu, 10 Feb 2022 at 07:59, Laurent Pinchart
 wrote:
>
> Hi Abhinav,
>
> On Wed, Feb 09, 2022 at 05:40:29PM -0800, Abhinav Kumar wrote:
> > Hi Laurent
> >
> > Gentle reminder on this.
>
> I won't have time before next week I'm afraid.

Laurent, another gentle ping.

>
> > On 2/6/2022 11:20 PM, Abhinav Kumar wrote:
> > > Hi Laurent
> > >
> > > On 2/6/2022 3:32 PM, Dmitry Baryshkov wrote:
> > >> On Wed, 2 Feb 2022 at 16:26, Laurent Pinchart
> > >>  wrote:
> > >>>
> > >>> Hi Jani,
> > >>>
> > >>> On Wed, Feb 02, 2022 at 03:15:03PM +0200, Jani Nikula wrote:
> >  On Wed, 02 Feb 2022, Laurent Pinchart wrote:
> > > On Wed, Feb 02, 2022 at 02:24:28PM +0530, Kandpal Suraj wrote:
> > >> Changing rcar_du driver to accomadate the change of
> > >> drm_writeback_connector.base and drm_writeback_connector.encoder
> > >> to a pointer the reason for which is explained in the
> > >> Patch(drm: add writeback pointers to drm_connector).
> > >>
> > >> Signed-off-by: Kandpal Suraj 
> > >> ---
> > >>   drivers/gpu/drm/rcar-du/rcar_du_crtc.h  | 2 ++
> > >>   drivers/gpu/drm/rcar-du/rcar_du_writeback.c | 8 +---
> > >>   2 files changed, 7 insertions(+), 3 deletions(-)
> > >>
> > >> diff --git a/drivers/gpu/drm/rcar-du/rcar_du_crtc.h
> > >> b/drivers/gpu/drm/rcar-du/rcar_du_crtc.h
> > >> index 66e8839db708..68f387a04502 100644
> > >> --- a/drivers/gpu/drm/rcar-du/rcar_du_crtc.h
> > >> +++ b/drivers/gpu/drm/rcar-du/rcar_du_crtc.h
> > >> @@ -72,6 +72,8 @@ struct rcar_du_crtc {
> > >> const char *const *sources;
> > >> unsigned int sources_count;
> > >>
> > >> +  struct drm_connector connector;
> > >> +  struct drm_encoder encoder;
> > >
> > > Those fields are, at best, poorly named. Furthermore, there's no
> > > need in
> > > this driver or in other drivers using drm_writeback_connector to
> > > create
> > > an encoder or connector manually. Let's not polute all drivers because
> > > i915 doesn't have its abstractions right.
> > 
> >  i915 uses the quite common model for struct inheritance:
> > 
> > struct intel_connector {
> > struct drm_connector base;
> > /* ... */
> > }
> > 
> >  Same with at least amd, ast, fsl-dcu, hisilicon, mga200, msm, nouveau,
> >  radeon, tilcdc, and vboxvideo.
> > 
> >  We could argue about the relative merits of that abstraction, but I
> >  think the bottom line is that it's popular and the drivers using it are
> >  not going to be persuaded to move away from it.
> > >>>
> > >>> Nobody said inheritance is bad.
> > >>>
> >  It's no coincidence that the drivers who've implemented writeback so
> >  far
> >  (komeda, mali, rcar-du, vc4, and vkms) do not use the abstraction,
> >  because the drm_writeback_connector midlayer does, forcing the issue.
> > >>>
> > >>> Are you sure it's not a coincidence ? :-)
> > >>>
> > >>> The encoder and especially connector created by drm_writeback_connector
> > >>> are there only because KMS requires a drm_encoder and a drm_connector to
> > >>> be exposed to userspace (and I could argue that using a connector for
> > >>> writeback is a hack, but that won't change). The connector is "virtual",
> > >>> I still fail to see why i915 or any other driver would need to wrap it
> > >>> into something else. The whole point of the drm_writeback_connector
> > >>> abstraction is that drivers do not have to manage the writeback
> > >>> drm_connector manually, they shouldn't touch it at all.
> > >>
> > >> Laurent, I wanted to shift a bit from the question of drm_connector to
> > >> the question of drm_encoder being embedded in the
> > >> drm_writeback_connector.
> > >> In case of the msm driver the drm_encoder is not a lightweight entity,
> > >> but a full-featured driver part. Significant part of it can be shared
> > >> with the writeback implementation, if we allow using a pointer to the
> > >> external drm_encoder with the drm_writeback_connector.
> > >> Does the following patch set stand a chance to receive your ack?
> > >>   - Switch drm_writeback_connector to point to drm_encoder rather than
> > >> embedding it?
> > >>   - Create drm_encoder for the drm_writeback_connector when one is not
> > >> specified, so the current drivers can be left unchanged.
> > >>
> > >
> > > I second Dmitry's request here. For the reasons he has mentioned along
> > > with the possibility of the writeback encoder being shared across
> > > display pipelines, strengthens our request of the drm encoder being a
> > > pointer inside the drm_writeback_connector instead of embedding it.
> > >
> > > Like I had shown in my RFC, in case the other drivers dont specify one,
> > > we can allocate one:
> > >
> > > https://patchwork.kernel.org/project/dri-devel/patch/1642732195-25349-1-git-send-email-quic_abhin...@quicinc.com/
> > >
> > >
> > > We think this 

[Bug 215631] New: Some Desktop oriented mode setting drivers are missing DRM PRIME support

2022-02-21 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=215631

Bug ID: 215631
   Summary: Some Desktop oriented mode setting drivers are missing
DRM PRIME support
   Product: Drivers
   Version: 2.5
Kernel Version: 5.14
  Hardware: Intel
OS: Linux
  Tree: Mainline
Status: NEW
  Severity: normal
  Priority: P1
 Component: Video(DRI - non Intel)
  Assignee: drivers_video-...@kernel-bugs.osdl.org
  Reporter: bluescreen_aven...@verizon.net
Regression: No

Hi

It seems that not all Desktop GPU oriented DRM drivers support PRIME, which it
now appears that wlroots requires. Seeing that even SimpleDRM supports PRIME
though, I think that it should be possible for other drivers.  

The drivers appear to be
ast
gma500
bochs
i2c
vboxvideo
These are the ones that Debian has turned on that to my knowledge are missing
this support, so that is what I am considering to be a Desktop GPU, although I
might be wrong. I admit I am being very x86-centric.

I am not sure if I should open a separate bug for each driver, However I am not
able to test all of them, as I really can only test the two virtual ones. The
others I am going by grepping the kernel source for lines that I have been told
to check for, and an online database https://drmdb.emersion.fr/capabilities

I am invested in this because I think pairing the wlroots based cage with a
terminal emulator like foot is probably the best solution, and in a recent
release of wlroots, DRM Prime is a hard requirement. (probably for how they do
multiple GPUs, I am not sure)


Thanks

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

[PATCH] drm/msm/dpu: wire up MSM8998's DSPP blocks

2022-02-21 Thread Dmitry Baryshkov
The commit adding msm8998 support didn't added msm8998's DSPP blocks
configuration, but did not use them in msm8998_cfg_init(). Wire them up
to be used for display post processing.

Reported-by: kernel test robot 
Fixes: 94391a14fc27 ("drm/msm/dpu1: Add MSM8998 to hw catalog")
Cc: AngeloGioacchino Del Regno 
Cc: Jami Kettunen 
Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
index aa4d20762ccb..f74bc7acd901 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
@@ -1496,6 +1496,8 @@ static void msm8998_cfg_init(struct dpu_mdss_cfg *dpu_cfg)
.sspp = msm8998_sspp,
.mixer_count = ARRAY_SIZE(msm8998_lm),
.mixer = msm8998_lm,
+   .dspp_count = ARRAY_SIZE(msm8998_dspp),
+   .dspp = msm8998_dspp,
.pingpong_count = ARRAY_SIZE(sdm845_pp),
.pingpong = sdm845_pp,
.intf_count = ARRAY_SIZE(msm8998_intf),
-- 
2.34.1



Re: [PATCH v2 2/4] drm/bridge: use atomic enable/disable for bridge callbacks

2022-02-21 Thread Dmitry Baryshkov
On Mon, 21 Feb 2022 at 17:52, Vinod Polimera  wrote:
>
> Use atomic enable/disable for bridge callbacks to access certain
> states like self-refresh.
>
> This change avoids panel prepare/unprepare based on self-refresh
> state.

Please split this into two patches:
- change to atomic_* callbacks
- introduction of PSR support.

>
> Signed-off-by: Sankeerth Billakanti 
> Signed-off-by: Kalyan Thota 
> Signed-off-by: Vinod Polimera 
>
> Changes in V2:
> - As per review suggestion by Dmitry.
> ---
>  drivers/gpu/drm/bridge/panel.c | 102 
> +
>  1 file changed, 94 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/bridge/panel.c b/drivers/gpu/drm/bridge/panel.c
> index b32295a..5c7dc82 100644
> --- a/drivers/gpu/drm/bridge/panel.c
> +++ b/drivers/gpu/drm/bridge/panel.c
> @@ -41,6 +41,40 @@ static int panel_bridge_connector_get_modes(struct 
> drm_connector *connector)
> return drm_panel_get_modes(panel_bridge->panel, connector);
>  }
>
> +static struct drm_crtc *bridge_drm_get_old_connector_crtc(struct drm_encoder 
> *encoder,
> +   struct drm_atomic_state 
> *state)
> +{
> +   struct drm_connector *connector;
> +   struct drm_connector_state *conn_state;
> +
> +   connector = drm_atomic_get_old_connector_for_encoder(state, encoder);
> +   if (!connector)
> +   return NULL;
> +
> +   conn_state = drm_atomic_get_old_connector_state(state, connector);
> +   if (!conn_state)
> +   return NULL;
> +
> +   return conn_state->crtc;
> +}
> +
> +static struct drm_crtc *bridge_drm_get_new_connector_crtc(struct drm_encoder 
> *encoder,
> +   struct drm_atomic_state 
> *state)
> +{
> +   struct drm_connector *connector;
> +   struct drm_connector_state *conn_state;
> +
> +   connector = drm_atomic_get_new_connector_for_encoder(state, encoder);
> +   if (!connector)
> +   return NULL;
> +
> +   conn_state = drm_atomic_get_new_connector_state(state, connector);
> +   if (!conn_state)
> +   return NULL;
> +
> +   return conn_state->crtc;
> +}
> +
>  static const struct drm_connector_helper_funcs
>  panel_bridge_connector_helper_funcs = {
> .get_modes = panel_bridge_connector_get_modes,
> @@ -102,30 +136,82 @@ static void panel_bridge_detach(struct drm_bridge 
> *bridge)
> drm_connector_cleanup(connector);
>  }
>
> -static void panel_bridge_pre_enable(struct drm_bridge *bridge)
> +static void panel_bridge_pre_enable(struct drm_bridge *bridge,
> +   struct drm_bridge_state *old_bridge_state)
>  {
> struct panel_bridge *panel_bridge = 
> drm_bridge_to_panel_bridge(bridge);
> +   struct drm_atomic_state *old_state = old_bridge_state->base.state;
> +   struct drm_encoder *encoder = bridge->encoder;
> +   struct drm_crtc *crtc;
> +   struct drm_crtc_state *old_crtc_state;
> +
> +   crtc = bridge_drm_get_new_connector_crtc(encoder, old_state);
> +   if (!crtc)
> +   return;
> +
> +   old_crtc_state = drm_atomic_get_old_crtc_state(old_state, crtc);
> +   if (old_crtc_state && old_crtc_state->self_refresh_active)
> +   return;
>
> drm_panel_prepare(panel_bridge->panel);
>  }
>
> -static void panel_bridge_enable(struct drm_bridge *bridge)
> +static void panel_bridge_enable(struct drm_bridge *bridge,
> +   struct drm_bridge_state *old_bridge_state)
>  {
> struct panel_bridge *panel_bridge = 
> drm_bridge_to_panel_bridge(bridge);
> +   struct drm_atomic_state *old_state = old_bridge_state->base.state;
> +   struct drm_encoder *encoder = bridge->encoder;
> +   struct drm_crtc *crtc;
> +   struct drm_crtc_state *old_crtc_state;
> +
> +   crtc = bridge_drm_get_new_connector_crtc(encoder, old_state);
> +   if (!crtc)
> +   return;
> +
> +   old_crtc_state = drm_atomic_get_old_crtc_state(old_state, crtc);
> +   if (old_crtc_state && old_crtc_state->self_refresh_active)
> +   return;
>
> drm_panel_enable(panel_bridge->panel);
>  }
>
> -static void panel_bridge_disable(struct drm_bridge *bridge)
> +static void panel_bridge_disable(struct drm_bridge *bridge,
> +   struct drm_bridge_state *old_bridge_state)
>  {
> struct panel_bridge *panel_bridge = 
> drm_bridge_to_panel_bridge(bridge);
> +   struct drm_atomic_state *old_state = old_bridge_state->base.state;
> +   struct drm_encoder *encoder = bridge->encoder;
> +   struct drm_crtc *crtc;
> +   struct drm_crtc_state *new_crtc_state;
> +
> +   crtc = bridge_drm_get_old_connector_crtc(encoder, old_state);
> +   if (!crtc)
> +   return;
> +
> +   new_crtc_state = drm_atomic_get_new_crtc_state(old_state, crtc);
> +   if (new_crtc_state && new_crtc_state->self_refresh_active)
> +   return;

Re: [PATCH v2 3/4] drm/msm/disp/dpu1: use atomic enable/disable callbacks for encoder functions

2022-02-21 Thread Dmitry Baryshkov
On Mon, 21 Feb 2022 at 17:52, Vinod Polimera  wrote:
>
> Use atomic variants for encoder callback functions such that
> certain states like self-refresh can be accessed as part of
> enable/disable sequence.
>
> Signed-off-by: Kalyan Thota 
> Signed-off-by: Vinod Polimera 

Reviewed-by: Dmitry Baryshkov 

>
> Changes in v2:
> - As per review suggestion by Dmitry.
> ---
>  drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c | 10 ++
>  1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c 
> b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
> index 1e648db..6eac417 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
> @@ -1138,7 +1138,8 @@ void dpu_encoder_virt_runtime_resume(struct drm_encoder 
> *drm_enc)
> mutex_unlock(_enc->enc_lock);
>  }
>
> -static void dpu_encoder_virt_enable(struct drm_encoder *drm_enc)
> +static void dpu_encoder_virt_enable(struct drm_encoder *drm_enc,
> +   struct drm_atomic_state *state)
>  {
> struct dpu_encoder_virt *dpu_enc = NULL;
> int ret = 0;
> @@ -1176,7 +1177,8 @@ static void dpu_encoder_virt_enable(struct drm_encoder 
> *drm_enc)
> mutex_unlock(_enc->enc_lock);
>  }
>
> -static void dpu_encoder_virt_disable(struct drm_encoder *drm_enc)
> +static void dpu_encoder_virt_disable(struct drm_encoder *drm_enc,
> +   struct drm_atomic_state *state)
>  {
> struct dpu_encoder_virt *dpu_enc = NULL;
> struct msm_drm_private *priv;
> @@ -2094,8 +2096,8 @@ static void dpu_encoder_frame_done_timeout(struct 
> timer_list *t)
>
>  static const struct drm_encoder_helper_funcs dpu_encoder_helper_funcs = {
> .mode_set = dpu_encoder_virt_mode_set,
> -   .disable = dpu_encoder_virt_disable,
> -   .enable = dpu_encoder_virt_enable,
> +   .atomic_disable = dpu_encoder_virt_disable,
> +   .atomic_enable = dpu_encoder_virt_enable,
> .atomic_check = dpu_encoder_virt_atomic_check,
>  };
>
> --
> 2.7.4
>


-- 
With best wishes
Dmitry


Re: [PATCH v2 1/4] drm/msm/dp: Add basic PSR support for eDP

2022-02-21 Thread Dmitry Baryshkov
On Mon, 21 Feb 2022 at 17:52, Vinod Polimera  wrote:
>
> Add support for basic panel self refresh (PSR) feature for eDP.
> Add a new interface to set PSR state in the sink from DPU.
> Program the eDP controller to issue PSR enter and exit SDP to
> the sink.
>
> Signed-off-by: Sankeerth Billakanti 
>
> Changes in v2:
>   - Use dp bridge to set psr entry/exit instead of dpu_enocder
>   - Don't modify whitespaces
>   - set self refresh aware from atomic_check
>   - set self refresh aware only if psr is supported
>   - provide a stub for msm_dp_display_set_psr
> ---
>  drivers/gpu/drm/msm/dp/dp_catalog.c |  81 +
>  drivers/gpu/drm/msm/dp/dp_catalog.h |   4 +
>  drivers/gpu/drm/msm/dp/dp_ctrl.c|  63 +
>  drivers/gpu/drm/msm/dp/dp_ctrl.h|   3 +
>  drivers/gpu/drm/msm/dp/dp_display.c |  14 +++
>  drivers/gpu/drm/msm/dp/dp_display.h |   1 +
>  drivers/gpu/drm/msm/dp/dp_drm.c | 177 
> ++--
>  drivers/gpu/drm/msm/dp/dp_link.c|  22 +
>  drivers/gpu/drm/msm/dp/dp_panel.c   |  21 +
>  drivers/gpu/drm/msm/dp/dp_panel.h   |   6 ++
>  drivers/gpu/drm/msm/dp/dp_reg.h |  19 
>  drivers/gpu/drm/msm/msm_drv.h   |   6 ++
>  12 files changed, 411 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/dp/dp_catalog.c 
> b/drivers/gpu/drm/msm/dp/dp_catalog.c
> index 8a6d3ea..3cd223d 100644
> --- a/drivers/gpu/drm/msm/dp/dp_catalog.c
> +++ b/drivers/gpu/drm/msm/dp/dp_catalog.c
> @@ -45,6 +45,14 @@
>  #define DP_INTERRUPT_STATUS2_MASK \
> (DP_INTERRUPT_STATUS2 << DP_INTERRUPT_STATUS_MASK_SHIFT)
>
> +#define DP_INTERRUPT_STATUS4 \
> +   (PSR_UPDATE_INT | PSR_CAPTURE_INT | PSR_EXIT_INT | \
> +   PSR_UPDATE_ERROR_INT | PSR_WAKE_ERROR_INT)
> +
> +#define DP_INTERRUPT_MASK4 \
> +   (PSR_UPDATE_MASK | PSR_CAPTURE_MASK | PSR_EXIT_MASK | \
> +   PSR_UPDATE_ERROR_MASK | PSR_WAKE_ERROR_MASK)
> +
>  struct dp_catalog_private {
> struct device *dev;
> struct dp_io *io;
> @@ -343,6 +351,20 @@ void dp_catalog_ctrl_lane_mapping(struct dp_catalog 
> *dp_catalog)
> ln_mapping);
>  }
>
> +void dp_catalog_ctrl_psr_mainlink_enable(struct dp_catalog *dp_catalog,
> +   bool enable)
> +{
> +   u32 mainlink_ctrl;
> +   struct dp_catalog_private *catalog = container_of(dp_catalog,
> +   struct dp_catalog_private, dp_catalog);
> +
> +   mainlink_ctrl = dp_read_link(catalog, REG_DP_MAINLINK_CTRL);
> +   mainlink_ctrl &= ~DP_MAINLINK_CTRL_ENABLE;
> +   mainlink_ctrl |= (enable & DP_MAINLINK_CTRL_ENABLE);
> +
> +   dp_write_link(catalog, REG_DP_MAINLINK_CTRL, mainlink_ctrl);
> +}
> +
>  void dp_catalog_ctrl_mainlink_ctrl(struct dp_catalog *dp_catalog,
> bool enable)
>  {
> @@ -581,6 +603,51 @@ void dp_catalog_ctrl_hpd_config(struct dp_catalog 
> *dp_catalog)
> dp_write_aux(catalog, REG_DP_DP_HPD_CTRL, DP_DP_HPD_CTRL_HPD_EN);
>  }
>
> +static void dp_catalog_enable_sdp(struct dp_catalog_private *catalog)
> +{
> +   /* trigger sdp */
> +   dp_write_link(catalog, MMSS_DP_SDP_CFG3, 0x1);
> +   dp_write_link(catalog, MMSS_DP_SDP_CFG3, 0x0);
> +}
> +
> +void dp_catalog_ctrl_config_psr(struct dp_catalog *dp_catalog)
> +{
> +   struct dp_catalog_private *catalog = container_of(dp_catalog,
> +   struct dp_catalog_private, dp_catalog);
> +   u32 psr_config;
> +
> +   /* enable PSR1 function */
> +   psr_config = dp_read_link(catalog, REG_PSR_CONFIG);
> +   psr_config |= BIT(0);
> +   dp_write_link(catalog, REG_PSR_CONFIG, psr_config);
> +
> +   dp_write_ahb(catalog, REG_DP_INTR_MASK4, DP_INTERRUPT_MASK4);
> +   dp_catalog_enable_sdp(catalog);
> +}
> +
> +void dp_catalog_ctrl_set_psr(struct dp_catalog *dp_catalog, bool enter)
> +{
> +   struct dp_catalog_private *catalog = container_of(dp_catalog,
> +   struct dp_catalog_private, dp_catalog);
> +   u32 psr_cmd;
> +
> +   psr_cmd = dp_read_link(catalog, REG_PSR_CMD);
> +
> +   /*
> +* BIT(0) - send psr entry SDP
> +* BIT(1) - sned psr exit SDP
> +*/
> +   psr_cmd &= ~(BIT(0) | BIT(1));
> +
> +   if (enter)
> +   psr_cmd |= BIT(0);
> +   else
> +   psr_cmd |= BIT(1);
> +
> +   dp_catalog_enable_sdp(catalog);
> +   dp_write_link(catalog, REG_PSR_CMD, psr_cmd);
> +}
> +
>  u32 dp_catalog_link_is_connected(struct dp_catalog *dp_catalog)
>  {
> struct dp_catalog_private *catalog = container_of(dp_catalog,
> @@ -608,6 +675,20 @@ u32 dp_catalog_hpd_get_intr_status(struct dp_catalog 
> *dp_catalog)
> return isr;
>  }
>
> +int dp_catalog_ctrl_get_psr_interrupt(struct dp_catalog *dp_catalog)
> +{
> +   struct dp_catalog_private *catalog = container_of(dp_catalog,
> +   struct 

Re: [RFC PATCH] drm/msm/dpu1: Add a common DPU1 compatible

2022-02-21 Thread Dmitry Baryshkov
Hi,

On Tue, 22 Feb 2022 at 04:26, Konrad Dybcio
 wrote:
>
> There is *almost no reason* to keep separate compatibles for different
> SoCs utilizing the DPU1 driver, as it checks the HW version at runtime.
>
> Introduce a common compatible, while not removing the old ones to keep
> old DT compatibility.

I don't quite like this idea. Specifying more or less exact
compatibility string gives us more flexibility.
Few recent usecases to mention:
- qcom,mdp5 compatibility. If we had soc-specific compatibilities, we
would be able to switch the drivers w/o changing the dts. With a
single compatibility we would have to change the dts if we were to
change one of the boards form mdp5 to dpu1.
- qcom,mdss-dsi-ctrl vs qcm2290. We have to add special compat string
to account for the different io addresses. If we were using
soc-specific compats, it would be one from many, not one vs many
usage.

>
> Signed-off-by: Konrad Dybcio 
> ---
> Bar some very very very unlikely edge cases (such as need for some random
> quick being applied to one SoC from a family that shares DPU hw rev, but
> not the others, there is little to no reason to keep adding compatibles
> that don't mean anything.
>
> If this change is cool, then the question about what to do with
> dt-bindings arises...
>
>  drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c 
> b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
> index 47fe11a84a77..335018542a3a 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
> @@ -1348,6 +1348,9 @@ static const struct dev_pm_ops dpu_pm_ops = {
>  };
>
>  const struct of_device_id dpu_dt_match[] = {
> +   { .compatible = "qcom,dpu1" },
> +
> +   /* Legacy compatibles for old DTs */
> { .compatible = "qcom,sdm845-dpu", },
> { .compatible = "qcom,sc7180-dpu", },
> { .compatible = "qcom,sc7280-dpu", },
> --
> 2.35.1
>


-- 
With best wishes
Dmitry


[PATCH v2] drm/amdgpu: check vm ready by amdgpu_vm->evicting flag

2022-02-21 Thread Qiang Yu
Workstation application ANSA/META v21.1.4 get this error dmesg when
running CI test suite provided by ANSA/META:
[drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't update BO_VA (-16)

This is caused by:
1. create a 256MB buffer in invisible VRAM
2. CPU map the buffer and access it causes vm_fault and try to move
   it to visible VRAM
3. force visible VRAM space and traverse all VRAM bos to check if
   evicting this bo is valuable
4. when checking a VM bo (in invisible VRAM), amdgpu_vm_evictable()
   will set amdgpu_vm->evicting, but latter due to not in visible
   VRAM, won't really evict it so not add it to amdgpu_vm->evicted
5. before next CS to clear the amdgpu_vm->evicting, user VM ops
   ioctl will pass amdgpu_vm_ready() (check amdgpu_vm->evicted)
   but fail in amdgpu_vm_bo_update_mapping() (check
   amdgpu_vm->evicting) and get this error log

This error won't affect functionality as next CS will finish the
waiting VM ops. But we'd better clear the error log by checking
the amdgpu_vm->evicting flag in amdgpu_vm_ready() to stop calling
amdgpu_vm_bo_update_mapping() latter.

Another reason is amdgpu_vm->evicted list holds all BOs (both
user buffer and page table), but only page table BOs' eviction
prevent VM ops. amdgpu_vm->evicting flag is set only for page
table BOs, so we should use evicting flag instead of evicted list
in amdgpu_vm_ready().

The side effect of This change is: previously blocked VM op (user
buffer in "evicted" list but no page table in it) gets done
immediately.

v2: update commit comments.

Reviewed-by: Christian König 
Signed-off-by: Qiang Yu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 37acd8911168..2cd9f1a2e5fa 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -770,11 +770,16 @@ int amdgpu_vm_validate_pt_bos(struct amdgpu_device *adev, 
struct amdgpu_vm *vm,
  * Check if all VM PDs/PTs are ready for updates
  *
  * Returns:
- * True if eviction list is empty.
+ * True if VM is not evicting.
  */
 bool amdgpu_vm_ready(struct amdgpu_vm *vm)
 {
-   return list_empty(>evicted);
+   bool ret;
+
+   amdgpu_vm_eviction_lock(vm);
+   ret = !vm->evicting;
+   amdgpu_vm_eviction_unlock(vm);
+   return ret;
 }
 
 /**
-- 
2.25.1



2022 X.Org Board of Directors Elections Nomination period is NOW

2022-02-21 Thread Lyude Paul
We are seeking nominations for candidates for election to the X.Org Foundation
Board of Directors. All X.Org Foundation members are eligible for election to
the board.

Nominations for the 2022 election are now open and will remain open until
23:59 UTC on 06 March 2022.

The Board consists of directors elected from the membership. Each year, an
election is held to bring the total number of directors to eight. The four
members receiving the highest vote totals will serve as directors for two year
terms.

The directors who received two year terms starting in 2021 were Lyude Paul,
Samuel Iglesias Gonsálvez, Manasi D Navare and Daniel Vetter. They will
continue to serve until their term ends in 2023. Current directors whose term
expires in 2022 are Emma Anholt, Keith Packard, Harry Wentland and Mark
Filion.

A director is expected to participate in the fortnightly IRC meeting to
discuss current business and to attend the annual meeting of the X.Org
Foundation, which will be held at a location determined in advance by the
Board of Directors.

A member may nominate themselves or any other member they feel is qualified.
Nominations should be sent to the Election Committee at elections at x.org.

Nominees shall be required to be current members of the X.Org Foundation, and
submit a personal statement of up to 200 words that will be provided to
prospective voters. The collected statements, along with the statement of
contribution to the X.Org Foundation in the member's account page on
http://members.x.org, will be made available to all voters to help them make
their voting decisions.

Nominations, membership applications or renewals and completed personal
statements must be received no later than 23:59 UTC on 6th March 2022.

The slate of candidates will be published 14 March 2022 and candidate Q will
begin then. The deadline for Xorg membership applications and renewals is 17
March 2022.

Cheers, Lyude Paul, on behalf of the X.Org BoD




[RFC PATCH] drm/msm/dpu1: Add a common DPU1 compatible

2022-02-21 Thread Konrad Dybcio
There is *almost no reason* to keep separate compatibles for different
SoCs utilizing the DPU1 driver, as it checks the HW version at runtime.

Introduce a common compatible, while not removing the old ones to keep
old DT compatibility.

Signed-off-by: Konrad Dybcio 
---
Bar some very very very unlikely edge cases (such as need for some random
quick being applied to one SoC from a family that shares DPU hw rev, but
not the others, there is little to no reason to keep adding compatibles
that don't mean anything.

If this change is cool, then the question about what to do with
dt-bindings arises...

 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
index 47fe11a84a77..335018542a3a 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
@@ -1348,6 +1348,9 @@ static const struct dev_pm_ops dpu_pm_ops = {
 };
 
 const struct of_device_id dpu_dt_match[] = {
+   { .compatible = "qcom,dpu1" },
+
+   /* Legacy compatibles for old DTs */
{ .compatible = "qcom,sdm845-dpu", },
{ .compatible = "qcom,sc7180-dpu", },
{ .compatible = "qcom,sc7280-dpu", },
-- 
2.35.1



[PATCH 1/3] drm/msm/adreno: Add A619 support

2022-02-21 Thread Konrad Dybcio
Add support for the Adreno 619 GPU, as found in Snapdragon 690 (SM6350),
480 (SM4350) and 750G (SM7225).

Signed-off-by: Konrad Dybcio 
---
 drivers/gpu/drm/msm/adreno/a6xx_gmu.c  | 11 ++--
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c  | 70 +-
 drivers/gpu/drm/msm/adreno/a6xx_hfi.c  | 66 +++-
 drivers/gpu/drm/msm/adreno/adreno_device.c | 14 +
 drivers/gpu/drm/msm/adreno/adreno_gpu.h| 13 +++-
 5 files changed, 166 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
index 3e325e2a2b1b..e8d4cca6cd46 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -527,6 +527,8 @@ static void a6xx_gmu_rpmh_init(struct a6xx_gmu *gmu)
pdc_in_aop = true;
else if (adreno_is_a618(adreno_gpu) || 
adreno_is_a640_family(adreno_gpu))
pdc_address_offset = 0x30090;
+   else if (adreno_is_a619(adreno_gpu))
+   pdc_address_offset = 0x300a0;
else
pdc_address_offset = 0x30080;
 
@@ -601,7 +603,8 @@ static void a6xx_gmu_rpmh_init(struct a6xx_gmu *gmu)
 
pdc_write(pdcptr, REG_A6XX_PDC_GPU_TCS3_CMD0_MSGID + 4, 0x10108);
pdc_write(pdcptr, REG_A6XX_PDC_GPU_TCS3_CMD0_ADDR + 4, 0x3);
-   if (adreno_is_a618(adreno_gpu) || adreno_is_a650_family(adreno_gpu))
+   if (adreno_is_a618(adreno_gpu) || adreno_is_a619(adreno_gpu) ||
+   adreno_is_a650_family(adreno_gpu))
pdc_write(pdcptr, REG_A6XX_PDC_GPU_TCS3_CMD0_DATA + 4, 0x2);
else
pdc_write(pdcptr, REG_A6XX_PDC_GPU_TCS3_CMD0_DATA + 4, 0x3);
@@ -1537,7 +1540,7 @@ int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct 
device_node *node)
SZ_16M - SZ_16K, 0x04000, "icache");
if (ret)
goto err_memory;
-   } else if (adreno_is_a640_family(adreno_gpu)) {
+   } else {
ret = a6xx_gmu_memory_alloc(gmu, >icache,
SZ_256K - SZ_16K, 0x04000, "icache");
if (ret)
@@ -1547,9 +1550,9 @@ int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct 
device_node *node)
SZ_256K - SZ_16K, 0x44000, "dcache");
if (ret)
goto err_memory;
-   } else {
-   BUG_ON(adreno_is_a660_family(adreno_gpu));
+   }
 
+   if (adreno_is_a630(adreno_gpu) || adreno_is_a615_family(adreno_gpu)) {
/* HFI v1, has sptprac */
gmu->legacy = true;
 
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 17cfad6424db..ed9abb2d5e5c 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -224,6 +224,74 @@ static void a6xx_submit(struct msm_gpu *gpu, struct 
msm_gem_submit *submit)
a6xx_flush(gpu, ring);
 }
 
+/* For a615 family (a615, a616, a618 and a619) */
+const struct adreno_reglist a615_hwcg[] = {
+   {REG_A6XX_RBBM_CLOCK_CNTL_SP0,  0x0222},
+   {REG_A6XX_RBBM_CLOCK_CNTL2_SP0, 0x0220},
+   {REG_A6XX_RBBM_CLOCK_DELAY_SP0, 0x0080},
+   {REG_A6XX_RBBM_CLOCK_HYST_SP0,  0xF3CF},
+   {REG_A6XX_RBBM_CLOCK_CNTL_TP0,  0x0222},
+   {REG_A6XX_RBBM_CLOCK_CNTL_TP1,  0x0222},
+   {REG_A6XX_RBBM_CLOCK_CNTL2_TP0, 0x},
+   {REG_A6XX_RBBM_CLOCK_CNTL2_TP1, 0x},
+   {REG_A6XX_RBBM_CLOCK_CNTL3_TP0, 0x},
+   {REG_A6XX_RBBM_CLOCK_CNTL3_TP1, 0x},
+   {REG_A6XX_RBBM_CLOCK_CNTL4_TP0, 0x0002},
+   {REG_A6XX_RBBM_CLOCK_CNTL4_TP1, 0x0002},
+   {REG_A6XX_RBBM_CLOCK_HYST_TP0,  0x},
+   {REG_A6XX_RBBM_CLOCK_HYST_TP1,  0x},
+   {REG_A6XX_RBBM_CLOCK_HYST2_TP0, 0x},
+   {REG_A6XX_RBBM_CLOCK_HYST2_TP1, 0x},
+   {REG_A6XX_RBBM_CLOCK_HYST3_TP0, 0x},
+   {REG_A6XX_RBBM_CLOCK_HYST3_TP1, 0x},
+   {REG_A6XX_RBBM_CLOCK_HYST4_TP0, 0x0007},
+   {REG_A6XX_RBBM_CLOCK_HYST4_TP1, 0x0007},
+   {REG_A6XX_RBBM_CLOCK_DELAY_TP0, 0x},
+   {REG_A6XX_RBBM_CLOCK_DELAY_TP1, 0x},
+   {REG_A6XX_RBBM_CLOCK_DELAY2_TP0, 0x},
+   {REG_A6XX_RBBM_CLOCK_DELAY2_TP1, 0x},
+   {REG_A6XX_RBBM_CLOCK_DELAY3_TP0, 0x},
+   {REG_A6XX_RBBM_CLOCK_DELAY3_TP1, 0x},
+   {REG_A6XX_RBBM_CLOCK_DELAY4_TP0, 0x0001},
+   {REG_A6XX_RBBM_CLOCK_DELAY4_TP1, 0x0001},
+   {REG_A6XX_RBBM_CLOCK_CNTL_UCHE,  0x},
+   {REG_A6XX_RBBM_CLOCK_CNTL2_UCHE, 0x},
+   {REG_A6XX_RBBM_CLOCK_CNTL3_UCHE, 0x},
+   {REG_A6XX_RBBM_CLOCK_CNTL4_UCHE, 0x0022},
+   {REG_A6XX_RBBM_CLOCK_HYST_UCHE,  0x0004},
+   {REG_A6XX_RBBM_CLOCK_DELAY_UCHE, 0x0002},
+   {REG_A6XX_RBBM_CLOCK_CNTL_RB0, 0x},
+   {REG_A6XX_RBBM_CLOCK_CNTL2_RB0, 0x},
+   

[PATCH 2/3] drm/msm/a6xx: Add speedbin support for A619 GPU

2022-02-21 Thread Konrad Dybcio
There are various SKUs of A619, ranging from 565 MHz to 850 MHz, depending
on the bin. Add support for distinguishing them, so that proper frequency
ranges can be applied, depending on the HW.

Signed-off-by: Konrad Dybcio 
---
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index ed9abb2d5e5c..019df7a226b7 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -1804,12 +1804,30 @@ static u32 a618_get_speed_bin(u32 fuse)
return UINT_MAX;
 }
 
+static u32 a619_get_speed_bin(u32 fuse)
+{
+   if (fuse == 0)
+   return 0;
+   else if (fuse == 120)
+   return 4;
+   else if (fuse == 138)
+   return 3;
+   else if (fuse == 169)
+   return 2;
+   else if (fuse == 180)
+   return 1;
+
+   return UINT_MAX;
+}
+
 static u32 fuse_to_supp_hw(struct device *dev, struct adreno_rev rev, u32 fuse)
 {
u32 val = UINT_MAX;
 
if (adreno_cmp_rev(ADRENO_REV(6, 1, 8, ANY_ID), rev))
val = a618_get_speed_bin(fuse);
+   else if (adreno_cmp_rev(ADRENO_REV(6, 1, 9, ANY_ID), rev))
+   val = a619_get_speed_bin(fuse);
 
if (val == UINT_MAX) {
DRM_DEV_ERROR(dev,
-- 
2.35.1



[PATCH 3/3] drm/msm/adreno: Fix up formatting

2022-02-21 Thread Konrad Dybcio
Leading spaces are not something checkpatch likes, and it says so when
they are present. Use tabs consistently to indent function body and
unwrap a 83-char-long line, as 100 is cool nowadays.

Signed-off-by: Konrad Dybcio 
---
 drivers/gpu/drm/msm/adreno/adreno_gpu.h | 17 -
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h 
b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
index 9e3b4ea7f9bc..e1f9d7442114 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
@@ -198,7 +198,7 @@ static inline int adreno_is_a420(struct adreno_gpu *gpu)
 
 static inline int adreno_is_a430(struct adreno_gpu *gpu)
 {
-   return gpu->revn == 430;
+   return gpu->revn == 430;
 }
 
 static inline int adreno_is_a506(struct adreno_gpu *gpu)
@@ -238,7 +238,7 @@ static inline int adreno_is_a540(struct adreno_gpu *gpu)
 
 static inline int adreno_is_a618(struct adreno_gpu *gpu)
 {
-   return gpu->revn == 618;
+   return gpu->revn == 618;
 }
 
 static inline int adreno_is_a619(struct adreno_gpu *gpu)
@@ -248,7 +248,7 @@ static inline int adreno_is_a619(struct adreno_gpu *gpu)
 
 static inline int adreno_is_a630(struct adreno_gpu *gpu)
 {
-   return gpu->revn == 630;
+   return gpu->revn == 630;
 }
 
 static inline int adreno_is_a640_family(struct adreno_gpu *gpu)
@@ -258,18 +258,18 @@ static inline int adreno_is_a640_family(struct adreno_gpu 
*gpu)
 
 static inline int adreno_is_a650(struct adreno_gpu *gpu)
 {
-   return gpu->revn == 650;
+   return gpu->revn == 650;
 }
 
 static inline int adreno_is_7c3(struct adreno_gpu *gpu)
 {
/* The order of args is important here to handle ANY_ID correctly */
-   return adreno_cmp_rev(ADRENO_REV(6, 3, 5, ANY_ID), gpu->rev);
+   return adreno_cmp_rev(ADRENO_REV(6, 3, 5, ANY_ID), gpu->rev);
 }
 
 static inline int adreno_is_a660(struct adreno_gpu *gpu)
 {
-   return gpu->revn == 660;
+   return gpu->revn == 660;
 }
 
 /* check for a615, a616, a618, a619 or any derivatives */
@@ -280,14 +280,13 @@ static inline int adreno_is_a615_family(struct adreno_gpu 
*gpu)
 
 static inline int adreno_is_a660_family(struct adreno_gpu *gpu)
 {
-   return adreno_is_a660(gpu) || adreno_is_7c3(gpu);
+   return adreno_is_a660(gpu) || adreno_is_7c3(gpu);
 }
 
 /* check for a650, a660, or any derivatives */
 static inline int adreno_is_a650_family(struct adreno_gpu *gpu)
 {
-   return gpu->revn == 650 || gpu->revn == 620 ||
-  adreno_is_a660_family(gpu);
+   return gpu->revn == 650 || gpu->revn == 620 || 
adreno_is_a660_family(gpu);
 }
 
 int adreno_get_param(struct msm_gpu *gpu, uint32_t param, uint64_t *value);
-- 
2.35.1



Re: [PATCH] drm/i915: Check input parameter for NULL

2022-02-21 Thread kernel test robot
Hi Yongzhi,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on drm-intel/for-linux-next]
[also build test ERROR on v5.17-rc5 next-20220217]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Yongzhi-Liu/drm-i915-Check-input-parameter-for-NULL/20220221-225508
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: i386-randconfig-a004-20220221 
(https://download.01.org/0day-ci/archive/20220222/202202220935.3r4emo4y-...@intel.com/config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce (this is a W=1 build):
# 
https://github.com/0day-ci/linux/commit/c54be425a38b3f4cb82c5badecf6b343f9e24a90
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Yongzhi-Liu/drm-i915-Check-input-parameter-for-NULL/20220221-225508
git checkout c54be425a38b3f4cb82c5badecf6b343f9e24a90
# save the config file to linux build tree
mkdir build_dir
make W=1 O=build_dir ARCH=i386 SHELL=/bin/bash

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

   drivers/gpu/drm/i915/gem/i915_gem_phys.c: In function 
'i915_gem_object_put_pages_phys':
>> drivers/gpu/drm/i915/gem/i915_gem_phys.c:100:2: error: ISO C90 forbids mixed 
>> declarations and code [-Werror=declaration-after-statement]
 100 |  dma_addr_t dma = sg_dma_address(pages->sgl);
 |  ^~
   cc1: all warnings being treated as errors


vim +100 drivers/gpu/drm/i915/gem/i915_gem_phys.c

f033428db28bdf Chris Wilson  2019-05-28   93  
a61170975718d5 Maarten Lankhorst 2021-03-23   94  void
f033428db28bdf Chris Wilson  2019-05-28   95  
i915_gem_object_put_pages_phys(struct drm_i915_gem_object *obj,
f033428db28bdf Chris Wilson  2019-05-28   96   
struct sg_table *pages)
f033428db28bdf Chris Wilson  2019-05-28   97  {
c54be425a38b3f Yongzhi Liu   2022-02-21   98if (!pages)
c54be425a38b3f Yongzhi Liu   2022-02-21   99return;
c6790dc22312f5 Chris Wilson  2020-02-02 @100dma_addr_t dma = 
sg_dma_address(pages->sgl);
c6790dc22312f5 Chris Wilson  2020-02-02  101void *vaddr = 
sg_page(pages->sgl);
c6790dc22312f5 Chris Wilson  2020-02-02  102  
f033428db28bdf Chris Wilson  2019-05-28  103
__i915_gem_object_release_shmem(obj, pages, false);
f033428db28bdf Chris Wilson  2019-05-28  104  
f033428db28bdf Chris Wilson  2019-05-28  105if (obj->mm.dirty) {
f033428db28bdf Chris Wilson  2019-05-28  106struct 
address_space *mapping = obj->base.filp->f_mapping;
c6790dc22312f5 Chris Wilson  2020-02-02  107void *src = 
vaddr;
f033428db28bdf Chris Wilson  2019-05-28  108int i;
f033428db28bdf Chris Wilson  2019-05-28  109  
f033428db28bdf Chris Wilson  2019-05-28  110for (i = 0; i < 
obj->base.size / PAGE_SIZE; i++) {
f033428db28bdf Chris Wilson  2019-05-28  111struct 
page *page;
f033428db28bdf Chris Wilson  2019-05-28  112char 
*dst;
f033428db28bdf Chris Wilson  2019-05-28  113  
f033428db28bdf Chris Wilson  2019-05-28  114page = 
shmem_read_mapping_page(mapping, i);
f033428db28bdf Chris Wilson  2019-05-28  115if 
(IS_ERR(page))
f033428db28bdf Chris Wilson  2019-05-28  116
continue;
f033428db28bdf Chris Wilson  2019-05-28  117  
f033428db28bdf Chris Wilson  2019-05-28  118dst = 
kmap_atomic(page);
c6790dc22312f5 Chris Wilson  2020-02-02  119
drm_clflush_virt_range(src, PAGE_SIZE);
c6790dc22312f5 Chris Wilson  2020-02-02  120
memcpy(dst, src, PAGE_SIZE);
f033428db28bdf Chris Wilson  2019-05-28  121
kunmap_atomic(dst);
f033428db28bdf Chris Wilson  2019-05-28  122  
f033428db28bdf Chris Wilson  2019-05-28  123
set_page_dirty(page);
f033428db28bdf Chris Wilson  2019-05-28  124if 
(obj->mm.madv == I915_MADV_WILLNEED)
f033428db28bdf Chris Wilson  2019-05-28  125
mark_page_accessed(page);
f033428db28bdf Chris Wilson  2019-05-28  126
put_page(page);
c6790dc22312f5 Chris Wilson  2020-02-02  127  
c6790dc22312f5 Chris Wilson  2020-02-02  128src += 
PAGE_SIZE;
f033428db28bdf Chris Wilson  2019-05-28  129}
f033428db28bdf Chris Wilson  2019-05-28  130obj->mm.dirty = 
false;

Re: [PATCH v4 7/9] drm: vkms: Refactor the plane composer to accept new formats

2022-02-21 Thread Igor Torrente

Hi Pekka,

On 2/21/22 06:18, Pekka Paalanen wrote:

On Sun, 20 Feb 2022 22:02:12 -0300
Igor Torrente  wrote:


Hi Melissa,

On 2/9/22 18:45, Melissa Wen wrote:

On 02/08, Igor Torrente wrote:

Hi Melissa,

On 2/8/22 07:40, Melissa Wen wrote:

On 01/21, Igor Torrente wrote:

Currently the blend function only accepts XRGB_ and ARGB_
as a color input.

This patch refactors all the functions related to the plane composition
to overcome this limitation.

A new internal format(`struct pixel`) is introduced to deal with all
possible inputs. It consists of 16 bits fields that represent each of
the channels.

The pixels blend is done using this internal format. And new handlers
are being added to convert a specific format to/from this internal format.

So the blend operation depends on these handlers to convert to this common
format. The blended result, if necessary, is converted to the writeback
buffer format.

This patch introduces three major differences to the blend function.
1 - All the planes are blended at once.
2 - The blend calculus is done as per line instead of per pixel.
3 - It is responsible to calculates the CRC and writing the writeback
   buffer(if necessary).

These changes allow us to allocate way less memory in the intermediate
buffer to compute these operations. Because now we don't need to
have the entire intermediate image lines at once, just one line is
enough.

| Memory consumption (output dimensions) |
|:--:|
|   Current  | This patch|
|:--:|:-:|
|   Width * Heigth   | 2 * Width |

Beyond memory, we also have a minor performance benefit from all
these changes. Results running the IGT tests `*kms_cursor_crc*`:
  

First, thanks for this improvement.

Some recent changes in kms_cursor_crc caused VKMS to fail in most test
cases (iirc, only size-change and alpha-opaque are passing currently).


I updated my igt and kernel(from drm_misc/drm-misc-next) to the latest
commit[1][2] and I'm getting mixed results. Sometimes most of the test
passes, sometimes almost nothing passes.

hmm.. is it happening when running kms_cursor_crc? Is the results
variation random or is it possible to follow a set of steps to reproduce
it? When failing, what is the reason displayed by the log?


I investigated it a little bit and discovered that the KMS
cursor(".*kms_cursor_crc*" ) are failing after the execution of
writeback tests(".*kms_writeback.*").

I don't know what is causing it, but they are failing while trying to
commit the KMS changes.

out.txt:
IGT-Version: 1.26-NO-GIT (x86_64) (Linux: 5.17.0-rc2 x86_64)
Stack trace:
#0 ../lib/igt_core.c:1754 __igt_fail_assert()
#1 ../lib/igt_kms.c:3795 do_display_commit()
#2 ../lib/igt_kms.c:3901 igt_display_commit2()
#3 ../tests/kms_cursor_crc.c:820 __igt_uniquereal_main814()
#4 ../tests/kms_cursor_crc.c:814 main()
#5 ../csu/libc-start.c:308 __libc_start_main()
#6 [_start+0x2a]
Subtest pipe-A-cursor-size-change: FAIL

err.txt:
(kms_cursor_crc:1936) igt_kms-CRITICAL: Test assertion failure function
do_display_commit, file ../lib/igt_kms.c:3795:
(kms_cursor_crc:1936) igt_kms-CRITICAL: Failed assertion: ret == 0
(kms_cursor_crc:1936) igt_kms-CRITICAL: Last errno: 22, Invalid argument
(kms_cursor_crc:1936) igt_kms-CRITICAL: error: -22 != 0



  From my side, only the first two subtest of kms_cursor_crc is passing
before this patch. And after your changes here, all subtests are
successful again, except those related to 32x10 cursor size (that needs
futher investigation). I didn't check how the recent changes in
kms_cursor_crc affect VKMS performance on it, but I bet that clearing
the alpha channel is the reason to have the performance back.


Yeah, I also don't understand why the 32x10 cursor tests are failing.



Hi,

are the tests putting the cursor partially outside of the CRTC area?
Or partially outside of primary plane area (which IIRC you used when you
should have used the CRTC area?)?

Does the writeback test forget to unlink the writeback connector? Or
does VKMS not handle unlinking the writeback connector?


I don't know the answer to all these questions.

I did try to find the commit that introduces this issue, and I found
that it's happening since the writeback was introduced in Aug
2020(dbd9d80c).

And the failure related to the 32x10 cursor was happening before my
changes.



If both are a problem, the latter would be just an unrelated bug that
exposes the first bug in VKMS, because whether writeback is used or not
probably should not affect where the cursor plane is allowed to be.


Yeah, I don't think those a related.

Best Regards.
---
Igor Torrente




Thanks,
pq


Re: [PATCH] drm/i915: Check input parameter for NULL

2022-02-21 Thread kernel test robot
Hi Yongzhi,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on drm-intel/for-linux-next]
[also build test WARNING on v5.17-rc5 next-20220217]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Yongzhi-Liu/drm-i915-Check-input-parameter-for-NULL/20220221-225508
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: x86_64-randconfig-a001-20220221 
(https://download.01.org/0day-ci/archive/20220222/202202220847.76w2ewnu-...@intel.com/config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce (this is a W=1 build):
# 
https://github.com/0day-ci/linux/commit/c54be425a38b3f4cb82c5badecf6b343f9e24a90
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Yongzhi-Liu/drm-i915-Check-input-parameter-for-NULL/20220221-225508
git checkout c54be425a38b3f4cb82c5badecf6b343f9e24a90
# save the config file to linux build tree
mkdir build_dir
make W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash drivers/gpu/drm/i915/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All warnings (new ones prefixed by >>):

   drivers/gpu/drm/i915/gem/i915_gem_phys.c: In function 
'i915_gem_object_put_pages_phys':
>> drivers/gpu/drm/i915/gem/i915_gem_phys.c:100:2: warning: ISO C90 forbids 
>> mixed declarations and code [-Wdeclaration-after-statement]
 100 |  dma_addr_t dma = sg_dma_address(pages->sgl);
 |  ^~


vim +100 drivers/gpu/drm/i915/gem/i915_gem_phys.c

f033428db28bdf Chris Wilson  2019-05-28   93  
a61170975718d5 Maarten Lankhorst 2021-03-23   94  void
f033428db28bdf Chris Wilson  2019-05-28   95  
i915_gem_object_put_pages_phys(struct drm_i915_gem_object *obj,
f033428db28bdf Chris Wilson  2019-05-28   96   
struct sg_table *pages)
f033428db28bdf Chris Wilson  2019-05-28   97  {
c54be425a38b3f Yongzhi Liu   2022-02-21   98if (!pages)
c54be425a38b3f Yongzhi Liu   2022-02-21   99return;
c6790dc22312f5 Chris Wilson  2020-02-02 @100dma_addr_t dma = 
sg_dma_address(pages->sgl);
c6790dc22312f5 Chris Wilson  2020-02-02  101void *vaddr = 
sg_page(pages->sgl);
c6790dc22312f5 Chris Wilson  2020-02-02  102  
f033428db28bdf Chris Wilson  2019-05-28  103
__i915_gem_object_release_shmem(obj, pages, false);
f033428db28bdf Chris Wilson  2019-05-28  104  
f033428db28bdf Chris Wilson  2019-05-28  105if (obj->mm.dirty) {
f033428db28bdf Chris Wilson  2019-05-28  106struct 
address_space *mapping = obj->base.filp->f_mapping;
c6790dc22312f5 Chris Wilson  2020-02-02  107void *src = 
vaddr;
f033428db28bdf Chris Wilson  2019-05-28  108int i;
f033428db28bdf Chris Wilson  2019-05-28  109  
f033428db28bdf Chris Wilson  2019-05-28  110for (i = 0; i < 
obj->base.size / PAGE_SIZE; i++) {
f033428db28bdf Chris Wilson  2019-05-28  111struct 
page *page;
f033428db28bdf Chris Wilson  2019-05-28  112char 
*dst;
f033428db28bdf Chris Wilson  2019-05-28  113  
f033428db28bdf Chris Wilson  2019-05-28  114page = 
shmem_read_mapping_page(mapping, i);
f033428db28bdf Chris Wilson  2019-05-28  115if 
(IS_ERR(page))
f033428db28bdf Chris Wilson  2019-05-28  116
continue;
f033428db28bdf Chris Wilson  2019-05-28  117  
f033428db28bdf Chris Wilson  2019-05-28  118dst = 
kmap_atomic(page);
c6790dc22312f5 Chris Wilson  2020-02-02  119
drm_clflush_virt_range(src, PAGE_SIZE);
c6790dc22312f5 Chris Wilson  2020-02-02  120
memcpy(dst, src, PAGE_SIZE);
f033428db28bdf Chris Wilson  2019-05-28  121
kunmap_atomic(dst);
f033428db28bdf Chris Wilson  2019-05-28  122  
f033428db28bdf Chris Wilson  2019-05-28  123
set_page_dirty(page);
f033428db28bdf Chris Wilson  2019-05-28  124if 
(obj->mm.madv == I915_MADV_WILLNEED)
f033428db28bdf Chris Wilson  2019-05-28  125
mark_page_accessed(page);
f033428db28bdf Chris Wilson  2019-05-28  126
put_page(page);
c6790dc22312f5 Chris Wilson  2020-02-02  127  
c6790dc22312f5 Chris Wilson  2020-02-02  128src += 
PAGE_SIZE;
f033428db28bdf Chris Wilson  2019-05-28  129}
f033428db28bdf Chris Wilson  2019-05-28  130obj->mm.dirty = 
false;
f033428db28bdf Chris Wilson   

Re: [PATCH] drm/i915: Check input parameter for NULL

2022-02-21 Thread kernel test robot
Hi Yongzhi,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on drm-intel/for-linux-next]
[also build test WARNING on v5.17-rc5 next-20220217]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Yongzhi-Liu/drm-i915-Check-input-parameter-for-NULL/20220221-225508
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: i386-randconfig-a014-20220221 
(https://download.01.org/0day-ci/archive/20220222/202202220722.25bhjj6r-...@intel.com/config)
compiler: clang version 15.0.0 (https://github.com/llvm/llvm-project 
d271fc04d5b97b12e6b797c6067d3c96a8d7470e)
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://github.com/0day-ci/linux/commit/c54be425a38b3f4cb82c5badecf6b343f9e24a90
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Yongzhi-Liu/drm-i915-Check-input-parameter-for-NULL/20220221-225508
git checkout c54be425a38b3f4cb82c5badecf6b343f9e24a90
# save the config file to linux build tree
mkdir build_dir
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 
O=build_dir ARCH=i386 SHELL=/bin/bash drivers/gpu/drm/i915/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All warnings (new ones prefixed by >>):

>> drivers/gpu/drm/i915/gem/i915_gem_phys.c:100:13: warning: mixing 
>> declarations and code is a C99 extension [-Wdeclaration-after-statement]
   dma_addr_t dma = sg_dma_address(pages->sgl);
  ^
   1 warning generated.


vim +100 drivers/gpu/drm/i915/gem/i915_gem_phys.c

f033428db28bdf Chris Wilson  2019-05-28   93  
a61170975718d5 Maarten Lankhorst 2021-03-23   94  void
f033428db28bdf Chris Wilson  2019-05-28   95  
i915_gem_object_put_pages_phys(struct drm_i915_gem_object *obj,
f033428db28bdf Chris Wilson  2019-05-28   96   
struct sg_table *pages)
f033428db28bdf Chris Wilson  2019-05-28   97  {
c54be425a38b3f Yongzhi Liu   2022-02-21   98if (!pages)
c54be425a38b3f Yongzhi Liu   2022-02-21   99return;
c6790dc22312f5 Chris Wilson  2020-02-02 @100dma_addr_t dma = 
sg_dma_address(pages->sgl);
c6790dc22312f5 Chris Wilson  2020-02-02  101void *vaddr = 
sg_page(pages->sgl);
c6790dc22312f5 Chris Wilson  2020-02-02  102  
f033428db28bdf Chris Wilson  2019-05-28  103
__i915_gem_object_release_shmem(obj, pages, false);
f033428db28bdf Chris Wilson  2019-05-28  104  
f033428db28bdf Chris Wilson  2019-05-28  105if (obj->mm.dirty) {
f033428db28bdf Chris Wilson  2019-05-28  106struct 
address_space *mapping = obj->base.filp->f_mapping;
c6790dc22312f5 Chris Wilson  2020-02-02  107void *src = 
vaddr;
f033428db28bdf Chris Wilson  2019-05-28  108int i;
f033428db28bdf Chris Wilson  2019-05-28  109  
f033428db28bdf Chris Wilson  2019-05-28  110for (i = 0; i < 
obj->base.size / PAGE_SIZE; i++) {
f033428db28bdf Chris Wilson  2019-05-28  111struct 
page *page;
f033428db28bdf Chris Wilson  2019-05-28  112char 
*dst;
f033428db28bdf Chris Wilson  2019-05-28  113  
f033428db28bdf Chris Wilson  2019-05-28  114page = 
shmem_read_mapping_page(mapping, i);
f033428db28bdf Chris Wilson  2019-05-28  115if 
(IS_ERR(page))
f033428db28bdf Chris Wilson  2019-05-28  116
continue;
f033428db28bdf Chris Wilson  2019-05-28  117  
f033428db28bdf Chris Wilson  2019-05-28  118dst = 
kmap_atomic(page);
c6790dc22312f5 Chris Wilson  2020-02-02  119
drm_clflush_virt_range(src, PAGE_SIZE);
c6790dc22312f5 Chris Wilson  2020-02-02  120
memcpy(dst, src, PAGE_SIZE);
f033428db28bdf Chris Wilson  2019-05-28  121
kunmap_atomic(dst);
f033428db28bdf Chris Wilson  2019-05-28  122  
f033428db28bdf Chris Wilson  2019-05-28  123
set_page_dirty(page);
f033428db28bdf Chris Wilson  2019-05-28  124if 
(obj->mm.madv == I915_MADV_WILLNEED)
f033428db28bdf Chris Wilson  2019-05-28  125
mark_page_accessed(page);
f033428db28bdf Chris Wilson  2019-05-28  126
put_page(page);
c6790dc22312f5 Chris Wilson  2020-02-02  127  
c6790dc22312f5 Chris Wilson  2020-02-02  128s

[PATCH v3 11/11] drm/i915: replace Intel internal tracker with kernel core ref_tracker

2022-02-21 Thread Andrzej Hajda
Beside reusing existing code, the main advantage of ref_tracker is
tracking per instance of wakeref. It allows also to catch double
put.
On the other side we lose information about the first acquire and
the last release, but the advantages outweigh it.

Signed-off-by: Andrzej Hajda 
Reviewed-by: Chris Wilson 
---
 drivers/gpu/drm/i915/Kconfig.debug|  11 +-
 drivers/gpu/drm/i915/Makefile |   3 -
 .../drm/i915/display/intel_display_power.c|   2 +-
 drivers/gpu/drm/i915/gt/intel_engine_pm.c |   2 +-
 drivers/gpu/drm/i915/gt/intel_gt_pm.c |   2 +-
 drivers/gpu/drm/i915/intel_runtime_pm.c   |  25 +-
 drivers/gpu/drm/i915/intel_runtime_pm.h   |   2 +-
 drivers/gpu/drm/i915/intel_wakeref.c  |   8 +-
 drivers/gpu/drm/i915/intel_wakeref.h  |  72 +-
 drivers/gpu/drm/i915/intel_wakeref_tracker.c  | 234 --
 drivers/gpu/drm/i915/intel_wakeref_tracker.h  |  76 --
 11 files changed, 87 insertions(+), 350 deletions(-)
 delete mode 100644 drivers/gpu/drm/i915/intel_wakeref_tracker.c
 delete mode 100644 drivers/gpu/drm/i915/intel_wakeref_tracker.h

diff --git a/drivers/gpu/drm/i915/Kconfig.debug 
b/drivers/gpu/drm/i915/Kconfig.debug
index 3bdc73f30a9e1..6c57f3e265f20 100644
--- a/drivers/gpu/drm/i915/Kconfig.debug
+++ b/drivers/gpu/drm/i915/Kconfig.debug
@@ -32,6 +32,7 @@ config DRM_I915_DEBUG
select DEBUG_FS
select PREEMPT_COUNT
select I2C_CHARDEV
+   select REF_TRACKER
select STACKDEPOT
select STACKTRACE
select DRM_DP_AUX_CHARDEV
@@ -46,7 +47,6 @@ config DRM_I915_DEBUG
select DRM_I915_DEBUG_GEM
select DRM_I915_DEBUG_GEM_ONCE
select DRM_I915_DEBUG_MMIO
-   select DRM_I915_TRACK_WAKEREF
select DRM_I915_DEBUG_RUNTIME_PM
select DRM_I915_DEBUG_WAKEREF
select DRM_I915_SW_FENCE_DEBUG_OBJECTS
@@ -238,18 +238,13 @@ config DRM_I915_DEBUG_VBLANK_EVADE
 
  If in doubt, say "N".
 
-config DRM_I915_TRACK_WAKEREF
-   depends on STACKDEPOT
-   depends on STACKTRACE
-   bool
-
 config DRM_I915_DEBUG_RUNTIME_PM
bool "Enable extra state checking for runtime PM"
depends on DRM_I915
default n
+   select REF_TRACKER
select STACKDEPOT
select STACKTRACE
-   select DRM_I915_TRACK_WAKEREF
help
  Choose this option to turn on extra state checking for the
  runtime PM functionality. This may introduce overhead during
@@ -263,9 +258,9 @@ config DRM_I915_DEBUG_WAKEREF
bool "Enable extra tracking for wakerefs"
depends on DRM_I915
default n
+   select REF_TRACKER
select STACKDEPOT
select STACKTRACE
-   select DRM_I915_TRACK_WAKEREF
help
  Choose this option to turn on extra state checking and usage
  tracking for the wakerefPM functionality. This may introduce
diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 88a403d3294cb..1f8d71430e2e6 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -76,9 +76,6 @@ i915-$(CONFIG_DEBUG_FS) += \
display/intel_display_debugfs.o \
display/intel_pipe_crc.o
 
-i915-$(CONFIG_DRM_I915_TRACK_WAKEREF) += \
-   intel_wakeref_tracker.o
-
 i915-$(CONFIG_PERF_EVENTS) += i915_pmu.o
 
 # "Graphics Technology" (aka we talk to the gpu)
diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c 
b/drivers/gpu/drm/i915/display/intel_display_power.c
index 9ebae7ac32356..0e1bf724f89b5 100644
--- a/drivers/gpu/drm/i915/display/intel_display_power.c
+++ b/drivers/gpu/drm/i915/display/intel_display_power.c
@@ -2107,7 +2107,7 @@ print_async_put_domains_state(struct i915_power_domains 
*power_domains)
 struct drm_i915_private,
 power_domains);
 
-   drm_dbg(>drm, "async_put_wakeref %u\n",
+   drm_dbg(>drm, "async_put_wakeref %lu\n",
power_domains->async_put_wakeref);
 
print_power_domains(power_domains, "async_put_domains[0]",
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c 
b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
index 52e46e7830ff5..cf8cc348942cb 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
@@ -273,7 +273,7 @@ void intel_engine_init__pm(struct intel_engine_cs *engine)
 {
struct intel_runtime_pm *rpm = engine->uncore->rpm;
 
-   intel_wakeref_init(>wakeref, rpm, _ops);
+   intel_wakeref_init(>wakeref, rpm, _ops, engine->name);
intel_engine_init_heartbeat(engine);
 }
 
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.c 
b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
index 7ee65a93f926f..01a055d0d0989 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
@@ -129,7 +129,7 @@ static const struct intel_wakeref_ops wf_ops = {
 
 void 

[PATCH v3 10/11] drm/i915: Correct type of wakeref variable

2022-02-21 Thread Andrzej Hajda
Wakeref has dedicated type. Assumption it will be int
compatible forever is incorrect.

Signed-off-by: Andrzej Hajda 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 7799939c38945..b308dd0866eaf 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -2797,7 +2797,7 @@ static void destroyed_worker_func(struct work_struct *w)
struct intel_guc *guc = container_of(w, struct intel_guc,
 submission_state.destroyed_worker);
struct intel_gt *gt = guc_to_gt(guc);
-   int tmp;
+   intel_wakeref_t tmp;
 
with_intel_gt_pm(gt, tmp)
deregister_destroyed_contexts(guc);
-- 
2.25.1



[PATCH v3 07/11] lib/ref_tracker: remove warnings in case of allocation failure

2022-02-21 Thread Andrzej Hajda
Library can handle allocation failures. To avoid allocation warnings
__GFP_NOWARN has been added everywhere. Moreover GFP_ATOMIC has been
replaced with GFP_NOWAIT in case of stack allocation on tracker free
call.

Signed-off-by: Andrzej Hajda 
---
 lib/ref_tracker.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/lib/ref_tracker.c b/lib/ref_tracker.c
index 2ef4596b6b36f..cae4498fcfd70 100644
--- a/lib/ref_tracker.c
+++ b/lib/ref_tracker.c
@@ -189,7 +189,7 @@ int ref_tracker_alloc(struct ref_tracker_dir *dir,
unsigned long entries[REF_TRACKER_STACK_ENTRIES];
struct ref_tracker *tracker;
unsigned int nr_entries;
-   gfp_t gfp_mask = gfp;
+   gfp_t gfp_mask = gfp | __GFP_NOWARN;
unsigned long flags;
 
WARN_ON_ONCE(dir->dead);
@@ -237,7 +237,8 @@ int ref_tracker_free(struct ref_tracker_dir *dir,
return -EEXIST;
}
nr_entries = stack_trace_save(entries, ARRAY_SIZE(entries), 1);
-   stack_handle = stack_depot_save(entries, nr_entries, GFP_ATOMIC);
+   stack_handle = stack_depot_save(entries, nr_entries,
+   GFP_NOWAIT | __GFP_NOWARN);
 
spin_lock_irqsave(>lock, flags);
if (tracker->dead) {
-- 
2.25.1



[PATCH v3 08/11] drm/i915: Separate wakeref tracking

2022-02-21 Thread Andrzej Hajda
From: Chris Wilson 

Extract the callstack tracking of intel_runtime_pm.c into its own
utility so that that we can reuse it for other online debugging of
scoped wakerefs.

Signed-off-by: Chris Wilson 
Reviewed-by: Andrzej Hajda 
Signed-off-by: Andrzej Hajda 
---
 drivers/gpu/drm/i915/Kconfig.debug   |   9 +
 drivers/gpu/drm/i915/Makefile|   4 +
 drivers/gpu/drm/i915/intel_runtime_pm.c  | 244 +++
 drivers/gpu/drm/i915/intel_runtime_pm.h  |  10 +-
 drivers/gpu/drm/i915/intel_wakeref.h |   6 +-
 drivers/gpu/drm/i915/intel_wakeref_tracker.c | 234 ++
 drivers/gpu/drm/i915/intel_wakeref_tracker.h |  76 ++
 7 files changed, 355 insertions(+), 228 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/intel_wakeref_tracker.c
 create mode 100644 drivers/gpu/drm/i915/intel_wakeref_tracker.h

diff --git a/drivers/gpu/drm/i915/Kconfig.debug 
b/drivers/gpu/drm/i915/Kconfig.debug
index e7fd3e76f8a20..8b1973146e848 100644
--- a/drivers/gpu/drm/i915/Kconfig.debug
+++ b/drivers/gpu/drm/i915/Kconfig.debug
@@ -33,6 +33,7 @@ config DRM_I915_DEBUG
select PREEMPT_COUNT
select I2C_CHARDEV
select STACKDEPOT
+   select STACKTRACE
select DRM_DP_AUX_CHARDEV
select X86_MSR # used by igt/pm_rpm
select DRM_VGEM # used by igt/prime_vgem (dmabuf interop checks)
@@ -45,6 +46,7 @@ config DRM_I915_DEBUG
select DRM_I915_DEBUG_GEM
select DRM_I915_DEBUG_GEM_ONCE
select DRM_I915_DEBUG_MMIO
+   select DRM_I915_TRACK_WAKEREF
select DRM_I915_DEBUG_RUNTIME_PM
select DRM_I915_SW_FENCE_DEBUG_OBJECTS
select DRM_I915_SELFTEST
@@ -235,11 +237,18 @@ config DRM_I915_DEBUG_VBLANK_EVADE
 
  If in doubt, say "N".
 
+config DRM_I915_TRACK_WAKEREF
+   depends on STACKDEPOT
+   depends on STACKTRACE
+   bool
+
 config DRM_I915_DEBUG_RUNTIME_PM
bool "Enable extra state checking for runtime PM"
depends on DRM_I915
default n
select STACKDEPOT
+   select STACKTRACE
+   select DRM_I915_TRACK_WAKEREF
help
  Choose this option to turn on extra state checking for the
  runtime PM functionality. This may introduce overhead during
diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 9d588d936e3dc..88a403d3294cb 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -75,6 +75,10 @@ i915-$(CONFIG_DEBUG_FS) += \
i915_debugfs_params.o \
display/intel_display_debugfs.o \
display/intel_pipe_crc.o
+
+i915-$(CONFIG_DRM_I915_TRACK_WAKEREF) += \
+   intel_wakeref_tracker.o
+
 i915-$(CONFIG_PERF_EVENTS) += i915_pmu.o
 
 # "Graphics Technology" (aka we talk to the gpu)
diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c 
b/drivers/gpu/drm/i915/intel_runtime_pm.c
index 6ed5786bcd299..7bd10efa56bf3 100644
--- a/drivers/gpu/drm/i915/intel_runtime_pm.c
+++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
@@ -52,182 +52,37 @@
 
 #if IS_ENABLED(CONFIG_DRM_I915_DEBUG_RUNTIME_PM)
 
-#include 
-
-#define STACKDEPTH 8
-
-static noinline depot_stack_handle_t __save_depot_stack(void)
-{
-   unsigned long entries[STACKDEPTH];
-   unsigned int n;
-
-   n = stack_trace_save(entries, ARRAY_SIZE(entries), 1);
-   return stack_depot_save(entries, n, GFP_NOWAIT | __GFP_NOWARN);
-}
-
 static void init_intel_runtime_pm_wakeref(struct intel_runtime_pm *rpm)
 {
-   spin_lock_init(>debug.lock);
-   stack_depot_init();
+   intel_wakeref_tracker_init(>debug);
 }
 
-static noinline depot_stack_handle_t
+static intel_wakeref_t
 track_intel_runtime_pm_wakeref(struct intel_runtime_pm *rpm)
 {
-   depot_stack_handle_t stack, *stacks;
-   unsigned long flags;
-
-   if (rpm->no_wakeref_tracking)
-   return -1;
-
-   stack = __save_depot_stack();
-   if (!stack)
+   if (!rpm->available)
return -1;
 
-   spin_lock_irqsave(>debug.lock, flags);
-
-   if (!rpm->debug.count)
-   rpm->debug.last_acquire = stack;
-
-   stacks = krealloc(rpm->debug.owners,
- (rpm->debug.count + 1) * sizeof(*stacks),
- GFP_NOWAIT | __GFP_NOWARN);
-   if (stacks) {
-   stacks[rpm->debug.count++] = stack;
-   rpm->debug.owners = stacks;
-   } else {
-   stack = -1;
-   }
-
-   spin_unlock_irqrestore(>debug.lock, flags);
-
-   return stack;
+   return intel_wakeref_tracker_add(>debug);
 }
 
 static void untrack_intel_runtime_pm_wakeref(struct intel_runtime_pm *rpm,
-depot_stack_handle_t stack)
+intel_wakeref_t wakeref)
 {
-   struct drm_i915_private *i915 = container_of(rpm,
-struct drm_i915_private,
-

[PATCH v3 09/11] drm/i915: Track leaked gt->wakerefs

2022-02-21 Thread Andrzej Hajda
From: Chris Wilson 

Track every intel_gt_pm_get() until its corresponding release in
intel_gt_pm_put() by returning a cookie to the caller for acquire that
must be passed by on rleased. When there is an imbalance, we can see who
either tried to free a stale wakeref, or who forgot to free theirs.

v2: Rebase from backporting wakeref leak (Umesh)

Signed-off-by: Chris Wilson 
Reviewed-by: Andrzej Hajda 
Signed-off-by: Andrzej Hajda 
---
 drivers/gpu/drm/i915/Kconfig.debug| 15 +++
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|  7 ++--
 .../i915/gem/selftests/i915_gem_coherency.c   | 10 +++--
 .../drm/i915/gem/selftests/i915_gem_mman.c| 14 ---
 drivers/gpu/drm/i915/gt/intel_breadcrumbs.c   | 13 --
 .../gpu/drm/i915/gt/intel_breadcrumbs_types.h |  3 +-
 drivers/gpu/drm/i915/gt/intel_engine_pm.c |  4 +-
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |  2 +
 .../drm/i915/gt/intel_execlists_submission.c  |  2 +-
 drivers/gpu/drm/i915/gt/intel_gt_pm.c | 10 +++--
 drivers/gpu/drm/i915/gt/intel_gt_pm.h | 36 
 drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c |  4 +-
 drivers/gpu/drm/i915/gt/selftest_engine_cs.c  | 20 +
 drivers/gpu/drm/i915/gt/selftest_gt_pm.c  |  5 ++-
 drivers/gpu/drm/i915/gt/selftest_reset.c  | 10 +++--
 drivers/gpu/drm/i915/gt/selftest_rps.c| 17 
 drivers/gpu/drm/i915/gt/selftest_slpc.c   | 10 +++--
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  9 ++--
 drivers/gpu/drm/i915/i915_pmu.c   | 16 +++
 drivers/gpu/drm/i915/intel_wakeref.c  |  4 ++
 drivers/gpu/drm/i915/intel_wakeref.h  | 42 +++
 21 files changed, 182 insertions(+), 71 deletions(-)

diff --git a/drivers/gpu/drm/i915/Kconfig.debug 
b/drivers/gpu/drm/i915/Kconfig.debug
index 8b1973146e848..3bdc73f30a9e1 100644
--- a/drivers/gpu/drm/i915/Kconfig.debug
+++ b/drivers/gpu/drm/i915/Kconfig.debug
@@ -48,6 +48,7 @@ config DRM_I915_DEBUG
select DRM_I915_DEBUG_MMIO
select DRM_I915_TRACK_WAKEREF
select DRM_I915_DEBUG_RUNTIME_PM
+   select DRM_I915_DEBUG_WAKEREF
select DRM_I915_SW_FENCE_DEBUG_OBJECTS
select DRM_I915_SELFTEST
select BROKEN # for prototype uAPI
@@ -257,3 +258,17 @@ config DRM_I915_DEBUG_RUNTIME_PM
  Recommended for driver developers only.
 
  If in doubt, say "N"
+
+config DRM_I915_DEBUG_WAKEREF
+   bool "Enable extra tracking for wakerefs"
+   depends on DRM_I915
+   default n
+   select STACKDEPOT
+   select STACKTRACE
+   select DRM_I915_TRACK_WAKEREF
+   help
+ Choose this option to turn on extra state checking and usage
+ tracking for the wakerefPM functionality. This may introduce
+ overhead during driver runtime.
+
+ If in doubt, say "N"
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 13c975da77474..4b6c144f706da 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -252,6 +252,7 @@ struct i915_execbuffer {
struct intel_gt *gt; /* gt for the execbuf */
struct intel_context *context; /* logical state for the request */
struct i915_gem_context *gem_context; /** caller's context */
+   intel_wakeref_t wakeref;
 
/** our requests to build */
struct i915_request *requests[MAX_ENGINE_INSTANCE + 1];
@@ -2679,7 +2680,7 @@ eb_select_engine(struct i915_execbuffer *eb)
 
for_each_child(ce, child)
intel_context_get(child);
-   intel_gt_pm_get(ce->engine->gt);
+   eb->wakeref = intel_gt_pm_get(ce->engine->gt);
 
if (!test_bit(CONTEXT_ALLOC_BIT, >flags)) {
err = intel_context_alloc_state(ce);
@@ -2713,7 +2714,7 @@ eb_select_engine(struct i915_execbuffer *eb)
return err;
 
 err:
-   intel_gt_pm_put(ce->engine->gt);
+   intel_gt_pm_put(ce->engine->gt, eb->wakeref);
for_each_child(ce, child)
intel_context_put(child);
intel_context_put(ce);
@@ -2725,7 +2726,7 @@ eb_put_engine(struct i915_execbuffer *eb)
 {
struct intel_context *child;
 
-   intel_gt_pm_put(eb->gt);
+   intel_gt_pm_put(eb->context->engine->gt, eb->wakeref);
for_each_child(eb->context, child)
intel_context_put(child);
intel_context_put(eb->context);
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c 
b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c
index 13b088cc787eb..553f2730c2a76 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c
@@ -85,6 +85,7 @@ static int cpu_get(struct context *ctx, unsigned long offset, 
u32 *v)
 
 static int gtt_set(struct context *ctx, unsigned long offset, u32 v)
 {
+   intel_wakeref_t wakeref;
struct i915_vma *vma;
u32 __iomem *map;
  

[PATCH v3 06/11] lib/ref_tracker: add printing to memory buffer

2022-02-21 Thread Andrzej Hajda
In case one wants to show stats via debugfs.

Signed-off-by: Andrzej Hajda 
Reviewed-by: Chris Wilson 
---
 include/linux/ref_tracker.h |  8 ++
 lib/ref_tracker.c   | 56 +++--
 2 files changed, 56 insertions(+), 8 deletions(-)

diff --git a/include/linux/ref_tracker.h b/include/linux/ref_tracker.h
index a2cf1f6309adb..2fdbfd2e14797 100644
--- a/include/linux/ref_tracker.h
+++ b/include/linux/ref_tracker.h
@@ -50,6 +50,8 @@ void __ref_tracker_dir_print(struct ref_tracker_dir *dir,
 void ref_tracker_dir_print(struct ref_tracker_dir *dir,
   unsigned int display_limit);
 
+int ref_tracker_dir_snprint(struct ref_tracker_dir *dir, char *buf, size_t 
size);
+
 int ref_tracker_alloc(struct ref_tracker_dir *dir,
  struct ref_tracker **trackerp, gfp_t gfp);
 
@@ -78,6 +80,12 @@ static inline void ref_tracker_dir_print(struct 
ref_tracker_dir *dir,
 {
 }
 
+static inline int ref_tracker_dir_snprint(struct ref_tracker_dir *dir,
+ char *buf, size_t size)
+{
+   return 0;
+}
+
 static inline int ref_tracker_alloc(struct ref_tracker_dir *dir,
struct ref_tracker **trackerp,
gfp_t gfp)
diff --git a/lib/ref_tracker.c b/lib/ref_tracker.c
index ab1253fde244e..2ef4596b6b36f 100644
--- a/lib/ref_tracker.c
+++ b/lib/ref_tracker.c
@@ -62,8 +62,27 @@ ref_tracker_get_stats(struct ref_tracker_dir *dir, unsigned 
int limit)
return stats;
 }
 
-void __ref_tracker_dir_print(struct ref_tracker_dir *dir,
-  unsigned int display_limit)
+struct ostream {
+   char *buf;
+   int size, used;
+};
+
+#define pr_ostream(stream, fmt, args...) \
+({ \
+   struct ostream *_s = (stream); \
+\
+   if (!_s->buf) { \
+   pr_err(fmt, ##args); \
+   } else { \
+   int ret, len = _s->size - _s->used; \
+   ret = snprintf(_s->buf + _s->used, len, pr_fmt(fmt), ##args); \
+   _s->used += min(ret, len); \
+   } \
+})
+
+static void
+__ref_tracker_dir_pr_ostream(struct ref_tracker_dir *dir,
+unsigned int display_limit, struct ostream *s)
 {
struct ref_tracker_dir_stats *stats;
unsigned int i = 0, skipped;
@@ -77,8 +96,8 @@ void __ref_tracker_dir_print(struct ref_tracker_dir *dir,
 
stats = ref_tracker_get_stats(dir, display_limit);
if (IS_ERR(stats)) {
-   pr_err("%s@%pK: couldn't get stats, error %pe\n",
-  dir->name, dir, stats);
+   pr_ostream(s, "%s@%pK: couldn't get stats, error %pe\n",
+  dir->name, dir, stats);
return;
}
 
@@ -88,19 +107,27 @@ void __ref_tracker_dir_print(struct ref_tracker_dir *dir,
stack = stats->stacks[i].stack_handle;
if (sbuf && !stack_depot_snprint(stack, sbuf, STACK_BUF_SIZE, 
4))
sbuf[0] = 0;
-   pr_err("%s@%pK has %d/%d users at\n%s\n", dir->name, dir,
-  stats->stacks[i].count, stats->total, sbuf);
+   pr_ostream(s, "%s@%pK has %d/%d users at\n%s\n", dir->name, dir,
+  stats->stacks[i].count, stats->total, sbuf);
skipped -= stats->stacks[i].count;
}
 
if (skipped)
-   pr_err("%s@%pK skipped reports about %d/%d users.\n",
-  dir->name, dir, skipped, stats->total);
+   pr_ostream(s, "%s@%pK skipped reports about %d/%d users.\n",
+  dir->name, dir, skipped, stats->total);
 
kfree(sbuf);
 
kfree(stats);
 }
+
+void __ref_tracker_dir_print(struct ref_tracker_dir *dir,
+  unsigned int display_limit)
+{
+   struct ostream os = {};
+
+   __ref_tracker_dir_pr_ostream(dir, display_limit, );
+}
 EXPORT_SYMBOL(__ref_tracker_dir_print);
 
 void ref_tracker_dir_print(struct ref_tracker_dir *dir,
@@ -114,6 +141,19 @@ void ref_tracker_dir_print(struct ref_tracker_dir *dir,
 }
 EXPORT_SYMBOL(ref_tracker_dir_print);
 
+int ref_tracker_dir_snprint(struct ref_tracker_dir *dir, char *buf, size_t 
size)
+{
+   struct ostream os = { .buf = buf, .size = size };
+   unsigned long flags;
+
+   spin_lock_irqsave(>lock, flags);
+   __ref_tracker_dir_pr_ostream(dir, 16, );
+   spin_unlock_irqrestore(>lock, flags);
+
+   return os.used;
+}
+EXPORT_SYMBOL(ref_tracker_dir_snprint);
+
 void ref_tracker_dir_exit(struct ref_tracker_dir *dir)
 {
struct ref_tracker *tracker, *n;
-- 
2.25.1



[PATCH v3 05/11] lib/ref_tracker: __ref_tracker_dir_print improve printing

2022-02-21 Thread Andrzej Hajda
To improve readibility of ref_tracker printing following changes
have been performed:
- reports are printed per stack_handle - log is more compact,
- added display name for ref_tracker_dir,
- stack trace is printed indented, in the same printk call,
- total number of references is printed every time,
- print info about dropped references.

Signed-off-by: Andrzej Hajda 
---
 include/linux/ref_tracker.h | 15 +--
 lib/ref_tracker.c   | 90 -
 2 files changed, 91 insertions(+), 14 deletions(-)

diff --git a/include/linux/ref_tracker.h b/include/linux/ref_tracker.h
index 3e9e9df2a41f5..a2cf1f6309adb 100644
--- a/include/linux/ref_tracker.h
+++ b/include/linux/ref_tracker.h
@@ -17,12 +17,19 @@ struct ref_tracker_dir {
booldead;
struct list_headlist; /* List of active trackers */
struct list_headquarantine; /* List of dead trackers */
+   charname[32];
 #endif
 };
 
 #ifdef CONFIG_REF_TRACKER
-static inline void ref_tracker_dir_init(struct ref_tracker_dir *dir,
-   unsigned int quarantine_count)
+
+// Temporary allow two and three arguments, until consumers are converted
+#define ref_tracker_dir_init(_d, _q, args...) _ref_tracker_dir_init(_d, _q, 
##args, #_d)
+#define _ref_tracker_dir_init(_d, _q, _n, ...) __ref_tracker_dir_init(_d, _q, 
_n)
+
+static inline void __ref_tracker_dir_init(struct ref_tracker_dir *dir,
+   unsigned int quarantine_count,
+   const char *name)
 {
INIT_LIST_HEAD(>list);
INIT_LIST_HEAD(>quarantine);
@@ -31,6 +38,7 @@ static inline void ref_tracker_dir_init(struct 
ref_tracker_dir *dir,
dir->dead = false;
refcount_set(>untracked, 1);
refcount_set(>no_tracker, 1);
+   strlcpy(dir->name, name, sizeof(dir->name));
stack_depot_init();
 }
 
@@ -51,7 +59,8 @@ int ref_tracker_free(struct ref_tracker_dir *dir,
 #else /* CONFIG_REF_TRACKER */
 
 static inline void ref_tracker_dir_init(struct ref_tracker_dir *dir,
-   unsigned int quarantine_count)
+   unsigned int quarantine_count,
+   ...)
 {
 }
 
diff --git a/lib/ref_tracker.c b/lib/ref_tracker.c
index 5e9f90bbf771b..ab1253fde244e 100644
--- a/lib/ref_tracker.c
+++ b/lib/ref_tracker.c
@@ -1,11 +1,16 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
+
+#define pr_fmt(fmt) "ref_tracker: " fmt
+
 #include 
+#include 
 #include 
 #include 
 #include 
 #include 
 
 #define REF_TRACKER_STACK_ENTRIES 16
+#define STACK_BUF_SIZE 1024
 
 struct ref_tracker {
struct list_headhead;   /* anchor into dir->list or 
dir->quarantine */
@@ -14,24 +19,87 @@ struct ref_tracker {
depot_stack_handle_tfree_stack_handle;
 };
 
-void __ref_tracker_dir_print(struct ref_tracker_dir *dir,
-  unsigned int display_limit)
+struct ref_tracker_dir_stats {
+   int total;
+   int count;
+   struct {
+   depot_stack_handle_t stack_handle;
+   unsigned int count;
+   } stacks[];
+};
+
+static struct ref_tracker_dir_stats *
+ref_tracker_get_stats(struct ref_tracker_dir *dir, unsigned int limit)
 {
+   struct ref_tracker_dir_stats *stats;
struct ref_tracker *tracker;
-   unsigned int i = 0;
 
-   lockdep_assert_held(>lock);
+   stats = kmalloc(struct_size(stats, stacks, limit),
+   GFP_NOWAIT | __GFP_NOWARN);
+   if (!stats)
+   return ERR_PTR(-ENOMEM);
+   stats->total = 0;
+   stats->count = 0;
 
list_for_each_entry(tracker, >list, head) {
-   if (i < display_limit) {
-   pr_err("leaked reference.\n");
-   if (tracker->alloc_stack_handle)
-   stack_depot_print(tracker->alloc_stack_handle);
-   i++;
-   } else {
-   break;
+   depot_stack_handle_t stack = tracker->alloc_stack_handle;
+   int i;
+
+   ++stats->total;
+   for (i = 0; i < stats->count; ++i)
+   if (stats->stacks[i].stack_handle == stack)
+   break;
+   if (i >= limit)
+   continue;
+   if (i >= stats->count) {
+   stats->stacks[i].stack_handle = stack;
+   stats->stacks[i].count = 0;
+   ++stats->count;
}
+   ++stats->stacks[i].count;
+   }
+
+   return stats;
+}
+
+void __ref_tracker_dir_print(struct ref_tracker_dir *dir,
+  unsigned int display_limit)
+{
+   struct ref_tracker_dir_stats *stats;
+   unsigned int i = 0, skipped;
+   depot_stack_handle_t stack;
+  

[PATCH v3 03/11] [DO NOT MERGE] ref_tracker: remove filter_irq_stacks() call

2022-02-21 Thread Andrzej Hajda
From: Eric Dumazet 

After commit e94006608949 ("lib/stackdepot: always do filter_irq_stacks()
in stack_depot_save()") it became unnecessary to filter the stack
before calling stack_depot_save().

Signed-off-by: Eric Dumazet 
Cc: Marco Elver 
Cc: Alexander Potapenko 
Cc: Dmitry Vyukov 
Signed-off-by: David S. Miller 
Signed-off-by: Andrzej Hajda 
---
 lib/ref_tracker.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/lib/ref_tracker.c b/lib/ref_tracker.c
index 9c0c2e09df666..dc7b14aa3431e 100644
--- a/lib/ref_tracker.c
+++ b/lib/ref_tracker.c
@@ -89,7 +89,6 @@ int ref_tracker_alloc(struct ref_tracker_dir *dir,
return -ENOMEM;
}
nr_entries = stack_trace_save(entries, ARRAY_SIZE(entries), 1);
-   nr_entries = filter_irq_stacks(entries, nr_entries);
tracker->alloc_stack_handle = stack_depot_save(entries, nr_entries, 
gfp);
 
spin_lock_irqsave(>lock, flags);
@@ -120,7 +119,6 @@ int ref_tracker_free(struct ref_tracker_dir *dir,
return -EEXIST;
}
nr_entries = stack_trace_save(entries, ARRAY_SIZE(entries), 1);
-   nr_entries = filter_irq_stacks(entries, nr_entries);
stack_handle = stack_depot_save(entries, nr_entries, GFP_ATOMIC);
 
spin_lock_irqsave(>lock, flags);
-- 
2.25.1



[PATCH v3 04/11] lib/ref_tracker: add unlocked leak print helper

2022-02-21 Thread Andrzej Hajda
To have reliable detection of leaks, caller must be able to check under the same
lock both: tracked counter and the leaks. dir.lock is natural candidate for such
lock and unlocked print helper can be called with this lock taken.
As a bonus we can reuse this helper in ref_tracker_dir_exit.

Signed-off-by: Andrzej Hajda 
Reviewed-by: Chris Wilson 
---
 include/linux/ref_tracker.h |  8 +
 lib/ref_tracker.c   | 66 +
 2 files changed, 46 insertions(+), 28 deletions(-)

diff --git a/include/linux/ref_tracker.h b/include/linux/ref_tracker.h
index 9ca353ab712b5..3e9e9df2a41f5 100644
--- a/include/linux/ref_tracker.h
+++ b/include/linux/ref_tracker.h
@@ -36,6 +36,9 @@ static inline void ref_tracker_dir_init(struct 
ref_tracker_dir *dir,
 
 void ref_tracker_dir_exit(struct ref_tracker_dir *dir);
 
+void __ref_tracker_dir_print(struct ref_tracker_dir *dir,
+  unsigned int display_limit);
+
 void ref_tracker_dir_print(struct ref_tracker_dir *dir,
   unsigned int display_limit);
 
@@ -56,6 +59,11 @@ static inline void ref_tracker_dir_exit(struct 
ref_tracker_dir *dir)
 {
 }
 
+static inline void __ref_tracker_dir_print(struct ref_tracker_dir *dir,
+  unsigned int display_limit)
+{
+}
+
 static inline void ref_tracker_dir_print(struct ref_tracker_dir *dir,
 unsigned int display_limit)
 {
diff --git a/lib/ref_tracker.c b/lib/ref_tracker.c
index dc7b14aa3431e..5e9f90bbf771b 100644
--- a/lib/ref_tracker.c
+++ b/lib/ref_tracker.c
@@ -14,6 +14,38 @@ struct ref_tracker {
depot_stack_handle_tfree_stack_handle;
 };
 
+void __ref_tracker_dir_print(struct ref_tracker_dir *dir,
+  unsigned int display_limit)
+{
+   struct ref_tracker *tracker;
+   unsigned int i = 0;
+
+   lockdep_assert_held(>lock);
+
+   list_for_each_entry(tracker, >list, head) {
+   if (i < display_limit) {
+   pr_err("leaked reference.\n");
+   if (tracker->alloc_stack_handle)
+   stack_depot_print(tracker->alloc_stack_handle);
+   i++;
+   } else {
+   break;
+   }
+   }
+}
+EXPORT_SYMBOL(__ref_tracker_dir_print);
+
+void ref_tracker_dir_print(struct ref_tracker_dir *dir,
+  unsigned int display_limit)
+{
+   unsigned long flags;
+
+   spin_lock_irqsave(>lock, flags);
+   __ref_tracker_dir_print(dir, display_limit);
+   spin_unlock_irqrestore(>lock, flags);
+}
+EXPORT_SYMBOL(ref_tracker_dir_print);
+
 void ref_tracker_dir_exit(struct ref_tracker_dir *dir)
 {
struct ref_tracker *tracker, *n;
@@ -27,13 +59,13 @@ void ref_tracker_dir_exit(struct ref_tracker_dir *dir)
kfree(tracker);
dir->quarantine_avail++;
}
-   list_for_each_entry_safe(tracker, n, >list, head) {
-   pr_err("leaked reference.\n");
-   if (tracker->alloc_stack_handle)
-   stack_depot_print(tracker->alloc_stack_handle);
+   if (!list_empty(>list)) {
+   __ref_tracker_dir_print(dir, 16);
leak = true;
-   list_del(>head);
-   kfree(tracker);
+   list_for_each_entry_safe(tracker, n, >list, head) {
+   list_del(>head);
+   kfree(tracker);
+   }
}
spin_unlock_irqrestore(>lock, flags);
WARN_ON_ONCE(leak);
@@ -42,28 +74,6 @@ void ref_tracker_dir_exit(struct ref_tracker_dir *dir)
 }
 EXPORT_SYMBOL(ref_tracker_dir_exit);
 
-void ref_tracker_dir_print(struct ref_tracker_dir *dir,
-  unsigned int display_limit)
-{
-   struct ref_tracker *tracker;
-   unsigned long flags;
-   unsigned int i = 0;
-
-   spin_lock_irqsave(>lock, flags);
-   list_for_each_entry(tracker, >list, head) {
-   if (i < display_limit) {
-   pr_err("leaked reference.\n");
-   if (tracker->alloc_stack_handle)
-   stack_depot_print(tracker->alloc_stack_handle);
-   i++;
-   } else {
-   break;
-   }
-   }
-   spin_unlock_irqrestore(>lock, flags);
-}
-EXPORT_SYMBOL(ref_tracker_dir_print);
-
 int ref_tracker_alloc(struct ref_tracker_dir *dir,
  struct ref_tracker **trackerp,
  gfp_t gfp)
-- 
2.25.1



[PATCH v3 02/11] [DO NOT MERGE] ref_tracker: add a count of untracked references

2022-02-21 Thread Andrzej Hajda
From: Eric Dumazet 

We are still chasing a netdev refcount imbalance, and we suspect
we have one rogue dev_put() that is consuming a reference taken
from a dev_hold_track()

To detect this case, allow ref_tracker_alloc() and ref_tracker_free()
to be called with a NULL @trackerp parameter, and use a dedicated
refcount_t just for them.

Signed-off-by: Eric Dumazet 
Signed-off-by: David S. Miller 
Signed-off-by: Andrzej Hajda 
---
 include/linux/ref_tracker.h |  2 ++
 lib/ref_tracker.c   | 12 +++-
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/include/linux/ref_tracker.h b/include/linux/ref_tracker.h
index a443abda937d8..9ca353ab712b5 100644
--- a/include/linux/ref_tracker.h
+++ b/include/linux/ref_tracker.h
@@ -13,6 +13,7 @@ struct ref_tracker_dir {
spinlock_t  lock;
unsigned intquarantine_avail;
refcount_t  untracked;
+   refcount_t  no_tracker;
booldead;
struct list_headlist; /* List of active trackers */
struct list_headquarantine; /* List of dead trackers */
@@ -29,6 +30,7 @@ static inline void ref_tracker_dir_init(struct 
ref_tracker_dir *dir,
dir->quarantine_avail = quarantine_count;
dir->dead = false;
refcount_set(>untracked, 1);
+   refcount_set(>no_tracker, 1);
stack_depot_init();
 }
 
diff --git a/lib/ref_tracker.c b/lib/ref_tracker.c
index 32ff6bd497f8e..9c0c2e09df666 100644
--- a/lib/ref_tracker.c
+++ b/lib/ref_tracker.c
@@ -38,6 +38,7 @@ void ref_tracker_dir_exit(struct ref_tracker_dir *dir)
spin_unlock_irqrestore(>lock, flags);
WARN_ON_ONCE(leak);
WARN_ON_ONCE(refcount_read(>untracked) != 1);
+   WARN_ON_ONCE(refcount_read(>no_tracker) != 1);
 }
 EXPORT_SYMBOL(ref_tracker_dir_exit);
 
@@ -75,6 +76,10 @@ int ref_tracker_alloc(struct ref_tracker_dir *dir,
 
WARN_ON_ONCE(dir->dead);
 
+   if (!trackerp) {
+   refcount_inc(>no_tracker);
+   return 0;
+   }
if (gfp & __GFP_DIRECT_RECLAIM)
gfp_mask |= __GFP_NOFAIL;
*trackerp = tracker = kzalloc(sizeof(*tracker), gfp_mask);
@@ -98,13 +103,18 @@ int ref_tracker_free(struct ref_tracker_dir *dir,
 struct ref_tracker **trackerp)
 {
unsigned long entries[REF_TRACKER_STACK_ENTRIES];
-   struct ref_tracker *tracker = *trackerp;
depot_stack_handle_t stack_handle;
+   struct ref_tracker *tracker;
unsigned int nr_entries;
unsigned long flags;
 
WARN_ON_ONCE(dir->dead);
 
+   if (!trackerp) {
+   refcount_dec(>no_tracker);
+   return 0;
+   }
+   tracker = *trackerp;
if (!tracker) {
refcount_dec(>untracked);
return -EEXIST;
-- 
2.25.1



[PATCH v3 01/11] [DO NOT MERGE] ref_tracker: implement use-after-free detection

2022-02-21 Thread Andrzej Hajda
From: Eric Dumazet 

Whenever ref_tracker_dir_init() is called, mark the struct ref_tracker_dir
as dead.

Test the dead status from ref_tracker_alloc() and ref_tracker_free()

This should detect buggy dev_put()/dev_hold() happening too late
in netdevice dismantle process.

Signed-off-by: Eric Dumazet 
Signed-off-by: David S. Miller 
Signed-off-by: Andrzej Hajda 
---
 include/linux/ref_tracker.h | 2 ++
 lib/ref_tracker.c   | 5 +
 2 files changed, 7 insertions(+)

diff --git a/include/linux/ref_tracker.h b/include/linux/ref_tracker.h
index 60f3453be23e6..a443abda937d8 100644
--- a/include/linux/ref_tracker.h
+++ b/include/linux/ref_tracker.h
@@ -13,6 +13,7 @@ struct ref_tracker_dir {
spinlock_t  lock;
unsigned intquarantine_avail;
refcount_t  untracked;
+   booldead;
struct list_headlist; /* List of active trackers */
struct list_headquarantine; /* List of dead trackers */
 #endif
@@ -26,6 +27,7 @@ static inline void ref_tracker_dir_init(struct 
ref_tracker_dir *dir,
INIT_LIST_HEAD(>quarantine);
spin_lock_init(>lock);
dir->quarantine_avail = quarantine_count;
+   dir->dead = false;
refcount_set(>untracked, 1);
stack_depot_init();
 }
diff --git a/lib/ref_tracker.c b/lib/ref_tracker.c
index a6789c0c626b0..32ff6bd497f8e 100644
--- a/lib/ref_tracker.c
+++ b/lib/ref_tracker.c
@@ -20,6 +20,7 @@ void ref_tracker_dir_exit(struct ref_tracker_dir *dir)
unsigned long flags;
bool leak = false;
 
+   dir->dead = true;
spin_lock_irqsave(>lock, flags);
list_for_each_entry_safe(tracker, n, >quarantine, head) {
list_del(>head);
@@ -72,6 +73,8 @@ int ref_tracker_alloc(struct ref_tracker_dir *dir,
gfp_t gfp_mask = gfp;
unsigned long flags;
 
+   WARN_ON_ONCE(dir->dead);
+
if (gfp & __GFP_DIRECT_RECLAIM)
gfp_mask |= __GFP_NOFAIL;
*trackerp = tracker = kzalloc(sizeof(*tracker), gfp_mask);
@@ -100,6 +103,8 @@ int ref_tracker_free(struct ref_tracker_dir *dir,
unsigned int nr_entries;
unsigned long flags;
 
+   WARN_ON_ONCE(dir->dead);
+
if (!tracker) {
refcount_dec(>untracked);
return -EEXIST;
-- 
2.25.1



[PATCH v3 00/11] drm/i915: use ref_tracker library for tracking wakerefs

2022-02-21 Thread Andrzej Hajda
Hi,

Appearance of ref_tracker library allows to drop custom solution for wakeref
tracking used in i915 and reuse the library.
For this few adjustements has been made to ref_tracker, details in patches.
I hope changes are OK for original author.

The patchset has been rebased on top of drm-tip to allow test changes by CI.
Patches marked "[DO NOT MERGE]" are cherry-picked from linux-next (they are
not yet in drm-tip), to allow build and run CI on the patchset (it works only
on drm-tip tree).

Added CC to netdev as the only user of the library atm.

v2:
  - replaced list_sort with ref_tracker_dir_stats, to avoid potentially
extensive sorting, if number of reports is expected to be big enough (???)
we can replace linear search in ref_tracker_dir_stats.stacks with binary
heap (min_heap),
  - refactored gfp flags,
  - fixed i915 handling of no-tracking flag.
v3:
  - fixed mess with duplicated mails

Regards
Andrzej


Andrzej Hajda (6):
  lib/ref_tracker: add unlocked leak print helper
  lib/ref_tracker: __ref_tracker_dir_print improve printing
  lib/ref_tracker: add printing to memory buffer
  lib/ref_tracker: remove warnings in case of allocation failure
  drm/i915: Correct type of wakeref variable
  drm/i915: replace Intel internal tracker with kernel core ref_tracker

Chris Wilson (2):
  drm/i915: Separate wakeref tracking
  drm/i915: Track leaked gt->wakerefs

Eric Dumazet (3):
  [DO NOT MERGE] ref_tracker: implement use-after-free detection
  [DO NOT MERGE] ref_tracker: add a count of untracked references
  [DO NOT MERGE] ref_tracker: remove filter_irq_stacks() call

 drivers/gpu/drm/i915/Kconfig.debug|  19 ++
 drivers/gpu/drm/i915/Makefile |   1 +
 .../drm/i915/display/intel_display_power.c|   2 +-
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|   7 +-
 .../i915/gem/selftests/i915_gem_coherency.c   |  10 +-
 .../drm/i915/gem/selftests/i915_gem_mman.c|  14 +-
 drivers/gpu/drm/i915/gt/intel_breadcrumbs.c   |  13 +-
 .../gpu/drm/i915/gt/intel_breadcrumbs_types.h |   3 +-
 drivers/gpu/drm/i915/gt/intel_engine_pm.c |   6 +-
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |   2 +
 .../drm/i915/gt/intel_execlists_submission.c  |   2 +-
 drivers/gpu/drm/i915/gt/intel_gt_pm.c |  12 +-
 drivers/gpu/drm/i915/gt/intel_gt_pm.h |  36 ++-
 drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c |   4 +-
 drivers/gpu/drm/i915/gt/selftest_engine_cs.c  |  20 +-
 drivers/gpu/drm/i915/gt/selftest_gt_pm.c  |   5 +-
 drivers/gpu/drm/i915/gt/selftest_reset.c  |  10 +-
 drivers/gpu/drm/i915/gt/selftest_rps.c|  17 +-
 drivers/gpu/drm/i915/gt/selftest_slpc.c   |  10 +-
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  11 +-
 drivers/gpu/drm/i915/i915_pmu.c   |  16 +-
 drivers/gpu/drm/i915/intel_runtime_pm.c   | 239 ++
 drivers/gpu/drm/i915/intel_runtime_pm.h   |  10 +-
 drivers/gpu/drm/i915/intel_wakeref.c  |  10 +-
 drivers/gpu/drm/i915/intel_wakeref.h  | 112 +++-
 include/linux/ref_tracker.h   |  35 ++-
 lib/ref_tracker.c | 198 ---
 27 files changed, 480 insertions(+), 344 deletions(-)

-- 
2.25.1



Re: [PATCH v2 00/11] drm/i915: use ref_tracker library for tracking wakerefs

2022-02-21 Thread Andrzej Hajda



On 22.02.2022 00:16, Andrzej Hajda wrote:

Hi,

Appearance of ref_tracker library allows to drop custom solution for wakeref
tracking used in i915 and reuse the library.
For this few adjustements has been made to ref_tracker, details in patches.
I hope changes are OK for original author.

The patchset has been rebased on top of drm-tip to allow test changes by CI.
Patches marked "[DO NOT MERGE]" are cherry-picked from linux-next (they are
not yet in drm-tip), to allow build and run CI on the patchset (it works only
on drm-tip tree).

Added CC to netdev as the only user of the library atm.

v2:
   - replaced list_sort with ref_tracker_dir_stats, to avoid potentially
 extensive sorting, if number of reports is expected to be big enough (???)
 we can replace linear search in ref_tracker_dir_stats.stacks with binary
 heap (min_heap),
   - refactored gfp flags,
   - fixed i915 handling of no-tracking flag.

Regards
Andrzej


Sorry for the mess, sth wrong happened to my scripts and I've messed 
patches, I will resend it properly.


Regards
Andrzej



[PATCH v2 11/11] drm/i915: replace Intel internal tracker with kernel core ref_tracker

2022-02-21 Thread Andrzej Hajda
Beside reusing existing code, the main advantage of ref_tracker is
tracking per instance of wakeref. It allows also to catch double
put.
On the other side we lose information about the first acquire and
the last release, but the advantages outweigh it.

Signed-off-by: Andrzej Hajda 
Reviewed-by: Chris Wilson 
---
 drivers/gpu/drm/i915/Kconfig.debug|  11 +-
 drivers/gpu/drm/i915/Makefile |   3 -
 .../drm/i915/display/intel_display_power.c|   2 +-
 drivers/gpu/drm/i915/gt/intel_engine_pm.c |   2 +-
 drivers/gpu/drm/i915/gt/intel_gt_pm.c |   2 +-
 drivers/gpu/drm/i915/intel_runtime_pm.c   |  25 +-
 drivers/gpu/drm/i915/intel_runtime_pm.h   |   2 +-
 drivers/gpu/drm/i915/intel_wakeref.c  |   8 +-
 drivers/gpu/drm/i915/intel_wakeref.h  |  72 +-
 drivers/gpu/drm/i915/intel_wakeref_tracker.c  | 234 --
 drivers/gpu/drm/i915/intel_wakeref_tracker.h  |  76 --
 11 files changed, 87 insertions(+), 350 deletions(-)
 delete mode 100644 drivers/gpu/drm/i915/intel_wakeref_tracker.c
 delete mode 100644 drivers/gpu/drm/i915/intel_wakeref_tracker.h

diff --git a/drivers/gpu/drm/i915/Kconfig.debug 
b/drivers/gpu/drm/i915/Kconfig.debug
index 3bdc73f30a9e1..6c57f3e265f20 100644
--- a/drivers/gpu/drm/i915/Kconfig.debug
+++ b/drivers/gpu/drm/i915/Kconfig.debug
@@ -32,6 +32,7 @@ config DRM_I915_DEBUG
select DEBUG_FS
select PREEMPT_COUNT
select I2C_CHARDEV
+   select REF_TRACKER
select STACKDEPOT
select STACKTRACE
select DRM_DP_AUX_CHARDEV
@@ -46,7 +47,6 @@ config DRM_I915_DEBUG
select DRM_I915_DEBUG_GEM
select DRM_I915_DEBUG_GEM_ONCE
select DRM_I915_DEBUG_MMIO
-   select DRM_I915_TRACK_WAKEREF
select DRM_I915_DEBUG_RUNTIME_PM
select DRM_I915_DEBUG_WAKEREF
select DRM_I915_SW_FENCE_DEBUG_OBJECTS
@@ -238,18 +238,13 @@ config DRM_I915_DEBUG_VBLANK_EVADE
 
  If in doubt, say "N".
 
-config DRM_I915_TRACK_WAKEREF
-   depends on STACKDEPOT
-   depends on STACKTRACE
-   bool
-
 config DRM_I915_DEBUG_RUNTIME_PM
bool "Enable extra state checking for runtime PM"
depends on DRM_I915
default n
+   select REF_TRACKER
select STACKDEPOT
select STACKTRACE
-   select DRM_I915_TRACK_WAKEREF
help
  Choose this option to turn on extra state checking for the
  runtime PM functionality. This may introduce overhead during
@@ -263,9 +258,9 @@ config DRM_I915_DEBUG_WAKEREF
bool "Enable extra tracking for wakerefs"
depends on DRM_I915
default n
+   select REF_TRACKER
select STACKDEPOT
select STACKTRACE
-   select DRM_I915_TRACK_WAKEREF
help
  Choose this option to turn on extra state checking and usage
  tracking for the wakerefPM functionality. This may introduce
diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 88a403d3294cb..1f8d71430e2e6 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -76,9 +76,6 @@ i915-$(CONFIG_DEBUG_FS) += \
display/intel_display_debugfs.o \
display/intel_pipe_crc.o
 
-i915-$(CONFIG_DRM_I915_TRACK_WAKEREF) += \
-   intel_wakeref_tracker.o
-
 i915-$(CONFIG_PERF_EVENTS) += i915_pmu.o
 
 # "Graphics Technology" (aka we talk to the gpu)
diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c 
b/drivers/gpu/drm/i915/display/intel_display_power.c
index 9ebae7ac32356..0e1bf724f89b5 100644
--- a/drivers/gpu/drm/i915/display/intel_display_power.c
+++ b/drivers/gpu/drm/i915/display/intel_display_power.c
@@ -2107,7 +2107,7 @@ print_async_put_domains_state(struct i915_power_domains 
*power_domains)
 struct drm_i915_private,
 power_domains);
 
-   drm_dbg(>drm, "async_put_wakeref %u\n",
+   drm_dbg(>drm, "async_put_wakeref %lu\n",
power_domains->async_put_wakeref);
 
print_power_domains(power_domains, "async_put_domains[0]",
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c 
b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
index 52e46e7830ff5..cf8cc348942cb 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
@@ -273,7 +273,7 @@ void intel_engine_init__pm(struct intel_engine_cs *engine)
 {
struct intel_runtime_pm *rpm = engine->uncore->rpm;
 
-   intel_wakeref_init(>wakeref, rpm, _ops);
+   intel_wakeref_init(>wakeref, rpm, _ops, engine->name);
intel_engine_init_heartbeat(engine);
 }
 
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.c 
b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
index 7ee65a93f926f..01a055d0d0989 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
@@ -129,7 +129,7 @@ static const struct intel_wakeref_ops wf_ops = {
 
 void 

[PATCH v2 10/11] drm/i915: Correct type of wakeref variable

2022-02-21 Thread Andrzej Hajda
Wakeref has dedicated type. Assumption it will be int
compatible forever is incorrect.

Signed-off-by: Andrzej Hajda 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 7799939c38945..b308dd0866eaf 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -2797,7 +2797,7 @@ static void destroyed_worker_func(struct work_struct *w)
struct intel_guc *guc = container_of(w, struct intel_guc,
 submission_state.destroyed_worker);
struct intel_gt *gt = guc_to_gt(guc);
-   int tmp;
+   intel_wakeref_t tmp;
 
with_intel_gt_pm(gt, tmp)
deregister_destroyed_contexts(guc);
-- 
2.25.1



[PATCH v2 09/11] drm/i915: Track leaked gt->wakerefs

2022-02-21 Thread Andrzej Hajda
From: Chris Wilson 

Track every intel_gt_pm_get() until its corresponding release in
intel_gt_pm_put() by returning a cookie to the caller for acquire that
must be passed by on rleased. When there is an imbalance, we can see who
either tried to free a stale wakeref, or who forgot to free theirs.

v2: Rebase from backporting wakeref leak (Umesh)

Signed-off-by: Chris Wilson 
Reviewed-by: Andrzej Hajda 
Signed-off-by: Andrzej Hajda 
---
 drivers/gpu/drm/i915/Kconfig.debug| 15 +++
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|  7 ++--
 .../i915/gem/selftests/i915_gem_coherency.c   | 10 +++--
 .../drm/i915/gem/selftests/i915_gem_mman.c| 14 ---
 drivers/gpu/drm/i915/gt/intel_breadcrumbs.c   | 13 --
 .../gpu/drm/i915/gt/intel_breadcrumbs_types.h |  3 +-
 drivers/gpu/drm/i915/gt/intel_engine_pm.c |  4 +-
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |  2 +
 .../drm/i915/gt/intel_execlists_submission.c  |  2 +-
 drivers/gpu/drm/i915/gt/intel_gt_pm.c | 10 +++--
 drivers/gpu/drm/i915/gt/intel_gt_pm.h | 36 
 drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c |  4 +-
 drivers/gpu/drm/i915/gt/selftest_engine_cs.c  | 20 +
 drivers/gpu/drm/i915/gt/selftest_gt_pm.c  |  5 ++-
 drivers/gpu/drm/i915/gt/selftest_reset.c  | 10 +++--
 drivers/gpu/drm/i915/gt/selftest_rps.c| 17 
 drivers/gpu/drm/i915/gt/selftest_slpc.c   | 10 +++--
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  9 ++--
 drivers/gpu/drm/i915/i915_pmu.c   | 16 +++
 drivers/gpu/drm/i915/intel_wakeref.c  |  4 ++
 drivers/gpu/drm/i915/intel_wakeref.h  | 42 +++
 21 files changed, 182 insertions(+), 71 deletions(-)

diff --git a/drivers/gpu/drm/i915/Kconfig.debug 
b/drivers/gpu/drm/i915/Kconfig.debug
index 8b1973146e848..3bdc73f30a9e1 100644
--- a/drivers/gpu/drm/i915/Kconfig.debug
+++ b/drivers/gpu/drm/i915/Kconfig.debug
@@ -48,6 +48,7 @@ config DRM_I915_DEBUG
select DRM_I915_DEBUG_MMIO
select DRM_I915_TRACK_WAKEREF
select DRM_I915_DEBUG_RUNTIME_PM
+   select DRM_I915_DEBUG_WAKEREF
select DRM_I915_SW_FENCE_DEBUG_OBJECTS
select DRM_I915_SELFTEST
select BROKEN # for prototype uAPI
@@ -257,3 +258,17 @@ config DRM_I915_DEBUG_RUNTIME_PM
  Recommended for driver developers only.
 
  If in doubt, say "N"
+
+config DRM_I915_DEBUG_WAKEREF
+   bool "Enable extra tracking for wakerefs"
+   depends on DRM_I915
+   default n
+   select STACKDEPOT
+   select STACKTRACE
+   select DRM_I915_TRACK_WAKEREF
+   help
+ Choose this option to turn on extra state checking and usage
+ tracking for the wakerefPM functionality. This may introduce
+ overhead during driver runtime.
+
+ If in doubt, say "N"
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 13c975da77474..4b6c144f706da 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -252,6 +252,7 @@ struct i915_execbuffer {
struct intel_gt *gt; /* gt for the execbuf */
struct intel_context *context; /* logical state for the request */
struct i915_gem_context *gem_context; /** caller's context */
+   intel_wakeref_t wakeref;
 
/** our requests to build */
struct i915_request *requests[MAX_ENGINE_INSTANCE + 1];
@@ -2679,7 +2680,7 @@ eb_select_engine(struct i915_execbuffer *eb)
 
for_each_child(ce, child)
intel_context_get(child);
-   intel_gt_pm_get(ce->engine->gt);
+   eb->wakeref = intel_gt_pm_get(ce->engine->gt);
 
if (!test_bit(CONTEXT_ALLOC_BIT, >flags)) {
err = intel_context_alloc_state(ce);
@@ -2713,7 +2714,7 @@ eb_select_engine(struct i915_execbuffer *eb)
return err;
 
 err:
-   intel_gt_pm_put(ce->engine->gt);
+   intel_gt_pm_put(ce->engine->gt, eb->wakeref);
for_each_child(ce, child)
intel_context_put(child);
intel_context_put(ce);
@@ -2725,7 +2726,7 @@ eb_put_engine(struct i915_execbuffer *eb)
 {
struct intel_context *child;
 
-   intel_gt_pm_put(eb->gt);
+   intel_gt_pm_put(eb->context->engine->gt, eb->wakeref);
for_each_child(eb->context, child)
intel_context_put(child);
intel_context_put(eb->context);
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c 
b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c
index 13b088cc787eb..553f2730c2a76 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c
@@ -85,6 +85,7 @@ static int cpu_get(struct context *ctx, unsigned long offset, 
u32 *v)
 
 static int gtt_set(struct context *ctx, unsigned long offset, u32 v)
 {
+   intel_wakeref_t wakeref;
struct i915_vma *vma;
u32 __iomem *map;
  

[PATCH v2 9/9] drm/i915: replace Intel internal tracker with kernel core ref_tracker

2022-02-21 Thread Andrzej Hajda
Beside reusing existing code, the main advantage of ref_tracker is
tracking per instance of wakeref. It allows also to catch double
put.
On the other side we lose information about the first acquire and
the last release, but the advantages outweigh it.

Signed-off-by: Andrzej Hajda 
Reviewed-by: Chris Wilson 
---
 drivers/gpu/drm/i915/Kconfig.debug|  11 +-
 drivers/gpu/drm/i915/Makefile |   3 -
 .../drm/i915/display/intel_display_power.c|   2 +-
 drivers/gpu/drm/i915/gt/intel_engine_pm.c |   2 +-
 drivers/gpu/drm/i915/gt/intel_gt_pm.c |   2 +-
 drivers/gpu/drm/i915/intel_runtime_pm.c   |  23 +-
 drivers/gpu/drm/i915/intel_runtime_pm.h   |   2 +-
 drivers/gpu/drm/i915/intel_wakeref.c  |   8 +-
 drivers/gpu/drm/i915/intel_wakeref.h  |  72 +-
 drivers/gpu/drm/i915/intel_wakeref_tracker.c  | 234 --
 drivers/gpu/drm/i915/intel_wakeref_tracker.h  |  76 --
 11 files changed, 86 insertions(+), 349 deletions(-)
 delete mode 100644 drivers/gpu/drm/i915/intel_wakeref_tracker.c
 delete mode 100644 drivers/gpu/drm/i915/intel_wakeref_tracker.h

diff --git a/drivers/gpu/drm/i915/Kconfig.debug 
b/drivers/gpu/drm/i915/Kconfig.debug
index 3bdc73f30a9e1..6c57f3e265f20 100644
--- a/drivers/gpu/drm/i915/Kconfig.debug
+++ b/drivers/gpu/drm/i915/Kconfig.debug
@@ -32,6 +32,7 @@ config DRM_I915_DEBUG
select DEBUG_FS
select PREEMPT_COUNT
select I2C_CHARDEV
+   select REF_TRACKER
select STACKDEPOT
select STACKTRACE
select DRM_DP_AUX_CHARDEV
@@ -46,7 +47,6 @@ config DRM_I915_DEBUG
select DRM_I915_DEBUG_GEM
select DRM_I915_DEBUG_GEM_ONCE
select DRM_I915_DEBUG_MMIO
-   select DRM_I915_TRACK_WAKEREF
select DRM_I915_DEBUG_RUNTIME_PM
select DRM_I915_DEBUG_WAKEREF
select DRM_I915_SW_FENCE_DEBUG_OBJECTS
@@ -238,18 +238,13 @@ config DRM_I915_DEBUG_VBLANK_EVADE
 
  If in doubt, say "N".
 
-config DRM_I915_TRACK_WAKEREF
-   depends on STACKDEPOT
-   depends on STACKTRACE
-   bool
-
 config DRM_I915_DEBUG_RUNTIME_PM
bool "Enable extra state checking for runtime PM"
depends on DRM_I915
default n
+   select REF_TRACKER
select STACKDEPOT
select STACKTRACE
-   select DRM_I915_TRACK_WAKEREF
help
  Choose this option to turn on extra state checking for the
  runtime PM functionality. This may introduce overhead during
@@ -263,9 +258,9 @@ config DRM_I915_DEBUG_WAKEREF
bool "Enable extra tracking for wakerefs"
depends on DRM_I915
default n
+   select REF_TRACKER
select STACKDEPOT
select STACKTRACE
-   select DRM_I915_TRACK_WAKEREF
help
  Choose this option to turn on extra state checking and usage
  tracking for the wakerefPM functionality. This may introduce
diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 88a403d3294cb..1f8d71430e2e6 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -76,9 +76,6 @@ i915-$(CONFIG_DEBUG_FS) += \
display/intel_display_debugfs.o \
display/intel_pipe_crc.o
 
-i915-$(CONFIG_DRM_I915_TRACK_WAKEREF) += \
-   intel_wakeref_tracker.o
-
 i915-$(CONFIG_PERF_EVENTS) += i915_pmu.o
 
 # "Graphics Technology" (aka we talk to the gpu)
diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c 
b/drivers/gpu/drm/i915/display/intel_display_power.c
index 9ebae7ac32356..0e1bf724f89b5 100644
--- a/drivers/gpu/drm/i915/display/intel_display_power.c
+++ b/drivers/gpu/drm/i915/display/intel_display_power.c
@@ -2107,7 +2107,7 @@ print_async_put_domains_state(struct i915_power_domains 
*power_domains)
 struct drm_i915_private,
 power_domains);
 
-   drm_dbg(>drm, "async_put_wakeref %u\n",
+   drm_dbg(>drm, "async_put_wakeref %lu\n",
power_domains->async_put_wakeref);
 
print_power_domains(power_domains, "async_put_domains[0]",
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c 
b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
index 52e46e7830ff5..cf8cc348942cb 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
@@ -273,7 +273,7 @@ void intel_engine_init__pm(struct intel_engine_cs *engine)
 {
struct intel_runtime_pm *rpm = engine->uncore->rpm;
 
-   intel_wakeref_init(>wakeref, rpm, _ops);
+   intel_wakeref_init(>wakeref, rpm, _ops, engine->name);
intel_engine_init_heartbeat(engine);
 }
 
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.c 
b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
index 7ee65a93f926f..01a055d0d0989 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
@@ -129,7 +129,7 @@ static const struct intel_wakeref_ops wf_ops = {
 
 void 

[PATCH v2 08/11] drm/i915: Separate wakeref tracking

2022-02-21 Thread Andrzej Hajda
From: Chris Wilson 

Extract the callstack tracking of intel_runtime_pm.c into its own
utility so that that we can reuse it for other online debugging of
scoped wakerefs.

Signed-off-by: Chris Wilson 
Reviewed-by: Andrzej Hajda 
Signed-off-by: Andrzej Hajda 
---
 drivers/gpu/drm/i915/Kconfig.debug   |   9 +
 drivers/gpu/drm/i915/Makefile|   4 +
 drivers/gpu/drm/i915/intel_runtime_pm.c  | 244 +++
 drivers/gpu/drm/i915/intel_runtime_pm.h  |  10 +-
 drivers/gpu/drm/i915/intel_wakeref.h |   6 +-
 drivers/gpu/drm/i915/intel_wakeref_tracker.c | 234 ++
 drivers/gpu/drm/i915/intel_wakeref_tracker.h |  76 ++
 7 files changed, 355 insertions(+), 228 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/intel_wakeref_tracker.c
 create mode 100644 drivers/gpu/drm/i915/intel_wakeref_tracker.h

diff --git a/drivers/gpu/drm/i915/Kconfig.debug 
b/drivers/gpu/drm/i915/Kconfig.debug
index e7fd3e76f8a20..8b1973146e848 100644
--- a/drivers/gpu/drm/i915/Kconfig.debug
+++ b/drivers/gpu/drm/i915/Kconfig.debug
@@ -33,6 +33,7 @@ config DRM_I915_DEBUG
select PREEMPT_COUNT
select I2C_CHARDEV
select STACKDEPOT
+   select STACKTRACE
select DRM_DP_AUX_CHARDEV
select X86_MSR # used by igt/pm_rpm
select DRM_VGEM # used by igt/prime_vgem (dmabuf interop checks)
@@ -45,6 +46,7 @@ config DRM_I915_DEBUG
select DRM_I915_DEBUG_GEM
select DRM_I915_DEBUG_GEM_ONCE
select DRM_I915_DEBUG_MMIO
+   select DRM_I915_TRACK_WAKEREF
select DRM_I915_DEBUG_RUNTIME_PM
select DRM_I915_SW_FENCE_DEBUG_OBJECTS
select DRM_I915_SELFTEST
@@ -235,11 +237,18 @@ config DRM_I915_DEBUG_VBLANK_EVADE
 
  If in doubt, say "N".
 
+config DRM_I915_TRACK_WAKEREF
+   depends on STACKDEPOT
+   depends on STACKTRACE
+   bool
+
 config DRM_I915_DEBUG_RUNTIME_PM
bool "Enable extra state checking for runtime PM"
depends on DRM_I915
default n
select STACKDEPOT
+   select STACKTRACE
+   select DRM_I915_TRACK_WAKEREF
help
  Choose this option to turn on extra state checking for the
  runtime PM functionality. This may introduce overhead during
diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 9d588d936e3dc..88a403d3294cb 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -75,6 +75,10 @@ i915-$(CONFIG_DEBUG_FS) += \
i915_debugfs_params.o \
display/intel_display_debugfs.o \
display/intel_pipe_crc.o
+
+i915-$(CONFIG_DRM_I915_TRACK_WAKEREF) += \
+   intel_wakeref_tracker.o
+
 i915-$(CONFIG_PERF_EVENTS) += i915_pmu.o
 
 # "Graphics Technology" (aka we talk to the gpu)
diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c 
b/drivers/gpu/drm/i915/intel_runtime_pm.c
index 6ed5786bcd299..7bd10efa56bf3 100644
--- a/drivers/gpu/drm/i915/intel_runtime_pm.c
+++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
@@ -52,182 +52,37 @@
 
 #if IS_ENABLED(CONFIG_DRM_I915_DEBUG_RUNTIME_PM)
 
-#include 
-
-#define STACKDEPTH 8
-
-static noinline depot_stack_handle_t __save_depot_stack(void)
-{
-   unsigned long entries[STACKDEPTH];
-   unsigned int n;
-
-   n = stack_trace_save(entries, ARRAY_SIZE(entries), 1);
-   return stack_depot_save(entries, n, GFP_NOWAIT | __GFP_NOWARN);
-}
-
 static void init_intel_runtime_pm_wakeref(struct intel_runtime_pm *rpm)
 {
-   spin_lock_init(>debug.lock);
-   stack_depot_init();
+   intel_wakeref_tracker_init(>debug);
 }
 
-static noinline depot_stack_handle_t
+static intel_wakeref_t
 track_intel_runtime_pm_wakeref(struct intel_runtime_pm *rpm)
 {
-   depot_stack_handle_t stack, *stacks;
-   unsigned long flags;
-
-   if (rpm->no_wakeref_tracking)
-   return -1;
-
-   stack = __save_depot_stack();
-   if (!stack)
+   if (!rpm->available)
return -1;
 
-   spin_lock_irqsave(>debug.lock, flags);
-
-   if (!rpm->debug.count)
-   rpm->debug.last_acquire = stack;
-
-   stacks = krealloc(rpm->debug.owners,
- (rpm->debug.count + 1) * sizeof(*stacks),
- GFP_NOWAIT | __GFP_NOWARN);
-   if (stacks) {
-   stacks[rpm->debug.count++] = stack;
-   rpm->debug.owners = stacks;
-   } else {
-   stack = -1;
-   }
-
-   spin_unlock_irqrestore(>debug.lock, flags);
-
-   return stack;
+   return intel_wakeref_tracker_add(>debug);
 }
 
 static void untrack_intel_runtime_pm_wakeref(struct intel_runtime_pm *rpm,
-depot_stack_handle_t stack)
+intel_wakeref_t wakeref)
 {
-   struct drm_i915_private *i915 = container_of(rpm,
-struct drm_i915_private,
-

[PATCH v2 07/11] lib/ref_tracker: remove warnings in case of allocation failure

2022-02-21 Thread Andrzej Hajda
Library can handle allocation failures. To avoid allocation warnings
__GFP_NOWARN has been added everywhere. Moreover GFP_ATOMIC has been
replaced with GFP_NOWAIT in case of stack allocation on tracker free
call.

Signed-off-by: Andrzej Hajda 
---
 lib/ref_tracker.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/lib/ref_tracker.c b/lib/ref_tracker.c
index 2ef4596b6b36f..cae4498fcfd70 100644
--- a/lib/ref_tracker.c
+++ b/lib/ref_tracker.c
@@ -189,7 +189,7 @@ int ref_tracker_alloc(struct ref_tracker_dir *dir,
unsigned long entries[REF_TRACKER_STACK_ENTRIES];
struct ref_tracker *tracker;
unsigned int nr_entries;
-   gfp_t gfp_mask = gfp;
+   gfp_t gfp_mask = gfp | __GFP_NOWARN;
unsigned long flags;
 
WARN_ON_ONCE(dir->dead);
@@ -237,7 +237,8 @@ int ref_tracker_free(struct ref_tracker_dir *dir,
return -EEXIST;
}
nr_entries = stack_trace_save(entries, ARRAY_SIZE(entries), 1);
-   stack_handle = stack_depot_save(entries, nr_entries, GFP_ATOMIC);
+   stack_handle = stack_depot_save(entries, nr_entries,
+   GFP_NOWAIT | __GFP_NOWARN);
 
spin_lock_irqsave(>lock, flags);
if (tracker->dead) {
-- 
2.25.1



[PATCH v2 8/9] drm/i915: Correct type of wakeref variable

2022-02-21 Thread Andrzej Hajda
Wakeref has dedicated type. Assumption it will be int
compatible forever is incorrect.

Signed-off-by: Andrzej Hajda 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 7799939c38945..b308dd0866eaf 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -2797,7 +2797,7 @@ static void destroyed_worker_func(struct work_struct *w)
struct intel_guc *guc = container_of(w, struct intel_guc,
 submission_state.destroyed_worker);
struct intel_gt *gt = guc_to_gt(guc);
-   int tmp;
+   intel_wakeref_t tmp;
 
with_intel_gt_pm(gt, tmp)
deregister_destroyed_contexts(guc);
-- 
2.25.1



[PATCH v2 06/11] lib/ref_tracker: add printing to memory buffer

2022-02-21 Thread Andrzej Hajda
In case one wants to show stats via debugfs.

Signed-off-by: Andrzej Hajda 
Reviewed-by: Chris Wilson 
---
 include/linux/ref_tracker.h |  8 ++
 lib/ref_tracker.c   | 56 +++--
 2 files changed, 56 insertions(+), 8 deletions(-)

diff --git a/include/linux/ref_tracker.h b/include/linux/ref_tracker.h
index a2cf1f6309adb..2fdbfd2e14797 100644
--- a/include/linux/ref_tracker.h
+++ b/include/linux/ref_tracker.h
@@ -50,6 +50,8 @@ void __ref_tracker_dir_print(struct ref_tracker_dir *dir,
 void ref_tracker_dir_print(struct ref_tracker_dir *dir,
   unsigned int display_limit);
 
+int ref_tracker_dir_snprint(struct ref_tracker_dir *dir, char *buf, size_t 
size);
+
 int ref_tracker_alloc(struct ref_tracker_dir *dir,
  struct ref_tracker **trackerp, gfp_t gfp);
 
@@ -78,6 +80,12 @@ static inline void ref_tracker_dir_print(struct 
ref_tracker_dir *dir,
 {
 }
 
+static inline int ref_tracker_dir_snprint(struct ref_tracker_dir *dir,
+ char *buf, size_t size)
+{
+   return 0;
+}
+
 static inline int ref_tracker_alloc(struct ref_tracker_dir *dir,
struct ref_tracker **trackerp,
gfp_t gfp)
diff --git a/lib/ref_tracker.c b/lib/ref_tracker.c
index ab1253fde244e..2ef4596b6b36f 100644
--- a/lib/ref_tracker.c
+++ b/lib/ref_tracker.c
@@ -62,8 +62,27 @@ ref_tracker_get_stats(struct ref_tracker_dir *dir, unsigned 
int limit)
return stats;
 }
 
-void __ref_tracker_dir_print(struct ref_tracker_dir *dir,
-  unsigned int display_limit)
+struct ostream {
+   char *buf;
+   int size, used;
+};
+
+#define pr_ostream(stream, fmt, args...) \
+({ \
+   struct ostream *_s = (stream); \
+\
+   if (!_s->buf) { \
+   pr_err(fmt, ##args); \
+   } else { \
+   int ret, len = _s->size - _s->used; \
+   ret = snprintf(_s->buf + _s->used, len, pr_fmt(fmt), ##args); \
+   _s->used += min(ret, len); \
+   } \
+})
+
+static void
+__ref_tracker_dir_pr_ostream(struct ref_tracker_dir *dir,
+unsigned int display_limit, struct ostream *s)
 {
struct ref_tracker_dir_stats *stats;
unsigned int i = 0, skipped;
@@ -77,8 +96,8 @@ void __ref_tracker_dir_print(struct ref_tracker_dir *dir,
 
stats = ref_tracker_get_stats(dir, display_limit);
if (IS_ERR(stats)) {
-   pr_err("%s@%pK: couldn't get stats, error %pe\n",
-  dir->name, dir, stats);
+   pr_ostream(s, "%s@%pK: couldn't get stats, error %pe\n",
+  dir->name, dir, stats);
return;
}
 
@@ -88,19 +107,27 @@ void __ref_tracker_dir_print(struct ref_tracker_dir *dir,
stack = stats->stacks[i].stack_handle;
if (sbuf && !stack_depot_snprint(stack, sbuf, STACK_BUF_SIZE, 
4))
sbuf[0] = 0;
-   pr_err("%s@%pK has %d/%d users at\n%s\n", dir->name, dir,
-  stats->stacks[i].count, stats->total, sbuf);
+   pr_ostream(s, "%s@%pK has %d/%d users at\n%s\n", dir->name, dir,
+  stats->stacks[i].count, stats->total, sbuf);
skipped -= stats->stacks[i].count;
}
 
if (skipped)
-   pr_err("%s@%pK skipped reports about %d/%d users.\n",
-  dir->name, dir, skipped, stats->total);
+   pr_ostream(s, "%s@%pK skipped reports about %d/%d users.\n",
+  dir->name, dir, skipped, stats->total);
 
kfree(sbuf);
 
kfree(stats);
 }
+
+void __ref_tracker_dir_print(struct ref_tracker_dir *dir,
+  unsigned int display_limit)
+{
+   struct ostream os = {};
+
+   __ref_tracker_dir_pr_ostream(dir, display_limit, );
+}
 EXPORT_SYMBOL(__ref_tracker_dir_print);
 
 void ref_tracker_dir_print(struct ref_tracker_dir *dir,
@@ -114,6 +141,19 @@ void ref_tracker_dir_print(struct ref_tracker_dir *dir,
 }
 EXPORT_SYMBOL(ref_tracker_dir_print);
 
+int ref_tracker_dir_snprint(struct ref_tracker_dir *dir, char *buf, size_t 
size)
+{
+   struct ostream os = { .buf = buf, .size = size };
+   unsigned long flags;
+
+   spin_lock_irqsave(>lock, flags);
+   __ref_tracker_dir_pr_ostream(dir, 16, );
+   spin_unlock_irqrestore(>lock, flags);
+
+   return os.used;
+}
+EXPORT_SYMBOL(ref_tracker_dir_snprint);
+
 void ref_tracker_dir_exit(struct ref_tracker_dir *dir)
 {
struct ref_tracker *tracker, *n;
-- 
2.25.1



[PATCH v2 7/9] drm/i915: Track leaked gt->wakerefs

2022-02-21 Thread Andrzej Hajda
From: Chris Wilson 

Track every intel_gt_pm_get() until its corresponding release in
intel_gt_pm_put() by returning a cookie to the caller for acquire that
must be passed by on rleased. When there is an imbalance, we can see who
either tried to free a stale wakeref, or who forgot to free theirs.

v2: Rebase from backporting wakeref leak (Umesh)

Signed-off-by: Chris Wilson 
Reviewed-by: Andrzej Hajda 
Signed-off-by: Andrzej Hajda 
---
 drivers/gpu/drm/i915/Kconfig.debug| 15 +++
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|  7 ++--
 .../i915/gem/selftests/i915_gem_coherency.c   | 10 +++--
 .../drm/i915/gem/selftests/i915_gem_mman.c| 14 ---
 drivers/gpu/drm/i915/gt/intel_breadcrumbs.c   | 13 --
 .../gpu/drm/i915/gt/intel_breadcrumbs_types.h |  3 +-
 drivers/gpu/drm/i915/gt/intel_engine_pm.c |  4 +-
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |  2 +
 .../drm/i915/gt/intel_execlists_submission.c  |  2 +-
 drivers/gpu/drm/i915/gt/intel_gt_pm.c | 10 +++--
 drivers/gpu/drm/i915/gt/intel_gt_pm.h | 36 
 drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c |  4 +-
 drivers/gpu/drm/i915/gt/selftest_engine_cs.c  | 20 +
 drivers/gpu/drm/i915/gt/selftest_gt_pm.c  |  5 ++-
 drivers/gpu/drm/i915/gt/selftest_reset.c  | 10 +++--
 drivers/gpu/drm/i915/gt/selftest_rps.c| 17 
 drivers/gpu/drm/i915/gt/selftest_slpc.c   | 10 +++--
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  9 ++--
 drivers/gpu/drm/i915/i915_pmu.c   | 16 +++
 drivers/gpu/drm/i915/intel_wakeref.c  |  4 ++
 drivers/gpu/drm/i915/intel_wakeref.h  | 42 +++
 21 files changed, 182 insertions(+), 71 deletions(-)

diff --git a/drivers/gpu/drm/i915/Kconfig.debug 
b/drivers/gpu/drm/i915/Kconfig.debug
index 8b1973146e848..3bdc73f30a9e1 100644
--- a/drivers/gpu/drm/i915/Kconfig.debug
+++ b/drivers/gpu/drm/i915/Kconfig.debug
@@ -48,6 +48,7 @@ config DRM_I915_DEBUG
select DRM_I915_DEBUG_MMIO
select DRM_I915_TRACK_WAKEREF
select DRM_I915_DEBUG_RUNTIME_PM
+   select DRM_I915_DEBUG_WAKEREF
select DRM_I915_SW_FENCE_DEBUG_OBJECTS
select DRM_I915_SELFTEST
select BROKEN # for prototype uAPI
@@ -257,3 +258,17 @@ config DRM_I915_DEBUG_RUNTIME_PM
  Recommended for driver developers only.
 
  If in doubt, say "N"
+
+config DRM_I915_DEBUG_WAKEREF
+   bool "Enable extra tracking for wakerefs"
+   depends on DRM_I915
+   default n
+   select STACKDEPOT
+   select STACKTRACE
+   select DRM_I915_TRACK_WAKEREF
+   help
+ Choose this option to turn on extra state checking and usage
+ tracking for the wakerefPM functionality. This may introduce
+ overhead during driver runtime.
+
+ If in doubt, say "N"
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 13c975da77474..4b6c144f706da 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -252,6 +252,7 @@ struct i915_execbuffer {
struct intel_gt *gt; /* gt for the execbuf */
struct intel_context *context; /* logical state for the request */
struct i915_gem_context *gem_context; /** caller's context */
+   intel_wakeref_t wakeref;
 
/** our requests to build */
struct i915_request *requests[MAX_ENGINE_INSTANCE + 1];
@@ -2679,7 +2680,7 @@ eb_select_engine(struct i915_execbuffer *eb)
 
for_each_child(ce, child)
intel_context_get(child);
-   intel_gt_pm_get(ce->engine->gt);
+   eb->wakeref = intel_gt_pm_get(ce->engine->gt);
 
if (!test_bit(CONTEXT_ALLOC_BIT, >flags)) {
err = intel_context_alloc_state(ce);
@@ -2713,7 +2714,7 @@ eb_select_engine(struct i915_execbuffer *eb)
return err;
 
 err:
-   intel_gt_pm_put(ce->engine->gt);
+   intel_gt_pm_put(ce->engine->gt, eb->wakeref);
for_each_child(ce, child)
intel_context_put(child);
intel_context_put(ce);
@@ -2725,7 +2726,7 @@ eb_put_engine(struct i915_execbuffer *eb)
 {
struct intel_context *child;
 
-   intel_gt_pm_put(eb->gt);
+   intel_gt_pm_put(eb->context->engine->gt, eb->wakeref);
for_each_child(eb->context, child)
intel_context_put(child);
intel_context_put(eb->context);
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c 
b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c
index 13b088cc787eb..553f2730c2a76 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c
@@ -85,6 +85,7 @@ static int cpu_get(struct context *ctx, unsigned long offset, 
u32 *v)
 
 static int gtt_set(struct context *ctx, unsigned long offset, u32 v)
 {
+   intel_wakeref_t wakeref;
struct i915_vma *vma;
u32 __iomem *map;
  

[PATCH v2 6/9] drm/i915: Separate wakeref tracking

2022-02-21 Thread Andrzej Hajda
From: Chris Wilson 

Extract the callstack tracking of intel_runtime_pm.c into its own
utility so that that we can reuse it for other online debugging of
scoped wakerefs.

Signed-off-by: Chris Wilson 
Reviewed-by: Andrzej Hajda 
Signed-off-by: Andrzej Hajda 
---
 drivers/gpu/drm/i915/Kconfig.debug   |   9 +
 drivers/gpu/drm/i915/Makefile|   4 +
 drivers/gpu/drm/i915/intel_runtime_pm.c  | 244 +++
 drivers/gpu/drm/i915/intel_runtime_pm.h  |  10 +-
 drivers/gpu/drm/i915/intel_wakeref.h |   6 +-
 drivers/gpu/drm/i915/intel_wakeref_tracker.c | 234 ++
 drivers/gpu/drm/i915/intel_wakeref_tracker.h |  76 ++
 7 files changed, 355 insertions(+), 228 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/intel_wakeref_tracker.c
 create mode 100644 drivers/gpu/drm/i915/intel_wakeref_tracker.h

diff --git a/drivers/gpu/drm/i915/Kconfig.debug 
b/drivers/gpu/drm/i915/Kconfig.debug
index e7fd3e76f8a20..8b1973146e848 100644
--- a/drivers/gpu/drm/i915/Kconfig.debug
+++ b/drivers/gpu/drm/i915/Kconfig.debug
@@ -33,6 +33,7 @@ config DRM_I915_DEBUG
select PREEMPT_COUNT
select I2C_CHARDEV
select STACKDEPOT
+   select STACKTRACE
select DRM_DP_AUX_CHARDEV
select X86_MSR # used by igt/pm_rpm
select DRM_VGEM # used by igt/prime_vgem (dmabuf interop checks)
@@ -45,6 +46,7 @@ config DRM_I915_DEBUG
select DRM_I915_DEBUG_GEM
select DRM_I915_DEBUG_GEM_ONCE
select DRM_I915_DEBUG_MMIO
+   select DRM_I915_TRACK_WAKEREF
select DRM_I915_DEBUG_RUNTIME_PM
select DRM_I915_SW_FENCE_DEBUG_OBJECTS
select DRM_I915_SELFTEST
@@ -235,11 +237,18 @@ config DRM_I915_DEBUG_VBLANK_EVADE
 
  If in doubt, say "N".
 
+config DRM_I915_TRACK_WAKEREF
+   depends on STACKDEPOT
+   depends on STACKTRACE
+   bool
+
 config DRM_I915_DEBUG_RUNTIME_PM
bool "Enable extra state checking for runtime PM"
depends on DRM_I915
default n
select STACKDEPOT
+   select STACKTRACE
+   select DRM_I915_TRACK_WAKEREF
help
  Choose this option to turn on extra state checking for the
  runtime PM functionality. This may introduce overhead during
diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 9d588d936e3dc..88a403d3294cb 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -75,6 +75,10 @@ i915-$(CONFIG_DEBUG_FS) += \
i915_debugfs_params.o \
display/intel_display_debugfs.o \
display/intel_pipe_crc.o
+
+i915-$(CONFIG_DRM_I915_TRACK_WAKEREF) += \
+   intel_wakeref_tracker.o
+
 i915-$(CONFIG_PERF_EVENTS) += i915_pmu.o
 
 # "Graphics Technology" (aka we talk to the gpu)
diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c 
b/drivers/gpu/drm/i915/intel_runtime_pm.c
index 6ed5786bcd299..7bd10efa56bf3 100644
--- a/drivers/gpu/drm/i915/intel_runtime_pm.c
+++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
@@ -52,182 +52,37 @@
 
 #if IS_ENABLED(CONFIG_DRM_I915_DEBUG_RUNTIME_PM)
 
-#include 
-
-#define STACKDEPTH 8
-
-static noinline depot_stack_handle_t __save_depot_stack(void)
-{
-   unsigned long entries[STACKDEPTH];
-   unsigned int n;
-
-   n = stack_trace_save(entries, ARRAY_SIZE(entries), 1);
-   return stack_depot_save(entries, n, GFP_NOWAIT | __GFP_NOWARN);
-}
-
 static void init_intel_runtime_pm_wakeref(struct intel_runtime_pm *rpm)
 {
-   spin_lock_init(>debug.lock);
-   stack_depot_init();
+   intel_wakeref_tracker_init(>debug);
 }
 
-static noinline depot_stack_handle_t
+static intel_wakeref_t
 track_intel_runtime_pm_wakeref(struct intel_runtime_pm *rpm)
 {
-   depot_stack_handle_t stack, *stacks;
-   unsigned long flags;
-
-   if (rpm->no_wakeref_tracking)
-   return -1;
-
-   stack = __save_depot_stack();
-   if (!stack)
+   if (!rpm->available)
return -1;
 
-   spin_lock_irqsave(>debug.lock, flags);
-
-   if (!rpm->debug.count)
-   rpm->debug.last_acquire = stack;
-
-   stacks = krealloc(rpm->debug.owners,
- (rpm->debug.count + 1) * sizeof(*stacks),
- GFP_NOWAIT | __GFP_NOWARN);
-   if (stacks) {
-   stacks[rpm->debug.count++] = stack;
-   rpm->debug.owners = stacks;
-   } else {
-   stack = -1;
-   }
-
-   spin_unlock_irqrestore(>debug.lock, flags);
-
-   return stack;
+   return intel_wakeref_tracker_add(>debug);
 }
 
 static void untrack_intel_runtime_pm_wakeref(struct intel_runtime_pm *rpm,
-depot_stack_handle_t stack)
+intel_wakeref_t wakeref)
 {
-   struct drm_i915_private *i915 = container_of(rpm,
-struct drm_i915_private,
-

[PATCH v2 5/9] lib/ref_tracker: improve allocation flags

2022-02-21 Thread Andrzej Hajda
Library can be called in non-sleeping context, so it should not use
__GFP_NOFAIL. Instead it should calmly handle allocation fails, for
this __GFP_NOWARN has been added as well.

Signed-off-by: Andrzej Hajda 
---
 lib/ref_tracker.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/lib/ref_tracker.c b/lib/ref_tracker.c
index 7b00bca300043..c8441ffbb058a 100644
--- a/lib/ref_tracker.c
+++ b/lib/ref_tracker.c
@@ -59,7 +59,7 @@ __ref_tracker_dir_pr_ostream(struct ref_tracker_dir *dir,
if (list_empty(>list))
return;
 
-   sbuf = kmalloc(STACK_BUF_SIZE, GFP_NOWAIT);
+   sbuf = kmalloc(STACK_BUF_SIZE, GFP_NOWAIT | __GFP_NOWARN);
 
list_for_each_entry(tracker, >list, head)
++total;
@@ -154,11 +154,11 @@ int ref_tracker_alloc(struct ref_tracker_dir *dir,
unsigned long entries[REF_TRACKER_STACK_ENTRIES];
struct ref_tracker *tracker;
unsigned int nr_entries;
-   gfp_t gfp_mask = gfp;
+   gfp_t gfp_mask;
unsigned long flags;
 
-   if (gfp & __GFP_DIRECT_RECLAIM)
-   gfp_mask |= __GFP_NOFAIL;
+   gfp |= __GFP_NOWARN;
+   gfp_mask = (gfp & __GFP_DIRECT_RECLAIM) ? (gfp | __GFP_NOFAIL) : gfp;
*trackerp = tracker = kzalloc(sizeof(*tracker), gfp_mask);
if (unlikely(!tracker)) {
pr_err_once("memory allocation failure, unreliable refcount 
tracker.\n");
@@ -191,7 +191,8 @@ int ref_tracker_free(struct ref_tracker_dir *dir,
}
nr_entries = stack_trace_save(entries, ARRAY_SIZE(entries), 1);
nr_entries = filter_irq_stacks(entries, nr_entries);
-   stack_handle = stack_depot_save(entries, nr_entries, GFP_ATOMIC);
+   stack_handle = stack_depot_save(entries, nr_entries,
+   GFP_NOWAIT | __GFP_NOWARN);
 
spin_lock_irqsave(>lock, flags);
if (tracker->dead) {
-- 
2.25.1



[PATCH v2 05/11] lib/ref_tracker: __ref_tracker_dir_print improve printing

2022-02-21 Thread Andrzej Hajda
To improve readibility of ref_tracker printing following changes
have been performed:
- reports are printed per stack_handle - log is more compact,
- added display name for ref_tracker_dir,
- stack trace is printed indented, in the same printk call,
- total number of references is printed every time,
- print info about dropped references.

Signed-off-by: Andrzej Hajda 
---
 include/linux/ref_tracker.h | 15 +--
 lib/ref_tracker.c   | 90 -
 2 files changed, 91 insertions(+), 14 deletions(-)

diff --git a/include/linux/ref_tracker.h b/include/linux/ref_tracker.h
index 3e9e9df2a41f5..a2cf1f6309adb 100644
--- a/include/linux/ref_tracker.h
+++ b/include/linux/ref_tracker.h
@@ -17,12 +17,19 @@ struct ref_tracker_dir {
booldead;
struct list_headlist; /* List of active trackers */
struct list_headquarantine; /* List of dead trackers */
+   charname[32];
 #endif
 };
 
 #ifdef CONFIG_REF_TRACKER
-static inline void ref_tracker_dir_init(struct ref_tracker_dir *dir,
-   unsigned int quarantine_count)
+
+// Temporary allow two and three arguments, until consumers are converted
+#define ref_tracker_dir_init(_d, _q, args...) _ref_tracker_dir_init(_d, _q, 
##args, #_d)
+#define _ref_tracker_dir_init(_d, _q, _n, ...) __ref_tracker_dir_init(_d, _q, 
_n)
+
+static inline void __ref_tracker_dir_init(struct ref_tracker_dir *dir,
+   unsigned int quarantine_count,
+   const char *name)
 {
INIT_LIST_HEAD(>list);
INIT_LIST_HEAD(>quarantine);
@@ -31,6 +38,7 @@ static inline void ref_tracker_dir_init(struct 
ref_tracker_dir *dir,
dir->dead = false;
refcount_set(>untracked, 1);
refcount_set(>no_tracker, 1);
+   strlcpy(dir->name, name, sizeof(dir->name));
stack_depot_init();
 }
 
@@ -51,7 +59,8 @@ int ref_tracker_free(struct ref_tracker_dir *dir,
 #else /* CONFIG_REF_TRACKER */
 
 static inline void ref_tracker_dir_init(struct ref_tracker_dir *dir,
-   unsigned int quarantine_count)
+   unsigned int quarantine_count,
+   ...)
 {
 }
 
diff --git a/lib/ref_tracker.c b/lib/ref_tracker.c
index 5e9f90bbf771b..ab1253fde244e 100644
--- a/lib/ref_tracker.c
+++ b/lib/ref_tracker.c
@@ -1,11 +1,16 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
+
+#define pr_fmt(fmt) "ref_tracker: " fmt
+
 #include 
+#include 
 #include 
 #include 
 #include 
 #include 
 
 #define REF_TRACKER_STACK_ENTRIES 16
+#define STACK_BUF_SIZE 1024
 
 struct ref_tracker {
struct list_headhead;   /* anchor into dir->list or 
dir->quarantine */
@@ -14,24 +19,87 @@ struct ref_tracker {
depot_stack_handle_tfree_stack_handle;
 };
 
-void __ref_tracker_dir_print(struct ref_tracker_dir *dir,
-  unsigned int display_limit)
+struct ref_tracker_dir_stats {
+   int total;
+   int count;
+   struct {
+   depot_stack_handle_t stack_handle;
+   unsigned int count;
+   } stacks[];
+};
+
+static struct ref_tracker_dir_stats *
+ref_tracker_get_stats(struct ref_tracker_dir *dir, unsigned int limit)
 {
+   struct ref_tracker_dir_stats *stats;
struct ref_tracker *tracker;
-   unsigned int i = 0;
 
-   lockdep_assert_held(>lock);
+   stats = kmalloc(struct_size(stats, stacks, limit),
+   GFP_NOWAIT | __GFP_NOWARN);
+   if (!stats)
+   return ERR_PTR(-ENOMEM);
+   stats->total = 0;
+   stats->count = 0;
 
list_for_each_entry(tracker, >list, head) {
-   if (i < display_limit) {
-   pr_err("leaked reference.\n");
-   if (tracker->alloc_stack_handle)
-   stack_depot_print(tracker->alloc_stack_handle);
-   i++;
-   } else {
-   break;
+   depot_stack_handle_t stack = tracker->alloc_stack_handle;
+   int i;
+
+   ++stats->total;
+   for (i = 0; i < stats->count; ++i)
+   if (stats->stacks[i].stack_handle == stack)
+   break;
+   if (i >= limit)
+   continue;
+   if (i >= stats->count) {
+   stats->stacks[i].stack_handle = stack;
+   stats->stacks[i].count = 0;
+   ++stats->count;
}
+   ++stats->stacks[i].count;
+   }
+
+   return stats;
+}
+
+void __ref_tracker_dir_print(struct ref_tracker_dir *dir,
+  unsigned int display_limit)
+{
+   struct ref_tracker_dir_stats *stats;
+   unsigned int i = 0, skipped;
+   depot_stack_handle_t stack;
+  

[PATCH v2 04/11] lib/ref_tracker: add unlocked leak print helper

2022-02-21 Thread Andrzej Hajda
To have reliable detection of leaks, caller must be able to check under the same
lock both: tracked counter and the leaks. dir.lock is natural candidate for such
lock and unlocked print helper can be called with this lock taken.
As a bonus we can reuse this helper in ref_tracker_dir_exit.

Signed-off-by: Andrzej Hajda 
Reviewed-by: Chris Wilson 
---
 include/linux/ref_tracker.h |  8 +
 lib/ref_tracker.c   | 66 +
 2 files changed, 46 insertions(+), 28 deletions(-)

diff --git a/include/linux/ref_tracker.h b/include/linux/ref_tracker.h
index 9ca353ab712b5..3e9e9df2a41f5 100644
--- a/include/linux/ref_tracker.h
+++ b/include/linux/ref_tracker.h
@@ -36,6 +36,9 @@ static inline void ref_tracker_dir_init(struct 
ref_tracker_dir *dir,
 
 void ref_tracker_dir_exit(struct ref_tracker_dir *dir);
 
+void __ref_tracker_dir_print(struct ref_tracker_dir *dir,
+  unsigned int display_limit);
+
 void ref_tracker_dir_print(struct ref_tracker_dir *dir,
   unsigned int display_limit);
 
@@ -56,6 +59,11 @@ static inline void ref_tracker_dir_exit(struct 
ref_tracker_dir *dir)
 {
 }
 
+static inline void __ref_tracker_dir_print(struct ref_tracker_dir *dir,
+  unsigned int display_limit)
+{
+}
+
 static inline void ref_tracker_dir_print(struct ref_tracker_dir *dir,
 unsigned int display_limit)
 {
diff --git a/lib/ref_tracker.c b/lib/ref_tracker.c
index dc7b14aa3431e..5e9f90bbf771b 100644
--- a/lib/ref_tracker.c
+++ b/lib/ref_tracker.c
@@ -14,6 +14,38 @@ struct ref_tracker {
depot_stack_handle_tfree_stack_handle;
 };
 
+void __ref_tracker_dir_print(struct ref_tracker_dir *dir,
+  unsigned int display_limit)
+{
+   struct ref_tracker *tracker;
+   unsigned int i = 0;
+
+   lockdep_assert_held(>lock);
+
+   list_for_each_entry(tracker, >list, head) {
+   if (i < display_limit) {
+   pr_err("leaked reference.\n");
+   if (tracker->alloc_stack_handle)
+   stack_depot_print(tracker->alloc_stack_handle);
+   i++;
+   } else {
+   break;
+   }
+   }
+}
+EXPORT_SYMBOL(__ref_tracker_dir_print);
+
+void ref_tracker_dir_print(struct ref_tracker_dir *dir,
+  unsigned int display_limit)
+{
+   unsigned long flags;
+
+   spin_lock_irqsave(>lock, flags);
+   __ref_tracker_dir_print(dir, display_limit);
+   spin_unlock_irqrestore(>lock, flags);
+}
+EXPORT_SYMBOL(ref_tracker_dir_print);
+
 void ref_tracker_dir_exit(struct ref_tracker_dir *dir)
 {
struct ref_tracker *tracker, *n;
@@ -27,13 +59,13 @@ void ref_tracker_dir_exit(struct ref_tracker_dir *dir)
kfree(tracker);
dir->quarantine_avail++;
}
-   list_for_each_entry_safe(tracker, n, >list, head) {
-   pr_err("leaked reference.\n");
-   if (tracker->alloc_stack_handle)
-   stack_depot_print(tracker->alloc_stack_handle);
+   if (!list_empty(>list)) {
+   __ref_tracker_dir_print(dir, 16);
leak = true;
-   list_del(>head);
-   kfree(tracker);
+   list_for_each_entry_safe(tracker, n, >list, head) {
+   list_del(>head);
+   kfree(tracker);
+   }
}
spin_unlock_irqrestore(>lock, flags);
WARN_ON_ONCE(leak);
@@ -42,28 +74,6 @@ void ref_tracker_dir_exit(struct ref_tracker_dir *dir)
 }
 EXPORT_SYMBOL(ref_tracker_dir_exit);
 
-void ref_tracker_dir_print(struct ref_tracker_dir *dir,
-  unsigned int display_limit)
-{
-   struct ref_tracker *tracker;
-   unsigned long flags;
-   unsigned int i = 0;
-
-   spin_lock_irqsave(>lock, flags);
-   list_for_each_entry(tracker, >list, head) {
-   if (i < display_limit) {
-   pr_err("leaked reference.\n");
-   if (tracker->alloc_stack_handle)
-   stack_depot_print(tracker->alloc_stack_handle);
-   i++;
-   } else {
-   break;
-   }
-   }
-   spin_unlock_irqrestore(>lock, flags);
-}
-EXPORT_SYMBOL(ref_tracker_dir_print);
-
 int ref_tracker_alloc(struct ref_tracker_dir *dir,
  struct ref_tracker **trackerp,
  gfp_t gfp)
-- 
2.25.1



[PATCH v2 03/11] ref_tracker: remove filter_irq_stacks() call

2022-02-21 Thread Andrzej Hajda
From: Eric Dumazet 

After commit e94006608949 ("lib/stackdepot: always do filter_irq_stacks()
in stack_depot_save()") it became unnecessary to filter the stack
before calling stack_depot_save().

Signed-off-by: Eric Dumazet 
Cc: Marco Elver 
Cc: Alexander Potapenko 
Cc: Dmitry Vyukov 
Signed-off-by: David S. Miller 
Signed-off-by: Andrzej Hajda 
---
 lib/ref_tracker.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/lib/ref_tracker.c b/lib/ref_tracker.c
index 9c0c2e09df666..dc7b14aa3431e 100644
--- a/lib/ref_tracker.c
+++ b/lib/ref_tracker.c
@@ -89,7 +89,6 @@ int ref_tracker_alloc(struct ref_tracker_dir *dir,
return -ENOMEM;
}
nr_entries = stack_trace_save(entries, ARRAY_SIZE(entries), 1);
-   nr_entries = filter_irq_stacks(entries, nr_entries);
tracker->alloc_stack_handle = stack_depot_save(entries, nr_entries, 
gfp);
 
spin_lock_irqsave(>lock, flags);
@@ -120,7 +119,6 @@ int ref_tracker_free(struct ref_tracker_dir *dir,
return -EEXIST;
}
nr_entries = stack_trace_save(entries, ARRAY_SIZE(entries), 1);
-   nr_entries = filter_irq_stacks(entries, nr_entries);
stack_handle = stack_depot_save(entries, nr_entries, GFP_ATOMIC);
 
spin_lock_irqsave(>lock, flags);
-- 
2.25.1



[PATCH v2 4/9] lib/ref_tracker: add printing to memory buffer

2022-02-21 Thread Andrzej Hajda
In case one wants to show stats via debugfs.

Signed-off-by: Andrzej Hajda 
Reviewed-by: Chris Wilson 
---
 include/linux/ref_tracker.h |  8 ++
 lib/ref_tracker.c   | 52 -
 2 files changed, 54 insertions(+), 6 deletions(-)

diff --git a/include/linux/ref_tracker.h b/include/linux/ref_tracker.h
index 090230e5b485d..6d2634590ee5a 100644
--- a/include/linux/ref_tracker.h
+++ b/include/linux/ref_tracker.h
@@ -46,6 +46,8 @@ void __ref_tracker_dir_print(struct ref_tracker_dir *dir,
 void ref_tracker_dir_print(struct ref_tracker_dir *dir,
   unsigned int display_limit);
 
+int ref_tracker_dir_snprint(struct ref_tracker_dir *dir, char *buf, size_t 
size);
+
 int ref_tracker_alloc(struct ref_tracker_dir *dir,
  struct ref_tracker **trackerp, gfp_t gfp);
 
@@ -74,6 +76,12 @@ static inline void ref_tracker_dir_print(struct 
ref_tracker_dir *dir,
 {
 }
 
+static inline int ref_tracker_dir_snprint(struct ref_tracker_dir *dir,
+ char *buf, size_t size)
+{
+   return 0;
+}
+
 static inline int ref_tracker_alloc(struct ref_tracker_dir *dir,
struct ref_tracker **trackerp,
gfp_t gfp)
diff --git a/lib/ref_tracker.c b/lib/ref_tracker.c
index 943cff08110e3..7b00bca300043 100644
--- a/lib/ref_tracker.c
+++ b/lib/ref_tracker.c
@@ -27,8 +27,27 @@ static int ref_tracker_cmp(void *priv, const struct 
list_head *a, const struct l
return ta->alloc_stack_handle - tb->alloc_stack_handle;
 }
 
-void __ref_tracker_dir_print(struct ref_tracker_dir *dir,
-  unsigned int display_limit)
+struct ostream {
+   char *buf;
+   int size, used;
+};
+
+#define pr_ostream(stream, fmt, args...) \
+({ \
+   struct ostream *_s = (stream); \
+\
+   if (!_s->buf) { \
+   pr_err(fmt, ##args); \
+   } else { \
+   int ret, len = _s->size - _s->used; \
+   ret = snprintf(_s->buf + _s->used, len, pr_fmt(fmt), ##args); \
+   _s->used += min(ret, len); \
+   } \
+})
+
+static void
+__ref_tracker_dir_pr_ostream(struct ref_tracker_dir *dir,
+unsigned int display_limit, struct ostream *s)
 {
unsigned int i = 0, count = 0, total = 0;
struct ref_tracker *tracker;
@@ -58,16 +77,24 @@ void __ref_tracker_dir_print(struct ref_tracker_dir *dir,
 
if (sbuf && !stack_depot_snprint(stack, sbuf, STACK_BUF_SIZE, 
4))
sbuf[0] = 0;
-   pr_err("%s@%pK has %d/%d users at\n%s\n",
-  dir->name, dir, count, total, sbuf);
+   pr_ostream(s, "%s@%pK has %d/%d users at\n%s\n",
+  dir->name, dir, count, total, sbuf);
count = 0;
}
if (i > display_limit)
-   pr_err("%s@%pK skipped %d/%d reports with %d unique stacks.\n",
-  dir->name, dir, count, total, i - display_limit);
+   pr_ostream(s, "%s@%pK skipped %d/%d reports with %d unique 
stacks.\n",
+  dir->name, dir, count, total, i - display_limit);
 
kfree(sbuf);
 }
+
+void __ref_tracker_dir_print(struct ref_tracker_dir *dir,
+  unsigned int display_limit)
+{
+   struct ostream os = {};
+
+   __ref_tracker_dir_pr_ostream(dir, display_limit, );
+}
 EXPORT_SYMBOL(__ref_tracker_dir_print);
 
 void ref_tracker_dir_print(struct ref_tracker_dir *dir,
@@ -81,6 +108,19 @@ void ref_tracker_dir_print(struct ref_tracker_dir *dir,
 }
 EXPORT_SYMBOL(ref_tracker_dir_print);
 
+int ref_tracker_dir_snprint(struct ref_tracker_dir *dir, char *buf, size_t 
size)
+{
+   struct ostream os = { .buf = buf, .size = size };
+   unsigned long flags;
+
+   spin_lock_irqsave(>lock, flags);
+   __ref_tracker_dir_pr_ostream(dir, 16, );
+   spin_unlock_irqrestore(>lock, flags);
+
+   return os.used;
+}
+EXPORT_SYMBOL(ref_tracker_dir_snprint);
+
 void ref_tracker_dir_exit(struct ref_tracker_dir *dir)
 {
struct ref_tracker *tracker, *n;
-- 
2.25.1



[PATCH v2 3/9] lib/ref_tracker: __ref_tracker_dir_print improve printing

2022-02-21 Thread Andrzej Hajda
To improve readibility of ref_tracker printing following changes
have been performed:
- added display name for ref_tracker_dir,
- stack trace is printed indented, in the same printk call,
- total number of references is printed every time,
- print info about dropped references.

Signed-off-by: Andrzej Hajda 
Reviewed-by: Chris Wilson 
---
 include/linux/ref_tracker.h | 15 ---
 lib/ref_tracker.c   | 28 ++--
 2 files changed, 34 insertions(+), 9 deletions(-)

diff --git a/include/linux/ref_tracker.h b/include/linux/ref_tracker.h
index b9c968a716483..090230e5b485d 100644
--- a/include/linux/ref_tracker.h
+++ b/include/linux/ref_tracker.h
@@ -15,18 +15,26 @@ struct ref_tracker_dir {
refcount_t  untracked;
struct list_headlist; /* List of active trackers */
struct list_headquarantine; /* List of dead trackers */
+   charname[32];
 #endif
 };
 
 #ifdef CONFIG_REF_TRACKER
-static inline void ref_tracker_dir_init(struct ref_tracker_dir *dir,
-   unsigned int quarantine_count)
+
+// Temporary allow two and three arguments, until consumers are converted
+#define ref_tracker_dir_init(_d, _q, args...) _ref_tracker_dir_init(_d, _q, 
##args, #_d)
+#define _ref_tracker_dir_init(_d, _q, _n, ...) __ref_tracker_dir_init(_d, _q, 
_n)
+
+static inline void __ref_tracker_dir_init(struct ref_tracker_dir *dir,
+   unsigned int quarantine_count,
+   const char *name)
 {
INIT_LIST_HEAD(>list);
INIT_LIST_HEAD(>quarantine);
spin_lock_init(>lock);
dir->quarantine_avail = quarantine_count;
refcount_set(>untracked, 1);
+   strlcpy(dir->name, name, sizeof(dir->name));
stack_depot_init();
 }
 
@@ -47,7 +55,8 @@ int ref_tracker_free(struct ref_tracker_dir *dir,
 #else /* CONFIG_REF_TRACKER */
 
 static inline void ref_tracker_dir_init(struct ref_tracker_dir *dir,
-   unsigned int quarantine_count)
+   unsigned int quarantine_count,
+   ...)
 {
 }
 
diff --git a/lib/ref_tracker.c b/lib/ref_tracker.c
index 0e9c7d2828ccb..943cff08110e3 100644
--- a/lib/ref_tracker.c
+++ b/lib/ref_tracker.c
@@ -1,4 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
+
+#define pr_fmt(fmt) "ref_tracker: " fmt
+
 #include 
 #include 
 #include 
@@ -7,6 +10,7 @@
 #include 
 
 #define REF_TRACKER_STACK_ENTRIES 16
+#define STACK_BUF_SIZE 1024
 
 struct ref_tracker {
struct list_headhead;   /* anchor into dir->list or 
dir->quarantine */
@@ -26,31 +30,43 @@ static int ref_tracker_cmp(void *priv, const struct 
list_head *a, const struct l
 void __ref_tracker_dir_print(struct ref_tracker_dir *dir,
   unsigned int display_limit)
 {
-   unsigned int i = 0, count = 0;
+   unsigned int i = 0, count = 0, total = 0;
struct ref_tracker *tracker;
depot_stack_handle_t stack;
+   char *sbuf;
 
lockdep_assert_held(>lock);
 
if (list_empty(>list))
return;
 
+   sbuf = kmalloc(STACK_BUF_SIZE, GFP_NOWAIT);
+
+   list_for_each_entry(tracker, >list, head)
+   ++total;
+
list_sort(NULL, >list, ref_tracker_cmp);
 
list_for_each_entry(tracker, >list, head) {
-   if (i++ >= display_limit)
-   break;
if (!count++)
stack = tracker->alloc_stack_handle;
if (stack == tracker->alloc_stack_handle &&
!list_is_last(>head, >list))
continue;
+   if (i++ >= display_limit)
+   continue;
 
-   pr_err("leaked %d references.\n", count);
-   if (stack)
-   stack_depot_print(stack);
+   if (sbuf && !stack_depot_snprint(stack, sbuf, STACK_BUF_SIZE, 
4))
+   sbuf[0] = 0;
+   pr_err("%s@%pK has %d/%d users at\n%s\n",
+  dir->name, dir, count, total, sbuf);
count = 0;
}
+   if (i > display_limit)
+   pr_err("%s@%pK skipped %d/%d reports with %d unique stacks.\n",
+  dir->name, dir, count, total, i - display_limit);
+
+   kfree(sbuf);
 }
 EXPORT_SYMBOL(__ref_tracker_dir_print);
 
-- 
2.25.1



[PATCH v2 03/11] [DO NOT MERGE] ref_tracker: remove filter_irq_stacks() call

2022-02-21 Thread Andrzej Hajda
From: Eric Dumazet 

After commit e94006608949 ("lib/stackdepot: always do filter_irq_stacks()
in stack_depot_save()") it became unnecessary to filter the stack
before calling stack_depot_save().

Signed-off-by: Eric Dumazet 
Cc: Marco Elver 
Cc: Alexander Potapenko 
Cc: Dmitry Vyukov 
Signed-off-by: David S. Miller 
Signed-off-by: Andrzej Hajda 
---
 lib/ref_tracker.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/lib/ref_tracker.c b/lib/ref_tracker.c
index 9c0c2e09df666..dc7b14aa3431e 100644
--- a/lib/ref_tracker.c
+++ b/lib/ref_tracker.c
@@ -89,7 +89,6 @@ int ref_tracker_alloc(struct ref_tracker_dir *dir,
return -ENOMEM;
}
nr_entries = stack_trace_save(entries, ARRAY_SIZE(entries), 1);
-   nr_entries = filter_irq_stacks(entries, nr_entries);
tracker->alloc_stack_handle = stack_depot_save(entries, nr_entries, 
gfp);
 
spin_lock_irqsave(>lock, flags);
@@ -120,7 +119,6 @@ int ref_tracker_free(struct ref_tracker_dir *dir,
return -EEXIST;
}
nr_entries = stack_trace_save(entries, ARRAY_SIZE(entries), 1);
-   nr_entries = filter_irq_stacks(entries, nr_entries);
stack_handle = stack_depot_save(entries, nr_entries, GFP_ATOMIC);
 
spin_lock_irqsave(>lock, flags);
-- 
2.25.1



[PATCH v2 02/11] ref_tracker: add a count of untracked references

2022-02-21 Thread Andrzej Hajda
From: Eric Dumazet 

We are still chasing a netdev refcount imbalance, and we suspect
we have one rogue dev_put() that is consuming a reference taken
from a dev_hold_track()

To detect this case, allow ref_tracker_alloc() and ref_tracker_free()
to be called with a NULL @trackerp parameter, and use a dedicated
refcount_t just for them.

Signed-off-by: Eric Dumazet 
Signed-off-by: David S. Miller 
Signed-off-by: Andrzej Hajda 
---
 include/linux/ref_tracker.h |  2 ++
 lib/ref_tracker.c   | 12 +++-
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/include/linux/ref_tracker.h b/include/linux/ref_tracker.h
index a443abda937d8..9ca353ab712b5 100644
--- a/include/linux/ref_tracker.h
+++ b/include/linux/ref_tracker.h
@@ -13,6 +13,7 @@ struct ref_tracker_dir {
spinlock_t  lock;
unsigned intquarantine_avail;
refcount_t  untracked;
+   refcount_t  no_tracker;
booldead;
struct list_headlist; /* List of active trackers */
struct list_headquarantine; /* List of dead trackers */
@@ -29,6 +30,7 @@ static inline void ref_tracker_dir_init(struct 
ref_tracker_dir *dir,
dir->quarantine_avail = quarantine_count;
dir->dead = false;
refcount_set(>untracked, 1);
+   refcount_set(>no_tracker, 1);
stack_depot_init();
 }
 
diff --git a/lib/ref_tracker.c b/lib/ref_tracker.c
index 32ff6bd497f8e..9c0c2e09df666 100644
--- a/lib/ref_tracker.c
+++ b/lib/ref_tracker.c
@@ -38,6 +38,7 @@ void ref_tracker_dir_exit(struct ref_tracker_dir *dir)
spin_unlock_irqrestore(>lock, flags);
WARN_ON_ONCE(leak);
WARN_ON_ONCE(refcount_read(>untracked) != 1);
+   WARN_ON_ONCE(refcount_read(>no_tracker) != 1);
 }
 EXPORT_SYMBOL(ref_tracker_dir_exit);
 
@@ -75,6 +76,10 @@ int ref_tracker_alloc(struct ref_tracker_dir *dir,
 
WARN_ON_ONCE(dir->dead);
 
+   if (!trackerp) {
+   refcount_inc(>no_tracker);
+   return 0;
+   }
if (gfp & __GFP_DIRECT_RECLAIM)
gfp_mask |= __GFP_NOFAIL;
*trackerp = tracker = kzalloc(sizeof(*tracker), gfp_mask);
@@ -98,13 +103,18 @@ int ref_tracker_free(struct ref_tracker_dir *dir,
 struct ref_tracker **trackerp)
 {
unsigned long entries[REF_TRACKER_STACK_ENTRIES];
-   struct ref_tracker *tracker = *trackerp;
depot_stack_handle_t stack_handle;
+   struct ref_tracker *tracker;
unsigned int nr_entries;
unsigned long flags;
 
WARN_ON_ONCE(dir->dead);
 
+   if (!trackerp) {
+   refcount_dec(>no_tracker);
+   return 0;
+   }
+   tracker = *trackerp;
if (!tracker) {
refcount_dec(>untracked);
return -EEXIST;
-- 
2.25.1



[PATCH v2 2/9] lib/ref_tracker: compact stacktraces before printing

2022-02-21 Thread Andrzej Hajda
In cases references are taken alternately on multiple exec paths leak
report can grow substantially, sorting and grouping leaks by stack_handle
allows to compact it.

Signed-off-by: Andrzej Hajda 
Reviewed-by: Chris Wilson 
---
 lib/ref_tracker.c | 35 +++
 1 file changed, 27 insertions(+), 8 deletions(-)

diff --git a/lib/ref_tracker.c b/lib/ref_tracker.c
index 1b0c6d645d64a..0e9c7d2828ccb 100644
--- a/lib/ref_tracker.c
+++ b/lib/ref_tracker.c
@@ -1,5 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -14,23 +15,41 @@ struct ref_tracker {
depot_stack_handle_tfree_stack_handle;
 };
 
+static int ref_tracker_cmp(void *priv, const struct list_head *a, const struct 
list_head *b)
+{
+   const struct ref_tracker *ta = list_entry(a, const struct ref_tracker, 
head);
+   const struct ref_tracker *tb = list_entry(b, const struct ref_tracker, 
head);
+
+   return ta->alloc_stack_handle - tb->alloc_stack_handle;
+}
+
 void __ref_tracker_dir_print(struct ref_tracker_dir *dir,
   unsigned int display_limit)
 {
+   unsigned int i = 0, count = 0;
struct ref_tracker *tracker;
-   unsigned int i = 0;
+   depot_stack_handle_t stack;
 
lockdep_assert_held(>lock);
 
+   if (list_empty(>list))
+   return;
+
+   list_sort(NULL, >list, ref_tracker_cmp);
+
list_for_each_entry(tracker, >list, head) {
-   if (i < display_limit) {
-   pr_err("leaked reference.\n");
-   if (tracker->alloc_stack_handle)
-   stack_depot_print(tracker->alloc_stack_handle);
-   i++;
-   } else {
+   if (i++ >= display_limit)
break;
-   }
+   if (!count++)
+   stack = tracker->alloc_stack_handle;
+   if (stack == tracker->alloc_stack_handle &&
+   !list_is_last(>head, >list))
+   continue;
+
+   pr_err("leaked %d references.\n", count);
+   if (stack)
+   stack_depot_print(stack);
+   count = 0;
}
 }
 EXPORT_SYMBOL(__ref_tracker_dir_print);
-- 
2.25.1



[PATCH v2 02/11] [DO NOT MERGE] ref_tracker: add a count of untracked references

2022-02-21 Thread Andrzej Hajda
From: Eric Dumazet 

We are still chasing a netdev refcount imbalance, and we suspect
we have one rogue dev_put() that is consuming a reference taken
from a dev_hold_track()

To detect this case, allow ref_tracker_alloc() and ref_tracker_free()
to be called with a NULL @trackerp parameter, and use a dedicated
refcount_t just for them.

Signed-off-by: Eric Dumazet 
Signed-off-by: David S. Miller 
Signed-off-by: Andrzej Hajda 
---
 include/linux/ref_tracker.h |  2 ++
 lib/ref_tracker.c   | 12 +++-
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/include/linux/ref_tracker.h b/include/linux/ref_tracker.h
index a443abda937d8..9ca353ab712b5 100644
--- a/include/linux/ref_tracker.h
+++ b/include/linux/ref_tracker.h
@@ -13,6 +13,7 @@ struct ref_tracker_dir {
spinlock_t  lock;
unsigned intquarantine_avail;
refcount_t  untracked;
+   refcount_t  no_tracker;
booldead;
struct list_headlist; /* List of active trackers */
struct list_headquarantine; /* List of dead trackers */
@@ -29,6 +30,7 @@ static inline void ref_tracker_dir_init(struct 
ref_tracker_dir *dir,
dir->quarantine_avail = quarantine_count;
dir->dead = false;
refcount_set(>untracked, 1);
+   refcount_set(>no_tracker, 1);
stack_depot_init();
 }
 
diff --git a/lib/ref_tracker.c b/lib/ref_tracker.c
index 32ff6bd497f8e..9c0c2e09df666 100644
--- a/lib/ref_tracker.c
+++ b/lib/ref_tracker.c
@@ -38,6 +38,7 @@ void ref_tracker_dir_exit(struct ref_tracker_dir *dir)
spin_unlock_irqrestore(>lock, flags);
WARN_ON_ONCE(leak);
WARN_ON_ONCE(refcount_read(>untracked) != 1);
+   WARN_ON_ONCE(refcount_read(>no_tracker) != 1);
 }
 EXPORT_SYMBOL(ref_tracker_dir_exit);
 
@@ -75,6 +76,10 @@ int ref_tracker_alloc(struct ref_tracker_dir *dir,
 
WARN_ON_ONCE(dir->dead);
 
+   if (!trackerp) {
+   refcount_inc(>no_tracker);
+   return 0;
+   }
if (gfp & __GFP_DIRECT_RECLAIM)
gfp_mask |= __GFP_NOFAIL;
*trackerp = tracker = kzalloc(sizeof(*tracker), gfp_mask);
@@ -98,13 +103,18 @@ int ref_tracker_free(struct ref_tracker_dir *dir,
 struct ref_tracker **trackerp)
 {
unsigned long entries[REF_TRACKER_STACK_ENTRIES];
-   struct ref_tracker *tracker = *trackerp;
depot_stack_handle_t stack_handle;
+   struct ref_tracker *tracker;
unsigned int nr_entries;
unsigned long flags;
 
WARN_ON_ONCE(dir->dead);
 
+   if (!trackerp) {
+   refcount_dec(>no_tracker);
+   return 0;
+   }
+   tracker = *trackerp;
if (!tracker) {
refcount_dec(>untracked);
return -EEXIST;
-- 
2.25.1



[PATCH v2 01/11] ref_tracker: implement use-after-free detection

2022-02-21 Thread Andrzej Hajda
From: Eric Dumazet 

Whenever ref_tracker_dir_init() is called, mark the struct ref_tracker_dir
as dead.

Test the dead status from ref_tracker_alloc() and ref_tracker_free()

This should detect buggy dev_put()/dev_hold() happening too late
in netdevice dismantle process.

Signed-off-by: Eric Dumazet 
Signed-off-by: David S. Miller 
Signed-off-by: Andrzej Hajda 
---
 include/linux/ref_tracker.h | 2 ++
 lib/ref_tracker.c   | 5 +
 2 files changed, 7 insertions(+)

diff --git a/include/linux/ref_tracker.h b/include/linux/ref_tracker.h
index 60f3453be23e6..a443abda937d8 100644
--- a/include/linux/ref_tracker.h
+++ b/include/linux/ref_tracker.h
@@ -13,6 +13,7 @@ struct ref_tracker_dir {
spinlock_t  lock;
unsigned intquarantine_avail;
refcount_t  untracked;
+   booldead;
struct list_headlist; /* List of active trackers */
struct list_headquarantine; /* List of dead trackers */
 #endif
@@ -26,6 +27,7 @@ static inline void ref_tracker_dir_init(struct 
ref_tracker_dir *dir,
INIT_LIST_HEAD(>quarantine);
spin_lock_init(>lock);
dir->quarantine_avail = quarantine_count;
+   dir->dead = false;
refcount_set(>untracked, 1);
stack_depot_init();
 }
diff --git a/lib/ref_tracker.c b/lib/ref_tracker.c
index a6789c0c626b0..32ff6bd497f8e 100644
--- a/lib/ref_tracker.c
+++ b/lib/ref_tracker.c
@@ -20,6 +20,7 @@ void ref_tracker_dir_exit(struct ref_tracker_dir *dir)
unsigned long flags;
bool leak = false;
 
+   dir->dead = true;
spin_lock_irqsave(>lock, flags);
list_for_each_entry_safe(tracker, n, >quarantine, head) {
list_del(>head);
@@ -72,6 +73,8 @@ int ref_tracker_alloc(struct ref_tracker_dir *dir,
gfp_t gfp_mask = gfp;
unsigned long flags;
 
+   WARN_ON_ONCE(dir->dead);
+
if (gfp & __GFP_DIRECT_RECLAIM)
gfp_mask |= __GFP_NOFAIL;
*trackerp = tracker = kzalloc(sizeof(*tracker), gfp_mask);
@@ -100,6 +103,8 @@ int ref_tracker_free(struct ref_tracker_dir *dir,
unsigned int nr_entries;
unsigned long flags;
 
+   WARN_ON_ONCE(dir->dead);
+
if (!tracker) {
refcount_dec(>untracked);
return -EEXIST;
-- 
2.25.1



[PATCH v2 1/9] lib/ref_tracker: add unlocked leak print helper

2022-02-21 Thread Andrzej Hajda
To have reliable detection of leaks, caller must be able to check under the same
lock both: tracked counter and the leaks. dir.lock is natural candidate for such
lock and unlocked print helper can be called with this lock taken.
As a bonus we can reuse this helper in ref_tracker_dir_exit.

Signed-off-by: Andrzej Hajda 
Reviewed-by: Chris Wilson 
---
 include/linux/ref_tracker.h |  8 +
 lib/ref_tracker.c   | 66 +
 2 files changed, 46 insertions(+), 28 deletions(-)

diff --git a/include/linux/ref_tracker.h b/include/linux/ref_tracker.h
index 60f3453be23e6..b9c968a716483 100644
--- a/include/linux/ref_tracker.h
+++ b/include/linux/ref_tracker.h
@@ -32,6 +32,9 @@ static inline void ref_tracker_dir_init(struct 
ref_tracker_dir *dir,
 
 void ref_tracker_dir_exit(struct ref_tracker_dir *dir);
 
+void __ref_tracker_dir_print(struct ref_tracker_dir *dir,
+  unsigned int display_limit);
+
 void ref_tracker_dir_print(struct ref_tracker_dir *dir,
   unsigned int display_limit);
 
@@ -52,6 +55,11 @@ static inline void ref_tracker_dir_exit(struct 
ref_tracker_dir *dir)
 {
 }
 
+static inline void __ref_tracker_dir_print(struct ref_tracker_dir *dir,
+  unsigned int display_limit)
+{
+}
+
 static inline void ref_tracker_dir_print(struct ref_tracker_dir *dir,
 unsigned int display_limit)
 {
diff --git a/lib/ref_tracker.c b/lib/ref_tracker.c
index a6789c0c626b0..1b0c6d645d64a 100644
--- a/lib/ref_tracker.c
+++ b/lib/ref_tracker.c
@@ -14,6 +14,38 @@ struct ref_tracker {
depot_stack_handle_tfree_stack_handle;
 };
 
+void __ref_tracker_dir_print(struct ref_tracker_dir *dir,
+  unsigned int display_limit)
+{
+   struct ref_tracker *tracker;
+   unsigned int i = 0;
+
+   lockdep_assert_held(>lock);
+
+   list_for_each_entry(tracker, >list, head) {
+   if (i < display_limit) {
+   pr_err("leaked reference.\n");
+   if (tracker->alloc_stack_handle)
+   stack_depot_print(tracker->alloc_stack_handle);
+   i++;
+   } else {
+   break;
+   }
+   }
+}
+EXPORT_SYMBOL(__ref_tracker_dir_print);
+
+void ref_tracker_dir_print(struct ref_tracker_dir *dir,
+  unsigned int display_limit)
+{
+   unsigned long flags;
+
+   spin_lock_irqsave(>lock, flags);
+   __ref_tracker_dir_print(dir, display_limit);
+   spin_unlock_irqrestore(>lock, flags);
+}
+EXPORT_SYMBOL(ref_tracker_dir_print);
+
 void ref_tracker_dir_exit(struct ref_tracker_dir *dir)
 {
struct ref_tracker *tracker, *n;
@@ -26,13 +58,13 @@ void ref_tracker_dir_exit(struct ref_tracker_dir *dir)
kfree(tracker);
dir->quarantine_avail++;
}
-   list_for_each_entry_safe(tracker, n, >list, head) {
-   pr_err("leaked reference.\n");
-   if (tracker->alloc_stack_handle)
-   stack_depot_print(tracker->alloc_stack_handle);
+   if (!list_empty(>list)) {
+   __ref_tracker_dir_print(dir, 16);
leak = true;
-   list_del(>head);
-   kfree(tracker);
+   list_for_each_entry_safe(tracker, n, >list, head) {
+   list_del(>head);
+   kfree(tracker);
+   }
}
spin_unlock_irqrestore(>lock, flags);
WARN_ON_ONCE(leak);
@@ -40,28 +72,6 @@ void ref_tracker_dir_exit(struct ref_tracker_dir *dir)
 }
 EXPORT_SYMBOL(ref_tracker_dir_exit);
 
-void ref_tracker_dir_print(struct ref_tracker_dir *dir,
-  unsigned int display_limit)
-{
-   struct ref_tracker *tracker;
-   unsigned long flags;
-   unsigned int i = 0;
-
-   spin_lock_irqsave(>lock, flags);
-   list_for_each_entry(tracker, >list, head) {
-   if (i < display_limit) {
-   pr_err("leaked reference.\n");
-   if (tracker->alloc_stack_handle)
-   stack_depot_print(tracker->alloc_stack_handle);
-   i++;
-   } else {
-   break;
-   }
-   }
-   spin_unlock_irqrestore(>lock, flags);
-}
-EXPORT_SYMBOL(ref_tracker_dir_print);
-
 int ref_tracker_alloc(struct ref_tracker_dir *dir,
  struct ref_tracker **trackerp,
  gfp_t gfp)
-- 
2.25.1



[PATCH v2 01/11] [DO NOT MERGE] ref_tracker: implement use-after-free detection

2022-02-21 Thread Andrzej Hajda
From: Eric Dumazet 

Whenever ref_tracker_dir_init() is called, mark the struct ref_tracker_dir
as dead.

Test the dead status from ref_tracker_alloc() and ref_tracker_free()

This should detect buggy dev_put()/dev_hold() happening too late
in netdevice dismantle process.

Signed-off-by: Eric Dumazet 
Signed-off-by: David S. Miller 
Signed-off-by: Andrzej Hajda 
---
 include/linux/ref_tracker.h | 2 ++
 lib/ref_tracker.c   | 5 +
 2 files changed, 7 insertions(+)

diff --git a/include/linux/ref_tracker.h b/include/linux/ref_tracker.h
index 60f3453be23e6..a443abda937d8 100644
--- a/include/linux/ref_tracker.h
+++ b/include/linux/ref_tracker.h
@@ -13,6 +13,7 @@ struct ref_tracker_dir {
spinlock_t  lock;
unsigned intquarantine_avail;
refcount_t  untracked;
+   booldead;
struct list_headlist; /* List of active trackers */
struct list_headquarantine; /* List of dead trackers */
 #endif
@@ -26,6 +27,7 @@ static inline void ref_tracker_dir_init(struct 
ref_tracker_dir *dir,
INIT_LIST_HEAD(>quarantine);
spin_lock_init(>lock);
dir->quarantine_avail = quarantine_count;
+   dir->dead = false;
refcount_set(>untracked, 1);
stack_depot_init();
 }
diff --git a/lib/ref_tracker.c b/lib/ref_tracker.c
index a6789c0c626b0..32ff6bd497f8e 100644
--- a/lib/ref_tracker.c
+++ b/lib/ref_tracker.c
@@ -20,6 +20,7 @@ void ref_tracker_dir_exit(struct ref_tracker_dir *dir)
unsigned long flags;
bool leak = false;
 
+   dir->dead = true;
spin_lock_irqsave(>lock, flags);
list_for_each_entry_safe(tracker, n, >quarantine, head) {
list_del(>head);
@@ -72,6 +73,8 @@ int ref_tracker_alloc(struct ref_tracker_dir *dir,
gfp_t gfp_mask = gfp;
unsigned long flags;
 
+   WARN_ON_ONCE(dir->dead);
+
if (gfp & __GFP_DIRECT_RECLAIM)
gfp_mask |= __GFP_NOFAIL;
*trackerp = tracker = kzalloc(sizeof(*tracker), gfp_mask);
@@ -100,6 +103,8 @@ int ref_tracker_free(struct ref_tracker_dir *dir,
unsigned int nr_entries;
unsigned long flags;
 
+   WARN_ON_ONCE(dir->dead);
+
if (!tracker) {
refcount_dec(>untracked);
return -EEXIST;
-- 
2.25.1



[PATCH v2 00/11] drm/i915: use ref_tracker library for tracking wakerefs

2022-02-21 Thread Andrzej Hajda
Hi,

Appearance of ref_tracker library allows to drop custom solution for wakeref
tracking used in i915 and reuse the library.
For this few adjustements has been made to ref_tracker, details in patches.
I hope changes are OK for original author.

The patchset has been rebased on top of drm-tip to allow test changes by CI.
Patches marked "[DO NOT MERGE]" are cherry-picked from linux-next (they are
not yet in drm-tip), to allow build and run CI on the patchset (it works only
on drm-tip tree).

Added CC to netdev as the only user of the library atm.

v2:
  - replaced list_sort with ref_tracker_dir_stats, to avoid potentially
extensive sorting, if number of reports is expected to be big enough (???)
we can replace linear search in ref_tracker_dir_stats.stacks with binary
heap (min_heap),
  - refactored gfp flags,
  - fixed i915 handling of no-tracking flag.

Regards
Andrzej


Andrzej Hajda (6):
  lib/ref_tracker: add unlocked leak print helper
  lib/ref_tracker: __ref_tracker_dir_print improve printing
  lib/ref_tracker: add printing to memory buffer
  lib/ref_tracker: remove warnings in case of allocation failure
  drm/i915: Correct type of wakeref variable
  drm/i915: replace Intel internal tracker with kernel core ref_tracker

Chris Wilson (2):
  drm/i915: Separate wakeref tracking
  drm/i915: Track leaked gt->wakerefs

Eric Dumazet (3):
  [DO NOT MERGE] ref_tracker: implement use-after-free detection
  [DO NOT MERGE] ref_tracker: add a count of untracked references
  [DO NOT MERGE] ref_tracker: remove filter_irq_stacks() call

 drivers/gpu/drm/i915/Kconfig.debug|  19 ++
 drivers/gpu/drm/i915/Makefile |   1 +
 .../drm/i915/display/intel_display_power.c|   2 +-
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|   7 +-
 .../i915/gem/selftests/i915_gem_coherency.c   |  10 +-
 .../drm/i915/gem/selftests/i915_gem_mman.c|  14 +-
 drivers/gpu/drm/i915/gt/intel_breadcrumbs.c   |  13 +-
 .../gpu/drm/i915/gt/intel_breadcrumbs_types.h |   3 +-
 drivers/gpu/drm/i915/gt/intel_engine_pm.c |   6 +-
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |   2 +
 .../drm/i915/gt/intel_execlists_submission.c  |   2 +-
 drivers/gpu/drm/i915/gt/intel_gt_pm.c |  12 +-
 drivers/gpu/drm/i915/gt/intel_gt_pm.h |  36 ++-
 drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c |   4 +-
 drivers/gpu/drm/i915/gt/selftest_engine_cs.c  |  20 +-
 drivers/gpu/drm/i915/gt/selftest_gt_pm.c  |   5 +-
 drivers/gpu/drm/i915/gt/selftest_reset.c  |  10 +-
 drivers/gpu/drm/i915/gt/selftest_rps.c|  17 +-
 drivers/gpu/drm/i915/gt/selftest_slpc.c   |  10 +-
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  11 +-
 drivers/gpu/drm/i915/i915_pmu.c   |  16 +-
 drivers/gpu/drm/i915/intel_runtime_pm.c   | 239 ++
 drivers/gpu/drm/i915/intel_runtime_pm.h   |  10 +-
 drivers/gpu/drm/i915/intel_wakeref.c  |  10 +-
 drivers/gpu/drm/i915/intel_wakeref.h  | 112 +++-
 include/linux/ref_tracker.h   |  35 ++-
 lib/ref_tracker.c | 198 ---
 27 files changed, 480 insertions(+), 344 deletions(-)

-- 
2.25.1



Re: [PATCH] drm/amdgpu: Fix typo in *whether* in comment

2022-02-21 Thread Alex Deucher
Applied.  Thanks!

On Fri, Feb 18, 2022 at 11:56 PM Paul Menzel  wrote:
>
> Signed-off-by: Paul Menzel 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 63a089992645..430e56583751 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -740,7 +740,7 @@ MODULE_PARM_DESC(debug_largebar,
>   * systems with a broken CRAT table.
>   *
>   * Default is auto (according to asic type, iommu_v2, and crat table, to 
> decide
> - * whehter use CRAT)
> + * whether use CRAT)
>   */
>  int ignore_crat;
>  module_param(ignore_crat, int, 0444);
> --
> 2.35.1
>


[PATCH 2/2] drm/i915/vlv_dsi: Add DMI quirk for wrong panel size on Lenovo Yoga Tablet 2 series

2022-02-21 Thread Hans de Goede
On the Lenovo Yoga Tablet 2 830 / 1050 the VBT contains a bogus
192mm x 120mm size. This is especially a problem on the 8" 830 version
which uses a 10:16 portrait screen where as the bogus size is 16:10.

Add a DMI quirk to override the wrong panel size with the correct one.
Note both the 10" 1050 models as well as the 8" 830 models use the same
mainboard and thus the same DMI strings. The 10" 1050 uses a 1920x1200
landscape screen, where as the 8" 830 uses a 1200x1920 portrait screen,
so the quirk handling uses the display resolution to detect the model.

Signed-off-by: Hans de Goede 
---
 drivers/gpu/drm/i915/display/vlv_dsi.c | 37 ++
 1 file changed, 37 insertions(+)

diff --git a/drivers/gpu/drm/i915/display/vlv_dsi.c 
b/drivers/gpu/drm/i915/display/vlv_dsi.c
index 66f5cf32bb66..e370a039e991 100644
--- a/drivers/gpu/drm/i915/display/vlv_dsi.c
+++ b/drivers/gpu/drm/i915/display/vlv_dsi.c
@@ -1847,6 +1847,29 @@ static void vlv_dsi_asus_tf103c_mode_fixup(struct 
drm_connector *connector,
fixed_mode->crtc_vtotal = 816;
 }
 
+/*
+ * On the Lenovo Yoga Tablet 2 830 / 1050 width_/height_mm contain a bogus
+ * 192mm x 120mm size. This is especially a problem on the 8" 830 version which
+ * uses a 10:16 portrait screen where as the bogus size is 16:10.
+ */
+static void vlv_dsi_lenovo_yoga_tab2_mode_fixup(struct drm_connector 
*connector,
+   struct drm_display_mode 
*fixed_mode)
+{
+   struct drm_display_info *info = >display_info;
+
+   /*
+* The 10" 1050 uses a 1920x1200 landscape screen, where as the 8" 830
+* uses a 1200x1920 portrait screen.
+*/
+   if (fixed_mode->hdisplay == 1920) {
+   info->width_mm = 216;
+   info->height_mm = 135;
+   } else {
+   info->width_mm = 107;
+   info->height_mm = 171;
+   }
+}
+
 static const struct dmi_system_id dmi_mode_fixup_table[] = {
{
/* Asus Transformer Pad TF103C */
@@ -1856,6 +1879,20 @@ static const struct dmi_system_id dmi_mode_fixup_table[] 
= {
},
.driver_data = (void *)vlv_dsi_asus_tf103c_mode_fixup,
},
+   {
+   /*
+* Lenovo Yoga Tablet 2 830F/L or 1050F/L (The 8" and 10"
+* Lenovo Yoga Tablet 2 use the same mainboard)
+*/
+   .matches = {
+   DMI_MATCH(DMI_SYS_VENDOR, "Intel Corp."),
+   DMI_MATCH(DMI_PRODUCT_NAME, "VALLEYVIEW C0 PLATFORM"),
+   DMI_MATCH(DMI_BOARD_NAME, "BYT-T FFD8"),
+   /* Partial match on beginning of BIOS version */
+   DMI_MATCH(DMI_BIOS_VERSION, "BLADE_21"),
+   },
+   .driver_data = (void *)vlv_dsi_lenovo_yoga_tab2_mode_fixup,
+   },
{ }
 };
 
-- 
2.35.1



[PATCH 1/2] drm/i915/vlv_dsi: Add DMI quirk for wrong panel modeline in BIOS on Asus TF103C

2022-02-21 Thread Hans de Goede
Vtotal is wrong in the BIOS supplied modeline for the DSI panel on
the Asus TF103C leading to the last line of the display being shown
as the first line.

The factory installed Android has a hardcoded modeline in its kernel,
causing it to not suffer from this BIOS bug;

and the Android boot-splash which uses the EFI FB which does have this bug
has the last line all black causing the bug to not be visible.

This commit introduces a generic DMI based mechanism for doing modeline
fixups, in case we need similar fixups on other models in the future.

Signed-off-by: Hans de Goede 
---
 drivers/gpu/drm/i915/display/vlv_dsi.c | 36 ++
 1 file changed, 36 insertions(+)

diff --git a/drivers/gpu/drm/i915/display/vlv_dsi.c 
b/drivers/gpu/drm/i915/display/vlv_dsi.c
index 06ef822c27bd..66f5cf32bb66 100644
--- a/drivers/gpu/drm/i915/display/vlv_dsi.c
+++ b/drivers/gpu/drm/i915/display/vlv_dsi.c
@@ -23,6 +23,7 @@
  * Author: Jani Nikula 
  */
 
+#include 
 #include 
 
 #include 
@@ -1831,6 +1832,33 @@ static void vlv_dphy_param_init(struct intel_dsi 
*intel_dsi)
intel_dsi_log_params(intel_dsi);
 }
 
+typedef void (*vlv_dsi_mode_fixup_func)(struct drm_connector *connector,
+   struct drm_display_mode *fixed_mode);
+
+/*
+ * Vtotal is wrong on the Asus TF103C leading to the last line of the display
+ * being shown as the first line. The factory installed Android has a hardcoded
+ * modeline, causing it to not suffer from this BIOS bug.
+ */
+static void vlv_dsi_asus_tf103c_mode_fixup(struct drm_connector *connector,
+  struct drm_display_mode *fixed_mode)
+{
+   fixed_mode->vtotal = 816;
+   fixed_mode->crtc_vtotal = 816;
+}
+
+static const struct dmi_system_id dmi_mode_fixup_table[] = {
+   {
+   /* Asus Transformer Pad TF103C */
+   .matches = {
+   DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
+   DMI_MATCH(DMI_PRODUCT_NAME, "TF103C"),
+   },
+   .driver_data = (void *)vlv_dsi_asus_tf103c_mode_fixup,
+   },
+   { }
+};
+
 void vlv_dsi_init(struct drm_i915_private *dev_priv)
 {
struct drm_device *dev = _priv->drm;
@@ -1840,6 +1868,8 @@ void vlv_dsi_init(struct drm_i915_private *dev_priv)
struct intel_connector *intel_connector;
struct drm_connector *connector;
struct drm_display_mode *current_mode, *fixed_mode;
+   const struct dmi_system_id *dmi_id;
+   vlv_dsi_mode_fixup_func mode_fixup;
enum port port;
enum pipe pipe;
 
@@ -1968,6 +1998,12 @@ void vlv_dsi_init(struct drm_i915_private *dev_priv)
goto err_cleanup_connector;
}
 
+   dmi_id = dmi_first_match(dmi_mode_fixup_table);
+   if (dmi_id) {
+   mode_fixup = (vlv_dsi_mode_fixup_func)dmi_id->driver_data;
+   mode_fixup(connector, fixed_mode);
+   }
+
intel_panel_init(_connector->panel, fixed_mode, NULL);
intel_backlight_setup(intel_connector, INVALID_PIPE);
 
-- 
2.35.1



[PATCH] drm/simpledrm: Add "panel orientation" property on non-upright mounted LCD panels

2022-02-21 Thread Hans de Goede
Some devices use e.g. a portrait panel in a standard laptop casing made
for landscape panels. efifb calls drm_get_panel_orientation_quirk() and
sets fb_info.fbcon_rotate_hint to make fbcon rotate the console so that
it shows up-right instead of on its side.

When switching to simpledrm to fbcon renders on its side. Call the
drm_connector_set_panel_orientation_with_quirk() helper to add
a "panel orientation" property on devices listed in the quirk table,
to make the fbcon (and aware userspace apps) rotate the image to
display properly.

Cc: Javier Martinez Canillas 
Signed-off-by: Hans de Goede 
---
 drivers/gpu/drm/tiny/simpledrm.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/tiny/simpledrm.c b/drivers/gpu/drm/tiny/simpledrm.c
index 04146da2d1d8..11576e0297e4 100644
--- a/drivers/gpu/drm/tiny/simpledrm.c
+++ b/drivers/gpu/drm/tiny/simpledrm.c
@@ -798,6 +798,9 @@ static int simpledrm_device_init_modeset(struct 
simpledrm_device *sdev)
if (ret)
return ret;
drm_connector_helper_add(connector, _connector_helper_funcs);
+   drm_connector_set_panel_orientation_with_quirk(connector,
+  
DRM_MODE_PANEL_ORIENTATION_UNKNOWN,
+  mode->hdisplay, 
mode->vdisplay);
 
formats = simpledrm_device_formats(sdev, );
 
-- 
2.35.1



Re: [PATCH] drm/amdgpu: fix printk format for size_t variable

2022-02-21 Thread Tom Rix



On 2/21/22 12:53 PM, Luben Tuikov wrote:

On 2022-02-21 15:36, Tom Rix wrote:

On 2/21/22 11:57 AM, Luben Tuikov wrote:

Hi Tom,

This was already fixed with this patch, and LKML was CC-ed. See the CC tags in 
the patch below,

commit 4f7d7cda90cbd7
Author: Luben Tuikov 
Date:   Wed Feb 16 16:47:32 2022 -0500

  drm/amdgpu: Fix ARM compilation warning
  
  Fix this ARM warning:

I glad it wasn't just mips ;)

There have been a couple of build breaks with amdgpu recently.

Nick asked about adding clang to your ci.

Could at least one non x86_64 gcc also be added, maybe aarch64 ?

Yeah, that's a great idea. I tried the make.cross (for ARM) as per
the initial breakage report, but when I tried it, it got into a loop of
"make ARCH=arm mrproper" --> "make prepare" --> "make ARCH=arm mrproper" --> "make 
prepare" --> ...
and I couldn't figure out why.

Maybe need to set CROSS_COMPILE ?

I don't mind adding ARM cross compilation into my local setup.


For crosses, I generate a 'make' script like

#!/bin/sh

export PATH=/bin:$PATH

make ARCH=arm64 CROSS_COMPILE=aarch64-elf- $@

so workflow looks like normal, replacing make with ./make

Tom



Regards,
Luben



Tom

  
  drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c:664:35: warning: format '%ld'

  expects argument of type 'long int', but argument 4 has type 'size_t' {aka
  'unsigned int'} [-Wformat=]
  
  Cc: Alex Deucher 

  Cc: kbuild-...@lists.01.org
  Cc: linux-ker...@vger.kernel.org
  Reported-by: kernel test robot 
  Fixes: 7e60fbfbdc10a0 ("drm/amdgpu: Show IP discovery in sysfs")
  Signed-off-by: Luben Tuikov 
  Acked-by: Alex Deucher 

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
index 2506bcf36c870c..6c7ec058125e1d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
@@ -668,7 +668,7 @@ static int amdgpu_discovery_sysfs_ips(struct amdgpu_device 
*adev,
  le16_to_cpu(ip->hw_id) != ii)
  goto next_ip;
   
-   DRM_DEBUG("match:%d @ ip_offset:%ld", ii, ip_offset);

+   DRM_DEBUG("match:%d @ ip_offset:%zu", ii, ip_offset);
   
  /* We have a hw_id match; register the hw

   * block if not yet registered.

Regards,
Luben

On 2022-02-21 12:37, t...@redhat.com wrote:

From: Tom Rix 

On mips64 allyesconfig, there is this build break
amdgpu_discovery.c:671:35: error: format '%ld' expects
argument of type 'long int', but argument 4 has
type 'size_t' {aka 'unsigned int'}
DRM_DEBUG("match:%d @ ip_offset:%ld", ii, ip_offset);

For size_t, use %zu.

Fixes: a6c40b178092 ("drm/amdgpu: Show IP discovery in sysfs")
Signed-off-by: Tom Rix 
---
   drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
index 7c7e28fd912e..58238f67b1d3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
@@ -668,7 +668,7 @@ static int amdgpu_discovery_sysfs_ips(struct amdgpu_device 
*adev,
le16_to_cpu(ip->hw_id) != ii)
goto next_ip;
   
-			DRM_DEBUG("match:%d @ ip_offset:%ld", ii, ip_offset);

+   DRM_DEBUG("match:%d @ ip_offset:%zu", ii, ip_offset);
   
   			/* We have a hw_id match; register the hw

 * block if not yet registered.

Regards,

Regards,




Re: [PATCH] drm/amdgpu: fix printk format for size_t variable

2022-02-21 Thread Luben Tuikov
On 2022-02-21 15:36, Tom Rix wrote:
> 
> On 2/21/22 11:57 AM, Luben Tuikov wrote:
>> Hi Tom,
>>
>> This was already fixed with this patch, and LKML was CC-ed. See the CC tags 
>> in the patch below,
>>
>> commit 4f7d7cda90cbd7
>> Author: Luben Tuikov 
>> Date:   Wed Feb 16 16:47:32 2022 -0500
>>
>>  drm/amdgpu: Fix ARM compilation warning
>>  
>>  Fix this ARM warning:
> 
> I glad it wasn't just mips ;)
> 
> There have been a couple of build breaks with amdgpu recently.
> 
> Nick asked about adding clang to your ci.
> 
> Could at least one non x86_64 gcc also be added, maybe aarch64 ?

Yeah, that's a great idea. I tried the make.cross (for ARM) as per
the initial breakage report, but when I tried it, it got into a loop of
"make ARCH=arm mrproper" --> "make prepare" --> "make ARCH=arm mrproper" --> 
"make prepare" --> ...
and I couldn't figure out why.

I don't mind adding ARM cross compilation into my local setup.

Regards,
Luben


> 
> Tom
> 
>>  
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c:664:35: warning: format 
>> '%ld'
>>  expects argument of type 'long int', but argument 4 has type 'size_t' 
>> {aka
>>  'unsigned int'} [-Wformat=]
>>  
>>  Cc: Alex Deucher 
>>  Cc: kbuild-...@lists.01.org
>>  Cc: linux-ker...@vger.kernel.org
>>  Reported-by: kernel test robot 
>>  Fixes: 7e60fbfbdc10a0 ("drm/amdgpu: Show IP discovery in sysfs")
>>  Signed-off-by: Luben Tuikov 
>>  Acked-by: Alex Deucher 
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
>> index 2506bcf36c870c..6c7ec058125e1d 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
>> @@ -668,7 +668,7 @@ static int amdgpu_discovery_sysfs_ips(struct 
>> amdgpu_device *adev,
>>  le16_to_cpu(ip->hw_id) != ii)
>>  goto next_ip;
>>   
>> -   DRM_DEBUG("match:%d @ ip_offset:%ld", ii, ip_offset);
>> +   DRM_DEBUG("match:%d @ ip_offset:%zu", ii, ip_offset);
>>   
>>  /* We have a hw_id match; register the hw
>>   * block if not yet registered.
>>
>> Regards,
>> Luben
>>
>> On 2022-02-21 12:37, t...@redhat.com wrote:
>>> From: Tom Rix 
>>>
>>> On mips64 allyesconfig, there is this build break
>>> amdgpu_discovery.c:671:35: error: format '%ld' expects
>>>argument of type 'long int', but argument 4 has
>>>type 'size_t' {aka 'unsigned int'}
>>>DRM_DEBUG("match:%d @ ip_offset:%ld", ii, ip_offset);
>>>
>>> For size_t, use %zu.
>>>
>>> Fixes: a6c40b178092 ("drm/amdgpu: Show IP discovery in sysfs")
>>> Signed-off-by: Tom Rix 
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
>>> index 7c7e28fd912e..58238f67b1d3 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
>>> @@ -668,7 +668,7 @@ static int amdgpu_discovery_sysfs_ips(struct 
>>> amdgpu_device *adev,
>>> le16_to_cpu(ip->hw_id) != ii)
>>> goto next_ip;
>>>   
>>> -   DRM_DEBUG("match:%d @ ip_offset:%ld", ii, ip_offset);
>>> +   DRM_DEBUG("match:%d @ ip_offset:%zu", ii, ip_offset);
>>>   
>>> /* We have a hw_id match; register the hw
>>>  * block if not yet registered.
>> Regards,
> 

Regards,
-- 
Luben


Re: [PATCH] drm/amdgpu: fix printk format for size_t variable

2022-02-21 Thread Tom Rix



On 2/21/22 11:57 AM, Luben Tuikov wrote:

Hi Tom,

This was already fixed with this patch, and LKML was CC-ed. See the CC tags in 
the patch below,

commit 4f7d7cda90cbd7
Author: Luben Tuikov 
Date:   Wed Feb 16 16:47:32 2022 -0500

 drm/amdgpu: Fix ARM compilation warning
 
 Fix this ARM warning:


I glad it wasn't just mips ;)

There have been a couple of build breaks with amdgpu recently.

Nick asked about adding clang to your ci.

Could at least one non x86_64 gcc also be added, maybe aarch64 ?

Tom

 
 drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c:664:35: warning: format '%ld'

 expects argument of type 'long int', but argument 4 has type 'size_t' {aka
 'unsigned int'} [-Wformat=]
 
 Cc: Alex Deucher 

 Cc: kbuild-...@lists.01.org
 Cc: linux-ker...@vger.kernel.org
 Reported-by: kernel test robot 
 Fixes: 7e60fbfbdc10a0 ("drm/amdgpu: Show IP discovery in sysfs")
 Signed-off-by: Luben Tuikov 
 Acked-by: Alex Deucher 

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
index 2506bcf36c870c..6c7ec058125e1d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
@@ -668,7 +668,7 @@ static int amdgpu_discovery_sysfs_ips(struct amdgpu_device 
*adev,
 le16_to_cpu(ip->hw_id) != ii)
 goto next_ip;
  
-   DRM_DEBUG("match:%d @ ip_offset:%ld", ii, ip_offset);

+   DRM_DEBUG("match:%d @ ip_offset:%zu", ii, ip_offset);
  
 /* We have a hw_id match; register the hw

  * block if not yet registered.

Regards,
Luben

On 2022-02-21 12:37, t...@redhat.com wrote:

From: Tom Rix 

On mips64 allyesconfig, there is this build break
amdgpu_discovery.c:671:35: error: format '%ld' expects
   argument of type 'long int', but argument 4 has
   type 'size_t' {aka 'unsigned int'}
   DRM_DEBUG("match:%d @ ip_offset:%ld", ii, ip_offset);

For size_t, use %zu.

Fixes: a6c40b178092 ("drm/amdgpu: Show IP discovery in sysfs")
Signed-off-by: Tom Rix 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
index 7c7e28fd912e..58238f67b1d3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
@@ -668,7 +668,7 @@ static int amdgpu_discovery_sysfs_ips(struct amdgpu_device 
*adev,
le16_to_cpu(ip->hw_id) != ii)
goto next_ip;
  
-			DRM_DEBUG("match:%d @ ip_offset:%ld", ii, ip_offset);

+   DRM_DEBUG("match:%d @ ip_offset:%zu", ii, ip_offset);
  
  			/* We have a hw_id match; register the hw

 * block if not yet registered.

Regards,




Re: [PATCH libdrm v2 00/25] Update Tegra support

2022-02-21 Thread Dmitry Osipenko
18.02.2022 12:31, Mikko Perttunen пишет:
> On 2/17/22 21:16, Thierry Reding wrote:
>> ...
> 
> Reviewed-by: Mikko Perttunen 
> 
> Left one cosmetic comment in the VIC4.0 patch, but overall looks OK. I
> think it would be fine to have some basic tests in libdrm as well.

There is a question about who is going to use this libdrm API. Are you
going to use it in the VAAPI driver?

Grate drivers can't use this API because:

1. More features are needed
2. There is no stable API
3. It's super painful to keep all drivers and libdrm in sync from a
packaging perspective.

It's much more practical nowadays to use DRM directly, without
SoC-specific libdrm API, i.e. to bundle that SoC-specific API within the
drivers.


Re: [PATCH] drm/omap: switch to drm_of_find_panel_or_bridge

2022-02-21 Thread kernel test robot
Hi "José,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on drm/drm-next]
[also build test ERROR on v5.17-rc5 next-20220217]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Jos-Exp-sito/drm-omap-switch-to-drm_of_find_panel_or_bridge/20220221-035403
base:   git://anongit.freedesktop.org/drm/drm drm-next
config: arm-allmodconfig 
(https://download.01.org/0day-ci/archive/20220222/202202220451.vbtgfzsa-...@intel.com/config)
compiler: arm-linux-gnueabi-gcc (GCC) 11.2.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://github.com/0day-ci/linux/commit/9a465e2c1dba123efe08cf2f4a5ae11b07be4142
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Jos-Exp-sito/drm-omap-switch-to-drm_of_find_panel_or_bridge/20220221-035403
git checkout 9a465e2c1dba123efe08cf2f4a5ae11b07be4142
# save the config file to linux build tree
mkdir build_dir
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross 
O=build_dir ARCH=arm SHELL=/bin/bash

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

   drivers/gpu/drm/omapdrm/dss/output.c: In function 
'omapdss_device_init_output':
>> drivers/gpu/drm/omapdrm/dss/output.c:25:15: error: implicit declaration of 
>> function 'drm_of_find_panel_or_bridge' 
>> [-Werror=implicit-function-declaration]
  25 | ret = drm_of_find_panel_or_bridge(out->dev->of_node, 
out->of_port, 0,
 |   ^~~
   cc1: some warnings being treated as errors


vim +/drm_of_find_panel_or_bridge +25 drivers/gpu/drm/omapdrm/dss/output.c

19  
20  int omapdss_device_init_output(struct omap_dss_device *out,
21 struct drm_bridge *local_bridge)
22  {
23  int ret;
24  
  > 25  ret = drm_of_find_panel_or_bridge(out->dev->of_node, 
out->of_port, 0,
26>panel, >bridge);
27  if (ret) {
28  if (ret == -ENODEV) {
29  dev_dbg(out->dev, "failed to find video 
sink\n");
30  return 0;
31  }
32  goto error;
33  }
34  
35  if (out->panel) {
36  struct drm_bridge *bridge;
37  
38  bridge = drm_panel_bridge_add(out->panel);
39  if (IS_ERR(bridge)) {
40  dev_err(out->dev,
41  "unable to create panel bridge (%ld)\n",
42  PTR_ERR(bridge));
43  ret = PTR_ERR(bridge);
44  goto error;
45  }
46  
47  out->bridge = bridge;
48  }
49  
50  if (local_bridge) {
51  if (!out->bridge) {
52  ret = -EPROBE_DEFER;
53  goto error;
54  }
55  
56  out->next_bridge = out->bridge;
57  out->bridge = local_bridge;
58  }
59  
60  if (!out->bridge) {
61  ret = -EPROBE_DEFER;
62  goto error;
63  }
64  
65  return 0;
66  
67  error:
68  omapdss_device_cleanup_output(out);
69  return ret;
70  }
71  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


Re: [PATCH v3 8/9] drm/tegra: vic: Implement get_streamid_offset

2022-02-21 Thread Dmitry Osipenko
21.02.2022 14:44, Mikko Perttunen пишет:
> On 2/19/22 20:54, Dmitry Osipenko wrote:
>> 19.02.2022 21:49, Dmitry Osipenko пишет:
>>> 18.02.2022 14:39, Mikko Perttunen пишет:
 +static int vic_get_streamid_offset(struct tegra_drm_client *client)
 +{
 +    struct vic *vic = to_vic(client);
 +    int err;
 +
 +    err = vic_load_firmware(vic);
>>>
>>> You can't invoke vic_load_firmware() while RPM is suspended. Either
>>> replace this with RPM get/put or do something else.
> 
> Why not, I'm not seeing any HW accesses in vic_load_firmware? Although
> it looks like it might race with the vic_load_firmware call in
> vic_runtime_resume which probably needs to be fixed.

It was not clear from the function's name that h/w is untouched, I read
"load" as "upload" and then looked at vic_runtime_resume(). I'd rename
vic_load_firmware() to vic_prepare_firmware_image().

And yes, technically lock is needed.


Re: [PATCH v3 9/9] drm/tegra: Support context isolation

2022-02-21 Thread Dmitry Osipenko
21.02.2022 15:06, Mikko Perttunen пишет:
> On 2/19/22 20:35, Dmitry Osipenko wrote:
>> 18.02.2022 14:39, Mikko Perttunen пишет:
>>> +    if (context->memory_context &&
>>> context->client->ops->get_streamid_offset) {
>>  ^^^
>>> +    int offset =
>>> context->client->ops->get_streamid_offset(context->client);
>>> +
>>> +    if (offset >= 0) {
>>> +    job->context = context->memory_context;
>>> +    job->engine_streamid_offset = offset;
>>> +    host1x_context_get(job->context);
>>> +    }
>>
>> You should bump refcount unconditionally or you'll get refcnt underflow
>> on put, when offset < 0.
> 
> This refcount is intended to be dropped from 'release_job', where it's
> dropped if job->context is set, which it is from this path.
> 
>>
>>> +    }
>>> +
>>>   /*
>>>    * job_data is now part of job reference counting, so don't
>>> release
>>>    * it from here.
>>> diff --git a/drivers/gpu/drm/tegra/uapi.c b/drivers/gpu/drm/tegra/uapi.c
>>> index 9ab9179d2026..be33da54d12c 100644
>>> --- a/drivers/gpu/drm/tegra/uapi.c
>>> +++ b/drivers/gpu/drm/tegra/uapi.c
>>> @@ -33,6 +33,9 @@ static void tegra_drm_channel_context_close(struct
>>> tegra_drm_context *context)
>>>   struct tegra_drm_mapping *mapping;
>>>   unsigned long id;
>>>   +    if (context->memory_context)
>>> +    host1x_context_put(context->memory_context);
>>
>> The "if (context->memory_context &&
>> context->client->ops->get_streamid_offset)" above doesn't match the "if
>> (context->memory_context)". You'll get refcount underflow.
> 
> And this drop is for the refcount implicitly added when allocating the
> memory context through host1x_context_alloc; so these two places should
> be independent.
> 
> Please elaborate if I missed something.

You named context as memory_context and then have
context=context->memory_context. Please try to improve the variable
names, like drm_ctx->host1x_ctx for example.

I'm also not a big fan of the "if (ref) put(ref)" pattern. Won't be
better to move all the "if (!NULL)" checks inside of get()/put() and
make the invocations unconditional?


Re: [PATCH] drm/amdgpu: fix printk format for size_t variable

2022-02-21 Thread Luben Tuikov
Hi Tom,

This was already fixed with this patch, and LKML was CC-ed. See the CC tags in 
the patch below,

commit 4f7d7cda90cbd7
Author: Luben Tuikov 
Date:   Wed Feb 16 16:47:32 2022 -0500

drm/amdgpu: Fix ARM compilation warning

Fix this ARM warning:

drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c:664:35: warning: format '%ld'
expects argument of type 'long int', but argument 4 has type 'size_t' {aka
'unsigned int'} [-Wformat=]

Cc: Alex Deucher 
Cc: kbuild-...@lists.01.org
Cc: linux-ker...@vger.kernel.org
Reported-by: kernel test robot 
Fixes: 7e60fbfbdc10a0 ("drm/amdgpu: Show IP discovery in sysfs")
Signed-off-by: Luben Tuikov 
Acked-by: Alex Deucher 

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
index 2506bcf36c870c..6c7ec058125e1d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
@@ -668,7 +668,7 @@ static int amdgpu_discovery_sysfs_ips(struct amdgpu_device 
*adev,
le16_to_cpu(ip->hw_id) != ii)
goto next_ip;
 
-   DRM_DEBUG("match:%d @ ip_offset:%ld", ii, ip_offset);
+   DRM_DEBUG("match:%d @ ip_offset:%zu", ii, ip_offset);
 
/* We have a hw_id match; register the hw
 * block if not yet registered.

Regards,
Luben

On 2022-02-21 12:37, t...@redhat.com wrote:
> From: Tom Rix 
> 
> On mips64 allyesconfig, there is this build break
> amdgpu_discovery.c:671:35: error: format '%ld' expects
>   argument of type 'long int', but argument 4 has
>   type 'size_t' {aka 'unsigned int'}
>   DRM_DEBUG("match:%d @ ip_offset:%ld", ii, ip_offset);
> 
> For size_t, use %zu.
> 
> Fixes: a6c40b178092 ("drm/amdgpu: Show IP discovery in sysfs")
> Signed-off-by: Tom Rix 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
> index 7c7e28fd912e..58238f67b1d3 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
> @@ -668,7 +668,7 @@ static int amdgpu_discovery_sysfs_ips(struct 
> amdgpu_device *adev,
>   le16_to_cpu(ip->hw_id) != ii)
>   goto next_ip;
>  
> - DRM_DEBUG("match:%d @ ip_offset:%ld", ii, ip_offset);
> + DRM_DEBUG("match:%d @ ip_offset:%zu", ii, ip_offset);
>  
>   /* We have a hw_id match; register the hw
>* block if not yet registered.

Regards,
-- 
Luben


[PATCH v2 5/5] drm: Add TODO item for optimizing format helpers

2022-02-21 Thread Thomas Zimmermann
Add a TODO item for optimizing blitting and format-conversion helpers
in DRM and fbdev. There's always demand for faster graphics output.

Signed-off-by: Thomas Zimmermann 
---
 Documentation/gpu/todo.rst | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/Documentation/gpu/todo.rst b/Documentation/gpu/todo.rst
index 7bf7f2111696..7f113c6a02dd 100644
--- a/Documentation/gpu/todo.rst
+++ b/Documentation/gpu/todo.rst
@@ -241,6 +241,28 @@ Contact: Thomas Zimmermann , Daniel 
Vetter
 
 Level: Advanced
 
+Benchmark and optimize blitting and format-conversion function
+--
+
+Drawing to dispay memory quickly is crucial for many applications'
+performance.
+
+On at least x86-64, sys_imageblit() is significantly slower than
+cfb_imageblit(), even though both use the same blitting algorithm and
+the latter is written for I/O memory. It turns out that cfb_imageblit()
+uses movl instructions, while sys_imageblit apparently does not. This
+seems to be a problem with gcc's optimizer. DRM's format-conversion
+heleprs might be subject to similar issues.
+
+Benchmark and optimize fbdev's sys_() helpers and DRM's format-conversion
+helpers. In cases that can be further optimized, maybe implement a different
+algorithm, For micro-optimizations, use movl/movq instructions explicitly.
+That might possibly require architecture specific helpers (e.g., storel()
+storeq()).
+
+Contact: Thomas Zimmermann 
+
+Level: Intermediate
 
 drm_framebuffer_funcs and drm_mode_config_funcs.fb_create cleanup
 -
-- 
2.35.1



[PATCH v2 1/5] fbdev: Improve performance of sys_fillrect()

2022-02-21 Thread Thomas Zimmermann
Improve the performance of sys_fillrect() by using word-aligned
32/64-bit mov instructions. While the code tried to implement this,
the compiler failed to create fast instructions. The resulting
binary instructions were even slower than cfb_fillrect(), which
uses the same algorithm, but operates on I/O memory.

A microbenchmark measures the average number of CPU cycles
for sys_fillrect() after a stabilizing period of a few minutes
(i7-4790, FullHD, simpledrm, kernel with debugging). The value
for CFB is given as a reference.

  sys_fillrect(), new:  26586 cycles
  sys_fillrect(), old: 166603 cycles
  cfb_fillrect():   41012 cycles

In the optimized case, sys_fillrect() is now ~6x faster than before
and ~1.5x faster than the CFB implementation.

Signed-off-by: Thomas Zimmermann 
Reviewed-by: Javier Martinez Canillas 
Reviewed-by: Sam Ravnborg 
---
 drivers/video/fbdev/core/sysfillrect.c | 16 +++-
 1 file changed, 3 insertions(+), 13 deletions(-)

diff --git a/drivers/video/fbdev/core/sysfillrect.c 
b/drivers/video/fbdev/core/sysfillrect.c
index 33ee3d34f9d2..bcdcaeae6538 100644
--- a/drivers/video/fbdev/core/sysfillrect.c
+++ b/drivers/video/fbdev/core/sysfillrect.c
@@ -50,19 +50,9 @@ bitfill_aligned(struct fb_info *p, unsigned long *dst, int 
dst_idx,
 
/* Main chunk */
n /= bits;
-   while (n >= 8) {
-   *dst++ = pat;
-   *dst++ = pat;
-   *dst++ = pat;
-   *dst++ = pat;
-   *dst++ = pat;
-   *dst++ = pat;
-   *dst++ = pat;
-   *dst++ = pat;
-   n -= 8;
-   }
-   while (n--)
-   *dst++ = pat;
+   memset_l(dst, pat, n);
+   dst += n;
+
/* Trailing bits */
if (last)
*dst = comp(pat, *dst, last);
-- 
2.35.1



[PATCH v2 2/5] fbdev: Improve performance of sys_imageblit()

2022-02-21 Thread Thomas Zimmermann
Improve the performance of sys_imageblit() by manually unrolling
the inner blitting loop and moving some invariants out. The compiler
failed to do this automatically. The resulting binary code was even
slower than the cfb_imageblit() helper, which uses the same algorithm,
but operates on I/O memory.

A microbenchmark measures the average number of CPU cycles
for sys_imageblit() after a stabilizing period of a few minutes
(i7-4790, FullHD, simpledrm, kernel with debugging). The value
for CFB is given as a reference.

  sys_imageblit(), new: 25934 cycles
  sys_imageblit(), old: 35944 cycles
  cfb_imageblit():  30566 cycles

In the optimized case, sys_imageblit() is now ~30% faster than before
and ~20% faster than cfb_imageblit().

v2:
* move switch out of inner loop (Gerd)
* remove test for alignment of dst1 (Sam)

Signed-off-by: Thomas Zimmermann 
Reviewed-by: Javier Martinez Canillas 
Acked-by: Sam Ravnborg 
---
 drivers/video/fbdev/core/sysimgblt.c | 49 +---
 1 file changed, 38 insertions(+), 11 deletions(-)

diff --git a/drivers/video/fbdev/core/sysimgblt.c 
b/drivers/video/fbdev/core/sysimgblt.c
index a4d05b1b17d7..722c327a381b 100644
--- a/drivers/video/fbdev/core/sysimgblt.c
+++ b/drivers/video/fbdev/core/sysimgblt.c
@@ -188,23 +188,29 @@ static void fast_imageblit(const struct fb_image *image, 
struct fb_info *p,
 {
u32 fgx = fgcolor, bgx = bgcolor, bpp = p->var.bits_per_pixel;
u32 ppw = 32/bpp, spitch = (image->width + 7)/8;
-   u32 bit_mask, end_mask, eorx, shift;
+   u32 bit_mask, eorx;
const char *s = image->data, *src;
u32 *dst;
-   const u32 *tab = NULL;
+   const u32 *tab;
+   size_t tablen;
+   u32 colortab[16];
int i, j, k;
 
switch (bpp) {
case 8:
tab = fb_be_math(p) ? cfb_tab8_be : cfb_tab8_le;
+   tablen = 16;
break;
case 16:
tab = fb_be_math(p) ? cfb_tab16_be : cfb_tab16_le;
+   tablen = 4;
break;
case 32:
-   default:
tab = cfb_tab32;
+   tablen = 2;
break;
+   default:
+   return;
}
 
for (i = ppw-1; i--; ) {
@@ -218,19 +224,40 @@ static void fast_imageblit(const struct fb_image *image, 
struct fb_info *p,
eorx = fgx ^ bgx;
k = image->width/ppw;
 
+   for (i = 0; i < tablen; ++i)
+   colortab[i] = (tab[i] & eorx) ^ bgx;
+
for (i = image->height; i--; ) {
dst = dst1;
-   shift = 8;
src = s;
 
-   for (j = k; j--; ) {
-   shift -= ppw;
-   end_mask = tab[(*src >> shift) & bit_mask];
-   *dst++ = (end_mask & eorx) ^ bgx;
-   if (!shift) {
-   shift = 8;
-   src++;
+   switch (ppw) {
+   case 4: /* 8 bpp */
+   for (j = k; j; j -= 2, ++src) {
+   *dst++ = colortab[(*src >> 4) & bit_mask];
+   *dst++ = colortab[(*src >> 0) & bit_mask];
+   }
+   break;
+   case 2: /* 16 bpp */
+   for (j = k; j; j -= 4, ++src) {
+   *dst++ = colortab[(*src >> 6) & bit_mask];
+   *dst++ = colortab[(*src >> 4) & bit_mask];
+   *dst++ = colortab[(*src >> 2) & bit_mask];
+   *dst++ = colortab[(*src >> 0) & bit_mask];
+   }
+   break;
+   case 1: /* 32 bpp */
+   for (j = k; j; j -= 8, ++src) {
+   *dst++ = colortab[(*src >> 7) & bit_mask];
+   *dst++ = colortab[(*src >> 6) & bit_mask];
+   *dst++ = colortab[(*src >> 5) & bit_mask];
+   *dst++ = colortab[(*src >> 4) & bit_mask];
+   *dst++ = colortab[(*src >> 3) & bit_mask];
+   *dst++ = colortab[(*src >> 2) & bit_mask];
+   *dst++ = colortab[(*src >> 1) & bit_mask];
+   *dst++ = colortab[(*src >> 0) & bit_mask];
}
+   break;
}
dst1 += p->fix.line_length;
s += spitch;
-- 
2.35.1



[PATCH v2 3/5] fbdev: Remove trailing whitespaces from cfbimgblt.c

2022-02-21 Thread Thomas Zimmermann
Fix coding style. No functional changes.

Signed-off-by: Thomas Zimmermann 
---
 drivers/video/fbdev/core/cfbimgblt.c | 60 ++--
 1 file changed, 30 insertions(+), 30 deletions(-)

diff --git a/drivers/video/fbdev/core/cfbimgblt.c 
b/drivers/video/fbdev/core/cfbimgblt.c
index a2bb276a8b24..01b01a279681 100644
--- a/drivers/video/fbdev/core/cfbimgblt.c
+++ b/drivers/video/fbdev/core/cfbimgblt.c
@@ -16,15 +16,15 @@
  *  must be laid out exactly in the same format as the framebuffer. Yes I know
  *  their are cards with hardware that coverts images of various depths to the
  *  framebuffer depth. But not every card has this. All images must be rounded
- *  up to the nearest byte. For example a bitmap 12 bits wide must be two 
- *  bytes width. 
+ *  up to the nearest byte. For example a bitmap 12 bits wide must be two
+ *  bytes width.
  *
- *  Tony: 
- *  Incorporate mask tables similar to fbcon-cfb*.c in 2.4 API.  This speeds 
+ *  Tony:
+ *  Incorporate mask tables similar to fbcon-cfb*.c in 2.4 API.  This speeds
  *  up the code significantly.
- *  
+ *
  *  Code for depths not multiples of BITS_PER_LONG is still kludgy, which is
- *  still processed a bit at a time.   
+ *  still processed a bit at a time.
  *
  *  Also need to add code to deal with cards endians that are different than
  *  the native cpu endians. I also need to deal with MSB position in the word.
@@ -72,8 +72,8 @@ static const u32 cfb_tab32[] = {
 #define FB_WRITEL fb_writel
 #define FB_READL  fb_readl
 
-static inline void color_imageblit(const struct fb_image *image, 
-  struct fb_info *p, u8 __iomem *dst1, 
+static inline void color_imageblit(const struct fb_image *image,
+  struct fb_info *p, u8 __iomem *dst1,
   u32 start_index,
   u32 pitch_index)
 {
@@ -92,7 +92,7 @@ static inline void color_imageblit(const struct fb_image 
*image,
dst = (u32 __iomem *) dst1;
shift = 0;
val = 0;
-   
+
if (start_index) {
u32 start_mask = ~fb_shifted_pixels_mask_u32(p,
start_index, bswapmask);
@@ -109,8 +109,8 @@ static inline void color_imageblit(const struct fb_image 
*image,
val |= FB_SHIFT_HIGH(p, color, shift ^ bswapmask);
if (shift >= null_bits) {
FB_WRITEL(val, dst++);
-   
-   val = (shift == null_bits) ? 0 : 
+
+   val = (shift == null_bits) ? 0 :
FB_SHIFT_LOW(p, color, 32 - shift);
}
shift += bpp;
@@ -134,9 +134,9 @@ static inline void color_imageblit(const struct fb_image 
*image,
}
 }
 
-static inline void slow_imageblit(const struct fb_image *image, struct fb_info 
*p, 
+static inline void slow_imageblit(const struct fb_image *image, struct fb_info 
*p,
  u8 __iomem *dst1, u32 fgcolor,
- u32 bgcolor, 
+ u32 bgcolor,
  u32 start_index,
  u32 pitch_index)
 {
@@ -172,7 +172,7 @@ static inline void slow_imageblit(const struct fb_image 
*image, struct fb_info *
l--;
color = (*s & (1 << l)) ? fgcolor : bgcolor;
val |= FB_SHIFT_HIGH(p, color, shift ^ bswapmask);
-   
+
/* Did the bitshift spill bits to the next long? */
if (shift >= null_bits) {
FB_WRITEL(val, dst++);
@@ -191,16 +191,16 @@ static inline void slow_imageblit(const struct fb_image 
*image, struct fb_info *
 
FB_WRITEL((FB_READL(dst) & end_mask) | val, dst);
}
-   
+
dst1 += pitch;
-   src += spitch;  
+   src += spitch;
if (pitch_index) {
dst2 += pitch;
dst1 = (u8 __iomem *)((long __force)dst2 & 
~(sizeof(u32) - 1));
start_index += pitch_index;
start_index &= 32 - 1;
}
-   
+
}
 }
 
@@ -212,9 +212,9 @@ static inline void slow_imageblit(const struct fb_image 
*image, struct fb_info *
  *   fix->line_legth is divisible by 4;
  *   beginning and end of a scanline is dword aligned
  */
-static inline void fast_imageblit(const struct fb_image *image, struct fb_info 
*p, 
- u8 __iomem *dst1, u32 fgcolor, 
- u32 bgcolor) 
+static inline void fast_imageblit(const struct fb_image *image, struct fb_info 
*p,
+ 

[PATCH v2 4/5] fbdev: Improve performance of cfb_imageblit()

2022-02-21 Thread Thomas Zimmermann
Improve the performance of sys_imageblit() by manually unrolling
the inner blitting loop and moving some invariants out. The compiler
failed to do this automatically. This change keeps cfb_imageblit()
in sync with sys_imagebit().

A microbenchmark measures the average number of CPU cycles
for sys_imageblit() after a stabilizing period of a few minutes
(i7-4790, FullHD, simpledrm, kernel with debugging).

sys_imageblit(), new: 15724 cycles
cfb_imageblit(): old: 30566 cycles

In the optimized case, cfb_imageblit() is now ~2x faster than before.

Signed-off-by: Thomas Zimmermann 
---
 drivers/video/fbdev/core/cfbimgblt.c | 51 +++-
 1 file changed, 42 insertions(+), 9 deletions(-)

diff --git a/drivers/video/fbdev/core/cfbimgblt.c 
b/drivers/video/fbdev/core/cfbimgblt.c
index 01b01a279681..7361cfabdd85 100644
--- a/drivers/video/fbdev/core/cfbimgblt.c
+++ b/drivers/video/fbdev/core/cfbimgblt.c
@@ -218,23 +218,29 @@ static inline void fast_imageblit(const struct fb_image 
*image, struct fb_info *
 {
u32 fgx = fgcolor, bgx = bgcolor, bpp = p->var.bits_per_pixel;
u32 ppw = 32/bpp, spitch = (image->width + 7)/8;
-   u32 bit_mask, end_mask, eorx, shift;
+   u32 bit_mask, eorx;
const char *s = image->data, *src;
u32 __iomem *dst;
const u32 *tab = NULL;
+   size_t tablen;
+   u32 colortab[16];
int i, j, k;
 
switch (bpp) {
case 8:
tab = fb_be_math(p) ? cfb_tab8_be : cfb_tab8_le;
+   tablen = 16;
break;
case 16:
tab = fb_be_math(p) ? cfb_tab16_be : cfb_tab16_le;
+   tablen = 4;
break;
case 32:
-   default:
tab = cfb_tab32;
+   tablen = 2;
break;
+   default:
+   return;
}
 
for (i = ppw-1; i--; ) {
@@ -248,15 +254,42 @@ static inline void fast_imageblit(const struct fb_image 
*image, struct fb_info *
eorx = fgx ^ bgx;
k = image->width/ppw;
 
-   for (i = image->height; i--; ) {
-   dst = (u32 __iomem *) dst1, shift = 8; src = s;
+   for (i = 0; i < tablen; ++i)
+   colortab[i] = (tab[i] & eorx) ^ bgx;
 
-   for (j = k; j--; ) {
-   shift -= ppw;
-   end_mask = tab[(*src >> shift) & bit_mask];
-   FB_WRITEL((end_mask & eorx)^bgx, dst++);
-   if (!shift) { shift = 8; src++; }
+   for (i = image->height; i--; ) {
+   dst = (u32 __iomem *)dst1;
+   src = s;
+
+   switch (ppw) {
+   case 4: /* 8 bpp */
+   for (j = k; j; j -= 2, ++src) {
+   FB_WRITEL(colortab[(*src >> 4) & bit_mask], 
dst++);
+   FB_WRITEL(colortab[(*src >> 0) & bit_mask], 
dst++);
+   }
+   break;
+   case 2: /* 16 bpp */
+   for (j = k; j; j -= 4, ++src) {
+   FB_WRITEL(colortab[(*src >> 6) & bit_mask], 
dst++);
+   FB_WRITEL(colortab[(*src >> 4) & bit_mask], 
dst++);
+   FB_WRITEL(colortab[(*src >> 2) & bit_mask], 
dst++);
+   FB_WRITEL(colortab[(*src >> 0) & bit_mask], 
dst++);
+   }
+   break;
+   case 1: /* 32 bpp */
+   for (j = k; j; j -= 8, ++src) {
+   FB_WRITEL(colortab[(*src >> 7) & bit_mask], 
dst++);
+   FB_WRITEL(colortab[(*src >> 6) & bit_mask], 
dst++);
+   FB_WRITEL(colortab[(*src >> 5) & bit_mask], 
dst++);
+   FB_WRITEL(colortab[(*src >> 4) & bit_mask], 
dst++);
+   FB_WRITEL(colortab[(*src >> 3) & bit_mask], 
dst++);
+   FB_WRITEL(colortab[(*src >> 2) & bit_mask], 
dst++);
+   FB_WRITEL(colortab[(*src >> 1) & bit_mask], 
dst++);
+   FB_WRITEL(colortab[(*src >> 0) & bit_mask], 
dst++);
+   }
+   break;
}
+
dst1 += p->fix.line_length;
s += spitch;
}
-- 
2.35.1



[PATCH v2 0/5] fbdev: Improve performance of fbdev console

2022-02-21 Thread Thomas Zimmermann
Optimize performance of the fbdev console for the common case of
software-based clearing and image blitting.

The commit descripton of each patch contains resuls os a simple
microbenchmark. I also tested the full patchset's effect on the
console output by printing directory listings (i7-4790, FullHD,
simpledrm, kernel with debugging).

  > time find /usr/share/doc -type f

In the unoptimized case:

  real0m6.173s
  user0m0.044s
  sys 0m6.107s

With optimizations applied:

  real0m4.754s
  user0m0.044s
  sys 0m4.698s

In the optimized case, printing the directory listing is ~25% faster
than before.

In v2 of the patchset, after implementing Sam's suggestion to update
cfb_imageblit() as well, it turns out that the compiled code in
sys_imageblit() is still significantly slower than the CFB version. A
fix is probably a larger task and would include architecture-specific
changes. A new TODO item suggests to investigate the performance of the
various helpers and format-conversion functions in DRM and fbdev.

v2:
* improve readability for sys_imageblit() (Gerd, Sam)
* new TODO item for further optimization

Thomas Zimmermann (5):
  fbdev: Improve performance of sys_fillrect()
  fbdev: Improve performance of sys_imageblit()
  fbdev: Remove trailing whitespaces from cfbimgblt.c
  fbdev: Improve performance of cfb_imageblit()
  drm: Add TODO item for optimizing format helpers

 Documentation/gpu/todo.rst |  22 +
 drivers/video/fbdev/core/cfbimgblt.c   | 107 -
 drivers/video/fbdev/core/sysfillrect.c |  16 +---
 drivers/video/fbdev/core/sysimgblt.c   |  49 ---
 4 files changed, 133 insertions(+), 61 deletions(-)

-- 
2.35.1



Re: Report 2 in ext4 and journal based on v5.17-rc1

2022-02-21 Thread Jan Kara


So I was trying to understand what this report is about for some time but
honestly I have failed...

On Thu 17-02-22 20:10:04, Byungchul Park wrote:
> [9.008161] ===
> [9.008163] DEPT: Circular dependency has been detected.
> [9.008164] 5.17.0-rc1-00015-gb94f67143867-dirty #2 Tainted: GW
> [9.008166] ---
> [9.008167] summary
> [9.008167] ---
> [9.008168] *** DEADLOCK ***
> [9.008168]
> [9.008168] context A
> [9.008169] [S] 
> (unknown)(&(>j_wait_transaction_locked)->dmap:0)
> [9.008171] [W] wait(&(>j_wait_commit)->dmap:0)
> [9.008172] [E] event(&(>j_wait_transaction_locked)->dmap:0)
> [9.008173]
> [9.008173] context B
> [9.008174] [S] down_write(mapping.invalidate_lock:0)
> [9.008175] [W] wait(&(>j_wait_transaction_locked)->dmap:0)
> [9.008176] [E] up_write(mapping.invalidate_lock:0)
> [9.008177]
> [9.008178] context C
> [9.008179] [S] (unknown)(&(>j_wait_commit)->dmap:0)
> [9.008180] [W] down_write(mapping.invalidate_lock:0)
> [9.008181] [E] event(&(>j_wait_commit)->dmap:0)
> [9.008181]
> [9.008182] [S]: start of the event context
> [9.008183] [W]: the wait blocked
> [9.008183] [E]: the event not reachable

So what situation is your tool complaining about here? Can you perhaps show
it here in more common visualization like:

TASK1   TASK2
does foo, grabs Z
does X, grabs lock Y
blocks on Z
blocks on Y

or something like that? Because I was not able to decipher this from the
report even after trying for some time...

Honza



> [9.008184] ---
> [9.008184] context A's detail
> [9.008185] ---
> [9.008186] context A
> [9.008186] [S] 
> (unknown)(&(>j_wait_transaction_locked)->dmap:0)
> [9.008187] [W] wait(&(>j_wait_commit)->dmap:0)
> [9.008188] [E] event(&(>j_wait_transaction_locked)->dmap:0)
> [9.008189]
> [9.008190] [S] (unknown)(&(>j_wait_transaction_locked)->dmap:0):
> [9.008191] (N/A)
> [9.008191]
> [9.008192] [W] wait(&(>j_wait_commit)->dmap:0):
> [9.008193] prepare_to_wait (kernel/sched/wait.c:275) 
> [9.008197] stacktrace:
> [9.008198] __schedule (kernel/sched/sched.h:1318 
> kernel/sched/sched.h:1616 kernel/sched/core.c:6213) 
> [9.008200] schedule (kernel/sched/core.c:6373 (discriminator 1)) 
> [9.008201] kjournald2 (fs/jbd2/journal.c:250) 
> [9.008203] kthread (kernel/kthread.c:377) 
> [9.008206] ret_from_fork (arch/x86/entry/entry_64.S:301) 
> [9.008209]
> [9.008209] [E] event(&(>j_wait_transaction_locked)->dmap:0):
> [9.008210] __wake_up_common (kernel/sched/wait.c:108) 
> [9.008212] stacktrace:
> [9.008213] dept_event (kernel/dependency/dept.c:2337) 
> [9.008215] __wake_up_common (kernel/sched/wait.c:109) 
> [9.008217] __wake_up_common_lock (./include/linux/spinlock.h:428 
> (discriminator 1) kernel/sched/wait.c:141 (discriminator 1)) 
> [9.008218] jbd2_journal_commit_transaction (fs/jbd2/commit.c:583) 
> [9.008221] kjournald2 (fs/jbd2/journal.c:214 (discriminator 3)) 
> [9.008223] kthread (kernel/kthread.c:377) 
> [9.008224] ret_from_fork (arch/x86/entry/entry_64.S:301) 
> [9.008226] ---
> [9.008226] context B's detail
> [9.008227] ---
> [9.008228] context B
> [9.008228] [S] down_write(mapping.invalidate_lock:0)
> [9.008229] [W] wait(&(>j_wait_transaction_locked)->dmap:0)
> [9.008230] [E] up_write(mapping.invalidate_lock:0)
> [9.008231]
> [9.008232] [S] down_write(mapping.invalidate_lock:0):
> [9.008233] ext4_da_write_begin (fs/ext4/truncate.h:21 
> fs/ext4/inode.c:2963) 
> [9.008237] stacktrace:
> [9.008237] down_write (kernel/locking/rwsem.c:1514) 
> [9.008239] ext4_da_write_begin (fs/ext4/truncate.h:21 
> fs/ext4/inode.c:2963) 
> [9.008241] generic_perform_write (mm/filemap.c:3784) 
> [9.008243] ext4_buffered_write_iter (fs/ext4/file.c:269) 
> [9.008245] ext4_file_write_iter (fs/ext4/file.c:677) 
> [9.008247] new_sync_write (fs/read_write.c:504 (discriminator 1)) 
> [9.008250] vfs_write (fs/read_write.c:590) 
> [9.008251] ksys_write (fs/read_write.c:644) 
> [9.008253] do_syscall_64 (arch/x86/entry/common.c:50 
> arch/x86/entry/common.c:80) 
> [9.008255] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:113) 
> [9.008258]
> [9.008258] [W] wait(&(>j_wait_transaction_locked)->dmap:0):
> [

Re: [PATCH v13 1/5] drm: improve drm_buddy_alloc function

2022-02-21 Thread Christian König
Going to let that sit on the mailing list till tomorrow, if I don't get 
any objections till then I will push it to drm-misc-next.


Thanks,
Christian.

Am 21.02.22 um 17:45 schrieb Arunpravin:

- Make drm_buddy_alloc a single function to handle
   range allocation and non-range allocation demands

- Implemented a new function alloc_range() which allocates
   the requested power-of-two block comply with range limitations

- Moved order computation and memory alignment logic from
   i915 driver to drm buddy

v2:
   merged below changes to keep the build unbroken
- drm_buddy_alloc_range() becomes obsolete and may be removed
- enable ttm range allocation (fpfn / lpfn) support in i915 driver
- apply enhanced drm_buddy_alloc() function to i915 driver

v3(Matthew Auld):
   - Fix alignment issues and remove unnecessary list_empty check
   - add more validation checks for input arguments
   - make alloc_range() block allocations as bottom-up
   - optimize order computation logic
   - replace uint64_t with u64, which is preferred in the kernel

v4(Matthew Auld):
   - keep drm_buddy_alloc_range() function implementation for generic
 actual range allocations
   - keep alloc_range() implementation for end bias allocations

v5(Matthew Auld):
   - modify drm_buddy_alloc() passing argument place->lpfn to lpfn
 as place->lpfn will currently always be zero for i915

v6(Matthew Auld):
   - fixup potential uaf - If we are unlucky and can't allocate
 enough memory when splitting blocks, where we temporarily
 end up with the given block and its buddy on the respective
 free list, then we need to ensure we delete both blocks,
 and no just the buddy, before potentially freeing them

   - fix warnings reported by kernel test robot 

v7(Matthew Auld):
   - revert fixup potential uaf
   - keep __alloc_range() add node to the list logic same as
 drm_buddy_alloc_blocks() by having a temporary list variable
   - at drm_buddy_alloc_blocks() keep i915 range_overflows macro
 and add a new check for end variable

v8:
   - fix warnings reported by kernel test robot 

v9(Matthew Auld):
   - remove DRM_BUDDY_RANGE_ALLOCATION flag
   - remove unnecessary function description

v10:
- keep DRM_BUDDY_RANGE_ALLOCATION flag as removing the flag
  and replacing with (end < size) logic fails amdgpu driver load

Signed-off-by: Arunpravin 
Reviewed-by: Matthew Auld 
---
  drivers/gpu/drm/drm_buddy.c   | 292 +-
  drivers/gpu/drm/i915/i915_ttm_buddy_manager.c |  67 ++--
  drivers/gpu/drm/i915/i915_ttm_buddy_manager.h |   2 +
  include/drm/drm_buddy.h   |  13 +-
  4 files changed, 257 insertions(+), 117 deletions(-)

diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index d60878bc9c20..1d801c88b286 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -282,23 +282,97 @@ void drm_buddy_free_list(struct drm_buddy *mm, struct 
list_head *objects)
  }
  EXPORT_SYMBOL(drm_buddy_free_list);
  
-/**

- * drm_buddy_alloc_blocks - allocate power-of-two blocks
- *
- * @mm: DRM buddy manager to allocate from
- * @order: size of the allocation
- *
- * The order value here translates to:
- *
- * 0 = 2^0 * mm->chunk_size
- * 1 = 2^1 * mm->chunk_size
- * 2 = 2^2 * mm->chunk_size
- *
- * Returns:
- * allocated ptr to the _buddy_block on success
- */
-struct drm_buddy_block *
-drm_buddy_alloc_blocks(struct drm_buddy *mm, unsigned int order)
+static inline bool overlaps(u64 s1, u64 e1, u64 s2, u64 e2)
+{
+   return s1 <= e2 && e1 >= s2;
+}
+
+static inline bool contains(u64 s1, u64 e1, u64 s2, u64 e2)
+{
+   return s1 <= s2 && e1 >= e2;
+}
+
+static struct drm_buddy_block *
+alloc_range_bias(struct drm_buddy *mm,
+u64 start, u64 end,
+unsigned int order)
+{
+   struct drm_buddy_block *block;
+   struct drm_buddy_block *buddy;
+   LIST_HEAD(dfs);
+   int err;
+   int i;
+
+   end = end - 1;
+
+   for (i = 0; i < mm->n_roots; ++i)
+   list_add_tail(>roots[i]->tmp_link, );
+
+   do {
+   u64 block_start;
+   u64 block_end;
+
+   block = list_first_entry_or_null(,
+struct drm_buddy_block,
+tmp_link);
+   if (!block)
+   break;
+
+   list_del(>tmp_link);
+
+   if (drm_buddy_block_order(block) < order)
+   continue;
+
+   block_start = drm_buddy_block_offset(block);
+   block_end = block_start + drm_buddy_block_size(mm, block) - 1;
+
+   if (!overlaps(start, end, block_start, block_end))
+   continue;
+
+   if (drm_buddy_block_is_allocated(block))
+   continue;
+
+   if (contains(start, end, block_start, block_end) &&
+   order == 

Re: [PATCH v6 21/23] drm: rockchip: Add VOP2 driver

2022-02-21 Thread Lucas Stach
Hi Andy,

Am Montag, dem 21.02.2022 um 19:51 +0800 schrieb Andy Yan:
> Hi Sascha:
> 
> On 2/17/22 16:29, Sascha Hauer wrote:
> > From: Andy Yan 
> > 
> > The VOP2 unit is found on Rockchip SoCs beginning with rk3566/rk3568.
> > It replaces the VOP unit found in the older Rockchip SoCs.
> > 
> > This driver has been derived from the downstream Rockchip Kernel and
> > heavily modified:
> > 
> > - All nonstandard DRM properties have been removed
> > - dropped struct vop2_plane_state and pass around less data between
> >functions
> > - Dropped all DRM_FORMAT_* not known on upstream
> > - rework register access to get rid of excessively used macros
> > - Drop all waiting for framesyncs
> 
> All the waiting sync in the downstream divers are try to fix special 
> problems,
> 
> and some of them inherited from upstream vop driver, for example: the 
> fb_unref_work,
> 
> It submitted by Tomas Figa to upstream vop driver to make the wait for 
> flip and asynchronously cursor work
> 
> right.
> 
> VOP2 share the same hardware design in vblank, so I don't think these 
> code are useless.

We've discussed this quite a bit internally before dropping this custom
frame synchronization. We believe that this is okay to do, as Sascha
also dropped a lot of code that would have required more
synchronization than what the common DRM atomic core code provides.

Fundamentally, a lot of the extra synchronization, also in the
upstream, VOP driver seems like it is working around the fact that the
vblank event might get sent out too early, so references to still
active framebuffers are dropped from the DRM core and userspace is
allowed to make further progress.

With this VOP2 submission, the vblank event is only armed at
atomic_flush time after all the HW programming is in place, which
ensures that even though we still race with the vblank IRQ, we only
loose this race in a predictable way and the only bad thing caused by
the race is a temporary hickup where a page-flip is deferred for one
more vblank cycle, but all the reference counts are still correct and
nothing gets freed prematurely.

Features not included in this submission, like dynamic assignment of
overlay planes to CRTCs, might need additional frame synchronization,
but at the current state of the driver any additional synchronization
is just cargo-cult and not actually needed.

Regards,
Lucas

> 
> [0] 
> https://patchwork.kernel.org/project/linux-rockchip/patch/1473857701-9250-5-git-send-email-tf...@chromium.org/
> 
> [1] 
> https://patchwork.kernel.org/project/linux-rockchip/patch/1473857701-9250-6-git-send-email-tf...@chromium.org/
> 
> [2] 
> https://patchwork.kernel.org/project/linux-rockchip/patch/1473857701-9250-4-git-send-email-tf...@chromium.org/
> 
> 
> > 
> > The driver is tested with HDMI and MIPI-DSI display on a RK3568-EVB
> > board. Overlay support is tested with the modetest utility. AFBC support
> > on the cluster windows is tested with weston-simple-dmabuf-egl on
> > weston using the (yet to be upstreamed) panfrost driver support.
> > 
> > Signed-off-by: Andy Yan 
> > Signed-off-by: Sascha Hauer 
> > ---
> > 
> > Notes:
> >  Changes since v5:
> >  - consistently use u8/u16/u32 rather than uint8_t/uint16_t/uint32_t
> >  - Use spin_lock rather than spin_lock_irqsave
> >  - replace printk with drm_dbg
> >  - break some overlong lines
> >  
> >  Changes since v4:
> >  - Avoid stack frame overflow by not allocating big array on the stack
> >  
> >  Changes since v3:
> >  - Sort includes
> >  - fix typos
> >  - Drop spinlock
> >  - Use regmap_set_bits()/regmap_clear_bits()
> >  - simplify vop2_scale_factor()
> >  - simplify vop2_afbc_transform_offset()
> >  
> >  Changes since v4:
> >  - Sort nodes alphabetically
> >  
> >  Changes since v3:
> >  - Fix HDMI connector type
> > 
> >   drivers/gpu/drm/rockchip/Kconfig |6 +
> >   drivers/gpu/drm/rockchip/Makefile|1 +
> >   drivers/gpu/drm/rockchip/rockchip_drm_drv.c  |1 +
> >   drivers/gpu/drm/rockchip/rockchip_drm_drv.h  |6 +-
> >   drivers/gpu/drm/rockchip/rockchip_drm_fb.c   |2 +
> >   drivers/gpu/drm/rockchip/rockchip_drm_vop.h  |   15 +
> >   drivers/gpu/drm/rockchip/rockchip_drm_vop2.c | 2708 ++
> >   drivers/gpu/drm/rockchip/rockchip_drm_vop2.h |  477 +++
> >   drivers/gpu/drm/rockchip/rockchip_vop2_reg.c |  281 ++
> >   9 files changed, 3496 insertions(+), 1 deletion(-)
> >   create mode 100644 drivers/gpu/drm/rockchip/rockchip_drm_vop2.c
> >   create mode 100644 drivers/gpu/drm/rockchip/rockchip_drm_vop2.h
> >   create mode 100644 drivers/gpu/drm/rockchip/rockchip_vop2_reg.c
> > 
> > diff --git a/drivers/gpu/drm/rockchip/Kconfig 
> > b/drivers/gpu/drm/rockchip/Kconfig
> > index b9b156308460a..4ff0043f0ee70 100644
> > --- a/drivers/gpu/drm/rockchip/Kconfig
> > +++ b/drivers/gpu/drm/rockchip/Kconfig
> > @@ -28,6 +28,12 @@ config ROCKCHIP_VOP
> >   This 

Re: [PATCH v12 1/5] drm: improve drm_buddy_alloc function

2022-02-21 Thread Arunpravin



On 16/02/22 1:37 pm, Arunpravin wrote:
> 
> 
> On 14/02/22 2:42 pm, Christian König wrote:
>>
>>
>> Am 14.02.22 um 09:36 schrieb Matthew Auld:
>>> On Mon, 14 Feb 2022 at 06:32, Christian König
>>>  wrote:
 Am 13.02.22 um 09:52 schrieb Arunpravin:
> - Make drm_buddy_alloc a single function to handle
> range allocation and non-range allocation demands
>
> - Implemented a new function alloc_range() which allocates
> the requested power-of-two block comply with range limitations
>
> - Moved order computation and memory alignment logic from
> i915 driver to drm buddy
>
> v2:
> merged below changes to keep the build unbroken
>  - drm_buddy_alloc_range() becomes obsolete and may be removed
>  - enable ttm range allocation (fpfn / lpfn) support in i915 driver
>  - apply enhanced drm_buddy_alloc() function to i915 driver
>
> v3(Matthew Auld):
> - Fix alignment issues and remove unnecessary list_empty check
> - add more validation checks for input arguments
> - make alloc_range() block allocations as bottom-up
> - optimize order computation logic
> - replace uint64_t with u64, which is preferred in the kernel
>
> v4(Matthew Auld):
> - keep drm_buddy_alloc_range() function implementation for generic
>   actual range allocations
> - keep alloc_range() implementation for end bias allocations
>
> v5(Matthew Auld):
> - modify drm_buddy_alloc() passing argument place->lpfn to lpfn
>   as place->lpfn will currently always be zero for i915
>
> v6(Matthew Auld):
> - fixup potential uaf - If we are unlucky and can't allocate
>   enough memory when splitting blocks, where we temporarily
>   end up with the given block and its buddy on the respective
>   free list, then we need to ensure we delete both blocks,
>   and no just the buddy, before potentially freeing them
>
> - fix warnings reported by kernel test robot 
>
> v7(Matthew Auld):
> - revert fixup potential uaf
> - keep __alloc_range() add node to the list logic same as
>   drm_buddy_alloc_blocks() by having a temporary list variable
> - at drm_buddy_alloc_blocks() keep i915 range_overflows macro
>   and add a new check for end variable
>
> v8:
> - fix warnings reported by kernel test robot 
>
> v9(Matthew Auld):
> - remove DRM_BUDDY_RANGE_ALLOCATION flag
> - remove unnecessary function description
>
> Signed-off-by: Arunpravin 
> Reviewed-by: Matthew Auld 
 As long as nobody objects I'm going to push patches 1-3 to drm-misc-next
 in the next hour or so:
>>> As part of this could you also push
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.freedesktop.org%2Fseries%2F99842%2Fdata=04%7C01%7CArunpravin.PaneerSelvam%40amd.com%7Cc50a2b13b2a0425e596f08d9ef9a2d60%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637804268194961068%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=ENxu%2BSquLubBYLkNYV1SIUau1u7aZMdjz22izvv3FvM%3Dreserved=0
>>>  ?
>>
>> Sure, but Arun said in our internal chat that I should wait with that 
>> anyway since he wanted to sort out one more issue.
>>
>> Christian.
>>
> 
> working on 2 issues,
> 1. I think we need to keep DRM_BUDDY_RANGE_ALLOCATION flag, some corner
> case didnt allow amdgpu driver load
> 
> 2. rebasing the existing amdgpu_vram_mgr.c and resolving all conflicts
> as there are many changes merged in with the below patch
> - drm/amdgpu: remove VRAM accounting v2

Hi Christian,
I sent the v13 patches, selftest cases are passed.

Thanks,
Arun
>>>
 Then going to take a deeper look into patches 4 and 5 to get them reviewed.

 Thanks,
 Christian.

> ---
>drivers/gpu/drm/drm_buddy.c   | 292 +-
>drivers/gpu/drm/i915/i915_ttm_buddy_manager.c |  63 ++--
>drivers/gpu/drm/i915/i915_ttm_buddy_manager.h |   2 +
>include/drm/drm_buddy.h   |  11 +-
>4 files changed, 250 insertions(+), 118 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
> index d60878bc9c20..e0c0d786a572 100644
> --- a/drivers/gpu/drm/drm_buddy.c
> +++ b/drivers/gpu/drm/drm_buddy.c
> @@ -282,23 +282,97 @@ void drm_buddy_free_list(struct drm_buddy *mm, 
> struct list_head *objects)
>}
>EXPORT_SYMBOL(drm_buddy_free_list);
>
> -/**
> - * drm_buddy_alloc_blocks - allocate power-of-two blocks
> - *
> - * @mm: DRM buddy manager to allocate from
> - * @order: size of the allocation
> - *
> - * The order value here translates to:
> - *
> - * 0 = 2^0 * mm->chunk_size
> - * 1 = 2^1 * mm->chunk_size
> - * 2 = 2^2 * 

[PATCH] drm/amdgpu: fix printk format for size_t variable

2022-02-21 Thread trix
From: Tom Rix 

On mips64 allyesconfig, there is this build break
amdgpu_discovery.c:671:35: error: format '%ld' expects
  argument of type 'long int', but argument 4 has
  type 'size_t' {aka 'unsigned int'}
  DRM_DEBUG("match:%d @ ip_offset:%ld", ii, ip_offset);

For size_t, use %zu.

Fixes: a6c40b178092 ("drm/amdgpu: Show IP discovery in sysfs")
Signed-off-by: Tom Rix 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
index 7c7e28fd912e..58238f67b1d3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
@@ -668,7 +668,7 @@ static int amdgpu_discovery_sysfs_ips(struct amdgpu_device 
*adev,
le16_to_cpu(ip->hw_id) != ii)
goto next_ip;
 
-   DRM_DEBUG("match:%d @ ip_offset:%ld", ii, ip_offset);
+   DRM_DEBUG("match:%d @ ip_offset:%zu", ii, ip_offset);
 
/* We have a hw_id match; register the hw
 * block if not yet registered.
-- 
2.26.3



Re: [PATCH v3 8/9] drm/tegra: vic: Implement get_streamid_offset

2022-02-21 Thread Mikko Perttunen

On 2/21/22 19:27, Robin Murphy wrote:

On 2022-02-18 11:39, Mikko Perttunen via iommu wrote:

Implement the get_streamid_offset required for supporting context
isolation. Since old firmware cannot support context isolation
without hacks that we don't want to implement, check the firmware
binary to see if context isolation should be enabled.

Signed-off-by: Mikko Perttunen 
---
  drivers/gpu/drm/tegra/vic.c | 38 +
  1 file changed, 38 insertions(+)

diff --git a/drivers/gpu/drm/tegra/vic.c b/drivers/gpu/drm/tegra/vic.c
index 1e342fa3d27b..2863ee5e0e67 100644
--- a/drivers/gpu/drm/tegra/vic.c
+++ b/drivers/gpu/drm/tegra/vic.c
@@ -38,6 +38,8 @@ struct vic {
  struct clk *clk;
  struct reset_control *rst;
+    bool can_use_context;
+
  /* Platform configuration */
  const struct vic_config *config;
  };
@@ -229,6 +231,7 @@ static int vic_load_firmware(struct vic *vic)
  {
  struct host1x_client *client = >client.base;
  struct tegra_drm *tegra = vic->client.drm;
+    u32 fce_bin_data_offset;
  dma_addr_t iova;
  size_t size;
  void *virt;
@@ -277,6 +280,25 @@ static int vic_load_firmware(struct vic *vic)
  vic->falcon.firmware.phys = phys;
  }
+    /*
+ * Check if firmware is new enough to not require mapping firmware
+ * to data buffer domains.
+ */
+    fce_bin_data_offset = *(u32 *)(virt + VIC_UCODE_FCE_DATA_OFFSET);
+
+    if (!vic->config->supports_sid) {
+    vic->can_use_context = false;
+    } else if (fce_bin_data_offset != 0x0 && fce_bin_data_offset != 
0xa5a5a5a5) {

+    /*
+ * Firmware will access FCE through STREAMID0, so context
+ * isolation cannot be used.
+ */
+    vic->can_use_context = false;
+    dev_warn_once(vic->dev, "context isolation disabled due to 
old firmware\n");

+    } else {
+    vic->can_use_context = true;
+    }
+
  return 0;
  cleanup:
@@ -358,10 +380,26 @@ static void vic_close_channel(struct 
tegra_drm_context *context)

  host1x_channel_put(context->channel);
  }
+static int vic_get_streamid_offset(struct tegra_drm_client *client)
+{
+    struct vic *vic = to_vic(client);
+    int err;
+
+    err = vic_load_firmware(vic);
+    if (err < 0)
+    return err;
+
+    if (vic->can_use_context)
+    return 0x30;
+    else
+    return -ENOTSUPP;
+}
+
  static const struct tegra_drm_client_ops vic_ops = {
  .open_channel = vic_open_channel,
  .close_channel = vic_close_channel,
  .submit = tegra_drm_submit,
+    .get_streamid_offset = vic_get_streamid_offset,


The patch order seems off here, since the .get_streamid_offset member 
isn't defined yet.


Robin.


Indeed, will fix.

Thanks,
Mikko




  };
  #define NVIDIA_TEGRA_124_VIC_FIRMWARE "nvidia/tegra124/vic03_ucode.bin"




Re: [PATCH v3 8/9] drm/tegra: vic: Implement get_streamid_offset

2022-02-21 Thread Robin Murphy

On 2022-02-18 11:39, Mikko Perttunen via iommu wrote:

Implement the get_streamid_offset required for supporting context
isolation. Since old firmware cannot support context isolation
without hacks that we don't want to implement, check the firmware
binary to see if context isolation should be enabled.

Signed-off-by: Mikko Perttunen 
---
  drivers/gpu/drm/tegra/vic.c | 38 +
  1 file changed, 38 insertions(+)

diff --git a/drivers/gpu/drm/tegra/vic.c b/drivers/gpu/drm/tegra/vic.c
index 1e342fa3d27b..2863ee5e0e67 100644
--- a/drivers/gpu/drm/tegra/vic.c
+++ b/drivers/gpu/drm/tegra/vic.c
@@ -38,6 +38,8 @@ struct vic {
struct clk *clk;
struct reset_control *rst;
  
+	bool can_use_context;

+
/* Platform configuration */
const struct vic_config *config;
  };
@@ -229,6 +231,7 @@ static int vic_load_firmware(struct vic *vic)
  {
struct host1x_client *client = >client.base;
struct tegra_drm *tegra = vic->client.drm;
+   u32 fce_bin_data_offset;
dma_addr_t iova;
size_t size;
void *virt;
@@ -277,6 +280,25 @@ static int vic_load_firmware(struct vic *vic)
vic->falcon.firmware.phys = phys;
}
  
+	/*

+* Check if firmware is new enough to not require mapping firmware
+* to data buffer domains.
+*/
+   fce_bin_data_offset = *(u32 *)(virt + VIC_UCODE_FCE_DATA_OFFSET);
+
+   if (!vic->config->supports_sid) {
+   vic->can_use_context = false;
+   } else if (fce_bin_data_offset != 0x0 && fce_bin_data_offset != 
0xa5a5a5a5) {
+   /*
+* Firmware will access FCE through STREAMID0, so context
+* isolation cannot be used.
+*/
+   vic->can_use_context = false;
+   dev_warn_once(vic->dev, "context isolation disabled due to old 
firmware\n");
+   } else {
+   vic->can_use_context = true;
+   }
+
return 0;
  
  cleanup:

@@ -358,10 +380,26 @@ static void vic_close_channel(struct tegra_drm_context 
*context)
host1x_channel_put(context->channel);
  }
  
+static int vic_get_streamid_offset(struct tegra_drm_client *client)

+{
+   struct vic *vic = to_vic(client);
+   int err;
+
+   err = vic_load_firmware(vic);
+   if (err < 0)
+   return err;
+
+   if (vic->can_use_context)
+   return 0x30;
+   else
+   return -ENOTSUPP;
+}
+
  static const struct tegra_drm_client_ops vic_ops = {
.open_channel = vic_open_channel,
.close_channel = vic_close_channel,
.submit = tegra_drm_submit,
+   .get_streamid_offset = vic_get_streamid_offset,


The patch order seems off here, since the .get_streamid_offset member 
isn't defined yet.


Robin.


  };
  
  #define NVIDIA_TEGRA_124_VIC_FIRMWARE "nvidia/tegra124/vic03_ucode.bin"


Re: [PATCH] drm: rcar-du: Simplify division/shift logic

2022-02-21 Thread Laurent Pinchart
Hi Geert,

Thank you for the patch.

On Mon, Feb 21, 2022 at 05:26:15PM +0100, Geert Uytterhoeven wrote:
> "a / (1 << b)" == "a >> b".
> 
> No change in generated code.

If there's no change in generated code, isn't the current code more
readable ? :-)

> Signed-off-by: Geert Uytterhoeven 
> ---
>  drivers/gpu/drm/rcar-du/rcar_lvds.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/rcar-du/rcar_lvds.c 
> b/drivers/gpu/drm/rcar-du/rcar_lvds.c
> index 72a272cfc11ee129..30afc1d3482a9670 100644
> --- a/drivers/gpu/drm/rcar-du/rcar_lvds.c
> +++ b/drivers/gpu/drm/rcar-du/rcar_lvds.c
> @@ -229,7 +229,7 @@ static void rcar_lvds_d3_e3_pll_calc(struct rcar_lvds 
> *lvds, struct clk *clk,
>* the PLL, followed by a an optional fixed /7
>* divider.
>*/
> - fout = fvco / (1 << e) / div7;
> + fout = (fvco >> e) / div7;
>   div = max(1UL, DIV_ROUND_CLOSEST(fout, target));
>   diff = abs(fout / div - target);
>  
> @@ -249,7 +249,7 @@ static void rcar_lvds_d3_e3_pll_calc(struct rcar_lvds 
> *lvds, struct clk *clk,
>   }
>  
>  done:
> - output = fin * pll->pll_n / pll->pll_m / (1 << pll->pll_e)
> + output = (fin * pll->pll_n / pll->pll_m >> pll->pll_e)
>  / div7 / pll->div;
>   error = (long)(output - target) * 1 / (long)target;
>  

-- 
Regards,

Laurent Pinchart


[PATCH 2/2] drm: rcar-du: Don't restart group when enabling plane on Gen3

2022-02-21 Thread Laurent Pinchart
On Gen3 hardware enabling a VSP plane doesn't change any register that
requires DRES to take effect. Avoid a group restart in that case.

Signed-off-by: Laurent Pinchart 
---
 drivers/gpu/drm/rcar-du/rcar_du_plane.c | 6 ++
 drivers/gpu/drm/rcar-du/rcar_du_vsp.c   | 9 -
 2 files changed, 6 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/rcar-du/rcar_du_plane.c 
b/drivers/gpu/drm/rcar-du/rcar_du_plane.c
index 9b058d6cb032..22aeeb1cc1fb 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_plane.c
+++ b/drivers/gpu/drm/rcar-du/rcar_du_plane.c
@@ -560,6 +560,12 @@ void __rcar_du_plane_setup(struct rcar_du_group *rgrp,
if (rcdu->vspd1_sink != vspd1_sink) {
rcdu->vspd1_sink = vspd1_sink;
rcar_du_set_dpad0_vsp1_routing(rcdu);
+
+   /*
+* Changes to the VSP1 sink take effect on DRES and thus
+* need a restart of the group.
+*/
+   rgrp->need_restart = true;
}
}
 }
diff --git a/drivers/gpu/drm/rcar-du/rcar_du_vsp.c 
b/drivers/gpu/drm/rcar-du/rcar_du_vsp.c
index b7fc5b069cbc..32530d698e75 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_vsp.c
+++ b/drivers/gpu/drm/rcar-du/rcar_du_vsp.c
@@ -84,15 +84,6 @@ void rcar_du_vsp_enable(struct rcar_du_crtc *crtc)
 
__rcar_du_plane_setup(crtc->group, );
 
-   /*
-* Ensure that the plane source configuration takes effect by requesting
-* a restart of the group. See rcar_du_plane_atomic_update() for a more
-* detailed explanation.
-*
-* TODO: Check whether this is still needed on Gen3.
-*/
-   crtc->group->need_restart = true;
-
vsp1_du_setup_lif(crtc->vsp->vsp, crtc->vsp_pipe, );
 }
 
-- 
Regards,

Laurent Pinchart



[PATCH 1/2] drm: rcar-du: Don't select VSP1 sink on Gen3

2022-02-21 Thread Laurent Pinchart
The VSP1 sink selection through register DEFR8 is only available on Gen2
hardware. Skip it on Gen3.

Signed-off-by: Laurent Pinchart 
---
 drivers/gpu/drm/rcar-du/rcar_du_plane.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/rcar-du/rcar_du_plane.c 
b/drivers/gpu/drm/rcar-du/rcar_du_plane.c
index 862197be1e01..9b058d6cb032 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_plane.c
+++ b/drivers/gpu/drm/rcar-du/rcar_du_plane.c
@@ -549,8 +549,10 @@ void __rcar_du_plane_setup(struct rcar_du_group *rgrp,
rcar_du_plane_setup_format(rgrp, (state->hwindex + 1) % 8,
   state);
 
-   if (rcdu->info->gen < 3)
-   rcar_du_plane_setup_scanout(rgrp, state);
+   if (rcdu->info->gen >= 3)
+   return;
+
+   rcar_du_plane_setup_scanout(rgrp, state);
 
if (state->source == RCAR_DU_PLANE_VSPD1) {
unsigned int vspd1_sink = rgrp->index ? 2 : 0;
-- 
Regards,

Laurent Pinchart



[PATCH 0/2] drm: rcar-du: Avoid flicker when enabling a VSP plane

2022-02-21 Thread Laurent Pinchart
Hello,

This patch series avoids flicker in some scenarios related to dual
output configuration.

The issue was originally reported by Michael Rodin in [1]. The problem
is described in details there, and copied here to facilitate discussion:


Restarting a display unit group can cause a visible flicker on the display.
Particularly when a LVDS display is connected to a Salvator board and an
HDMI display is (re)connected, then there will be 2 visible flickers on the
LVDS display:

 1. during atomic_flush (The need_restart flag is set in this case by
rcar_du_vsp_enable.):
  rcar_du_crtc_atomic_flush
rcar_du_crtc_update_planes
  ...
  ...
  /* Restart the group if plane sources have changed. */
  if (rcrtc->group->need_restart)
  rcar_du_group_restart(rcrtc->group);
 2. during atomic_enable:
  rcar_du_crtc_atomic_enable
rcar_du_crtc_start
  rcar_du_group_start_stop(rcrtc->group, true);

To avoid flickers in all use cases, do not restart DU groups on the Gen3
SoCs at all, since it is not required any more.


The proposed patch unfortunately introduced a regression. This series
fixes the issue in the first scenario described above. The second
scenario still leads to flicker, and I don't think that can be fixed as
the hardware requires the whole group of outputs to be stopped for some
register changes to take effect.

[1] 
https://lore.kernel.org/dri-devel/1637680811-90510-1-git-send-email-mro...@de.adit-jv.com

Laurent Pinchart (2):
  drm: rcar-du: Don't select VSP1 sink on Gen3
  drm: rcar-du: Don't restart group when enabling plane on Gen3

 drivers/gpu/drm/rcar-du/rcar_du_plane.c | 12 ++--
 drivers/gpu/drm/rcar-du/rcar_du_vsp.c   |  9 -
 2 files changed, 10 insertions(+), 11 deletions(-)

-- 
Regards,

Laurent Pinchart



Re: [PATCH] drm/sched: Add device pointer to drm_gpu_scheduler

2022-02-21 Thread kernel test robot
Hi Jiawei,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on drm/drm-next]
[also build test ERROR on drm-intel/for-linux-next drm-exynos/exynos-drm-next 
tegra-drm/drm/tegra/for-next v5.17-rc5 next-20220217]
[cannot apply to drm-tip/drm-tip]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Jiawei-Gu/drm-sched-Add-device-pointer-to-drm_gpu_scheduler/20220221-175818
base:   git://anongit.freedesktop.org/drm/drm drm-next
config: hexagon-allmodconfig 
(https://download.01.org/0day-ci/archive/20220222/202202220108.kzxhno9i-...@intel.com/config)
compiler: clang version 15.0.0 (https://github.com/llvm/llvm-project 
d271fc04d5b97b12e6b797c6067d3c96a8d7470e)
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://github.com/0day-ci/linux/commit/9fdafca855faca0a3b8f213f024985c4112fa0bb
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Jiawei-Gu/drm-sched-Add-device-pointer-to-drm_gpu_scheduler/20220221-175818
git checkout 9fdafca855faca0a3b8f213f024985c4112fa0bb
# save the config file to linux build tree
mkdir build_dir
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 
O=build_dir ARCH=hexagon SHELL=/bin/bash drivers/gpu/drm/msm/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

>> drivers/gpu/drm/msm/msm_ringbuffer.c:92:41: error: too few arguments to 
>> function call, expected 9, have 8
   NULL, NULL, to_msm_bo(ring->bo)->name);
^
   include/drm/gpu_scheduler.h:463:5: note: 'drm_sched_init' declared here
   int drm_sched_init(struct drm_gpu_scheduler *sched,
   ^
   1 error generated.


vim +92 drivers/gpu/drm/msm/msm_ringbuffer.c

1d8a5ca436ee4a Rob Clark 2021-07-27   47  
f97decac5f4c2d Jordan Crouse 2017-10-20   48  struct msm_ringbuffer 
*msm_ringbuffer_new(struct msm_gpu *gpu, int id,
f97decac5f4c2d Jordan Crouse 2017-10-20   49void *memptrs, uint64_t 
memptrs_iova)
7198e6b03155f6 Rob Clark 2013-07-19   50  {
7198e6b03155f6 Rob Clark 2013-07-19   51struct msm_ringbuffer *ring;
1d8a5ca436ee4a Rob Clark 2021-07-27   52long sched_timeout;
f97decac5f4c2d Jordan Crouse 2017-10-20   53char name[32];
7198e6b03155f6 Rob Clark 2013-07-19   54int ret;
7198e6b03155f6 Rob Clark 2013-07-19   55  
f97decac5f4c2d Jordan Crouse 2017-10-20   56/* We assume everwhere that 
MSM_GPU_RINGBUFFER_SZ is a power of 2 */
f97decac5f4c2d Jordan Crouse 2017-10-20   57
BUILD_BUG_ON(!is_power_of_2(MSM_GPU_RINGBUFFER_SZ));
7198e6b03155f6 Rob Clark 2013-07-19   58  
7198e6b03155f6 Rob Clark 2013-07-19   59ring = kzalloc(sizeof(*ring), 
GFP_KERNEL);
7198e6b03155f6 Rob Clark 2013-07-19   60if (!ring) {
7198e6b03155f6 Rob Clark 2013-07-19   61ret = -ENOMEM;
7198e6b03155f6 Rob Clark 2013-07-19   62goto fail;
7198e6b03155f6 Rob Clark 2013-07-19   63}
7198e6b03155f6 Rob Clark 2013-07-19   64  
7198e6b03155f6 Rob Clark 2013-07-19   65ring->gpu = gpu;
f97decac5f4c2d Jordan Crouse 2017-10-20   66ring->id = id;
84c6127580c1ce Jordan Crouse 2018-11-07   67  
f97decac5f4c2d Jordan Crouse 2017-10-20   68ring->start = 
msm_gem_kernel_new(gpu->dev, MSM_GPU_RINGBUFFER_SZ,
604234f33658cd Jordan Crouse 2020-09-03   69check_apriv(gpu, 
MSM_BO_WC | MSM_BO_GPU_READONLY),
604234f33658cd Jordan Crouse 2020-09-03   70gpu->aspace, >bo, 
>iova);
8223286d62e296 Jordan Crouse 2017-07-27   71  
69a834c28fb514 Rob Clark 2016-05-24   72if (IS_ERR(ring->start)) {
69a834c28fb514 Rob Clark 2016-05-24   73ret = 
PTR_ERR(ring->start);
375f9a63a66bae Rob Clark 2021-07-27   74ring->start = NULL;
69a834c28fb514 Rob Clark 2016-05-24   75goto fail;
69a834c28fb514 Rob Clark 2016-05-24   76}
0815d7749a6852 Jordan Crouse 2018-11-07   77  
0815d7749a6852 Jordan Crouse 2018-11-07   78
msm_gem_object_set_name(ring->bo, "ring%d", id);
0815d7749a6852 Jordan Crouse 2018-11-07   79  
f97decac5f4c2d Jordan Crouse 2017-10-20   80ring->end   = ring->start + 
(MSM_GPU_RINGBUFFER_SZ >> 2);
4c7085a5d581a5 Jordan Crouse 2017-10-20   81ring->next  = ring->start;
7198e6b03155f6 Rob Clark 2013-07-19   82ring->cur   = ring->start;
7198e6b03155f6 Rob Clark 2013-07-19   83  
f97decac5f4c2d Jordan Crouse 2017-10-20   84ring->memptrs

  1   2   3   >