[AMD Official Use Only - General]
Thank you Lyude!
I just have a tentative patch set to deal with this. Would like to give a test
on some platforms first.
If the result is positive, will send out for review immediately.
Thanks for tracking on this : )
Regards,
Wayne Lin
> -Original Message-
> From: Lyude Paul
> Sent: Thursday, April 21, 2022 7:01 AM
> To: Lin, Wayne
> Cc: dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] WIP: drm/dp_mst: Add support for dumping topology ref
> histories from debugfs
>
> Hey! Figured I'd check if there's been any status updates here since it's been
> a while, just to make sure I haven't dropped this issue from my radar. No
> problem if you're busy :)
>
> On Wed, 2022-03-16 at 10:46 +, Lin, Wayne wrote:
> > [Public]
> >
> > > -Original Message-
> > > From: Lyude Paul
> > > Sent: Wednesday, March 16, 2022 8:48 AM
> > > To: Lin, Wayne
> > > Cc: dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org
> > > Subject: Re: [PATCH] WIP: drm/dp_mst: Add support for dumping topology ref
> > > histories from debugfs
> > >
> > > (Adding this back to the dri-devel mailing list since I didn't notice it
> > > got
> > > dropped from there)
> > >
> > > Hm, some comments on this issue down below. Sorry for the delayed
> > > response, I
> > > was going to try this right after I finished the MST legacy removal but
> > > that's
> > > ending up taking longer than I hoped.
> > >
> > > On Tue, 2022-03-08 at 01:46 +, Lin, Wayne wrote:
> > > > [AMD Official Use Only]
> > > >
> > > > Oops...
> > > > I didn't notice that I replied to wrong mail previously : P
> > > > Sorry for the confusion and reply to the correct thread.
> > > >
> > > > Good day!
> > > > I probably know why you can't reproduce the issue there. Please refer to
> > > > the
> > > > attached
> > > > patch file which makes me easier to explain this.
> > > >
> > > > In dm_dp_mst_get_modes(), we will create a new dc_sink by calling
> > > > dc_link_add_remote_sink() and add that dc_sink into the array dc_link-
> > > > > remote_sinks[].
> > > > However, if we find that aconnector->dc_sink is not null, then we won't
> > > > create a new
> > > > dc_sink for that aconnector and won't add the sink to the array dc_link-
> > > > > remote_sinks[].
> > > >
> > > > When the issue that I mentioned before occurs, we won't be able to
> > > > release
> > > > the dc_sink
> > > > for the aconnector when we unplug the sst monitor. Hence, we can't
> > > > create
> > > > new dc_sink
> > > > for the latter on connected new stream sink of that port. Which leads to
> > > > two
> > > > issues
> > > > 1. Unplug monitor and plug in another monitor "to the same port"
> > > > => Since we use the same dc_sink to reflect the capability of the new
> > > > connected stream
> > > > sink, we might send out inappropriate stream to the new connected sink.
> > > > 2. Because we can't release dc_sink in the array dc_link-
> > > > >remote_sinks[],
> > > > when we
> > > > plug/unplug sst monitor to more than 4 different mst port, we will break
> > > > the
> > > > criteria
> > > > "dc_link->sink_count >= MAX_SINKS_PER_LINK" in
> > > > link_add_remote_sink_helper().
> > >
> > > Ok, a lot of this seems to be happening in amdgpu which certainly explains
> > > why
> > > I had so much trouble following along with this originally (also, having
> > > learned a bit more about how DC works definitely has helped a bit). I
> > > already
> > > see a bunch of issues though with how this code is structured that would
> > > definitely break things, though note I haven't sat down and tried this on
> > > a
> > > real machine yet - will try tomorrow.
> > >
> > > So - it seems like dc_sink == a bunch of hotplugging state for a dm
> > > connector,
> > > which actually tells me for one that this is definitely the wrong spot for
> > > this code. get_modes() really should just handle filling out DRM modes and
> > > pretty much nothing else, because callers from DRM aren't going to expect
> > > side-effects like this when get_modes() is called - which could
> > > potentially
> > > lead to all sorts of weirdness down the line if the DRM call paths that
> > > use
> > > this ever change. i915 and nouveau have good examples of what these
> > > functions
> > > should typically look like, and amdgpu actually seems to mostly follow
> > > this
> > > advice for it's other get_modes() callback.
> > >
> > > Note there's also another problem here: the assumption "no EDID ==
> > > disconnected" isn't correct. It's totally possible to run into EDID-less
> > > displays if the display is some ancient pre-historic relic, or if the ROM
> > > (or
> > > EEPROM? can't remember what type of chip computer displays tend to use...)
> > > chip
> > > in the monitor containing the EDID has died. Note that the second
> > > situation is
> > > suprisingly more common then one might think! I ran into a 140Hz 4k ASUS
> > > display where this happened, and I