On Mon, Jul 07, 2025 at 12:13:19PM +0200, Luca Ceresoli wrote: > Hi Maxime, > > ouch, e-mail sent by mistake unfinished and without proof-reading... > well, let me continue it below. > > On Mon, 7 Jul 2025 11:58:53 +0200 > Luca Ceresoli <luca.ceres...@bootlin.com> wrote: > > > On Mon, 7 Jul 2025 08:16:49 +0200 > > Maxime Ripard <mrip...@kernel.org> wrote: > > > > > Hi Luca, > > > > > > On Wed, Jun 25, 2025 at 06:45:04PM +0200, Luca Ceresoli wrote: > > > > This series is the first attempt at avoiding DSI host drivers to have > > > > pointers to DSI devices (struct mipi_dsi_device), as discussed during > > > > the > > > > Linux Plumbers Conference 2024 with Maxime and Dmitry. > > > > > > > > It is working, but I consider this a draft in order to discuss and > > > > challenge the proposed approach. > > > > > > > > Overall work > > > > ============ > > > > > > > > This is part of the work towards removal of bridges from a still > > > > existing > > > > DRM pipeline without use-after-free. The grand plan as discussed in [1]. > > > > Here's the work breakdown (➜ marks the current series): > > > > > > > > 1. … add refcounting to DRM bridges (struct drm_bridge) > > > > (based on devm_drm_bridge_alloc() [0]) > > > > A. ✔ add new alloc API and refcounting (in v6.16-rc1) > > > > B. ✔ convert all bridge drivers to new API (now in drm-misc-next) > > > > C. ✔ kunit tests (now in drm-misc-next) > > > > D. … add get/put to drm_bridge_add/remove() + attach/detach() > > > > and warn on old allocation pattern (under review) > > > > E. … add get/put on drm_bridge accessors > > > > 1. … drm_bridge_chain_get_first_bridge() + add a cleanup action > > > > 2. … drm_bridge_chain_get_last_bridge() > > > > 3. drm_bridge_get_prev_bridge() > > > > 4. drm_bridge_get_next_bridge() > > > > 5. drm_for_each_bridge_in_chain() > > > > 6. drm_bridge_connector_init > > > > 7. of_drm_find_bridge > > > > 8. drm_of_find_panel_or_bridge, *_of_get_bridge > > > > F. debugfs improvements > > > > 2. handle gracefully atomic updates during bridge removal > > > > 3. ➜ avoid DSI host drivers to have dangling pointers to DSI devices > > > > (this series) > > > > 4. finish the hotplug bridge work, removing the "always-disconnected" > > > > connector, moving code to the core and potentially removing the > > > > hotplug-bridge itself (this needs to be clarified as points 1-3 are > > > > developed) > > > > > > > > [0] > > > > https://gitlab.freedesktop.org/drm/misc/kernel/-/commit/0cc6aadd7fc1e629b715ea3d1ba537ef2da95eec > > > > [1] > > > > https://lore.kernel.org/lkml/20250206-hotplug-drm-bridge-v6-0-9d6f2c9c3...@bootlin.com/t/#u > > > > > > > > Motivation > > > > ========== > > > > > > > > The motivation for this series is that with hot-pluggable hardware a DSI > > > > device can be disconnected from the DSI host at runtime, and later on > > > > reconnected, potentially with a different model having different bus > > > > parameters. > > > > > > > > DSI host drivers currently receive a struct mipi_dsi_device pointer in > > > > the > > > > attach callback and some store it permanently for later access to the > > > > bur > > > > format data (lanes, channel, pixel format etc). The stored pointer can > > > > become dangling if the device is removed, leading to a use-after-free. > > > > > > > > Currently the data exchange between DSI host and device happens > > > > primarily > > > > by two means: > > > > > > > > * the device requests attach, detach and message transfer to the host > > > > by > > > > calling mipi_dsi_attach/detach/transfer which in turn call the > > > > callbacks > > > > in struct mipi_dsi_host_ops > > > > - for this to work, struct mipi_dsi_device has a pointer to the > > > > host: > > > > this is OK because the goal is supporting hotplug of the "remote" > > > > part of the DRM pipeline > > > > * the host accesses directly the fields of struct mipi_dsi_device, to > > > > which it receives a pointer in the .attach and .detach callbacks > > > > > > > > The second bullet is the problematic one, which we want to remove. > > > > > > > > Strategy > > > > ======== > > > > > > > > I devised two possible strategies to address it: > > > > > > > > 1. change the host ops to not pass a struct mipi_dsi_device, but > > > > instead > > > > to pass only a copy of the needed information (bus format mainly), > > > > so > > > > the host driver does never access any info from the device > > > > > > > > 2. let the host get info from the device as needed, but without having > > > > a > > > > pointer to it; this is be based on: > > > > - storing a __private mipi_dsi_device pointer in struct > > > > mipi_dsi_host > > > > - adding getters to the DSI core for the host to query the needed > > > > info, e.g. drm_mipi_dsi_host_get_device_lanes(host) (the getters > > > > would be allowed to dereference the device pointer) > > > > > > > > This series implements strategy 1. It does so by adding a .attach_new > > > > host > > > > op, which does not take a mipi_dsi_device pointer, and converting most > > > > host > > > > drivers to it. Once all drivers are converted, the old op can be > > > > removed, > > > > and .attach_new renamed to .attach. > > > > > > I don't recall discussing this particular aspect at Plumbers, so sorry > > > if we're coming back to the same discussion we had. > > > > > > I'm not necessarily opposed to changing the MIPI-DSI bus API, but I > > > don't think changing the semantics to remove the fact that a particular > > > device is connected or not is a good idea. > > > > > > I would have expected to have bus driver (maybe) take a device pointer > > > at attach, and drop it at detach. > > > > > > Then, when we detect the hotplug of a DSI device, we detach it from its > > > parent, and we're done. > > > > > > What prevents us from using that approach? > > > > I probably should have done a recap of the whole discussion, so let me > > do it now. > > > > It all starts with one fact: a DSI device can be disconnected and then > > a different one connected later on, having a different DSI bus format > > (lanes, channel, mode flags, whatever). A detach/attach sequence would > > handle that, but only in the simple case when there is a host/device > > pair. Let's how consider this topology: > > > > ┌──────────────────┐ > > │ DSI bridge │ > > ┌─────────┐ A │ │ B ┌───────────┐ > > │ DSI host├────►│device host├────►│DSI device │ > > └─────────┘ └──────────────────┘ └───────────┘ > > > > Here link A is always connected, link B is hot-pluggable. When the tail > > device is removed and a different one plugged, a detach/attach sequence > > can update the bus format on the DSI bridge, but then the DSI bridge > > cannot update the format on the first host without faking a > > detach/attach that does not map a real event. > > > > The above topology is probably not common, but it is exactly what the > > hotplug-bridge introduces [0]. Whether the hotplug-bridge will have to > > eventually exist or not to support hotplug is still to be defined, but > > regardless there is another problematic aspect. > > > > The second problematic aspect is that several DSI host drivers will not > > even drm_bridge_add() until they have an attached DSI device. One such > > example is samsung-dsim, which calls drm_bridge_add() > > in samsung_dsim_host_attach(). When such a driver implements the first > > DSI host, the DSI bridge must register a DSI device before the DRM card > > can be instantiated. See the lengthy comment before > > hotplug_bridge_dsi_attach() in [0] for more gory details, but the > > outcome is that the hotplug-bridge needs to attach a DSI device with > > a fake format once initially just to let the DRM card probe, and the > > detach and reattach with the correct format once an actual DSI device > > is connected at the tail. > > > > [0] > > https://lore.kernel.org/all/20240917-hotplug-drm-bridge-v4-4-bc4dfee61...@bootlin.com/ > > > > The above would be improved if the DSI host API provided a way to > > notify to the host about a bus format change, which is however not > > present currently. > > > > The naive solution would be adding a new DSI host op: > > > > struct mipi_dsi_host_ops { > > int (*attach)(struct mipi_dsi_host *host, > > struct mipi_dsi_device *dsi); > > int (*detach)(struct mipi_dsi_host *host, > > struct mipi_dsi_device *dsi); > > + int (*bus_fmt_changed)(struct mipi_dsi_host *host, > > + struct mipi_dsi_device *dsi); > > ssize_t (*transfer)(struct mipi_dsi_host *host, > > const struct mipi_dsi_msg *msg); > > }; > > > > This would allow reduce the current sequence: > > 1. attach with dummy format (no tail device yet) > > 2. fake detach > > 3. attach > > > > with: > > 1. attach with dummy format (no tail device yet) > > 2. update format > > > > Adding such a new op would be part of chapter 4 of this work, being it > > quite useless without hotplug. > > > > However while reasoning about this I noticed the DSI host drivers peek > > into the struct mipi_dsi_device fields to read the format, so there is > > no sort of isolation between host and device. Introducing a struct to > > contain all the format fields looked like a good improvement in terms > > of code organization. > > > > Yet another aspect is that several host drivers keep a pointer to the > > device, and thus in case of format change in the DSI device they might > > be reading different fields at different moments, ending up with an > > inconsistent format. > > > > The above considerations, which are all partially overlapped, led me to > > the idea of introducing a struct to exchange a DSI bus format, to be > > exchanged as a whole ("atomically") between host and device. What's > > your opinion about introducing such a struct? > > > > The second aspect of this series is not passing pointers, and that's > > the core topic you questioned. I realize it is not strictly necessary > > to reach the various goals discussed in this e-mail. The work I'm doing > > on the drm_bridge struct is actually a way to store a pointer while > > avoiding use-after-free, so that can obviously be done for a simpler > > scenario such as DSI host-device. However I thought not passing a > > pointer would be a more radical solution: if a driver receives no > > pointer, then it cannot by mistake keep it stored when it shouldn't, > > maybe in a rare case within a complex driver where it is hard to spot. > > > > I'll be OK to change the approach and keep the pointer passed in the > > attach/detach ops, if that is the best option. However I'd like to have > > your opinion about the above topics before working towards that > > direction, and ensure I fully grasp the usefulness of keeping the > > pointer. > > > > Post scriptum. The very initial issue that led to all this discussion > > when writing the hotplug-bridge driver is that the samsung-dsim driver > > will not drm_bridge_add() until a DSI device does .attach to it. Again, > > see the comments before hotplug_bridge_dsi_attach() in [0] for details. > > However by re-examining the driver for the N-th time now from a new > > POV, I _think_ this is not correct and potentially easy to solve. But this > > leads to one fundamental question: > > The question is: should a DSI host bridge driver: > > A) wait for a DSI device to .attach before drm_bridge_add()ing itself, > or > B) drm_bridge_add() itself unconditionally, and let the DSI device > .attach whenever it happens? > > A) is what many drivers (IIRC the majority) does. It implies the card > will not be populated until .attach, which in the hotplug case could > happen very late > > B) is done by a few drivers and allows the card to appear in the > hotplug case without the device, which is needed for hotplug. > > I had tried simply moving drm_bridge_add() from .attach to probe in > the samsung-dsim driver in the pase but that would not work. Now I did > yet another check at the code and I suspect it can be done with a small > additional change, but cannot access the hardware to test it currently. > > Answering this last question might change and simplify the requirements > discussed in the (very lengthy, sorry about that) discussion above.
You're not going to like the answer too much, but I think that it is that "nobody knows". The bad news is that I can't give you an answer, but the good one is that it gives us some leeway. As I said in my previous mail, we should probably figure it out, document it, and then make it work for your drivers. Once we have a documentation we can point to, the rest will fall in line. Maxime
signature.asc
Description: PGP signature