Re: [Nouveau] 4.20.0-rc3 nouveau/Quadro P2000 Mobile: runpm causing ACPI errors, lockups
On Tue, Nov 27, 2018 at 11:36:50AM +0200, Mika Westerberg wrote: > +linux-acpi > > Hi Michael, > > On Mon, Nov 26, 2018 at 10:53:26PM -0500, Michael S. Tsirkin wrote: > > So a new thinkpad: > > 01:00.0 VGA compatible controller: NVIDIA Corporation GP107GLM [Quadro > > P2000 Mobile] (rev a1) > > > > Hangs whenever I try to poke at the card. It starts happily enough with > > > > [3.971515] ACPI Warning: \_SB.PCI0.GFX0._DSM: Argument #4 type mismatch > > - Found [Buffer], ACPI requires [Package] (20181003/nsarguments-66) > > [3.971553] ACPI Warning: \_SB.PCI0.PEG0.PEGP._DSM: Argument #4 type > > mismatch - Found [Buffer], ACPI requires [Package] (20181003/nsarguments-66) > > [3.971721] pci :01:00.0: optimus capabilities: enabled, status > > dynamic power, hda bios codec supported > > [3.971726] VGA switcheroo: detected Optimus DSM method > > \_SB_.PCI0.PEG0.PEGP handle > > [3.971727] nouveau: detected PR support, will not use DSM BTW this is also a bit strange. It says it will not use DSM - but did it maybe use DSM previously? The ACPI Warning seems to indicate that it did ... And just to complete the picture here's the _DSM from ACPI: Method (_DSM, 4, Serialized) // _DSM: Device-Specific Method { If ((Arg0 == ToUUID ("3e5b41c6-eb1d-4260-9d15-c71fbadae414"))) { Switch (ToInteger (Arg2)) { Case (Zero) { If ((Arg1 == One)) { Return (0x0001E7FF) } } Case (One) { If ((Arg1 == One)) { If (((S0ID == One) && (OSYS < 0x07DF))) { If (((DerefOf (Arg3 [Zero]) & 0xFF) == One)) { \GUAM (One) } Local0 = (DerefOf (Arg3 [One]) & 0xFF) If ((Local0 == Zero)) { \GUAM (Zero) } } If ((DerefOf (Arg3 [Zero]) == Zero)) { Local0 = CLID /* \_SB_.PCI0.GFX0.CLID */ If ((0x8000 & Local0)) { CLID &= 0x0F GLID (CLID) } } Return (One) } } Case (0x02) { If ((Arg1 == One)) { Return (One) } } Case (0x03) { If ((Arg1 == One)) { Return (One) } } Case (0x04) { If ((Arg1 == One)) { Return (One) } } Case (0x05) { If ((Arg1 == One)) { Return (One) } } Case (0x06) { If ((Arg1 == One)) { Return (One) } } Case (0x07) { If ((Arg1 == One)) { IBTT = (DerefOf (Arg3 [Zero]) & 0xFF) Return (One) } } Case (0x08) { If ((Arg1 == One)) { IPSC = (DerefOf (Arg3 [Zero]) & 0xFF) If ((DerefOf (Arg3 [One]) & 0xFF)) { IPAT = (DerefOf (Arg3 [One]) & 0xFF) IPAT-- } IBIA = ((DerefOf (Arg3 [0x02]) >> 0x04) & 0x07) Return (One) } } Case (0x09) { If ((Arg1 == One)) { Return (One) } } Case
Re: [Nouveau] [PATCH v6 3/6] drm/dp_mst: Start tracking per-port VCPI allocations
On Tue, 2018-11-27 at 20:44 +0100, Daniel Vetter wrote: > On Tue, Nov 27, 2018 at 12:48:59PM -0500, Lyude Paul wrote: > > On Mon, 2018-11-26 at 22:22 +0100, Daniel Vetter wrote: > > > On Mon, Nov 26, 2018 at 10:04:21PM +0100, Daniel Vetter wrote: > > > > On Thu, Nov 15, 2018 at 07:50:05PM -0500, Lyude Paul wrote: > > > > > There has been a TODO waiting for quite a long time in > > > > > drm_dp_mst_topology.c: > > > > > > > > > > /* We cannot rely on port->vcpi.num_slots to update > > > > >* topology_state->avail_slots as the port may not exist if > > > > > the parent > > > > >* branch device was unplugged. This should be fixed by > > > > > tracking > > > > >* per-port slot allocation in drm_dp_mst_topology_state > > > > > instead of > > > > >* depending on the caller to tell us how many slots to > > > > > release. > > > > >*/ > > > > > > > > > > That's not the only reason we should fix this: forcing the driver to > > > > > track the VCPI allocations throughout a state's atomic check is > > > > > error prone, because it means that extra care has to be taken with > > > > > the > > > > > order that drm_dp_atomic_find_vcpi_slots() and > > > > > drm_dp_atomic_release_vcpi_slots() are called in in order to ensure > > > > > idempotency. Currently the only driver actually using these helpers, > > > > > i915, doesn't even do this correctly: multiple ->best_encoder() > > > > > checks > > > > > with i915's current implementation would not be idempotent and would > > > > > over-allocate VCPI slots, something I learned trying to implement > > > > > fallback retraining in MST. > > > > > > > > > > So: simplify this whole mess, and teach > > > > > drm_dp_atomic_find_vcpi_slots() > > > > > and drm_dp_atomic_release_vcpi_slots() to track the VCPI allocations > > > > > for > > > > > each port. This allows us to ensure idempotency without having to > > > > > rely > > > > > on the driver as much. Additionally: the driver doesn't need to do > > > > > any > > > > > kind of VCPI slot tracking anymore if it doesn't need it for it's > > > > > own > > > > > internal state. > > > > > > > > > > Additionally; this adds a new drm_dp_mst_atomic_check() helper which > > > > > must be used by atomic drivers to perform validity checks for the > > > > > new > > > > > VCPI allocations incurred by a state. > > > > > > > > > > Also: update the documentation and make it more obvious that these > > > > > /must/ be called by /all/ atomic drivers supporting MST. > > > > > > > > > > Changes since v6: > > > > > - Keep a kref to all of the ports we have allocations on. This > > > > > required > > > > >a good bit of changing to when we call drm_dp_find_vcpi_slots(), > > > > >mainly that we need to ensure that we only redo VCPI allocations > > > > > on > > > > >actual mode or CRTC changes, not crtc_state->active changes. > > > > >Additionally, we no longer take the registration of the DRM > > > > > connector > > > > >for each port into account because so long as we have a kref to > > > > > the > > > > >port in the new or previous atomic state, the connector will stay > > > > >registered. > > > > > > > > I write an entire pile of small nitpits (still included most of them > > > > below), until I realized this here wont work. Delaying the call to > > > > destroy > > > > the connector (well, unregister it really) wreaks the design we've > > > > come up > > > > with thus far, resulting in most of my comments here. > > > > > > > > Instead, all we need to do is delay the kfree(port) at the bottom of > > > > drm_dp_destroy_port(). The vcpi allocation structure _only_ needs the > > > > pointer value to stay valid, as a lookup key. It doesn't care at all > > > > about > > > > anything actually stored in there. So the only thing we need to delay > > > > is > > > > the kfree. I think the simplest way to achieve this is to add a 2nd > > > > kref > > > > (port->kfree_ref or something like that), with on reference held by > > > > the > > > > port itself (released in drm_dp_destroy_port), and the other one held > > > > as-needed by the vcpi allocation lists. > > > > > > > > I think if we go with this design instead of retrofitting a semantic > > > > change of the port lifetime itself, all the complications I complain > > > > about > > > > below should disappear. > > > > > > In the above I meant drm_dp_destroy_port or > > > drm_dp_destroy_connector_work. > > > > > > Aside: I think creating a kref for the final kfree would also solve a > > > bunch of other issues in a much neater way: In > > > > > > commit f038c5b99fc1332f558b495d136d4f433ee65caa > > > Author: Lyude Paul > > > Date: Tue Nov 13 17:46:14 2018 -0500 > > > > > > drm/dp_mst: Skip validating ports during destruction, just ref > > > > > > we could use that kfree reference to make sure the port pointer is > > > alive. > > > This of course means that drm_dp_update_payload_part1() would also need > > > to > > > use the kfree
Re: [Nouveau] [PATCH 0/6] Remove all bad dp_mst_port uses and hide struct def
For the series: Acked-by: Ben Skeggs On Sat, 17 Nov 2018 at 10:21, Lyude Paul wrote: > > So we don't ever have to worry about drivers touching drm_dp_mst_port > structs without verifying them and crashing again. > > Lyude Paul (6): > drm/dp_mst: Add drm_dp_get_payload_info() > drm/nouveau: Use drm_dp_get_payload_info() for getting payload/vcpi > drm/nouveau: Stop reading port->mgr in nv50_mstc_get_modes() > drm/nouveau: Stop reading port->mgr in nv50_mstc_detect() > drm/dp_mst: Hide drm_dp_mst_port contents from drivers > drm/i915: Start using struct drm_dp_mst_port again > > drivers/gpu/drm/drm_dp_mst_topology.c | 115 > drivers/gpu/drm/i915/intel_dp_mst.c | 2 +- > drivers/gpu/drm/i915/intel_drv.h| 2 +- > drivers/gpu/drm/nouveau/dispnv50/disp.c | 60 + > include/drm/drm_dp_mst_helper.h | 65 ++ > 5 files changed, 146 insertions(+), 98 deletions(-) > > -- > 2.19.1 > > ___ > dri-devel mailing list > dri-de...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH 1/6] drm/dp_mst: Add drm_dp_get_payload_info()
On Tue, 2018-11-27 at 22:23 +0100, Daniel Vetter wrote: > On Fri, Nov 16, 2018 at 07:21:15PM -0500, Lyude Paul wrote: > > Some hardware (nvidia hardware in particular) needs to be notified of > > the exact VCPI and payload settings that the topology manager decided on > > for each mstb port. Since there isn't currently any way to get this > > information without going through port (which drivers are very much not > > supposed to do by themselves, ever), let's add one. > > > > Signed-off-by: Lyude Paul > > --- > > drivers/gpu/drm/drm_dp_mst_topology.c | 56 +++ > > include/drm/drm_dp_mst_helper.h | 5 ++- > > 2 files changed, 60 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c > > b/drivers/gpu/drm/drm_dp_mst_topology.c > > index 529414556962..4336d17ce904 100644 > > --- a/drivers/gpu/drm/drm_dp_mst_topology.c > > +++ b/drivers/gpu/drm/drm_dp_mst_topology.c > > @@ -1982,6 +1982,62 @@ int drm_dp_update_payload_part2(struct > > drm_dp_mst_topology_mgr *mgr) > > } > > EXPORT_SYMBOL(drm_dp_update_payload_part2); > > > > +/** > > + * drm_dp_get_payload_info() - Retrieve payload/vcpi information for the > > given > > + * @port > > + * @mgr: manager to use > > + * @port: the port to get the relevant payload information for > > + * @vcpi_out: where to copy the port's VCPI information to > > + * @payload_out: where to copy the port's payload information to > > + * > > + * Searches the current payloads for @mgr and finds the relevant payload > > and > > + * VCPI information that was programmed by the topology mgr, then copies > > it > > + * into @vcpi_out and @payload_out. Drivers which need to know this > > + * information must use this helper as opposed to checking @port > > themselves, > > + * as this helper will ensure the port reference is still valid and grab > > the > > + * appropriate locks in @mgr. > > + * > > + * Returns: > > + * 0 on success, negative error code if the port is no longer valid or a > > + * programmed payload could not be found for @port. > > + */ > > +int drm_dp_get_payload_info(struct drm_dp_mst_topology_mgr *mgr, > > + struct drm_dp_mst_port *port, > > + struct drm_dp_vcpi *vcpi_out, > > + struct drm_dp_payload *payload_out) > > +{ > > + struct drm_dp_payload *payload = NULL; > > + int i; > > + int ret = 0; > > + > > + port = drm_dp_get_validated_port_ref(mgr, port); > > + if (!port) > > + return -EINVAL; > > This is the part that I mean in our other/irc discussions. The > dp_get_validated_port here could fail when it's going to surprise the > driver. With the dp_port_malloc_get stuff we could instead just grab a > port_malloc_kref when storing the port in mgr->payload, which would > guarantee that the port based lookup below still works. Yeah, that makes more sense now that there's context :P, sgtm. > > > + > > + mutex_lock(>payload_lock); > > + /* Figure out which of the payloads belongs to this port */ > > + for (i = 0; i < mgr->max_payloads; i++) { > > + if (mgr->payloads[i].vcpi == port->vcpi.vcpi) { > > Or maybe even rework the lookup here to use the port pointer (as an > abstract key) instead of port->vcpi.vcpi. With port_malloc_kref we could > guarantee that it keeps working, even after the port has been destroyed. > > And (without checking) I think that's needed anyway to clean up the > payload update hacks in the connector destroy work ... > -Daniel > > > + payload = >payloads[i]; > > + break; > > + } > > + } > > + > > + if (!payload) { > > + DRM_DEBUG_KMS("Failed to find payload for port %p\n", port); > > + ret = -EINVAL; > > + goto out; > > + } > > + > > + *payload_out = *payload; > > + *vcpi_out = port->vcpi; > > +out: > > + mutex_unlock(>payload_lock); > > + drm_dp_put_port(port); > > + return ret; > > +} > > +EXPORT_SYMBOL(drm_dp_get_payload_info); > > + > > #if 0 /* unused as of yet */ > > static int drm_dp_send_dpcd_read(struct drm_dp_mst_topology_mgr *mgr, > > struct drm_dp_mst_port *port, > > diff --git a/include/drm/drm_dp_mst_helper.h > > b/include/drm/drm_dp_mst_helper.h > > index 59f005b419cf..9cc93ea60e7e 100644 > > --- a/include/drm/drm_dp_mst_helper.h > > +++ b/include/drm/drm_dp_mst_helper.h > > @@ -592,7 +592,10 @@ bool drm_dp_mst_allocate_vcpi(struct > > drm_dp_mst_topology_mgr *mgr, > > struct drm_dp_mst_port *port, int pbn, int > > slots); > > > > int drm_dp_mst_get_vcpi_slots(struct drm_dp_mst_topology_mgr *mgr, struct > > drm_dp_mst_port *port); > > - > > +int drm_dp_get_payload_info(struct drm_dp_mst_topology_mgr *mgr, > > + struct drm_dp_mst_port *port, > > + struct drm_dp_vcpi *vcpi_out, > > + struct drm_dp_payload
Re: [Nouveau] [PATCH 1/6] drm/dp_mst: Add drm_dp_get_payload_info()
On Fri, Nov 16, 2018 at 07:21:15PM -0500, Lyude Paul wrote: > Some hardware (nvidia hardware in particular) needs to be notified of > the exact VCPI and payload settings that the topology manager decided on > for each mstb port. Since there isn't currently any way to get this > information without going through port (which drivers are very much not > supposed to do by themselves, ever), let's add one. > > Signed-off-by: Lyude Paul > --- > drivers/gpu/drm/drm_dp_mst_topology.c | 56 +++ > include/drm/drm_dp_mst_helper.h | 5 ++- > 2 files changed, 60 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c > b/drivers/gpu/drm/drm_dp_mst_topology.c > index 529414556962..4336d17ce904 100644 > --- a/drivers/gpu/drm/drm_dp_mst_topology.c > +++ b/drivers/gpu/drm/drm_dp_mst_topology.c > @@ -1982,6 +1982,62 @@ int drm_dp_update_payload_part2(struct > drm_dp_mst_topology_mgr *mgr) > } > EXPORT_SYMBOL(drm_dp_update_payload_part2); > > +/** > + * drm_dp_get_payload_info() - Retrieve payload/vcpi information for the > given > + * @port > + * @mgr: manager to use > + * @port: the port to get the relevant payload information for > + * @vcpi_out: where to copy the port's VCPI information to > + * @payload_out: where to copy the port's payload information to > + * > + * Searches the current payloads for @mgr and finds the relevant payload and > + * VCPI information that was programmed by the topology mgr, then copies it > + * into @vcpi_out and @payload_out. Drivers which need to know this > + * information must use this helper as opposed to checking @port themselves, > + * as this helper will ensure the port reference is still valid and grab the > + * appropriate locks in @mgr. > + * > + * Returns: > + * 0 on success, negative error code if the port is no longer valid or a > + * programmed payload could not be found for @port. > + */ > +int drm_dp_get_payload_info(struct drm_dp_mst_topology_mgr *mgr, > + struct drm_dp_mst_port *port, > + struct drm_dp_vcpi *vcpi_out, > + struct drm_dp_payload *payload_out) > +{ > + struct drm_dp_payload *payload = NULL; > + int i; > + int ret = 0; > + > + port = drm_dp_get_validated_port_ref(mgr, port); > + if (!port) > + return -EINVAL; This is the part that I mean in our other/irc discussions. The dp_get_validated_port here could fail when it's going to surprise the driver. With the dp_port_malloc_get stuff we could instead just grab a port_malloc_kref when storing the port in mgr->payload, which would guarantee that the port based lookup below still works. > + > + mutex_lock(>payload_lock); > + /* Figure out which of the payloads belongs to this port */ > + for (i = 0; i < mgr->max_payloads; i++) { > + if (mgr->payloads[i].vcpi == port->vcpi.vcpi) { Or maybe even rework the lookup here to use the port pointer (as an abstract key) instead of port->vcpi.vcpi. With port_malloc_kref we could guarantee that it keeps working, even after the port has been destroyed. And (without checking) I think that's needed anyway to clean up the payload update hacks in the connector destroy work ... -Daniel > + payload = >payloads[i]; > + break; > + } > + } > + > + if (!payload) { > + DRM_DEBUG_KMS("Failed to find payload for port %p\n", port); > + ret = -EINVAL; > + goto out; > + } > + > + *payload_out = *payload; > + *vcpi_out = port->vcpi; > +out: > + mutex_unlock(>payload_lock); > + drm_dp_put_port(port); > + return ret; > +} > +EXPORT_SYMBOL(drm_dp_get_payload_info); > + > #if 0 /* unused as of yet */ > static int drm_dp_send_dpcd_read(struct drm_dp_mst_topology_mgr *mgr, >struct drm_dp_mst_port *port, > diff --git a/include/drm/drm_dp_mst_helper.h b/include/drm/drm_dp_mst_helper.h > index 59f005b419cf..9cc93ea60e7e 100644 > --- a/include/drm/drm_dp_mst_helper.h > +++ b/include/drm/drm_dp_mst_helper.h > @@ -592,7 +592,10 @@ bool drm_dp_mst_allocate_vcpi(struct > drm_dp_mst_topology_mgr *mgr, > struct drm_dp_mst_port *port, int pbn, int slots); > > int drm_dp_mst_get_vcpi_slots(struct drm_dp_mst_topology_mgr *mgr, struct > drm_dp_mst_port *port); > - > +int drm_dp_get_payload_info(struct drm_dp_mst_topology_mgr *mgr, > + struct drm_dp_mst_port *port, > + struct drm_dp_vcpi *vcpi_out, > + struct drm_dp_payload *payload_out); > > void drm_dp_mst_reset_vcpi_slots(struct drm_dp_mst_topology_mgr *mgr, struct > drm_dp_mst_port *port); > > -- > 2.19.1 > > ___ > Nouveau mailing list > Nouveau@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/nouveau --
Re: [Nouveau] [PATCH v6 3/6] drm/dp_mst: Start tracking per-port VCPI allocations
On Tue, Nov 27, 2018 at 12:48:59PM -0500, Lyude Paul wrote: > On Mon, 2018-11-26 at 22:22 +0100, Daniel Vetter wrote: > > On Mon, Nov 26, 2018 at 10:04:21PM +0100, Daniel Vetter wrote: > > > On Thu, Nov 15, 2018 at 07:50:05PM -0500, Lyude Paul wrote: > > > > There has been a TODO waiting for quite a long time in > > > > drm_dp_mst_topology.c: > > > > > > > > /* We cannot rely on port->vcpi.num_slots to update > > > > * topology_state->avail_slots as the port may not exist if the > > > > parent > > > > * branch device was unplugged. This should be fixed by tracking > > > > * per-port slot allocation in drm_dp_mst_topology_state > > > > instead of > > > > * depending on the caller to tell us how many slots to release. > > > > */ > > > > > > > > That's not the only reason we should fix this: forcing the driver to > > > > track the VCPI allocations throughout a state's atomic check is > > > > error prone, because it means that extra care has to be taken with the > > > > order that drm_dp_atomic_find_vcpi_slots() and > > > > drm_dp_atomic_release_vcpi_slots() are called in in order to ensure > > > > idempotency. Currently the only driver actually using these helpers, > > > > i915, doesn't even do this correctly: multiple ->best_encoder() checks > > > > with i915's current implementation would not be idempotent and would > > > > over-allocate VCPI slots, something I learned trying to implement > > > > fallback retraining in MST. > > > > > > > > So: simplify this whole mess, and teach drm_dp_atomic_find_vcpi_slots() > > > > and drm_dp_atomic_release_vcpi_slots() to track the VCPI allocations for > > > > each port. This allows us to ensure idempotency without having to rely > > > > on the driver as much. Additionally: the driver doesn't need to do any > > > > kind of VCPI slot tracking anymore if it doesn't need it for it's own > > > > internal state. > > > > > > > > Additionally; this adds a new drm_dp_mst_atomic_check() helper which > > > > must be used by atomic drivers to perform validity checks for the new > > > > VCPI allocations incurred by a state. > > > > > > > > Also: update the documentation and make it more obvious that these > > > > /must/ be called by /all/ atomic drivers supporting MST. > > > > > > > > Changes since v6: > > > > - Keep a kref to all of the ports we have allocations on. This required > > > >a good bit of changing to when we call drm_dp_find_vcpi_slots(), > > > >mainly that we need to ensure that we only redo VCPI allocations on > > > >actual mode or CRTC changes, not crtc_state->active changes. > > > >Additionally, we no longer take the registration of the DRM connector > > > >for each port into account because so long as we have a kref to the > > > >port in the new or previous atomic state, the connector will stay > > > >registered. > > > > > > I write an entire pile of small nitpits (still included most of them > > > below), until I realized this here wont work. Delaying the call to destroy > > > the connector (well, unregister it really) wreaks the design we've come up > > > with thus far, resulting in most of my comments here. > > > > > > Instead, all we need to do is delay the kfree(port) at the bottom of > > > drm_dp_destroy_port(). The vcpi allocation structure _only_ needs the > > > pointer value to stay valid, as a lookup key. It doesn't care at all about > > > anything actually stored in there. So the only thing we need to delay is > > > the kfree. I think the simplest way to achieve this is to add a 2nd kref > > > (port->kfree_ref or something like that), with on reference held by the > > > port itself (released in drm_dp_destroy_port), and the other one held > > > as-needed by the vcpi allocation lists. > > > > > > I think if we go with this design instead of retrofitting a semantic > > > change of the port lifetime itself, all the complications I complain about > > > below should disappear. > > > > In the above I meant drm_dp_destroy_port or drm_dp_destroy_connector_work. > > > > Aside: I think creating a kref for the final kfree would also solve a > > bunch of other issues in a much neater way: In > > > > commit f038c5b99fc1332f558b495d136d4f433ee65caa > > Author: Lyude Paul > > Date: Tue Nov 13 17:46:14 2018 -0500 > > > > drm/dp_mst: Skip validating ports during destruction, just ref > > > > we could use that kfree reference to make sure the port pointer is alive. > > This of course means that drm_dp_update_payload_part1() would also need to > > use the kfree reference for the vcpi allocations. And probably everywhere > > else (e.g. in your nouveau series for payload information). > > > > That would give us a very clear separation overall between "port can still > > be used because it's not yet unplugged" vs. "port only hangs around > > because a bunch of vcpi allocations and other payload things upstream of > > the port might still need the port lookup key to free
Re: [Nouveau] [PATCH v6 3/6] drm/dp_mst: Start tracking per-port VCPI allocations
On Mon, 2018-11-26 at 22:22 +0100, Daniel Vetter wrote: > On Mon, Nov 26, 2018 at 10:04:21PM +0100, Daniel Vetter wrote: > > On Thu, Nov 15, 2018 at 07:50:05PM -0500, Lyude Paul wrote: > > > There has been a TODO waiting for quite a long time in > > > drm_dp_mst_topology.c: > > > > > > /* We cannot rely on port->vcpi.num_slots to update > > >* topology_state->avail_slots as the port may not exist if the parent > > >* branch device was unplugged. This should be fixed by tracking > > >* per-port slot allocation in drm_dp_mst_topology_state instead of > > >* depending on the caller to tell us how many slots to release. > > >*/ > > > > > > That's not the only reason we should fix this: forcing the driver to > > > track the VCPI allocations throughout a state's atomic check is > > > error prone, because it means that extra care has to be taken with the > > > order that drm_dp_atomic_find_vcpi_slots() and > > > drm_dp_atomic_release_vcpi_slots() are called in in order to ensure > > > idempotency. Currently the only driver actually using these helpers, > > > i915, doesn't even do this correctly: multiple ->best_encoder() checks > > > with i915's current implementation would not be idempotent and would > > > over-allocate VCPI slots, something I learned trying to implement > > > fallback retraining in MST. > > > > > > So: simplify this whole mess, and teach drm_dp_atomic_find_vcpi_slots() > > > and drm_dp_atomic_release_vcpi_slots() to track the VCPI allocations for > > > each port. This allows us to ensure idempotency without having to rely > > > on the driver as much. Additionally: the driver doesn't need to do any > > > kind of VCPI slot tracking anymore if it doesn't need it for it's own > > > internal state. > > > > > > Additionally; this adds a new drm_dp_mst_atomic_check() helper which > > > must be used by atomic drivers to perform validity checks for the new > > > VCPI allocations incurred by a state. > > > > > > Also: update the documentation and make it more obvious that these > > > /must/ be called by /all/ atomic drivers supporting MST. > > > > > > Changes since v6: > > > - Keep a kref to all of the ports we have allocations on. This required > > >a good bit of changing to when we call drm_dp_find_vcpi_slots(), > > >mainly that we need to ensure that we only redo VCPI allocations on > > >actual mode or CRTC changes, not crtc_state->active changes. > > >Additionally, we no longer take the registration of the DRM connector > > >for each port into account because so long as we have a kref to the > > >port in the new or previous atomic state, the connector will stay > > >registered. > > > > I write an entire pile of small nitpits (still included most of them > > below), until I realized this here wont work. Delaying the call to destroy > > the connector (well, unregister it really) wreaks the design we've come up > > with thus far, resulting in most of my comments here. > > > > Instead, all we need to do is delay the kfree(port) at the bottom of > > drm_dp_destroy_port(). The vcpi allocation structure _only_ needs the > > pointer value to stay valid, as a lookup key. It doesn't care at all about > > anything actually stored in there. So the only thing we need to delay is > > the kfree. I think the simplest way to achieve this is to add a 2nd kref > > (port->kfree_ref or something like that), with on reference held by the > > port itself (released in drm_dp_destroy_port), and the other one held > > as-needed by the vcpi allocation lists. > > > > I think if we go with this design instead of retrofitting a semantic > > change of the port lifetime itself, all the complications I complain about > > below should disappear. > > In the above I meant drm_dp_destroy_port or drm_dp_destroy_connector_work. > > Aside: I think creating a kref for the final kfree would also solve a > bunch of other issues in a much neater way: In > > commit f038c5b99fc1332f558b495d136d4f433ee65caa > Author: Lyude Paul > Date: Tue Nov 13 17:46:14 2018 -0500 > > drm/dp_mst: Skip validating ports during destruction, just ref > > we could use that kfree reference to make sure the port pointer is alive. > This of course means that drm_dp_update_payload_part1() would also need to > use the kfree reference for the vcpi allocations. And probably everywhere > else (e.g. in your nouveau series for payload information). > > That would give us a very clear separation overall between "port can still > be used because it's not yet unplugged" vs. "port only hangs around > because a bunch of vcpi allocations and other payload things upstream of > the port might still need the port lookup key to free resources". Instead > of again sprinkling complicated conditions and magic tricks all over the > code to figure out whethe we should allow the get_validate_port to succeed > or not, or the modeset to succeed or not. mhm, I've been experimenting with this and I think I agree this is
Re: [Nouveau] 4.20.0-rc3 nouveau/Quadro P2000 Mobile: runpm causing ACPI errors, lockups
+linux-acpi Hi Michael, On Mon, Nov 26, 2018 at 10:53:26PM -0500, Michael S. Tsirkin wrote: > So a new thinkpad: > 01:00.0 VGA compatible controller: NVIDIA Corporation GP107GLM [Quadro P2000 > Mobile] (rev a1) > > Hangs whenever I try to poke at the card. It starts happily enough with > > [3.971515] ACPI Warning: \_SB.PCI0.GFX0._DSM: Argument #4 type mismatch - > Found [Buffer], ACPI requires [Package] (20181003/nsarguments-66) > [3.971553] ACPI Warning: \_SB.PCI0.PEG0.PEGP._DSM: Argument #4 type > mismatch - Found [Buffer], ACPI requires [Package] (20181003/nsarguments-66) > [3.971721] pci :01:00.0: optimus capabilities: enabled, status > dynamic power, hda bios codec supported > [3.971726] VGA switcheroo: detected Optimus DSM method > \_SB_.PCI0.PEG0.PEGP handle > [3.971727] nouveau: detected PR support, will not use DSM > [3.971745] nouveau :01:00.0: enabling device (0006 -> 0007) > [3.971923] nouveau :01:00.0: NVIDIA GP107 (137000a1) > [4.009875] PM: Image not found (code -22) > [4.135752] nouveau :01:00.0: DRM: VRAM: 4096 MiB > [4.135753] nouveau :01:00.0: DRM: GART: 536870912 MiB > [4.135754] nouveau :01:00.0: DRM: BIT table 'A' not found > [4.135755] nouveau :01:00.0: DRM: BIT table 'L' not found > [4.135756] nouveau :01:00.0: DRM: TMDS table version 2.0 > [4.135756] nouveau :01:00.0: DRM: DCB version 4.1 > [4.135757] nouveau :01:00.0: DRM: DCB outp 00: 02800f76 04600020 > [4.135758] nouveau :01:00.0: DRM: DCB outp 01: 02011f62 00020010 > [4.135759] nouveau :01:00.0: DRM: DCB outp 02: 01022f46 04600010 > [4.135760] nouveau :01:00.0: DRM: DCB outp 03: 01033f56 04600020 > [4.135761] nouveau :01:00.0: DRM: DCB conn 00: 00020047 > [4.135761] nouveau :01:00.0: DRM: DCB conn 01: 00010161 > [4.135762] nouveau :01:00.0: DRM: DCB conn 02: 1246 > [4.135763] nouveau :01:00.0: DRM: DCB conn 03: 2346 > [4.508355] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). > [4.508355] [drm] Driver supports precise vblank timestamp query. > [4.509812] [drm] Cannot find any crtc or sizes > [4.510144] [drm] Initialized nouveau 1.3.1 20120801 for :01:00.0 on > minor 2 > > > Although that type mismatch is a bit worrying. And I'm not sure what > prints PM: Image not found. That is fine, it just says it cannot find a hibernation image from swap device. I guess you have resume=... in the kernel command line. > But after a short while it gets pretty busy: > > > [ 52.917009] No Local Variables are initialized for Method [NVPO] > [ 52.917011] No Arguments are initialized for method [NVPO] > [ 52.917012] ACPI Error: Method parse/execution failed > \_SB.PCI0.PEG0.PEGP.NVPO, AE_AML_LOOP_TIMEOUT (20181003/psparse-516) > [ 52.917063] ACPI Error: Method parse/execution failed \_SB.PCI0.PGON, > AE_AML_LOOP_TIMEOUT (20181003/psparse-516) > [ 52.917084] ACPI Error: Method parse/execution failed > \_SB.PCI0.PEG0.PG00._ON, AE_AML_LOOP_TIMEOUT (20181003/psparse-516) > [ 52.917108] acpi device:00: Failed to change power state to D0 Here it seems to fail to turn on the power resource for the device. > [ 52.969287] video LNXVIDEO:00: Cannot transition to power state D0 for > parent in (unknown) > [ 52.969289] pci_raw_set_power_state: 2 callbacks suppressed > [ 52.969291] nouveau :01:00.0: Refused to change power state, currently > in D3 > [ 53.029514] video LNXVIDEO:00: Cannot transition to power state D0 for > parent in (unknown) > [ 53.041027] nouveau :01:00.0: Refused to change power state, currently > in D3 > [ 53.041035] video LNXVIDEO:00: Cannot transition to power state D0 for > parent in (unknown) > [ 53.053008] nouveau :01:00.0: Refused to change power state, currently > in D3 > > > And then kernel proceeds to throw up errors at random places, e.g. > > [ 67.021892] cfg80211: failed to load regulatory.db > [ 67.021895] cfg80211: failed to load regulatory.db > [ 67.021897] cfg80211: failed to load regulatory.db > [ 67.021900] cfg80211: failed to load regulatory.db > [ 67.021927] cfg80211: failed to load regulatory.db > [ 67.021928] cfg80211: failed to load regulatory.db > [ 67.021932] cfg80211: failed to load regulatory.db > [ 67.021934] cfg80211: failed to load regulatory.db > [ 67.024463] cfg80211: failed to load regulatory.db > [ 99.980625] iwlwifi :00:14.3: Error sending STATISTICS_CMD: time out > after 2000ms. > > followed by soft lockups and sometimes hard lockups in places > like attempts to walk skb lists. > > Adding runpm=0 does away with this issue. > > The specific test was with noaccel=1 - it does not seem to change > things for me. > > I poked at the ACPI method NVPO and yes it does actually > seem to execute a while loop waiting for some register > to become 0. Which I guess never happens? Because card > is in a low power state and so
Re: [Nouveau] [PATCH v6 3/6] drm/dp_mst: Start tracking per-port VCPI allocations
On Mon, Nov 26, 2018 at 04:36:46PM -0500, Lyude Paul wrote: > On Mon, 2018-11-26 at 22:04 +0100, Daniel Vetter wrote: > > On Thu, Nov 15, 2018 at 07:50:05PM -0500, Lyude Paul wrote: > > > There has been a TODO waiting for quite a long time in > > > drm_dp_mst_topology.c: > > > > > > /* We cannot rely on port->vcpi.num_slots to update > > >* topology_state->avail_slots as the port may not exist if the parent > > >* branch device was unplugged. This should be fixed by tracking > > >* per-port slot allocation in drm_dp_mst_topology_state instead of > > >* depending on the caller to tell us how many slots to release. > > >*/ > > > > > > That's not the only reason we should fix this: forcing the driver to > > > track the VCPI allocations throughout a state's atomic check is > > > error prone, because it means that extra care has to be taken with the > > > order that drm_dp_atomic_find_vcpi_slots() and > > > drm_dp_atomic_release_vcpi_slots() are called in in order to ensure > > > idempotency. Currently the only driver actually using these helpers, > > > i915, doesn't even do this correctly: multiple ->best_encoder() checks > > > with i915's current implementation would not be idempotent and would > > > over-allocate VCPI slots, something I learned trying to implement > > > fallback retraining in MST. > > > > > > So: simplify this whole mess, and teach drm_dp_atomic_find_vcpi_slots() > > > and drm_dp_atomic_release_vcpi_slots() to track the VCPI allocations for > > > each port. This allows us to ensure idempotency without having to rely > > > on the driver as much. Additionally: the driver doesn't need to do any > > > kind of VCPI slot tracking anymore if it doesn't need it for it's own > > > internal state. > > > > > > Additionally; this adds a new drm_dp_mst_atomic_check() helper which > > > must be used by atomic drivers to perform validity checks for the new > > > VCPI allocations incurred by a state. > > > > > > Also: update the documentation and make it more obvious that these > > > /must/ be called by /all/ atomic drivers supporting MST. > > > > > > Changes since v6: > > > - Keep a kref to all of the ports we have allocations on. This required > > >a good bit of changing to when we call drm_dp_find_vcpi_slots(), > > >mainly that we need to ensure that we only redo VCPI allocations on > > >actual mode or CRTC changes, not crtc_state->active changes. > > >Additionally, we no longer take the registration of the DRM connector > > >for each port into account because so long as we have a kref to the > > >port in the new or previous atomic state, the connector will stay > > >registered. > > > > I write an entire pile of small nitpits (still included most of them > > below), until I realized this here wont work. Delaying the call to destroy > > the connector (well, unregister it really) wreaks the design we've come up > > with thus far, resulting in most of my comments here. > > > > Instead, all we need to do is delay the kfree(port) at the bottom of > > drm_dp_destroy_port(). The vcpi allocation structure _only_ needs the > > pointer value to stay valid, as a lookup key. It doesn't care at all about > > anything actually stored in there. So the only thing we need to delay is > > the kfree. I think the simplest way to achieve this is to add a 2nd kref > > (port->kfree_ref or something like that), with on reference held by the > > port itself (released in drm_dp_destroy_port), and the other one held > > as-needed by the vcpi allocation lists. > > > > I think if we go with this design instead of retrofitting a semantic > > change of the port lifetime itself, all the complications I complain about > > below should disappear. > > > > Piles of comments below. > > > > Cheers, Daniel > > > > > - Use the small changes to drm_dp_put_port() to add even more error > > >checking to make misusage of the helpers more obvious. I added this > > >after having to chase down various use-after-free conditions that > > >started popping up from the new helpers so no one else has to > > >troubleshoot that. > > > - Move some accidental DRM_DEBUG_KMS() calls to DRM_DEBUG_ATOMIC() > > > - Update documentation again, note that find/release() should both not be > > >called on the same port in a single atomic check phase (but multiple > > >calls to one or the other is OK) > > > > > > Changes since v4: > > > - Don't skip the atomic checks for VCPI allocations if no new VCPI > > >allocations happen in a state. This makes the next change I'm about > > >to list here a lot easier to implement. > > > - Don't ignore VCPI allocations on destroyed ports, instead ensure that > > >when ports are destroyed and still have VCPI allocations in the > > >topology state, the only state changes allowed are releasing said > > >ports' VCPI. This prevents a state with a mix of VCPI allocations > > >from destroyed ports, and allocations from valid
[Nouveau] [Bug 84721] [NVC1] Nvidia Geforce GT 630 using nouveau on 3.16 kernel. dangerous Fan speed
https://bugs.freedesktop.org/show_bug.cgi?id=84721 --- Comment #21 from Javier Fernandez --- Created attachment 142627 --> https://bugs.freedesktop.org/attachment.cgi?id=142627=edit VBIOS nVidia GT 630 vbios -- You are receiving this mail because: You are the assignee for the bug.___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [Bug 84721] [NVC1] Nvidia Geforce GT 630 using nouveau on 3.16 kernel. dangerous Fan speed
https://bugs.freedesktop.org/show_bug.cgi?id=84721 --- Comment #20 from Javier Fernandez --- (In reply to Javier Fernandez from comment #19) > Hi there, > > im using kernel 4.18 (Ubuntu), and nouveau 1.0.15 > > fan speed is high here too > > Here is my VBIOS well, i cant attach it! i get this error message: "The content type is invalid. Valid types must be of the form foo/bar where foo is one of application, audio, image, message, model, multipart, text, video and bar must not contain any special characters (such as "=", "?", ...)." -- You are receiving this mail because: You are the assignee for the bug.___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [Bug 84721] [NVC1] Nvidia Geforce GT 630 using nouveau on 3.16 kernel. dangerous Fan speed
https://bugs.freedesktop.org/show_bug.cgi?id=84721 --- Comment #19 from Javier Fernandez --- Hi there, im using kernel 4.18 (Ubuntu), and nouveau 1.0.15 fan speed is high here too Here is my VBIOS -- You are receiving this mail because: You are the assignee for the bug.___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau