Re: [Intel-gfx] [REGRESSION] Re: i915 driver crashes on T540p if docking station attached
On Mon, Aug 03, 2015 at 12:25:11PM -0400, Theodore Ts'o wrote: On Mon, Aug 03, 2015 at 05:27:29PM +0200, Daniel Vetter wrote: Ok I updated fixes-stuff with just 2 patches which seem to be enough to fix it. Plus a patch to convert Linus' hack into something we can keep plus a drive-by WARNING fix in mst that got in the way for me. Seems to work here in getting rid of the Oops. If this tests out for you too I'll send a pull to Linus. I've just tried pulling in your updated fixes-stuff, and it avoids the oops and allows external the monitor to work correctly. However, I'm still seeing a large number of drm/i915 related warning messages and other kernel kvetching. Involved a bit of head-scratching since I'm not too familiar with the watermark code and it gained a lot of complexity for atomic. But the below patch should be able to fix this WARNING (and it looks like it was a genuine one). If it works for you I'll bake it into a proper patch. Thanks, Daniel diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 30e0f54ba19d..ae07fd0c395c 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -15121,6 +15121,11 @@ void intel_modeset_setup_hw_state(struct drm_device *dev, intel_modeset_readout_hw_state(dev); + if (IS_GEN9(dev)) + skl_wm_get_hw_state(dev); + else if (HAS_PCH_SPLIT(dev)) + ilk_wm_get_hw_state(dev); + /* * Now that we have the config, copy it to each CRTC struct * Note that this could go away if we move to using crtc_config @@ -15162,11 +15167,6 @@ void intel_modeset_setup_hw_state(struct drm_device *dev, pll-on = false; } - if (IS_GEN9(dev)) - skl_wm_get_hw_state(dev); - else if (HAS_PCH_SPLIT(dev)) - ilk_wm_get_hw_state(dev); - if (force_restore) { i915_redisable_vga(dev); -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [REGRESSION] Re: i915 driver crashes on T540p if docking station attached
On Thu, Jul 30, 2015 at 11:50:29AM -0400, Theodore Ts'o wrote: On Thu, Jul 30, 2015 at 04:40:02PM +0200, Daniel Vetter wrote: I have 4 patches in git://people.freedesktop.org/~danvet/drm fixes-stuff but I couldn't test them yet since no dp mst here and I didn't find anything that would ship faster than 1-2 weeks yet. I'll try to get some other people here to test it meanwhile too. I've tried pulling in your patches from fixes-stuff, onto Linus's tree (without Linus's fix), and the good news is that I'm no longer crashing on boot. The *bad* news is that (a) it breaks the external monitor attached to the docking station completely (this was working with Linus's patch), and (b) it's triggering a LOCKDEP failure. So even though Linus's patch wasn't supposed to work, I think I'm going to back to it Ok I updated fixes-stuff with just 2 patches which seem to be enough to fix it. Plus a patch to convert Linus' hack into something we can keep plus a drive-by WARNING fix in mst that got in the way for me. Seems to work here in getting rid of the Oops. If this tests out for you too I'll send a pull to Linus. Thanks, Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [REGRESSION] Re: i915 driver crashes on T540p if docking station attached
On Mon, Aug 03, 2015 at 05:27:29PM +0200, Daniel Vetter wrote: Ok I updated fixes-stuff with just 2 patches which seem to be enough to fix it. Plus a patch to convert Linus' hack into something we can keep plus a drive-by WARNING fix in mst that got in the way for me. Seems to work here in getting rid of the Oops. If this tests out for you too I'll send a pull to Linus. I've just tried pulling in your updated fixes-stuff, and it avoids the oops and allows external the monitor to work correctly. However, I'm still seeing a large number of drm/i915 related warning messages and other kernel kvetching. Thanks!! - Ted [4.084198] [drm] Initialized drm 1.1.0 20060810 [4.129576] [drm] Memory usable by graphics device = 2048M [4.129616] [drm] Replacing VGA console driver [4.130315] Console: switching to colour dummy device 80x25 [4.145332] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). [4.145334] [drm] Driver supports precise vblank timestamp query. [4.146184] vgaarb: device changed decodes: PCI::00:02.0,olddecodes=io+mem,decodes=io+mem:owns=io+mem [4.163778] usbcore: registered new interface driver btusb [4.170719] [ cut here ] [4.170749] WARNING: CPU: 0 PID: 463 at /usr/projects/linux/linux/drivers/gpu/drm/i915/intel_pm.c:2339 ilk_update_wm+0x71a/0xb27 [i915]() [4.170751] WARN_ON(!r-enable) [4.170752] Modules linked in: [4.170754] btusb btrtl btbcm btintel iwlmvm(+) bluetooth mac80211 iwlwifi snd_hda_intel i915(+) drm_kms_helper snd_hda_codec cfg80211 drm snd_hwdep lpc_ich snd_hda_core intel_gtt thinkpad_acpi tpm_tis nvram tpm intel_smartconnect uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core sch_fq_codel kvm_intel kvm ecryptfs parport_pc ppdev lp parport autofs4 btrfs xor hid_generic usbhid hid raid6_pq microcode rtsx_pci_sdmmc ehci_pci e1000e rtsx_pci ehci_hcd xhci_pci ptp mfd_core pps_core xhci_hcd [4.170805] CPU: 0 PID: 463 Comm: systemd-udevd Not tainted 4.2.0-rc5-14194-g130583b #18 [4.170807] Hardware name: LENOVO 20BECTO1WW/20BECTO1WW, BIOS GMET59WW (2.07 ) 02/12/2014 [4.170809] 0009 880403f0f4c8 8161aaee 0006 [4.170814] 880403f0f518 880403f0f508 8107e5f0 0006 [4.170818] c05ade43 8800c8b7 8800c7f16000 880405fb48b8 [4.170823] Call Trace: [4.170829] [8161aaee] dump_stack+0x4c/0x65 [4.170833] [8107e5f0] warn_slowpath_common+0xa1/0xbb [4.170856] [c05ade43] ? ilk_update_wm+0x71a/0xb27 [i915] [4.170859] [8107e650] warn_slowpath_fmt+0x46/0x48 [4.170879] [c05abb1e] ? ilk_compute_wm_maximums+0x43/0xa2 [i915] [4.170899] [c05ade43] ilk_update_wm+0x71a/0xb27 [i915] [4.170921] [c05afb2b] intel_update_watermarks+0x1e/0x20 [i915] [4.170957] [c05ff8d4] haswell_crtc_disable+0x270/0x2ae [i915] [4.170989] [c060199d] intel_crtc_control+0xa0/0xe1 [i915] [4.171020] [c0601a2b] intel_crtc_update_dpms+0x4d/0x5d [i915] [4.171052] [c0607dd9] intel_modeset_setup_hw_state+0x7b0/0xa90 [i915] [4.171081] [c05ec6de] ? hsw_write64+0xcd/0xcd [i915] [4.171113] [c060ab44] ? ilk_fbc_disable+0x29/0x69 [i915] [4.171142] [c0609512] intel_modeset_init+0x130d/0x14e3 [i915] [4.171179] [c0636962] i915_driver_load+0xf05/0x1139 [i915] [4.171183] [810ba787] ? mark_held_locks+0x56/0x6c [4.171186] [81620c06] ? _raw_spin_unlock_irqrestore+0x3f/0x4d [4.171189] [810ba90e] ? trace_hardirqs_on_caller+0x171/0x18d [4.171204] [c042cf19] drm_dev_register+0x84/0xfd [drm] [4.171215] [c042f77e] drm_get_pci_dev+0x102/0x1bc [drm] [4.171237] [c05a61e2] i915_pci_probe+0x4f/0x51 [i915] [4.171240] [81333c33] pci_device_probe+0x74/0xd6 [4.171245] [813d4b8e] ? driver_probe_device+0x387/0x387 [4.171248] [813d4966] driver_probe_device+0x15f/0x387 [4.171250] [813d4b8e] ? driver_probe_device+0x387/0x387 [4.171252] [813d4be1] __driver_attach+0x53/0x74 [4.171255] [813d2c00] bus_for_each_dev+0x6f/0x89 [4.171257] [813d4350] driver_attach+0x1e/0x20 [4.171260] [813d3f93] bus_add_driver+0x140/0x238 [4.171263] [813d5538] driver_register+0x8f/0xcc [4.171266] [81332d41] __pci_register_driver+0x5e/0x62 [4.171268] [c069c000] ? 0xc069c000 [4.171278] [c042f890] drm_pci_init+0x58/0xda [drm] [4.171281] [c069c000] ? 0xc069c000 [4.171301] [c069c0a0] i915_init+0xa0/0xa8 [i915] [4.171303] [c069c000] ? 0xc069c000 [4.171307] [810003c7] do_one_initcall+0x19a/0x1af [4.171310] [81619d1d] ?
Re: [Intel-gfx] [REGRESSION] Re: i915 driver crashes on T540p if docking station attached
On Tuesday, August 04, 2015 12:05:14 AM Daniel Vetter wrote: On Mon, Aug 3, 2015 at 7:24 PM, Linus Torvalds torva...@linux-foundation.org wrote: However, I'm still seeing a large number of drm/i915 related warning messages and other kernel kvetching. I suspect I can live with that for now. The lockdep one looks like it's mainly an initialization issue, so you'd never get the actual deadlock in practice, but it's obviously annoying. The intel_pm.c one I'll have to defer to the i915 people for.. The lockdep splat is just acpi being inconsistent with init_mutex vs. backlight notifier_chain (which has it's own lock) calls. init_mutex is new in 4.2 and has been added in commit 87521e16a7abbf3fa337f56cb4d1e18247f15e8a Author: Hans de Goede hdego...@redhat.com Date: Tue Jun 16 16:27:48 2015 +0200 acpi-video-detect: Rewrite backlight interface selection logic Not mine ;-) But adding relevant people. Hans, can you have a look at this, please? Rafael ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [REGRESSION] Re: i915 driver crashes on T540p if docking station attached
On Mon, Aug 3, 2015 at 7:24 PM, Linus Torvalds torva...@linux-foundation.org wrote: However, I'm still seeing a large number of drm/i915 related warning messages and other kernel kvetching. I suspect I can live with that for now. The lockdep one looks like it's mainly an initialization issue, so you'd never get the actual deadlock in practice, but it's obviously annoying. The intel_pm.c one I'll have to defer to the i915 people for.. The lockdep splat is just acpi being inconsistent with init_mutex vs. backlight notifier_chain (which has it's own lock) calls. init_mutex is new in 4.2 and has been added in commit 87521e16a7abbf3fa337f56cb4d1e18247f15e8a Author: Hans de Goede hdego...@redhat.com Date: Tue Jun 16 16:27:48 2015 +0200 acpi-video-detect: Rewrite backlight interface selection logic Not mine ;-) But adding relevant people. I'll send you a pull for the mst one tomorrow and look into the watermark fail in intel_pm.c too. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [REGRESSION] Re: i915 driver crashes on T540p if docking station attached
On Mon, Aug 03, 2015 at 10:24:53AM -0700, Linus Torvalds wrote: On Mon, Aug 3, 2015 at 9:25 AM, Theodore Ts'o ty...@mit.edu wrote: I've just tried pulling in your updated fixes-stuff, and it avoids the oops and allows external the monitor to work correctly. Good. Have either of you tested the suspend/resume behavior? Is that fixed too? No, I haven't had a chance to test the suspend/resume behavior, because that requires suspending at work, going home, and connecting to a dock which has a different monitor attached to it, and resuming (or vice versa of suspending at home and then resuming at work). So it's a bit trickier for me to test. It's also not a regression, and the workaround of rebooting is annoying, but I've lived with it for several releases now, but I'll try the two patches/changes that folks had suggested hopefully later this week. - Ted ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [REGRESSION] Re: i915 driver crashes on T540p if docking station attached
On Mon, Aug 3, 2015 at 9:25 AM, Theodore Ts'o ty...@mit.edu wrote: I've just tried pulling in your updated fixes-stuff, and it avoids the oops and allows external the monitor to work correctly. Good. Have either of you tested the suspend/resume behavior? Is that fixed too? However, I'm still seeing a large number of drm/i915 related warning messages and other kernel kvetching. I suspect I can live with that for now. The lockdep one looks like it's mainly an initialization issue, so you'd never get the actual deadlock in practice, but it's obviously annoying. The intel_pm.c one I'll have to defer to the i915 people for.. I'll be travelling much of this week (flying to Finland tomorrow, back on Sunday - yay, 30h in airplanes for three days on the ground, but it's my dad's bday), and my internet will be sporadic. But I'll have a laptop and be able to pull stuff every once in a while. It would be good to have this one resolved, and I just need to worry about the remaining VM problem.. Linus [4.170749] WARNING: CPU: 0 PID: 463 at drivers/gpu/drm/i915/intel_pm.c:2339 ilk_update_wm+0x71a/0xb27 [i915]() [4.170751] WARN_ON(!r-enable) ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [REGRESSION] Re: i915 driver crashes on T540p if docking station attached
On Thu, Jul 30, 2015 at 04:40:02PM +0200, Daniel Vetter wrote: On Wed, Jul 29, 2015 at 10:18:16PM -0700, Linus Torvalds wrote: drivers/gpu/drm/drm_atomic_helper.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c index 5b59d5ad7d1c..aac212297b49 100644 --- a/drivers/gpu/drm/drm_atomic_helper.c +++ b/drivers/gpu/drm/drm_atomic_helper.c @@ -230,10 +230,12 @@ update_connector_routing(struct drm_atomic_state *state, int conn_idx) } connector_state-best_encoder = new_encoder; - idx = drm_crtc_index(connector_state-crtc); + if (connector_state-crtc) { + idx = drm_crtc_index(connector_state-crtc); - crtc_state = state-crtc_states[idx]; - crtc_state-mode_changed = true; + crtc_state = state-crtc_states[idx]; + crtc_state-mode_changed = true; + } This shouldn't happen since if it does we ended up stealing the encoder from the connector itself (we do check for connector_state-crtc earlier) and that would be a bug. I haven't figured out a precise theory but my guess is on the best_encoder selection, and indeed dp mst encoder selection seems to have gone belly up in 4.2 with the bisected commit. Well, I just tested Linus's patch and it works. BTW, is there any chance that I can suspend my laptop, and then move it from my docking station at home (where I have a Dell 30 display) to my docking station at work (where I have a Dell 24 display), and actually have the new monitor be detected? For at least the past year, I have to reboot in order to be able to use the external monitor? This used to work, but it's been a very long-standing regression. I undrstand that Multi-stream DP is a evil horrible hack, and supporting it is painful, but this used to work, and it hasn't in a long time. :-( - Ted ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [REGRESSION] Re: i915 driver crashes on T540p if docking station attached
On Wed, Jul 29, 2015 at 10:18:16PM -0700, Linus Torvalds wrote: drivers/gpu/drm/drm_atomic_helper.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c index 5b59d5ad7d1c..aac212297b49 100644 --- a/drivers/gpu/drm/drm_atomic_helper.c +++ b/drivers/gpu/drm/drm_atomic_helper.c @@ -230,10 +230,12 @@ update_connector_routing(struct drm_atomic_state *state, int conn_idx) } connector_state-best_encoder = new_encoder; - idx = drm_crtc_index(connector_state-crtc); + if (connector_state-crtc) { + idx = drm_crtc_index(connector_state-crtc); - crtc_state = state-crtc_states[idx]; - crtc_state-mode_changed = true; + crtc_state = state-crtc_states[idx]; + crtc_state-mode_changed = true; + } This shouldn't happen since if it does we ended up stealing the encoder from the connector itself (we do check for connector_state-crtc earlier) and that would be a bug. I haven't figured out a precise theory but my guess is on the best_encoder selection, and indeed dp mst encoder selection seems to have gone belly up in 4.2 with the bisected commit. I have 4 patches in git://people.freedesktop.org/~danvet/drm fixes-stuff but I couldn't test them yet since no dp mst here and I didn't find anything that would ship faster than 1-2 weeks yet. I'll try to get some other people here to test it meanwhile too. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [REGRESSION] Re: i915 driver crashes on T540p if docking station attached
On Thu, Jul 30, 2015 at 5:32 PM, Theodore Ts'o ty...@mit.edu wrote: On Thu, Jul 30, 2015 at 04:40:02PM +0200, Daniel Vetter wrote: On Wed, Jul 29, 2015 at 10:18:16PM -0700, Linus Torvalds wrote: drivers/gpu/drm/drm_atomic_helper.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c index 5b59d5ad7d1c..aac212297b49 100644 --- a/drivers/gpu/drm/drm_atomic_helper.c +++ b/drivers/gpu/drm/drm_atomic_helper.c @@ -230,10 +230,12 @@ update_connector_routing(struct drm_atomic_state *state, int conn_idx) } connector_state-best_encoder = new_encoder; - idx = drm_crtc_index(connector_state-crtc); + if (connector_state-crtc) { + idx = drm_crtc_index(connector_state-crtc); - crtc_state = state-crtc_states[idx]; - crtc_state-mode_changed = true; + crtc_state = state-crtc_states[idx]; + crtc_state-mode_changed = true; + } This shouldn't happen since if it does we ended up stealing the encoder from the connector itself (we do check for connector_state-crtc earlier) and that would be a bug. I haven't figured out a precise theory but my guess is on the best_encoder selection, and indeed dp mst encoder selection seems to have gone belly up in 4.2 with the bisected commit. Well, I just tested Linus's patch and it works. That's sersiously surprising if you mean display and everything actually works. Is dpms on/off and suspend and all that also still working? Can you please changed the check into a if (!connector_state-crtc) return 0; so that we don't blow up on the debug line below and then grab dmesg with drm.debug=0x1e when this happens? Note there will be lots of noise you might need to dig out full dmesg from logs. BTW, is there any chance that I can suspend my laptop, and then move it from my docking station at home (where I have a Dell 30 display) to my docking station at work (where I have a Dell 24 display), and actually have the new monitor be detected? For at least the past year, I have to reboot in order to be able to use the external monitor? This used to work, but it's been a very long-standing regression. I undrstand that Multi-stream DP is a evil horrible hack, and supporting it is painful, but this used to work, and it hasn't in a long time. :-( Hm we seem to not reprobe mst state on resume. The quick hack below should help (but totally untested since still no dp mst hub here). -Daniel diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 884b4f9b81c4..c0677c83a0e9 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -775,6 +775,9 @@ static int i915_drm_resume(struct drm_device *dev) /* Config may have changed between suspend and resume */ drm_helper_hpd_irq_event(dev); + dev_priv-short_hpd_port_mask = ~0; + queue_work(dev_priv-dp_wq, dev_priv-dig_port_work); + intel_opregion_init(dev); intel_fbdev_set_suspend(dev, FBINFO_STATE_RUNNING, false); -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [REGRESSION] Re: i915 driver crashes on T540p if docking station attached
On Thu, Jul 30, 2015 at 11:50:29AM -0400, Theodore Ts'o wrote: I've tried pulling in your patches from fixes-stuff, onto Linus's tree (without Linus's fix), and the good news is that I'm no longer crashing on boot. The *bad* news is that (a) it breaks the external monitor attached to the docking station completely (this was working with Linus's patch), and (b) it's triggering a LOCKDEP failure. Well, that's not fair. Even with Linus's fix, there is still a LOCKDEP failure. And a few more i915 WARNINGS. But at least the external monitor works, so this is what I'm using. Enclosed please find a dmesg with the lockdep and i915 warnings and my .config. The kernel that I used can be found at: https://git.kernel.org/cgit/linux/kernel/git/tytso/ext4.git/log/?h=i915-test-4.2.0-rc4 - Ted dmesg.gz Description: application/gzip config.gz Description: application/gzip ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [REGRESSION] Re: i915 driver crashes on T540p if docking station attached
On Thu, 30 Jul 2015 17:32:28 +0200, Theodore Ts'o wrote: On Thu, Jul 30, 2015 at 04:40:02PM +0200, Daniel Vetter wrote: On Wed, Jul 29, 2015 at 10:18:16PM -0700, Linus Torvalds wrote: drivers/gpu/drm/drm_atomic_helper.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c index 5b59d5ad7d1c..aac212297b49 100644 --- a/drivers/gpu/drm/drm_atomic_helper.c +++ b/drivers/gpu/drm/drm_atomic_helper.c @@ -230,10 +230,12 @@ update_connector_routing(struct drm_atomic_state *state, int conn_idx) } connector_state-best_encoder = new_encoder; - idx = drm_crtc_index(connector_state-crtc); + if (connector_state-crtc) { + idx = drm_crtc_index(connector_state-crtc); - crtc_state = state-crtc_states[idx]; - crtc_state-mode_changed = true; + crtc_state = state-crtc_states[idx]; + crtc_state-mode_changed = true; + } This shouldn't happen since if it does we ended up stealing the encoder from the connector itself (we do check for connector_state-crtc earlier) and that would be a bug. I haven't figured out a precise theory but my guess is on the best_encoder selection, and indeed dp mst encoder selection seems to have gone belly up in 4.2 with the bisected commit. Well, I just tested Linus's patch and it works. BTW, is there any chance that I can suspend my laptop, and then move it from my docking station at home (where I have a Dell 30 display) to my docking station at work (where I have a Dell 24 display), and actually have the new monitor be detected? For at least the past year, I have to reboot in order to be able to use the external monitor? This used to work, but it's been a very long-standing regression. I undrstand that Multi-stream DP is a evil horrible hack, and supporting it is painful, but this used to work, and it hasn't in a long time. :-( Relevant with this? https://bugs.freedesktop.org/show_bug.cgi?id=89589 I wanted to check this by myself, too, as the same bug was reported to openSUSE bugzilla, but I had no hardware showing it. Takashi ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [REGRESSION] Re: i915 driver crashes on T540p if docking station attached
On Thu, Jul 30, 2015 at 04:40:02PM +0200, Daniel Vetter wrote: I have 4 patches in git://people.freedesktop.org/~danvet/drm fixes-stuff but I couldn't test them yet since no dp mst here and I didn't find anything that would ship faster than 1-2 weeks yet. I'll try to get some other people here to test it meanwhile too. I've tried pulling in your patches from fixes-stuff, onto Linus's tree (without Linus's fix), and the good news is that I'm no longer crashing on boot. The *bad* news is that (a) it breaks the external monitor attached to the docking station completely (this was working with Linus's patch), and (b) it's triggering a LOCKDEP failure. So even though Linus's patch wasn't supposed to work, I think I'm going to back to it - Ted Jul 30 11:46:49 closure kernel: [4.221951] Jul 30 11:46:49 closure kernel: [4.221954] == Jul 30 11:46:49 closure kernel: [4.221957] [ INFO: possible circular locking dependency detected ] Jul 30 11:46:49 closure kernel: [4.221960] 4.2.0-rc4-13906-g5f1b75cd #16 Not tainted Jul 30 11:46:49 closure kernel: [4.221963] --- Jul 30 11:46:49 closure kernel: [4.221966] modprobe/503 is trying to acquire lock: Jul 30 11:46:49 closure kernel: [4.221968] (init_mutex){+.+.+.}, at: [8138b380] acpi_video_get_backlight_type+0x17/0x164 Jul 30 11:46:49 closure kernel: [4.221977] Jul 30 11:46:49 closure kernel: [4.221977] but task is already holding lock: Jul 30 11:46:49 closure kernel: [4.221979] ((backlight_notifier)-rwsem){..}, at: [8109a7c9] __blocking_notifier_call_chain+0x37/0x69 Jul 30 11:46:49 closure kernel: [4.221987] Jul 30 11:46:49 closure kernel: [4.221987] which lock already depends on the new lock. Jul 30 11:46:49 closure kernel: [4.221987] Jul 30 11:46:49 closure kernel: [4.221990] Jul 30 11:46:49 closure kernel: [4.221990] the existing dependency chain (in reverse order) is: Jul 30 11:46:49 closure kernel: [4.221995] Jul 30 11:46:49 closure kernel: [4.221995] - #1 ((backlight_notifier)-rwsem){..}: Jul 30 11:46:49 closure kernel: [4.222001][810bbe08] lock_acquire+0x104/0x18b Jul 30 11:46:49 closure kernel: [4.222007][8161f1db] down_write+0x46/0x8a Jul 30 11:46:49 closure kernel: [4.222012][8109a6c0] blocking_notifier_chain_register+0x36/0x57 Jul 30 11:46:49 closure kernel: [4.222017][8134eb4e] backlight_register_notifier+0x18/0x1a Jul 30 11:46:49 closure kernel: [4.222022][8138b463] acpi_video_get_backlight_type+0xfa/0x164 Jul 30 11:46:49 closure kernel: [4.222028][c03a1e45] 0xc03a1e45 Jul 30 11:46:49 closure audispd: No plugins found, exiting Jul 30 11:46:49 closure kernel: [4.222032][c03a28a8] 0xc03a28a8 Jul 30 11:46:49 closure kernel: [4.222036][810003c7] do_one_initcall+0x19a/0x1af Jul 30 11:46:49 closure kernel: [4.222042][81619985] do_init_module+0x60/0x1e3 Jul 30 11:46:49 closure kernel: [4.222047][810f0a5b] load_module+0x1c42/0x2059 Jul 30 11:46:49 closure kernel: [4.222052][810f1046] SyS_finit_module+0x85/0x92 Jul 30 11:46:49 closure kernel: [4.222056][8162109b] entry_SYSCALL_64_fastpath+0x16/0x73 Jul 30 11:46:49 closure kernel: [4.222060] Jul 30 11:46:49 closure kernel: [4.222060] - #0 (init_mutex){+.+.+.}: Jul 30 11:46:49 closure kernel: [4.222065][810bb77a] __lock_acquire+0xc55/0xf54 Jul 30 11:46:49 closure kernel: [4.222070][810bbe08] lock_acquire+0x104/0x18b Jul 30 11:46:49 closure kernel: [4.222074][8161d83a] mutex_lock_nested+0x70/0x391 Jul 30 11:46:49 closure kernel: [4.222078][8138b380] acpi_video_get_backlight_type+0x17/0x164 Jul 30 11:46:49 closure kernel: [4.222083][8138b505] acpi_video_backlight_notify+0x19/0x2f Jul 30 11:46:49 closure kernel: [4.222088][8109a442] notifier_call_chain+0x4c/0x71 Jul 30 11:46:49 closure kernel: [4.222092][8109a7e2] __blocking_notifier_call_chain+0x50/0x69 Jul 30 11:46:49 closure kernel: [4.222098][8109a80f] blocking_notifier_call_chain+0x14/0x16 Jul 30 11:46:49 closure kernel: [4.222103][8134f023] backlight_device_register+0x1df/0x1f1 Jul 30 11:46:49 closure kernel: [4.222108][c07b3061] intel_backlight_register+0xf0/0x157 [i915] Jul 30 11:46:49 closure kernel: [4.222146][c078c843] intel_modeset_gem_init+0x158/0x164 [i915] Jul 30 11:46:49 closure kernel: [4.222176][c07b997c] i915_driver_load+0xf1c/0x1139 [i915] Jul 30 11:46:49 closure kernel:
Re: [Intel-gfx] [REGRESSION] Re: i915 driver crashes on T540p if docking station attached
On Thu, Jul 30, 2015 at 11:50:29AM -0400, Theodore Ts'o wrote: On Thu, Jul 30, 2015 at 04:40:02PM +0200, Daniel Vetter wrote: I have 4 patches in git://people.freedesktop.org/~danvet/drm fixes-stuff but I couldn't test them yet since no dp mst here and I didn't find anything that would ship faster than 1-2 weeks yet. I'll try to get some other people here to test it meanwhile too. I've tried pulling in your patches from fixes-stuff, onto Linus's tree (without Linus's fix), and the good news is that I'm no longer crashing on boot. Ok so I'm not completely clueless yet, the encoder confusion indeed resulted in the follow-up crash. But obviously I don't understand yet exactly what's going on if this breaks the display. The *bad* news is that (a) it breaks the external monitor attached to the docking station completely (this was working with Linus's patch), and (b) it's triggering a LOCKDEP failure. The lockdep splat is all in the driver load before we do any modeset at all, so shouldn't have changed between these patches. Are you sure it's a regression due to mine and wasn't there before? So even though Linus's patch wasn't supposed to work, I think I'm going to back to it Well I found some dp mst hubs meanwhile so hopefully tomorrow I can test myself what's going wrong here. -Daniel - Ted Jul 30 11:46:49 closure kernel: [4.221951] Jul 30 11:46:49 closure kernel: [4.221954] == Jul 30 11:46:49 closure kernel: [4.221957] [ INFO: possible circular locking dependency detected ] Jul 30 11:46:49 closure kernel: [4.221960] 4.2.0-rc4-13906-g5f1b75cd #16 Not tainted Jul 30 11:46:49 closure kernel: [4.221963] --- Jul 30 11:46:49 closure kernel: [4.221966] modprobe/503 is trying to acquire lock: Jul 30 11:46:49 closure kernel: [4.221968] (init_mutex){+.+.+.}, at: [8138b380] acpi_video_get_backlight_type+0x17/0x164 Jul 30 11:46:49 closure kernel: [4.221977] Jul 30 11:46:49 closure kernel: [4.221977] but task is already holding lock: Jul 30 11:46:49 closure kernel: [4.221979] ((backlight_notifier)-rwsem){..}, at: [8109a7c9] __blocking_notifier_call_chain+0x37/0x69 Jul 30 11:46:49 closure kernel: [4.221987] Jul 30 11:46:49 closure kernel: [4.221987] which lock already depends on the new lock. Jul 30 11:46:49 closure kernel: [4.221987] Jul 30 11:46:49 closure kernel: [4.221990] Jul 30 11:46:49 closure kernel: [4.221990] the existing dependency chain (in reverse order) is: Jul 30 11:46:49 closure kernel: [4.221995] Jul 30 11:46:49 closure kernel: [4.221995] - #1 ((backlight_notifier)-rwsem){..}: Jul 30 11:46:49 closure kernel: [4.222001][810bbe08] lock_acquire+0x104/0x18b Jul 30 11:46:49 closure kernel: [4.222007][8161f1db] down_write+0x46/0x8a Jul 30 11:46:49 closure kernel: [4.222012][8109a6c0] blocking_notifier_chain_register+0x36/0x57 Jul 30 11:46:49 closure kernel: [4.222017][8134eb4e] backlight_register_notifier+0x18/0x1a Jul 30 11:46:49 closure kernel: [4.222022][8138b463] acpi_video_get_backlight_type+0xfa/0x164 Jul 30 11:46:49 closure kernel: [4.222028][c03a1e45] 0xc03a1e45 Jul 30 11:46:49 closure audispd: No plugins found, exiting Jul 30 11:46:49 closure kernel: [4.222032][c03a28a8] 0xc03a28a8 Jul 30 11:46:49 closure kernel: [4.222036][810003c7] do_one_initcall+0x19a/0x1af Jul 30 11:46:49 closure kernel: [4.222042][81619985] do_init_module+0x60/0x1e3 Jul 30 11:46:49 closure kernel: [4.222047][810f0a5b] load_module+0x1c42/0x2059 Jul 30 11:46:49 closure kernel: [4.222052][810f1046] SyS_finit_module+0x85/0x92 Jul 30 11:46:49 closure kernel: [4.222056][8162109b] entry_SYSCALL_64_fastpath+0x16/0x73 Jul 30 11:46:49 closure kernel: [4.222060] Jul 30 11:46:49 closure kernel: [4.222060] - #0 (init_mutex){+.+.+.}: Jul 30 11:46:49 closure kernel: [4.222065][810bb77a] __lock_acquire+0xc55/0xf54 Jul 30 11:46:49 closure kernel: [4.222070][810bbe08] lock_acquire+0x104/0x18b Jul 30 11:46:49 closure kernel: [4.222074][8161d83a] mutex_lock_nested+0x70/0x391 Jul 30 11:46:49 closure kernel: [4.222078][8138b380] acpi_video_get_backlight_type+0x17/0x164 Jul 30 11:46:49 closure kernel: [4.222083][8138b505] acpi_video_backlight_notify+0x19/0x2f Jul 30 11:46:49 closure kernel: [4.222088][8109a442] notifier_call_chain+0x4c/0x71 Jul 30 11:46:49 closure kernel: [4.222092]
Re: [Intel-gfx] [REGRESSION] Re: i915 driver crashes on T540p if docking station attached
On Thu, Jul 30, 2015 at 8:57 AM, Takashi Iwai ti...@suse.de wrote: On Thu, 30 Jul 2015 17:32:28 +0200, Theodore Ts'o wrote: BTW, is there any chance that I can suspend my laptop, and then move it from my docking station at home (where I have a Dell 30 display) to my docking station at work (where I have a Dell 24 display), and actually have the new monitor be detected? For at least the past year, I have to reboot in order to be able to use the external monitor? This used to work, but it's been a very long-standing regression. I undrstand that Multi-stream DP is a evil horrible hack, and supporting it is painful, but this used to work, and it hasn't in a long time. :-( Relevant with this? https://bugs.freedesktop.org/show_bug.cgi?id=89589 I wanted to check this by myself, too, as the same bug was reported to openSUSE bugzilla, but I had no hardware showing it. Hmm. That commit e7d6f7d70829 looks like it should still revert fairly cleanly (just move the call to intel_dp_mst_resume() to before the intel_modeset_setup_hw_state() call and locking). Ted, worth checking out, even if that presumably ends up re-introducing some WARN_ON's.. Linus ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [REGRESSION] Re: i915 driver crashes on T540p if docking station attached
On 30 July 2015 at 15:18, Linus Torvalds torva...@linux-foundation.org wrote: On Wed, Jul 29, 2015 at 6:39 PM, Theodore Ts'o ty...@mit.edu wrote: It's here: https://goo.gl/photos/xHjn2Z97JQEw6k2C9 You didn't catch enough of the code line to decode the code, but it's early enough in drm_crtc_index() (just five bytes in) that it's almost certainly the very first dereference, so it's almost guaranteed to be that crtc-dev access as part of list_for_each_entry(), with crtc being NULL. And yes, -dev is the very first field, so the offset is zero too (while the -mode_config list access would not be at offset zero). And it looks like it is called from drm_atomic_helper_check_modeset(): the reason it has a question mark in the backtrace is because the fault happens before the stack frame has even been set up. There are multiple calls to drm_crtc_index() from that function, I can't tell which one it is. Looking at the code generation I get, I think it's because update_connector_routing() gets inlined, and that one does several calls. Most of them look like this: if (connector-state-crtc) { idx = drm_crtc_index(connector-state-crtc); ie they check that the crtc is non-NULL, but that last one does not: connector_state-best_encoder = new_encoder; idx = drm_crtc_index(connector_state-crtc); crtc_state = state-crtc_states[idx]; crtc_state-mode_changed = true; and I suspect the fix might be something like the attached. Totally untested. Ted? This whole atomic modeset series has been one royal fuck-up, guys. We've had too many of these kinds of crap issues. It hasn't been that bad, on a scale of 1 to MD eats my raid array, I'd say we are barely at a 5. There have been a lot of small and seemingly easily fixed teething problems, essentially rewriting the DRM API to provide a new userspace API and internal interface, porting some drivers partly to the new interface, while trying to maintain the old ABI/API on top seamlessly was always going to be an impossible task. It was never going to magically all just work in -next and land in your tree fully formed smelling of lavender and elderberries. This is a massive undertaking, and doing it over a few kernels was the only possible way it could ever land. I think the biggest problem we've had is the QA team at Intel got reorganised or something right when they really needed to be doing testing on this stuff, so what was sitting in -next never got as much testing as it had previously, and you can see that in the types of cases that are getting through. I think the other thing we can learn is that when Android forks the kernel we should just say this shit is too hard, let Google go and create a new API and a complete set of graphics drivers and deal with it in 10 years, because that was seriously the only other option. So yes it's a pity other kernel developers are seeing our fallout, but I've experienced lots of other kernel developers fall out over the years, and generally the idea is to get this stuff fixed to a reasonable state before you release a final kernel. Note I'm not personally involved in the development for atomic modesetting at all, I'm running the kernels with it where and when I can, and I trust the developers who work on it are doing as much as they can to make it work. That said hopefully Daniel can find a bag of fucks to debug and write a proper patch, instead of rage quitting the universe, and just git reset --hard v4.0 drivers/gpu/drm/i915.. Dave. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [REGRESSION] Re: i915 driver crashes on T540p if docking station attached
On Wed, Jul 29, 2015 at 08:49:37PM -0400, Theodore Ts'o wrote: Unfortunately the failure causes a series of recursive faults and I haven't been able to capture the stack trace, but on 4.2-rcX kernels, I can reliably cause the system to crash if my T540p is booted with the docking station attached. It will also crash if I boot the system first, and then insert the laptop into the dockstation. Unfortunately, I can't get a stack trace because there are a huge number of recursive/double faults, and the system dies so quickly that nothing ends up in the log files. If you really need a stack dump I can try to rig something, but modern Laptops don't have serial consoles any more, alas, so it's bit of a pain. The bad news is that I tried to use kdump to capture a crashdump and hopefully get more information, and kdump utterly wedged on the panic. The good news is because it wedged the system, I was able to get the console stackdump before it scrolled off due to a whole series of recursive oops messages. It's here: https://goo.gl/photos/xHjn2Z97JQEw6k2C9 Hopefully tihs is useful. It's not obvious how to revert this change, since there were a large number of changes to i915 after this. If someone could help me with a revert, I'd be happy to test it. Thanks, - Ted I was able to bisect it down to this commit, however: 8c7b5ccb72987: drm/i915: Use atomic helpers for computing changed flags: Is there any chance Intel could add a Lenovo Dockstation with a Multistream DP output to part of your test hardware? Unfortunately it seems pretty common that I see regressions with my particular hardware. Maybe there aren't enough people using Thinkpads any more? :-( - Ted P.S. The git bisect log git bisect start # bad: [421d125c06c4be4c5005cb69840206bd09b71dd6] builddeb: sign the modules after splitting out the debuginfo files git bisect bad 421d125c06c4be4c5005cb69840206bd09b71dd6 # good: [b953c0d234bc72e8489d3bf51a276c5c4ec85345] Linux 4.1 git bisect good b953c0d234bc72e8489d3bf51a276c5c4ec85345 # good: [aeaa2122af4e53f3bfd28e8f294557bb95af43fc] drm/i915/skl: Add the INIT power domain to the MISC I/O power well git bisect good aeaa2122af4e53f3bfd28e8f294557bb95af43fc # bad: [4d70f38a760ad2879d2ebd84001c92980180f630] drm/i915/bios: remove a redundant NULL pointer check git bisect bad 4d70f38a760ad2879d2ebd84001c92980180f630 # bad: [27a1b688d9f1fa2abd14bfe6a8729a19fb3b1b25] drm/i915/bxt: Enable WaEnableYV12BugFixInHalfSliceChicken7 for Broxton git bisect bad 27a1b688d9f1fa2abd14bfe6a8729a19fb3b1b25 # good: [4be0731786de10d0e9ae1d159504c83c6b052647] drm/i915: Add crtc states before calling compute_config() git bisect good 4be0731786de10d0e9ae1d159504c83c6b052647 # good: [d5432a9d19b61ba6a2b3d88f3026e0ca60eb57a1] drm/i915: Stage new modeset state straight into atomic state git bisect good d5432a9d19b61ba6a2b3d88f3026e0ca60eb57a1 # bad: [a821fc46bc7bb6d4cf9a5f8d2787fd70231c2c10] drm/i915: Swap atomic state in legacy modeset git bisect bad a821fc46bc7bb6d4cf9a5f8d2787fd70231c2c10 # bad: [8c7b5ccb729870e606321b3703e2c2e698c49a95] drm/i915: Use atomic helpers for computing changed flags git bisect bad 8c7b5ccb729870e606321b3703e2c2e698c49a95 # good: [0f63cca2afdc38877e86acfa9821020f6e2213fd] drm/i915: Update crtc state active flag based on DPMS git bisect good 0f63cca2afdc38877e86acfa9821020f6e2213fd # good: [840bfe953384a134c8639f2964d9b74bfa671e16] drm/atomic: Make mode_fixup() optional for check_modeset() git bisect good 840bfe953384a134c8639f2964d9b74bfa671e16 # first bad commit: [8c7b5ccb729870e606321b3703e2c2e698c49a95] drm/i915: Use atomic helpers for computing changed flags ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [REGRESSION] Re: i915 driver crashes on T540p if docking station attached
On Wed, Jul 29, 2015 at 6:39 PM, Theodore Ts'o ty...@mit.edu wrote: It's here: https://goo.gl/photos/xHjn2Z97JQEw6k2C9 You didn't catch enough of the code line to decode the code, but it's early enough in drm_crtc_index() (just five bytes in) that it's almost certainly the very first dereference, so it's almost guaranteed to be that crtc-dev access as part of list_for_each_entry(), with crtc being NULL. And yes, -dev is the very first field, so the offset is zero too (while the -mode_config list access would not be at offset zero). And it looks like it is called from drm_atomic_helper_check_modeset(): the reason it has a question mark in the backtrace is because the fault happens before the stack frame has even been set up. There are multiple calls to drm_crtc_index() from that function, I can't tell which one it is. Looking at the code generation I get, I think it's because update_connector_routing() gets inlined, and that one does several calls. Most of them look like this: if (connector-state-crtc) { idx = drm_crtc_index(connector-state-crtc); ie they check that the crtc is non-NULL, but that last one does not: connector_state-best_encoder = new_encoder; idx = drm_crtc_index(connector_state-crtc); crtc_state = state-crtc_states[idx]; crtc_state-mode_changed = true; and I suspect the fix might be something like the attached. Totally untested. Ted? This whole atomic modeset series has been one royal fuck-up, guys. We've had too many of these kinds of crap issues. Linus drivers/gpu/drm/drm_atomic_helper.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c index 5b59d5ad7d1c..aac212297b49 100644 --- a/drivers/gpu/drm/drm_atomic_helper.c +++ b/drivers/gpu/drm/drm_atomic_helper.c @@ -230,10 +230,12 @@ update_connector_routing(struct drm_atomic_state *state, int conn_idx) } connector_state-best_encoder = new_encoder; - idx = drm_crtc_index(connector_state-crtc); + if (connector_state-crtc) { + idx = drm_crtc_index(connector_state-crtc); - crtc_state = state-crtc_states[idx]; - crtc_state-mode_changed = true; + crtc_state = state-crtc_states[idx]; + crtc_state-mode_changed = true; + } DRM_DEBUG_ATOMIC([CONNECTOR:%d:%s] using [ENCODER:%d:%s] on [CRTC:%d]\n, connector-base.id, ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx