[Intel-gfx] debugging Haswell eDP black screen after S3
I am attempting to debug an issue with some Haswell laptop systems which do not restore their screen after resuming from S3 when running on the stable 3.8 kernel (3.8.13) The backlight is OK, but the screen is just black. In trying to determine what was going wrong, I tried looking at the output of intel_reg_dumper, in a good, and bad case: diff -u good_reg.txt bad_reg.txt --- good_reg.txt2013-05-14 15:08:44.361997000 + +++ bad_reg.txt 2013-05-14 15:09:20.48000 + @@ -1,5 +1,4 @@ - DCC: 0x (0xf340 0xf37f 0x�� -�) + DCC: 0x (0xf340 0xf37f 0x��=�) CHDECMISC: 0x (none, ch2 enh disabled, ch1 enh disabled, ch0 enh disabled, flex disabled, ep not present) C0DRB0: 0x (0x) C0DRB1: 0x (0x) @@ -63,17 +62,17 @@ PIPEA_DP_LINK_N: 0x CURSOR_A_BASE: 0x01061000 CURSOR_A_CONTROL: 0x0427 - CURSOR_A_POSITION: 0x03a3032f + CURSOR_A_POSITION: 0x01bb03fb FPA0: 0x (n = 0, m1 = 0, m2 = 0) FPA1: 0x (n = 0, m1 = 0, m2 = 0) DPLL_A: 0x (disabled, non-dvo, VGA, default clock, unknown mode, p1 = 0, p2 = 0) DPLL_A_MD: 0x -HTOTAL_A: 0x0821077f (1920 active, 2082 total) -HBLANK_A: 0x0821077f (1920 start, 2082 end) - HSYNC_A: 0x081307af (1968 start, 2068 end) -VTOTAL_A: 0x045f0437 (1080 active, 1120 total) -VBLANK_A: 0x045f0437 (1080 start, 1120 end) - VSYNC_A: 0x044b0441 (1090 start, 1100 end) +HTOTAL_A: 0x (1 active, 1 total) +HBLANK_A: 0x (1 start, 1 end) + HSYNC_A: 0x (1 start, 1 end) +VTOTAL_A: 0x (1 active, 1 total) +VBLANK_A: 0x (1 start, 1 end) + VSYNC_A: 0x (1 start, 1 end) BCLRPAT_A: 0x VSYNCSHIFT_A: 0x DSPBCNTR: 0x4000 (disabled, pipe A) It appears the registers that are saved, and restored in i915_save_modeset_reg / i915_restore_modeset_reg is not working properly. When I put some debug in, I discovered that it was bailing out of i915_save_modeset_reg early since the DRIVER_MODESET bit was cleared. However, it was set at the end of i915_init() This, of course, confuses me. Am I seeing memory corruption here? Any insight is appreciated. Ben ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] debugging Haswell eDP black screen after S3
On Tue, May 14, 2013 at 5:01 PM, Ben Guthro wrote: > I am attempting to debug an issue with some Haswell laptop systems > which do not restore their screen after resuming from S3 when running > on the stable 3.8 kernel (3.8.13) > The backlight is OK, but the screen is just black. > > In trying to determine what was going wrong, I tried looking at the > output of intel_reg_dumper, in a good, and bad case: > > diff -u good_reg.txt bad_reg.txt > --- good_reg.txt2013-05-14 15:08:44.361997000 + > +++ bad_reg.txt 2013-05-14 15:09:20.48000 + > @@ -1,5 +1,4 @@ > - DCC: 0x (0xf340 > 0xf37f 0x�� > -� ) > + DCC: 0x (0xf340 > 0xf37f 0x��= � ) > CHDECMISC: 0x (none, ch2 enh disabled, ch1 enh > disabled, ch0 enh disabled, flex disabled, ep not present) >C0DRB0: 0x (0x) >C0DRB1: 0x (0x) > @@ -63,17 +62,17 @@ > PIPEA_DP_LINK_N: 0x > CURSOR_A_BASE: 0x01061000 > CURSOR_A_CONTROL: 0x0427 > - CURSOR_A_POSITION: 0x03a3032f > + CURSOR_A_POSITION: 0x01bb03fb > FPA0: 0x (n = 0, m1 = 0, m2 = 0) > FPA1: 0x (n = 0, m1 = 0, m2 = 0) >DPLL_A: 0x (disabled, non-dvo, VGA, default > clock, unknown mode, p1 = 0, p2 = 0) > DPLL_A_MD: 0x > -HTOTAL_A: 0x0821077f (1920 active, 2082 total) > -HBLANK_A: 0x0821077f (1920 start, 2082 end) > - HSYNC_A: 0x081307af (1968 start, 2068 end) > -VTOTAL_A: 0x045f0437 (1080 active, 1120 total) > -VBLANK_A: 0x045f0437 (1080 start, 1120 end) > - VSYNC_A: 0x044b0441 (1090 start, 1100 end) > +HTOTAL_A: 0x (1 active, 1 total) > +HBLANK_A: 0x (1 start, 1 end) > + HSYNC_A: 0x (1 start, 1 end) > +VTOTAL_A: 0x (1 active, 1 total) > +VBLANK_A: 0x (1 start, 1 end) > + VSYNC_A: 0x (1 start, 1 end) > BCLRPAT_A: 0x > VSYNCSHIFT_A: 0x > DSPBCNTR: 0x4000 (disabled, pipe A) > > > It appears the registers that are saved, and restored in > i915_save_modeset_reg / i915_restore_modeset_reg is not working > properly. > > When I put some debug in, I discovered that it was bailing out of > i915_save_modeset_reg early since the DRIVER_MODESET bit was cleared. > However, it was set at the end of i915_init() > This, of course, confuses me. > > Am I seeing memory corruption here? It looks like I misread the code here, inversing an if statement state. That said, I don't really have any clues as to why the display is black after resuming from S3 Is this an eDP training issue? Are there any changesets I can try backporting? I tried this, but it didn't seem to help: https://patchwork.kernel.org/patch/2516601/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] debugging Haswell eDP black screen after S3
On Wed, May 15, 2013 at 4:42 PM, Ben Guthro wrote: > On Tue, May 14, 2013 at 5:01 PM, Ben Guthro wrote: >> I am attempting to debug an issue with some Haswell laptop systems >> which do not restore their screen after resuming from S3 when running >> on the stable 3.8 kernel (3.8.13) >> The backlight is OK, but the screen is just black. >> >> In trying to determine what was going wrong, I tried looking at the >> output of intel_reg_dumper, in a good, and bad case: >> >> diff -u good_reg.txt bad_reg.txt >> --- good_reg.txt2013-05-14 15:08:44.361997000 + >> +++ bad_reg.txt 2013-05-14 15:09:20.48000 + >> @@ -1,5 +1,4 @@ >> - DCC: 0x (0xf340 >> 0xf37f 0x�� >> -� ) >> + DCC: 0x (0xf340 >> 0xf37f 0x��= � ) >> CHDECMISC: 0x (none, ch2 enh disabled, ch1 enh >> disabled, ch0 enh disabled, flex disabled, ep not present) >>C0DRB0: 0x (0x) >>C0DRB1: 0x (0x) >> @@ -63,17 +62,17 @@ >> PIPEA_DP_LINK_N: 0x >> CURSOR_A_BASE: 0x01061000 >> CURSOR_A_CONTROL: 0x0427 >> - CURSOR_A_POSITION: 0x03a3032f >> + CURSOR_A_POSITION: 0x01bb03fb >> FPA0: 0x (n = 0, m1 = 0, m2 = 0) >> FPA1: 0x (n = 0, m1 = 0, m2 = 0) >>DPLL_A: 0x (disabled, non-dvo, VGA, default >> clock, unknown mode, p1 = 0, p2 = 0) >> DPLL_A_MD: 0x >> -HTOTAL_A: 0x0821077f (1920 active, 2082 total) >> -HBLANK_A: 0x0821077f (1920 start, 2082 end) >> - HSYNC_A: 0x081307af (1968 start, 2068 end) >> -VTOTAL_A: 0x045f0437 (1080 active, 1120 total) >> -VBLANK_A: 0x045f0437 (1080 start, 1120 end) >> - VSYNC_A: 0x044b0441 (1090 start, 1100 end) >> +HTOTAL_A: 0x (1 active, 1 total) >> +HBLANK_A: 0x (1 start, 1 end) >> + HSYNC_A: 0x (1 start, 1 end) >> +VTOTAL_A: 0x (1 active, 1 total) >> +VBLANK_A: 0x (1 start, 1 end) >> + VSYNC_A: 0x (1 start, 1 end) >> BCLRPAT_A: 0x >> VSYNCSHIFT_A: 0x >> DSPBCNTR: 0x4000 (disabled, pipe A) >> >> >> It appears the registers that are saved, and restored in >> i915_save_modeset_reg / i915_restore_modeset_reg is not working >> properly. >> >> When I put some debug in, I discovered that it was bailing out of >> i915_save_modeset_reg early since the DRIVER_MODESET bit was cleared. >> However, it was set at the end of i915_init() >> This, of course, confuses me. >> >> Am I seeing memory corruption here? > > It looks like I misread the code here, inversing an if statement state. > > That said, I don't really have any clues as to why the display is > black after resuming from S3 > > Is this an eDP training issue? Are there any changesets I can try backporting? > I tried this, but it didn't seem to help: > https://patchwork.kernel.org/patch/2516601/ Below is a serial dump with drm.debug=4, after resuming from S3 If anyone sees anything awry, being pointed in the right direction would be appreciated: [ 119.676134] ACPI: Low-level resume complete [ 119.676200] PM: Restoring platform NVS memory [ 119.676585] xen-acpi-processor: Uploading Xen processor PM info [ 119.678302] Enabling non-boot CPUs ... [ 119.678351] installing Xen timer for CPU 1 [ 119.678380] cpu 1 spinlock event irq 48 [ 119.679469] CPU1 is up [ 119.679496] installing Xen timer for CPU 2 [ 119.679505] cpu 2 spinlock event irq 55 [ 119.680524] CPU2 is up [ 119.680586] installing Xen timer for CPU 3 [ 119.680590] cpu 3 spinlock event irq 62 [ 119.681463] CPU3 is up [ 119.681482] installing Xen timer for CPU 4 [ 119.681487] cpu 4 spinlock event irq 69 [ 119.682448] CPU4 is up [ 119.682478] installing Xen timer for CPU 5 [ 119.682482] cpu 5 spinlock event irq 76 [ 119.683463] CPU5 is up [ 119.683490] installing Xen timer for CPU 6 [ 119.683494] cpu 6 spinlock event irq 83 [ 119.684483] CPU6 is up [ 119.684512] installing Xen timer for CPU 7 [ 119.684517] cpu 7 spinlock event irq 90 [ 119.685523] CPU7 is up [ 119.685941] ACPI: Waking up from system sleep state S3 [ 120.546804] pci :01:00.0: power state changed by ACPI to D0 [ 120.586931] PM: noirq resume of devices complete after 160.133 msecs [ 120.587261] PM: early resume of devices complete after 0.247 msecs [ 120.587438] xen: registering gsi 16 triggering 0 polarity 1 [ 120.587447] i915 :00:02.0: setting latency timer to 64 [ 120.587449] Already setup the GSI :16 [ 120.587569] xen: registering gsi 22 triggering 0 polarity 1 [ 120.587578] Already setup the GSI :22 [ 120.587749] pciehp :00:1c.2:pcie04: pciehp_resume ENTRY [ 120.587769] pciehp :00:1c.0:pcie04: pciehp_resume ENTRY [ 120.587837] pciehp :00:1c.
Re: [Intel-gfx] debugging Haswell eDP black screen after S3
On Thu, May 16, 2013 at 9:24 AM, Ben Guthro wrote: > On Wed, May 15, 2013 at 4:42 PM, Ben Guthro wrote: >> On Tue, May 14, 2013 at 5:01 PM, Ben Guthro wrote: >>> I am attempting to debug an issue with some Haswell laptop systems >>> which do not restore their screen after resuming from S3 when running >>> on the stable 3.8 kernel (3.8.13) >>> The backlight is OK, but the screen is just black. >>> >>> In trying to determine what was going wrong, I tried looking at the >>> output of intel_reg_dumper, in a good, and bad case: >>> >>> diff -u good_reg.txt bad_reg.txt >>> --- good_reg.txt2013-05-14 15:08:44.361997000 + >>> +++ bad_reg.txt 2013-05-14 15:09:20.48000 + >>> @@ -1,5 +1,4 @@ >>> - DCC: 0x (0xf340 >>> 0xf37f 0x�� >>> -� ) >>> + DCC: 0x (0xf340 >>> 0xf37f 0x��= � ) >>> CHDECMISC: 0x (none, ch2 enh disabled, ch1 enh >>> disabled, ch0 enh disabled, flex disabled, ep not present) >>>C0DRB0: 0x (0x) >>>C0DRB1: 0x (0x) >>> @@ -63,17 +62,17 @@ >>> PIPEA_DP_LINK_N: 0x >>> CURSOR_A_BASE: 0x01061000 >>> CURSOR_A_CONTROL: 0x0427 >>> - CURSOR_A_POSITION: 0x03a3032f >>> + CURSOR_A_POSITION: 0x01bb03fb >>> FPA0: 0x (n = 0, m1 = 0, m2 = 0) >>> FPA1: 0x (n = 0, m1 = 0, m2 = 0) >>>DPLL_A: 0x (disabled, non-dvo, VGA, default >>> clock, unknown mode, p1 = 0, p2 = 0) >>> DPLL_A_MD: 0x >>> -HTOTAL_A: 0x0821077f (1920 active, 2082 total) >>> -HBLANK_A: 0x0821077f (1920 start, 2082 end) >>> - HSYNC_A: 0x081307af (1968 start, 2068 end) >>> -VTOTAL_A: 0x045f0437 (1080 active, 1120 total) >>> -VBLANK_A: 0x045f0437 (1080 start, 1120 end) >>> - VSYNC_A: 0x044b0441 (1090 start, 1100 end) >>> +HTOTAL_A: 0x (1 active, 1 total) >>> +HBLANK_A: 0x (1 start, 1 end) >>> + HSYNC_A: 0x (1 start, 1 end) >>> +VTOTAL_A: 0x (1 active, 1 total) >>> +VBLANK_A: 0x (1 start, 1 end) >>> + VSYNC_A: 0x (1 start, 1 end) >>> BCLRPAT_A: 0x >>> VSYNCSHIFT_A: 0x >>> DSPBCNTR: 0x4000 (disabled, pipe A) >>> >>> >>> It appears the registers that are saved, and restored in >>> i915_save_modeset_reg / i915_restore_modeset_reg is not working >>> properly. >>> >>> When I put some debug in, I discovered that it was bailing out of >>> i915_save_modeset_reg early since the DRIVER_MODESET bit was cleared. >>> However, it was set at the end of i915_init() >>> This, of course, confuses me. >>> >>> Am I seeing memory corruption here? >> >> It looks like I misread the code here, inversing an if statement state. >> >> That said, I don't really have any clues as to why the display is >> black after resuming from S3 It appears that S3 is not necessary. I can reproduce the black screen with just vbetool: vbetool dpms off vbetool dpms on Does this suggest a bios issue? >> >> Is this an eDP training issue? Are there any changesets I can try >> backporting? >> I tried this, but it didn't seem to help: >> https://patchwork.kernel.org/patch/2516601/ > > > Below is a serial dump with drm.debug=4, after resuming from S3 > > If anyone sees anything awry, being pointed in the right direction > would be appreciated: ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] debugging Haswell eDP black screen after S3
On Fri, May 17, 2013 at 7:52 AM, Ben Guthro wrote: > On Thu, May 16, 2013 at 9:24 AM, Ben Guthro wrote: >> On Wed, May 15, 2013 at 4:42 PM, Ben Guthro wrote: >>> On Tue, May 14, 2013 at 5:01 PM, Ben Guthro wrote: I am attempting to debug an issue with some Haswell laptop systems which do not restore their screen after resuming from S3 when running on the stable 3.8 kernel (3.8.13) The backlight is OK, but the screen is just black. In trying to determine what was going wrong, I tried looking at the output of intel_reg_dumper, in a good, and bad case: diff -u good_reg.txt bad_reg.txt --- good_reg.txt2013-05-14 15:08:44.361997000 + +++ bad_reg.txt 2013-05-14 15:09:20.48000 + @@ -1,5 +1,4 @@ - DCC: 0x (0xf340 0xf37f 0x�� -� ) + DCC: 0x (0xf340 0xf37f 0x��= � ) CHDECMISC: 0x (none, ch2 enh disabled, ch1 enh disabled, ch0 enh disabled, flex disabled, ep not present) C0DRB0: 0x (0x) C0DRB1: 0x (0x) @@ -63,17 +62,17 @@ PIPEA_DP_LINK_N: 0x CURSOR_A_BASE: 0x01061000 CURSOR_A_CONTROL: 0x0427 - CURSOR_A_POSITION: 0x03a3032f + CURSOR_A_POSITION: 0x01bb03fb FPA0: 0x (n = 0, m1 = 0, m2 = 0) FPA1: 0x (n = 0, m1 = 0, m2 = 0) DPLL_A: 0x (disabled, non-dvo, VGA, default clock, unknown mode, p1 = 0, p2 = 0) DPLL_A_MD: 0x -HTOTAL_A: 0x0821077f (1920 active, 2082 total) -HBLANK_A: 0x0821077f (1920 start, 2082 end) - HSYNC_A: 0x081307af (1968 start, 2068 end) -VTOTAL_A: 0x045f0437 (1080 active, 1120 total) -VBLANK_A: 0x045f0437 (1080 start, 1120 end) - VSYNC_A: 0x044b0441 (1090 start, 1100 end) +HTOTAL_A: 0x (1 active, 1 total) +HBLANK_A: 0x (1 start, 1 end) + HSYNC_A: 0x (1 start, 1 end) +VTOTAL_A: 0x (1 active, 1 total) +VBLANK_A: 0x (1 start, 1 end) + VSYNC_A: 0x (1 start, 1 end) BCLRPAT_A: 0x VSYNCSHIFT_A: 0x DSPBCNTR: 0x4000 (disabled, pipe A) It appears the registers that are saved, and restored in i915_save_modeset_reg / i915_restore_modeset_reg is not working properly. When I put some debug in, I discovered that it was bailing out of i915_save_modeset_reg early since the DRIVER_MODESET bit was cleared. However, it was set at the end of i915_init() This, of course, confuses me. Am I seeing memory corruption here? >>> >>> It looks like I misread the code here, inversing an if statement state. >>> >>> That said, I don't really have any clues as to why the display is >>> black after resuming from S3 > > It appears that S3 is not necessary. > > I can reproduce the black screen with just vbetool: > vbetool dpms off > vbetool dpms on > > Does this suggest a bios issue? This can be reliably reproduced on this machine, and worked around by saving the vbestate, and restoring it after the fact: (in a working state) vbetool vbestate save > vbe.save break the system: vbetool dpms off vbetool dpms on The following brings the screen back, but in a low resolution corner of X: vbetool vbestate restore < vbe.save And then we can get the full resolution back with the following: xrandr --output eDP1 --off xrandr --output eDP1 --auto This is clearly not an ideal solution to make a product out of. Does this point to a BIOS issue? Is anyone out there? > > > >>> >>> Is this an eDP training issue? Are there any changesets I can try >>> backporting? >>> I tried this, but it didn't seem to help: >>> https://patchwork.kernel.org/patch/2516601/ >> >> >> Below is a serial dump with drm.debug=4, after resuming from S3 >> >> If anyone sees anything awry, being pointed in the right direction >> would be appreciated: ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] debugging Haswell eDP black screen after S3
On Fri, May 17, 2013 at 09:26:18AM -0400, Ben Guthro wrote: > On Fri, May 17, 2013 at 7:52 AM, Ben Guthro wrote: > > On Thu, May 16, 2013 at 9:24 AM, Ben Guthro wrote: > >> On Wed, May 15, 2013 at 4:42 PM, Ben Guthro wrote: > >>> On Tue, May 14, 2013 at 5:01 PM, Ben Guthro wrote: > I am attempting to debug an issue with some Haswell laptop systems > which do not restore their screen after resuming from S3 when running > on the stable 3.8 kernel (3.8.13) > The backlight is OK, but the screen is just black. > > In trying to determine what was going wrong, I tried looking at the > output of intel_reg_dumper, in a good, and bad case: > > diff -u good_reg.txt bad_reg.txt > --- good_reg.txt2013-05-14 15:08:44.361997000 + > +++ bad_reg.txt 2013-05-14 15:09:20.48000 + > @@ -1,5 +1,4 @@ > - DCC: 0x (0xf340 > 0xf37f 0x�� > -� ) > + DCC: 0x (0xf340 > 0xf37f 0x��= � ) > CHDECMISC: 0x (none, ch2 enh disabled, ch1 enh > disabled, ch0 enh disabled, flex disabled, ep not present) > C0DRB0: 0x (0x) > C0DRB1: 0x (0x) > @@ -63,17 +62,17 @@ > PIPEA_DP_LINK_N: 0x > CURSOR_A_BASE: 0x01061000 > CURSOR_A_CONTROL: 0x0427 > - CURSOR_A_POSITION: 0x03a3032f > + CURSOR_A_POSITION: 0x01bb03fb > FPA0: 0x (n = 0, m1 = 0, m2 = 0) > FPA1: 0x (n = 0, m1 = 0, m2 = 0) > DPLL_A: 0x (disabled, non-dvo, VGA, default > clock, unknown mode, p1 = 0, p2 = 0) > DPLL_A_MD: 0x > -HTOTAL_A: 0x0821077f (1920 active, 2082 total) > -HBLANK_A: 0x0821077f (1920 start, 2082 end) > - HSYNC_A: 0x081307af (1968 start, 2068 end) > -VTOTAL_A: 0x045f0437 (1080 active, 1120 total) > -VBLANK_A: 0x045f0437 (1080 start, 1120 end) > - VSYNC_A: 0x044b0441 (1090 start, 1100 end) > +HTOTAL_A: 0x (1 active, 1 total) > +HBLANK_A: 0x (1 start, 1 end) > + HSYNC_A: 0x (1 start, 1 end) > +VTOTAL_A: 0x (1 active, 1 total) > +VBLANK_A: 0x (1 start, 1 end) > + VSYNC_A: 0x (1 start, 1 end) > BCLRPAT_A: 0x > VSYNCSHIFT_A: 0x > DSPBCNTR: 0x4000 (disabled, pipe A) > > > It appears the registers that are saved, and restored in > i915_save_modeset_reg / i915_restore_modeset_reg is not working > properly. > > When I put some debug in, I discovered that it was bailing out of > i915_save_modeset_reg early since the DRIVER_MODESET bit was cleared. > However, it was set at the end of i915_init() > This, of course, confuses me. > > Am I seeing memory corruption here? > >>> > >>> It looks like I misread the code here, inversing an if statement state. > >>> > >>> That said, I don't really have any clues as to why the display is > >>> black after resuming from S3 > > > > It appears that S3 is not necessary. > > > > I can reproduce the black screen with just vbetool: > > vbetool dpms off > > vbetool dpms on > > > > Does this suggest a bios issue? > > This can be reliably reproduced on this machine, and worked around by > saving the vbestate, and restoring it after the fact: > > (in a working state) > vbetool vbestate save > vbe.save > > break the system: > vbetool dpms off > vbetool dpms on This will break kms since now you have the vbios and the linux kms driver fighting over the same piece of hw. Does xset dpms force off xset dpms force on cause similar issues? If not please make sure that vbetool isn't badly interfering with the kernel modeset driver on suspend/resume. At least looking at your dmesg and reg dumps vbe wreaking havoc with the kms driver seems like a rather likely scenario. Also, can you please test latest 3.10-rc kernels? Cheers, Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] debugging Haswell eDP black screen after S3
On Tue, May 21, 2013 at 4:00 AM, Daniel Vetter wrote: > On Fri, May 17, 2013 at 09:26:18AM -0400, Ben Guthro wrote: >> On Fri, May 17, 2013 at 7:52 AM, Ben Guthro wrote: >> > On Thu, May 16, 2013 at 9:24 AM, Ben Guthro wrote: >> >> On Wed, May 15, 2013 at 4:42 PM, Ben Guthro wrote: >> >>> On Tue, May 14, 2013 at 5:01 PM, Ben Guthro wrote: >> I am attempting to debug an issue with some Haswell laptop systems >> which do not restore their screen after resuming from S3 when running >> on the stable 3.8 kernel (3.8.13) >> The backlight is OK, but the screen is just black. >> >> In trying to determine what was going wrong, I tried looking at the >> output of intel_reg_dumper, in a good, and bad case: >> >> diff -u good_reg.txt bad_reg.txt >> --- good_reg.txt2013-05-14 15:08:44.361997000 + >> +++ bad_reg.txt 2013-05-14 15:09:20.48000 + >> @@ -1,5 +1,4 @@ >> - DCC: 0x (0xf340 >> 0xf37f 0x�� >> -� ) >> + DCC: 0x (0xf340 >> 0xf37f 0x��= � ) >> CHDECMISC: 0x (none, ch2 enh disabled, ch1 enh >> disabled, ch0 enh disabled, flex disabled, ep not present) >> C0DRB0: 0x (0x) >> C0DRB1: 0x (0x) >> @@ -63,17 +62,17 @@ >> PIPEA_DP_LINK_N: 0x >> CURSOR_A_BASE: 0x01061000 >> CURSOR_A_CONTROL: 0x0427 >> - CURSOR_A_POSITION: 0x03a3032f >> + CURSOR_A_POSITION: 0x01bb03fb >> FPA0: 0x (n = 0, m1 = 0, m2 = 0) >> FPA1: 0x (n = 0, m1 = 0, m2 = 0) >> DPLL_A: 0x (disabled, non-dvo, VGA, default >> clock, unknown mode, p1 = 0, p2 = 0) >> DPLL_A_MD: 0x >> -HTOTAL_A: 0x0821077f (1920 active, 2082 total) >> -HBLANK_A: 0x0821077f (1920 start, 2082 end) >> - HSYNC_A: 0x081307af (1968 start, 2068 end) >> -VTOTAL_A: 0x045f0437 (1080 active, 1120 total) >> -VBLANK_A: 0x045f0437 (1080 start, 1120 end) >> - VSYNC_A: 0x044b0441 (1090 start, 1100 end) >> +HTOTAL_A: 0x (1 active, 1 total) >> +HBLANK_A: 0x (1 start, 1 end) >> + HSYNC_A: 0x (1 start, 1 end) >> +VTOTAL_A: 0x (1 active, 1 total) >> +VBLANK_A: 0x (1 start, 1 end) >> + VSYNC_A: 0x (1 start, 1 end) >> BCLRPAT_A: 0x >> VSYNCSHIFT_A: 0x >> DSPBCNTR: 0x4000 (disabled, pipe A) >> >> >> It appears the registers that are saved, and restored in >> i915_save_modeset_reg / i915_restore_modeset_reg is not working >> properly. >> >> When I put some debug in, I discovered that it was bailing out of >> i915_save_modeset_reg early since the DRIVER_MODESET bit was cleared. >> However, it was set at the end of i915_init() >> This, of course, confuses me. >> >> Am I seeing memory corruption here? >> >>> >> >>> It looks like I misread the code here, inversing an if statement state. >> >>> >> >>> That said, I don't really have any clues as to why the display is >> >>> black after resuming from S3 >> > >> > It appears that S3 is not necessary. >> > >> > I can reproduce the black screen with just vbetool: >> > vbetool dpms off >> > vbetool dpms on >> > >> > Does this suggest a bios issue? >> >> This can be reliably reproduced on this machine, and worked around by >> saving the vbestate, and restoring it after the fact: >> >> (in a working state) >> vbetool vbestate save > vbe.save >> >> break the system: >> vbetool dpms off >> vbetool dpms on > > This will break kms since now you have the vbios and the linux kms driver > fighting over the same piece of hw. Does > > xset dpms force off > xset dpms force on > > cause similar issues? No, these work as expected (on 3.8) I didn't realize that these broke with KMS. I'll stick with the S3 reproduction. > > If not please make sure that vbetool isn't badly interfering with the > kernel modeset driver on suspend/resume. At least looking at your dmesg > and reg dumps vbe wreaking havoc with the kms driver seems like a rather > likely scenario. Also, can you please test latest 3.10-rc kernels? 3.10-rc2 doesn't seem to work at all - it boots to a black screen every time. Ben > > Cheers, Daniel > -- > Daniel Vetter > Software Engineer, Intel Corporation > +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] debugging Haswell eDP black screen after S3
On Tue, May 21, 2013 at 9:44 AM, Ben Guthro wrote: > On Tue, May 21, 2013 at 4:00 AM, Daniel Vetter wrote: >> On Fri, May 17, 2013 at 09:26:18AM -0400, Ben Guthro wrote: >>> On Fri, May 17, 2013 at 7:52 AM, Ben Guthro wrote: >>> > On Thu, May 16, 2013 at 9:24 AM, Ben Guthro wrote: >>> >> On Wed, May 15, 2013 at 4:42 PM, Ben Guthro wrote: >>> >>> On Tue, May 14, 2013 at 5:01 PM, Ben Guthro wrote: >>> I am attempting to debug an issue with some Haswell laptop systems >>> which do not restore their screen after resuming from S3 when running >>> on the stable 3.8 kernel (3.8.13) >>> The backlight is OK, but the screen is just black. >>> >>> In trying to determine what was going wrong, I tried looking at the >>> output of intel_reg_dumper, in a good, and bad case: >>> >>> diff -u good_reg.txt bad_reg.txt >>> --- good_reg.txt2013-05-14 15:08:44.361997000 + >>> +++ bad_reg.txt 2013-05-14 15:09:20.48000 + >>> @@ -1,5 +1,4 @@ >>> - DCC: 0x (0xf340 >>> 0xf37f 0x�� >>> -� ) >>> + DCC: 0x (0xf340 >>> 0xf37f 0x��= � ) >>> CHDECMISC: 0x (none, ch2 enh disabled, ch1 enh >>> disabled, ch0 enh disabled, flex disabled, ep not present) >>> C0DRB0: 0x (0x) >>> C0DRB1: 0x (0x) >>> @@ -63,17 +62,17 @@ >>> PIPEA_DP_LINK_N: 0x >>> CURSOR_A_BASE: 0x01061000 >>> CURSOR_A_CONTROL: 0x0427 >>> - CURSOR_A_POSITION: 0x03a3032f >>> + CURSOR_A_POSITION: 0x01bb03fb >>> FPA0: 0x (n = 0, m1 = 0, m2 = 0) >>> FPA1: 0x (n = 0, m1 = 0, m2 = 0) >>> DPLL_A: 0x (disabled, non-dvo, VGA, default >>> clock, unknown mode, p1 = 0, p2 = 0) >>> DPLL_A_MD: 0x >>> -HTOTAL_A: 0x0821077f (1920 active, 2082 total) >>> -HBLANK_A: 0x0821077f (1920 start, 2082 end) >>> - HSYNC_A: 0x081307af (1968 start, 2068 end) >>> -VTOTAL_A: 0x045f0437 (1080 active, 1120 total) >>> -VBLANK_A: 0x045f0437 (1080 start, 1120 end) >>> - VSYNC_A: 0x044b0441 (1090 start, 1100 end) >>> +HTOTAL_A: 0x (1 active, 1 total) >>> +HBLANK_A: 0x (1 start, 1 end) >>> + HSYNC_A: 0x (1 start, 1 end) >>> +VTOTAL_A: 0x (1 active, 1 total) >>> +VBLANK_A: 0x (1 start, 1 end) >>> + VSYNC_A: 0x (1 start, 1 end) >>> BCLRPAT_A: 0x >>> VSYNCSHIFT_A: 0x >>> DSPBCNTR: 0x4000 (disabled, pipe A) >>> >>> >>> It appears the registers that are saved, and restored in >>> i915_save_modeset_reg / i915_restore_modeset_reg is not working >>> properly. >>> >>> When I put some debug in, I discovered that it was bailing out of >>> i915_save_modeset_reg early since the DRIVER_MODESET bit was cleared. >>> However, it was set at the end of i915_init() >>> This, of course, confuses me. >>> >>> Am I seeing memory corruption here? >>> >>> >>> >>> It looks like I misread the code here, inversing an if statement state. >>> >>> >>> >>> That said, I don't really have any clues as to why the display is >>> >>> black after resuming from S3 >>> > >>> > It appears that S3 is not necessary. >>> > >>> > I can reproduce the black screen with just vbetool: >>> > vbetool dpms off >>> > vbetool dpms on >>> > >>> > Does this suggest a bios issue? >>> >>> This can be reliably reproduced on this machine, and worked around by >>> saving the vbestate, and restoring it after the fact: >>> >>> (in a working state) >>> vbetool vbestate save > vbe.save >>> >>> break the system: >>> vbetool dpms off >>> vbetool dpms on >> >> This will break kms since now you have the vbios and the linux kms driver >> fighting over the same piece of hw. Does >> >> xset dpms force off >> xset dpms force on >> >> cause similar issues? > > No, these work as expected (on 3.8) > I didn't realize that these broke with KMS. I'll stick with the S3 > reproduction. > >> >> If not please make sure that vbetool isn't badly interfering with the >> kernel modeset driver on suspend/resume. At least looking at your dmesg >> and reg dumps vbe wreaking havoc with the kms driver seems like a rather >> likely scenario. vbetool was not installed on the system when I took those S3 dumps, so it seems unlikely to be the root of the problem, IMO. > Also, can you please test latest 3.10-rc kernels? > > 3.10-rc2 doesn't seem to work at all - it boots to a black screen every time. > > Ben > >> >> Cheers, Daniel >> -- >> Daniel
Re: [Intel-gfx] debugging Haswell eDP black screen after S3
On Tue, May 21, 2013 at 3:44 PM, Ben Guthro wrote: >> This will break kms since now you have the vbios and the linux kms driver >> fighting over the same piece of hw. Does >> >> xset dpms force off >> xset dpms force on >> >> cause similar issues? > > No, these work as expected (on 3.8) > I didn't realize that these broke with KMS. I'll stick with the S3 > reproduction. Ok, so things are at least not terribly broken. >> If not please make sure that vbetool isn't badly interfering with the >> kernel modeset driver on suspend/resume. At least looking at your dmesg >> and reg dumps vbe wreaking havoc with the kms driver seems like a rather >> likely scenario. Also, can you please test latest 3.10-rc kernels? > > 3.10-rc2 doesn't seem to work at all - it boots to a black screen every time. That otoh is ugly. Could be that though that this is the same (or a similar bug) to your resume issue - in the last few kernel releases we've tried very hard to unify the code between initial driver load at boot-up and resume. So can you please try to bisect where the boot-up regression has been introduced between 3.8 and 3.10-rc2? Thanks, Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] debugging Haswell eDP black screen after S3
On Tue, May 21, 2013 at 10:02 AM, Daniel Vetter wrote: > On Tue, May 21, 2013 at 3:44 PM, Ben Guthro wrote: >>> This will break kms since now you have the vbios and the linux kms driver >>> fighting over the same piece of hw. Does >>> >>> xset dpms force off >>> xset dpms force on >>> >>> cause similar issues? >> >> No, these work as expected (on 3.8) >> I didn't realize that these broke with KMS. I'll stick with the S3 >> reproduction. > > Ok, so things are at least not terribly broken. > >>> If not please make sure that vbetool isn't badly interfering with the >>> kernel modeset driver on suspend/resume. At least looking at your dmesg >>> and reg dumps vbe wreaking havoc with the kms driver seems like a rather >>> likely scenario. Also, can you please test latest 3.10-rc kernels? >> >> 3.10-rc2 doesn't seem to work at all - it boots to a black screen every time. > > That otoh is ugly. Could be that though that this is the same (or a > similar bug) to your resume issue - in the last few kernel releases > we've tried very hard to unify the code between initial driver load at > boot-up and resume. Perhaps I should qualify "at all" It seems that it fails somewhat late in the boot process. If I remove the "boot splash" cli params, I can see it transition into the high res mode, and seemingly get into init. However, even if I boot to single user mode, the screen goes black. Unfortunately, both times I tried to test this, and then reboot, I ended up at a "grub rescue" prompt, with an unusable system. > > So can you please try to bisect where the boot-up regression has been > introduced between 3.8 and 3.10-rc2? I'm not sure I'll be able to do this. With the failure condition I describe above, I am unable to even ssh into this machine to debug, nevermind install a new kernel. This means I need to generate a new kernel, and install kit with that kernel for every bisection test. This may be more time than I am able to dedicate to this problem - but I'll try. Ben > > Thanks, Daniel > -- > Daniel Vetter > Software Engineer, Intel Corporation > +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] debugging Haswell eDP black screen after S3
On Tue, May 21, 2013 at 1:28 PM, Ben Guthro wrote: > On Tue, May 21, 2013 at 10:02 AM, Daniel Vetter wrote: >> On Tue, May 21, 2013 at 3:44 PM, Ben Guthro wrote: This will break kms since now you have the vbios and the linux kms driver fighting over the same piece of hw. Does xset dpms force off xset dpms force on cause similar issues? >>> >>> No, these work as expected (on 3.8) >>> I didn't realize that these broke with KMS. I'll stick with the S3 >>> reproduction. >> >> Ok, so things are at least not terribly broken. >> If not please make sure that vbetool isn't badly interfering with the kernel modeset driver on suspend/resume. At least looking at your dmesg and reg dumps vbe wreaking havoc with the kms driver seems like a rather likely scenario. Also, can you please test latest 3.10-rc kernels? >>> >>> 3.10-rc2 doesn't seem to work at all - it boots to a black screen every >>> time. >> >> That otoh is ugly. Could be that though that this is the same (or a >> similar bug) to your resume issue - in the last few kernel releases >> we've tried very hard to unify the code between initial driver load at >> boot-up and resume. > > Perhaps I should qualify "at all" > > It seems that it fails somewhat late in the boot process. If I remove > the "boot splash" cli params, I can see it transition into the high > res mode, and seemingly get into init. > However, even if I boot to single user mode, the screen goes black. > > Unfortunately, both times I tried to test this, and then reboot, I > ended up at a "grub rescue" prompt, with an unusable system. > >> >> So can you please try to bisect where the boot-up regression has been >> introduced between 3.8 and 3.10-rc2? > > I'm not sure I'll be able to do this. > With the failure condition I describe above, I am unable to even ssh > into this machine to debug, nevermind install a new kernel. > This means I need to generate a new kernel, and install kit with that > kernel for every bisection test. > > This may be more time than I am able to dedicate to this problem - but I'll > try. > > Ben It appears I did not CC the list on my last 2 replies. My apologies - I'll re-paste them below. I tried to bisect this, but was unsuccessful, in that I didn't seem to have a reproducible test case to get back into this failure condition. It seemed that it always would succeed for me...which of course makes bisecting near impossible. I tried updating to 3.10-RC3...well, actually to this changeset at the tip of Linus' tree: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=58f8bbd2e39c3732c55698494338ee19a92c53a0 I can get X to come up now on this machine - albeit very slowly. Once it comes up, it seems to hang, and respawn I get a lot of these in the log now, as well: [ 392.195734] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung Things in the log that look suspicious to me are: [ 34.293452] [drm:intel_pipe_set_base] *ERROR* pin & fence failed [ 34.293486] [drm:intel_crtc_set_config] *ERROR* failed to set mode on [CRTC:3], err = -28 I get the following errors in the X log, that prevent it from coming up: [76.142] (EE) intel(0): failed to set mode: No space left on device [76.142] Fatal server error: [76.142] AddScreen/ScreenInit failed for driver 0 [76.142] [76.142] (EE) Xorg also crashes in the following manner: [ 218.876] (EE) Backtrace: [ 218.880] (EE) 0: X (xorg_backtrace+0x34) [0x7fe44fff9754] [ 218.880] (EE) 1: X (0x7fe44fe44000+0x1b96a9) [0x7fe44fffd6a9] [ 218.880] (EE) 2: /lib/x86_64-linux-gnu/libpthread.so.0 (0x7fe44f16a000+0xfcb0) [0x7fe44f179cb0] [ 218.880] (EE) 3: /lib/x86_64-linux-gnu/libc.so.6 (0x7fe44ddcf000+0x148c6b) [0x7fe44df17c6b] [ 218.880] (EE) 4: /usr/lib/xorg/modules/drivers/intel_drv.so (0x7fe44cb5a000+0x17c36) [0x7fe44cb71c36] [ 218.880] (EE) 5: /usr/lib/xorg/modules/drivers/intel_drv.so (0x7fe44cb5a000+0x19857) [0x7fe44cb73857] [ 218.880] (EE) 6: /usr/lib/xorg/modules/drivers/intel_drv.so (0x7fe44cb5a000+0xed429) [0x7fe44cc47429] [ 218.880] (EE) 7: X (0x7fe44fe44000+0x13e8ac) [0x7fe44ff828ac] [ 218.880] (EE) 8: X (0x7fe44fe44000+0x5239e) [0x7fe44fe9639e] [ 218.880] (EE) 9: X (0x7fe44fe44000+0x557a1) [0x7fe44fe997a1] [ 218.880] (EE) 10: X (0x7fe44fe44000+0x4415a) [0x7fe44fe8815a] [ 218.880] (EE) 11: /lib/x86_64-linux-gnu/libc.so.6 (__libc_start_main+0xed) [0x7fe44ddf076d] [ 218.880] (EE) 12: X (0x7fe44fe44000+0x444b1) [0x7fe44fe884b1] [ 218.880] (EE) [ 218.880] (EE) Bus error at address 0x7fe44a6c9080 [ 218.880] Fatal server error: [ 218.881] Caught signal 7 (Bus error). Server aborting [ 218.881] [ 218.881] (EE) I recognize that this isn't terribly helpful without the symbol resolution. I tried installing debug symbols, but they didn't seem to help. Ben ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freed