[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon
This patch was picked up as part of an upstream stable release. This commit first appeared in 3.0.0-13.21. ** Changed in: linux (Ubuntu) Status: Incomplete = Fix Released -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/820746 Title: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon
Patch for this bug that also addresses other chipsets affected on resume. ** Patch added: Fix for 820746 https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+attachment/2636981/+files/video_820746.patch -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/820746 Title: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon
@a-r-karthick, Trying to apply that patch it seems to be corrupt. Can you just add the patch email as you received it as an attachment to this bug and I'll see if I can apply that. Thanks. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/820746 Title: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon
@a-r-karthick, I don't see this patch in Linus' tree. Has this been submitted upstream? ** Changed in: linux (Ubuntu) Status: Confirmed = Incomplete -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/820746 Title: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon
@brad-figg Yes we did mail lkml and dri-devel. They had ack'ed it back then and proposed to resolve it by just resetting the write ring buffer index to 0 on resume to be safe. But not sure it got submitted upstream as it had asked for confirmation. I am sure the patch would have worked but I debugged the issue which was reproduced by @mynk in his hardware setup as it is typical to his setup. Since I didn't have access or could reproduce locally, I just debugged it with the objdump. Maybe you should pitch for it upstream if it hasn't been merged. Here is the mail that was sent back in response to our submission: From c564bc8e6d449216d74ee134d5bf470221f79e8d Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Michel=20D=C3=A4nzer?= michel.daen...@amd.com Date: Thu, 8 Sep 2011 11:09:39 +0200 Subject: [PATCH] drm/radeon: Don't read from CP ring write pointer registers. MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The patch below is what I had in mind. Does this fix the problem above? Apparently this doesn't always work reliably, e.g. at resume time. Just initialize to 0, so the ring is considered empty. Tested with hibernation on Sumo and Cayman cards. Should fix https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/ . Signed-off-by: Michel Dänzer michel.daen...@amd.com --- drivers/gpu/drm/radeon/evergreen.c |4 ++-- drivers/gpu/drm/radeon/ni.c| 12 ++-- drivers/gpu/drm/radeon/r100.c |6 ++ drivers/gpu/drm/radeon/r600.c |4 ++-- 4 files changed, 12 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/radeon/evergreen.c b/drivers/gpu/drm/radeon/evergreen.c index 15bd047..f2bd90a 100644 --- a/drivers/gpu/drm/radeon/evergreen.c +++ b/drivers/gpu/drm/radeon/evergreen.c @@ -1378,7 +1378,8 @@ int evergreen_cp_resume(struct radeon_device *rdev) /* Initialize the ring buffer's read and write pointers */ WREG32(CP_RB_CNTL, tmp | RB_RPTR_WR_ENA); WREG32(CP_RB_RPTR_WR, 0); - WREG32(CP_RB_WPTR, 0); + rdev-cp.wptr = 0; + WREG32(CP_RB_WPTR, rdev-cp.wptr); /* set the wb address wether it's enabled or not */ WREG32(CP_RB_RPTR_ADDR, @@ -1403,7 +1404,6 @@ int evergreen_cp_resume(struct radeon_device *rdev) WREG32(CP_DEBUG, (1 27) | (1 28)); rdev-cp.rptr = RREG32(CP_RB_RPTR); - rdev-cp.wptr = RREG32(CP_RB_WPTR); evergreen_cp_start(rdev); rdev-cp.ready = true; diff --git a/drivers/gpu/drm/radeon/ni.c b/drivers/gpu/drm/radeon/ni.c index 559dbd4..e3489ee 100644 --- a/drivers/gpu/drm/radeon/ni.c +++ b/drivers/gpu/drm/radeon/ni.c @@ -1182,7 +1182,8 @@ int cayman_cp_resume(struct radeon_device *rdev) /* Initialize the ring buffer's read and write pointers */ WREG32(CP_RB0_CNTL, tmp | RB_RPTR_WR_ENA); - WREG32(CP_RB0_WPTR, 0); + rdev-cp.wptr = 0; + WREG32(CP_RB0_WPTR, rdev-cp.wptr); /* set the wb address wether it's enabled or not */ WREG32(CP_RB0_RPTR_ADDR, (rdev-wb.gpu_addr + RADEON_WB_CP_RPTR_OFFSET) 0xFFFC); @@ -1202,7 +1203,6 @@ int cayman_cp_resume(struct radeon_device *rdev) WREG32(CP_RB0_BASE, rdev-cp.gpu_addr 8); rdev-cp.rptr = RREG32(CP_RB0_RPTR); - rdev-cp.wptr = RREG32(CP_RB0_WPTR); /* ring1 - compute only */ /* Set ring buffer size */ @@ -1215,7 +1215,8 @@ int cayman_cp_resume(struct radeon_device *rdev) /* Initialize the ring buffer's read and write pointers */ WREG32(CP_RB1_CNTL, tmp | RB_RPTR_WR_ENA); - WREG32(CP_RB1_WPTR, 0); + rdev-cp1.wptr = 0; + WREG32(CP_RB1_WPTR, rdev-cp1.wptr); /* set the wb address wether it's enabled or not */ WREG32(CP_RB1_RPTR_ADDR, (rdev-wb.gpu_addr + RADEON_WB_CP1_RPTR_OFFSET) 0xFFFC); @@ -1227,7 +1228,6 @@ int cayman_cp_resume(struct radeon_device *rdev) WREG32(CP_RB1_BASE, rdev-cp1.gpu_addr 8); rdev-cp1.rptr = RREG32(CP_RB1_RPTR); - rdev-cp1.wptr = RREG32(CP_RB1_WPTR); /* ring2 - compute only */ /* Set ring buffer size */ @@ -1240,7 +1240,8 @@ int cayman_cp_resume(struct radeon_device *rdev) /* Initialize the ring buffer's read and write pointers */ WREG32(CP_RB2_CNTL, tmp | RB_RPTR_WR_ENA); - WREG32(CP_RB2_WPTR, 0); + rdev-cp2.wptr = 0; + WREG32(CP_RB2_WPTR, rdev-cp2.wptr); /* set the wb address wether it's enabled or not */ WREG32(CP_RB2_RPTR_ADDR, (rdev-wb.gpu_addr + RADEON_WB_CP2_RPTR_OFFSET) 0xFFFC); @@ -1252,7 +1253,6 @@ int cayman_cp_resume(struct radeon_device *rdev) WREG32(CP_RB2_BASE, rdev-cp2.gpu_addr 8); rdev-cp2.rptr = RREG32(CP_RB2_RPTR); - rdev-cp2.wptr = RREG32(CP_RB2_WPTR); /* start the rings */ cayman_cp_start(rdev); diff --git a/drivers/gpu/drm/radeon/r100.c b/drivers/gpu/drm/radeon/r100.c index f2204cb..11e44a3 100644 --- a/drivers/gpu/drm/radeon/r100.c +++ b/drivers/gpu/drm/radeon/r100.c @@ -990,7
[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon
** Tags added: patch -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/820746 Title: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon
@a-r-karthick, I'd be happy to build some kernels with that patch applied if someone is willing to test them. I'll do it first thing tomorrow. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/820746 Title: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon
I have changed my setup and I don't remember seeing this issue on linux 3.0.x kernel. I will need to check if I can reproduce the problem. If I can I am willing to test the same. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/820746 Title: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon
** Changed in: linux (Ubuntu) Status: New = Confirmed -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/820746 Title: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon
@a-r-karthick: the patch works fine. I saw some errors in r600 you might want to check. Attaching along with. ** Attachment added: kern.log https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+attachment/2260727/+files/kern.log -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/820746 Title: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon
@Mynk: Thanks for your efforts in testing out the patch. I know you spent sleepless nights to get this verified amidst other preemptions. Regarding the error log on resume, its a different one altogether. It has something to do with the fact that the radeon_pcie_gart_enable never really happend. (advanced relocation table/iommu for PCI express slots). So on device startup, the GPU acceleration was disabled for your hardware which also releases the gart table. (radeon_gart_table_vram_free invoked on gart finalize which clears up the vram) Aug 8 05:34:29 mayankr-T400 kernel: [ 10.481258] [drm:radeon_ring_write] *ERROR* radeon: writting more dword to ring than expected ! Aug 8 05:34:29 mayankr-T400 kernel: [ 10.626140] [drm:r600_ring_test] *ERROR* radeon: ring test failed (scratch(0x8504)=0x) Aug 8 05:34:29 mayankr-T400 kernel: [ 10.626146] radeon :01:00.0: disabling GPU acceleration However this fact isn't marked by the radeon driver when it disables the GPU acceleration in r600_startup (no flags marked). So during resume when it tries to re-enable the gart table, it found an empty: vram object in r600_pcie_gart_enable. And then fails the resume but since your ring and gpu were anyway initialized and suspend doesn't touch it, you were not impacted but just got left with a Resume failed message. I think this has something to do with the fact that your hardware is returning invalid (~0U) values during initialize for the ring buffer write index which we are now anding it with the ring buffer size which effectively leaves us with 1 byte of write space at the tail or reduced ring buffer write space for the GPU acceleration feature to be enabled. So I guess our quirk for your hardware is making you _live_ or exist with a broken/crazy hardware :) Otherwise you would be Oopsing as before. If you want to enable GPU acceleration, maybe we retry a finite number of times on receiving invalid ring buffer write index values as my original patch with the expectation that the subsequent retries work but I guess its not worth it and it makes sense to live without GPU acceleration for your seemingly broken graphics chipset. I also believe that we can mark a flag in rdev-flags like rdev-flags |= RADEON_IS_GART_DISABLED and then check against this flag when the pcie_gart_enable fails on resume and continue by rdev-flags = ~RADEON_IS_GART_DISABLED with the resume instead of failing the resume since the gart vram object was freed during r600_startup while disabling the gart and continuing. But I don't think its a big deal and we can treat it as benign for now for the reasons mentioned above. So to cut it short, lets now pull the trigger for the patch to be pushed upstream :) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/820746 Title: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon
Adding the objdump referred to in the update. Will try the suggestions and update. ** Attachment added: objdump -d -j .text radeon.ko radeon.ko.out https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+attachment/2255879/+files/radeon.ko.out -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/820746 Title: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon
Thanks Karthick, Your suggestion works - T400:~/linux-2.6.38/drivers/gpu/drm/radeon# diff r600.c.orig r600.c 2221a,2234 /* * Re-read the read and write if the value returned isn't sane. before calling r600_cp_start */ do { rdev-cp.rptr = RREG32(CP_RB_RPTR); mdelay(15); } while((int)rdev-cp.rptr 0); do { rdev-cp.wptr = RREG32(CP_RB_WPTR); mdelay(15); } while( (int)rdev-cp.wptr 0 ); If I boot without this fix I run into the Oops. With the patched module it works fine. If a patch would help kindly let know what steps to use to produce the patch (which directory to run it from) and I can upload the same. Hope this fix is reviewed and makes into the next release. Thanks, Mayank -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/820746 Title: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon
Great news! Attach your r600.c and I can make a proper patch which you can retest before we can pitch for it to be included in the next release. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/820746 Title: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon
Attaching the r600.c as requested. ** Attachment added: r600.c https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+attachment/2256717/+files/r600.c -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/820746 Title: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon
a-r-karthick: thanks for your analysis. This invalid register read seems something very similar which was also workarounded at r100_cp_init function, take a look at this change: commit 9e5786bd14cb9ffe29ebe66d41cedf03311b0d30 Author: Dave Airlie airl...@redhat.com Date: Wed Mar 31 13:38:56 2010 +1000 drm/radeon/kms: add sanity check to wptr. If we resume in a bad way, we'll get 0x in wptr, and then oops with no console. This just adds a sanity check so that we can avoid the oops and hopefully get more details out of people's systems. Signed-off-by: Dave Airlie airl...@redhat.com diff --git a/drivers/gpu/drm/radeon/r100.c b/drivers/gpu/drm/radeon/r100.c index 138ddd4..c8f4b03 100644 --- a/drivers/gpu/drm/radeon/r100.c +++ b/drivers/gpu/drm/radeon/r100.c @@ -744,6 +744,8 @@ int r100_cp_init(struct radeon_device *rdev, unsigned ring_size) udelay(10); rdev-cp.rptr = RREG32(RADEON_CP_RB_RPTR); rdev-cp.wptr = RREG32(RADEON_CP_RB_WPTR); + /* protect against crazy HW on resume */ + rdev-cp.wptr = rdev-cp.ptr_mask; /* Set cp mode to bus mastering enable cp*/ WREG32(RADEON_CP_CSQ_MODE, REG_SET(RADEON_INDIRECT2_START, indirect2_start) | a-r-karthick: Can you raise this issue upstream, at dri-de...@lists.freedesktop.org mailing list? You can just test and send a similar patch for review, using ptr_mask also. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/820746 Title: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon
I mean, r600_cp_resume also seems to need same workaround already present on r100_cp_init -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/820746 Title: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon
And this isn't a regression, the update from 2.6.38-8.42 doesn't touch this area, seems the issue happens by luck, some code shuffle or anything else (timing?) may be made the issue more likely to happen on newer kernel. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/820746 Title: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon
@Herton : Good point regarding a similar fix in r100.c that I wasn't aware. I didn't even reproduce this as @mynk (reporter and buddy) reproduced this as I don't have a hardware with radeon chipset :) I debugged this with the objdump disassembly and the OOPs information. Seems to exactly match the fix and the comment in r100.c that corrects the write index. Masking the write pointer with the ring buffer ptr_mask like in r100.c makes sense even if it doesn't match the exact write index (would write to the last byte before rolling back to 0 in the ring buffer) on resume since I was trying to re-read with a delay. The fact that the retry with mdelay was working for him on a resume implies that it was indeed fetching the right values on a re-read. But my patch was causing a boot time lockup as it was doing the same thing for ring buffer read index and it seems that the ring buffer read index returned from the hardware is always uninitialized or ~0U during boot. So maybe a mask for read index makes sense but the fact that its acceptable to issue a read to the iommu at an invalid offset before its corrected in the next read pass that patches it with ring ptr_mask suggests that its not a deal breaker. So lets play it safe and retain the same fix in r600.c as it is in r100.c: @Mayank : Please revert the last patch and apply the patch on top of your original un-patched r600.c. (also attached) @Herton: Once Mayank re-tests with the patch and confirms that it works, I can push for it upstream quoting this bug as the reference. I am surprised that the bug still exists in 3.0 as well. And I don't believe it has anything to do with regression. It could be that we are plain lucky with resume since this is related to the hardware returning an invalid index on resume. --- r600_orig.c 2011-08-05 13:39:25.833427436 -0700 +++ r600.c 2011-08-05 14:50:32.037670946 -0700 @@ -2218,6 +2218,8 @@ rdev-cp.rptr = RREG32(CP_RB_RPTR); rdev-cp.wptr = RREG32(CP_RB_WPTR); + /* protect against crazy HW on resume */ + rdev-cp.wptr = rdev-cp.ptr_mask; r600_cp_start(rdev); rdev-cp.ready = true; ** Patch added: Patch for kernel panic in resume in r600 radeon driver module https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+attachment/2257107/+files/r600.c.resume.patch -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/820746 Title: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon
@a-r-karthick: yes once you get tested please submit it upstream, thank you. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/820746 Title: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon
** Tags added: regression-update -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/820746 Title: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon
Curious guy not knowing much about radeon/video drivers but with a can- debug approach trying to take a stab at this issue based on the radeon.ko objdump disassembly provided by the bug reporter who happens to be my friend : (please try the recommendation suggested at the end of this report in r600.c, r600_cp_resume to see if it resolves the OOPs) The kernel panic is a result of an invalid ring write pointer while updating a value to the radeon ring buffer. The write pointer read from the radeon control register (r100_mm_rreq function in radeon.h) is returning an incorrect (or seemingly negative value on RESUME). Looks like we may have to add a retry on r600_cp_resume to make it work. RCA enclosed below: -- Mapping the Oops to the disassembly, its clear that the kernel panic was triggered by this instruction: static inline void radeon_ring_write(struct radeon_device *rdev, uint32_t v) { #if DRM_DEBUG_CODE if (rdev-cp.count_dw = 0) { DRM_ERROR(radeon: writting more dword to ring than expected !\n); } #endif rdev-cp.ring[rdev-cp.wptr++] = v; -PANICs here as rdev-cp.wptr seems to be negative rdev-cp.wptr = rdev-cp.ptr_mask; rdev-cp.count_dw--; rdev-cp.ring_free_dw--; } Lets now map the above to the EIP first: EIP at the time of the kernel panic was r600_cp_start+0x48 From objdump disassembly, it maps to: r600_cp_start: (0x709e8): 709e8: c7 02 00 44 05 c0 movl $0xc0054400,(%edx) Also it exactly matches the OOPs hex dump: Aug 4 09:25:20 mayankr-T400 kernel: [ 10.356006] Code: c6 0f 85 fd 02 00 00 8b bb a4 07 00 00 85 ff 0f 8e d6 02 00 00 8b 83 94 07 00 00 8d 14 85 00 00 00 00 83 c0 01 03 93 8c 07 00 00 c7 02 00 44 05 c0 8b 93 a4 07 00 00 23 83 b4 07 00 00 83 ab a0 Refer to the instruction c7 in angular brackets which represents the faulting instruction and hexcodes also match the above EIP. (c7 02 00 44 05 c0 ) If you reverse engineer the code to the disassembly, the panic EIP is evident. From the objdump, EBX holds the radeon_device pointer *rdev. The ring buffer remaining count cp.dw_count is 16 and held in register EDI. The write pointer or rdev-cp.wptr index for the radeon ring buffer is stored in EAX. From the panic this value is shown as 0 as EAX holds 0. The ring buffer pointer that triggered the OOps is in EDX as seen also from the above target of the store. EDX value from the OOPs is: 0xfa501ffc. And this is also the PTE entry that took the page fault as seen from the OOPs: Aug 4 09:25:20 mayankr-T400 kernel: [ 10.354151] BUG: unable to handle kernel paging request at fa501ffc movl $0xc0054400, (%edx) The C call for the above store is radeon_ring_write(rdev, PACKET3(PACKET3_ME_INITIALIZE, 5)); from r600_cp_start. PACKET3(PACKET3_ME_INITIALIZE, 5) macro evaluates to 0xc0054400. So now we are dead sure that we had an incorrect radeon ring write pointer read from the register in r600_cp_resume: before calling r600_cp_start: rdev-cp.rptr = RREG32(CP_RB_RPTR); rdev-cp.wptr = RREG32(CP_RB_WPTR); Now from the assembly, the value of the write pointer is stored in EAX at the time of the panic: Taking a few instructions above the faulting instruction: 709d2: 8b 83 94 07 00 00 mov0x794(%ebx),%eax 709d8: 8d 14 85 00 00 00 00lea0x0(,%eax,4),%edx 709df: 83 c0 01add$0x1,%eax 709e2: 03 93 8c 07 00 00 add0x78c(%ebx),%edx 709e8: c7 02 00 44 05 c0 movl $0xc0054400,(%edx) 0x794 offset of EBX (rdev pointer) is rdev-cp.wptr or the write index for the ring buffer. We can see this is being moved to EAX. And the indexed absolute address is stored in EDX (the address that took the page fault as mentioned above) Now we can see that: add $0x1, %eax happens BEFORE the movl or the faulting instruction. In other words: rdev-cp.wptr++ after use of EAX. Unless we missed a speculative execution, there is no chance to miss execution of this increment. So for all intents and purposes, EAX cannot hold 0 since it was incremented before the fault. Which implies that EAX was negative on reading from the radeon register on a resume. And as a result, we indexed an invalid location into the ring buffer (rdev-cp.ring) which expectedly triggered the kernel panic. I am not sure if we have to re-read the values on resume if the value returned is negative from the radeon register for the read and write pointers. You can try the following changes to see if it works: In r600_cp_resume before calling r600_cp_start : (r600.c) /* * Re-read the read and write if the value returned isn't sane. before calling r600_cp_start */ do { rdev-cp.rptr = RREG32(CP_RB_RPTR); mdelay(15); } while((int)rdev-cp.rptr 0); do { rdev-cp.wptr = RREG32(CP_RB_WPTR); mdelay(15); } while( (int)rdev-cp.wptr 0 ); r600_cp_start(rdev); If the above still triggers
[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon
Just to avoid confusion as rdev-cp.wptr is unsigned, by negative I implied that it was ~0 or 0xU; when read in r600_cp_resume. So an increment: add $1, %eax before the OOPs wrapped it back to 0. as EAX or the ring buffer write pointer index at the time of the panic was shown as 0. Since the increment had happened before the fault, it had to be ~0 or 0xU on resume which is again an invalid write pointer value for the radeon ring buffer. So we retry till we get a sane value for the read and write pointer for the radeon ring buffer on RESUME in r600_cp_resume. Have a strong hunch that it would fix the panic as this seems to be a timing issue with reading registers on resume. (maybe the device isn't ready yet when the resume tries to fetch the read and write indexes) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/820746 Title: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon
** Attachment added: Relevant log under kern.log https://bugs.launchpad.net/bugs/820746/+attachment/2253248/+files/Aug4_kern.log -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/820746 Title: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon
Ok I see that lspci details are attached. I am able to reproduce this problem at will. The laptop won't boot with the latest kernel. I have to boot from the previous kernel. Kindly note that the bug report was taken when booted from the older version of the kernel. Hope this helps, Mayank -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/820746 Title: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs