https://bugzilla.kernel.org/show_bug.cgi?id=16140
--- Comment #33 from Florian Mickler <flor...@mickler.org> 2010-10-10 23:15:54 --- Hi! I did find a rv280 card. On that card, the screen is garbled after resume, but the ring test doesn't fail. It is using the same code-paths as far as I see. So we can probably conclude: 1. garbled screen and the ring setup failure are independent failures. 2. the ring setup failure is something specific to your card / or chipset. Do you see differences in lspci -vv output before and after suspend? (In reply to comment #32) > (Patch see https://bugzilla.kernel.org/show_bug.cgi?id=16140#c26) > > With applying my patch from above, it's this section (Line #1028 and > following) > from r100_cp_init() doing the problem: > > 940 int r100_cp_init(struct radeon_device *rdev, unsigned ring_size) > ... > 1026 radeon_ring_start(rdev); > 1027 r = radeon_ring_test(rdev); > 1028 if (r) { > 1029 DRM_ERROR("radeon: cp isn't working (%d).\n", r); > 1030 return r; > 1031 } > 1032 rdev->cp.ready = true; > 1033 return 0; > 1034 } > ... > > Replacing "if (r) {" with "if (WARN_ON(r)) {" shows the above Call-trace. Yes. This is also seen by the "radeon: cp isn't working (-22)." Line in your dmesg. But of course the callstack is handy to verify we are looking at the right code. I didn't put a WARN there, because we already knew it failed. I wondered if some tests without error-messages failed and put the WARN's there. But in retrospect we would have seen that, because the above error message would have not been preceded by the ring-test error message. > I looked into r600.c source-code and put "rdev->cp.ready = true;" before Line > "r = radeon_ring_test(rdev);", not helping. If you are interested how the driver works, have a look at http://www.botchco.com/agd5f/?p=50 The "ring" is a buffer where the driver writes commands and the gpu reads those commands and executes them. It's a ring buffer. http://en.wikipedia.org/wiki/Circular_buffer If you set cp.ready and the hardware isn't really ready, that won't help. The ring test works like so: The driver writes a value (0xCAFEDEAD) into the scratch-register and instructs the gpu via the ringbuffer to overwrite it with "0xDEADBEEF". Then the driver check's if the gpu does it. And if after N udelays(1) the gpu did not write the expected value into that register, the test fails. But of course, we are left to wonder as to why. > Again inspired from r600.c, I put Line #966 "r100_cp_load_microcode(rdev);" > after "r = radeon_ring_init(rdev, ring_size);", this resulted in a > not-so-garbled screen, after hanging: > pm-resume in X -> switching to vt-1 -> killing X -> restarting startx That's interesting. Can you elaborate on the hanging? > > This is doing no harm, see my logs. > - DRM_ERROR("radeon: ring test failed (sracth(0x%04X)=0x%08X)\n", > + DRM_ERROR("radeon: ring test failed (scratch(0x%04X)=0x%08X)\n", True, but it is inconvenient. If you 'grep -r' on that error message you only get the r100 one. With the typo corrected, you get both, the r100 and the r600 one. I agree, not a big deal, but... > I am not sure what you mean with "radeon driver": the one in the kernel or the > DDX (xf86-video-ati). Always the kernel one, at the moment. > > One NOTE: > In Line #3728 there is a commented "r100_gpu_init(rdev);", it is nowhere > "defined". I see in r600.c a *_gpu_init() and a *_cp_start() in case of > resuming. Just a hint, if you wanna compare or dig into it. > Yes. I wondered about that too. 'git-blame' shows it is a left over from: commit 90aca4d2740255bd130ea71a91530b9920c70abe Author: Jerome Glisse <jgli...@redhat.com> Date: Tue Mar 9 14:45:12 2010 +0000 drm/radeon/kms: simplify & improve GPU reset V2 ... > IIRC it would make sense to interprete correctly the Call-trace, I am not that > familiar with "the internals". The call-trace is not complicated. The topmost function is the function that is currently executing. The second entry is the function it will return to. The third function is the function the second function will return to. and so on. see: http://en.wikipedia.org/wiki/Call_stack I don't know about the item 1 to 3 in that trace. But I guess they are just artifacts of the WARN_ON macro. If you look into the code, you see that the call trace is to be expected. What has to be considered bad, is that the ring-test fails because the gpu doesn't process the ringbuffer in time. In comment #12 you said, that turning off agp would fix the suspend issue? Which one was that? The ring-test error message, or the garbled screen or both? In my setup (rv280) it only worked once out of ten times. First time, it came back without garbled screen, but all subsequent suspend/resumes did garble the screen. On that screen garble I have a few thoughts. It is somewhat periodic and always follows a pattern for me. I can clear the corruption by changing consoles for example. Then it always scribbles in a predetermined pattern on the framebuffer where it stays (overwriting itself with a high frequency), till I change consoles. Same for you? -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. ------------------------------------------------------------------------------ Beautiful is writing same markup. Internet Explorer 9 supports standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3. Spend less time writing and rewriting code and more time creating great experiences on the web. Be a part of the beta today. http://p.sf.net/sfu/beautyoftheweb -- _______________________________________________ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel