Re: X Hangs with RS690 + 2.6.26
On Fri, Jul 25, 2008 at 11:10:07AM +0100, Dave Airlie wrote: I've started to see hangs with X on an ATI RS690 with a 2.6.26 kernel. The symptoms are that load average goes up, X stops accepting keypresses or mouse clicks, but the cursor still moves around the screen in response to the mouse being moved. I can't switch to a VT but can ssh in remotely to see that things are still running. I don't seem to be able to kill X but shutdown -r now cleanly reboots. radeon driver is recent git - 1c5858484da4fb1c9bc3ac3b4d7a97863ab99730 but I've seen it with older revisions too. It can take a couple of days for me to hit the problem, so a git bisect could be a lengthy process. If anyone has any suggestions about faster ways to track down the issue I'd like to hear them. git log v2.6.25..v2.6.26 drivers/char/drm ... not sure if you wanna try reverting some of those and seeing which is the cause maybe.. I never figured out which of these caused the issues, but as a further data point for anyone else suffering from the issue 2.6.27-rc kernels appear to fix (or at least significantly ease) the problem; I managed a 23 day uptime on 2.6.27-rc5 with I think one X freeze during that period that cleaned up after a Ctrl-Alt-Backspace. Not seen the same thing at all on 2.6.27-rc7 (though only ran it for 14 days before rebooting into 2.6.27 proper). J. -- Web [Can I trade this job for what's behind door 2?] site: http:// [ ] Made by www.earth.li/~noodles/ [ ] HuggieTag 0.0.23 - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: X Hangs with RS690 + 2.6.26
On Sat, Aug 09, 2008 at 05:47:42AM -0300, Frédéric L. W. Meunier wrote: On Fri, 1 Aug 2008, Jonathan McDowell wrote: On Fri, Jul 25, 2008 at 11:10:07AM +0100, Dave Airlie wrote: I've started to see hangs with X on an ATI RS690 with a 2.6.26 kernel. The symptoms are that load average goes up, X stops accepting keypresses or mouse clicks, but the cursor still moves around the screen in response to the mouse being moved. I can't switch to a VT but can ssh in remotely to see that things are still running. I don't seem to be able to kill X but shutdown -r now cleanly reboots. radeon driver is recent git - 1c5858484da4fb1c9bc3ac3b4d7a97863ab99730 but I've seen it with older revisions too. It can take a couple of days for me to hit the problem, so a git bisect could be a lengthy process. If anyone has any suggestions about faster ways to track down the issue I'd like to hear them. git log v2.6.25..v2.6.26 drivers/char/drm 5e35eff13f7dd0f5c1d82b3b4708b2f7a5f44113 5cfb6956073a9e42d44a26790b7800980634d037 No joy. d396db321bcaec54345e7e9e87cea8482d6ae3a8 I thought this might be it; nearly 5 days of uptime rather than the usual less than 2. But I got the same symptoms today so I'll continue working down the list. 259434acccbc823ee8bc00b2d2689d25e1fd d7463eb41d88a39de2653fd41857c4ccddb8707b 45e519052e8f583a709edd442a23f59581d3fe42 2735977b12cb0f113aae24afff04747b6d0f5bf1 3722bfc607d46275369865c02fe8694486d640b5 fa0d71b967506031f7cb08ced6095d1c4f988594 9f18409ea3d778a171a9505c0a849d846f352bd0 Any joy ? 259434acccbc823ee8bc00b2d2689d25e1fd d7463eb41d88a39de2653fd41857c4ccddb8707b 45e519052e8f583a709edd442a23f59581d3fe42 all don't seem to be the problem. It's getting harder to do the reverts and I'm away this week so I haven't got any further yet. I apparently have the same problem with my RS690. I noticed it after upgrading from 2.6.25 to 2.6.26, alongside xorg-server (1.4.99.904 to 1.4.99.905) and Mesa (7.1-rc1 to 7.1-rc3). The ATI driver is 6.9.0. Here it always freezes in a few minutes or less than an hour. When it happens, I'm not running any 3D application and the CPU is idle. I may be just typing something in a shell. But it works disabling DRI. Likewise, I'm not doing anything 3D related (at least, not consciously). J. -- ] http://www.earth.li/~noodles/ [] No program done by a hacker will [ ] PGP/GPG Key @ the.earth.li [] work unless he is on the system. [ ] via keyserver, web or email. [] [ ] RSA: 4DC4E7FD / DSA: 5B430367 [] [ - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: X Hangs with RS690 + 2.6.26
On Sun, 10 Aug 2008, Jonathan McDowell wrote: On Sat, Aug 09, 2008 at 05:47:42AM -0300, Frédéric L. W. Meunier wrote: On Fri, 1 Aug 2008, Jonathan McDowell wrote: On Fri, Jul 25, 2008 at 11:10:07AM +0100, Dave Airlie wrote: I've started to see hangs with X on an ATI RS690 with a 2.6.26 kernel. The symptoms are that load average goes up, X stops accepting keypresses or mouse clicks, but the cursor still moves around the screen in response to the mouse being moved. I can't switch to a VT but can ssh in remotely to see that things are still running. I don't seem to be able to kill X but shutdown -r now cleanly reboots. radeon driver is recent git - 1c5858484da4fb1c9bc3ac3b4d7a97863ab99730 but I've seen it with older revisions too. It can take a couple of days for me to hit the problem, so a git bisect could be a lengthy process. If anyone has any suggestions about faster ways to track down the issue I'd like to hear them. git log v2.6.25..v2.6.26 drivers/char/drm 5e35eff13f7dd0f5c1d82b3b4708b2f7a5f44113 5cfb6956073a9e42d44a26790b7800980634d037 No joy. d396db321bcaec54345e7e9e87cea8482d6ae3a8 I thought this might be it; nearly 5 days of uptime rather than the usual less than 2. But I got the same symptoms today so I'll continue working down the list. 259434acccbc823ee8bc00b2d2689d25e1fd d7463eb41d88a39de2653fd41857c4ccddb8707b 45e519052e8f583a709edd442a23f59581d3fe42 2735977b12cb0f113aae24afff04747b6d0f5bf1 3722bfc607d46275369865c02fe8694486d640b5 fa0d71b967506031f7cb08ced6095d1c4f988594 9f18409ea3d778a171a9505c0a849d846f352bd0 Any joy ? 259434acccbc823ee8bc00b2d2689d25e1fd d7463eb41d88a39de2653fd41857c4ccddb8707b 45e519052e8f583a709edd442a23f59581d3fe42 all don't seem to be the problem. It's getting harder to do the reverts and I'm away this week so I haven't got any further yet. I apparently have the same problem with my RS690. I noticed it after upgrading from 2.6.25 to 2.6.26, alongside xorg-server (1.4.99.904 to 1.4.99.905) and Mesa (7.1-rc1 to 7.1-rc3). The ATI driver is 6.9.0. Here it always freezes in a few minutes or less than an hour. When it happens, I'm not running any 3D application and the CPU is idle. I may be just typing something in a shell. But it works disabling DRI. Likewise, I'm not doing anything 3D related (at least, not consciously). BTW, I forgot to mention that. Here the motherboard is a Gigabyte GA-MA69VM-S2. When it happens and I use SysRq to reboot, it doesn't post in the BIOS screen. I have to press reset.- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/-- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: X Hangs with RS690 + 2.6.26
On Sun, Aug 10, 2008 at 05:25:55PM -0300, Frédéric L. W. Meunier wrote: On Sun, 10 Aug 2008, Jonathan McDowell wrote: On Sat, Aug 09, 2008 at 05:47:42AM -0300, Frédéric L. W. Meunier wrote: I apparently have the same problem with my RS690. I noticed it after upgrading from 2.6.25 to 2.6.26, alongside xorg-server (1.4.99.904 to 1.4.99.905) and Mesa (7.1-rc1 to 7.1-rc3). The ATI driver is 6.9.0. Here it always freezes in a few minutes or less than an hour. When it happens, I'm not running any 3D application and the CPU is idle. I may be just typing something in a shell. But it works disabling DRI. Likewise, I'm not doing anything 3D related (at least, not consciously). BTW, I forgot to mention that. Here the motherboard is a Gigabyte GA-MA69VM-S2. When it happens and I use SysRq to reboot, it doesn't post in the BIOS screen. I have to press reset. My mobo is an ASUS M2A-VM HDMI and a shutdown -r now when X is wedged (done over ssh) results in a clean reboot; no need to hard reset. J. -- ] http://www.earth.li/~noodles/ [] 101 things you can't have too much [ ] PGP/GPG Key @ the.earth.li [] of : 38 - clean underwear. [ ] via keyserver, web or email. [] [ ] RSA: 4DC4E7FD / DSA: 5B430367 [] [ - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: X Hangs with RS690 + 2.6.26
On Fri, 1 Aug 2008, Jonathan McDowell wrote: On Fri, Jul 25, 2008 at 11:10:07AM +0100, Dave Airlie wrote: I've started to see hangs with X on an ATI RS690 with a 2.6.26 kernel. The symptoms are that load average goes up, X stops accepting keypresses or mouse clicks, but the cursor still moves around the screen in response to the mouse being moved. I can't switch to a VT but can ssh in remotely to see that things are still running. I don't seem to be able to kill X but shutdown -r now cleanly reboots. radeon driver is recent git - 1c5858484da4fb1c9bc3ac3b4d7a97863ab99730 but I've seen it with older revisions too. It can take a couple of days for me to hit the problem, so a git bisect could be a lengthy process. If anyone has any suggestions about faster ways to track down the issue I'd like to hear them. git log v2.6.25..v2.6.26 drivers/char/drm 5e35eff13f7dd0f5c1d82b3b4708b2f7a5f44113 5cfb6956073a9e42d44a26790b7800980634d037 No joy. d396db321bcaec54345e7e9e87cea8482d6ae3a8 I thought this might be it; nearly 5 days of uptime rather than the usual less than 2. But I got the same symptoms today so I'll continue working down the list. 259434acccbc823ee8bc00b2d2689d25e1fd d7463eb41d88a39de2653fd41857c4ccddb8707b 45e519052e8f583a709edd442a23f59581d3fe42 2735977b12cb0f113aae24afff04747b6d0f5bf1 3722bfc607d46275369865c02fe8694486d640b5 fa0d71b967506031f7cb08ced6095d1c4f988594 9f18409ea3d778a171a9505c0a849d846f352bd0 Any joy ? I apparently have the same problem with my RS690. I noticed it after upgrading from 2.6.25 to 2.6.26, alongside xorg-server (1.4.99.904 to 1.4.99.905) and Mesa (7.1-rc1 to 7.1-rc3). The ATI driver is 6.9.0. Here it always freezes in a few minutes or less than an hour. When it happens, I'm not running any 3D application and the CPU is idle. I may be just typing something in a shell. But it works disabling DRI. Alt-SysRq-s/u/b is the only way. Trying with q freezes the mouse cursor. - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: X Hangs with RS690 + 2.6.26
On Fri, Jul 25, 2008 at 11:10:07AM +0100, Dave Airlie wrote: I've started to see hangs with X on an ATI RS690 with a 2.6.26 kernel. The symptoms are that load average goes up, X stops accepting keypresses or mouse clicks, but the cursor still moves around the screen in response to the mouse being moved. I can't switch to a VT but can ssh in remotely to see that things are still running. I don't seem to be able to kill X but shutdown -r now cleanly reboots. radeon driver is recent git - 1c5858484da4fb1c9bc3ac3b4d7a97863ab99730 but I've seen it with older revisions too. It can take a couple of days for me to hit the problem, so a git bisect could be a lengthy process. If anyone has any suggestions about faster ways to track down the issue I'd like to hear them. git log v2.6.25..v2.6.26 drivers/char/drm 5e35eff13f7dd0f5c1d82b3b4708b2f7a5f44113 5cfb6956073a9e42d44a26790b7800980634d037 No joy. d396db321bcaec54345e7e9e87cea8482d6ae3a8 I thought this might be it; nearly 5 days of uptime rather than the usual less than 2. But I got the same symptoms today so I'll continue working down the list. 259434acccbc823ee8bc00b2d2689d25e1fd d7463eb41d88a39de2653fd41857c4ccddb8707b 45e519052e8f583a709edd442a23f59581d3fe42 2735977b12cb0f113aae24afff04747b6d0f5bf1 3722bfc607d46275369865c02fe8694486d640b5 fa0d71b967506031f7cb08ced6095d1c4f988594 9f18409ea3d778a171a9505c0a849d846f352bd0 J. -- Friends are God's apology for relations. - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: X Hangs with RS690 + 2.6.26
On Fri, 25 Jul 2008 19:04:55 +0200 Nicolai Hähnle [EMAIL PROTECTED] wrote: Am Freitag 25 Juli 2008 12:12:59 schrieb Jerome Glisse: This looks like usual engine lockup followed by CP lockup so that DMA buffer age never get written and we run out of DMA buffer thus freelist failing in infinite loop. I think we now know all the reason why we lockup, while a fix could be made for old ioctl we believe the best plan is to work on new ioctl with this fix in mind. I can't help but feel uneasy with that kind of plan. After all, do we *really* know what's going on? I always had the impression that we only knew things along the lines of perhaps it's better to submit 3D stuff in indirect buffers. If you *really* know what causes the lockups, could you please document that? As in, what's the actual command processor sequence that is to blame? I know that running e.g. a Nexuiz demo + glxgears window above it is apparently a 100% guaranteed lockup on my system (R420). If you could share your progress in tracking down the sources of the lockups, I'd happily try to write a patch against the current system. cu, Nicolai Here is a brief list from top of my head for the record : - no RB3D_DSTCACHE twice in a row without rendering cmd in btw - initialize all clip register to default values wait for engine idle after setting them - update wptr every 32 dwords (2 dwords seems enough but that one is very hard to track) - use indirect buffer - RB3D_DSTCACHE is not pipelined if free or sync bit is not set thus you have to feel the fifo and wait for idle before writing it if none of these bits are set - flush wait until 3d before 2d, and flush wait dma 2d idle after 2d as well feel the fifo with dummy 2d reg to avoid unpipelined 3d reg to get executed before idle is asserted - avoid emitting cliprect too much - txinval before changing texture - avoid stuff RB3D_DSTCACHE RB2D_DSTCACHE too much - set ISYNC properly through CP - CP idle is wrong we should wait for tag and not try to force CP to goes idle or inject flush after idle - set vertex shader constant input to default safe value And there is other things to think about scattered in my drm. Baiscly things should be set in some order to make sure the engine will not be unhappy in face of a cmd stream. Some of the above might be wrong but i use them because somehow they each one of them seems to give me more stable drm. The last drm i have doesn't lockup in the case of few glxgears on top of other 3d app like celestia and likely nexuiz haven't tried that one. Cheers, Jerome Glisse [EMAIL PROTECTED] - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: X Hangs with RS690 + 2.6.26
I've started to see hangs with X on an ATI RS690 with a 2.6.26 kernel. The symptoms are that load average goes up, X stops accepting keypresses or mouse clicks, but the cursor still moves around the screen in response to the mouse being moved. I can't switch to a VT but can ssh in remotely to see that things are still running. I don't seem to be able to kill X but shutdown -r now cleanly reboots. radeon driver is recent git - 1c5858484da4fb1c9bc3ac3b4d7a97863ab99730 but I've seen it with older revisions too. It can take a couple of days for me to hit the problem, so a git bisect could be a lengthy process. If anyone has any suggestions about faster ways to track down the issue I'd like to hear them. git log v2.6.25..v2.6.26 drivers/char/drm 5e35eff13f7dd0f5c1d82b3b4708b2f7a5f44113 5cfb6956073a9e42d44a26790b7800980634d037 d396db321bcaec54345e7e9e87cea8482d6ae3a8 259434acccbc823ee8bc00b2d2689d25e1fd d7463eb41d88a39de2653fd41857c4ccddb8707b 45e519052e8f583a709edd442a23f59581d3fe42 2735977b12cb0f113aae24afff04747b6d0f5bf1 3722bfc607d46275369865c02fe8694486d640b5 fa0d71b967506031f7cb08ced6095d1c4f988594 9f18409ea3d778a171a9505c0a849d846f352bd0 not sure if you wanna try reverting some of those and seeing which is the cause maybe.. Dave. - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: X Hangs with RS690 + 2.6.26
On Fri, 25 Jul 2008 10:43:34 +0100 Jonathan McDowell [EMAIL PROTECTED] wrote: Hi. I've started to see hangs with X on an ATI RS690 with a 2.6.26 kernel. The symptoms are that load average goes up, X stops accepting keypresses or mouse clicks, but the cursor still moves around the screen in response to the mouse being moved. I can't switch to a VT but can ssh in remotely to see that things are still running. I don't seem to be able to kill X but shutdown -r now cleanly reboots. gdb fails to attach - complains about an internal error. strace shows lots of ioctls against the DRM device all returning EBUSY. 2.6.25 appears to work fine. I originally had PAT enabled under 2.6.26 but have seen a patch fixing that go into git, so disabled it for my 2.6.26 kernel to see if that was the issue; no change AFAICT. Enabling DRM debug (echo 1 /sys/module/drm/parameters/debug) gives lots of output from radeon_freelist_get, after the following ioctl is received: Jul 25 10:11:14 meepok kernel: [drm:drm_ioctl] pid=3302, cmd=0xc0406429, nr=0x29 , dev 0xe200, auth=1 and then a returning NULL message. radeon driver is recent git - 1c5858484da4fb1c9bc3ac3b4d7a97863ab99730 but I've seen it with older revisions too. It can take a couple of days for me to hit the problem, so a git bisect could be a lengthy process. If anyone has any suggestions about faster ways to track down the issue I'd like to hear them. Machine is a dual core AMD64 with 4GB of RAM running Debian unstable, card is: 01:05.0 VGA compatible controller [0300]: ATI Technologies Inc RS690 [Radeon X1200 Series] [1002:791e] Kernel configs at: http://the.earth.li/~noodles/radeon-2.6.26-hang/config-2.6.25 http://the.earth.li/~noodles/radeon-2.6.26-hang/config-2.6.26 Debug log from enabling drm debug: http://the.earth.li/~noodles/radeon-2.6.26-hang/debug Full dmesg (no obvious errors): http://the.earth.li/~noodles/radeon-2.6.26-hang/meepok.dmesg Xorg log file (no obvious errors): http://the.earth.li/~noodles/radeon-2.6.26-hang/Xorg.0.log J. This looks like usual engine lockup followed by CP lockup so that DMA buffer age never get written and we run out of DMA buffer thus freelist failing in infinite loop. I think we now know all the reason why we lockup, while a fix could be made for old ioctl we believe the best plan is to work on new ioctl with this fix in mind. So i don't think a bisect will help, there is certainly somethings that made this lockup more probable to happen on your config but best things is to fix lockup. If you really got time you can still do bisect and find out what makes this lockups more obvious on your config this could be helpfull to check that our theories are goods. Cheers, Jerome Glisse - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: X Hangs with RS690 + 2.6.26
Am Freitag 25 Juli 2008 12:12:59 schrieb Jerome Glisse: This looks like usual engine lockup followed by CP lockup so that DMA buffer age never get written and we run out of DMA buffer thus freelist failing in infinite loop. I think we now know all the reason why we lockup, while a fix could be made for old ioctl we believe the best plan is to work on new ioctl with this fix in mind. I can't help but feel uneasy with that kind of plan. After all, do we *really* know what's going on? I always had the impression that we only knew things along the lines of perhaps it's better to submit 3D stuff in indirect buffers. If you *really* know what causes the lockups, could you please document that? As in, what's the actual command processor sequence that is to blame? I know that running e.g. a Nexuiz demo + glxgears window above it is apparently a 100% guaranteed lockup on my system (R420). If you could share your progress in tracking down the sources of the lockups, I'd happily try to write a patch against the current system. cu, Nicolai signature.asc Description: This is a digitally signed message part. - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/-- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel