Re: Weird rawhide desktop behavior
On Sat, 2012-03-24 at 11:58 -0600, Jonathan Corbet wrote: > Here's a strange pathology that just bit me for the first time in a while, > though I've seen it before. I'm not sure where to file a bug on this > one... There's several levels of "X locked up" pathology, let's see if I can shed some light here. (For bonus points, someone who wanted to add this kind of info to the wiki would be Way Cool.) > In short: I'll be working away, minding my own business, when the desktop > goes completely dead - no response to any key or mouse events. That said, > the X server is still running; the pointer still moves with the mouse. I > can also switch to another virtual console with alt-ctrl-Fn. Sometimes > things start working again after some time (measured in minutes); > sometimes I lose patience and start over. Today I went and made lunch and > it never came back. The pointer position (but not image) updates during a SIGIO handler if you have hardware cursors enabled [1]. How do you know if you have hardware cursors? Short answer is, you do, unless you're running a dumb driver like vesa/fbdev/modesetting. So, class 1 lockup here is "I can't move the cursor", and boy are you in trouble. For KMS drivers this usually means X is waiting on a blocking DRM ioctl; ps will show X in D state, and /proc/$(pidof Xorg)/wchan will show you somewhere in ioctl land. This is always a video driver bug, and you will typically see something in dmesg when this happens. Don't bother trying to get an xserver backtrace here, ptrace can't attach to D-state processes. Class 2 lockup is "I can move the cursor, but the image never changes", as in, if you mouse over a text entry field it doesn't change to the vertical bar, or over a resize grip it doesn't change to a resize indicator. Here, the X server is stuck somewhere away from the main loop, but at least isn't stuck in the kernel. gdb on X will work, and will probably tell you where you're stuck. This class is usually a userspace bug, could be either the driver or the server. Class 3 lockup is "I can move the cursor and it behaves normally, but I can't type". In this scenario X _is_ successfully going around its main loop. If you can VT switch, this is you; VT switch processing happens while draining the event queue, which is driven off the main loop. This scenario has an outside chance of being an xserver bug, but typically this is the server dutifully doing what clients have told it to do: something takes a grab, and then deadlocks. Sorry about X11, we keep trying to get rid of it for a reason. Class 3 here one could debug more readily if you had some of the debugging key combos wired up in XKB: http://cgit.freedesktop.org/xorg/xserver/commit/?id=7d2543a3cb3089241982ce4f8984fd723d5312a1 Sadly gnome does not yet have UI for this, and I don't remember how to drive setxkbmap to add them. Note that the Ungrab and CloseGrab combos allow you to defeat screensaver locking - ie, they are security holes - which is why they're not enabled by default. You don't want to use them anyway if you're debugging, you want PrintGrabs so you can then go inspect the grabbing process to see why it's deadlocked. > I've tried killing off applications to see if somebody has some sort of > all-inclusive grab, but I can't find the right one if that's the case. I > can kill something like Firefox and verify that the process is gone, but > the Firefox window remains on-screen when I return to X. This is significant. It means the compositor isn't repainting. So either: a) the compositor isn't the client with the stuck grab, b) the compositor's internal grab logic is broken [1] - Why position but not image? Because on most hardware position is just one register to poke, but image updates require an image upload, which isn't safe to do if the driver is in the middle of some other accelerated rendering. Why only for hardware cursor? Because software cursor rendering only caches the pixels behind the cursor on motion, which means you could race with normal rendering. Both of these you could fix if you were willing to take much more of a mutex overhead than you're probably okay with. - ajax signature.asc Description: This is a digitally signed message part -- test mailing list test@lists.fedoraproject.org To unsubscribe: https://admin.fedoraproject.org/mailman/listinfo/test
Re: Weird rawhide desktop behavior
On Sun, 25 Mar 2012 16:01:26 + "Jóhann B. Guðmundsson" wrote: > I would also like to point to our own documentation regarding SysRQ > [1] which I created a while back for the QA community to use and > improve. > > Dont hesitate improve/add anything to that page. > 1.http://fedoraproject.org/wiki/QA/Sysrq Thanks for the link! My future reference. :-) -- test mailing list test@lists.fedoraproject.org To unsubscribe: https://admin.fedoraproject.org/mailman/listinfo/test
Re: Weird rawhide desktop behavior
On 03/25/2012 03:46 PM, stan wrote: On Sat, 24 Mar 2012 11:58:15 -0600 Jonathan Corbet wrote: Anybody got a clue what's going on, or where I could look to get more information? Another suggestion is to use the emergency recovery key sequence. I think it is compiled by default into the Fedora kernels. Described here: http://en.wikipedia.org/wiki/Magic_SysRq_key#.E2.80.9CREISUB.E2.80.9D_.E2.80.93_safe_reboot I would also like to point to our own documentation regarding SysRQ [1] which I created a while back for the QA community to use and improve. Dont hesitate improve/add anything to that page. JBG 1.http://fedoraproject.org/wiki/QA/Sysrq -- test mailing list test@lists.fedoraproject.org To unsubscribe: https://admin.fedoraproject.org/mailman/listinfo/test
Re: Weird rawhide desktop behavior
On Sat, 24 Mar 2012 11:58:15 -0600 Jonathan Corbet wrote: > Anybody got a clue what's going on, or where I could look to get more > information? Another suggestion is to use the emergency recovery key sequence. I think it is compiled by default into the Fedora kernels. Described here: http://en.wikipedia.org/wiki/Magic_SysRq_key#.E2.80.9CREISUB.E2.80.9D_.E2.80.93_safe_reboot -- test mailing list test@lists.fedoraproject.org To unsubscribe: https://admin.fedoraproject.org/mailman/listinfo/test
Re: Weird rawhide desktop behavior
On Sat, 2012-03-24 at 19:32 -0700, Chuck Forsberg WA7KGX N2469R wrote: > X wedged on me again while I was running Gnome. > I rebooted and started Xfce. It wedged after some minutes. > > I then installed the current Nvidia proprietary driver. The computer > has not > wedged for a few hours. This suggests the problem lies > with the default X server for my GTX 460SE ... or an interaction > between my karma and the ozone layer. Oh, another possible cause if you're on GNOME 3.3.90 is a memory leak in Shell. This was fixed in 3.3.92. So, one thing to try is updating to 3.3.92. -- Adam Williamson Fedora QA Community Monkey IRC: adamw | Twitter: AdamW_Fedora | identi.ca: adamwfedora http://www.happyassassin.net -- test mailing list test@lists.fedoraproject.org To unsubscribe: https://admin.fedoraproject.org/mailman/listinfo/test
Re: Weird rawhide desktop behavior
On Sat, 2012-03-24 at 21:09 +, "Jóhann B. Guðmundsson" wrote: > On 03/24/2012 08:34 PM, Jonathan Corbet wrote: > > On Sat, 24 Mar 2012 12:23:26 -0700 > > Adam Williamson wrote: > > > >> Jonathan, Chuck - if you try holding down a key that ought to do > >> something for half a second instead of just pressing it, does it work? > > I'll try that next time the problem hits. I don't have any real way to > > provoke it now, though, so I don't know when that will be...stay tuned. > > The best way to deal with this is to follow [1] then file a bug against > gnome-shell and attach the relevant output along with .xsession-errors If it's a Shell bug. Mine isn't. -- Adam Williamson Fedora QA Community Monkey IRC: adamw | Twitter: AdamW_Fedora | identi.ca: adamwfedora http://www.happyassassin.net -- test mailing list test@lists.fedoraproject.org To unsubscribe: https://admin.fedoraproject.org/mailman/listinfo/test
Re: Weird rawhide desktop behavior
X wedged on me again while I was running Gnome. I rebooted and started Xfce. It wedged after some minutes. I then installed the current Nvidia proprietary driver. The computer has not wedged for a few hours. This suggests the problem lies with the default X server for my GTX 460SE ... or an interaction between my karma and the ozone layer. -- Chuck Forsberg WA7KGX N2469R c...@omen.com www.omen.com Developer of Industrial ZMODEM(Tm) for Embedded Applications Omen Technology Inc "The High Reliability Software" 10255 NW Old Cornelius Pass Portland OR 97231 503-614-0430 -- test mailing list test@lists.fedoraproject.org To unsubscribe: https://admin.fedoraproject.org/mailman/listinfo/test
RE: Weird rawhide desktop behavior
> Date: Sat, 24 Mar 2012 19:37:38 -0300 > Subject: Re: Weird rawhide desktop behavior > From: look...@gmail.com > To: test@lists.fedoraproject.org > > On Sat, Mar 24, 2012 at 2:58 PM, Jonathan Corbet wrote: > > Here's a strange pathology that just bit me for the first time in a while, > > though I've seen it before. I'm not sure where to file a bug on this > > one... > > > > In short: I'll be working away, minding my own business, when the desktop > > goes completely dead - no response to any key or mouse events. That said, > > the X server is still running; the pointer still moves with the mouse. I > > can also switch to another virtual console with alt-ctrl-Fn. Sometimes > > things start working again after some time (measured in minutes); > > sometimes I lose patience and start over. Today I went and made lunch and > > it never came back. > > I've noticed the same behavior on my box, that was freshly installed > with Fedora 17 last Friday. I'll see what I can do to gather > information about the problem as well. > > -- > Lucas I've been seeing similar behaviour in F16 for quite some time. Usually, it happens when I am in the middle of something else and don't have time to try to track it down, and when I do have time for trying to figure it out, it doesn't happen. It does seem to occur more frequently when my box is a little warm, so I assumed that it was hardware related. John. -- test mailing list test@lists.fedoraproject.org To unsubscribe: https://admin.fedoraproject.org/mailman/listinfo/test
Re: Weird rawhide desktop behavior
On Sat, Mar 24, 2012 at 2:58 PM, Jonathan Corbet wrote: > Here's a strange pathology that just bit me for the first time in a while, > though I've seen it before. I'm not sure where to file a bug on this > one... > > In short: I'll be working away, minding my own business, when the desktop > goes completely dead - no response to any key or mouse events. That said, > the X server is still running; the pointer still moves with the mouse. I > can also switch to another virtual console with alt-ctrl-Fn. Sometimes > things start working again after some time (measured in minutes); > sometimes I lose patience and start over. Today I went and made lunch and > it never came back. I've noticed the same behavior on my box, that was freshly installed with Fedora 17 last Friday. I'll see what I can do to gather information about the problem as well. -- Lucas -- test mailing list test@lists.fedoraproject.org To unsubscribe: https://admin.fedoraproject.org/mailman/listinfo/test
Re: Weird rawhide desktop behavior
On 03/24/2012 08:34 PM, Jonathan Corbet wrote: On Sat, 24 Mar 2012 12:23:26 -0700 Adam Williamson wrote: Jonathan, Chuck - if you try holding down a key that ought to do something for half a second instead of just pressing it, does it work? I'll try that next time the problem hits. I don't have any real way to provoke it now, though, so I don't know when that will be...stay tuned. The best way to deal with this is to follow [1] then file a bug against gnome-shell and attach the relevant output along with .xsession-errors JBG 1.https://live.gnome.org/GnomeShell/Debugging -- test mailing list test@lists.fedoraproject.org To unsubscribe: https://admin.fedoraproject.org/mailman/listinfo/test
Re: Weird rawhide desktop behavior
On Sat, 24 Mar 2012 11:58:15 -0600 Jonathan Corbet wrote: > Here's a strange pathology that just bit me for the first time in a > while, though I've seen it before. I'm not sure where to file a bug > on this one... > > In short: I'll be working away, minding my own business, when the > desktop goes completely dead - no response to any key or mouse > events. ... > Anybody got a clue what's going on, or where I could look to get more > information? This is old, and might be unrelated, but I used to have these lockups starting around F11,F12. For me, it was almost always firefox related, I had just clicked on something, and bingo, lock-up. But it was other things often enough that I couldn't say for sure it was firefox. So I started compiling my own kernels, customized for my system. And the problem went away, and I haven't seen it again. Maybe it is gone in the generic kernels, maybe not, but it only takes about 10 to 15 minutes of my time now to compile and install a kernel, so I just continue doing it. Such lockups only happened while X was running for me, and that would seem to absolve the kernel. So it is probably that the recompile removes the troublesome code, or changes it enough that it no longer fails. My best guess at the time was that there was a race condition leading to a deadlock. One other thing you could try is boosting the priority of your user to -1. It seems counter-intuitive, but for a workstation instead of a server, this makes sense because then your graphical user experience doesn't get impacted by background processes as much, yet they still run if you have any CPU time (most of the time). It is in /etc/security/limits.conf, and as I said I have my user set to -1. This is especially important to prevent system io from affecting your gui experience as much. Think of it this way; is it more important to your user experience that writing log files get done, or that the file you want to edit gets loaded? It also might finesse the race condition leading to your lockups, by shifting the priorities of jobs in the interrupt chain. -- test mailing list test@lists.fedoraproject.org To unsubscribe: https://admin.fedoraproject.org/mailman/listinfo/test
Re: Weird rawhide desktop behavior
On Sat, 24 Mar 2012 12:23:26 -0700 Adam Williamson wrote: > Jonathan, Chuck - if you try holding down a key that ought to do > something for half a second instead of just pressing it, does it work? I'll try that next time the problem hits. I don't have any real way to provoke it now, though, so I don't know when that will be...stay tuned. Thanks, jon -- test mailing list test@lists.fedoraproject.org To unsubscribe: https://admin.fedoraproject.org/mailman/listinfo/test
Re: Weird rawhide desktop behavior
On Sat, 2012-03-24 at 11:43 -0700, Chuck Forsberg WA7KGX N2469R wrote: > On 03/24/2012 11:12 AM, "Jóhann B. Guðmundsson" wrote: > > On 03/24/2012 05:58 PM, Jonathan Corbet wrote: > >> Here's a strange pathology that just bit me for the first time in a > >> while, > >> though I've seen it before. I'm not sure where to file a bug on this > >> one... > >> > >> In short: I'll be working away, minding my own business, when the > >> desktop > >> goes completely dead - no response to any key or mouse events. That > >> said, > >> the X server is still running; the pointer still moves with the > >> mouse. I > >> can also switch to another virtual console with alt-ctrl-Fn. Sometimes > >> things start working again after some time (measured in minutes); > >> sometimes I lose patience and start over. Today I went and made > >> lunch and > >> it never came back. > > > > Hmm > > > > I think there was a lock screen bug mentioned upstream that fits this > > description... > > > > JBG > Same thing happened to me with RC1 Gnome on 64 bit. Yum update was running > in one window and I was surfing in another, each window taking up about > half the > 1080x1920 monitor. All of a sudden Firefox stopped responding and Yum > ground to a halt. > The mouse cursor still followed mouse movements, but the keyboard and > mouse keys > were dead. No Ctrl-Alt-Fn did anything. No LED response to CapsLock > etc.. Some > background process accessed the HD from time to time. I had to use the > hardware reset. > > Yum was hopelessly confused so I reinstalled RC1 and ran yum update from > a console > terminal. Hooray for Xfce which works better now. Everyone I know of > who uses Linux > as a tool rather dislikes Gnome 3, so let's make sure Xfce works properly. Josh - to me, the above sound vaguely like that interrupt problem I have, the one we were talking about the other day, where I have to hold down keys for half a second before they register. Could it be the same? Jonathan, Chuck - if you try holding down a key that ought to do something for half a second instead of just pressing it, does it work? -- Adam Williamson Fedora QA Community Monkey IRC: adamw | Twitter: AdamW_Fedora | identi.ca: adamwfedora http://www.happyassassin.net -- test mailing list test@lists.fedoraproject.org To unsubscribe: https://admin.fedoraproject.org/mailman/listinfo/test
Re: Weird rawhide desktop behavior
On 03/24/2012 11:12 AM, "Jóhann B. Guðmundsson" wrote: On 03/24/2012 05:58 PM, Jonathan Corbet wrote: Here's a strange pathology that just bit me for the first time in a while, though I've seen it before. I'm not sure where to file a bug on this one... In short: I'll be working away, minding my own business, when the desktop goes completely dead - no response to any key or mouse events. That said, the X server is still running; the pointer still moves with the mouse. I can also switch to another virtual console with alt-ctrl-Fn. Sometimes things start working again after some time (measured in minutes); sometimes I lose patience and start over. Today I went and made lunch and it never came back. Hmm I think there was a lock screen bug mentioned upstream that fits this description... JBG Same thing happened to me with RC1 Gnome on 64 bit. Yum update was running in one window and I was surfing in another, each window taking up about half the 1080x1920 monitor. All of a sudden Firefox stopped responding and Yum ground to a halt. The mouse cursor still followed mouse movements, but the keyboard and mouse keys were dead. No Ctrl-Alt-Fn did anything. No LED response to CapsLock etc.. Some background process accessed the HD from time to time. I had to use the hardware reset. Yum was hopelessly confused so I reinstalled RC1 and ran yum update from a console terminal. Hooray for Xfce which works better now. Everyone I know of who uses Linux as a tool rather dislikes Gnome 3, so let's make sure Xfce works properly. -- Chuck Forsberg WA7KGX N2469R c...@omen.com www.omen.com Developer of Industrial ZMODEM(Tm) for Embedded Applications Omen Technology Inc "The High Reliability Software" 10255 NW Old Cornelius Pass Portland OR 97231 503-614-0430 -- test mailing list test@lists.fedoraproject.org To unsubscribe: https://admin.fedoraproject.org/mailman/listinfo/test
Re: Weird rawhide desktop behavior
On 03/24/2012 05:58 PM, Jonathan Corbet wrote: Here's a strange pathology that just bit me for the first time in a while, though I've seen it before. I'm not sure where to file a bug on this one... In short: I'll be working away, minding my own business, when the desktop goes completely dead - no response to any key or mouse events. That said, the X server is still running; the pointer still moves with the mouse. I can also switch to another virtual console with alt-ctrl-Fn. Sometimes things start working again after some time (measured in minutes); sometimes I lose patience and start over. Today I went and made lunch and it never came back. Hmm I think there was a lock screen bug mentioned upstream that fits this description... JBG -- test mailing list test@lists.fedoraproject.org To unsubscribe: https://admin.fedoraproject.org/mailman/listinfo/test