-------- Original Message --------
Subject: nVidia-glx crashing in SID. (BTS #664261 and #667684)
Date: Thu, 07 Jun 2012 14:56:30 +0200
From: Alberto Gabrielli <[email protected]>
Organization: Qualimedia srl
To: [email protected]
cite http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=664261#62
> I didn't experience the problem recently, but as always with such
> "random" problems you cannot be sure. I would count it as solved for
now.
Good. If I don't see any updates on this bug report, I'll close it in
about a week as fixed in 295.53.
-----------
Sorry again. (see attach; got no reply, please send me an ACK, thanks)
I still encounter those crashes every day! 7, 8, 10 times a day!!
System has become unusable....
Same symptoms and behavior already described in attach. Same stuff in
syslog.
This is latest; 3rd time since this morning:
> Jun 7 14:21:31 PC-4 kernel: [ 395.373671] NVRM: GPU at 0000:04:00.0 has
> fallen off the bus.
> Jun 7 14:21:32 PC-4 kernel: [ 395.965804] irq 17: nobody cared (try booting
> with the "irqpoll" option)
> Jun 7 14:21:32 PC-4 kernel: [ 395.965810] Pid: 4853, comm: tracker-store
> Tainted: P O 3.2.0-2-686-pae #1
> Jun 7 14:21:32 PC-4 kernel: [ 395.965813] Call Trace:
> Jun 7 14:21:32 PC-4 kernel: [ 395.965822] [<c1078a55>] ?
> __report_bad_irq+0x1c/0x8d
> Jun 7 14:21:32 PC-4 kernel: [ 395.965826] [<c1078c5f>] ?
> note_interrupt+0x122/0x18f
> Jun 7 14:21:32 PC-4 kernel: [ 395.965829] [<c10775ff>] ?
> handle_irq_event_percpu+0x142/0x158
> Jun 7 14:21:32 PC-4 kernel: [ 395.965832] [<c107920e>] ?
> handle_level_irq+0x56/0x56
> Jun 7 14:21:32 PC-4 kernel: [ 395.965835] [<c1077636>] ?
> handle_irq_event+0x21/0x37
> Jun 7 14:21:32 PC-4 kernel: [ 395.965838] [<c107920e>] ?
> handle_level_irq+0x56/0x56
> Jun 7 14:21:32 PC-4 kernel: [ 395.965840] [<c107926e>] ?
> handle_fasteoi_irq+0x60/0x85
> Jun 7 14:21:32 PC-4 kernel: [ 395.965842] <IRQ> [<c100cd43>] ?
> do_IRQ+0x2e/0x76
> Jun 7 14:21:32 PC-4 kernel: [ 395.965850] [<c12c63b0>] ?
> common_interrupt+0x30/0x38
> Jun 7 14:21:32 PC-4 kernel: [ 395.965852] handlers:
> Jun 7 14:21:32 PC-4 kernel: [ 395.966022] [<f96c7937>] nv_kern_isr
> Jun 7 14:21:32 PC-4 kernel: [ 395.966024] Disabling IRQ #17
and manual shutdown :-(
> Jun 7 14:25:14 PC-4 kernel: [ 617.594374] SysRq : Emergency Sync
> Jun 7 14:25:20 PC-4 kernel: [ 623.807347] Emergency Sync complete
> Jun 7 14:25:24 PC-4 kernel: [ 627.839524] SysRq : Emergency Sync
> Jun 7 14:25:24 PC-4 kernel: [ 627.840124] Emergency Sync complete
> Jun 7 14:25:26 PC-4 kernel: [ 630.005265] SysRq : Emergency Remount R/O
GeForce 6200
nvidia-glx 295.53-1 from SID
Linux version 3.2.0-2-686-pae (Debian 3.2.19-1)
([email protected]) (gcc version 4.6.3 (Debian 4.6.3-7) )
#1 SMP Fri Jun 1 18:56:14 UTC 2012
If I can be useful, feel free to ask!
--- Begin Message ---
Hello,
sorry I bother you personally, but I have "problems" (...) with the BTS,
and this could be useful.
I encounter the crashes since "few days" before 29/Feb/2012. (Can't say
exactly; maybe something like a week before).
On a daily updated SID.
[Quotes from mails to Mike Hommey, Iceape's maintainer, since at that
time the bug appeared being Iceape-related. Which isn't.]
----CUT------CUT-------CUT----------
Symptom is: a system beep. Nothing else.
No Kernel panic leds; no strange disk activity.
Immediately after, two alternative situations:
a) Entire desktop freezes. Mouse is sometimes moveable, but inactive;
more often, it totally disappears.
Everything is locked. Even CTRL-ALT-N virtual consoles, or CTRL-ALT-BKSP!
b) mouse is moving, and "some" stuff on Nautilus desktop is still
active, selectable and so on. All Iceape's windows are there, but
totally locked.
After a while, few clicks around, or so, entire X locks with no other
symptoms like in a)
Only solution is Magic-keys: emergency Sync, Umount, Boot! (have no
console on TTY; no idea)
I'm not _sure_, but I suspect it happens only using mouse; no keyboard.
On logs, I only found stuff like the following, which AFAIK appears
being video driver (nVidia proprietary) or maybe Kernel. (excluding HW)
But it happens ONLY when interacting in Iceape! Mostly (surely?) in
Browser windows.
Sometimes was a terrible "many times in a hour"; sometimes, with circa
SAME windows opened/used, was a "2 days, and everything is perfect!"
And I always act with very videodriver intensive tasks, including Video
(mplayer and so; but Youtube too, with gnash), Audio, Virtualbox, Wine,
gaming, tests...
Only Iceape!
So I'm writing you not for bothering, but hoping all of this stuff makes
some whistles in your mind about some exotic function you use in Iceape
but buggy in [nVidia|xorg|kerne] an warn more appropriate developers.
Syslog:
Feb 28 18:51:53 MyHostname kernel: [12466.377765] NVRM: GPU at 0000:04:00.0 has
fallen off the bus.
Feb 28 18:51:54 MyHostname kernel: [12466.790570] Pid: 3488, comm: Xorg
Tainted: P O 3.2.0-1-686-pae #1
Feb 28 18:51:54 MyHostname kernel: [12466.790572] Call Trace:
Feb 28 18:51:54 MyHostname kernel: [12466.790580] [<c10788e5>] ?
__report_bad_irq+0x1c/0x8d
Feb 28 18:51:54 MyHostname kernel: [12466.790584] [<c1078aef>] ?
note_interrupt+0x122/0x18f
Feb 28 18:51:54 MyHostname kernel: [12466.790588] [<c1077493>] ?
handle_irq_event_percpu+0x142/0x158
Feb 28 18:51:54 MyHostname kernel: [12466.790591] [<c107906d>] ?
handle_level_irq+0x62/0x62
Feb 28 18:51:54 MyHostname kernel: [12466.790593] [<c10774ca>] ?
handle_irq_event+0x21/0x37
Feb 28 18:51:54 MyHostname kernel: [12466.790596] [<c107906d>] ?
handle_level_irq+0x62/0x62
Feb 28 18:51:54 MyHostname kernel: [12466.790599] [<c10790cd>] ?
handle_fasteoi_irq+0x60/0x78
Feb 28 18:51:54 MyHostname kernel: [12466.790601] <IRQ> [<c100cd5f>] ?
do_IRQ+0x2e/0x76
Feb 28 18:51:54 MyHostname kernel: [12466.790609] [<c12be2b0>] ?
common_interrupt+0x30/0x38
Feb 28 18:51:54 MyHostname kernel: [12466.790611] [<c12bdcdf>] ?
sysenter_past_esp+0x3c/0x6a
Feb 28 18:52:14 MyHostname kernel: [12487.427365] SysRq : Emergency Sync
Feb 28 18:52:14 MyHostname kernel: [12487.427617] Emergency Sync complete
Feb 28 18:52:16 MyHostname kernel: [12488.742709] SysRq : Emergency Remount R/O
Xorg.log:
[ 12469.378] (WW) NVIDIA(0): WAIT (2, 7, 0x8000, 0x00004714, 0x00004820)
[ 12469.378] (WW) NVIDIA(0): WAIT (0, 7, 0x8000, 0x00004820, 0x00004820)
[ 12475.116] [mi] EQ overflowing. The server is probably stuck in an infinite
loop.
[ 12475.116]
Backtrace:
[ 12475.148] 0: /usr/bin/X (xorg_backtrace+0x37) [0xb76f0ca7]
[ 12475.148] 1: /usr/bin/X (mieqEnqueue+0x185) [0xb76cf105]
[ 12475.148] 2: /usr/bin/X (0xb756c000+0x4fcc5) [0xb75bbcc5]
[ 12475.148] 3: /usr/bin/X (xf86PostButtonEventM+0x9f) [0xb75fa03f]
[ 12475.149] 4: /usr/bin/X (xf86PostButtonEvent+0xc4) [0xb75fa224]
[ 12475.149] 5: /usr/lib/xorg/modules/input/evdev_drv.so (0xb3af3000+0x3efc)
[0xb3af6efc]
[ 12475.149] 6: /usr/bin/X (0xb756c000+0x782f1) [0xb75e42f1]
[ 12475.149] 7: /usr/bin/X (0xb756c000+0x9fab2) [0xb760bab2]
[ 12475.149] 8: (vdso) (__kernel_sigreturn+0x0) [0xb754e400]
[ 12475.149] 9: /usr/lib/xorg/modules/drivers/nvidia_drv.so
(0xb480d000+0x61859) [0xb486e859]
-----------------
Just got this in Syslog:
Mar 22 15:22:01 Hostname4 kernel: [ 895.636232] NVRM: GPU at 0000:04:00.0 has
fallen off the bus.
Mar 22 15:22:01 Hostname4 kernel: [ 896.138257] irq 17: nobody cared (try booting with
the "irqpoll" option)
Mar 22 15:22:01 Hostname4 kernel: [ 896.138264] Pid: 4804, comm: iceape-bin
Tainted: P O 3.2.0-2-686-pae #1
Mar 22 15:22:01 Hostname4 kernel: [ 896.138266] Call Trace:
Mar 22 15:22:01 Hostname4 kernel: [ 896.138275] [<c1078a21>] ?
__report_bad_irq+0x1c/0x8d
Mar 22 15:22:01 Hostname4 kernel: [ 896.138279] [<c1078c2b>] ?
note_interrupt+0x122/0x18f
Mar 22 15:22:01 Hostname4 kernel: [ 896.138282] [<c10775cb>] ?
handle_irq_event_percpu+0x142/0x158
Mar 22 15:22:01 Hostname4 kernel: [ 896.138285] [<c10791de>] ?
handle_level_irq+0x56/0x56
Mar 22 15:22:01 Hostname4 kernel: [ 896.138287] [<c1077602>] ?
handle_irq_event+0x21/0x37
Mar 22 15:22:01 Hostname4 kernel: [ 896.138290] [<c10791de>] ?
handle_level_irq+0x56/0x56
Mar 22 15:22:01 Hostname4 kernel: [ 896.138293] [<c107923e>] ?
handle_fasteoi_irq+0x60/0x85
Mar 22 15:22:01 Hostname4 kernel: [ 896.138295] <IRQ> [<c100cdcf>] ?
do_IRQ+0x2e/0x76
Mar 22 15:22:01 Hostname4 kernel: [ 896.138302] [<c12c57f0>] ?
common_interrupt+0x30/0x38
Mar 22 15:22:01 Hostname4 kernel: [ 896.138304] handlers:
Mar 22 15:22:01 Hostname4 kernel: [ 896.138451] [<f9749daf>] nv_kern_isr
Mar 22 15:22:01 Hostname4 kernel: [ 896.138454] Disabling IRQ #17
Than, manually:
Mar 22 15:22:24 Hostname4 kernel: [ 919.058166] SysRq : Emergency Sync
Mar 22 15:22:24 Hostname4 kernel: [ 919.060139] Emergency Sync complete
Mar 22 15:22:25 Hostname4 kernel: [ 920.381559] SysRq : Emergency Sync
Mar 22 15:22:25 Hostname4 kernel: [ 920.381818] Emergency Sync complete
Mar 22 15:22:26 Hostname4 kernel: [ 921.293939] SysRq : Emergency Remount R/O
----CUT------CUT-------CUT----------
Few minutes ago (identical to older one above...)
Apr 11 15:54:45 Hostname4 kernel: [ 2425.888267] NVRM: GPU at 0000:04:00.0 has
fallen off the bus.
Apr 11 15:54:45 Hostname4 kernel: [ 2426.480134] irq 17: nobody cared (try booting with
the "irqpoll" option)
Apr 11 15:54:45 Hostname4 kernel: [ 2426.480140] Pid: 2928, comm: Xorg Tainted:
P O 3.2.0-2-686-pae #1
Apr 11 15:54:45 Hostname4 kernel: [ 2426.480142] Call Trace:
Apr 11 15:54:45 Hostname4 kernel: [ 2426.480151] [<c1078a29>] ?
__report_bad_irq+0x1c/0x8d
Apr 11 15:54:45 Hostname4 kernel: [ 2426.480155] [<c1078c33>] ?
note_interrupt+0x122/0x18f
Apr 11 15:54:45 Hostname4 kernel: [ 2426.480158] [<c10775d3>] ?
handle_irq_event_percpu+0x142/0x158
Apr 11 15:54:45 Hostname4 kernel: [ 2426.480161] [<c10791e2>] ?
handle_level_irq+0x56/0x56
Apr 11 15:54:45 Hostname4 kernel: [ 2426.480164] [<c107760a>] ?
handle_irq_event+0x21/0x37
Apr 11 15:54:45 Hostname4 kernel: [ 2426.480166] [<c10791e2>] ?
handle_level_irq+0x56/0x56
Apr 11 15:54:45 Hostname4 kernel: [ 2426.480170] [<c1079242>] ?
handle_fasteoi_irq+0x60/0x85
Apr 11 15:54:45 Hostname4 kernel: [ 2426.480172] <IRQ> [<c100cdcf>] ?
do_IRQ+0x2e/0x76
Apr 11 15:54:45 Hostname4 kernel: [ 2426.480179] [<c12c59f0>] ?
common_interrupt+0x30/0x38
Apr 11 15:54:45 Hostname4 kernel: [ 2426.480181] handlers:
Apr 11 15:54:45 Hostname4 kernel: [ 2426.480332] [<f96ddf6f>] nv_kern_isr
Apr 11 15:54:45 Hostname4 kernel: [ 2426.480334] Disabling IRQ #17
Apr 11 15:54:53 Hostname4 kernel: [ 2434.396759] SysRq : Emergency Sync
Apr 11 15:54:53 Hostname4 kernel: [ 2434.397010] Emergency Sync complete
Apr 11 15:54:55 Hostname4 kernel: [ 2435.872184] SysRq : Emergency Remount R/O
card is a GeForce 6200.
Linux version 3.2.0-2-686-pae (Debian 3.2.14-1)
([email protected]) (gcc version 4.6.3 (Debian 4.6.3-1) )
#1 SMP Fri Apr 6 05:25:56 UTC 2012
I can confirm that with nvidia-glx 295.33.2 the crashes are here.
Terribly here! (6 yesterday, in 5 hours; 2 in the last hour...)
Hope this stuff will be useful.
Obviously, feel free to ask for tests or whatever! :-)
--
Alberto Gabrielli
Qualimedia srl - Roma
--- End Message ---