Re: Screen unresponsive
On Tue, 2012-03-13 at 17:04 +, Darac Marjal wrote: On Mon, Mar 05, 2012 at 01:32:17AM -0500, KS wrote: On Mon, Mar 5, 2012, at 12:51 AM, KS wrote: Hi all, The last few days I ahve noticed that when I return to my machine (always ON), the screen doesn't respond. Keyboard (caps lock, num lock) works. I can also ssh to the machine and have noticed that Xorg takes 100% CPU. I couldn't find anything in the Xorg log or syslog files. Today however, the screen stopped responding after a beep while I was using the machine. Below is what I found on sys log: Mar 5 00:32:28 gurh kernel: [17901.730462] NVRM: GPU at :01:00.0 has fallen off the bus. This doesn't sound particularly good. It would suggest to me that your graphics card (the GPU) is no longer attached to the PCI bus. Probably the best case scenario is that this is a physical problem: Open up your computer, pull out the card and push it back in, making sure it's fully seated. If the problem persists, then it may be that the card is locking up completely such that the PCI bus THINKS you've pulled it out. You may find monitoring the output of nvclock -T useful. Syslog gave the warning again as above! So it this just a kernel issue? Thanks, KS Hi Darac, I don't think this is related to HW issue, indeed, I'm experiencing this since some time ago on two different machines. All I can have is the following: root@laptop:~# head -20 /var/log/syslog May 31 22:28:59 laptop syslog-ng[1860]: Configuration reload request received, reloading configuration; May 31 22:28:59 laptop syslog-ng[1860]: EOF on control channel, closing connection; May 31 22:29:00 laptop anacron[11394]: Job `cron.daily' terminated May 31 22:29:00 laptop anacron[11394]: Normal exit (1 job run) May 31 22:49:00 laptop -- MARK -- May 31 23:05:40 laptop kernel: [32915.745040] sdc: detected capacity change from 8019509248 to 0 May 31 23:05:52 laptop kernel: [32927.622139] usb 2-1: USB disconnect, device number 8 May 31 23:08:11 laptop kernel: [33066.384097] NVRM: GPU at :01:00.0 has fallen off the bus. May 31 23:08:11 laptop kernel: [33066.384102] NVRM: GPU at :01:00.0 has fallen off the bus. May 31 23:08:11 laptop kernel: [33066.384120] NVRM: os_pci_init_handle: invalid context! May 31 23:08:11 laptop kernel: [33066.384124] NVRM: os_pci_init_handle: invalid context! May 31 23:08:11 laptop kernel: [33066.384176] NVRM: os_pci_init_handle: invalid context! May 31 23:08:11 laptop kernel: [33066.384179] NVRM: os_pci_init_handle: invalid context! May 31 23:13:06 laptop kernel: [0.00] Initializing cgroup subsys cpuset May 31 23:13:06 laptop kernel: [0.00] Initializing cgroup subsys cpu May 31 23:13:06 laptop kernel: [0.00] Linux version 3.2.0-2-686-pae (Debian 3.2.18-1) (debian-ker...@lists.debian.org) (gcc version 4.6.3 (Debian 4.6.3-5) ) #1 SMP Mon May 21 18:24:12 UTC 2012 May 31 23:13:06 laptop kernel: [0.00] BIOS-provided physical RAM map: May 31 23:13:06 laptop kernel: [0.00] BIOS-e820: - 0009f000 (usable) May 31 23:13:06 laptop kernel: [0.00] BIOS-e820: 0009f000 - 000a (reserved) May 31 23:13:06 laptop kernel: [0.00] BIOS-e820: 0010 - bfe5a800 (usable) then on Xorg side I have this [ 30399.257] (II) config/udev: Adding input device ELECOM ELECOM USB mouse with wheel (/dev/input/mouse2) [ 30399.257] (II) No input driver specified, ignoring this device. [ 30399.257] (II) This device may have been added with another device file. [ 33119.907] [mi] EQ overflowing. Additional events will be discarded until existing events are processed. [ 33119.907] [ 33119.907] Backtrace: [ 33120.497] 0: /usr/bin/Xorg (xorg_backtrace+0x49) [0xb7778099] [ 33120.497] 1: /usr/bin/Xorg (mieqEnqueue+0x22b) [0xb77569ab] [ 33120.497] 2: /usr/bin/Xorg (0xb75fb000+0x51265) [0xb764c265] [ 33120.497] 3: /usr/bin/Xorg (xf86PostMotionEventM+0xf9) [0xb7686119] [ 33120.497] 4: /usr/lib/xorg/modules/input/evdev_drv.so (0xb4255000+0x35ad) [0xb42585ad] [ 33120.497] 5: /usr/lib/xorg/modules/input/evdev_drv.so (0xb4255000+0x4a2c) [0xb4259a2c] [ 33120.497] 6: /usr/bin/Xorg (0xb75fb000+0x7a8e1) [0xb76758e1] [ 33120.497] 7: /usr/bin/Xorg (0xb75fb000+0xa050a) [0xb769b50a] [ 33120.497] 8: (vdso) (__kernel_sigreturn+0x0) [0xb75dd400] [ 33120.497] 9: (vdso) (__kernel_vsyscall+0x10) [0xb75dd424] [ 33120.497] 10: /lib/i386-linux-gnu/i686/cmov/libc.so.6 (__gettimeofday+0x16) [0xb7309916] [ 33120.497] 11: /usr/lib/xorg/modules/drivers/nvidia_drv.so (0xb486f000+0x62e0d) [0xb48d1e0d] [ 33120.497] [ 33120.497] [mi] These backtraces from mieqEnqueue may point to a culprit higher up the stack. [ 33120.497] [mi] mieq is *NOT* the cause. It is a victim. [ 33120.983] (WW) NVIDIA(0): WAIT (0, 7, 0x8000, 0x9354, 0x9354) [ 33120.983] [mi] Increasing EQ size to 512 to prevent dropped events. [ 33120.983] [mi] EQ
Re: Screen unresponsive
On Tue, 2012-03-13 at 17:04 +, Darac Marjal wrote: On Mon, Mar 05, 2012 at 01:32:17AM -0500, KS wrote: On Mon, Mar 5, 2012, at 12:51 AM, KS wrote: Hi all, The last few days I ahve noticed that when I return to my machine (always ON), the screen doesn't respond. Keyboard (caps lock, num lock) works. I can also ssh to the machine and have noticed that Xorg takes 100% CPU. I couldn't find anything in the Xorg log or syslog files. Today however, the screen stopped responding after a beep while I was using the machine. Below is what I found on sys log: Mar 5 00:32:28 gurh kernel: [17901.730462] NVRM: GPU at :01:00.0 has fallen off the bus. This doesn't sound particularly good. It would suggest to me that your graphics card (the GPU) is no longer attached to the PCI bus. Probably the best case scenario is that this is a physical problem: Open up your computer, pull out the card and push it back in, making sure it's fully seated. If the problem persists, then it may be that the card is locking up completely such that the PCI bus THINKS you've pulled it out. You may find monitoring the output of nvclock -T useful. Syslog gave the warning again as above! So it this just a kernel issue? Thanks, KS Hi Darac, I don't think this is related to HW issue, indeed, I'm experiencing this since some time ago on two different machines. All I can have is the following: root@laptop:~# head -20 /var/log/syslog May 31 22:28:59 laptop syslog-ng[1860]: Configuration reload request received, reloading configuration; May 31 22:28:59 laptop syslog-ng[1860]: EOF on control channel, closing connection; May 31 22:29:00 laptop anacron[11394]: Job `cron.daily' terminated May 31 22:29:00 laptop anacron[11394]: Normal exit (1 job run) May 31 22:49:00 laptop -- MARK -- May 31 23:05:40 laptop kernel: [32915.745040] sdc: detected capacity change from 8019509248 to 0 May 31 23:05:52 laptop kernel: [32927.622139] usb 2-1: USB disconnect, device number 8 May 31 23:08:11 laptop kernel: [33066.384097] NVRM: GPU at :01:00.0 has fallen off the bus. May 31 23:08:11 laptop kernel: [33066.384102] NVRM: GPU at :01:00.0 has fallen off the bus. May 31 23:08:11 laptop kernel: [33066.384120] NVRM: os_pci_init_handle: invalid context! May 31 23:08:11 laptop kernel: [33066.384124] NVRM: os_pci_init_handle: invalid context! May 31 23:08:11 laptop kernel: [33066.384176] NVRM: os_pci_init_handle: invalid context! May 31 23:08:11 laptop kernel: [33066.384179] NVRM: os_pci_init_handle: invalid context! May 31 23:13:06 laptop kernel: [0.00] Initializing cgroup subsys cpuset May 31 23:13:06 laptop kernel: [0.00] Initializing cgroup subsys cpu May 31 23:13:06 laptop kernel: [0.00] Linux version 3.2.0-2-686-pae (Debian 3.2.18-1) (debian-ker...@lists.debian.org) (gcc version 4.6.3 (Debian 4.6.3-5) ) #1 SMP Mon May 21 18:24:12 UTC 2012 May 31 23:13:06 laptop kernel: [0.00] BIOS-provided physical RAM map: May 31 23:13:06 laptop kernel: [0.00] BIOS-e820: - 0009f000 (usable) May 31 23:13:06 laptop kernel: [0.00] BIOS-e820: 0009f000 - 000a (reserved) May 31 23:13:06 laptop kernel: [0.00] BIOS-e820: 0010 - bfe5a800 (usable) then on Xorg side I have this [ 30399.257] (II) config/udev: Adding input device ELECOM ELECOM USB mouse with wheel (/dev/input/mouse2) [ 30399.257] (II) No input driver specified, ignoring this device. [ 30399.257] (II) This device may have been added with another device file. [ 33119.907] [mi] EQ overflowing. Additional events will be discarded until existing events are processed. [ 33119.907] [ 33119.907] Backtrace: [ 33120.497] 0: /usr/bin/Xorg (xorg_backtrace+0x49) [0xb7778099] [ 33120.497] 1: /usr/bin/Xorg (mieqEnqueue+0x22b) [0xb77569ab] [ 33120.497] 2: /usr/bin/Xorg (0xb75fb000+0x51265) [0xb764c265] [ 33120.497] 3: /usr/bin/Xorg (xf86PostMotionEventM+0xf9) [0xb7686119] [ 33120.497] 4: /usr/lib/xorg/modules/input/evdev_drv.so (0xb4255000+0x35ad) [0xb42585ad] [ 33120.497] 5: /usr/lib/xorg/modules/input/evdev_drv.so (0xb4255000+0x4a2c) [0xb4259a2c] [ 33120.497] 6: /usr/bin/Xorg (0xb75fb000+0x7a8e1) [0xb76758e1] [ 33120.497] 7: /usr/bin/Xorg (0xb75fb000+0xa050a) [0xb769b50a] [ 33120.497] 8: (vdso) (__kernel_sigreturn+0x0) [0xb75dd400] [ 33120.497] 9: (vdso) (__kernel_vsyscall+0x10) [0xb75dd424] [ 33120.497] 10: /lib/i386-linux-gnu/i686/cmov/libc.so.6 (__gettimeofday+0x16) [0xb7309916] [ 33120.497] 11: /usr/lib/xorg/modules/drivers/nvidia_drv.so (0xb486f000+0x62e0d) [0xb48d1e0d] [ 33120.497] [ 33120.497] [mi] These backtraces from mieqEnqueue may point to a culprit higher up the stack. [ 33120.497] [mi] mieq is *NOT* the cause. It is a victim. [ 33120.983] (WW) NVIDIA(0): WAIT (0, 7, 0x8000, 0x9354, 0x9354) [ 33120.983] [mi] Increasing EQ size to 512 to prevent dropped events. [ 33120.983] [mi] EQ
Re: Screen unresponsive
On Thu, 2012-05-31 at 23:59 +0200, Abou Al Montacir wrote: On Tue, 2012-03-13 at 17:04 +, Darac Marjal wrote: It would suggest to me that your graphics card (the GPU) is no longer attached to the PCI bus. Probably the best case scenario is that this is a physical problem: Open up your computer, pull out the card and push it back in, making sure it's fully seated. I don't think this is related to HW issue I also don't think that it's related to the hardware, but assumed it should be due to bad hardware connections, take care when you screw the card to the box, that the card doesn't change it's position inside the socket. The case of a computer is often the weak point of the hardware. Time to open a new thread ... Regards, Ralf -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/1338526993.2213.77.camel@precise
Re: Screen unresponsive
On Mon, Mar 05, 2012 at 01:32:17AM -0500, KS wrote: On Mon, Mar 5, 2012, at 12:51 AM, KS wrote: Hi all, The last few days I ahve noticed that when I return to my machine (always ON), the screen doesn't respond. Keyboard (caps lock, num lock) works. I can also ssh to the machine and have noticed that Xorg takes 100% CPU. I couldn't find anything in the Xorg log or syslog files. Today however, the screen stopped responding after a beep while I was using the machine. Below is what I found on sys log: Mar 5 00:32:28 gurh kernel: [17901.730462] NVRM: GPU at :01:00.0 has fallen off the bus. This doesn't sound particularly good. It would suggest to me that your graphics card (the GPU) is no longer attached to the PCI bus. Probably the best case scenario is that this is a physical problem: Open up your computer, pull out the card and push it back in, making sure it's fully seated. If the problem persists, then it may be that the card is locking up completely such that the PCI bus THINKS you've pulled it out. You may find monitoring the output of nvclock -T useful. Syslog gave the warning again as above! So it this just a kernel issue? Thanks, KS -- Darac Marjal -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120313170447.ga32...@darac.org.uk
Re: Screen unresponsive
On Mon, Mar 5, 2012, at 12:51 AM, KS wrote: Hi all, The last few days I ahve noticed that when I return to my machine (always ON), the screen doesn't respond. Keyboard (caps lock, num lock) works. I can also ssh to the machine and have noticed that Xorg takes 100% CPU. I couldn't find anything in the Xorg log or syslog files. Today however, the screen stopped responding after a beep while I was using the machine. Below is what I found on sys log: Mar 5 00:32:28 gurh kernel: [17901.730462] NVRM: GPU at :01:00.0 has fallen off the bus. Mar 5 00:32:28 gurh kernel: [17901.730487] NVRM: os_pci_init_handle: invalid context! Mar 5 00:32:28 gurh kernel: [17901.730497] NVRM: os_pci_init_handle: invalid context! Mar 5 00:32:28 gurh kernel: [17901.730545] NVRM: os_pci_init_handle: invalid context! Mar 5 00:32:28 gurh kernel: [17901.730551] NVRM: os_pci_init_handle: invalid context! Mar 5 00:32:29 gurh kernel: [17902.264522] irq 19: nobody cared (try booting with the irqpoll option) Mar 5 00:32:29 gurh kernel: [17902.264527] Pid: 2692, comm: krunner Tainted: P O 3.2.0-1-686-pae #1 Mar 5 00:32:29 gurh kernel: [17902.264529] Call Trace: Mar 5 00:32:29 gurh kernel: [17902.264535] [c10788e5] ? __report_bad_irq+0x1c/0x8d Mar 5 00:32:29 gurh kernel: [17902.264538] [c1078aef] ? note_interrupt+0x122/0x18f Mar 5 00:32:29 gurh kernel: [17902.264540] [c1077493] ? handle_irq_event_percpu+0x142/0x158 Mar 5 00:32:29 gurh kernel: [17902.264542] [c107906d] ? handle_level_irq+0x62/0x62 Mar 5 00:32:29 gurh kernel: [17902.264544] [c10774ca] ? handle_irq_event+0x21/0x37 Mar 5 00:32:29 gurh kernel: [17902.264547] [c107906d] ? handle_level_irq+0x62/0x62 Mar 5 00:32:29 gurh kernel: [17902.264549] [c10790cd] ? handle_fasteoi_irq+0x60/0x78 Mar 5 00:32:29 gurh kernel: [17902.264550] IRQ [c100cd5f] ? do_IRQ+0x2e/0x76 Mar 5 00:32:29 gurh kernel: [17902.264556] [c12be270] ? common_interrupt+0x30/0x38 Mar 5 00:32:29 gurh kernel: [17902.264558] [c10743a5] ? audit_socketcall+0x12/0x40 Mar 5 00:32:29 gurh kernel: [17902.264561] [c120b7fc] ? sys_socketcall+0x61/0x1da Mar 5 00:32:29 gurh kernel: [17902.264563] [c12bdcdf] ? sysenter_do_call+0x12/0x28 Mar 5 00:32:29 gurh kernel: [17902.264565] handlers: Mar 5 00:32:29 gurh kernel: [17902.264580] [f8399e76] ata_bmdma_interrupt Mar 5 00:32:29 gurh kernel: [17902.264685] [f9799daf] nv_kern_isr Mar 5 00:32:29 gurh kernel: [17902.264687] Disabling IRQ #19 and on the Xorg.0.log [25.418] (**) Option xkb_rules evdev [25.418] (**) Option xkb_model pc104 [25.418] (**) Option xkb_layout us [25.418] (II) config/udev: Adding input device PC Speaker (/dev/input/event3) [25.418] (II) No input driver/identifier specified (ignoring) [ 582.227] (WW) Open ACPI failed (/var/run/acpid.socket) (Connection refused) [ 583.227] (II) Open ACPI successful (/var/run/acpid.socket) Can someone shed some light on what is going on? Is my graphics card (NVIDIA GeForce 8600 GTS) dying? Its fan does make a lot of noise though. -- Apologies. I didn't have the correct version of nvidia drivers installed for my kernel. Installed it with m-a. Reloaded the new module. ii nvidia-kernel-3.2.0-1-686-pae295.20-1+3.2.7-1 Syslog gave the warning again as above! So it this just a kernel issue? Thanks, KS -- http://www.fastmail.fm - Choose from over 50 domains or use your own -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/1330929137.4715.140661044873...@webmail.messagingengine.com