Re: Screen unresponsive

2012-06-03 Thread Abou Al Montacir
On Tue, 2012-03-13 at 17:04 +, Darac Marjal wrote: 

 On Mon, Mar 05, 2012 at 01:32:17AM -0500, KS wrote:
  On Mon, Mar 5, 2012, at 12:51 AM, KS wrote:
   Hi all,
   
   The last few days I ahve noticed that when I return to my machine
   (always ON), the screen doesn't respond. Keyboard (caps lock, num lock)
   works. I can also ssh to the machine and have noticed that Xorg takes
   100% CPU. I couldn't find anything in the Xorg log or syslog files.
   
   Today however, the screen stopped responding after a beep while I was
   using the machine. Below is what I found on sys log:
   
   Mar  5 00:32:28 gurh kernel: [17901.730462] NVRM: GPU at :01:00.0
   has fallen off the bus.
 
 This doesn't sound particularly good. It would suggest to me that your
 graphics card (the GPU) is no longer attached to the PCI bus. Probably
 the best case scenario is that this is a physical problem: Open up your
 computer, pull out the card and push it back in, making sure it's fully
 seated.
 
 If the problem persists, then it may be that the card is locking up
 completely such that the PCI bus THINKS you've pulled it out. You may
 find monitoring the output of nvclock -T useful.
 
  
  Syslog gave the warning again as above!
  
  So it this just a kernel issue?
  
  Thanks,
  KS
  

Hi Darac,

I don't think this is related to HW issue, indeed, I'm experiencing this
since some time ago on two different machines. All I can have is the
following: 

root@laptop:~# head -20 /var/log/syslog
May 31 22:28:59 laptop syslog-ng[1860]: Configuration reload request received, 
reloading configuration;
May 31 22:28:59 laptop syslog-ng[1860]: EOF on control channel, closing 
connection;
May 31 22:29:00 laptop anacron[11394]: Job `cron.daily' terminated
May 31 22:29:00 laptop anacron[11394]: Normal exit (1 job run)
May 31 22:49:00 laptop -- MARK --
May 31 23:05:40 laptop kernel: [32915.745040] sdc: detected capacity change 
from 8019509248 to 0
May 31 23:05:52 laptop kernel: [32927.622139] usb 2-1: USB disconnect, device 
number 8
May 31 23:08:11 laptop kernel: [33066.384097] NVRM: GPU at :01:00.0 has 
fallen off the bus.
May 31 23:08:11 laptop kernel: [33066.384102] NVRM: GPU at :01:00.0 has 
fallen off the bus.
May 31 23:08:11 laptop kernel: [33066.384120] NVRM: os_pci_init_handle: invalid 
context!
May 31 23:08:11 laptop kernel: [33066.384124] NVRM: os_pci_init_handle: invalid 
context!
May 31 23:08:11 laptop kernel: [33066.384176] NVRM: os_pci_init_handle: invalid 
context!
May 31 23:08:11 laptop kernel: [33066.384179] NVRM: os_pci_init_handle: invalid 
context!
May 31 23:13:06 laptop kernel: [0.00] Initializing cgroup subsys cpuset
May 31 23:13:06 laptop kernel: [0.00] Initializing cgroup subsys cpu
May 31 23:13:06 laptop kernel: [0.00] Linux version 3.2.0-2-686-pae 
(Debian 3.2.18-1) (debian-ker...@lists.debian.org) (gcc version 4.6.3 (Debian 
4.6.3-5) ) #1 SMP Mon May 21 18:24:12 UTC 2012
May 31 23:13:06 laptop kernel: [0.00] BIOS-provided physical RAM map:
May 31 23:13:06 laptop kernel: [0.00]  BIOS-e820:  - 
0009f000 (usable)
May 31 23:13:06 laptop kernel: [0.00]  BIOS-e820: 0009f000 - 
000a (reserved)
May 31 23:13:06 laptop kernel: [0.00]  BIOS-e820: 0010 - 
bfe5a800 (usable)

then on Xorg side I have this 

[ 30399.257] (II) config/udev: Adding input device ELECOM ELECOM USB mouse with 
wheel  (/dev/input/mouse2)
[ 30399.257] (II) No input driver specified, ignoring this device.
[ 30399.257] (II) This device may have been added with another device file.
[ 33119.907] [mi] EQ overflowing.  Additional events will be discarded until 
existing events are processed.
[ 33119.907] 
[ 33119.907] Backtrace:
[ 33120.497] 0: /usr/bin/Xorg (xorg_backtrace+0x49) [0xb7778099]
[ 33120.497] 1: /usr/bin/Xorg (mieqEnqueue+0x22b) [0xb77569ab]
[ 33120.497] 2: /usr/bin/Xorg (0xb75fb000+0x51265) [0xb764c265]
[ 33120.497] 3: /usr/bin/Xorg (xf86PostMotionEventM+0xf9) [0xb7686119]
[ 33120.497] 4: /usr/lib/xorg/modules/input/evdev_drv.so (0xb4255000+0x35ad) 
[0xb42585ad]
[ 33120.497] 5: /usr/lib/xorg/modules/input/evdev_drv.so (0xb4255000+0x4a2c) 
[0xb4259a2c]
[ 33120.497] 6: /usr/bin/Xorg (0xb75fb000+0x7a8e1) [0xb76758e1]
[ 33120.497] 7: /usr/bin/Xorg (0xb75fb000+0xa050a) [0xb769b50a]
[ 33120.497] 8: (vdso) (__kernel_sigreturn+0x0) [0xb75dd400]
[ 33120.497] 9: (vdso) (__kernel_vsyscall+0x10) [0xb75dd424]
[ 33120.497] 10: /lib/i386-linux-gnu/i686/cmov/libc.so.6 (__gettimeofday+0x16) 
[0xb7309916]
[ 33120.497] 11: /usr/lib/xorg/modules/drivers/nvidia_drv.so 
(0xb486f000+0x62e0d) [0xb48d1e0d]
[ 33120.497] 
[ 33120.497] [mi] These backtraces from mieqEnqueue may point to a culprit 
higher up the stack.
[ 33120.497] [mi] mieq is *NOT* the cause.  It is a victim.
[ 33120.983] (WW) NVIDIA(0): WAIT (0, 7, 0x8000, 0x9354, 0x9354)
[ 33120.983] [mi] Increasing EQ size to 512 to prevent dropped events.
[ 33120.983] [mi] EQ 

Re: Screen unresponsive

2012-05-31 Thread Abou Al Montacir
On Tue, 2012-03-13 at 17:04 +, Darac Marjal wrote:

 On Mon, Mar 05, 2012 at 01:32:17AM -0500, KS wrote:
  On Mon, Mar 5, 2012, at 12:51 AM, KS wrote:
   Hi all,
   
   The last few days I ahve noticed that when I return to my machine
   (always ON), the screen doesn't respond. Keyboard (caps lock, num lock)
   works. I can also ssh to the machine and have noticed that Xorg takes
   100% CPU. I couldn't find anything in the Xorg log or syslog files.
   
   Today however, the screen stopped responding after a beep while I was
   using the machine. Below is what I found on sys log:
   
   Mar  5 00:32:28 gurh kernel: [17901.730462] NVRM: GPU at :01:00.0
   has fallen off the bus.
 
 This doesn't sound particularly good. It would suggest to me that your
 graphics card (the GPU) is no longer attached to the PCI bus. Probably
 the best case scenario is that this is a physical problem: Open up your
 computer, pull out the card and push it back in, making sure it's fully
 seated.
 
 If the problem persists, then it may be that the card is locking up
 completely such that the PCI bus THINKS you've pulled it out. You may
 find monitoring the output of nvclock -T useful.
 
  
  Syslog gave the warning again as above!
  
  So it this just a kernel issue?
  
  Thanks,
  KS
  

Hi Darac,

I don't think this is related to HW issue, indeed, I'm experiencing this
since some time ago on two different machines. All I can have is the
following:

root@laptop:~# head -20 /var/log/syslog
May 31 22:28:59 laptop syslog-ng[1860]: Configuration reload request received, 
reloading configuration;
May 31 22:28:59 laptop syslog-ng[1860]: EOF on control channel, closing 
connection;
May 31 22:29:00 laptop anacron[11394]: Job `cron.daily' terminated
May 31 22:29:00 laptop anacron[11394]: Normal exit (1 job run)
May 31 22:49:00 laptop -- MARK --
May 31 23:05:40 laptop kernel: [32915.745040] sdc: detected capacity change 
from 8019509248 to 0
May 31 23:05:52 laptop kernel: [32927.622139] usb 2-1: USB disconnect, device 
number 8
May 31 23:08:11 laptop kernel: [33066.384097] NVRM: GPU at :01:00.0 has 
fallen off the bus.
May 31 23:08:11 laptop kernel: [33066.384102] NVRM: GPU at :01:00.0 has 
fallen off the bus.
May 31 23:08:11 laptop kernel: [33066.384120] NVRM: os_pci_init_handle: invalid 
context!
May 31 23:08:11 laptop kernel: [33066.384124] NVRM: os_pci_init_handle: invalid 
context!
May 31 23:08:11 laptop kernel: [33066.384176] NVRM: os_pci_init_handle: invalid 
context!
May 31 23:08:11 laptop kernel: [33066.384179] NVRM: os_pci_init_handle: invalid 
context!
May 31 23:13:06 laptop kernel: [0.00] Initializing cgroup subsys cpuset
May 31 23:13:06 laptop kernel: [0.00] Initializing cgroup subsys cpu
May 31 23:13:06 laptop kernel: [0.00] Linux version 3.2.0-2-686-pae 
(Debian 3.2.18-1) (debian-ker...@lists.debian.org) (gcc version 4.6.3 (Debian 
4.6.3-5) ) #1 SMP Mon May 21 18:24:12 UTC 2012
May 31 23:13:06 laptop kernel: [0.00] BIOS-provided physical RAM map:
May 31 23:13:06 laptop kernel: [0.00]  BIOS-e820:  - 
0009f000 (usable)
May 31 23:13:06 laptop kernel: [0.00]  BIOS-e820: 0009f000 - 
000a (reserved)
May 31 23:13:06 laptop kernel: [0.00]  BIOS-e820: 0010 - 
bfe5a800 (usable)

then on Xorg side I have this

[ 30399.257] (II) config/udev: Adding input device ELECOM ELECOM USB mouse with 
wheel  (/dev/input/mouse2)
[ 30399.257] (II) No input driver specified, ignoring this device.
[ 30399.257] (II) This device may have been added with another device file.
[ 33119.907] [mi] EQ overflowing.  Additional events will be discarded until 
existing events are processed.
[ 33119.907] 
[ 33119.907] Backtrace:
[ 33120.497] 0: /usr/bin/Xorg (xorg_backtrace+0x49) [0xb7778099]
[ 33120.497] 1: /usr/bin/Xorg (mieqEnqueue+0x22b) [0xb77569ab]
[ 33120.497] 2: /usr/bin/Xorg (0xb75fb000+0x51265) [0xb764c265]
[ 33120.497] 3: /usr/bin/Xorg (xf86PostMotionEventM+0xf9) [0xb7686119]
[ 33120.497] 4: /usr/lib/xorg/modules/input/evdev_drv.so (0xb4255000+0x35ad) 
[0xb42585ad]
[ 33120.497] 5: /usr/lib/xorg/modules/input/evdev_drv.so (0xb4255000+0x4a2c) 
[0xb4259a2c]
[ 33120.497] 6: /usr/bin/Xorg (0xb75fb000+0x7a8e1) [0xb76758e1]
[ 33120.497] 7: /usr/bin/Xorg (0xb75fb000+0xa050a) [0xb769b50a]
[ 33120.497] 8: (vdso) (__kernel_sigreturn+0x0) [0xb75dd400]
[ 33120.497] 9: (vdso) (__kernel_vsyscall+0x10) [0xb75dd424]
[ 33120.497] 10: /lib/i386-linux-gnu/i686/cmov/libc.so.6 (__gettimeofday+0x16) 
[0xb7309916]
[ 33120.497] 11: /usr/lib/xorg/modules/drivers/nvidia_drv.so 
(0xb486f000+0x62e0d) [0xb48d1e0d]
[ 33120.497] 
[ 33120.497] [mi] These backtraces from mieqEnqueue may point to a culprit 
higher up the stack.
[ 33120.497] [mi] mieq is *NOT* the cause.  It is a victim.
[ 33120.983] (WW) NVIDIA(0): WAIT (0, 7, 0x8000, 0x9354, 0x9354)
[ 33120.983] [mi] Increasing EQ size to 512 to prevent dropped events.
[ 33120.983] [mi] EQ 

Re: Screen unresponsive

2012-05-31 Thread Ralf Mardorf
On Thu, 2012-05-31 at 23:59 +0200, Abou Al Montacir wrote:
 On Tue, 2012-03-13 at 17:04 +, Darac Marjal wrote:
  It would suggest to me that your
  graphics card (the GPU) is no longer attached to the PCI bus. Probably
  the best case scenario is that this is a physical problem: Open up your
  computer, pull out the card and push it back in, making sure it's fully
  seated.
 
 I don't think this is related to HW issue

I also don't think that it's related to the hardware, but assumed it
should be due to bad hardware connections, take care when you screw the
card to the box, that the card doesn't change it's position inside the
socket. The case of a computer is often the weak point of the hardware.

Time to open a new thread ...

Regards,
Ralf


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/1338526993.2213.77.camel@precise



Re: Screen unresponsive

2012-03-13 Thread Darac Marjal
On Mon, Mar 05, 2012 at 01:32:17AM -0500, KS wrote:
 On Mon, Mar 5, 2012, at 12:51 AM, KS wrote:
  Hi all,
  
  The last few days I ahve noticed that when I return to my machine
  (always ON), the screen doesn't respond. Keyboard (caps lock, num lock)
  works. I can also ssh to the machine and have noticed that Xorg takes
  100% CPU. I couldn't find anything in the Xorg log or syslog files.
  
  Today however, the screen stopped responding after a beep while I was
  using the machine. Below is what I found on sys log:
  
  Mar  5 00:32:28 gurh kernel: [17901.730462] NVRM: GPU at :01:00.0
  has fallen off the bus.

This doesn't sound particularly good. It would suggest to me that your
graphics card (the GPU) is no longer attached to the PCI bus. Probably
the best case scenario is that this is a physical problem: Open up your
computer, pull out the card and push it back in, making sure it's fully
seated.

If the problem persists, then it may be that the card is locking up
completely such that the PCI bus THINKS you've pulled it out. You may
find monitoring the output of nvclock -T useful.

 
 Syslog gave the warning again as above!
 
 So it this just a kernel issue?
 
 Thanks,
 KS
 

-- 
Darac Marjal


--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20120313170447.ga32...@darac.org.uk



Re: Screen unresponsive

2012-03-04 Thread KS
On Mon, Mar 5, 2012, at 12:51 AM, KS wrote:
 Hi all,
 
 The last few days I ahve noticed that when I return to my machine
 (always ON), the screen doesn't respond. Keyboard (caps lock, num lock)
 works. I can also ssh to the machine and have noticed that Xorg takes
 100% CPU. I couldn't find anything in the Xorg log or syslog files.
 
 Today however, the screen stopped responding after a beep while I was
 using the machine. Below is what I found on sys log:
 
 Mar  5 00:32:28 gurh kernel: [17901.730462] NVRM: GPU at :01:00.0
 has fallen off the bus.
 Mar  5 00:32:28 gurh kernel: [17901.730487] NVRM: os_pci_init_handle:
 invalid context!
 Mar  5 00:32:28 gurh kernel: [17901.730497] NVRM: os_pci_init_handle:
 invalid context!
 Mar  5 00:32:28 gurh kernel: [17901.730545] NVRM: os_pci_init_handle:
 invalid context!
 Mar  5 00:32:28 gurh kernel: [17901.730551] NVRM: os_pci_init_handle:
 invalid context!
 Mar  5 00:32:29 gurh kernel: [17902.264522] irq 19: nobody cared (try
 booting with the irqpoll option)
 Mar  5 00:32:29 gurh kernel: [17902.264527] Pid: 2692, comm: krunner
 Tainted: P   O 3.2.0-1-686-pae #1
 Mar  5 00:32:29 gurh kernel: [17902.264529] Call Trace:
 Mar  5 00:32:29 gurh kernel: [17902.264535]  [c10788e5] ?
 __report_bad_irq+0x1c/0x8d
 Mar  5 00:32:29 gurh kernel: [17902.264538]  [c1078aef] ?
 note_interrupt+0x122/0x18f
 Mar  5 00:32:29 gurh kernel: [17902.264540]  [c1077493] ?
 handle_irq_event_percpu+0x142/0x158
 Mar  5 00:32:29 gurh kernel: [17902.264542]  [c107906d] ?
 handle_level_irq+0x62/0x62
 Mar  5 00:32:29 gurh kernel: [17902.264544]  [c10774ca] ?
 handle_irq_event+0x21/0x37
 Mar  5 00:32:29 gurh kernel: [17902.264547]  [c107906d] ?
 handle_level_irq+0x62/0x62
 Mar  5 00:32:29 gurh kernel: [17902.264549]  [c10790cd] ?
 handle_fasteoi_irq+0x60/0x78
 Mar  5 00:32:29 gurh kernel: [17902.264550]  IRQ  [c100cd5f] ?
 do_IRQ+0x2e/0x76
 Mar  5 00:32:29 gurh kernel: [17902.264556]  [c12be270] ?
 common_interrupt+0x30/0x38
 Mar  5 00:32:29 gurh kernel: [17902.264558]  [c10743a5] ?
 audit_socketcall+0x12/0x40
 Mar  5 00:32:29 gurh kernel: [17902.264561]  [c120b7fc] ?
 sys_socketcall+0x61/0x1da
 Mar  5 00:32:29 gurh kernel: [17902.264563]  [c12bdcdf] ?
 sysenter_do_call+0x12/0x28
 Mar  5 00:32:29 gurh kernel: [17902.264565] handlers:
 Mar  5 00:32:29 gurh kernel: [17902.264580] [f8399e76]
 ata_bmdma_interrupt
 Mar  5 00:32:29 gurh kernel: [17902.264685] [f9799daf] nv_kern_isr
 Mar  5 00:32:29 gurh kernel: [17902.264687] Disabling IRQ #19
 
 and on the Xorg.0.log
 [25.418] (**) Option xkb_rules evdev
 [25.418] (**) Option xkb_model pc104
 [25.418] (**) Option xkb_layout us
 [25.418] (II) config/udev: Adding input device PC Speaker
 (/dev/input/event3)
 [25.418] (II) No input driver/identifier specified (ignoring)
 [   582.227] (WW) Open ACPI failed (/var/run/acpid.socket) (Connection
 refused)
 [   583.227] (II) Open ACPI successful (/var/run/acpid.socket)
 
 
 Can someone shed some light on what is going on? Is my graphics card
 (NVIDIA GeForce 8600 GTS) dying? Its fan does make a lot of noise
 though.
 -- 


Apologies. I didn't have the correct version of nvidia drivers installed
for my kernel. 
Installed it with m-a.
Reloaded the new module.
ii  nvidia-kernel-3.2.0-1-686-pae295.20-1+3.2.7-1 

Syslog gave the warning again as above!

So it this just a kernel issue?

Thanks,
KS

-- 
http://www.fastmail.fm - Choose from over 50 domains or use your own


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/1330929137.4715.140661044873...@webmail.messagingengine.com