Apparent hang in or around pcireg_cfgread

2007-05-22 Thread Stefan Bethke

Hi,

my router/file server just froze.  It's -stable from about a month  
ago.  I hit break on the serial console a couple of times, with about  
half a minute in between, but since it seemed like it wasn't going to  
recover by itself, I panicked it.


The machine used to run without a hitch for the past half year.  I've  
recently added a 500 GB SATA disk, and I used ataidle to set the  
suspend timeout to 10 minutes.  Immediatly before the hang, I got  
ad4: TIMEOUT - READ_DMA48 retrying (1 retry left) LBA=699557839.   
ataidle is evil?  Some other, unrelated hardware trouble?


Typescript from serial console, kgdb bt, dmesg, kernel config below.


Stefan

--
Stefan Bethke [EMAIL PROTECTED]   Fon +49 170 346 0140


KDB: enter: Line break on console
[thread pid 24 tid 100020 ]
Stopped at  kdb_enter+0x30: leave
db ps
  pid  ppid  pgrp   uid   state   wmesg wchancmd
55872  1328 55872 0  S+  ttyin0xc3424410 less
40665  1313 40665 0  S+  ttyin0xc3423c10 bash
47558  1041  104180  S   accept   0xc37ae5ca httpd
4 1 4 0  Ss  select   0xc07200c4 ntpd
45531 1 45531 0  Rs  ppp
44366 44333 44333 0  S+  biord0xcd7c8db8 afpd
44333 1 44333 0  S+  select   0xc07200c4 afpd
1769  1041  104180  S   accept   0xc37ae5ca httpd
1328  1327  1328 0  S+  wait 0xc3830860 bash
1327  1318  1327  1000  S+  wait 0xc3830a78 su
1318  1317  1318  1000  Ss+ wait 0xc395c000 bash
1317  1315  1315  1000  S   select   0xc07200c4 sshd
1315  1240  1315 0  Ss  sbwait   0xc3843bc8 sshd
1314 1 1 0  S   ttydcd   0xc342e400 getty
1313 1  1313 0  Ss+ wait 0xc382ac90 login
1312 1  1312 0  Ss+ ttyin0xc343ac10 getty
1311 1  1311 0  Ss+ ttyin0xc343b010 getty
1310 1  1310 0  Ss+ ttyin0xc343b410 getty
1309 1  1309 0  Ss+ ttyin0xc342f810 getty
1308 1  1308 0  Ss+ ttyin0xc3432810 getty
1307 1  1307 0  Ss+ ttyin0xc3433c10 getty
1306 1  1306 0  Ss+ ttyin0xc3433810 getty
1305 1  1305 0  Ss+ ttyin0xc3433010 getty
1252 1  1252 0  Ss  nanslp   0xc071b64c cron
1240 1  1240 0  Ss  select   0xc07200c4 sshd
1219 1  1218  1011  S+  select   0xc07200c4 boinc_client
1193 1  1193 0  Ss  select   0xc07200c4 cupsd
1186 1  1186   900  Ss  select   0xc07200c4 cvsupd
1167 1  1167 0  Ss  select   0xc07200c4 openvpn
1089 1  1089 0  Ss  select   0xc07200c4 openvpn
1080 1  1080 0  Rs  openvpn
1079  1041  104180  S   accept   0xc37ae5ca httpd
1078  1041  104180  S   accept   0xc37ae5ca httpd
1077  1041  104180  S   accept   0xc37ae5ca httpd
1076  1041  104180  S   accept   0xc37ae5ca httpd
1075  1041  104180  S   accept   0xc37ae5ca httpd
1058 1  1058   561  Ss  select   0xc07200c4 dhcpd
1041 1  1041 0  Ss  select   0xc07200c4 httpd
1027 1  1027 0  Ss  select   0xc07200c4 usbd
1009 1  1009 0  Ss  nanslp   0xc071b64c powerd
  975 1   974 0  S   select   0xc07200c4 snmpd
  956   951   951 0  S   -0xc3688e00 nfsd
  955   951   951 0  S   -0xc3739600 nfsd
  954   951   951 0  S   -0xc3739800 nfsd
  953   951   951 0  S   -0xc3739a00 nfsd
  951 1   951 0  Ss  accept   0xc378be22 nfsd
  943 1   943 0  Ss  select   0xc07200c4 mountd
  910 0 0 0  SL  mdwait   0xc3759800 [md0]
  896 1   896 0  Ss  select   0xc07200c4 rpcbind
  886 1   88653  Ss  select   0xc07200c4 named
  811 1   811 0  Ss  select   0xc07200c4 syslogd
  690 1   690 0  Ss  select   0xc07200c4 devd
   46 0 0 0  SL  -0xd56d4cf8 [schedcpu]
   45 0 0 0  SL  sdflush  0xc072aed4 [softdepflush]
   44 0 0 0  SL  syncer   0xc071b3bc [syncer]
   43 0 0 0  SL  vlruwt   0xc34fb648 [vnlru]
   42 0 0 0  SL  psleep   0xc072054c [bufdaemon]
   41 0 0 0  SL  pgzero   0xc072be44 [pagezero]
   40 0 0 0  SL  psleep   0xc072b994 [vmdaemon]
   39 0 0 0  SL  psleep   0xc072b950 [pagedaemon]
   38 0 0 0  SL  -0xc34cc400 [dummynet]
   37 0 0 0  WL  [irq1: atkbd0]
   36 0 0 0  RL  [swi0: sio]
   35 0 0 0  SL  cooling  0xc33a4cd4 [acpi_cooling0]
   34 0 0 0  SL  tzpoll   0xc08775b8 [acpi_thermal]
   33 0 0 0  WL  [irq15: ata1]
   32 0 0 0  WL  [irq14: ata0]
   31 0 0 0  WL  [irq17: em0]
   30 0 0 0  

Apparent hang in or around pcireg_cfgread

2007-05-20 Thread Stefan Bethke

Hi,

my router/file server just froze.  It's -stable from about a month  
ago.  I hit break on the serial console a couple of times, with about  
half a minute in between, but since it seemed like it wasn't going to  
recover by itself, I panicked it.


The machine used to run without a hitch for the past half year.  I've  
recently added a 500 GB SATA disk, and I used ataidle to set the  
suspend timeout to 10 minutes.  Immediatly before the hang, I got  
ad4: TIMEOUT - READ_DMA48 retrying (1 retry left) LBA=699557839.   
ataidle is evil?  Some other, unrelated hardware trouble?


Typescript from serial console, kgdb bt, dmesg, kernel config below.


Stefan

--
Stefan Bethke [EMAIL PROTECTED]   Fon +49 170 346 0140


KDB: enter: Line break on console
[thread pid 24 tid 100020 ]
Stopped at  kdb_enter+0x30: leave
db ps
  pid  ppid  pgrp   uid   state   wmesg wchancmd
55872  1328 55872 0  S+  ttyin0xc3424410 less
40665  1313 40665 0  S+  ttyin0xc3423c10 bash
47558  1041  104180  S   accept   0xc37ae5ca httpd
4 1 4 0  Ss  select   0xc07200c4 ntpd
45531 1 45531 0  Rs  ppp
44366 44333 44333 0  S+  biord0xcd7c8db8 afpd
44333 1 44333 0  S+  select   0xc07200c4 afpd
1769  1041  104180  S   accept   0xc37ae5ca httpd
1328  1327  1328 0  S+  wait 0xc3830860 bash
1327  1318  1327  1000  S+  wait 0xc3830a78 su
1318  1317  1318  1000  Ss+ wait 0xc395c000 bash
1317  1315  1315  1000  S   select   0xc07200c4 sshd
1315  1240  1315 0  Ss  sbwait   0xc3843bc8 sshd
1314 1 1 0  S   ttydcd   0xc342e400 getty
1313 1  1313 0  Ss+ wait 0xc382ac90 login
1312 1  1312 0  Ss+ ttyin0xc343ac10 getty
1311 1  1311 0  Ss+ ttyin0xc343b010 getty
1310 1  1310 0  Ss+ ttyin0xc343b410 getty
1309 1  1309 0  Ss+ ttyin0xc342f810 getty
1308 1  1308 0  Ss+ ttyin0xc3432810 getty
1307 1  1307 0  Ss+ ttyin0xc3433c10 getty
1306 1  1306 0  Ss+ ttyin0xc3433810 getty
1305 1  1305 0  Ss+ ttyin0xc3433010 getty
1252 1  1252 0  Ss  nanslp   0xc071b64c cron
1240 1  1240 0  Ss  select   0xc07200c4 sshd
1219 1  1218  1011  S+  select   0xc07200c4 boinc_client
1193 1  1193 0  Ss  select   0xc07200c4 cupsd
1186 1  1186   900  Ss  select   0xc07200c4 cvsupd
1167 1  1167 0  Ss  select   0xc07200c4 openvpn
1089 1  1089 0  Ss  select   0xc07200c4 openvpn
1080 1  1080 0  Rs  openvpn
1079  1041  104180  S   accept   0xc37ae5ca httpd
1078  1041  104180  S   accept   0xc37ae5ca httpd
1077  1041  104180  S   accept   0xc37ae5ca httpd
1076  1041  104180  S   accept   0xc37ae5ca httpd
1075  1041  104180  S   accept   0xc37ae5ca httpd
1058 1  1058   561  Ss  select   0xc07200c4 dhcpd
1041 1  1041 0  Ss  select   0xc07200c4 httpd
1027 1  1027 0  Ss  select   0xc07200c4 usbd
1009 1  1009 0  Ss  nanslp   0xc071b64c powerd
  975 1   974 0  S   select   0xc07200c4 snmpd
  956   951   951 0  S   -0xc3688e00 nfsd
  955   951   951 0  S   -0xc3739600 nfsd
  954   951   951 0  S   -0xc3739800 nfsd
  953   951   951 0  S   -0xc3739a00 nfsd
  951 1   951 0  Ss  accept   0xc378be22 nfsd
  943 1   943 0  Ss  select   0xc07200c4 mountd
  910 0 0 0  SL  mdwait   0xc3759800 [md0]
  896 1   896 0  Ss  select   0xc07200c4 rpcbind
  886 1   88653  Ss  select   0xc07200c4 named
  811 1   811 0  Ss  select   0xc07200c4 syslogd
  690 1   690 0  Ss  select   0xc07200c4 devd
   46 0 0 0  SL  -0xd56d4cf8 [schedcpu]
   45 0 0 0  SL  sdflush  0xc072aed4 [softdepflush]
   44 0 0 0  SL  syncer   0xc071b3bc [syncer]
   43 0 0 0  SL  vlruwt   0xc34fb648 [vnlru]
   42 0 0 0  SL  psleep   0xc072054c [bufdaemon]
   41 0 0 0  SL  pgzero   0xc072be44 [pagezero]
   40 0 0 0  SL  psleep   0xc072b994 [vmdaemon]
   39 0 0 0  SL  psleep   0xc072b950 [pagedaemon]
   38 0 0 0  SL  -0xc34cc400 [dummynet]
   37 0 0 0  WL  [irq1: atkbd0]
   36 0 0 0  RL  [swi0: sio]
   35 0 0 0  SL  cooling  0xc33a4cd4 [acpi_cooling0]
   34 0 0 0  SL  tzpoll   0xc08775b8 [acpi_thermal]
   33 0 0 0  WL  [irq15: ata1]
   32 0 0 0  WL  [irq14: ata0]
   31 0 0 0  WL  [irq17: em0]
   30 0 0 0