My 7-stable/amd64 server crashes nearly every night while my backup
routine is in progress.  There's no backtrace and no crash dump is saved,
but the console reads:
> ohci1: 1 scheduling overruns
> ohci1: WARNING: addr 0x01cf0000 not found
> ohci1: WARNING: addr 0x012c0000 not found
> ohci1: WARNING: addr 0x01cf0000 not found
> ohci1: WARNING: addr 0x01d50000 not found
> ohci1: WARNING: addr 0x01d70000 not found
> ohci1: WARNING: addr 0x01d60000 not found
> ohci1: WARNING: addr 0x012c0000 not found
> ohci1: WARNING: addr 0x01cf0000 not found
> ohci1: 44 scheduling overruns
> [more of these]
> ohci1: 46 scheduling overruns
before it reboots (sometimes hangs instead).

Probably, these messages are /not/ traces of the root cause, since the
machine will also crash with a kernel with no ohci support compiled in
whatsoever - the crashes are silent, then.

It happens while dump(8)ing an in-filesystem fss(4)-snapshot of an empty-ish
FFSv1 (fslevel 4) filesystem sitting on a raid(4)-1 with two components.

There should not, conceptually, be a problem with dumping a fss device,
right?

The command my script runs to create the snapshot is
# fssconfig -cx fss0 /stor /stor/snapshot
and the dump
# dump -$lvl -uant -h 0 -L "$nam" -f - /dev/rfss0 >/tmp/dumpfifo
where /tmp/dumpfifo is a fifo from which
# gzip -1 </tmp/dumpfifo >/var/tmp/dump.gz
reads.  (I don't remember the reason for going via a fifo, but there
was one...)

Any suggestions where I could start looking?  So far, I've tried running
a DEBUG kernel but that didn't provide additional information.
The filesystem is clean as far as fsck_ffs is concerned, too.


Here's some information on the filesystem:

# mount -v | grep /stor
/dev/raid0g on /stor type ffs (log, noatime, local, fsid: 0x1206/0x78b, reads: 
sync 8489 async 0, writes: sync 0 async 1791)


# df -h /stor
Filesystem         Size       Used      Avail %Cap Mounted on
/dev/raid0g        416G        19G       376G   4% /stor


# dumpfs -s /stor
file system: /dev/rraid0g
format  FFSv1
endian  little-endian
magic   11954           time    Fri Mar 11 05:55:54 2016
superblock location     8192    id      [ 564b7b58 793a9223 ]
cylgrp  dynamic inodes  4.4BSD  sblock  FFSv2   fslevel 4
nbfree  12996876        ndir    56002   nifree  26708703        nffree  4718
ncg     580     size    109891568       blocks  109028517
bsize   32768   shift   15      mask    0xffff8000
fsize   4096    shift   12      mask    0xfffff000
frag    8       shift   3       fsbtodb 3
bpg     23684   fpg     189472  ipg     47104
minfree 5%      optim   time    maxcontig 2     maxbpg  8192
symlinklen 60   contigsumsize 2
maxfilesize 0x004002001005ffff
nindir  8192    inopb   256
avgfilesize 16384       avgfpdir 64
sblkno  8       cblkno  16      iblkno  24      dblkno  1496
sbsize  4096    cgsize  32768
csaddr  1496    cssize  12288
cgrotor 0       fmod    0       ronly   0       clean   0x02
wapbl version 0x1       location 2      flags 0x0
wapbl loc0 439587072    loc1 131072     loc2 512        loc3 3
flags   wapbl 
fsmnt   /stor
volname         swuid   0


# raidctl -sv raid0
Components:
           /dev/wd0a: optimal
           /dev/wd1a: optimal
No spares.
Component label for /dev/wd0a:
   Row: 0, Column: 0, Num Rows: 1, Num Columns: 2
   Version: 2, Serial Number: 2015111701, Mod Counter: 1213
   Clean: No, Status: 0
   sectPerSU: 128, SUsPerPU: 1, SUsPerRU: 1
   Queue size: 100, blocksize: 512, numBlocks: 913211264
   RAID Level: 1
   Autoconfig: Yes
   Root partition: Force
   Last configured as: raid0
Component label for /dev/wd1a:
   Row: 0, Column: 1, Num Rows: 1, Num Columns: 2
   Version: 2, Serial Number: 2015111701, Mod Counter: 1213
   Clean: No, Status: 0
   sectPerSU: 128, SUsPerPU: 1, SUsPerRU: 1
   Queue size: 100, blocksize: 512, numBlocks: 913211264
   RAID Level: 1
   Autoconfig: Yes
   Root partition: Force
   Last configured as: raid0
Parity status: clean
Reconstruction is 100% complete.
Parity Re-write is 100% complete.
Copyback is 100% complete.
# exit

Reply via email to