On 08/14/2013 9:43 pm, Shane Ambler wrote:
On 14/08/2013 22:57, dweimer wrote:
I have a few systems running on ZFS with a backup script that creates
snapshots, then backs up the .zfs/snapshot/name directory to make sure
open files are not missed.  This has been working great but all of the
sudden one of my systems has stopped working.  It takes the snapshots
fine, zfs list -t spnapshot shows the snapshots, but if you do an ls
command, on the .zfs/snapshot/ directory it returns not a directory.

part of the zfs list output:

NAME                        USED  AVAIL  REFER  MOUNTPOINT
zroot                      4.48G  29.7G    31K  none
zroot/ROOT                 2.92G  29.7G    31K  none
zroot/ROOT/91p5-20130812   2.92G  29.7G  2.92G  legacy
zroot/home                  144K  29.7G   122K  /home

part of the zfs list -t snapshot output:

NAME                                            USED  AVAIL  REFER
MOUNTPOINT
zroot/ROOT/91p5-20130812@91p5-20130812--bsnap   340K      -  2.92G  -
zroot/home@home--bsnap                           22K      -   122K  -

ls /.zfs/snapshot/91p5-20130812--bsnap/
Does work at the right now, since the last reboot, but wasn't always
working, this is my boot environment.

if I do ls /home/.zfs/snapshot/, result is:
ls: /home/.zfs/snapshot/: Not a directory

if I do ls /home/.zfs, result is:
ls: snapshot: Bad file descriptor
shares

I have tried zpool scrub zroot, no errors were found, if I reboot the
system I can get one good backup, then I start having problems. Anyone
else ever ran into this, any suggestions as to a fix?

System is running FreeBSD 9.1-RELEASE-p5 #1 r253764: Mon Jul 29 15:07:35
CDT 2013, zpool is running version 28, zfs is running version 5



I can say I've had this problem. Not certain what fixed it. I do
remember I decided to stop snapshoting if I couldn't access them and
deleted existing snapshots. I later restarted the machine before I
went back for another look and they were working.

So my guess is a restart without existing snapshots may be the key.

Now if only we could find out what started the issue so we can stop it
happening again.

I had actually rebooted it last night, prior to seeing this message, I do know it didn't have any snapshots this time. As I am booting from ZFS using boot environments I may have had an older boot environment still on the system the last time it was rebooted. Backups ran great last night after the reboot, and I was able to kick off my pre-backup job and access all the snapshots today. Hopefully it doesn't come back, but if it does I will see if I can find anything else wrong.

FYI,
It didn't shutdown cleanly, so if this helps anyone find the issue, this is from my system logs:
Aug 14 22:08:04 cblproxy1 kernel:
Aug 14 22:08:04 cblproxy1 kernel: Fatal trap 12: page fault while in kernel mode
Aug 14 22:08:04 cblproxy1 kernel: cpuid = 0; apic id = 00
Aug 14 22:08:04 cblproxy1 kernel: fault virtual address = 0xa8
Aug 14 22:08:04 cblproxy1 kernel: fault code = supervisor write data, page not present Aug 14 22:08:04 cblproxy1 kernel: instruction pointer = 0x20:0xffffffff808b0562 Aug 14 22:08:04 cblproxy1 kernel: stack pointer = 0x28:0xffffff80002238f0 Aug 14 22:08:04 cblproxy1 kernel: frame pointer = 0x28:0xffffff8000223910 Aug 14 22:08:04 cblproxy1 kernel: code segment = base 0x0, limit 0xfffff, type 0x1b Aug 14 22:08:04 cblproxy1 kernel: = DPL 0, pres 1, long 1, def32 0, gran 1 Aug 14 22:08:04 cblproxy1 kernel: processor eflags = interrupt enabled, resume, IOPL = 0 Aug 14 22:08:04 cblproxy1 kernel: current process = 1 (init)
Aug 14 22:08:04 cblproxy1 kernel: trap number           = 12
Aug 14 22:08:04 cblproxy1 kernel: panic: page fault
Aug 14 22:08:04 cblproxy1 kernel: cpuid = 0
Aug 14 22:08:04 cblproxy1 kernel: KDB: stack backtrace:
Aug 14 22:08:04 cblproxy1 kernel: #0 0xffffffff808ddaf0 at kdb_backtrace+0x60
Aug 14 22:08:04 cblproxy1 kernel: #1 0xffffffff808a951d at panic+0x1fd
Aug 14 22:08:04 cblproxy1 kernel: #2 0xffffffff80b81578 at trap_fatal+0x388 Aug 14 22:08:04 cblproxy1 kernel: #3 0xffffffff80b81836 at trap_pfault+0x2a6
Aug 14 22:08:04 cblproxy1 kernel: #4 0xffffffff80b80ea1 at trap+0x2a1
Aug 14 22:08:04 cblproxy1 kernel: #5 0xffffffff80b6c7b3 at calltrap+0x8
Aug 14 22:08:04 cblproxy1 kernel: #6 0xffffffff815276da at zfsctl_umount_snapshots+0x8a Aug 14 22:08:04 cblproxy1 kernel: #7 0xffffffff81536766 at zfs_umount+0x76 Aug 14 22:08:04 cblproxy1 kernel: #8 0xffffffff809340bc at dounmount+0x3cc Aug 14 22:08:04 cblproxy1 kernel: #9 0xffffffff8093c101 at vfs_unmountall+0x71 Aug 14 22:08:04 cblproxy1 kernel: #10 0xffffffff808a8eae at kern_reboot+0x4ee Aug 14 22:08:04 cblproxy1 kernel: #11 0xffffffff808a89c0 at kern_reboot+0 Aug 14 22:08:04 cblproxy1 kernel: #12 0xffffffff80b81dab at amd64_syscall+0x29b Aug 14 22:08:04 cblproxy1 kernel: #13 0xffffffff80b6ca9b at Xfast_syscall+0xfb

--
Thanks,
   Dean E. Weimer
   http://www.dweimer.net/
_______________________________________________
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"

Reply via email to