Re: Kernel crash during heavy disk access
Date: Tue, 9 Jul 2013 18:29:01 -0700 Subject: Re: Kernel crash during heavy disk access From: Adrian Chadd adr...@freebsd.org To: Benjamin Kaduk b...@freebsd.org, Jeff Roberson j...@freebsd.org, Kirk McKusick mckus...@mckusick.com Cc: Eric Camachat eric.camac...@gmail.com, curr...@freebsd.org Well, best to tell kirk and jeffr. Jeffr wrote the journaling stuff. .. but I thought they knew there's still problems? -adrian Jeff has fixed all the journaling issues for which we have some way of reproducing them. We do still have some reports that there are problems but only a vague description and nothing that we can use to reproduce them on our systems. One of the inherit characteristics of any type of journaling is that once it thinks that it has fixed something, it never goes back and checks it again later. So, if there is some inconsistency that gets into your filesystem through media error or an earlier journaling bug, it will stay there and continue to plague you until a full fsck is run to clean it up. So, if you are getting filesystem related crashes, the first thing you should do is a full (fsck -f) check to make sure that you are starting from a clean state. After that, if you find that the journaling is not keeping it consistent, please send Jeff and me a report of what you are doing, what problems it creates, and most importantly transcript of a run of `fsck_ffs -d' first using the journal and then a second time with a full check (fsck_ffs -f -d) so that we can try to analyse what is going wrong. Note that you need to run fsck_ffs explicitly because the fsck front end will not pass the -d (debug output) flag through to fsck_ffs. Kirk McKusick On 9 July 2013 17:48, Benjamin Kaduk b...@freebsd.org wrote: On Tue, 9 Jul 2013, Adrian Chadd wrote: On 9 July 2013 09:24, Eric Camachat eric.camac...@gmail.com wrote: On Mon, 2013-07-08 at 23:05 -0700, Adrian Chadd wrote: Hi, Try doing a full, non-journal fsck. -adrian Thank you, it fixed the problem! Does it mean journal didn't work? Yup :( So, you are going to tell Kirk about it? -Ben ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Kernel crash during heavy disk access
I still get issues with latest stable/9 and panics during or just after a bunch of disk IO. I can try to reproduce this if you'd like. -adrian ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Kernel crash during heavy disk access
Hi, Try doing a full, non-journal fsck. -adrian On 8 July 2013 21:41, Eric Camachat eric.camac...@gmail.com wrote: I experienced kernel crashes while make world or ports. For example: # cd /usr/port/lang/mono # make Will cause the crash, from /var/crash/core.txt: eb8460p dumped core - see /var/crash/vmcore.5 Mon Jul 8 21:22:58 PDT 2013 FreeBSD eb8460p 10.0-CURRENT FreeBSD 10.0-CURRENT #5 r253048: Mon Jul 8 19:07:18 PDT 2013 root@eb8460p:/u sr/obj/usr/src/sys/EB8460p amd64 panic: ffs_valloc: dup alloc GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as amd64-marcel-freebsd... Unread portion of the kernel message buffer: mode = 0100600, inum = 52969060, fs = / panic: ffs_valloc: dup alloc cpuid = 0 KDB: stack backtrace: #0 0x805fd6d0 at kdb_backtrace+0x60 #1 0x805c5b65 at panic+0x155 #2 0x807dda6a at ffs_valloc+0x88a #3 0x8081a34c at ufs_makeinode+0x7c #4 0x808d2872 at VOP_CREATE_APV+0x92 #5 0x80670c49 at vn_open_cred+0x2c9 #6 0x8066a22f at kern_openat+0x1ef #7 0x8085db47 at amd64_syscall+0x357 #8 0x808475db at Xfast_syscall+0xfb Uptime: 6m57s Dumping 599 out of 3972 MB:..3%..11%..22%..33%..41%..51%..62%..73%..81%..91% Reading symbols from /boot/modules/cuse4bsd.ko...done. Loaded symbols for /boot/modules/cuse4bsd.ko Reading symbols from /boot/kernel/fdescfs.ko.symbols...done. Loaded symbols for /boot/kernel/fdescfs.ko.symbols Reading symbols from /boot/kernel/ng_ubt.ko.symbols...done. Loaded symbols for /boot/kernel/ng_ubt.ko.symbols Reading symbols from /boot/kernel/netgraph.ko.symbols...done. Loaded symbols for /boot/kernel/netgraph.ko.symbols Reading symbols from /boot/kernel/ng_hci.ko.symbols...done. Loaded symbols for /boot/kernel/ng_hci.ko.symbols Reading symbols from /boot/kernel/ng_bluetooth.ko.symbols...done. Loaded symbols for /boot/kernel/ng_bluetooth.ko.symbols Reading symbols from /boot/kernel/ums.ko.symbols...done. Loaded symbols for /boot/kernel/ums.ko.symbols Reading symbols from /boot/kernel/ng_l2cap.ko.symbols...done. Loaded symbols for /boot/kernel/ng_l2cap.ko.symbols Reading symbols from /boot/kernel/ng_btsocket.ko.symbols...done. Loaded symbols for /boot/kernel/ng_btsocket.ko.symbols Reading symbols from /boot/kernel/ng_socket.ko.symbols...done. Loaded symbols for /boot/kernel/ng_socket.ko.symbols #0 doadump (textdump=value optimized out) at pcpu.h:236 236 pcpu.h: No such file or directory. in pcpu.h (kgdb) #0 doadump (textdump=value optimized out) at pcpu.h:236 #1 0x805c57e0 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:447 #2 0x805c5ba4 in panic (fmt=value optimized out) at /usr/src/sys/kern/kern_shutdown.c:754 #3 0x807dda6a in ffs_valloc (pvp=value optimized out, mode=value optimized out, cred=value optimized out, vpp=value optimized out) at /usr/src/sys/ufs/ffs/ffs_alloc.c:1022 #4 0x8081a34c in ufs_makeinode (mode=value optimized out, dvp=0xfe011bf44ce8, vpp=0xff811ba058d8, cnp=0xff811ba05900) at /usr/src/sys/ufs/ufs/ufs_vnops.c:2620 #5 0x808d2872 in VOP_CREATE_APV (vop=value optimized out, a=value optimized out) at vnode_if.c:265 #6 0x80670c49 in vn_open_cred (ndp=0xff811ba05880, flagp=0xff811ba0595c, cmode=420, vn_open_flags=value optimized out, cred=0xfe0011fcee00, fp=0xfe00110925a0) at vnode_if.h:109 #7 0x8066a22f in kern_openat (td=0xfe011960f920, fd=value optimized out, path=0x801dbd580 Address 0x801dbd580 out of bounds, pathseg=UIO_USERSPACE, flags=1538, mode=value optimized out) at /usr/src/sys/kern/vfs_syscalls.c:1093 #8 0x8085db47 in amd64_syscall (td=0xfe011960f920, traced=0) at subr_syscall.c:134 #9 0x808475db in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:391 #10 0x0008013a5f2a in ?? () Previous frame inner to this frame (corrupt stack?) Current language: auto; currently minimal (kgdb) -- Eric Camachat ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Kernel crash during heavy disk access
On Mon, 2013-07-08 at 23:05 -0700, Adrian Chadd wrote: Hi, Try doing a full, non-journal fsck. -adrian Thank you, it fixed the problem! Does it mean journal didn't work? -- Eric Camachat signature.asc Description: This is a digitally signed message part
Re: Kernel crash during heavy disk access
On 9 July 2013 09:24, Eric Camachat eric.camac...@gmail.com wrote: On Mon, 2013-07-08 at 23:05 -0700, Adrian Chadd wrote: Hi, Try doing a full, non-journal fsck. -adrian Thank you, it fixed the problem! Does it mean journal didn't work? Yup :( -adrian ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Kernel crash during heavy disk access
On Tue, 9 Jul 2013, Adrian Chadd wrote: On 9 July 2013 09:24, Eric Camachat eric.camac...@gmail.com wrote: On Mon, 2013-07-08 at 23:05 -0700, Adrian Chadd wrote: Hi, Try doing a full, non-journal fsck. -adrian Thank you, it fixed the problem! Does it mean journal didn't work? Yup :( So, you are going to tell Kirk about it? -Ben ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Kernel crash during heavy disk access
Well, best to tell kirk and jeffr. Jeffr wrote the journaling stuff. .. but I thought they knew there's still problems? -adrian On 9 July 2013 17:48, Benjamin Kaduk b...@freebsd.org wrote: On Tue, 9 Jul 2013, Adrian Chadd wrote: On 9 July 2013 09:24, Eric Camachat eric.camac...@gmail.com wrote: On Mon, 2013-07-08 at 23:05 -0700, Adrian Chadd wrote: Hi, Try doing a full, non-journal fsck. -adrian Thank you, it fixed the problem! Does it mean journal didn't work? Yup :( So, you are going to tell Kirk about it? -Ben ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Kernel crash during heavy disk access
I experienced kernel crashes while make world or ports. For example: # cd /usr/port/lang/mono # make Will cause the crash, from /var/crash/core.txt: eb8460p dumped core - see /var/crash/vmcore.5 Mon Jul 8 21:22:58 PDT 2013 FreeBSD eb8460p 10.0-CURRENT FreeBSD 10.0-CURRENT #5 r253048: Mon Jul 8 19:07:18 PDT 2013 root@eb8460p:/u sr/obj/usr/src/sys/EB8460p amd64 panic: ffs_valloc: dup alloc GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as amd64-marcel-freebsd... Unread portion of the kernel message buffer: mode = 0100600, inum = 52969060, fs = / panic: ffs_valloc: dup alloc cpuid = 0 KDB: stack backtrace: #0 0x805fd6d0 at kdb_backtrace+0x60 #1 0x805c5b65 at panic+0x155 #2 0x807dda6a at ffs_valloc+0x88a #3 0x8081a34c at ufs_makeinode+0x7c #4 0x808d2872 at VOP_CREATE_APV+0x92 #5 0x80670c49 at vn_open_cred+0x2c9 #6 0x8066a22f at kern_openat+0x1ef #7 0x8085db47 at amd64_syscall+0x357 #8 0x808475db at Xfast_syscall+0xfb Uptime: 6m57s Dumping 599 out of 3972 MB:..3%..11%..22%..33%..41%..51%..62%..73%..81%..91% Reading symbols from /boot/modules/cuse4bsd.ko...done. Loaded symbols for /boot/modules/cuse4bsd.ko Reading symbols from /boot/kernel/fdescfs.ko.symbols...done. Loaded symbols for /boot/kernel/fdescfs.ko.symbols Reading symbols from /boot/kernel/ng_ubt.ko.symbols...done. Loaded symbols for /boot/kernel/ng_ubt.ko.symbols Reading symbols from /boot/kernel/netgraph.ko.symbols...done. Loaded symbols for /boot/kernel/netgraph.ko.symbols Reading symbols from /boot/kernel/ng_hci.ko.symbols...done. Loaded symbols for /boot/kernel/ng_hci.ko.symbols Reading symbols from /boot/kernel/ng_bluetooth.ko.symbols...done. Loaded symbols for /boot/kernel/ng_bluetooth.ko.symbols Reading symbols from /boot/kernel/ums.ko.symbols...done. Loaded symbols for /boot/kernel/ums.ko.symbols Reading symbols from /boot/kernel/ng_l2cap.ko.symbols...done. Loaded symbols for /boot/kernel/ng_l2cap.ko.symbols Reading symbols from /boot/kernel/ng_btsocket.ko.symbols...done. Loaded symbols for /boot/kernel/ng_btsocket.ko.symbols Reading symbols from /boot/kernel/ng_socket.ko.symbols...done. Loaded symbols for /boot/kernel/ng_socket.ko.symbols #0 doadump (textdump=value optimized out) at pcpu.h:236 236 pcpu.h: No such file or directory. in pcpu.h (kgdb) #0 doadump (textdump=value optimized out) at pcpu.h:236 #1 0x805c57e0 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:447 #2 0x805c5ba4 in panic (fmt=value optimized out) at /usr/src/sys/kern/kern_shutdown.c:754 #3 0x807dda6a in ffs_valloc (pvp=value optimized out, mode=value optimized out, cred=value optimized out, vpp=value optimized out) at /usr/src/sys/ufs/ffs/ffs_alloc.c:1022 #4 0x8081a34c in ufs_makeinode (mode=value optimized out, dvp=0xfe011bf44ce8, vpp=0xff811ba058d8, cnp=0xff811ba05900) at /usr/src/sys/ufs/ufs/ufs_vnops.c:2620 #5 0x808d2872 in VOP_CREATE_APV (vop=value optimized out, a=value optimized out) at vnode_if.c:265 #6 0x80670c49 in vn_open_cred (ndp=0xff811ba05880, flagp=0xff811ba0595c, cmode=420, vn_open_flags=value optimized out, cred=0xfe0011fcee00, fp=0xfe00110925a0) at vnode_if.h:109 #7 0x8066a22f in kern_openat (td=0xfe011960f920, fd=value optimized out, path=0x801dbd580 Address 0x801dbd580 out of bounds, pathseg=UIO_USERSPACE, flags=1538, mode=value optimized out) at /usr/src/sys/kern/vfs_syscalls.c:1093 #8 0x8085db47 in amd64_syscall (td=0xfe011960f920, traced=0) at subr_syscall.c:134 #9 0x808475db in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:391 #10 0x0008013a5f2a in ?? () Previous frame inner to this frame (corrupt stack?) Current language: auto; currently minimal (kgdb) -- Eric Camachat signature.asc Description: This is a digitally signed message part