Re: FreeBSD-9.1: machine reboots during snapshot creation, LORs found
On Fri, 31-May-2013 at 16:51:03 +0200, John Baldwin wrote: > On Friday, May 31, 2013 8:26:11 am Andre Albsmeier wrote: > > Each day at 5:15 we are generating snapshots on various machines. > > This used to work perfectly under 7-STABLE for years but since > > we started to use 9.1-STABLE the machine reboots in about 10% > > of all cases. > > > > After rebooting we find a new snapshot file which is a bit > > smaller than the good ones and with different permissions > > It does not succeed a fsck. In this example it is the one > > whose name is beginning with s3: > > > > -r--r- 1 root operator snapshot 72802894528 29 May 05:15 > > s2-2013.05.28-03.15.04 > > -r 1 root operator snapshot 72802893824 29 May 05:15 > > s3-2013.05.29-03.15.03 > > -r--r- 1 root operator snapshot 72802894528 28 May 14:22 > > s4-2013.05.23-06.38.44 > > -r--r- 1 root operator snapshot 72802894528 28 May 14:22 > > s5-2013.05.24-03.15.03 > > -r--r- 1 root operator snapshot 72802894528 28 May 14:22 > > s6-2013.05.25-03.15.03 > > > > After enabling DIAGNOSTIC, WITNESS and INVARIANTS in the kernel > > I see the following LORs (mksnap_ffs starts exactly at 5:15): > > > > May 29 05:15:00 palveli kernel: lock order reversal: > > May 29 05:15:00 palveli kernel: 1st 0xc2371da8 ufs (ufs) @ > > /src/src-9/sys/kern/vfs_mount.c:1240 > > May 29 05:15:00 palveli kernel: 2nd 0xc2371ec4 devfs (devfs) @ > > /src/src-9/sys/ufs/ffs/ffs_vfsops.c:1414 > > May 29 05:15:04 palveli kernel: lock order reversal: > > May 29 05:15:04 palveli kernel: 1st 0xc228471c snaplk (snaplk) > > @ /src/src-9/sys/ufs/ufs/ufs_vnops.c:976 > > May 29 05:15:04 palveli kernel: 2nd 0xc22f25e4 ufs (ufs) @ > > /src/src-9/sys/ufs/ffs/ffs_snapshot.c:1626 > > > > Unfortunatley no corefiles are being generated ;-(. > > > > I have checked and even rebuilt the (UFS1) fs in question > > from scratch. I have also seen this happen on an UFS2 on > > another machine and on a third one when running "dump -L" > > on a root fs. > > > > Any hints of how to proceed? > > Would it be possible to setup a serial console that is logged on this machine > to see if it is panic'ing but failing to write out a crashdump? I'll try to arrange that. It'll take a bit since this box is 200 km away... Maybe I'll find another one nearby to reproduce it... -Andre -- This email has been checked as virus-free. It may still be full of nonsense however. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: FreeBSD-9.1: machine reboots during snapshot creation, LORs found
On Friday, May 31, 2013 8:26:11 am Andre Albsmeier wrote: > Each day at 5:15 we are generating snapshots on various machines. > This used to work perfectly under 7-STABLE for years but since > we started to use 9.1-STABLE the machine reboots in about 10% > of all cases. > > After rebooting we find a new snapshot file which is a bit > smaller than the good ones and with different permissions > It does not succeed a fsck. In this example it is the one > whose name is beginning with s3: > > -r--r- 1 root operator snapshot 72802894528 29 May 05:15 > s2-2013.05.28-03.15.04 > -r 1 root operator snapshot 72802893824 29 May 05:15 > s3-2013.05.29-03.15.03 > -r--r- 1 root operator snapshot 72802894528 28 May 14:22 > s4-2013.05.23-06.38.44 > -r--r- 1 root operator snapshot 72802894528 28 May 14:22 > s5-2013.05.24-03.15.03 > -r--r- 1 root operator snapshot 72802894528 28 May 14:22 > s6-2013.05.25-03.15.03 > > After enabling DIAGNOSTIC, WITNESS and INVARIANTS in the kernel > I see the following LORs (mksnap_ffs starts exactly at 5:15): > > May 29 05:15:00 palveli kernel: lock order reversal: > May 29 05:15:00 palveli kernel: 1st 0xc2371da8 ufs (ufs) @ > /src/src-9/sys/kern/vfs_mount.c:1240 > May 29 05:15:00 palveli kernel: 2nd 0xc2371ec4 devfs (devfs) @ > /src/src-9/sys/ufs/ffs/ffs_vfsops.c:1414 > May 29 05:15:04 palveli kernel: lock order reversal: > May 29 05:15:04 palveli kernel: 1st 0xc228471c snaplk (snaplk) @ > /src/src-9/sys/ufs/ufs/ufs_vnops.c:976 > May 29 05:15:04 palveli kernel: 2nd 0xc22f25e4 ufs (ufs) @ > /src/src-9/sys/ufs/ffs/ffs_snapshot.c:1626 > > Unfortunatley no corefiles are being generated ;-(. > > I have checked and even rebuilt the (UFS1) fs in question > from scratch. I have also seen this happen on an UFS2 on > another machine and on a third one when running "dump -L" > on a root fs. > > Any hints of how to proceed? Would it be possible to setup a serial console that is logged on this machine to see if it is panic'ing but failing to write out a crashdump? -- John Baldwin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
pf loosing (v6) TCP states much too early, "no-route" not working with IPv6
Hello, my default pf config blocks everything and allowes specific connections. One of them is "in from x to self port ssh" which expands to "port ssh keep state flags S/SA" by default. After ssh login, I see the corresponding entry in the states table: all tcp 2001:db8:f0bb:1::1[22] <- 2001:db8:f0bb:1::3:1[42730] ESTABLISHED:ESTABLISHED pfctl -s info claims: TIMEOUTS: ... tcp.established 86400s ... After a couple of hours of inactivity, the ssh session silently stalls. Here's what I have in the log: rule 3/0(match): block in on rl1: 2001:db8:f0bb:1::3:1.42730 > 2001:db8:f0bb:1::1.22: Flags [P.], ack 1444009640, win 65535, length 48 The rule evaluation by itself is correct, it's no TCP-SYN, so it get's blocked, but this packet should not get through the ruleset at all, at least not before 86400s of idle connection. In my case, it was after ~3 hours. And ports numbers are exactly the same as in the state table entry from some hours before. So the state table entry seems to got lost! My question: Is such a problem known? Did I miss enything else? System runs 8.1-STABLE/x86 Another issue was that "no-route" doesn't work for IPv6 connections. I had to replace it with "any". Thansk for any hints in advance, -Harry P.S.: It's an embedded box where upgrading is overdue, but not that easy... signature.asc Description: OpenPGP digital signature
FreeBSD-9.1: machine reboots during snapshot creation, LORs found
Each day at 5:15 we are generating snapshots on various machines. This used to work perfectly under 7-STABLE for years but since we started to use 9.1-STABLE the machine reboots in about 10% of all cases. After rebooting we find a new snapshot file which is a bit smaller than the good ones and with different permissions It does not succeed a fsck. In this example it is the one whose name is beginning with s3: -r--r- 1 root operator snapshot 72802894528 29 May 05:15 s2-2013.05.28-03.15.04 -r 1 root operator snapshot 72802893824 29 May 05:15 s3-2013.05.29-03.15.03 -r--r- 1 root operator snapshot 72802894528 28 May 14:22 s4-2013.05.23-06.38.44 -r--r- 1 root operator snapshot 72802894528 28 May 14:22 s5-2013.05.24-03.15.03 -r--r- 1 root operator snapshot 72802894528 28 May 14:22 s6-2013.05.25-03.15.03 After enabling DIAGNOSTIC, WITNESS and INVARIANTS in the kernel I see the following LORs (mksnap_ffs starts exactly at 5:15): May 29 05:15:00 palveli kernel: lock order reversal: May 29 05:15:00 palveli kernel: 1st 0xc2371da8 ufs (ufs) @ /src/src-9/sys/kern/vfs_mount.c:1240 May 29 05:15:00 palveli kernel: 2nd 0xc2371ec4 devfs (devfs) @ /src/src-9/sys/ufs/ffs/ffs_vfsops.c:1414 May 29 05:15:04 palveli kernel: lock order reversal: May 29 05:15:04 palveli kernel: 1st 0xc228471c snaplk (snaplk) @ /src/src-9/sys/ufs/ufs/ufs_vnops.c:976 May 29 05:15:04 palveli kernel: 2nd 0xc22f25e4 ufs (ufs) @ /src/src-9/sys/ufs/ffs/ffs_snapshot.c:1626 Unfortunatley no corefiles are being generated ;-(. I have checked and even rebuilt the (UFS1) fs in question from scratch. I have also seen this happen on an UFS2 on another machine and on a third one when running "dump -L" on a root fs. Any hints of how to proceed? -Andre ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
[releng_9 tinderbox] failure on i386/i386
TB --- 2013-05-31 07:10:25 - tinderbox 2.10 running on freebsd-stable.sentex.ca TB --- 2013-05-31 07:10:25 - FreeBSD freebsd-stable.sentex.ca 8.3-STABLE FreeBSD 8.3-STABLE #0: Tue Oct 16 17:37:58 UTC 2012 mdtan...@freebsd-stable.sentex.ca:/usr/obj/usr/src/sys/server amd64 TB --- 2013-05-31 07:10:25 - starting RELENG_9 tinderbox run for i386/i386 TB --- 2013-05-31 07:10:25 - cleaning the object tree TB --- 2013-05-31 07:10:45 - /usr/local/bin/svn stat /src TB --- 2013-05-31 07:10:50 - At svn revision 251176 TB --- 2013-05-31 07:10:51 - building world TB --- 2013-05-31 07:10:51 - CROSS_BUILD_TESTING=YES TB --- 2013-05-31 07:10:51 - MAKEOBJDIRPREFIX=/obj TB --- 2013-05-31 07:10:51 - PATH=/usr/bin:/usr/sbin:/bin:/sbin TB --- 2013-05-31 07:10:51 - SRCCONF=/dev/null TB --- 2013-05-31 07:10:51 - TARGET=i386 TB --- 2013-05-31 07:10:51 - TARGET_ARCH=i386 TB --- 2013-05-31 07:10:51 - TZ=UTC TB --- 2013-05-31 07:10:51 - __MAKE_CONF=/dev/null TB --- 2013-05-31 07:10:51 - cd /src TB --- 2013-05-31 07:10:51 - /usr/bin/make -B buildworld >>> World build started on Fri May 31 07:10:51 UTC 2013 >>> Rebuilding the temporary build tree >>> stage 1.1: legacy release compatibility shims >>> stage 1.2: bootstrap tools >>> stage 2.1: cleaning up the object tree >>> stage 2.2: rebuilding the object tree >>> stage 2.3: build tools >>> stage 3: cross tools >>> stage 4.1: building includes >>> stage 4.2: building libraries >>> stage 4.3: make dependencies >>> stage 4.4: building everything >>> World build completed on Fri May 31 10:05:30 UTC 2013 TB --- 2013-05-31 10:05:30 - generating LINT kernel config TB --- 2013-05-31 10:05:30 - cd /src/sys/i386/conf TB --- 2013-05-31 10:05:30 - /usr/bin/make -B LINT TB --- 2013-05-31 10:05:30 - cd /src/sys/i386/conf TB --- 2013-05-31 10:05:30 - /usr/sbin/config -m LINT TB --- 2013-05-31 10:05:30 - building LINT kernel TB --- 2013-05-31 10:05:30 - CROSS_BUILD_TESTING=YES TB --- 2013-05-31 10:05:30 - MAKEOBJDIRPREFIX=/obj TB --- 2013-05-31 10:05:30 - PATH=/usr/bin:/usr/sbin:/bin:/sbin TB --- 2013-05-31 10:05:30 - SRCCONF=/dev/null TB --- 2013-05-31 10:05:30 - TARGET=i386 TB --- 2013-05-31 10:05:30 - TARGET_ARCH=i386 TB --- 2013-05-31 10:05:30 - TZ=UTC TB --- 2013-05-31 10:05:30 - __MAKE_CONF=/dev/null TB --- 2013-05-31 10:05:30 - cd /src TB --- 2013-05-31 10:05:30 - /usr/bin/make -B buildkernel KERNCONF=LINT >>> Kernel build for LINT started on Fri May 31 10:05:30 UTC 2013 >>> stage 1: configuring the kernel >>> stage 2.1: cleaning up the object tree >>> stage 2.2: rebuilding the object tree >>> stage 2.3: build tools >>> stage 3.1: making dependencies >>> stage 3.2: building everything [...] cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -Wmissing-include-dirs -fdiagnostics-show-option -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -mno-align-long-strings -mpreferred-stack-boundary=2 -mno-mmx -mno-sse -msoft-float -ffreestanding -fstack-protector -Werror -pg -mprofiler-epilogue /src/sys/dev/aha/aha_isa.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -Wmissing-include-dirs -fdiagnostics-show-option -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -mno-align-long-strings -mpreferred-stack-boundary=2 -mno-mmx -mno-sse -msoft-float -ffreestanding -fstack-protector -Werror -pg -mprofiler-epilogue /src/sys/dev/aha/aha_mca.c In file included from /src/sys/dev/aha/aha_mca.c:49: /src/sys/dev/aha/ahareg.h:300: error: field 'timer' has incomplete type /src/sys/dev/aha/aha_mca.c: In function 'aha_mca_attach': /src/sys/dev/aha/aha_mca.c:194: error: 'aha' undeclared (first use in this function) /src/sys/dev/aha/aha_mca.c:194: error: (Each undeclared identifier is reported only once /src/sys/dev/aha/aha_mca.c:194: error: for each function it appears in.) *** Error code 1 Stop in /obj/i386.i386/src/sys/LINT. *** Error code 1 Stop in /src. *** Error code 1 Stop in /src. TB --- 2013-05-31 10:12:04 - WARNING: /usr/bin/make returned exit code 1 TB --- 2013-05-31 10:12:04 - ERROR: failed to build LINT kernel TB --- 2013-05-31 10:12:04 - 8357.76 user 914.91 system 10899.77 real http://tinderbox.freebsd.org/tinderbox-freebsd9-build-RELENG_9-i386-i386.full ___ freebsd-stab