call for 2015Q4 quarterly status reports
Dear FreeBSD Community, The deadline for the next FreeBSD Quarterly Status update is January 7, 2016, for work done in October through December. Status report submissions do not have to be very long. They may be about anything happening in the FreeBSD project and community, and provide a great way to inform FreeBSD users and developers about what you're working on. Submission of reports is not restricted to committers. Anyone doing anything interesting and FreeBSD-related can -- and should -- write one! The preferred and easiest submission method is to use the XML generator [1] with the results emailed to the status report team at monthly at freebsd.org . There is also an XML template [2] which can be filled out manually and attached if preferred. For the expected content and style, please study our guidelines on how to write a good status report [3]. You can also review previous issues [4][5] for ideas on the style and format. We are looking forward to all of your 2015Q4 reports! Thanks, Ben (on behalf of monthly@) [1] http://www.freebsd.org/cgi/monthly.cgi [2] http://www.freebsd.org/news/status/report-sample.xml [3] http://www.freebsd.org/news/status/howto.html [4] http://www.freebsd.org/news/status/report-2015-04-2015-06.html [5] http://www.freebsd.org/news/status/report-2015-07-2015-09.html ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
[RESOLVED] LOR On AMD64 hosted by KVM hypervisor
Closing the loop on this for the archives - I am no longer seeing this LOR as of r291998. -pete On 12/08/15 11:36, Pete Wright wrote: > Hey All, > I am seeing a repeated LOR on r291495 that is pretty reproducible. This > happens right after the system boots: > > lock order reversal: > 1st 0xfe03e37fa920 bufwait (bufwait) @ /usr/src/sys/kern/vfs_bio.c:3476 > 2nd 0xf80024c72200 dirhash (dirhash) @ > /usr/src/sys/ufs/ufs/ufs_dirhash.c:281 > stack backtrace: > #0 0x80a7b2e0 at witness_debugger+0x70 > #1 0x80a7b1e1 at witness_checkorder+0xe71 > #2 0x80a28ab2 at _sx_xlock+0x72 > #3 0x80cc0a5d at ufsdirhash_add+0x3d > #4 0x80cc390f at ufs_direnter+0x62f > #5 0x80d3 at ufs_makeinode+0x5f3 > #6 0x80cc881d at ufs_create+0x2d > #7 0x80fb2ed1 at VOP_CREATE_APV+0xf1 > #8 0x80ae3568 at vn_open_cred+0x2f8 > #9 0x80adc8ec at kern_openat+0x25c > #10 0x80e6a4fe at amd64_syscall+0x2de > #11 0x80e4972b at Xfast_syscall+0xfb > > Here is the system info: > % uname -ar > FreeBSD bsd-current.trp-srd.com 11.0-CURRENT FreeBSD 11.0-CURRENT #0 > r291495: Mon Nov 30 23:14:34 UTC 2015 > r...@releng2.nyi.freebsd.org:/usr/obj/usr/src/sys/GENERIC amd64 > > The server itself is a VM running under the KVM hypervisor. I am > currently rebuilding world+kernel now. The base OS image is from the > latest 11-CURRENT snapshot, and I have been able to reproduce this on > several hypervisors. > > Has anyone else seen this? > > Cheers, > -pete > > > > -- Pete Wright p...@nomadlogic.org twitter => @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: panic: sbuf_vprintf called with a NULL sbuf pointer
On Tuesday, December 08, 2015 11:39:56 AM Don Lewis wrote: > On 8 Dec, John Baldwin wrote: > > On Monday, December 07, 2015 10:10:51 PM Don Lewis wrote: > >> On 2 Dec, John Baldwin wrote: > >> > On Wednesday, December 02, 2015 01:25:56 PM Don Lewis wrote: > >> >> > If you want to look at this further, try going to frame 16 and > >> >> > dissassembling the > >> >> > instructions before the call to see if you can spot which register > >> >> > the first > >> >> > parameter (saved in %rdi IIRC) comes from. > >> >> > >> >> Dump of assembler code for function sbuf_printf: > >> >>0x80a673e0 <+0>: push %rbp > >> >>0x80a673e1 <+1>: mov%rsp,%rbp > >> >>0x80a673e4 <+4>: push %r14 > >> >>0x80a673e6 <+6>: push %rbx > >> >>0x80a673e7 <+7>: sub$0x50,%rsp > >> >>0x80a673eb <+11>:mov%rsi,%r14 > >> >>0x80a673ee <+14>:mov%rdi,%rbx > >> >>0x80a673f1 <+17>:mov%r9,-0x38(%rbp) > >> >>0x80a673f5 <+21>:mov%r8,-0x40(%rbp) > >> >>0x80a673f9 <+25>:mov%rcx,-0x48(%rbp) > >> >>0x80a673fd <+29>:mov%rdx,-0x50(%rbp) > >> >>0x80a67401 <+33>:lea-0x60(%rbp),%rax > >> >>0x80a67405 <+37>:mov%rax,-0x20(%rbp) > >> >>0x80a67409 <+41>:lea0x10(%rbp),%rax > >> >>0x80a6740d <+45>:mov%rax,-0x28(%rbp) > >> >>0x80a67411 <+49>:movl $0x30,-0x2c(%rbp) > >> >>0x80a67418 <+56>:movl $0x10,-0x30(%rbp) > >> >>0x80a6741f <+63>:mov$0x8137bdf8,%rdi > >> >>0x80a67426 <+70>:mov%rbx,%rsi > >> >>0x80a67429 <+73>:callq 0x80a66c00 > >> >> <_assert_sbuf_integrity> > >> >> > >> >> > >> >>0x80a237b9 <+825>: jmpq 0x80a236fd > >> >>0x80a237be <+830>: mov$0x80fd8ad3,%rsi > >> >>0x80a237c5 <+837>: xor%eax,%eax > >> >>0x80a237c7 <+839>: mov%r12,%rdi > >> >>0x80a237ca <+842>: mov-0x228(%rbp),%rdx > >> >>0x80a237d1 <+849>: callq 0x80a673e0 > >> >> => 0x80a237d6 <+854>: inc%r14d > >> >>0x80a237d9 <+857>: jmpq 0x80a236fd > >> > > >> > So maybe try 'p $r12' in the corefile_open() frame. > >> > >> #17 0x80a237d6 in corefile_open (compress=0, comm=, > >> uid=, pid=, td=, > >> vpp=, namep=) > >> at /usr/src/sys/kern/kern_sig.c:3188 > >> 3188 sbuf_printf(&sb, "%s", comm); > >> (kgdb) p $r12 > >> $1 = 0 > > > > So it's definitely zero. :( The next step is probably to disassemble the > > corefile_open function (ugh) and walk backwards to find where %r12 is read > > from. It might be from a local variable on the stack, so then you would > > want to examine that memory in the stack and the surrounding memory to see > > if there is memory corruption and perhaps if there is anything recognizable > > about it (e.g. if the corruption contains some sort of data you recognize, > > or if the corruption is bounded by a certain length, etc.). It's a bit of > > a shot in the dark though. > > > > Is this reproducible? > > No it's not. The only time it happened was when there was a swap > timeout, probably because of a lengthy deep recovery on one of the > mirrored swap devices. > > The code in question is: > struct sbuf sb; > [snip] > (void)sbuf_new(&sb, name, MAXPATHLEN, SBUF_FIXEDLEN); > [snip] > for (i = 0; format[i] != '\0'; i++) { > switch (format[i]) { > case '%': /* Format character */ > i++; > switch (format[i]) { > [snip] > case 'N': /* process name */ > sbuf_printf(&sb, "%s", comm); > break; > > > &sb is used in a bunch of places, so the compiler probably computes its > value once by adding the proper offset to the stack pointer and stashing > the result in a register. Since kern.corefile is "%N.core", the failure > is happening on the first interation of the loop, so there isn't much > opportunity for things to get corrupted. Also, the control flow in this > function only depends on the format, so there shouldn't be anything > special about a swap timeout vs. a segfault generated core. Yes, r12 is call-safe (IIRC), so I expect it only computes it once as well, but I've sometimes seen the compiler spill local vars onto the stack due to register pressure. That said, I think it is unlikely it would have to spill &sb during the early part of the function. :( > How is gdb able to print the register contents for an arbitrary stack > frame? It's not like this is a SPARC with register windows. Aren't > only the final register values when the core dump wa
Re: powerpc64-gcc 5.2 vintages get L". . ." type wrong compared to Char for FreeBSD for lib32 compiling
[This time I'm explicit about a patch-gcc-freebsd-powerpc64 update and I report that with the change the PowerMac G5 hosted powerpc64-gcc based buildworld attempt completed, including WITH_LIB32= and WITH_BOOT= being involved. (But it may not be until tomorrow or later until I test if such a build actually works for installworld and reboot.)] > On 2015-Dec-6, at 2:44 PM, Andreas Tobler wrote: > > On 06.12.15 22:34, Mark Millard wrote: >> [I picked the lists that I did because powerpc64-gcc is the external >> toolchain created to allow modern powerpc64 builds.] >> >> For the powerpc64-gcc 5.2 vintages. . . (using my environment's path >> as an example) >> >> /usr/obj/portswork/usr/ports/devel/powerpc64-gcc/work/gcc-5.2.0/gcc/config/rs6000/freebsd64.h >> has: >> >>> /* rs6000.h gets this wrong for FreeBSD. We use the GCC defaults >>> instead. */ #undef WCHAR_TYPE #define WCHAR_TYPE >>> (TARGET_64BIT ? "int" : "long int") #undef WCHAR_TYPE_SIZE #define >>> WCHAR_TYPE_SIZE 32 >> >> That type in quotes ends up being the base type for L". . ." >> notation, for example. Probably the char notation as well (L'?'). >> >> For FreeBSD Char compatibility in a powerpc64 lib32 context that >> "long int" should effectively instead be "int", making the >> conditional above technically unnecessary. >> >> This blocks compiling lib32 source code that uses such notations as >> L". . .": "long int" is not compatible with FreeBSD's Char in the >> powerpc64 environment's 32 bit environment. Some compiler message are >> explicit about the base types of pointers that result for the >> mismatches: that is how I know that "long int" is in use for L". . ." >> and "int" is the other base type involved. >> >> (Yes, freebsd64.h appears to be used for lib32, at least on >> powerpc64. By contrast freebsd.h agrees for the WCHAR_TYPE_SIZE but >> only undef's WCHAR_TYPE, presuming gcc defaults are correct for >> FreeBSD as far as the type goes. It might need a more explicit type >> to be sure of a Char match for that freebsd.h file's context.) >> >> The 4.9 vintages of powerpc64-gcc were messed up the same way, as was >> noted at the time. > > I'll take care. > > Andreas (I make no claim that this note manages to preserve tabs and such in the diff -u text.) To turn my earlier note into an actual updated devel/powerpc64-gcc/files/patch-gcc-freebsd-powerpc64 instead of the more vague words would involve adding what would look something like: > @@ -304,7 +317,7 @@ > > /* rs6000.h gets this wrong for FreeBSD. We use the GCC defaults instead. > */ > #undef WCHAR_TYPE > -#defineWCHAR_TYPE (TARGET_64BIT ? "int" : "long int") > +#defineWCHAR_TYPE "int" > #undef WCHAR_TYPE_SIZE > #define WCHAR_TYPE_SIZE 32 > (It is what I actually tested.) The full patch-gcc-freebsd-powerpc64 would then look something like: > --- gcc/config/rs6000/freebsd64.h.orig 2015-01-05 04:33:28.0 -0800 > +++ gcc/config/rs6000/freebsd64.h 2015-12-09 00:14:28.520684000 -0800 > @@ -65,6 +65,13 @@ > #define INVALID_64BIT "-m%s not supported in this configuration" > #define INVALID_32BIT INVALID_64BIT > > +/* Use LINUX64 instead of FREEBSD64 for compat with e.g. sysv4le.h */ > +#ifdef LINUX64_DEFAULT_ABI_ELFv2 > +#define ELFv2_ABI_CHECK (rs6000_elf_abi != 1) > +#else > +#define ELFv2_ABI_CHECK (rs6000_elf_abi == 2) > +#endif > + > #undef SUBSUBTARGET_OVERRIDE_OPTIONS > #define SUBSUBTARGET_OVERRIDE_OPTIONS \ >do \ > @@ -84,6 +91,12 @@ > rs6000_isa_flags &= ~OPTION_MASK_RELOCATABLE; \ > error (INVALID_64BIT, "relocatable"); \ > } \ > + if (ELFv2_ABI_CHECK) \ > + { \ > + rs6000_current_abi = ABI_ELFv2; \ > + if (dot_symbols) \ > + error ("-mcall-aixdesc incompatible with -mabi=elfv2"); \ > + } \ > if (rs6000_isa_flags & OPTION_MASK_EABI) \ > { \ > rs6000_isa_flags &= ~OPTION_MASK_EABI;\ > @@ -304,7 +317,7 @@ > > /* rs6000.h gets this wrong for FreeBSD. We use the GCC defaults instead. > */ > #undef WCHAR_TYPE > -#defineWCHAR_TYPE (TARGET_64BIT ? "int" : "long int") > +#defineWCHAR_TYPE "int" > #undef WCHAR_TYPE_SIZE > #define WCHAR_TYPE_SIZE 32 > I can report that a make buildworld with the following WITH/WITHOUT src.conf type options competed based on the rebuilt powerpc64-gcc. (WITHOUT_CLANG= and WITHOUT_LLDB= were just to save time. The context is already libc++ based and so WITHOUT_GCC= and WITHOUT_GNUXX=.)
Re: Something in r291926 (and earlier) causes reboots during periodic daily (14 jails on system)
On Wed, 09 Dec 2015 09:38:07 +0100 Miroslav Lachman <000.f...@quip.cz> wrote: > Alexander Leidinger wrote on 12/09/2015 09:00: > > Does this ring a bell for someone or any ideas before I try to hunt > > this down? > > The same problem was reported yesterday on Stable > > Periodic jobs triggering panics in 10.1 and 10.2 > https://lists.freebsd.org/pipermail/freebsd-stable/2015-December/083807.html > > Also ZFS with jails. The above mail has a backtrace involving ZFS. I'm running the periodic scripts serially now, 8 out of 14 already finished without issues. Smells like a concurrency issue. I would assume it's something introduced between 5 and 1 month ago and MFCed back to 10. Does this ring a bell for someone on fs@? Bye, Alexander. -- http://www.Leidinger.net alexan...@leidinger.net: PGP 0xC773696B3BAC17DC http://www.FreeBSD.orgnetch...@freebsd.org : PGP 0xC773696B3BAC17DC ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Something in r291926 (and earlier) causes reboots during periodic daily (14 jails on system)
Alexander Leidinger wrote on 12/09/2015 09:00: Hi, with r291381 a system with 14 jails survives about 1-2 days. with r291926 this system survives 1 day. In both cases it reboots during periodic daily (the jails run periodic too, at the usual time). This is a ZFS-only (+nullfs) system There is no coredump. Watchdogd is currently not enabled on this system. In the logs I don't find any traces. The system is not really low on resources: last pid: 18031; load averages: 0.25, 0.23, 0.80 up 0+03:59:05 08:57:12 189 processes: 1 running, 188 sleeping CPU: 0.3% user, 0.0% nice, 0.4% system, 0.1% interrupt, 99.1% idle Mem: 579M Active, 709M Inact, 2311M Wired, 8253M Free ARC: 1460M Total, 418M MFU, 868M MRU, 1946K Anon, 17M Header, 155M Other Swap: 4096M Total, 4096M Free Does this ring a bell for someone or any ideas before I try to hunt this down? The same problem was reported yesterday on Stable Periodic jobs triggering panics in 10.1 and 10.2 https://lists.freebsd.org/pipermail/freebsd-stable/2015-December/083807.html Also ZFS with jails. Miroslav Lachman ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Something in r291926 (and earlier) causes reboots during periodic daily (14 jails on system)
Hi, with r291381 a system with 14 jails survives about 1-2 days. with r291926 this system survives 1 day. In both cases it reboots during periodic daily (the jails run periodic too, at the usual time). This is a ZFS-only (+nullfs) system There is no coredump. Watchdogd is currently not enabled on this system. In the logs I don't find any traces. The system is not really low on resources: last pid: 18031; load averages: 0.25, 0.23, 0.80 up 0+03:59:05 08:57:12 189 processes: 1 running, 188 sleeping CPU: 0.3% user, 0.0% nice, 0.4% system, 0.1% interrupt, 99.1% idle Mem: 579M Active, 709M Inact, 2311M Wired, 8253M Free ARC: 1460M Total, 418M MFU, 868M MRU, 1946K Anon, 17M Header, 155M Other Swap: 4096M Total, 4096M Free Does this ring a bell for someone or any ideas before I try to hunt this down? Bye, Alexander. -- http://www.Leidinger.net alexan...@leidinger.net: PGP 0xC773696B3BAC17DC http://www.FreeBSD.orgnetch...@freebsd.org : PGP 0xC773696B3BAC17DC ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"