Re: after trivial update, 15.0 ARM64 system no longer boots
Are you sure that the system isn't actually booting or are you saying that because you get no console output after going to userland? It's possible that /etc/ttys is not in sync with your console configuration in loader.conf. If that happened, nothing from /etc/rc would print to console during boot and you wouldn't get a login prompt, but networking and the like would still work. If you need the console to access it (e.g. no sshd is configured) you can try booting into single-user and look at /etc/ttys from there. If you can't actually reach single-user mode then the problem is something else. On Fri, Mar 15, 2024 at 11:57 AM Lexi Winter wrote: > > hi lists, > > i have a FreeBSD 15.0/arm64 system, an RPi4, which was previously > running 15.0 with pkgbase. i rebuilt main on my pkg server and updated > the RPi with 'pkg update', which only included ~2 commits neither of > which seemed like they had anything to do with booting, but after the > update, the system no longer boots. > > the problem seems to be a hang during kernel initialisation: > > https://www.le-fay.org/tmp/30d/9fE0NG.jpeg > > i am not really an expert on either ARM64 in general or on the RPi > hardware in particular. could anyone suggest how i could debug this > problem, e.g. to get more information about why the system won't finish > booting? > > thanks, lexi.
Re: buildkernel is broken
You could "git cherry-pick -n 37f604b49d4a; git restore --unstaged sys/net/vnet.h" to apply the fix to your local tree without committing it or leaving it staged for commit. On Thu, Jul 7, 2022 at 10:50 AM Steve Kargl wrote: > > On Thu, Jul 07, 2022 at 10:38:43AM -0400, Ryan Stone wrote: > > Okay, update your tree and it should be fixed then. > > Is it possible to pull just that fix? I spent part of > yesterday building world, and contrary to popular belief, > not all hardware contain a 32-core uber-fast ryzen cpu. > > Can people please test their simple changes prior to > committing? > > -- > Steve
Re: buildkernel is broken
Okay, update your tree and it should be fixed then.
Re: buildkernel is broken
Do you have VNET disabled in your kernel config? I believe that this was fixed by 37f604b49d4a. On Thu, Jul 7, 2022 at 1:07 AM Steve Kargl wrote: > > -std=iso9899:1999 -Werror /usr/src/sys/netinet/tcp_input.c > --- modules-all --- > /usr/src/sys/netpfil/ipfw/ip_dn_io.c:674:4: error: 'continue' statement not > in loop statement > continue; > ^ > 1 error generated. > *** [ip_dn_io.o] Error code 1 > > make[4]: stopped in /usr/src/sys/modules/dummynet > 1 error > *** [modules-all] Error code 6 > > make[2]: stopped in /usr/obj/usr/src/amd64.amd64/sys/SPEW > 1 error > > make[2]: stopped in /usr/obj/usr/src/amd64.amd64/sys/SPEW > 5.75 real20.45 user 2.30 sys > > make[1]: stopped in /usr/src > > > Please fix. > > -- > Steve >
/usr/share/locale/nn_NO.ISO8859-15/LC_MESSAGES is a symbolic link to itself on head
I happened to trip over this after doing a source upgrade on a machine running -HEAD. A fresh VM build from -HEAD that was done this morning shows the same issue. root@test:~ # uname -a FreeBSD test 14.0-CURRENT FreeBSD 14.0-CURRENT #0 main-n253973-8f1543785f77: Sat Mar 26 02:40:01 EDT 2022 build@rstone-build:/usr/obj/srcpool/src/build/freebsd-build/amd64.amd64/sys/GENERIC amd64 root@test:~ # realpath /usr/share/locale/nn_NO.ISO8859-15/LC_MESSAGES realpath: /usr/share/locale/nn_NO.ISO8859-15/LC_MESSAGES: Too many levels of symbolic links root@test:~ # readlink /usr/share/locale/nn_NO.ISO8859-15/LC_MESSAGES ../nn_NO.ISO8859-15/LC_MESSAGES root@test:~ # ls -l /usr/share/locale/nn_NO.ISO8859-15 total 28 -r--r--r-- 1 root wheel 16464 Mar 26 02:40 LC_COLLATE lrwxr-xr-x 1 root wheel 28 Mar 26 02:40 LC_CTYPE -> ../en_US.ISO8859-15/LC_CTYPE lrwxr-xr-x 1 root wheel 31 Mar 26 02:40 LC_MESSAGES -> ../nn_NO.ISO8859-15/LC_MESSAGES -r--r--r-- 1 root wheel 33 Mar 26 02:40 LC_MONETARY lrwxr-xr-x 1 root wheel 29 Mar 26 02:40 LC_NUMERIC -> ../uk_UA.ISO8859-5/LC_NUMERIC -r--r--r-- 1 root wheel392 Mar 26 02:40 LC_TIME
Re: schedgraph.d experience, per-CPU buffers, pipes
I've definitely experienced the issue about different buffers rolling over faster than others and producing confusing schedgraph data. I'm away from home this week, but I believe that I have a script that tries to chop off the schedgraph data at the point where the most recent CPU to roll over has no more data. If I think of it I'll try to pass it along. Another issue that I remember encountering is that there is a limitation on the total amount of space that you can allocate to dtrace buffers and it does not scale to the number of CPUs in the system, so the more CPUs that you have, the less buffer you can allocate per CPU. I seem to recall that the limit being very small compared to the amount of memory on a modern large core count system. On Fri, Dec 24, 2021 at 8:08 AM Andriy Gapon wrote: > > > I would like to share some experience or maybe rather a warning about using > DTrace for tracing scheduling events. Unlike KTR which has a global circular > buffer, DTrace with bufpolicy=ring uses per-CPU circular buffers. So, if > there > is an asymmetry in processor load, the buffers will fill and wrap-around at > different speeds. In the end, they might have approximately equal numbers of > events but those may cover very different time intervals. So, some additional > post-processing is required to find the latest event among first ones of each > per-CPU buffer. Any traces from before that would have information gaps > ("missing" processors) and would be very confusing. > > Also, I noticed that processes passing a lot of data through pipes produce a > lot > of scheduling events as they seem to get blocked and unlocked every few > microseconds (on a modern performant system with the default pipe sizing > configuration). That contributes to a quick wrap-around of circular buffers. > > -- > Andriy Gapon >
Re: recent head having significantly less "avail memory"
On Mon, Sep 13, 2021 at 2:13 PM Guido Falsi via freebsd-current wrote: > I'm not sure how to get the verbose data for the old boot, since I've > been unable to revert the machine to the old state. I'll try anyway though. Do you have physical access to the machine? It might be easiest to grab a snapshot image, stick it on a USB drive and boot from that.
Re: FreeBSD 13.0-RC1 ethertype IPv6 (0x86dd), length 2942 on 1500 MTU
Hi Lars, Do you see the TSO6 option enabled on your vtnet interface? Do you see normal packet sizes if you disable it with "ifconfig vtnet0 -tso6"? Does it actually fix your IPv6 issue? ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: HEADS UP: FreeBSD src repo transitioning to git this weekend
On Mon, Jan 4, 2021 at 3:44 PM Poul-Henning Kamp wrote: > Shattered is less impressive when you take into account that you > can stuff as much much garbage into a PDF file as you need, without > affecting the files normal function. > > Compact data formats, formats which leave no wiggle-room and do not > offer extension-space for "attic-junk", are much harder to produce > *meaningful* collisions for. > > (I take no opinion in where git is on that spectrum.) FWIW, a coworker of mine had a little hobby of introducing commits into our internal repro that had hashes that all started with deadc0de. As I understand it, it was able to do this by adding an bogus attribute with the right value to the commit object. Now, brute-forcing 8 digits in the hash is one thing and doing it for all 40 is quite another, but I suspect that this demonstrates that it's *possible* to do it for a git hash, given enough computing resources. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Laptop exhibits erratic responsiveness
On Sun, Nov 29, 2020 at 9:12 AM David Wolfskill wrote: > OK, and demonstrated some long RTTs about every 11 packets or so, but we > see thing come to a screeching halt with: > > ... > 64 bytes from 172.16.8.13: icmp_seq=534 ttl=63 time=0.664 ms > lockstat: dtrace_status(): Abort due to systemic unresponsiveness > 64 bytes from 172.16.8.13: icmp_seq=535 ttl=63 time=9404.383 ms > > and we get no lockstat output. :-/ I believe that if you run lockstat with the additional "-x destructive" option, it will disable the responsiveness test (the option does sound scary but it will not have any other potentially destructive effect) ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: buildworld: "cp: /dev/null: Invalid argument"
I'm curious: does this give a similar issue? touch /tmp/foo cp /tmp/foo /tmo/foo2 I'm wondering if the issue is that copy_file_range isn't handling empty files, or if it's a devfs issue. On Thu, Sep 10, 2020 at 11:45 AM Michael Butler wrote: > > It seems that SVN r365549 broke "cp /dev/null ..." > > imb > > On 9/10/20 10:35 AM, Michael Butler wrote: > > Is anyone else seeing failures like this in building world and, in my > > case, cron jobs as well? > > > > > > Building /usr/obj/usr/src/amd64.amd64/stand/i386/zfsboot/zfsboot.ldr > > --- all_subdir_sbin --- > > Building /usr/obj/usr/src/amd64.amd64/sbin/bsdlabel/bsdlabel > > --- all_subdir_stand --- > > --- zfsboot.ldr --- > > cp: /dev/null: Invalid argument > > *** [zfsboot.ldr] Error code 1 > > make[5]: *** zfsboot.ldr removed > > --- all_subdir_kerberos5 --- > > Building /usr/obj/usr/src/amd64.amd64/kerberos5/usr.sbin/iprop-log/iprop-log > > --- all_subdir_stand --- > > > > make[5]: stopped in /usr/src/stand/i386/zfsboot > > .ERROR_TARGET='zfsboot.ldr' > > .ERROR_META_FILE='/usr/obj/usr/src/amd64.amd64/stand/i386/zfsboot/zfsboot.ldr.meta' > > .MAKE.LEVEL='5' > > MAKEFILE='' > > .MAKE.MODE='meta missing-filemon=yes missing-meta=yes silent=yes verbose' > > _ERROR_CMD='cp /dev/null zfsboot.ldr;' > > .CURDIR='/usr/src/stand/i386/zfsboot' > > .MAKE='make' > > .OBJDIR='/usr/obj/usr/src/amd64.amd64/stand/i386/zfsboot' > > .TARGETS='all' > > DESTDIR='/usr/obj/usr/src/amd64.amd64/tmp' > > LD_LIBRARY_PATH='' > > MACHINE='amd64' > > MACHINE_ARCH='amd64' > > MAKEOBJDIRPREFIX='' > > MAKESYSPATH='/usr/src/share/mk' > > MAKE_VERSION='20200902' > > > > ___ > > freebsd-current@freebsd.org mailing list > > https://lists.freebsd.org/mailman/listinfo/freebsd-current > > To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" > > > ___ > freebsd-current@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
AT_EXECPATH aux_info vector contains path of interpreter when directly exec'ing rtld
I've noticed that on head, if I directly execute rtld to run an executable, AT_EXECPATH contains the path to rtld on head (on 12.0-RELEASE it will contain nothing). This is causing me a problem because clang uses AT_EXECPATH to preferentially locate where it's installed, which it uses to locate its driver programs. The end result is that clang can no longer successfully be executed from a process in capability mode, whereas before I could fexecve rtld and give it a pre-opened file descriptor to /usr/bin/clang. I've put together a quick test program demonstrating the problem: https://people.freebsd.org/~rstone/getprogname.c On 12.0-RELEASE, directly executing rtld to run this program gives this output: $ /libexec/ld-elf.so.1 -- ./progname progname: progname argv[0]: ./progname elf_aux_info failed: No such file or directory On head, I get this instead: /libexec/ld-elf.so.1 -- ./progname progname: progname argv[0]: ./progname AT_EXECPATH: /libexec/ld-elf.so.1 ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
WITH_LLVM_TARGET_BPF=yes broken on head
If I do a "make toolchain" with WITH_LLVM_TARGET_BPF=yes set in /etc/src.conf on the latest head I get the following errors when it tries to link clang. I believe that this was broken by the recent'ish llvm update; it worked as of r351363 back in August. ld: error: undefined symbol: llvm::initializeBPFAbstractMemberAccessPass(llvm::PassRegistry&) >>> referenced by BPFTargetMachine.cpp:37 >>> (/srcpool/src/rstone/freebsd/contrib/llvm/lib/Target/BPF/BPFTargetMachine.cp p:37) >>> BPFTargetMachine.o:(LLVMInitializeBPFTarget) in archive >>> /usr/obj/srcpool/src/rstone/freebsd/amd64.am d64/tmp/obj-tools/lib/clang/libllvm/libllvm.a ld: error: undefined symbol: llvm::createBPFAbstractMemberAccess() >>> referenced by BPFTargetMachine.cpp:97 >>> (/srcpool/src/rstone/freebsd/contrib/llvm/lib/Target/BPF/BPFTargetMachine.cp p:97) >>> BPFTargetMachine.o:((anonymous >>> namespace)::BPFPassConfig::addIRPasses()) in archive /usr/obj/srcpool /src/rstone/freebsd/amd64.amd64/tmp/obj-tools/lib/clang/libllvm/libllvm.a ld: error: undefined symbol: llvm::createBPFMISimplifyPatchablePass() >>> referenced by BPFTargetMachine.cpp:111 >>> (/srcpool/src/rstone/freebsd/contrib/llvm/lib/Target/BPF/BPFTargetMachine.c pp:111) >>> BPFTargetMachine.o:((anonymous >>> namespace)::BPFPassConfig::addMachineSSAOptimization()) in archive /u sr/obj/srcpool/src/rstone/freebsd/amd64.amd64/tmp/obj-tools/lib/clang/libllvm/libllvm.a ld: error: undefined symbol: llvm::BPFCoreSharedInfo::AmaAttr >>> referenced by string:1427 (/usr/include/c++/v1/string:1427) >>> BTFDebug.o:(llvm::BTFDebug::processLDimm64(llvm::MachineInstr >>> const*)) in archive /usr/obj/srcpool/s rc/rstone/freebsd/amd64.amd64/tmp/obj-tools/lib/clang/libllvm/libllvm.a ld: error: undefined symbol: llvm::BPFCoreSharedInfo::AmaAttr >>> referenced by string:0 (/usr/include/c++/v1/string:0) >>> BTFDebug.o:(llvm::BTFDebug::processLDimm64(llvm::MachineInstr >>> const*)) in archive /usr/obj/srcpool/s rc/rstone/freebsd/amd64.amd64/tmp/obj-tools/lib/clang/libllvm/libllvm.a ld: error: undefined symbol: llvm::BPFCoreSharedInfo::AmaAttr >>> referenced by string:0 (/usr/include/c++/v1/string:0) >>> BTFDebug.o:(llvm::BTFDebug::processLDimm64(llvm::MachineInstr >>> const*)) in archive /usr/obj/srcpool/s rc/rstone/freebsd/amd64.amd64/tmp/obj-tools/lib/clang/libllvm/libllvm.a ld: error: undefined symbol: llvm::BPFCoreSharedInfo::AmaAttr >>> referenced by string:0 (/usr/include/c++/v1/string:0) >>> BTFDebug.o:(llvm::BTFDebug::processLDimm64(llvm::MachineInstr >>> const*)) in archive /usr/obj/srcpool/s rc/rstone/freebsd/amd64.amd64/tmp/obj-tools/lib/clang/libllvm/libllvm.a ld: error: undefined symbol: llvm::BPFCoreSharedInfo::PatchableExtSecName >>> referenced by string:1427 (/usr/include/c++/v1/string:1427) >>> BTFDebug.o:(llvm::BTFDebug::processLDimm64(llvm::MachineInstr >>> const*)) in archive /usr/obj/srcpool/s rc/rstone/freebsd/amd64.amd64/tmp/obj-tools/lib/clang/libllvm/libllvm.a ld: error: undefined symbol: llvm::BPFCoreSharedInfo::PatchableExtSecName >>> referenced by string:0 (/usr/include/c++/v1/string:0) >>> BTFDebug.o:(llvm::BTFDebug::processLDimm64(llvm::MachineInstr >>> const*)) in archive /usr/obj/srcpool/s rc/rstone/freebsd/amd64.amd64/tmp/obj-tools/lib/clang/libllvm/libllvm.a ld: error: undefined symbol: llvm::BPFCoreSharedInfo::PatchableExtSecName >>> referenced by string:0 (/usr/include/c++/v1/string:0) >>> BTFDebug.o:(llvm::BTFDebug::processLDimm64(llvm::MachineInstr >>> const*)) in archive /usr/obj/srcpool/s rc/rstone/freebsd/amd64.amd64/tmp/obj-tools/lib/clang/libllvm/libllvm.a ld: error: undefined symbol: llvm::BPFCoreSharedInfo::PatchableExtSecName >>> referenced by StringRef.h:0 >>> (/srcpool/src/rstone/freebsd/contrib/llvm/include/llvm/ADT/StringRef.h:0) >>> BTFDebug.o:(llvm::BTFDebug::processLDimm64(llvm::MachineInstr >>> const*)) in archive /usr/obj/srcpool/s rc/rstone/freebsd/amd64.amd64/tmp/obj-tools/lib/clang/libllvm/libllvm.a ld: error: undefined symbol: llvm::BPFCoreSharedInfo::AmaAttr >>> referenced by string:1427 (/usr/include/c++/v1/string:1427) >>> BTFDebug.o:(llvm::BTFDebug::InstLower(llvm::MachineInstr >>> const*, llvm::MCInst&)) in archive /usr/obj /srcpool/src/rstone/freebsd/amd64.amd64/tmp/obj-tools/lib/clang/libllvm/libllvm.a ld: error: undefined symbol: llvm::BPFCoreSharedInfo::AmaAttr >>> referenced by string:0 (/usr/include/c++/v1/string:0) >>> BTFDebug.o:(llvm::BTFDebug::InstLower(llvm::MachineInstr >>> const*, llvm::MCInst&)) in archive /usr/obj /srcpool/src/rstone/freebsd/amd64.amd64/tmp/obj-tools/lib/clang/libllvm/libllvm.a ld: error: undefined symbol: llvm::BPFCoreSharedInfo::AmaAttr >>> referenced by string:0 (/usr/include/c++/v1/string:0) >>> BTFDebug.o:(llvm::BTFDebug::InstLower(llvm::MachineInstr >>> const*, llvm::MCInst&)) in
dtrace not working on bhyve VM without invariant_tsc
I have a bhyve VM guest on my laptop where dtrace just constantly aborts whenever I try to use it: [rstone@ebpf dtrace]sudo dtrace -s fdcopy.d Assertion failed: (buf->dtbd_timestamp >= first_timestamp), file /usr/home/rstone/git/bsd-worktree/ebpf-import/cddl/contrib/opensolaris/lib/libdtrace/common/dt_consume.c, line 3026. Abort trap I believe that the problem is caused by dtrace unconditionally using rdtsc() to implement dtrace_gethrtime(), assuming that the values will be stable for a given CPU. The VM's vcpus seem to be getting migrated frequently. Should dtrace instead be using the system timecounter? That should stand a much better chance of being monotonically increasing. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: panic: Assertion in_epoch(net_epoch_preempt) failed at ... src/sys/net/if.c:3694
I haven't found any good references on the subject, but here's my understanding: - epoch_enter() and epoch_exit() are very inexpensive operations (cheaper than mtx, rw_lock or rm_lock operations) that are use to mark read-only critical sections - epoch_wait() guarantees that no threads that were in the critical section when it was first called are still in the critical section when it completes With this guarantee, you can safely destroy an object with the following procedure: 1. Atomically remove all global pointers to the object (e.g. remove it from any lists that the critical sections might look it up in). This must be done atomically because read-only threads can be concurrently running in the critical section. This guarantees that no more threads can get a pointer to it. 2. Call epoch_wait() to drain all threads that already held pointers to it before step 1. 3. You now hold the only pointer to the object, so you are free to destroy it as you please. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Files with multiple entries in sys/conf/files
I notice that the following files have multiple entries in sys/conf/files: $ cut -f 1 -w files | sort | uniq -c | egrep -v '^[[:space:]]+1[[:space:]]' 1373 100 # 2 crypto/chacha20/chacha.c 2 dev/iicbus/rtc8583.c 2 dev/uart/uart_dev_sab82532.c 2 dev/uart/uart_dev_z8530.c The following patch should correct this. Should I just commit it? diff --git a/sys/conf/files b/sys/conf/files index 44c23e8cc01d..39304264f606 100644 --- a/sys/conf/files +++ b/sys/conf/files @@ -679,7 +679,8 @@ crypto/blowfish/bf_ecb.coptional ipsec | ipsec_support crypto/blowfish/bf_skey.c optional crypto | ipsec | ipsec_support crypto/camellia/camellia.c optional crypto | ipsec | ipsec_support crypto/camellia/camellia-api.c optional crypto | ipsec | ipsec_support -crypto/chacha20/chacha.c optional crypto | ipsec | ipsec_support +# Required by libkern +crypto/chacha20/chacha.c standard crypto/chacha20/chacha-sw.coptional crypto | ipsec | ipsec_support crypto/des/des_ecb.c optional crypto | ipsec | ipsec_support | netsmb crypto/des/des_setkey.coptional crypto | ipsec | ipsec_support | netsmb @@ -1777,7 +1778,6 @@ dev/iicbus/ds1307.c optional ds1307 dev/iicbus/ds13rtc.c optional ds13rtc | ds133x | ds1374 dev/iicbus/ds1672.coptional ds1672 dev/iicbus/ds3231.coptional ds3231 -dev/iicbus/rtc8583.c optional rtc8583 dev/iicbus/syr827.coptional syr827 ext_resources fdt dev/iicbus/icee.c optional icee dev/iicbus/if_ic.c optional ic @@ -3173,11 +3173,9 @@ dev/uart/uart_dev_mvebu.coptional uart uart_mvebu dev/uart/uart_dev_ns8250.c optional uart uart_ns8250 | uart uart_snps dev/uart/uart_dev_pl011.c optional uart pl011 dev/uart/uart_dev_quicc.c optional uart quicc -dev/uart/uart_dev_sab82532.c optional uart uart_sab82532 -dev/uart/uart_dev_sab82532.c optional uart scc +dev/uart/uart_dev_sab82532.c optional uart uart_sab82532 | uart scc dev/uart/uart_dev_snps.c optional uart uart_snps fdt -dev/uart/uart_dev_z8530.c optional uart uart_z8530 -dev/uart/uart_dev_z8530.c optional uart scc +dev/uart/uart_dev_z8530.c optional uart uart_z8530 | uart scc dev/uart/uart_if.m optional uart dev/uart/uart_subr.c optional uart dev/uart/uart_tty.coptional uart @@ -3950,7 +3948,6 @@ kgssapi/gsstest.c optional kgssapi_debug # the file should be moved to conf/files. from here. # libkern/arc4random.c standard -crypto/chacha20/chacha.c standard libkern/asprintf.c standard libkern/bcd.c standard libkern/bsearch.c standard ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
ktrace/kdump give incorrect message on unlinkat() failure due to capabilities
I have written a short test program that runs unlinkat(2) in capability mode and fails due to not having the write capabilities: https://people.freebsd.org/~rstone/src/unlink.c If I run the binary under ktrace and look at the kdump output, it gives the following incorrect output: 43775 unlink CALL unlinkat(0x3,0x7fffe995,0) 43775 unlink NAMI "from.QAUlAA0" 43775 unlink CAP operation requires CAP_LOOKUP, descriptor holds CAP_LOOKUP 43775 unlink RET unlinkat -1 errno 93 Capabilities insufficient The message should instead say that the operation requires CAP_UNLINKAT. Looking at sys/capsicum.h, I suspect that the problem is related to the strange definition of CAP_UNLINKAT: #define CAP_UNLINKAT (CAP_LOOKUP | 0x1000ULL) I have observed the same problem with renameat(2) and CAP_RENAMEAT_SOURCE and CAP_RENAMEAT_TARGET: https://people.freebsd.org/~rstone/src/rename.c ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Direct exec of /usr/bin/ld fails with "mmap of entire address space failed: Cannot allocate memory"
Apply the patch and playing with increasing the value does allow direct exec of ld via rtld to work. so thank you. The patch probably shouldn't be applied as-is, though, as executables will just segfault right away if the sysctl is tuned to a non-page aligned value. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Direct exec of /usr/bin/ld fails with "mmap of entire address space failed: Cannot allocate memory"
If I try direct exec ld via rtld, I get the following failure: # /libexec/ld-elf.so.1 /usr/bin/ld ld-elf.so.1: /usr/bin/ld: mmap of entire address space failed: Cannot allocate memory Is there some sysctl limit I need to bump up to get around this? I presume that the binary is just too large given that I can direct exec smaller binaries like /bin/cat. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Deadlock involving truss -f, pdfork() and wait4()
As Conrad has pointed out, it's an explicit PID. The test completes successfully when not run under truss -f. On Fri, Sep 13, 2019 at 2:37 PM Mark Johnston wrote: > > On Fri, Sep 13, 2019 at 02:12:56PM -0400, Ryan Stone wrote: > > This gets me a little further but now the wait4 call by the parent > > never reaps the child and instead blocks forever: > > Does it perform a wildcarded wait(), or does it explicitly specify the > PID of the child? By design, the former will not return children > created by pdfork(). ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Deadlock involving truss -f, pdfork() and wait4()
|SIGQUIT|SIGKILL|SIGPIPE|SIGALRM|SIG TERM|SIGURG|SIGSTOP|SIGTSTP|SIGCONT|SIGCHLD|SIGTTIN|SIGTTOU|SIGIO|SIGXCPU|SIGXFS Z|SIGVTALRM|SIGPROF|SIGWINCH|SIGINFO|SIGUSR1|SIGUSR2 },{ }) = 0 (0x0) 708: sigprocmask(SIG_SETMASK,{ },0x0) = 0 (0x0) 708: sigprocmask(SIG_BLOCK,{ SIGHUP|SIGINT|SIGQUIT|SIGKILL|SIGPIPE|SIGALRM|SIG TERM|SIGURG|SIGSTOP|SIGTSTP|SIGCONT|SIGCHLD|SIGTTIN|SIGTTOU|SIGIO|SIGXCPU|SIGXFS Z|SIGVTALRM|SIGPROF|SIGWINCH|SIGINFO|SIGUSR1|SIGUSR2 },{ }) = 0 (0x0) 708: sigprocmask(SIG_SETMASK,{ },0x0) = 0 (0x0) load: 0.27 cmd: pdfork 706 [wait] 18.20r 0.00u 0.00s 0% 2072k # ps PID TT STATTIME COMMAND 698 u0 Is 0:00.01 login [pam] (login) 700 u0 I0:00.04 -sh (sh) 705 u0 I+ 0:00.10 truss -f ./pdfork -p 706 u0 IX+ 0:00.01 ./pdfork -p 708 u0 Z+ 0:00.00 714 0 S0:00.01 su 715 0 S0:00.01 su (sh) 716 0 R+ 0:00.00 ps # procstat -kk 708 PIDTID COMMTDNAME KSTACK # procstat -kk 706 PIDTID COMMTDNAME KSTACK 706 100095 pdfork - mi_switch+0x174 sleepq_switch+0x110 sleepq_catch_signals+0x417 slee pq_wait_sig+0xf _sleep+0x2d0 kern_wait6+0x48f sys_wait4+0x78 amd64_syscall+0x337 fast_syscall_common+0x101 # procstat -kk 705 PIDTID COMMTDNAME KSTACK 705 100077 truss - mi_switch+0x174 sleepq_switch+0x110 sleepq_catch_signals+0x417 slee pq_wait_sig+0xf _sleep+0x2d0 kern_wait6+0x48f sys_wait6+0x9f amd64_syscall+0x337 fast_syscall_common+0x101 On Fri, Sep 13, 2019 at 10:05 AM Mariusz Zaborski wrote: > > Hello Ryan, > > Can you verify is this patch fix your issue: > https://reviews.freebsd.org/D20362 > > Thanks, > Mariusz > > On Thu, 12 Sep 2019 at 21:37, Ryan Stone wrote: > > > > I've hit an issue with a simple use of pdfork(). I have a process > > that calls pdfork() and the parent immediately does a wait4() on the > > child pid. This works fine under normal conditions, but if the parent > > is run under truss -f, the three processes deadlock. If I switch out > > pdfork() for fork(), the deadlock does not occur. > > > > This C file demonstrates the issue: > > > > https://people.freebsd.org/~rstone/pdfork.c > > > > If I run "truss -f ./pdfork", which uses fork(), it completes within a > > second. If I run "truss -f ./pdfork -p", which uses pdfork(), the > > processes deadlock. If I run "./pdfork -p" without truss, it > > completes normally. > > > > procstat reports the following kernel stacks: > > > > 27572 102043 truss - mi_switch+0xe2 > > sleepq_catch_signals+0x425 sleepq_wait_sig+0xf _sleep+0x1bf > > kern_wait6+0x695 sys_wait6+0x9f amd64_syscall+0x36e > > fast_syscall_common+0x101 > > 27573 102469 pdfork - mi_switch+0xe2 > > sleepq_catch_signals+0x425 sleepq_wait_sig+0xf _sleep+0x1bf > > kern_wait6+0x695 sys_wait4+0x78 amd64_syscall+0x36e > > fast_syscall_common+0x101 > > 27574 102053 pdfork - mi_switch+0xe2 > > thread_suspend_switch+0xd4 ptracestop+0x13b fork_return+0x14e > > fork_exit+0x83 fork_trampoline+0xe > > > > As near as I can tell, truss is blocked waiting for ptrace events, the > > parent process is blocked in wait4, and the child process is perhaps > > waiting for its parent to exit the kernel so it can send the ptrace > > event? > > > > I really don't see anything obvious in the pdfork() code path that > > would cause this to happen when fork() doesn't have the problem. It > > may be that pdfork() just changes the timing enough to expose a latent > > bug. > > > > I'm seeing this on a recentish current (r351363). > > ___ > > freebsd-current@freebsd.org mailing list > > https://lists.freebsd.org/mailman/listinfo/freebsd-current > > To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Deadlock involving truss -f, pdfork() and wait4()
I've hit an issue with a simple use of pdfork(). I have a process that calls pdfork() and the parent immediately does a wait4() on the child pid. This works fine under normal conditions, but if the parent is run under truss -f, the three processes deadlock. If I switch out pdfork() for fork(), the deadlock does not occur. This C file demonstrates the issue: https://people.freebsd.org/~rstone/pdfork.c If I run "truss -f ./pdfork", which uses fork(), it completes within a second. If I run "truss -f ./pdfork -p", which uses pdfork(), the processes deadlock. If I run "./pdfork -p" without truss, it completes normally. procstat reports the following kernel stacks: 27572 102043 truss - mi_switch+0xe2 sleepq_catch_signals+0x425 sleepq_wait_sig+0xf _sleep+0x1bf kern_wait6+0x695 sys_wait6+0x9f amd64_syscall+0x36e fast_syscall_common+0x101 27573 102469 pdfork - mi_switch+0xe2 sleepq_catch_signals+0x425 sleepq_wait_sig+0xf _sleep+0x1bf kern_wait6+0x695 sys_wait4+0x78 amd64_syscall+0x36e fast_syscall_common+0x101 27574 102053 pdfork - mi_switch+0xe2 thread_suspend_switch+0xd4 ptracestop+0x13b fork_return+0x14e fork_exit+0x83 fork_trampoline+0xe As near as I can tell, truss is blocked waiting for ptrace events, the parent process is blocked in wait4, and the child process is perhaps waiting for its parent to exit the kernel so it can send the ptrace event? I really don't see anything obvious in the pdfork() code path that would cause this to happen when fork() doesn't have the problem. It may be that pdfork() just changes the timing enough to expose a latent bug. I'm seeing this on a recentish current (r351363). ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Lost user database after bungled upgrade
On Wed, Aug 28, 2019 at 6:40 PM Ian Lepore wrote: > Or maybe in your case the files are fine and it really is a uid > problem. But a "pkg check -s -a" as suggested in the PR couldn't hurt. > :) I did have some problems here, but unfortunately re-installing the affected packages (and confirming that a subsequent run of pkg check showed no more problems) didn't resolve my issue. I'm also seeing errors like this: pkg: sqlite error while executing UPDATE packages SET name=?1 WHERE name=?2; in file pkg_jobs.c:1731: UNIQUE constraint failed: packages.name pkg: sqlite error while executing UPDATE packages SET name=?1 WHERE name=?2; in file pkg_jobs.c:1731: UNIQUE constraint failed: packages.name pkg: sqlite error while executing UPDATE packages SET name=?1 WHERE name=?2; in file pkg_jobs.c:1731: UNIQUE constraint failed: packages.name pkg: sqlite error while executing UPDATE packages SET name=?1 WHERE name=?2; in file pkg_jobs.c:1731: UNIQUE constraint failed: packages.name pkg: sqlite error while executing UPDATE packages SET name=?1 WHERE name=?2; in file pkg_jobs.c:1731: UNIQUE constraint failed: packages.name So unfortunately it looks like something is corrupted somewhere. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Lost user database after bungled upgrade
Thanks for the hint; I wasn't aware of /var/backups. Unfortunately fixing my user database at this point hasn't fixed pkg. I'm worried that it has some bad data cached somewhere now. I tried restoring the pkg databack from /var/backups but that hasn't helped. On Wed, Aug 28, 2019 at 5:20 PM Gary Palmer wrote: > > On Wed, Aug 28, 2019 at 05:09:35PM -0400, Ryan Stone wrote: > > Hi everybody, > > > > I lost /etc/master.passwd and friends while trying to recover from an > > src upgrade gone wrong. I'm trying to run "pkg upgrade -f" to get all > > of the users and groups created by packages recreating, but pkg is > > hitting an assert related to uids: > > > > Checking integrity...Assertion failed: (strcmp(uid, p->uid) != 0), > > function pkg_conflicts_check_local_path, file pkg_jobs_conflicts.c, > > line 386. > > > > Is there any way to get past this, or is the system toast? > > Did you try restoring from the backups under /var/backups? There should > be master.passwd in there which can be restored and /etc/passwd and > the DB files regenarated with pwd_mkdb (I think, never tried) > > Regards, > > Gary ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Lost user database after bungled upgrade
Hi everybody, I lost /etc/master.passwd and friends while trying to recover from an src upgrade gone wrong. I'm trying to run "pkg upgrade -f" to get all of the users and groups created by packages recreating, but pkg is hitting an assert related to uids: Checking integrity...Assertion failed: (strcmp(uid, p->uid) != 0), function pkg_conflicts_check_local_path, file pkg_jobs_conflicts.c, line 386. Is there any way to get past this, or is the system toast? ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
hwpmc: events don't seem to distinguish kernel and userland callchains anymore
Historically when processing callchain events from hwpmc, you could use PMC_CALLCHAIN_CPUFLAGS_TO_USERMODE(ev..pl_u.pl_cc.pl_cpuflags) to distinguish callchains that were captured in user mode from those captured in kernel mode. However, on 12.0-RELEASE and a month-old head, I have noticed that this macro never returns true anymore. Is there any way to make this distinction now, beyond an architecture-specific hack based off of the sample address? ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: buildworld falure: truncated or malformed archive
Does this mean that it's currently impossible to build a world with debug symbols? ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
buildworld falure: truncated or malformed archive
I'm trying to update an old (~May 2018) -head system to the latest, but I'm getting a persistent error during buildworld: ld: error: /usr/obj/usr/src/amd64.amd64/lib/clang/libclang/libclang.a: could not get the member for symbol _ZN5clang17MultiplexConsumerC1ENSt3__16vectorINS1_10unique_ptrINS_11ASTConsumerENS1_14default_deleteIS4_NS1_9allocatorIS7_: truncated or malformed archive (terminator characters in archive member "dC" not the correct "`\n" values for the archive member header for tOutputExprEj I seem to recall something about libarchive or ar having a bug creating archives > 4GB, but I tried doing a "make install" from lib/libarchive and usr.bin/ar and doing a rebuild, and that doesn't seem to have resolved the issue. I also made sure to try a build with a clean /usr/obj with no success. Any ideas how I can get past this? ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: ix SR-IOV working
How many VFs are you trying to create? Getting ENOSPC either indicates that you tried to allocate more VFs than the hardware supports, or the system could not allocate enough MMIO space for the VFs. On Thu, Aug 9, 2018 at 10:41 PM Pete Wright wrote: > > hello, > > i have a newly provisioned VPS system from Vultr which comes stock with > a 10Gbe ix interface: > > ix0@pci0:1:0:0: class=0x02 card=0x082315d9 chip=0x15578086 rev=0x01 > hdr=0x00 > vendor = 'Intel Corporation' > device = '82599 10 Gigabit Network Connection' > class = network > subclass = ethernet > > > it is currently running 11-STABLE but was curious if there are any > reports of people successfully running SR-IOV under CURRENT with this > hardware and driver? On both 11.2-RELEASE and 11-STABLE, after running > iovctl to bring up the interface results in the NIC hanging - for > example like so: > > $ sudo iovctl -C -f /etc/iovctl.conf > iovctl: Failed to configure SR-IOV: No space left on device > > > > so if its working on CURRENT i'll go through the upgrade process, but if > no one is testing this I'll forgo SR-IOV for now. > > > thanks! > > -pete > > > > > -- > Pete Wright > p...@nomadlogic.org > @nomadlogicLA > > ___ > freebsd-current@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: programs like gdb core dump
On Sat, Aug 4, 2018 at 9:17 PM Erich Dollansky wrote: > > Hi, > > I compiled me yesterday this system: > > 12.0-CURRENT FreeBSD 12.0-CURRENT #1 r337285: > > When restarting fortune core dumps. When trying to load the core dump, > gdb core dumps. > > The message is always: > > Bad system call (core dumped) > > Trying to install ports results in the same effect. > > Erich Try "kldload sem" ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: TSC calibration in virtual machines
I would guess that the calibration can fail because when running under the hypervisor, the FreeBSD guest code can be descheduled at the wrong time. As I recall, the current algorithm looks like: 1. Sample rdtsc 2. Use a fixed-frequency timer to busy-wait for exactly 1 second 3. Sample rdtsc again 4. tsc_freq = sample2 - sample1; If we are descheduled between 2 and 3, the time we spend off-cpu will not be accounted for at step 4. On bare-metal this is not possible as neither the scheduler nor interrupts are not running yet. Although, come to think of it, I seem to recall something about SMI interrupts mucking this up long in the past, for exactly the same reason. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: cd /sys/amd64/compile/GENERIC;make cleandepend; make cleandepend
Are you building with WITH_LD_IS_LLD=no? -CURRENT can no longer be built with a GPLv2 ld. You either have to use ldd or install a newer (GPLv3) binutils package. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: mlx5(4) jumbo receive
On Tue, Apr 24, 2018 at 4:55 AM, Konstantin Belousov wrote: > +#ifndef MLX5E_MAX_RX_BYTES > +#defineMLX5E_MAX_RX_BYTES MCLBYTES > +#endif Why do you use a 2KB buffer rather than a PAGE_SIZE'd buffer? MJUMPAGESIZE should offer significantly better performance for jumbo frames without increasing the risk of memory fragmentation. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: CloverEFIBoot (was Re: Call for Testing: UEFI Changes)
On Wed, Apr 11, 2018 at 11:14 AM, Pedro Giffuni wrote: > Hi; > > FWIW, I use a very old PC of the type where the processor will not be fixed > by Intel and that still needs support for the traditional BIOS. I also > bought a 3TB HD (they were easier to find that 2T). > > If I leave the disk dedicated to FreeBSD it recognizes the complete 3TB and > will happily use ZFS for everything, however I want to dual boot so after > lots of testing I ended up ignoring 1 TB of HD :(. > > It does happen that there is a really nice boot loader that could have saved > the day but it is very difficult to install standalone: > > https://sourceforge.net/projects/cloverefiboot > > Just in case someone has the time and inclination to play with it :) > > Pedro. Is the issue due to using MBR partitioning? FreeBSD supports booting from a GPT partition from a traditional BIOS; you don't need EFI. Is this machine so old that its BIOS doesn't support booting from GPT? ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Panic in prison_alloc() on boot
To close the loop on this, the root cause ended up being a mistake on my end. This system had a rather convoluted boot process, and as a result of that was loading a nullfs.ko built for a months-old kernel. This setup accidentally worked for some time, but I guess some recent change to struct thread changed the ABI, causing the old nullfs.ko to be incompatible and fail to boot. Sorry for the noise, Ryan ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Panic in prison_alloc() on boot
Sorry for the late reply. Panicking this system is a bit painful, but I found some time to do it today. Strangely, it's actually cred that is NULL, not cred->cr_prison: (kgdb) p cred $7 = (struct ucred *) 0x0 (kgdb) disassemble Dump of assembler code for function prison_allow: 0x80ac33e0 <+0>: push %rbp 0x80ac33e1 <+1>: mov%rsp,%rbp => 0x80ac33e4 <+4>: mov0x30(%rdi),%rax 0x80ac33e8 <+8>: and0xf8(%rax),%esi 0x80ac33ee <+14>:mov%esi,%eax 0x80ac33f0 <+16>:pop%rbp 0x80ac33f1 <+17>:retq End of assembler dump. (kgdb) info reg $rdi rdi0x0 0 However, if I go up a frame, things look fine? (kgdb) up #13 0x82c22531 in nullfs_mount (mp=0xf801a483d000) at /usr/src/sys/fs/nullfs/null_vfsops.c:88 88 if (!prison_allow(td->td_ucred, PR_ALLOW_MOUNT_NULLFS)) (kgdb) p td->td_ucred $8 = (struct ucred *) 0xf801854c1700 This appears to be a miscompilation, but I've blown away /usr/obj/usr/src multiple times and rebuilt and got this same error every time. But looking at the disassembly, something is definitely wrong: 0x82c22517 <+23>:mov%gs:0x0,%r14 0x82c22520 <+32>:mov0x150(%r14),%rdi 0x82c22527 <+39>:mov$0x100,%esi 0x82c2252c <+44>:callq 0x80ac33e0 => 0x82c22531 <+49>:test %eax,%eax (kgdb) p &((struct thread*)0)->td_ucred $10 = (struct ucred **) 0x158 It uses offset 0x150 to get the cred, but the debug info claims that td_ucred is at offset 0x158. If I print out the pointer at that offset, it looks reasonable: (kgdb) p *td->td_ucred $11 = {cr_ref = 107, cr_uid = 0, cr_ruid = 0, cr_svuid = 0, cr_ngroups = 1, cr_rgid = 0, cr_svgid = 0, cr_uidinfo = 0xf80106617000, cr_ruidinfo = 0xf80106617000, cr_prison = 0x8187cb70 , cr_loginclass = 0xf8019fa43b00, cr_flags = 0, cr_pspare2 = {0x0, 0x0}, cr_label = 0x0, cr_audit = {ai_auid = 4294967295, ai_mask = {am_success = 0, am_failure = 0}, ai_termid = {at_port = 0, at_type = 4, at_addr = {0, 0, 0, 0}}, ai_asid = 0, ai_flags = 0}, cr_groups = 0xf801854c179c, cr_agroups = 16, cr_smallgroups = {0 }} I'm really at a loss at to what to try next. Build with MAKEOBJDIRPREFIX set to something else to get rid of any lingering possibility of an issue in my objdir, I guess? ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Panic in prison_alloc() on boot
I'm getting a persistent panic on boot in prison_allow(). The first case that I hit is this: Fatal trap 12: page fault while in kernel mode cpuid = 10; apic id = 22 fault virtual address = 0x30 fault code = supervisor read data, page not present instruction pointer = 0x20:0x80ab9674 stack pointer = 0x28:0xfe00f8bb16f0 frame pointer = 0x28:0xfe00f8bb16f0 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 90 (mount_nullfs) [ thread pid 90 tid 100913 ] Stopped at prison_allow+0x4: movqll+0xf(%rdi),%rax db> bt Tracing pid 90 tid 100913 td 0xf801a5aab000 prison_allow() at prison_allow+0x4/frame 0xfe00f8bb16f0 nullfs_mount() at nullfs_mount+0x31/frame 0xfe00f8bb1830 vfs_donmount() at vfs_donmount+0x1415/frame 0xfe00f8bb1a80 sys_nmount() at sys_nmount+0x72/frame 0xfe00f8bb1ac0 amd64_syscall() at amd64_syscall+0xa48/frame 0xfe00f8bb1bf0 fast_syscall_common() at fast_syscall_common+0x101/frame 0x7fffecb0 However if I comment out my nullfs mounts in /etc/fstab, I just get a panic in prison_allow() called from elsewhere. I've seen this panic with r328936 (Feb 6), r329091 (Feb 9) and r329142 (Feb 11) ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: [self base packages] pkg: packages for wrong OS version: FreeBSD:12:amd64
On Wed, Jan 10, 2018 at 1:53 PM, Baptiste Daroussin wrote: > One has to specify pkg -o OSVERSION=1200055 to allow packages built on 1200055 > to install on 1200054. This workaround doesn't appear to work for pkg bootstrap: # pkg -o OSVERSION=1200055 bootstrap The package management tool is not yet installed on your system. Do you want to fetch and install it now? [y/N]: y Bootstrapping pkg from pkg+http://pkg.FreeBSD.org/FreeBSD:12:amd64/latest, please wait... Verifying signature with trusted certificate pkg.freebsd.org.2013102301... done Installing pkg-1.10.4... pkg-static: Newer FreeBSD version for package pkg: - package: 1200055 - running kernel: 1200054 Failed to install the following 1 package(s): /tmp//pkg.txz.ngJJEM ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Build failure in stand
So I don't fully understand why this build failure happened, but I did manage to find root cause. It turns out that there was a bug in make that caused our build infrastructure to write objects and other build output to the srcdir rather than the objdir in certain cases when using make -C. I have a workaround in place for now and bdrewery@ is working on a fix for the build infrastructure. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Build failure in stand
I'm seeing the following build failure when doing a buildworld of head: In file included from /repos/users/rstone/bsd-worktree/route-change/stand/ficl/i386/sysdep.c:18: /repos/users/rstone/bsd-worktree/route-change/stand/libsa/machine/cpufunc.h:491:13: error: shift count >= width of type [-Werror,-Wshift-count-overflow] high = val >> 32; ^ ~~ 1 error generated. My make.conf looks like: PERL_VERSION=5.14.2 DEBUG_FLAGS=-g CFLAGS+=-fno-omit-frame-pointer #BTX_SERIAL=yes #BOOT_PXELDR_ALWAYS_SERIAL=yes # src/sys/boot/i386/libi386/Makefile (loader) #BOOT_COMCONSOLE_SPEED=115200 ## src/sys/boot/i386/libi386/comconsole.c (loader) COMSPEED=115200 # src/sys/dev/sio/sioreg.h (kernel) #CONSPEED=115200 WITHOUT_DEBUG_FILES=yes and my src.conf is: WITH_TESTS=yes WITHOUT_DEBUG_FLAGS=yes WITHOUT_INFO=yes ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: panic: Lock PFil shared rmlock not read locked @ /usr/src/sys/net/pfil.c:116 prior to r326363
Please update to r326376 or later. That will likely fix you issue. On Sun, Dec 3, 2017 at 6:49 PM, Dave Cottlehuber wrote: > I'm seeing this repeatedly in the last ~ 2 weeks, present in r326363 > and typically panics 1-15m post boot since this weekend's update. I'm > currently bisecting my way back; it looks like its not in r325755, but I > can't be sure for a few more hours. > > https://s3.amazonaws.com/uploads.hipchat.com/8784/2508819/MGim1y8pQt1o1xK/IMG_2698.JPG > > # uname -a > FreeBSD wintermute.skunkwerks.at 12.0-CURRENT FreeBSD 12.0-CURRENT #2 > r326363+fcb3a5a8a12d(master): Wed Nov 29 11:37:17 UTC 2017 > root@wintermute:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64 > > # dmesg > > https://gist.github.com/dch/68dd6f8d44dd76c558908e71dc3e1f87#file-dmesg-log > + other logs in same gist > > Suggestions how to get from X to debugger would help, as would setting > up serial console over ipmi - this is a supermicro motherboard. > > A+ > Dave > ___ > freebsd-current@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: NFS client performance degradation when SMP enabled
What type of network interface do you have? The Intel 1G (em and igb) were switched over to the "iflib" framework a few months ago and that could be the cause. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: how to SVN regenerate [ man awk ]
On Thu, Mar 9, 2017 at 10:00 AM, Jeffrey Bouquet wrote: > For $giggles$ I svn up /usr/src/usr.bin/awk or wherever, then > man awk displays not the newer import per a recent SVN but > the older 2015 [ it says ] one. Stale file, or not all parts of > the man page updated to include latest revision dat, or some > other command to [g]unzip or whatever, besides 320.whatis > in periodic--weekly, update the compressed latest installed > files from /usr/obj to what one expects when one has just > recompiled the man page? > Any chance that there is an obsolete copy of the manpage in /usr/share/man/cat1? ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
[PATCH] Check for preemption after lowering a thread's priority
I've just put up a phabricator review for a fix to ULE and 4BSD for a priority inversion problem. The issue is that after a running thread's priority is lowered (usually due to releasing a mutex and having a priority load revoked), the scheduler does not check whether the thread should now be preempted by a thread waiting in the runq. See this old ML post for a detailed description of one instance that I saw: https://lists.freebsd.org/pipermail/freebsd-current/2013-January/039261.html The review can be found here: https://reviews.freebsd.org/D9518 ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Buildworld error with read-only source tree
I just got the following error attempting to build r308430 with my source tree on a read-only NFS mount. I can work around it for now, but shouldn't the source tree be untouched during a buildworld? $ make -j4 buildworld buildkernel --- buildworld --- make[1]: "/repos/users/rstone/freebsd/Makefile.inc1" line 146: SYSTEM_COMPILER: Determined that CC=cc matches the source tree. Not bootstrapping a cross-compiler. --- buildworld_prologue --- -- >>> World build started on Tue Nov 8 01:50:50 EST 2016 -- --- _worldtmp --- -- >>> Rebuilding the temporary build tree -- rm -rf /repos/users/rstone/freebsd/tmp rm -rf /repos/users/rstone/freebsd/lib32 mkdir -p /repos/users/rstone/freebsd/tmp/lib mkdir: /repos/users/rstone/freebsd/tmp: Read-only file system ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Callout subsystem doesn't cancel interrupts for canceled callouts
At $WORK, we're working on adding support for high-precision RTT calculations in TCP. The goal is reduce the retransmission timeout significantly to help mitigate the impact of TCP incast. This means that the retransmit callout for TCP sockets gets scheduled significantly more often with a shorter timeout period, but in the normal case it is expected to be canceled or rescheduled before it times out. What I have noticed is that when the retransmit callout is canceled or rescheduled, the callout subsystem will not reschedule its currently pending interrupt. The result is that my system takes a significant number of "spurious" timer interrupts where there are no callouts to service, which is having a significant performance impact. Unfortunately, neither the callout subsystem nor the eventtimers subsystem really seem to be designed for canceling interrupts. It's not easy to find the "next" event in the callout wheel and the current code doesn't even try when handling an interrupt; the next interrupt is scheduled at a seemingly arbitrary point in the future. I know that when the callout system was reworked the callout wheel data structure was maintained to keep insertion and deletion O(1). However I question whether that was the right decision given the fact that if callouts are frequently deleted, as in my case, we incur the signficant overhead of a spurious timer interrupt. Does anybody know if actual performance measurements were taken to justify this decision? ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Oversight in /etc/defaults/rc.conf
On Tue, Jul 12, 2016 at 10:50 AM, RW wrote: > Unless I'm misunderstanding the situation. rc.d/iovctl isn't actually > doing anything by default because of iovctl_files="". > > There is an analogy with rc.d/sysctl which runs by default, with a > an empty sysctl.conf file. This also has no explicit enable entry in > rc.conf. > That is how it is intended to work, and rc.d/sysctl was the inspiration for that script if memory serves. I'm not entirely opposed to an iovctl_enable variable but it seems redundant. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: panic: DI already started in pmap_delayed_invl_started
Whoops. I just reported the uname output without giving any thought as to whether it made any sense. I don't have r300863, so I'll update and hopefully that fixes it. Thanks. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
panic: DI already started in pmap_delayed_invl_started
I updated my system to r254461 on Thursday, and got this panic overnight. I have a full core and debug symbols if anybody wants me to look at something. panic: DI already started cpuid = 10 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe1838bbc380 vpanic() at vpanic+0x182/frame 0xfe1838bbc400 kassert_panic() at kassert_panic+0x126/frame 0xfe1838bbc470 pmap_delayed_invl_started() at pmap_delayed_invl_started+0xe1/frame 0xfe1838bbc490 pmap_advise() at pmap_advise+0x31/frame 0xfe1838bbc540 vm_fault_dontneed() at vm_fault_dontneed+0x12f/frame 0xfe1838bbc590 vm_fault_hold() at vm_fault_hold+0x919/frame 0xfe1838bbc6a0 vm_fault() at vm_fault+0x78/frame 0xfe1838bbc6e0 vm_run() at vm_run+0x5b5/frame 0xfe1838bbc880 vmmdev_ioctl() at vmmdev_ioctl+0x8c2/frame 0xfe1838bbc940 devfs_ioctl_f() at devfs_ioctl_f+0x156/frame 0xfe1838bbc9a0 kern_ioctl() at kern_ioctl+0x246/frame 0xfe1838bbca00 sys_ioctl() at sys_ioctl+0x171/frame 0xfe1838bbcae0 amd64_syscall() at amd64_syscall+0x2db/frame 0xfe1838bbcbf0 Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfe1838bbcbf0 --- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x800ff60fa, rsp = 0x7fffdedf4e28, rbp = 0x7fffdedf4ee0 --- ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Kernel panic from recent build
Do we need this debug output? It's quite clear from the acpidump output that there is an entry for APIC ID 0 in memory domain 0 and memory domain 1. Not sure if that's legal by the spec. On Mon, May 2, 2016 at 6:17 PM, Eric van Gyzen wrote: > On 05/02/2016 16:14, Bill O'Hanlon wrote: > > On Mon, May 2, 2016 at 3:55 PM, John Baldwin wrote: > > > >> On Monday, May 02, 2016 01:35:54 PM Bill O'Hanlon wrote: > >>> > >>> IMG_20160502_130335.jpg > >>> < > >> > https://drive.google.com/file/d/1dtJxTwWXfhXVUUtn1Vvpzh3laJt7AILyCg/view?usp=drive_web > >>> > >>> I'm getting the following panic from a recent (May 2, 2016) build. > >>> panic: Duplicate local APIC ID 0 > >>> > >>> The system is a Dell Precision T5500 with generic factory BIOS > settings. > >>> It has run previous builds without event for several years. > >>> > >>> I'm attaching a link to a photo of the screen for added details. > >> Try setting 'hint.srat.0.disabled=1' at the loader prompt and then grab > >> the output of 'acpidump -t' on your next boot. The SRAT table used by > >> the NUMA code appears to be corrupted by your BIOS. > >> > >> -- > >> John Baldwin > >> > > > > That allowed me to boot. I'm attaching the output of 'acpidump -t'. > > Thanks! > > Bill, > > Do you have the time and interest to test this patch? If so, remove the > line that you added to /boot/loader.conf so the patch actually gets > exercised. > > Eric > > > diff --git a/sys/x86/acpica/srat.c b/sys/x86/acpica/srat.c > index 85f1922..1d0f73d 100644 > --- a/sys/x86/acpica/srat.c > +++ b/sys/x86/acpica/srat.c > @@ -201,8 +201,12 @@ srat_parse_entry(ACPI_SUBTABLE_HEADER *entry, void > *arg) > "enabled" : "disabled"); > if (!(cpu->Flags & ACPI_SRAT_CPU_ENABLED)) > break; > -KASSERT(!cpus[cpu->ApicId].enabled, > -("Duplicate local APIC ID %u", cpu->ApicId)); > +if (cpus[cpu->ApicId].enabled) { > +printf("SRAT: Duplicate local APIC ID %u\n", > +cpu->ApicId); > +*(int *)arg = ENXIO; > +break; > +} > cpus[cpu->ApicId].domain = domain; > cpus[cpu->ApicId].enabled = 1; > break; > > ___ > freebsd-current@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" > ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: qsort() documentation
On Mon, Apr 18, 2016 at 12:13 PM, Hans Petter Selasky wrote: > Did anyone try to generate such a fiendish set of data, and see how > quadratic the FreeBSD's qsort() becomes? > Not me, but it has been done: http://calmerthanyouare.org/2014/06/11/algorithmic-complexity-attacks-and-libc-qsort.html ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: qsort() documentation
On Mon, Apr 18, 2016 at 9:09 AM, Hans Petter Selasky wrote > I think the algorithm is switching to mergesort. I'll look up the paper > and add that correctly before commit. > No, it switches to insertion sort, assuming that it's acting on an already sorted array. If that assumption is wrong you still get O(n**2) complexity. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: accessing a PCIe register from userspace through kmem or other ways ?
On Tue, Apr 5, 2016 at 2:10 AM, Konstantin Belousov wrote: > The mmap(2) interface to /dev/mem did not have the issue ever. The problem > was only with the read(2)/write(2) accesses. > I mis-remembered my situation. I was performing a read on /dev/mem rather than reading through a mmap. Thanks for clearing this up. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: accessing a PCIe register from userspace through kmem or other ways ?
On Mon, Apr 4, 2016 at 6:45 PM, John Baldwin wrote: > I suspect Ryan might be referring to BARs outside of the DMAP which we > only populate to Maxmem IIRC. /dev/mem should work for those. > Unfortunately I no longer have access to the systems so I can't really confirm. I had a debug tool that attempted to read PCI device registers through /dev/mem, and on these systems (which were running a 8.2 derivative) the reads from /dev/mem failed with some kind of error. The one detail that I do remember is that the errors started happening after we enabled the use of 64-bit BARs in the BIOS and the addresses assigned to the BARs were quite large -- I believe well beyond the bounds of real memory. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: accessing a PCIe register from userspace through kmem or other ways ?
On Fri, Apr 1, 2016 at 12:31 PM, John Baldwin wrote: > Sorry, I mapped PCIe registers to the PCI-e config space register set. I > am > not sure exactly how libpciaccess handles register access (perhaps it reads > raw bars and maps them via /dev/mem)? However, it would not be hard to a > new ioctl to /dev/pci to allow one to mmap a specific BAR of a given > device. > That is actually a really good question. I know that with some recent BIOSes if I enabled allocating 64-bit addresses to BARs, the BAR would actually be mapped outside of the range of /dev/mem. From a quick glance at libpciaccess, I don't think that it handles this case. A specific mechanism for allowing mmaping of BARs would be useful, I think. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: accessing a PCIe register from userspace through kmem or other ways ?
On Thu, Mar 31, 2016 at 4:39 PM, John Baldwin wrote: > On Wednesday, March 30, 2016 11:20:51 AM Jim Harris wrote: > > On Wed, Mar 30, 2016 at 10:47 AM, Luigi Rizzo > wrote: > > > > > Hi, > > > I'd like to test the rate at which I can access device registers > > > on a PCIe card, and was wondering whether I need to patch a device > > > driver, or perhaps I can use /dev/kmem once I figure out where > > > the registers are mapped ? > > > > > > > You do not need to patch a device driver. Have you looked at > > libpciaccess? This should give you everything you need. > > You can also look at what pciconf uses. (It has a read_config() method > that uses an ioctl on an fd of /dev/pci). > pciconf can only access the configuration space, right? I believe that Luigi is more interested in measuring the latency to a register mapped from a BAR. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: environment corrupt; missing value for QT_IM_MO
On Tue, Jan 5, 2016 at 3:54 AM, Andriy Gapon wrote: > Is there a limit on the environment's size? > If memory serves, this is bounded by ARG_MAX in sys/syslimits.h. The value is not tunable as far as I know, so if you want to experiment with changing it you will have to change syslimits.h and recompile your kernel. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Panic from vesa_configure()
On Sun, Dec 20, 2015 at 4:02 PM, Jeremie Le Hen wrote: > For some reason, "call doadump" > didn't work (any idea why?), so I took a picture. > I have no idea about the panic, but doadump failed because the crash happened too early for the dump device to be configured yet. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: nanoBSD boot problem (on USB stick or as a HD)
On Tue, Sep 15, 2015 at 9:53 PM, Julian Elischer wrote: > one possibility is to use gpart label to describe the device. > possibly it woudl hav ehte same result in both cases, but I don't know for > sure that > it works for root device.. you'd have to test. > > I would recommend a UFS label instead. gpart labels are kind of fragile and easy to mess up. My previous employer has been shipping systems where the root fs is specified in fstab via a UFS label for years and it never gave us a problems. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: kernel dtrace and current
On Mon, Sep 14, 2015 at 6:08 AM, Alexander V. Chernikov < melif...@freebsd.org> wrote: > (So I suppose that for some reason I got old ctfmerge) > I believe that ctfmerge is only built as a bootstrap tool when the host system's FreeBSD version is obsolete. Mark, you probably should update Makefile.inc1. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: freebsd-head: suddenly NMI panics lead to ddb being unable to stop CPUs?
I have seen similar behaviour before. The problem is that every CPU receives an NMI concurrently. As I recall, one of them gets some kind of pseudo-spinlock and tries to stop the other CPUs with an NMI. However, because they are already in an NMI handler, they don't get the second NMI and don't stop properly. The case that I saw actually had to do with a panic triggered by an NMI, not entering the debugger, but I believe that both cases use stop_cpus_hard() under the hood and have a similar issue. (I also recall seeing the exact situation that you describe while originally developing SR-IOV on an alpha version of the Fortville hardware and firmware with a very buggy SR-IOV implementation. I've never seen it on ixgbe before, although I haven't used SR-IOV there very much at all) On Thu, Aug 20, 2015 at 6:15 PM, Adrian Chadd wrote: > Hi! > > This has started happening on -HEAD recently. No, I don't have any > more details yet than "recently." > > Whenever I get an NMI panic (and getting an NMI is a separate issue, > sigh) I get a slew of "failed to stop cpu" messages, and all CPUs > enter ddb. This is .. sub-optimal. Has anyone seen this? Does anyone > have any ideas? > > > -adrian > ___ > freebsd-a...@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-arch > To unsubscribe, send any mail to "freebsd-arch-unsubscr...@freebsd.org" > ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: How should a driver shutdown a taskqueue on detach?
On Thu, Jul 2, 2015 at 3:08 AM, Konstantin Belousov wrote: > Having taskqueue_enqueue() which could silently (?) not enqueue the given > task is huge and IMO risky change to the KPI. If doing it, I think > that there should be a new function to enqueue, which is allowed to > fail, unlike taskqueue_enqueue(). > > BTW, the man page for taskqueue(9) is wrong, taskqueue_enqueue(9) > always succeed now and always returns 0 (ignoring the ushort overflow). > That's fair, but I feel that a new enqueue function would be rather intrusive for existing drivers. Maybe we should attach this from a different angle. How about a taskqueue_quiesce() function, which must be called on a blocked taskqueue (by taskqueue_block() ). taskqueue_quiesce() would block until the taskqueue's thread has stopped running. Then I can do: taskqueue_block() taskqueue_quiesce() taskqueue_cancel() //... taskqueue_free() ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: How should a driver shutdown a taskqueue on detach?
On Wed, Jul 1, 2015 at 5:32 PM, Konstantin Belousov wrote: > Do you mean, you want some KPI like > boolean taskqueue_is_draining(struct taskqueue *p); > so that e.g. executed task could see if it is executing in the > shutdown state ? I'd prefer a KPI that stops a taskqueue from accepting new tasks (and drops attempts to enqueue on the floor). Then I could do something like: taskqueue_stop() disable_interrupts() taskqueue_drain_all() taskqueue_free() ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: How should a driver shutdown a taskqueue on detach?
On Wed, Jul 1, 2015 at 4:58 PM, Jack Vogel wrote: > But if you've disabled interrupts why would an "interrupt-handling task" > even run?? > > Jack > There's a race. The task could have already have been scheduled by a previous interrupt and could be running while I am trying to disable interrupts and drain the taskqueue. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
How should a driver shutdown a taskqueue on detach?
I'm trying to figure out how a driver is supposed to shut down its interrupt-handling taskqueue when it detaches. taskqueue(9) recommends disabling interrupts, draining each task and then freeing the taskqueue. The problem that I have is the interrupt-handling tasks will sometimes re-enable interrupts on the device. Is there a better way than using some kind of flag internally in the driver to note that a detach is in progress that the interrupt handlers will have to check before enabling interrupts? ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
dtrace failing with "no struct pkt_node definition is available"
I'm getting this error when I try to run dtrace on a recent head: dtrace: failed to compile script error.d: "/usr/lib/dtrace/tcp.d", line 292: operator -> cannot be applied to a forward declaration: no struct pkt_node definition is available. I confirmed that I do have ctf data in my kernel. I was able to get the dtrace script to run by removing tcp.d. Has something changed in the kernel that tcp.d needs to catch up with? ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: ixl and BOOTP
Oh, I bet that you have a bunch of CPUs and ixl is consuming all of your interrupt vectors. Does setting this tunable fix the issue? hw.ixl.max_queues=1 ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: ixl and BOOTP
Hm, I'm unable to reproduce this on the latest -CURRENT (r283059). My hardware is a little different from yours -- my CPU is a Haswell Xeon, and I have only 1 igb port and no ixgbe. Also, I was just booting GENERIC. I didn't have Xen or anything running. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: ixl and BOOTP
On Mon, May 18, 2015 at 12:30 PM, Slawa Olhovchenkov wrote: > On Mon, May 18, 2015 at 02:42:51PM +, Eggert, Lars wrote: > > > > > Legacy mode, and it hangs in the kernel. > > > > Without if_ixl in loader.conf, it does the usual BOOTP logic: > ^^^ ^^ > > Sending DHCP Discover packet from interface ix0 (90:e2:ba:77:d4:9c) > ^^ > > Sending DHCP Discover packet from interface ix1 (90:e2:ba:77:d4:9d) > ^^ > how this is posible? > > ix0 interfaces are created by the ixgbe driver, not ixl. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: ixl and BOOTP
On Mon, May 18, 2015 at 10:08 AM, Ryan Stone wrote: > I can't remember when the last time I did this but it was probably within > the last couple of weeks. > Pardon me, I meant months, not weeks. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: ixl and BOOTP
On Mon, May 18, 2015 at 8:40 AM, Eggert, Lars wrote: > Hi, > > when I have the ixl driver compiled into my -CURRENT kernel (or loaded as > a module via loader.conf), the boot seems to hang (or silently crash) when > BOOTP starts bringing up interfaces to send out probes. (I'm not netbooting > over an ixl, the boot interface is an igb.) > > What works is building the kernel without the ixl driver and then loading > it manually once the system is up. That way, BOOTP via igb succeeds. > > Any ideas what could be causing this? > > Lars > This is very strange. I have successfully netbooted -CURRENT in a very similar environment (ixl compiled into kernel and booting over igb). I can't remember when the last time I did this but it was probably within the last couple of weeks. I routinely netboot an 8.2 derivative in this kind of environment and I've never seen this kind of problem. Could it be related to the size of the kernel, and not ixl specifically? Also, do you have any indication as to where the hang happens? Is it still in the BIOS, or in pxeloader, or in the kernel itself? Are you booting in legacy mode or EFI? ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: PCI PF memory decode disable when sizing VF BARs
On Wed, May 6, 2015 at 11:45 AM, John Baldwin wrote: > There are some devices with BARs in non-standard locations. :( If there is > a flag to just disable the VF bar decoding, then ideally we should just be > doing that and leaving the global decoding flag alone while sizing the VF > BAR. > Disabling SR-IOV BAR decoding in this function is currently redundant, as it's already done in pci_iov.c, but I guess to keep the interface sane it makes sense to do it here too. Something like this then? diff --git a/sys/dev/pci/pci.c b/sys/dev/pci/pci.c index b4c6151..c9d7541 100644 --- a/sys/dev/pci/pci.c +++ b/sys/dev/pci/pci.c @@ -37,6 +37,7 @@ __FBSDID("$FreeBSD$"); #include #include #include +#include #include #include #include @@ -62,6 +63,7 @@ __FBSDID("$FreeBSD$"); #include #include #include +#include #include #include @@ -75,6 +77,11 @@ __FBSDID("$FreeBSD$"); (((cfg)->hdrtype == PCIM_HDRTYPE_NORMAL && reg == PCIR_BIOS) || \ ((cfg)->hdrtype == PCIM_HDRTYPE_BRIDGE && reg == PCIR_BIOS_1)) +#definePCIR_IS_IOV(cfg, reg) \ + (((cfg)->iov != NULL) &&\ + ((reg) >= (cfg)->iov->iov_pos + PCIR_SRIOV_BAR(0)) && \ + ((reg) <= (cfg)->iov->iov_pos + PCIR_SRIOV_BAR(PCIR_MAX_BAR_0))) + static int pci_has_quirk(uint32_t devid, int quirk); static pci_addr_t pci_mapbase(uint64_t mapreg); static const char *pci_maptype(uint64_t mapreg); @@ -2647,7 +2654,8 @@ pci_read_bar(device_t dev, int reg, pci_addr_t *mapp, pci_addr_t *testvalp, struct pci_devinfo *dinfo; pci_addr_t map, testval; int ln2range; - uint16_t cmd; + uint32_t restore_reg; + uint16_t cmd, mask; /* * The device ROM BAR is special. It is always a 32-bit @@ -2677,9 +2685,21 @@ pci_read_bar(device_t dev, int reg, pci_addr_t *mapp, pci_addr_t *testvalp, * determining the BAR's length since we will be placing it in * a weird state. */ - cmd = pci_read_config(dev, PCIR_COMMAND, 2); - pci_write_config(dev, PCIR_COMMAND, - cmd & ~(PCI_BAR_MEM(map) ? PCIM_CMD_MEMEN : PCIM_CMD_PORTEN), 2); +#ifdef PCI_IOV + if (PCIR_IS_IOV(&dinfo->cfg, reg)) { + restore_reg = dinfo->cfg.iov->iov_pos + PCIR_SRIOV_CTL; + cmd = pci_read_config(dev, restore_reg, 2); + pci_write_config(dev, restore_reg, cmd & ~PCIM_SRIOV_VF_MSE, 2); + } else +#endif + { + cmd = pci_read_config(dev, PCIR_COMMAND, 2); + mask = PCI_BAR_MEM(map) ? PCIM_CMD_MEMEN : PCIM_CMD_PORTEN; + pci_write_config(dev, PCIR_COMMAND, cmd & ~mask, 2); + restore_reg = PCIR_COMMAND; + } /* * Determine the BAR's length by writing all 1's. The bottom @@ -2701,7 +2721,7 @@ pci_read_bar(device_t dev, int reg, pci_addr_t *mapp, pci_addr_t *testvalp, pci_write_config(dev, reg, map, 4); if (ln2range == 64) pci_write_config(dev, reg + 4, map >> 32, 4); - pci_write_config(dev, PCIR_COMMAND, cmd, 2); + pci_write_config(dev, restore_reg, cmd, 2); *mapp = map; *testvalp = testval; ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: PCI PF memory decode disable when sizing VF BARs
On Tue, May 5, 2015 at 9:17 AM, Eric Badger wrote: > Hi Ryan and -current, > > During IOV config, when setting up VF bars, several calls are made to > 'pci_read_bar' (in sys/dev/pci/pci.c) in order to size VF BARs, which > causes memory decoding to be turned off temporarily for the PF associated > with those VFs. I'm finding that this can interfere with an already running > PF. > Ouch. That's a nasty bug. Thanks for tracking this down. > 1. Check the value of the 'reg' arg to 'pci_read_bar' and, if it is > outside a standard BAR range, don't disable memory decoding. This is > simple, but feels a little hackish and may have consequences I'm missing. > 2. Pass some flag/context through such that pci_read_bar knows it is > configuring VF BARs (we might instead disable VF MSE in this case, if it is > enabled). It would be necessary to carry this flag/context through several > function calls before reaching pci_read_bar, which might end up being ugly. > 3. Rearrange the calls so that VF BARs are sized when the PF is not yet > running, and that info saved until VFs are created. Probably it would be > done when the PF BARs are sized for any device supporting IOV, even if that > device never creates VFs. > I don't think that #3 is possible. Unfortunately the BAR is sized again when the BAR is allocated so it's difficult so it wouldn't be enough to just size the BAR during attach. I would have to reserve the memory space during attach, but that might reserve physical address space that other devices need to function. Actually, that problem will prevent #2 from being easy to implement too. We'd have to add an extra flag to pci_alloc_multi_resource. I think that #1 is the best option. There's already a precedent for something similar has it has a special base for the device ROM BAR. I haven't had a chance to test this yet, but I believe that this patch will solve the problem: https://people.freebsd.org/~rstone/patches/iov_bar.diff ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Dual booting FreeBSD and Win95
No, this isn't a late April Fools joke. :( I find myself in a situation where I need to integrate my employer's manufacturing process with a third-party OEM's process. My employer's hardware tests are all FreeBSD-based while the OEM is Windows 95 based. I need to come up with a way to integrate them together. We're looking at dual-booting FreeBSD and Win95. We're thinking of booting into Win95, the OEM can do their thing, switch to booting FreeBSD, run our tests and produce a .csv file with the results, and then boot back into Win95 for them to finish up. Ideally we would like to switch the boot slice without human interaction. I've been playing around with trying to set one only slice as active to make the loader boot it, but it appears that doesn't actually work. boot0cfg would cover half of the use case (switching from FreeBSD back to Win95), but I'm not sure how I could do the original switch from Win95 to FreeBSD. We've discussed just switching hard drives, but we really want to shoot for a 100% automated process. Anybody have any ideas? ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: What's the official method to test the build now?
I still see the compile errors that I reported here: https://lists.freebsd.org/pipermail/freebsd-current/2015-February/054803.html It affects these builds: sparc64 LINT kernel failed, check _.sparc64.LINT for details powerpc LINT kernel failed, check _.powerpc.LINT for details powerpc LINT64 kernel failed, check _.powerpc.LINT64 for details i386 LINT-NOINET kernel failed, check _.i386.LINT-NOINET for details i386 LINT-NOINET6 kernel failed, check _.i386.LINT-NOINET6 for details i386 LINT kernel failed, check _.i386.LINT for details i386 LINT-NOIP kernel failed, check _.i386.LINT-NOIP for details i386 LINT-VIMAGE kernel failed, check _.i386.LINT-VIMAGE for details ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: What parts of UMA are part of the stable ABI?
I've put the full patch to convert uma_malloc and uma_free to accept a vm_size_t up for review[1]. It ended up being more extensive than expected as a fair number of places do use uma_set_allocf(). I do plan on MFC'ing this patch. This survive a make tinderbox (ignoring some vt-related LINT kernel build failures) I haven't attempted converting the uma ctors, etc over to vm_size_t yet, but I do plan on doing that still. [1] https://reviews.freebsd.org/D2106 ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
What's the official method to test the build now?
"make tinderbox" has been broken for weeks, so I presume it's not what I'm supposed to be using to test my commits anymore. What's the officially supported make target that I'm supposed to use? ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: What parts of UMA are part of the stable ABI?
On Wed, Mar 18, 2015 at 10:24 AM, John Baldwin wrote: > I do think the normal zone callbacks passed to uma_zcreate() are too public > to change. Or at least, you would need to do some crazy ABI shim where you > have a uma_zcreate_new() that you map to uma_zcreate() via a #define for > the API, but include a legacy uma_zcreate() symbol that older modules can > call (and then somehow tag the old function pointers via an internal flag > in the zone and patch UMA to cast to the old function signatures for zones > with that flag). > I really wasn't clear here. I definitely don't think that changing the ctor, etc to accept a size_t is MFC'able, and I don't think that the problem (which is really only theoretical at this point) warrants an MFC to -stable. I was talking about potentially doing it in a separate commit to head, but that does leave -stable and head with a different API. This can be painful for downstream consumers to deal with, which is why I wanted comments. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: sbuf-related panic
On Tue, Mar 17, 2015 at 5:08 PM, Ian Lepore wrote: > I had commented-out INVARIANTS in my kernel config and forgotten that > (doh!) so none of the assertions were actually being tested. > FYI: there is now a GENERIC-NODEBUG kernel config checked into head so you can just buildkernel/installkernel with KERNCONF=GENERIC-NODEBUG when you want to run without invariants/witness/etc. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
What parts of UMA are part of the stable ABI?
In this freebsd-hackers thread[1], a user reported that 10.1-RELEASE crashes during boot on a system with 3TB of RAM. As it turns out, when you have that much RAM ZFS autotunes itself to allocate a 6GB hash table. This triggers a nasty 32-bit integer truncation bug in malloc(9). malloc() calls uma_large_malloc(), but uma_large_malloc() accepts an int instead of a size_t and all kinds of hilarity can ensure from there. The user has confirmed that the page in [2] fixed the kernel from instantly panicking once zfs.ko was loaded. I'm a bit concerned about whether the patch as written is an MFC candidate though. uma_large_malloc() calls page_alloc() to actuallly allocate the memory, and page_alloc() also accepts an int size parameter. This is where things get tricky. The signature for page_alloc() is governed by the uma_alloc() typedef, as uma also uses it internally for allocating memory for uma_zones. There is even a uma_zone_set_allocf() API for overriding the default allocation function. So there's definitely an argument to be made the the signature of page_alloc() being a part of the stable ABI. I have no hesitation in saying that uma_large_malloc() is not a stable API and changing it is fair game. If uma_alloc() is a part of the stable API, then it's simple enough to commit a 64-bit safe allocation function for uma_large_malloc() to call and changing page_alloc() to call it instead. That commit can be MFC'ed, and a follow-up commit could convert the UMA APIs to use size_t everywhere. While I am at this, I'd like to also change the uma init/fini/ctor/dtor to also use size_t. I'm a little torn on this because this will definitely cause a lot of churn, both in the tree and for downstream consumers, and there's not necessarily going to be a big benefit to it. However, I suppose that the existence of machines where 4GB is less than 1% of system memory may mean that allocating 4GB at a time may not that outlandish. I can definitely be talked out of this though. [1] https://lists.freebsd.org/pipermail/freebsd-hardware/2015-March/007602.html [2] http://people.freebsd.org/~rstone/patches/vm_64bit_malloc.diff ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: [PATCH] Convert the VFS cache lock to an rmlock
On Thu, Mar 12, 2015 at 1:36 PM, Mateusz Guzik wrote: > Workloads like buildworld and the like (i.e. a lot of forks + execs) run > into very severe contention in vm, which is orders of magnitude bigger > than anything else. > > As such your result seems quite suspicious. > You're right, I did mess up the testing somewhere (I have no idea how). As you suggested, I switched to using a separate partition for the objdir, and ran each build with a freshly newfsed filesystem. I scripted it to be sure that I was following the same procedure with each run: # Build known-working commit from head git checkout 09be0092bd3285dd33e99bcab593981060e99058 || exit 1 for i in `jot 5` do # Create a fresh fs for objdir sudo umount -f /usr/obj 2> /dev/null sudo newfs -U -j -L OBJ $objdev || exit 1 sudo mount $objdev /usr/obj || exit 1 sudo chmod a+rwx /usr/obj || exit 1 # Ensure disk cache contains all source files git status > /dev/null /usr/bin/time -a -o $logfile make -s -j$(sysctl -n hw.ncpu) buildworld buildkernel done I tested on the original 12-core machine, as well as a 2 package x 8 core x 2 HTT (32 logical cores) machine that a co-worker was able to lend me. Unfortunately, the results show a performance decrease now. It's almost 5% on the 32 core machine: $ ministat -w 74 -C 1 12core/* x 12core/orig.log + 12core/rmlock.log +--+ |x xxxx + + ++ +| | |_A__| |___A___M__|| +--+ N Min MaxMedian AvgStddev x 5 2478.81 2487.74 2483.45 2483.652 3.2495646 + 5 2489.64 2501.67 2498.26 2496.832 4.7394694 Difference at 95.0% confidence 13.18 +/- 5.92622 0.53067% +/- 0.238609% (Student's t, pooled s = 4.06339) $ ministat -w 74 -C 1 32core/* x 32core/orig.log + 32core/rmlock.log +--+ |x x+| |x x x + ++ +| ||__AM| |___AM_| | +--+ N Min MaxMedian AvgStddev x 5 1067.97 1072.86 1071.29 1070.314 2.2238997 + 5 .22 1129.051122.3 1121.324 6.4046569 Difference at 95.0% confidence 51.01 +/- 6.99181 4.76589% +/- 0.653249% (Student's t, pooled s = 4.79403) The difference is due to a significant increase in system time. Write locks on an rmlock are extremely expensive (they involve an smp_rendezvous), and the cost likely scales with the number of cores: x 32core/orig.log + 32core/rmlock.log +--+ |xxx x + +++ +| ||_MA__| |MA__|| +--+ N Min MaxMedian AvgStddev x 5 5616.635715.75641.5 5661.72 48.511545 + 5 6502.51 6781.846596.5 6612.39 103.06568 Difference at 95.0% confidence 950.67 +/- 117.474 16.7912% +/- 2.07489% (Student's t, pooled s = 80.5478) At this point I'm pretty much at an impasse. The real-time behaviour is critical to me, but a 5% performance degradation isn't likely to be acceptable to many people. I'll see what I can do with this. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: [PATCH] Convert the VFS cache lock to an rmlock
On Thu, Mar 12, 2015 at 12:37 PM, Adrian Chadd wrote: > Do you have access to any boxes that have more than 12 cores? I have a 14-core hyperthreaded machine (so 28 logical cores), but it has no disk (long story). I could do a build out of a memory disk though. Also, to ask a stupid question - why wasn't the reader gifted a > temporary priority boost because you were trying to acquire the write > lock? > rwlocks don't have any metadata tracking what threads hold a read lock, so it's impossible to propagate priority to them. rwlocks only keep a counter of the number of readers. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
[PATCH] Convert the VFS cache lock to an rmlock
I've just submitted a patch to Differential[1] for review that converts the VFS cache to use an rmlock in place of the current rwlock. My main motivation for the change is to fix a priority inversion problem that I saw recently. A real-time priority thread attempted to acquire a write lock on the VFS cache lock, but there was already a reader holding it. The reader was preempted by a normal priority thread, and my real-time thread was starved. [1] https://reviews.freebsd.org/D2051 I was worried about the performance implications of the change, as I wasn't sure how common write operations on the VFS cache would be. I did a -j12 buildworld/buildkernel test on a 12-core Haswell Xeon system, as I figured that would be a reasonable stress test that simultaneously creates lots of small files and reads a lot of files as well. This actually wound up being about a 10% performance *increase* (the units below are seconds of elapsed time as measured by /usr/bin/time, so smaller is better): $ ministat -C 1 orig.log rmlock.log x orig.log + rmlock.log +--+ | + x | |xx xxx | | |A| |_A___M|| +--+ N Min MaxMedian AvgStddev x 6 2710.31 2821.35 2816.75 2798.0617 43.324817 + 5 2488.25 2500.25 2498.04 2495.756 5.0494782 Difference at 95.0% confidence -302.306 +/- 44.4709 -10.8041% +/- 1.58935% (Student's t, pooled s = 32.4674) The one outlier in the rwlock case does confuse me a bit. What I did was booted a freshly-built image with the rmlock lock applied, did a git checkout of head, and then did 5 builds in a row. The git checkout should have had the effect of priming the disk cache with the source files. Then I installed the stock head kernel, rebooted, and ran 5 more builds (and then 1 more when I noticed the outlier). The fast outlier was the *first* run, which should have been running with a cold disk cache, so I really don't know why it would be 90 seconds faster. I do see that this run also had about 500-600 fewer seconds spent in system time: x orig.log +--+ | x | |xx x xx | | |_A__M_|| +--+ N Min MaxMedian AvgStddev x 6 3515.23 4121.84 4105.57 4001.71 239.61362 I'm not sure how much that I care, given that the rmlock is universally faster (but maybe I should try the "cold boot" case anyway). If anybody had any comments or further testing that they would like to see, please let me know. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: sysutils/lsof does not build (maybe) after r279433
Would: #undef __BSD_VISIBLE #include #define __BSD_VISIBLE 1 Work in lsof? ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: r279514: buildworld failure: /usr/src/lib/libnv/tests/dnv_tests.cc:453:2: error: use of overloaded operator '<<' is ambiguous
I don't think that libnv is the problem. It looks like a problem with atf-c++ to me. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: r279514: buildworld failure: /usr/src/lib/libnv/tests/dnv_tests.cc:453:2: error: use of overloaded operator '<<' is ambiguous
Can you post the contents of your make.conf and src.conf? I didn't see this in any of my "make tinderbox" runs ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Build failed in Jenkins: FreeBSD_HEAD #2473
*sigh*, this was bad timing by Jenkins. This is already fixed in r279440 (it's good to see that Jenkins will catch this kind of thing so quickly though) ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
HEADS UP: PCI SR-IOV infrastructure has been committed to head
I've just finished committing support for PCI Single Root I/O Virtualization in the pci subsystem to head. This should be a no-op for everyone right now, but there were some minor refactorings in the pci code that could have a lingering bug. I did make sure to test that it boots on a variety of systems (but only i386/amd64, as that's all that I have access to). What's been committed to head is only the pci subsystem side of things, along with the userland tools to configure SR-IOV (along with, I'm happy to say, a full set of man pages). What's not in head yet are any drivers making use of the infrastructure. Full support for ixl(4) is complete and I've sent the patch to jfv@; I hope to see the driver support committed soon. I don't have any word on timelines for getting support in other drivers. Unfortunately adding SR-IOV support to a driver is not trivial as the standard leaves a lot of the details up to particular implementations (in the same way the the PCIe standard does not define how to send a packet from a NIC; instead defining how the PCIe device will expose its registers and whatnot, and its up to the PCIe device and driver to understand how to poke at the registers to send a packet). I have heard anecdotally that a number of driver maintainers have been very interested in this work so I hope that to see more drivers supported SR-IOV in the near future. I encourage all driver maintainers to read over the new manpages and contact me if they have any questions about the new infrastructure. Anybody interested in using SR-IOV should try to attend BSDCan 2015, as I will be giving a talk on the subject. I intend to focus more on the system administration side of configuring and using SR-IOV rather than the details of implementing an SR-IOV driver. If anybody did an "svn up" half-way through my muddled series of commits, sorry about the temporary breakage. My buildworld/buildkernel on r279466 just completed successfully so please make sure that you have at least that revision. If you still have problems, please let me know. I do want to thank John Baldwin for advice about the PCI Subsystem and newbus and Jack Vogel for his help with the Fortville NIC, including getting me early access to the VF driver for testing purposes. Thanks to everybody who reviewed the changes. Specially thanks to Mark Johnston and Sean Mahood, who literally spent hours with me in a meeting room reviewing the entire patch series last summer (thankfully, those hours at least weren't consecutive). Above all, thanks to Sandvine Inc. for sponsoring this work. This is definitely the biggest contribution we've ever made to FreeBSD and I hope to see this kind of thing continue. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: atkbd.c not compiling?
On Sat, Feb 28, 2015 at 4:51 PM, Garrett Cooper wrote: > I’m not sure about key_map — are you building with syscons or vt? I have no idea. I'm just running make tinderbox. So far _.sparc64.LINT, _.i386.LINT-NOINET and _.i386.LINT-VIMAGE have failed, among others. i386.LINT and sparc64.LINK have both "device sc" and "device vt" from what I can see ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: atkbd.c not compiling?
And now this: /repos/users/rstone/freebsd-2/sys/modules/sfxge/../../dev/sfxge/sfxge_rx.h:39:2: error: LRO #error LRO Seriously? ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
atkbd.c not compiling?
I updated my source tree this morning and now I'm seeing this compile error in "make tinderbox"; /repos/users/rstone/freebsd/sys/dev/atkbdc/atkbd.c:382:26: error: use of undecla red identifier 'key_map'; did you mean 'keymap'? keymap = malloc(sizeof(key_map), M_DEVBUF, M_NOWAIT); ^~~ keymap /repos/users/rstone/freebsd/sys/dev/atkbdc/atkbd.c:358:12: note: 'keymap' declar ed here keymap_t *keymap; /repos/users/rstone/freebsd/sys/dev/atkbdc/atkbd.c:383:26: error: use of undecla red identifier 'accent_map'; did you mean 'accentmap_t'? accmap = malloc(sizeof(accent_map), M_DEVBUF, M_NOWAIT); ^ /repos/users/rstone/freebsd/sys/sys/kbio.h:210:26: note: 'accentmap_t' declared here typedef struct accentmap accentmap_t; (By the way, this is the second time in two days that "make tinderbox" has been broken for me. It's extremely frustrating that I can't test my pending commits because others haven't done me the same courtesy) ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: netmap support for the Intel 40G card in head
This is great! Thanks to both you and Intel. I'm planning on getting SR-IOV support into head this week, which would allow you to create ixlv instances (on the same hardware). Any chance that you'd have the time to look into supporting SR-IOV for that driver too? ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
make regression -- -q doesn't work?
On 10.1-RELEASE, make -q doesn't seem to work anymore: [rstone@wtllab-bsd10-build-64 rstone]cat Makefile foo: bar cp bar foo bar: touch bar clean: rm -f foo bar [rstone@wtllab-bsd10-build-64 rstone]make -q foo; echo $? 1 [rstone@wtllab-bsd10-build-64 rstone]make foo touch bar cp bar foo [rstone@wtllab-bsd10-build-64 rstone]make -q foo; echo $? `foo' is up to date. 1 [rstone@wtllab-bsd10-build-64 rstone]make foo `foo' is up to date. [rstone@wtllab-bsd10-build-64 rstone]echo $? 0 This worked correctly in 8.1-RELEASE. I suspect that this is a bmake-induced regression? ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Has the counter/tic been resolved?
Is this the issue that you're referring to? https://lists.freebsd.org/pipermail/freebsd-current/2015-February/054295.html If so, it was fixed in r278229: https://svnweb.freebsd.org/changeset/base/278229 ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: PSA: If you run -current, beware!
On Wed, Feb 4, 2015 at 6:15 PM, Peter Wemm wrote: > --- kern/kern_clock.c 2014-12-01 15:42:21.707911656 -0800 > +++ kern/kern_clock.c 2014-12-01 15:42:21.707911656 -0800 > @@ -410,6 +415,11 @@ > #ifdef SW_WATCHDOG > EVENTHANDLER_REGISTER(watchdog_list, watchdog_config, NULL, 0); > #endif > + /* > +* Arrange for ticks to go negative just 5 minutes after boot > +* to help catch sign problems sooner. > +*/ > + ticks = INT_MAX - (hz * 5 * 60); > } Should we just commit this under #ifdef INVARIANTS? ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Is anyone using the schedgraph.d script?
Hm, there was one bug in that script. I uploaded a fixed version. The fix was: - printf("%d %d KTRGRAPH group:\"thread\", id:\"%s/%s tid %d\", state:\"runq add\", attributes: prio:%d, linkedto:\"%s/%s tid %d\"\n", cpu, timestamp, args[0]->td_proc->p_comm, args[0]->td_name, args[0]->td_tid, args[0]->td_priority, curthread->td_proc->p_comm, curthread->td_name, args[0]->td_tid); + printf("%d %d KTRGRAPH group:\"thread\", id:\"%s/%s tid %d\", state:\"runq add\", attributes: prio:%d, linkedto:\"%s/%s tid %d\"\n", cpu, timestamp, args[0]->td_proc->p_comm, args[0]->td_name, args[0]->td_tid, args[0]->td_priority, curthread->td_proc->p_comm, curthread->td_name, curthread->td_tid); Note that the last printf argument used args[0] instead of curthread as intended. One other thing that I have noticed with the schedgraph data gathering is that unlike KTR, in dtrace every CPU gathers its data into a CPU-local buffer. This can mean that a CPU that sees a large number of scheduler events will roll over its ring buffer much more quickly than a lightly loaded CPU. This can lead to a confusing or misleading schedgraph output at the beginning of the time period. You can mitigate this problem by allowing dtrace to allocate a larger ring buffer with: #pragma D option bufsize=32m (You can potentially tune it even higher than that, but that's a good place to start) Finally, I've noticed that schedgraph seems to have problems auto-detecting the clock frequency, so I tend to forcifully specify 1GHz (dtrace always outputs time in units of ns, so this is always correct to do with dtrace-gather data) ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: General Protection Fault in prelist_remove()
On Tue, Nov 18, 2014 at 2:21 AM, Mark Johnston wrote: > https://people.freebsd.org/~markj/patches/nd6_dad_races.diff I haven't had the chance to study the patch in detail, but I can confirm that it at least fixes the crashes that I was seeing earlier. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: General Protection Fault in prelist_remove()
On Mon, Sep 16, 2013 at 1:10 PM, Mark Johnston wrote: > I've partially fixed this at work by adding a rw lock to protect access > to the the prefix, default router, and DAD lists. The patch is here: > http://people.freebsd.org/~markj/patches/ndp-locking.diff Hi Mark, I've hit a bug in this patch today. The problem is in the locking of the DAD list. Many functions (e.g. nd6_dad_duplicated) call nd6_dad_find() to look up a dadq structure, and then manipulate the structure with no lock held. The problem that once nd6_dad_find() releases the ND lock there is nothing preventing another thread from going in and free'ing the structure. This causes a use-after-free in nd6_dad_duplicated. I have a setup which is somehow triggering DAD on link-local addresses (I don't understand why; I don't have duplicate mac addresses on the network as best that I can tell) and with INVARIANTS on I very frequently get a crash in nd6_dad_duplicated. It looks to me that the only way to fix it is either introduce referencing counting into the structure, or push the locking out of nd6_dad_find() and into the callers. Any opinions? ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: systat -ifstat on high-speed links
http://svnweb.freebsd.org/changeset/base/272284 ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Mounting ZFS with error 5 failed, since r271963 callout convert
On Mon, Oct 27, 2014 at 4:21 PM, Michael Schmiedgen wrote: > Hi List, > > my ZFS does not mount. I bifurcated to r271963 that > does not work anymore. The commit seems not directly > related to ZFS, but is rather a conversion from timeout(9) > to callout(9). > > After booting the kernel it drops to the mount prompt, > stating that ZFS cannot be mounted because of 'error 5'. > > Any hints? Can I provide some more information? > > Thanks, > Michael The changes to the 3 files there look to be independent, so can you narrow this down further by applying the patch to only a single file? Of those three only ACPI looks like it could affect ZFS or disks, so this will be the patch to try first: https://svnweb.freebsd.org/base/head/sys/dev/acpica/acpi.c?view=patch&r1=271889&r2=271963&pathrev=271963 If you can get a verbose boot log from the machine that would be helpful, but you'd need a serial console to capture that. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"