build problem: external/bsd/ntp/dist/ntpd/
Hi, recently external/bsd/ntp/dist/ntpd/ntp_parser.[ch] have been built in the source directories, presumably from ntp_parser.y Obviously this is not the right spot for them. I haven't seen a fix in the last couple of days but may have missed it. Regards, Geoff
amd failing in -current
Updating from a couple of weeks ago, amd is now failing on startup. Logs say: Nov 13 22:38:50 computername amd[345]/fatal: cannot create rpc/udp service Nov 13 22:38:50 computername amd[345]/info: Finishing with status 2 I'm guessing this is from dist/conf/transp/transp_sockets.c:create_nfs_service() and the svcudp_create() call. That doesn't look like it was updated recently so some RPC code or something changed. I cannot look at this over the weekend so if noone has any idea, I'll look at it next week. Regards, Geoff
Re: amd failing in -current
On Friday 2015-11-13 22:47 +1100, Geoff Wing output: : Nov 13 22:38:50 computername amd[345]/fatal: cannot create rpc/udp service : Nov 13 22:38:50 computername amd[345]/info: Finishing with status 2 As soon as I post this, I see that Matthias Scheler posted a fix, so hopefully can ignore. Thanks, Geoff
/etc/exports is now being read incorrectly
Hi, for a long time, the parser reading /etc/exports would treat the following example from exports(5): /u -maproot=bin: -network 131.104.48 -mask 255.255.255.0 as 131.104.48/24 Currently it's being treated as 131.104.0.48 (seen via ``showmount -e'') I'm guessing it changed in the last month or so. Regards, Geoff
Crash during rc bootup (amd64) with new networking stuff
Hi, with the new networking setup, I'm getting a crash using clean amd64 build (GENERIC kernel) during rc script processing. After getting past netstart.local, I'll get interface address is missing from cache = 0x0 in delete arp: writing to routing socket: No such file or directory Building databases: Starting syslogd. Starting named. Setting date via ntp. then panic: kernel "(la->la_flags & LLE_STATIC) == 0 failed: .. if_arp", line 1220 My netstart.local adds static routes and blackhole routes. It also deletes and adds in a static arp address: "arp -d 1.2.3.4; arp -s 1.2.3.4 xx:xx:xx:xx:xx:xx" It's a bit hard to diagnose on my computer, but I can try if others cannot reproduce. Regards, Geoff
Re: Crash during rc bootup (amd64) with new networking stuff
On Friday 2016-04-15 13:20 +1000, Geoff Wing output: :panic: kernel "(la->la_flags & LLE_STATIC) == 0 failed: .. if_arp", line 1220 :It also deletes and adds in a static arp address: : "arp -d 1.2.3.4; arp -s 1.2.3.4 xx:xx:xx:xx:xx:xx" Taking out the static arp commands and it boots up OK. Regards, Geoff
Re: Crash during rc bootup (amd64) with new networking stuff
On Friday 2016-04-15 16:20 +1000, Ryota Ozaki output: :> panic: kernel "(la->la_flags & LLE_STATIC) == 0 failed: .. if_arp", line 1220 :The source code of your kernel looks a bit old: : http://nxr.netbsd.org/xref/src/sys/netinet/if_arp.c#1220 : :You can see the version of the file by: : $ ident /netbsd |grep if_arp.c : $NetBSD: if_arp.c,v 1.205 2016/04/07 03:22:15 christos Exp $ :And this is the latest version. Mine says: $NetBSD: if_arp.c,v 1.206 2016/04/13 00:47:01 ozaki-r Exp $ I'll try the patch in the other post in a couple of hours. Regards, Geoff
Re: Crash during rc bootup (amd64) with new networking stuff
On Friday 2016-04-15 16:48 +1000, Ryota Ozaki output: :> :panic: kernel "(la->la_flags & LLE_STATIC) == 0 failed: .. if_arp", line 1220 :> :It also deletes and adds in a static arp address: :> : "arp -d 1.2.3.4; arp -s 1.2.3.4 xx:xx:xx:xx:xx:xx" :> Taking out the static arp commands and it boots up OK. :Thanks. I could reproduce the panic on my machine with the latest kernel. : :A quick fix is like this: [...] :Does this patch help you? :If so, I'll commit it (with tweaks maybe) after more validations. Great, boots up OK. Thanks, Geoff
libgnumalloc (i386) problems? Seen wih squid
Hi, with May builds of i386 (cross-compiled from amd64), when I run squid (pkgsrc build), I get a seg fault, with a backtrace showing over 20k calls to calloc() in libgnumalloc. e.g. . #22690 0xbb97b58c in calloc () from /usr/lib/libgnumalloc.so.1 #22691 0xbb97b58c in calloc () from /usr/lib/libgnumalloc.so.1 #22692 0xbb97b58c in calloc () from /usr/lib/libgnumalloc.so.1 #22693 0xbb97b58c in calloc () from /usr/lib/libgnumalloc.so.1 . % objdump -S calloc.o calloc.o: file format elf32-i386 Disassembly of section .text: : 0: 8b 44 24 08 mov0x8(%esp),%eax 4: c7 44 24 08 01 00 00movl $0x1,0x8(%esp) b: 00 c: 0f af 44 24 04 imul 0x4(%esp),%eax 11: 89 44 24 04 mov%eax,0x4(%esp) 15: eb e9 jmp0 Looks like some header may be causing trouble. Or some weird optimisation. Anyone else see this? Regards, Geoff
NPF table bug (Was: Why so many packet filters?)
On Monday 2016-08-15 17:14 +1000, Joerg Sonnenberger output: :[] There are still quite a few issues with NPF, primarily :documentation issues, but also some functional ones. It seems a bit [...] I recently ran into an NPF bug (PR 50511 from December last year). A cursory scan suggested it was running into an arbitrary limit in proplib and not handling it gracefully. I haven't had time to look into it more deeply. Regards, Geoff
ssh 8.2 option "compression" error
Hi, I have an error with the new openssh 8.2. The option "compression" is not being handled properly. I blew away my objdir for crypto/external/bsd/openssh before rebuilding but still getting this error. Anyone else able to reproduce or maybe something else caused a build error for me? % ssh -F /dev/null -o "compression yes" localhost command-line line 0: unsupported option "yes". % I didn't check all the options but several others seem to be working OK. Regards, Geoff
Re: libuv.so
On Saturday 2020-06-06 01:16 +0100, ci4...@gmail.com output: :On my system from the 4th of June I get unresolved libuv for host, :nslookup, dig. :# ldd /usr/bin/dig | grep uv :-luv.1 => not found :Is this some local problem? Should I do a clean rebuild? Hi, libuv was being installed for a couple of days for a BIND "named" upgrade - however it is now only used internally for the build. IIRC, libuv is too fast moving with API changes and is unnecessary to keep around. I believe Christos added them to the obsolete lists so the library and links should be removed when you run "postinstall" obsolete. DNS resolvers seem to be moving to lightweight "unbound". DNS provider is available in src build with "MKNSD=yes" in your /etc/mk.conf when building (or use pkgsrc). Documentation for both is on: www.nlnetlabs.nl Regards, Geoff
missing etc/rc.d file? (Was: blacklist -> blocklist in current)
On Sunday 2020-06-14 22:01 -0400, Christos output: :I've renamed blacklist to blocklist, so if you are currently using it, :you should rename things accordingly: : : - rc.conf variable : - /var/db/blacklist.db file : - npf table name : :Apologies for the inconvenience, :christos Hi, I cannot see (in CVS or FTP): src/etc/rc.d/blocklistd Regards, Geoff
Re: missing etc/rc.d file? (Was: blacklist -> blocklist in current)
On Wednesday 2020-06-17 08:03 +0100, Iain Hibbert output: :On Wed, 17 Jun 2020, Geoff Wing wrote: :> Hi, :> I cannot see (in CVS or FTP): :> src/etc/rc.d/blocklistd :src/external/bsd/blocklist/etc/rc.d/blocklistd Thanks. I guess I had a botched build during the changes since my version of postinstall doesn't know about it. Regards, Geoff
panic enabling ipfilter (Dec 27)
Hi, with -current today, when I "/etc/rc.d/ipfilter start" on amd64 it panics in sys/net/pfil.c: pfil_add_hook(pfil_func_t func, void *arg, int flags, pfil_head_t *ph) ... KASSERT((flags & ~PFIL_ALL) == 0); ... Unfortunately my machine is mostly headless and I can't get dmesg saved after reboot. I have it on another machine which normally does keep some dmesg though nothing saved with this: /etc/sysctl.conf: ddb.onpanic=1 ddb.commandonenter=bt;reboot Is "bt;sync" better? I'll try to get more info if noone can reproduce. Regards, Geoff
Issues with sshd segv (Dec 27)
Hi, on an amd64 machine, I was getting the child sshd process seg-faulting (I believe after dropping privileges but I wasn't getting a coredump) when trying to accept a connection (``sshd -d -d -d'' wasn't really helpful). It had three legacy lines in its sshd_config: HostKey /etc/ssh/ssh_host_key HostKey /etc/ssh/ssh_host_rsa_key HostKey /etc/ssh/ssh_host_dsa_key It didn't have lines for HostKey /etc/ssh/ssh_host_ecdsa_key HostKey /etc/ssh/ssh_host_ed25519_key nor did it specify a Protocol in the config. Commenting out the former three lines allows it to accept connections properly. Not investigated yet. Maybe someone has an idea about it. Regards, Geoff
Starting NPF crashes amd64 -current (23 Jan 2017)
Hi, starting -current on amd64, I get a crash during (presumably) /etc/rc.d/npf I have some dynamic tables in /etc/npf.conf, e.g. table type tree dynamic though maybe not relevant. Panics are copied from phone video. I can't get a crash dump, nor does my computer keep system message logs over reboot. One panic had panic: kernel diagnostic assertion "elements > 0" filed: sys/kern/subr_hash.c: line 93 vpanic() ch_voltag_convert_in() hashinit()+0x1b4 npf_table_create()+0xc7 pf_mk_tables.isra.0()+0x20f npfctl_load()+0x1d3 VOP_IOCTL() vn_ioctl() sys_ioctl() syscal() -- syscall (54) dumping to dev 168,10 not possible rebooting After several rebuilds and changing dump device, I get db{0}> sync dumping to dev 168,10 (offset=27281471, size=3143454) dump uvm_fault(0xfe833bacd5c8, 0x0, 1) -> e fatal page fault in supervisor mode trap type 6 ...blah... Stopped in pid 206.1 (npfctl) atnetbsd:sparse_dump_mark+0x10c: cmpq $0,38(%rax)
Re: Starting NPF crashes amd64 -current (23 Jan 2017)
On Tuesday 2017-01-24 15:59 +1100, Geoff Wing output: :starting -current on amd64, I get a crash during (presumably) /etc/rc.d/npf Panics from previous message were when I had pseudo-device npf in my kernel config. Removing that I get panics at the same place (npfctl) as mutex_vector_error: locking against myself address: ..f81327488 cpu:0 lwp: blah...5420 field:blah...5420 wait/spin: 0/0 Stopped in pid 206.1 (npfctl) at netbsd:breakpoint+0x5: leave Regards, Geoff
npf crashes on empty hash table load (amd64)
Hi, using the following /etc/npf.conf and an empty file "/etc/npf_blacklist" I get a crash in hashinit(): "KASSERT(elements > 0);" #10 0x80cf6365 in kern_assert (fmt=fmt@entry=0x810db938 "kernel %sassertion \"%s\" failed: file \"%s\", line %d ") at /usr/netbsd/src/sys/lib/libkern/kern_assert.c:51 #11 0x809950a4 in hashinit (elements=, htype=HASH_LIST, waitok=, hashmask=0xfe82dabaf920) at /usr/netbsd/src/sys/kern/subr_hash.c:93 #12 0x819494c9 in ?? () #13 0xfe82f2b094f0 in ?? () #14 0x in ?? () /etc/npf.conf: $ext1_if = inet4(msk0) alg "icmp" table type hash file "/etc/npf_blacklist" group "external1" on $ext1_if { block in final from } group default { pass final on lo0 all block all } Regards, Geoff
Re: npf crashes on empty hash table load (amd64)
Christos Zoulas typed: : In article <20170310020544.ga...@primenet.com.au>, : Geoff Wing wrote: :>using the following /etc/npf.conf and an empty file "/etc/npf_blacklist" :>I get a crash in hashinit(): "KASSERT(elements > 0);" : Fixed. : christos Fantastic. Even though I had a braino and only installed the kernel and not the modules to test initially, I can now use npf again. Large table loading is working too (my table text files were 67k and 75k, though it used to load over 1m via proplib) Thanks, Geoff
Re: current danger?
On Friday 2017-03-31 17:47 +0100, Patrick Welche output: :netbooted :fsck'd :copied vintage libc.so.12.206 -> /lib :booted netbsd.old (which should go with said libc) :try to compile hello world: :/usr/lib/libc.so: error adding symbols: File format not recognized :which file? Maybe recopy the /stand/ tree (modules for that netbsd and netbsd.old) to make sure the exec_elf* stuff is right. Regards, Geoff
dk wedges vs netbsd -current
Hi, I reinstalled an old i386 hardware server with 7.1 a couple of weeks ago. Still going strong with a 19 year old Quantum Fireball SE4.3A HD. The system installed as an old MBR (+63 sector) system. It did not add any dk stuff for /etc/fstab. It did NOT leave room to add the dk stuff at the start of the hard drive, so you cannot do "dkscan_bsdlabel" stuff on it and add bootloaders. -current does not have lots of the DKWEDGE_METHOD_* stuff in it so you cannot boot a new kernel on it. Where is the philosophy of least-surprise going with this? Hopefully we can have more DKWEDGE_METHOD* compatibility shivs in, given we also compile with things like COMPAT_* for whatever old binaries lie around. Regards, Geoff
Re: dk wedges vs netbsd -current
On Saturday 2017-04-01 08:35 +0200, Martin Husemann output: :On Sat, Apr 01, 2017 at 05:31:20PM +1100, Geoff Wing wrote: :> The system installed as an old MBR (+63 sector) system. :> It did not add any dk stuff for /etc/fstab. :> It did NOT leave room to add the dk stuff at the start of the hard drive, so :> you cannot do "dkscan_bsdlabel" stuff on it and add bootloaders. :> -current does not have lots of the DKWEDGE_METHOD_* stuff in it so you cannot :> boot a new kernel on it. :Sorry, but I fail to parse most of this. :You are talking about GPT, not dk(4) or something? :You can use dk(4) with MBR or any other on-disk format. You can. Boot any -current GENERIC kernel without those options. Look at them fail to find all the dk(4) stuff (because it's installed with pre-dk(4) stuff) and fail to boot. :Not sure what you mean with new kernel or doing dkscan_bsdlabel/add :bootloaders. I'm not sure what you mean here. I presume you understand the purpose of dkscan_bsdlabel(8). I also presume you understand the commented out DKWEDGE_METHOD_* stuff. If you've never added dk wedge info onto the disk (e.g. during a 7.1 install) where do you think it will come from? :Maybe you can rephrase your question, there might be some assumptions in it :that better be spelled out explicitly. OK. Install NetBSD 7.1 (i386) from CD/USB. Put in -current. Fail. That's it.
Re: dk wedges vs netbsd -current
Reply-To Organization: PrimeNet Computer Consultancy On Saturday 2017-04-01 12:54 +, mlel...@serpens.de output: :No need to scan for disklabels if the kernel already uses the disklabel. :But this has nothing to do with "adding bootloaders". I can only imagine :that you want to add boot managers in UEFI style, which is something :that your old i386 system doesn't support. I believe I have misrepresented GPT and DK stuff earlier. I've been fighting with them because of lack of space with GPT/old-MBR and other "in-use/busy" stuff. With a fresh 7.1 I have wd[0-9] (and /etc/fstab entries for them) (on i386). Plonk in a -current kernel without DKWEDGE_METHOD_* and it hangs because it cannot find anything (no /dev/dk* or /dev/wd*). I really would be unhappy if I was completely headless (and it is a pain for me to plug in a monitor there which I had with NPF woes (couldn't load large tables)) Maybe I've missed a step over the last couple of years so that it should automatically find old wd(*) BSD partitions with a -current kernel.
Re: dk wedges vs netbsd -current
On Saturday 2017-04-01 11:19 -0700, John Nemeth output: :On Apr 1, 11:18pm, Geoff Wing wrote: :} With a fresh 7.1 I have wd[0-9] (and /etc/fstab entries for them) (on i386). I meant wd0[abefgh] ... : How old is this system? What is the partition type of the :NetBSD partition? Can you show the output of "fdisk wd0"? It's a ~16 y/o Pentium-4 with ~19 y/o HD. 7.1 created and recognises wd0[abefgh] -current doesn't see wd0* partitions and won't build dk structures for them. % fdisk wd0 Disk: /dev/rwd0d NetBSD disklabel disk geometry: cylinders: 14848, heads: 9, sectors/track: 63 (567 sectors/cylinder) total sectors: 8418816, bytes/sector: 512 BIOS disk geometry: cylinders: 524, heads: 255, sectors/track: 63 (16065 sectors/cylinder) total sectors: 8418816 Partitions aligned to 16065 sector boundaries, offset 63 Partition table: 0: NetBSD (sysid 169) start 63, size 8418753 (4111 MB, Cyls 0-524/11/63), Active 1: 2: 3: Bootselector disabled. First active partition: 0 Drive serial number: 0 (0x)
Re: dk wedges vs netbsd -current
On Saturday 2017-04-01 18:14 -0700, John Nemeth output: : As others have noted, you are totally concentrating on the :wrong thing. The fact that it "won't build dk structures" is of :no relevance. The first thing that has to happen is detecting the :drive. Anyway, can you capture a dmesg? Can't get a -current one until tomorrow (30 hours or so). 7.1 sees wd0a, wd0b, etc -current will see wd0 but won't find any of the partitions on it (unless I have DKWEDGE_METHOD* and then I'll get dk() partitions/wedges) NetBSD 7.1 (GENERIC.201703111743Z) total memory = 511 MB avail memory = 486 MB kern.module.path=/stand/i386/7.1/modules timecounter: Timecounters tick every 10.000 msec timecounter: Timecounter "i8254" frequency 1193182 Hz quality 100 System Manufacturer System Name (System Version) mainbus0 (root) ACPI: RSDP 0xf52b0 14 (v00 ASUS ) ACPI: RSDT 0x1ffec000 30 (v01 ASUS P4B533 42302E31 MSFT 31313031) ACPI: FACP 0x1ffec0c0 74 (v01 ASUS P4B533 42302E31 MSFT 31313031) ACPI: DSDT 0x1ffec134 002742 (v01 ASUS P4B533 1000 MSFT 010B) ACPI: FACS 0x1000 40 ACPI: BOOT 0x1ffec030 28 (v01 ASUS P4B533 42302E31 MSFT 31313031) ACPI: APIC 0x1ffec058 5A (v01 ASUS P4B533 42302E31 MSFT 31313031) ACPI: All ACPI Tables successfully acquired ioapic0 at mainbus0 apid 2: pa 0xfec0, version 0x20, 24 pins cpu0 at mainbus0 apid 0: Intel(R) Pentium(R) 4 CPU 2.00GHz, id 0xf24 acpi0 at mainbus0: Intel ACPICA 20131218 acpi0: X/RSDT: OemId , AslId acpi0: SCI interrupting at int 9 timecounter: Timecounter "ACPI-Fast" frequency 3579545 Hz quality 1000 acpibut0 at acpi0 (PWRB, PNP0C0C): ACPI Power Button MEM1 (PNP0C01) at acpi0 not configured SBIO (PNP0C02) at acpi0 not configured SYS1 (PNP0C02) at acpi0 not configured attimer1 at acpi0 (TMR, PNP0100): io 0x40-0x43 irq 0 pcppi1 at acpi0 (SPKR, PNP0800): io 0x61 midi0 at pcppi1: PC speaker sysbeep0 at pcppi1 COPR (PNP0C04) at acpi0 not configured ECP (PNP0401) at acpi0 not configured UAR1 (PNP0501) at acpi0 not configured UAR2 (PNP0501) at acpi0 not configured pckbc1 at acpi0 (PS2K, PNP0303) (kbd port): io 0x60,0x64 irq 1 SYS2 (PNP0C02) at acpi0 not configured apm0 at acpi0: Power Management spec V1.2 ACPI: Enabled 1 GPEs in block 00 to 1F ACPI Exception: AE_NOT_FOUND, While evaluating Sleep State [\_S2_] (20131218/hwxface-646) ACPI Exception: AE_NOT_FOUND, While evaluating Sleep State [\_S3_] (20131218/hwxface-646) attimer1: attached to pcppi1 pckbd0 at pckbc1 (kbd slot) pckbc1: using irq 1 for kbd slot wskbd0 at pckbd0: console keyboard pci0 at mainbus0 bus 0: configuration mode 1 pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok pchb0 at pci0 dev 0 function 0: vendor 0x8086 product 0x1a30 (rev. 0x11) agp0 at pchb0: aperture at 0xf800, size 0x400 ppb0 at pci0 dev 1 function 0: vendor 0x8086 product 0x1a31 (rev. 0x11) pci1 at ppb0 bus 1 pci1: i/o space, memory space enabled vga0 at pci1 dev 0 function 0: vendor 0x10de product 0x0110 (rev. 0xb2) wsdisplay0 at vga0 kbdmux 1: console (80x25, vt100 emulation), using wskbd0 wsmux1: connecting to wsdisplay0 drm at vga0 not configured uhci0 at pci0 dev 29 function 0: vendor 0x8086 product 0x24c2 (rev. 0x01) uhci0: interrupting at ioapic0 pin 16 usb0 at uhci0: USB revision 1.0 uhci1 at pci0 dev 29 function 1: vendor 0x8086 product 0x24c4 (rev. 0x01) uhci1: interrupting at ioapic0 pin 19 usb1 at uhci1: USB revision 1.0 uhci2 at pci0 dev 29 function 2: vendor 0x8086 product 0x24c7 (rev. 0x01) uhci2: interrupting at ioapic0 pin 18 usb2 at uhci2: USB revision 1.0 ehci0 at pci0 dev 29 function 7: vendor 0x8086 product 0x24cd (rev. 0x01) ehci0: interrupting at ioapic0 pin 23 ehci0: EHCI version 1.0 ehci0: companion controllers, 2 ports each: uhci0 uhci1 uhci2 usb3 at ehci0: USB revision 2.0 ppb1 at pci0 dev 30 function 0: vendor 0x8086 product 0x244e (rev. 0x81) pci2 at ppb1 bus 2 pci2: i/o space, memory space enabled cmpci0 at pci2 dev 3 function 0: vendor 0x13f6 product 0x0111 (rev. 0x10) cmpci0: interrupting at ioapic0 pin 21 audio0 at cmpci0: full duplex, playback, capture, mmap, independent opl0 at cmpci0: model OPL3: LR swapped midi1 at opl0: CMPCI Yamaha OPL3 mpu0 at cmpci0 midi2 at mpu0: CMPCI MPU-401 MIDI UART ex0 at pci2 dev 12 function 0: 3Com 3c905C-TX 10/100 Ethernet with mngmt (rev. 0x78) ex0: interrupting at ioapic0 pin 20 ex0: MAC address [...] exphy0 at ex0 phy 24: 3Com internal media interface exphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto ex1 at pci2 dev 14 function 0: 3Com 3c905B-TX 10/100 Ethernet (rev. 0x30) ex1: interrupting at ioapic0 pin 18 ex1: MAC address [...] exphy1 at ex1 phy 24: 3Com internal media interface exphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto ichlpcib0 at pci0 dev 31 function 0: vendor 0x8086 product 0x24c0 (rev. 0x01) timecounter: Timecounter "ichlpcib0" frequency 3579545 Hz quality 1000 ichlpcib0: 24-bit timer ichlpcib0: TCO (watchdog) timer configured. piixide0 at pci0 dev 31 function 1:
su hanging
Hi, I have an issue where "su" is hanging after get password/key. It happens with both Kerberos and normal password authentication. CTRL-C will give an "Interrupted system call". Kernel is -current amd64 GENERIC plus DEBUG/LOCKDEBUG stuff My /etc/pam.d/su is the same as base src. openpam was updated a month ago. Anyone else have an issue? root "su" to root works OK and I obviously cannot ktrace non-root su Regards, Geoff
-current vs MKINET6=NO
Hi, the following files need changes to build a full tree with MKINET6=NO external/apache2/mDNSResponder/dist/mDNSPosix/mDNSUNP.c external/bsd/dhcpcd/dist/src/dhcpcd.c external/bsd/dhcpcd/dist/src/if-bsd.c external/bsd/tcpdump/bin/Makefile mDNSUNP.c needs #include for some IFF_* definitions. dhcpd stuff needs quite a few changes to remove calls to ip6 stuff tcpdump might be better off just including ip6 stuff unconditionally (since it's parsing packets) rather than trying to detangle all the ip6 stuff. A quick eyeball didn't see any ip6 headers/libs but I didn't check fully. I've only done quick-and-dirty patches so not including them here, especially given that they're all external source stuff. Regards, Geoff
openssl fallout(?)
Hi, anyone else seeing this sort of stuff? % dig www.netbsd.org. 14-Feb-2018 16:05:39.436 ENGINE_by_id failed (crypto failure) 14-Feb-2018 16:05:39.436 error:25070067:DSO support routines:DSO_load:could not load the shared library:/usr/src/crypto/external/bsd/openssl/dist/crypto/dso/dso_lib.c:161: 14-Feb-2018 16:05:39.436 error:260B6084:engine routines:dynamic_load:dso not found:/usr/src/crypto/external/bsd/openssl/dist/crypto/engine/eng_dyn.c:414: 14-Feb-2018 16:05:39.436 error:2606A074:engine routines:ENGINE_by_id:no such engine:/usr/src/crypto/external/bsd/openssl/dist/crypto/engine/eng_list.c:339:id=gost dig: dst_lib_init: crypto failure % ktrace shows near the end it opening /etc/openssl/openssl.cnf (successfully) then trying /usr/lib/openssl/gost.so (failing). ktrace of dig with previous openssl didn't look for gost. Is this something which should be disabled in the openssl(?) config if we don't distribute it? Regards, Geoff
npf in -current amd64 (7 Mar 2018) now cannot use a "ruleset" multiple times
Hi, npf previously had no issues using a "ruleset" in multiple groups, however it now has a problem and fails with npfctl: (re)load failed: some table has a duplicate entry? The following is a minimal npf.conf to illustrate with it failing due to the second ``ruleset "blacklistd"'' causing the issue: - $if1_if = inet4(vmx0) $if2_if = inet4(vmx1) alg "icmp" group "foo" on $if1_if { ruleset "blacklistd" } group "bar" on $if2_if { ruleset "blacklistd" } group default { pass final on lo0 all block all } - I haven't investigated further yet. Ring any bells with anyone? System is amd64 -current. Regards, Geoff
Re: Automated report: NetBSD-current/i386 build failure
On Saturday 2018-04-07 23:37 +1000, Andreas Gustafsson output: :The build is now failing in a different place, and the new failure did :not get reported automatically because it was hidden by the one above: Christos just fixed this (though his fix is missing an update for an unused #define around line 71 of pam_ssh.c) Regards, Geoff
panic with NPF tables and debug/lockdebug on amd64
Hi, I'm seeing a kassert panic with NPF tables and kernel options DEBUG/LOCKDEBUG config: include "arch/amd64/conf/GENERIC" options DEBUG options LOCKDEBUG My /etc/npf.conf has tables with type hash and type tree Does anyone else have this configuration working currently? I'll try it on something where I can debug remotely in a couple of days. The kassert is at the start of rw_vector_enter() panic: kernel debugging assertion "pserialize_not_in_read_section()" failed: file "/usr/netbsd/src/sys/kern/kern_rwlock.c", line 293 cpu0: Begin traceback... vpanic() at netbsd:vpanic+0x16f ch_voltag_convert_in() at netbsd:ch_voltag_convert_in rw_enter() at netbsd:rw_enter+0x403 npf_table_lookup() at netbsd:npf_table_lookup+0xfb npf_cop_table() at netbsd:npf_cop_table+0x63 ?() at 81bc5b89 npf_packet_handler() at netbsd:npf_packet_handler+0x231 pfil_run_hooks() at netbsd:pfil_run_hooks+0x12a ipintr() at netbsd:ipintr+0x4ac softint_dispatch() at netbsd:softint_dispatch+0xee DDB lost frame for netbsd:Xsoftintr+0x4f, trying 0xc300675410f0 Xsoftintr() at netbsd:Xsoftintr+0x4f --- interrupt --- Regards, Geoff
Re: panic with NPF tables and debug/lockdebug on amd64
On Friday 2018-09-28 19:05 +1000, Geoff Wing output: :Hi, :I'm seeing a kassert panic with NPF tables and kernel options DEBUG/LOCKDEBUG : :config: : include "arch/amd64/conf/GENERIC" : options DEBUG : options LOCKDEBUG : Hi, this is with the nv changes to npf (sources 2018-10-02 06:00 UTC). (gdb) bt #0 0x80222da5 in cpu_reboot (howto=howto@entry=260, bootstr=bootstr@entry=0x0) at /usr/netbsd/src/sys/arch/amd64/amd64/machdep.c:726 #1 0x809e1949 in vpanic (fmt=fmt@entry=0x813f4710 "LOCKDEBUG: %s error: %s,%zu: %s", ap=ap@entry=0xab8069c78948) at /usr/netbsd/src/sys/kern/subr_prf.c:335 #2 0x809e19e0 in panic (fmt=fmt@entry=0x813f4710 "LOCKDEBUG: %s error: %s,%zu: %s") at /usr/netbsd/src/sys/kern/subr_prf.c:254 #3 0x809d82d5 in lockdebug_abort1 (func=0x81281c20 <__func__.6114> "assert_sleepable", line=70, ld=0xab80092246b8, s=6, msg=0x813f4556 "spin lock held", dopanic=) at /usr/netbsd/src/sys/kern/subr_lockdebug.c:807 #4 0x80993df2 in assert_sleepable () at /usr/netbsd/src/sys/kern/kern_lock.c:70 #5 0x809df98f in pool_cache_get_paddr (pc=0x896889ddb500, flags=flags@entry=1, pap=pap@entry=0x0) at /usr/netbsd/src/sys/kern/subr_pool.c:2283 #6 0x809d4ff0 in kmem_intr_alloc (requested_size=requested_size@entry=64, kmflags=kmflags@entry=1) at /usr/netbsd/src/sys/kern/subr_kmem.c:268 #7 0x809d5257 in kmem_intr_zalloc (size=size@entry=64, kmflags=kmflags@entry=1) at /usr/netbsd/src/sys/kern/subr_kmem.c:289 #8 0x809d55d3 in kmem_zalloc (size=64, kmflags=kmflags@entry=1) at /usr/netbsd/src/sys/kern/subr_kmem.c:375 #9 0x8076dbd7 in hashmap_rehash (size=, hmap=) at /usr/netbsd/src/sys/net/npf/lpm.c:175 #10 hashmap_insert (len=4, key=0xab8069c78b30, hmap=0x8968800c91a8) at /usr/netbsd/src/sys/net/npf/lpm.c:204 #11 lpm_insert (lpm=0x8968800c9008, addr=addr@entry=0x8968828d84d8, len=len@entry=4, preflen=preflen@entry=24, val=val@entry=0x8968805daf80) at /usr/netbsd/src/sys/net/npf/lpm.c:329 #12 0x807654e9 in npf_table_insert (t=t@entry=0x89687bd74318, alen=, addr=addr@entry=0x8968828d84d8, mask=24 '\030') at /usr/netbsd/src/sys/net/npf/npf_tableset.c:536 #13 0x8076085b in npf_mk_table_entries (t=t@entry=0x89687bd74318, table=table@entry=0x896883a31150, errdict=errdict@entry=0x89687bc7b3d0) at /usr/netbsd/src/sys/net/npf/npf_ctl.c:130 #14 0x80760c14 in npf_mk_tables (npf_dict=npf_dict@entry=0x896883a31250, errdict=errdict@entry=0x89687bc7b3d0, tblsetp=tblsetp@entry=0xab8069c78cf8, npf=0x896851df8f50) at /usr/netbsd/src/sys/net/npf/npf_ctl.c:201 #15 0x80760fc2 in npfctl_load_nvlist (errdict=0x89687bc7b3d0, npf_dict=0x896883a31250, npf=0x896851df8f50) at /usr/netbsd/src/sys/net/npf/npf_ctl.c:535 #16 npfctl_load (npf=0x896851df8f50, cmd=, data=0xab8069c78ee0) at /usr/netbsd/src/sys/net/npf/npf_ctl.c:599 #17 0x80a4bf15 in VOP_IOCTL (vp=vp@entry=0x896887d43d28, command=command@entry=3222818406, data=data@entry=0xab8069c78ee0, fflag=, cred=) at /usr/netbsd/src/sys/kern/vnode_if.c:610 #18 0x80a43144 in vn_ioctl (fp=0x8968816d3480, com=3222818406, data=0xab8069c78ee0) at /usr/netbsd/src/sys/kern/vfs_vnops.c:769 #19 0x809ee05b in sys_ioctl (l=, uap=0xab8069c79000, retval=) at /usr/netbsd/src/sys/kern/sys_generic.c:671 #20 0x8024cdd5 in sy_call (rval=0xab8069c78fb0, uap=0xab8069c79000, l=0x896884386980, sy=0x81653a90 ) at /usr/netbsd/src/sys/sys/syscallvar.h:65 #21 sy_invoke (code=54, rval=0xab8069c78fb0, uap=0xab8069c79000, l=0x896884386980, sy=0x81653a90 ) at /usr/netbsd/src/sys/sys/syscallvar.h:94 #22 syscall (frame=0xab8069c79000) at /usr/netbsd/src/sys/arch/x86/x86/syscall.c:140 #23 0x802096dd in handle_syscall () Regards, Geoff
"dmesg -T" date doesn't match "date" output
Hi, dates output by "dmesg -T" are not matching real time. Using a program to generate a segfault dmesg is showing times in the future: # sysctl -w kern.logsigexit=1 kern.logsigexit: 0 -> 1 # ./segfault; date [1]18445 segmentation fault ./segfault Sat Oct 27 17:33:56 AEDT 2018 # dmesg -T | tail -1 [Sat Oct 27 17:34:02 AEDT 2018] pid 18445 (segfault), uid 0: exited on signal 11 (core not dumped, err = 13)
Re: "dmesg -T" date doesn't match "date" output
On Saturday 2018-10-27 19:03 +0700, Robert Elz output: :Date:Sat, 27 Oct 2018 17:39:16 +1100 :From:Geoff Wing :Message-ID: <20181027063916.ga2...@primenet.com.au> : : | dates output by "dmesg -T" are not matching real time. Using a program : | to generate a segfault dmesg is showing times in the future: : :dmesg times come from "seconds since boot" which is what is actually :logged, added to boottime. The seconds since boot is, I believe, a monotonic :counter which counts at timer rate, unadjusted for clock errors. : :I'd assume you're running NTP or similar to sync your clock, and that it :is constantly slowing down your timer - apparently by 6 seconds since :boot, which suggests that either your clock is wildly inaccurate, or that :your system has been up for a fairly lengthy time (at least a week probably). : :What does dmesg say without the -T, or perhaps using -TT : :The dmesg man page should perhaps explain all of this a little better. I've tried on two -current amd64 machines and both were showing similar issues. The dmesg time matches what appears in kern.boottime but I don't see a 5-6 second step in rc.log when ntpdate is run. Something fishy is going on. # ./segfault; date; dmesg -T | tail -1; dmesg -TT | tail -1 [1]871 segmentation fault ./segfault Sun Oct 28 09:37:49 AEDT 2018 [Sun Oct 28 09:37:55 AEDT 2018] pid 871 (segfault), uid 0: exited on signal 11 (core not dumped, err = 1) [PT1M13.006S] pid 871 (segfault), uid 0: exited on signal 11 (core not dumped, err = 1) # sysctl kern.boottime kern.boottime = Sun Oct 28 09:36:42 2018 # ntpq -p remote refid st t when poll reach delay offset jitter == XX 3 s3 12870.588 -4.308 3.546 XX 3 s7 12830.228 -3.728 0.066 XX 4 s 62 12870.100 -4.151 2.945 +XX 3 u 28 6430.812 -2.945 2.411 *XX 3 u 80 1281 12.870 -6.306 0.032 ( from /var/run/rc.log ) [running /etc/rc.d/ntpdate] Setting date via ntp. 28 Oct 09:37:01 ntpdate[229]: step time server XX offset 0.304666 sec Regards, Geoff
Re: "dmesg -T" date doesn't match "date" output
On Sunday 2018-10-28 07:19 +0700, Robert Elz output: : | The dmesg time matches what appears in kern.boottime but I don't see a 5-6 : | second step in rc.log when ntpdate is run. :You wouldn't now. The system from which you showed that output has :been up for a month. During that month, either in one jump, or more :likely, continuously, I suspect that the time has been slowed from what :your system clock source would have generated, to match NTP time. : :So, while there has been a month and 13 seconds of clock ticks, the :actual time has actually advanced just a month and 7 seconds. : :Or that is what I'd assume - but I'm no expert on NetBSD timekeeping. Ah, I just rebooted it so that's 1 min and 13 seconds, not 1 month and 13 seconds.
Re: "dmesg -T" date doesn't match "date" output
On Sunday 2018-10-28 08:32 +0700, Robert Elz output: :I don't suppose that your ToD clock is 6 seconds incorrect, and :ntpdate run from /etc/rc is fixing that (but the ToD clock isn't being :updated) ? : :if it is not that, then you're right, something weird is happening. Hi, I'm running the same -current build on two x64 machines. One is a VM and the other is bare-metal. I'm rebuilding in case something funny happened in the build and noone else can reproduce anything similar. The ntpdate in /var/run/rc.log said it updated 0.3 secs and both machines had time differences of 5-6 secs between dmesg / date commands. Regards, Geoff
Re: "dmesg -T" date doesn't match "date" output
On Sunday 2018-10-28 13:16 +1100, Geoff Wing output: :Hi, :I'm running the same -current build on two x64 machines. One is a VM :and the other is bare-metal. I'm rebuilding in case something funny :happened in the build and noone else can reproduce anything similar. : :The ntpdate in /var/run/rc.log said it updated 0.3 secs and both machines :had time differences of 5-6 secs between dmesg / date commands. I suspect it may be that boottime is being set late. My dmesg has: [ 6.730563] root on sd0a dumps on sd0b [ 6.730563] root file system type: ffs [ 6.730563] kern.module.path=/stand/amd64/8.99.25/modules >From my quick look, sys/kern/init_main.c:666 initialises boottime after mounting the root file system, so "dmesg -T" is using a bad value. Regards, Geoff
MSI/MSI-X implementation and interrupt handling on i386/amd64
Hi, brief background: on an amd64 VM (1 CPU on VMWare ESXi) I had a network interface (vmx) failing because it could not get an interrupt slot. The vmx wants 3 interrupts per interface (tx/rx/link-state). I had a few on an admin machine and one started failing when ahcisata was changed to use MSI (not ahcisata's fault, obviously). On i386/amd64 each CPU has a 32 bitmask for interrupts (1 bit per) - but 16 of the 32 are reserved for legacy IRQs (on the first CPU). MSI-X allows for 2048 interrupts. On a physical machine with many CPUs the MSI interrupts are farmed out across the different CPUs so would not be apparent to most. (and no problem for those 65+ core machines). For my personal use, I've hacked around by ignoring the reserved legacy IRQ region because it's not relevant to me in my VM with MSI/MSI-X. Other people using single CPU VMs may start bumping into this issue so just making people aware. I haven't looked into changing how interrupts are handled or if there would be significant performance penalty. Regards, Geoff FYI (pin17 is mpt0): % intrctl list interrupt id CPU0 device name(s) ioapic0 pin 90* acpi SCI ioapic0 pin 10* pckbc1 kbd ioapic0 pin 12 0* pckbc2 aux ioapic0 pin 14 0* piixide0 primary ioapic0 pin 15 0* piixide0 secondary ioapic0 pin 17 3481843* unknown ioapic0 pin 18 54* uhci0 ioapic0 pin 19 0* ehci0 msi0 vec 0 0* ahcisata0 msix1 vec 0 16215* vmx0: tx 0 msix1 vec 1 406335* vmx0: rx 0 msix1 vec 2 0* vmx0: link msix2 vec 0 100571* vmx1: tx 0 msix2 vec 1 178436* vmx1: rx 0 msix2 vec 2 0* vmx1: link msix3 vec 0 327583* vmx2: tx 0 msix3 vec 13141480* vmx2: rx 0 msix3 vec 2 0* vmx2: link
Re: zsh crash in recent -current
On Thursday 2019-03-14 10:57 +, ci4...@gmail.com output: :Well, after installing the unstripped zsh+modules and ncurses, I no :longer get zsh any crashes. Plus, as I mentioned, there was some :jemalloc updates a couple of days ago. Hence, no idea. Hi, if you ended up configuring with --enable-zsh-mem (as you mentioned trying in a previous message) then you should not be using jemalloc at all but zsh's mem routines - in which case it may avoid any use-after-free issues or otherwise which jemalloc may expose. Regards, Geoff
date/strftime() returning wrong timezone name
Hi, running /sbin/dmesg and /bin/date I am seeing a timezone name of "LMT" instead of my normal "AEST" Copying "date" and my zoneinfo file from a working computer, I still see bad info. >From -current (compiled myself and from nyftp snapshot): % TZ=Australia/Melbourne date; TZ=NZ date Wed Apr 17 16:04:16 LMT 2019 Wed Apr 17 16:04:16 LMT 2019 >From a month ago: % TZ=Australia/Melbourne date; TZ=NZ date Wed Apr 17 16:03:54 AEST 2019 Wed Apr 17 16:03:54 NZST 2019 "LMT" was the correct timezone name 125-150 years ago for those zones. Is this reproducible for anyone else or something unique to me? I see there have been some recent changes to strftime() so perhaps those are the cause of my problem. Regards, Geoff
crashes in amd64 8.99.51/9.99.2 with panic: pr_find_pagehead: [npfcn4pl]
Hi, I'm getting quite a few crashes in 8.99.51/9.99.2 on amd64 8.99.51 + modules I built 9.99.2 + modules from nyftp (NetBSD 9.99.2 (GENERIC) #0: Wed Jul 31 16:40:25 UTC 2019) It seems to be related to npf although when I booted with npf started during rc processing and then issuing "/etc/rc.d/npf stop" I was still getting crashes after some time. Any ideas? Something funny going on when npf is obtaining pages? Regards, Geoff 8.99.51 crash: panic: pr_find_pagehead: [npfcn4pl] item 0x98a0b89491b8 poolid 182 != 181 cpu1: Begin traceback... vpanic() at netbsd:vpanic+0x160 snprintf() at netbsd:snprintf pool_put() at netbsd:pool_put+0x6b9 pool_cache_invalidate_groups() at netbsd:pool_cache_invalidate_groups+0x71 pool_cache_invalidate() at netbsd:pool_cache_invalidate+0xd5 pool_reclaim() at netbsd:pool_reclaim+0xa7 pool_drain() at netbsd:pool_drain+0x85 uvmpd_pool_drain_thread() at netbsd:uvmpd_pool_drain_thread+0x74 cpu1: End traceback... 9.99.2 crash (doesn't have the pool_cache_invalidate() call): panic: pr_find_pagehead: [npfcn4pl] item 0x95f81443e038 poolid 175 != 174 cpu0: Begin traceback... vpanic() at netbsd:vpanic+0x160 snprintf() at netbsd:snprintf pool_put() at netbsd:pool_put+0x6b9 pool_cache_invalidate_groups() at netbsd:pool_cache_invalidate_groups+0x71 pool_reclaim() at netbsd:pool_reclaim+0xa7 pool_drain() at netbsd:pool_drain+0x85 uvmpd_pool_drain_thread() at netbsd:uvmpd_pool_drain_thread+0x74 cpu0: End traceback...
doc/CHANGES typo (2018)
Hi, there is a wrong year in doc/CHANGES. Regards, Geoff Index: doc/CHANGES === RCS file: /cvsroot/src/doc/CHANGES,v retrieving revision 1.2587 diff -u -r1.2587 CHANGES --- doc/CHANGES 2 Oct 2019 11:18:55 - 1.2587 +++ doc/CHANGES 3 Oct 2019 03:37:04 - @@ -24,7 +24,7 @@ # Changes from NetBSD 9.0 to NetBSD 10.0: - openldap: Import 2.4.48. [christos 20180808] + openldap: Import 2.4.48. [christos 20190808] usbnet(9): Add common framework for USB network devices. Port the axe(4), axen(4), cdce(4), cue(4), mue(4), smsc(4), udav(4), ure(4), url(4), and urndis(4) drivers fixing many bugs and
Re: httpd ssl failures
On Monday 2019-12-16 19:56 -0600, ed...@pettijohn-web.com output: :> > Certificate/key created like so: :> > openssl req -x509 -nodes -days 365 -sha256 -newkey rsa:2048 -keyout :> > mycert.pem -out mycert.pem [...] :> > Is this a problem with my setup? :> Think it may be an httpd issue. Used the cert/key with postfix and tested :> with openssl s_client and didn't see any issues. :Just tried my letsencrypt cert and key with the same results. Hi, I tried this on amd64 -current yesterday with a letsencrypt cert/key and also with a self-signed cert/key using, e.g. /usr/libexec/httpd -df -Z /tmp/test.pem /tmp/testkey.pem /www and had no problems. Maybe there was a miscompile or other issue with your httpd binary (or libs). Do you have mozilla-rootcerts installed to test the letsencrypt cert? Regards, Geoff