macppc: add ld.script for kernel, ofwboot
Hello tech list, I want macppc to switch from ld.bfd to ld.lld, but there is a problem when lld links ofwboot or the kernel. I propose to fix it by adding ld.script for both. These scripts also work with ld.bfd, so I want to commit my diff at the end of this mail, ok? lld sets the symbol "etext" to a nonsense value like 0x1034. In ofwboot, wrong "etext" causes freeze, failure to boot kernel. (Wrong "etext" in kernel doesn't cause an obvious problem.) Other lld arches use an ld.script to set a correct "etext" in kernel. I copied the ld.script from powerpc64's kernel and made these changes for macppc's kernel: - change "_start" to "start" to match macppc/locore0.S - remove PT_DYNAMIC and sections (don't exist in macppc kernel) - put .text at 0x00100114 to match Makefile - remove symbols like "_erodata" (not used by macppc kernel) - add ".rodata.*" and such, so sections and segments look correct | --- arch/powerpc64/conf/ld.script Sat Jul 18 09:16:32 2020 | +++ arch/macppc/conf/ld.scriptSat Apr 24 11:52:34 2021 | @@ -16,18 +16,17 @@ | * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. | */ | | -ENTRY(_start) | +ENTRY(start) | | PHDRS | { | text PT_LOAD; | - dynamic PT_DYNAMIC; | openbsd_randomize PT_OPENBSD_RANDOMIZE; | } | | SECTIONS | { | - . = 0x0010; | + . = 0x00100114; | .text : | { | *(.text) | @@ -35,49 +34,35 @@ | PROVIDE (etext = .); | PROVIDE (_etext = .); | | - . = ALIGN(4096); | - .rela.dyn : { *(.rela.dyn) } | - | - .dynamic : | + .rodata : | { | - *(.dynamic) | - } :dynamic :text | + *(.rodata .rodata.*) | + } :text | | - .rodata : | + .data.rel.ro : | { | - *(.rodata) | *(.data.rel.ro) | } :text | | .openbsd.randomdata : | { | - *(.openbsd.randomdata) | + *(.openbsd.randomdata .openbsd.randomdata.*) | } :openbsd_randomize :text | - PROVIDE (_erodata = .); | | - . = ALIGN(4096); | .data : | { | *(.data) | } :text | | - . = ALIGN(4096); | - .got : { *(.got) } | - .toc : { *(.toc) } | + .sbss : | + { | + *(.sbss) | + } | | - PROVIDE (__bss_start = .); | .bss : | { | *(.bss) | } | PROVIDE (end = .); | PROVIDE (_end = .); | - | - /DISCARD/ : | - { | - *(.dynsym) | - *(.dynstr) | - *(.gnu.hash) | - *(.hash) | - } | } Then I made these changes for ofwboot: - use "_start" and 0x0002 to match Makefile - remove randomdata (doesn't exist in ofwboot) | --- arch/macppc/conf/ld.scriptSat Apr 24 11:52:34 2021 | +++ arch/macppc/stand/ofwboot/ld.script Sat Apr 24 11:52:34 2021 | @@ -16,17 +16,17 @@ | * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. | */ | | -ENTRY(start) | +ENTRY(_start) | | PHDRS | { | text PT_LOAD; | - openbsd_randomize PT_OPENBSD_RANDOMIZE; | } | | SECTIONS | { | - . = 0x00100114; | + /* Must match RELOC in Makefile */ | + . = 0x0002; | .text : | { | *(.text) | @@ -43,11 +43,6 @@ | { | *(.data.rel.ro) | } :text | - | - .openbsd.randomdata : | - { | - *(.openbsd.randomdata .openbsd.randomdata.*) | - } :openbsd_randomize :text | | .data : | { Index: arch/macppc/conf/Makefile.macppc === RCS file: /cvs/src/sys/arch/macppc/conf/Makefile.macppc,v retrieving revision 1.101 diff -u -p -r1.101 Makefile.macppc --- arch/macppc/conf/Makefile.macppc28 Jan 2021 17:39:03 - 1.101 +++ arch/macppc/conf/Makefile.macppc6 May 2021 20:01:08 - @@ -51,7 +51,7 @@ DEBUG?= -g COPTIMIZE?=-O2 CFLAGS=${DEBUG} ${CWARNFLAGS} ${CMACHFLAGS} ${COPTIMIZE} ${COPTS} ${PIPE} AFLAGS=-D_LOCORE ${CMACHFLAGS} -LINKFLAGS= -N -Ttext 100114 -e start --warn-common -nopie +LINKFLAGS= -N -T ld.script --warn-common -nopie .if ${MACHINE} == "powerpc64" CFLAGS+= -m32 Index: arch/macppc/conf/ld.script === RCS file: /cvs/src/sys/arch/macppc/conf/ld.script,v retrieving revision 1.1 diff -u -p -r1.1 ld.script --- arch/macppc/conf/ld.script 13 Jun 2017 01:42:52 - 1.1 +++ arch/macppc/conf/ld.script 6 May 2021 20:01:08 - @@ -0,0 +1,68 @@ +/* $OpenBSD: ld.script,v 1.4 2020/07/18 13:16:32 kettenis Exp $*/ + +/* + * Copyright (c) 2013 Mark Kettenis + * + * Permission to use, copy, modify, and distribute this software for any + * purpose with or without fee is hereby granted, provided that the above + * copyright notice and this permission notice appear in all copies. + * + * THE SOFTWARE IS
macppc bsd.mp pmap's hash lock
Hello tech list, If you have a macppc with more than one cpu, I would like you to try this diff in the GENERIC.MP kernel. I am running it on a dual G5 (without radeondrm and not running X). I don't know whether I want to commit this diff. In late April, my G5's kernel froze very early during boot, while trying to map the framebuffer. The problem went away after reordering and relinking the kernel. I kept a copy of the bad kernel. I found the problem on Tuesday: __ppc_lock() crossed a page boundary. $ nm -n /bsd.crash | grep __ppc_lock $ objdump -dlr --start-ad=0x27bf8c /bsd.crash|less The disassembly had 0x27fbf8c <= __ppc_lock < 0x27c058, so it crossed pages at 0x27c000; page size = 0x1000 = 4096. On a G5, the kernel lazily faults in its own pages. The page fault at 0x27c000 inside __ppc_lock caused a recursive call to __ppc_lock. I believe that the fault happened here in __ppc_lock: s = ppc_intr_disable(); if (atomic_cas_ulong(&mpl->mpl_count, 0, 1) == 0) { membar_enter(); mpl->mpl_cpu = curcpu(); } if (mpl->mpl_cpu == curcpu()) { //--> // page fault! recursive call to __ppc_lock mpl->mpl_count++; ppc_intr_enable(s); This is bad, because the lock is not in a valid state when the page fault happens. The code tries ppc_intr_disable to protect the lock, but this doesn't disable page faults. The lock is a recursive spinlock that protects the pmap's hash table. My bad kernel tried to grab the lock to insert the 1st page of the framebuffer into the hash, and then tried to recursively grab the lock to insert page 0x27c000 into the hash. I debugged the problem by copying the bad kernel and overwriting some asm to insert some extra printf()s. The problem went away when my asm caused an earlier access to page 0x27c000. When I reorder the kernel, __ppc_lock gets a different address, and probably doesn't cross pages with a not-valid lock; but I can force a page fault this way: __asm volatile("b 1f; . = . + 4096; 1:"); If I insert this asm at my above "//-->" and compile GENERIC.MP, then it reproduces the freezing problem on my G5. The problem doesn't happen on a G3 or G4, where the kernel uses block address translation (!ppc_nobat); I copied my bad kernel to a G4 and it didn't freeze there. The problem doesn't happen in bsd.sp, where the lock doesn't exist. Our powerpc64 kernel doesn't fault in its own pages this way. I have observed the problem only with macppc on G5. In this diff, I try to fix the problem by shrinking the lock to 32 bits, and using 32-bit atomic ops to keep the lock in a valid state. This __ppc_lock is no longer the __mp_lock, but is only the pmap's hash lock, so I also delete some unused functions. --George Index: arch/powerpc/include/mplock.h === RCS file: /cvs/src/sys/arch/powerpc/include/mplock.h,v retrieving revision 1.4 diff -u -p -r1.4 mplock.h --- arch/powerpc/include/mplock.h 15 Apr 2020 08:09:00 - 1.4 +++ arch/powerpc/include/mplock.h 6 May 2021 20:01:08 - @@ -30,13 +30,13 @@ #define __USE_MI_MPLOCK /* + * __ppc_lock exists because pte_spill_r() can't use __mp_lock. * Really simple spinlock implementation with recursive capabilities. * Correctness is paramount, no fancyness allowed. */ struct __ppc_lock { - volatile struct cpu_info *mpl_cpu; - volatile long mpl_count; + volatile unsigned int mpl_bolt; }; #ifndef _LOCORE @@ -44,10 +44,6 @@ struct __ppc_lock { void __ppc_lock_init(struct __ppc_lock *); void __ppc_lock(struct __ppc_lock *); void __ppc_unlock(struct __ppc_lock *); -int __ppc_release_all(struct __ppc_lock *); -int __ppc_release_all_but_one(struct __ppc_lock *); -void __ppc_acquire_count(struct __ppc_lock *, int); -int __ppc_lock_held(struct __ppc_lock *, struct cpu_info *); #endif Index: arch/powerpc/powerpc/lock_machdep.c === RCS file: /cvs/src/sys/arch/powerpc/powerpc/lock_machdep.c,v retrieving revision 1.9 diff -u -p -r1.9 lock_machdep.c --- arch/powerpc/powerpc/lock_machdep.c 15 Apr 2020 08:09:00 - 1.9 +++ arch/powerpc/powerpc/lock_machdep.c 6 May 2021 20:01:08 - @@ -1,6 +1,7 @@ /* $OpenBSD: lock_machdep.c,v 1.9 2020/04/15 08:09:00 mpi Exp $*/ /* + * Copyright (c) 2021 George Koehler * Copyright (c) 2007 Artur Grabowski * * Permission to use, copy, modify, and distribute this software for any @@ -22,15 +23,29 @@ #include #include -#include #include +/* + * If __ppc_lock() crosses a page boundary in the kernel text, then it + * may cause a page fault (on G5 with ppc_nobat), and pte_spill_r() + * would recursively call __ppc_lock(). The lock must be in a valid + * state when the page fault happe
Re: patch: make ld.so aware of pthread_key_create destructor - Re: multimedia/mpv debug vo=gpu crash on exit
On Thu, May 06, 2021 at 08:00:56AM -0600, Todd C. Miller wrote: > On Thu, 06 May 2021 09:32:28 +0200, Sebastien Marie wrote: > > > We already take care of such situation with __cxa_thread_atexit_impl > > (in libc/stdlib/thread_atexit.c), by keeping an additionnal reference > > on object loaded (it makes ld.so aware that it is still used and so > > dlclose() doesn't unload it). > > > > I used the same idiom for pthread_key_create() and used dlctl(3) in > > the same way with the destructor address. > > This will set STAT_NODELETE so the DSO will never really get unloaded. > That's not a problem for atexit() since the process is headed for > the exit. > > I'm less sure about using it here since we don't have a way to > unreference the DSO upon pthread_key_delete(). > > - todd I did a quick investigation on my Linux machine and there mpv seems to be using libEGL_mesa.so instead of iris_dri.so. In this case I am not seeing a call to pthread_key_create at the start of video playback (there are some other places where pthread_key_create is called from but they don't cause a problem). So, not sure what happens in Linux when iris_dri.so is used. However, the Linux implementation of pthread_key_create seems to also not increment the refcount when the destructor is set so I don't yet see how it's solved there, assuming iris_dri.so behaves identically. Regards, Anindya
Re: [Patch] Add support for Realtek RTL8111FP-CG Ethernet Controller
On Thu, May 06, 2021 at 12:38:17PM -0400, Stephen Taylor wrote: > This is my first post to the mailing list and my first time customizing > the kernel. I apologize in advance for mistakes I make. > > I have a Lenovo ThinkCenter M75n Nano IoT computer. It uses the Realtek > RTL8111FP-CG Ethernet controller, as do other similar small form factor > computers. OpenBSD 6.9 does not recognize the controller. dmesg shows: > > re0 at pci2 dev 0 function 1 "Realtek 8168" rev 0x1a: unknown ASIC > (0x5480), msi, address 00:00:00:00 > > Studying similar posts in the mailing lists, I made the changes shown > in the diffs below, compiled the kernel, and now OpenBSD recognizes > and can use the Ethernet. > > Would someone be willing to review and/or commit this? Thank you! Thanks, committed. This id is also used by RTL8117. > > > Index: re.c > === > RCS file: /cvs/src/sys/dev/ic/re.c,v > retrieving revision 1.208 > diff -u -p -u -r1.208 re.c > --- re.c 12 Dec 2020 11:48:52 - 1.208 > +++ re.c 6 May 2021 16:13:43 - > @@ -248,6 +248,7 @@ static const struct re_revision { > { RL_HWREV_8168E, "RTL8168E/8111E" }, > { RL_HWREV_8168E_VL,"RTL8168E/8111E-VL" }, > { RL_HWREV_8168EP, "RTL8168EP/8111EP" }, > + { RL_HWREV_8168FP, "RTL8168FP/8111FP" }, > { RL_HWREV_8169,"RTL8169" }, > { RL_HWREV_8169_8110SB, "RTL8169/8110SB" }, > { RL_HWREV_8169_8110SBL, "RTL8169SBL" }, > @@ -754,6 +755,7 @@ re_attach(struct rl_softc *sc, const cha > sc->rl_max_mtu = RL_JUMBO_MTU_9K; > break; > case RL_HWREV_8168EP: > + case RL_HWREV_8168FP: > case RL_HWREV_8168G: > case RL_HWREV_8168GU: > case RL_HWREV_8168H: > Index: rtl81x9reg.h > === > RCS file: /cvs/src/sys/dev/ic/rtl81x9reg.h,v > retrieving revision 1.101 > diff -u -p -u -r1.101 rtl81x9reg.h > --- rtl81x9reg.h 11 Apr 2018 08:02:18 - 1.101 > +++ rtl81x9reg.h 6 May 2021 16:13:43 - > @@ -210,6 +210,7 @@ > #define RL_HWREV_8168EP 0x5000 > #define RL_HWREV_8168GU 0x5080 > #define RL_HWREV_8168H 0x5400 > +#define RL_HWREV_8168FP 0x5480 > #define RL_HWREV_8411B 0x5c80 > #define RL_HWREV_81390x6000 > #define RL_HWREV_8139A 0x7000 > >
[Patch] Add support for Realtek RTL8111FP-CG Ethernet Controller
This is my first post to the mailing list and my first time customizing the kernel. I apologize in advance for mistakes I make. I have a Lenovo ThinkCenter M75n Nano IoT computer. It uses the Realtek RTL8111FP-CG Ethernet controller, as do other similar small form factor computers. OpenBSD 6.9 does not recognize the controller. dmesg shows: re0 at pci2 dev 0 function 1 "Realtek 8168" rev 0x1a: unknown ASIC (0x5480), msi, address 00:00:00:00 Studying similar posts in the mailing lists, I made the changes shown in the diffs below, compiled the kernel, and now OpenBSD recognizes and can use the Ethernet. Would someone be willing to review and/or commit this? Thank you! Index: re.c === RCS file: /cvs/src/sys/dev/ic/re.c,v retrieving revision 1.208 diff -u -p -u -r1.208 re.c --- re.c12 Dec 2020 11:48:52 - 1.208 +++ re.c6 May 2021 16:13:43 - @@ -248,6 +248,7 @@ static const struct re_revision { { RL_HWREV_8168E, "RTL8168E/8111E" }, { RL_HWREV_8168E_VL,"RTL8168E/8111E-VL" }, { RL_HWREV_8168EP, "RTL8168EP/8111EP" }, + { RL_HWREV_8168FP, "RTL8168FP/8111FP" }, { RL_HWREV_8169,"RTL8169" }, { RL_HWREV_8169_8110SB, "RTL8169/8110SB" }, { RL_HWREV_8169_8110SBL, "RTL8169SBL" }, @@ -754,6 +755,7 @@ re_attach(struct rl_softc *sc, const cha sc->rl_max_mtu = RL_JUMBO_MTU_9K; break; case RL_HWREV_8168EP: + case RL_HWREV_8168FP: case RL_HWREV_8168G: case RL_HWREV_8168GU: case RL_HWREV_8168H: Index: rtl81x9reg.h === RCS file: /cvs/src/sys/dev/ic/rtl81x9reg.h,v retrieving revision 1.101 diff -u -p -u -r1.101 rtl81x9reg.h --- rtl81x9reg.h11 Apr 2018 08:02:18 - 1.101 +++ rtl81x9reg.h6 May 2021 16:13:43 - @@ -210,6 +210,7 @@ #define RL_HWREV_8168EP0x5000 #define RL_HWREV_8168GU0x5080 #define RL_HWREV_8168H 0x5400 +#define RL_HWREV_8168FP0x5480 #define RL_HWREV_8411B 0x5c80 #define RL_HWREV_8139 0x6000 #define RL_HWREV_8139A 0x7000
Re: services(5): add default ftps ports
On Thu, May 06, 2021 at 06:36:52PM +0200, Mark Kettenis wrote: > > From: "Theo de Raadt" > > Date: Thu, 06 May 2021 10:26:31 -0600 > > > > Jan Klemkow wrote: > > > > > On Wed, May 05, 2021 at 12:18:43PM -0600, Theo de Raadt wrote: > > > > I would like a further justification for removing these ports from > > > > the very limited dynamic reserved space used by bindresvport. > > > > > > > > (but not by rresvport, which appears still stomp over them) > > > > > > > > For tcp, 32 of the 512 are locked out. > > > > For udp, 19. > > > > > > > > What software is actually using these ports? > > > > > > > > Is that software irrelevant these days? > > > > > > I'm working on a diff to bring ftps with libtls into our ftpd(8). There > > > is a "getaddrinfo(NULL, "ftps", &hints, &res0)" call, which uses this > > > port. Thus, I made this change. > > > > Hang on -- does the world want ftps support? I don't know, what "the world" wants. But, I want ftps. As far as I can see, ftps is the only way to bring our ftpd(8) into the 21st century. I use ftp in my private local setup. I also want to use over public internet in the future, like I did in the past. Thats why I'm working on it. > I was going to ask the same thing. I mean even with encryption the > FTP protocol still is a bad idea given all the problems with NAT > traversal and such. In don't use NAT or packet filters in my setup. With IPv6 there is no active FTP problem.
Re: remove gccism from com(4)
> Date: Thu, 6 May 2021 19:56:57 + > From: Miod Vallat > > `return f()' when f is a void function is not allowed by the C standard > but is a gcc extension. Thanks. I probably inadvertedly copied that from the read code. Committed. > Index: com.c > === > RCS file: /OpenBSD/src/sys/dev/ic/com.c,v > retrieving revision 1.173 > diff -u -p -r1.173 com.c > --- com.c 14 Aug 2020 18:14:11 - 1.173 > +++ com.c 6 May 2021 19:55:13 - > @@ -1609,9 +1609,9 @@ com_write_reg(struct com_softc *sc, bus_ > reg <<= sc->sc_reg_shift; > > if (sc->sc_reg_width == 4) > - return bus_space_write_4(sc->sc_iot, sc->sc_ioh, reg, value); > + bus_space_write_4(sc->sc_iot, sc->sc_ioh, reg, value); > else > - return bus_space_write_1(sc->sc_iot, sc->sc_ioh, reg, value); > + bus_space_write_1(sc->sc_iot, sc->sc_ioh, reg, value); > } > > #ifdef COM_CONSOLE > @@ -1636,9 +1636,9 @@ comcn_write_reg(bus_size_t reg, uint8_t > reg <<= comcons_reg_shift; > > if (comcons_reg_width == 4) > - return bus_space_write_4(comconsiot, comconsioh, reg, value); > + bus_space_write_4(comconsiot, comconsioh, reg, value); > else > - return bus_space_write_1(comconsiot, comconsioh, reg, value); > + bus_space_write_1(comconsiot, comconsioh, reg, value); > } > > #endif > >
Re: services(5): add default ftps ports
On Wed, May 05, 2021 at 12:18:43PM -0600, Theo de Raadt wrote: > I would like a further justification for removing these ports from > the very limited dynamic reserved space used by bindresvport. > > (but not by rresvport, which appears still stomp over them) > > For tcp, 32 of the 512 are locked out. > For udp, 19. > > What software is actually using these ports? > > Is that software irrelevant these days? I'm working on a diff to bring ftps with libtls into our ftpd(8). There is a "getaddrinfo(NULL, "ftps", &hints, &res0)" call, which uses this port. Thus, I made this change. > Jan Klemkow wrote: > > On Wed, May 05, 2021 at 11:09:12AM +0100, Stuart Henderson wrote: > > > On 2021/05/04 12:07, Jan Klemkow wrote: > > > > Add missing ftps defaults ports to servies(5). > > > > > > > > Index: services > > > > === > > > > RCS file: /cvs/src/etc/services,v > > > > retrieving revision 1.99 > > > > diff -u -p -r1.99 services > > > > --- services18 Feb 2021 02:30:29 - 1.99 > > > > +++ services4 May 2021 10:01:35 - > > > > @@ -318,6 +318,10 @@ krb_prop 754/tcp hprop # > > > > Kerberos slav > > > > krbupdate 760/tcp kreg# BSD Kerberos > > > > registration > > > > supfilesrv 871/tcp # SUP server > > > > swat 901/tcp # Samba Web > > > > Administration Tool > > > > +ftps-data 989/tcp # ftp data over TLS/SSL > > > > +ftps-data 989/udp # ftp data over TLS/SSL > > > > +ftps 990/tcp # ftp control over > > > > TLS/SSL > > > > +ftps 990/udp # ftp control over > > > > TLS/SSL > > > > > > I'm OK with adding the TCP ones (though ftp-over-tls always makes me > > > want to rant...). It's not going to run on UDP though so I think those > > > should not be added. > > > > OK? > > > > Index: services > > === > > RCS file: /cvs/src/etc/services,v > > retrieving revision 1.99 > > diff -u -p -r1.99 services > > --- services18 Feb 2021 02:30:29 - 1.99 > > +++ services5 May 2021 12:24:29 - > > @@ -318,6 +318,8 @@ krb_prop754/tcp hprop # > > Kerberos slav > > krbupdate 760/tcp kreg# BSD Kerberos registration > > supfilesrv 871/tcp # SUP server > > swat 901/tcp # Samba Web > > Administration Tool > > +ftps-data 989/tcp # ftp data over TLS > > +ftps 990/tcp # ftp control over TLS > > supfiledbg 1127/tcp# SUP debugging > > support1529/tcp# GNATS, cygnus bug > > tracker > > datametrics1645/udp > > >
remove gccism from com(4)
`return f()' when f is a void function is not allowed by the C standard but is a gcc extension. Index: com.c === RCS file: /OpenBSD/src/sys/dev/ic/com.c,v retrieving revision 1.173 diff -u -p -r1.173 com.c --- com.c 14 Aug 2020 18:14:11 - 1.173 +++ com.c 6 May 2021 19:55:13 - @@ -1609,9 +1609,9 @@ com_write_reg(struct com_softc *sc, bus_ reg <<= sc->sc_reg_shift; if (sc->sc_reg_width == 4) - return bus_space_write_4(sc->sc_iot, sc->sc_ioh, reg, value); + bus_space_write_4(sc->sc_iot, sc->sc_ioh, reg, value); else - return bus_space_write_1(sc->sc_iot, sc->sc_ioh, reg, value); + bus_space_write_1(sc->sc_iot, sc->sc_ioh, reg, value); } #ifdef COM_CONSOLE @@ -1636,9 +1636,9 @@ comcn_write_reg(bus_size_t reg, uint8_t reg <<= comcons_reg_shift; if (comcons_reg_width == 4) - return bus_space_write_4(comconsiot, comconsioh, reg, value); + bus_space_write_4(comconsiot, comconsioh, reg, value); else - return bus_space_write_1(comconsiot, comconsioh, reg, value); + bus_space_write_1(comconsiot, comconsioh, reg, value); } #endif
Re: services(5): add default ftps ports
Jan Klemkow wrote: > > > > I'm working on a diff to bring ftps with libtls into our ftpd(8). There > > > > is a "getaddrinfo(NULL, "ftps", &hints, &res0)" call, which uses this > > > > port. Thus, I made this change. > > > > > > Hang on -- does the world want ftps support? > > I don't know, what "the world" wants. But, I want ftps. As far as I > can see, ftps is the only way to bring our ftpd(8) into the 21st > century. I have a really hard time with that. The protocol is completely broken, and in a way that adding TLS makes it even worse.
Re: rpki-client don't clobber errno in mkpath
I'm confused. Who has a free() that clobbers errno? Claudio Jeker wrote: > Noticed while looking at the same version in rsync. free() may clobber > errno so better save the value before calling free(). > Also update the comment, remove all those arguments I removed :) > > -- > :wq Claudio > > Index: mkdir.c > === > RCS file: /cvs/src/usr.sbin/rpki-client/mkdir.c,v > retrieving revision 1.6 > diff -u -p -r1.6 mkdir.c > --- mkdir.c 29 Mar 2021 04:01:17 - 1.6 > +++ mkdir.c 6 May 2021 16:41:16 - > @@ -39,15 +39,13 @@ > > /* > * mkpath -- create directories. > - * path - path > - * mode - file mode of terminal directory > - * dir_mode - file mode of intermediate directories > + * dir - path to create directories for > */ > int > mkpath(const char *dir) > { > char *path, *slash; > - int done; > + int done, save_errno; > > if ((path = strdup(dir)) == NULL) > return -1; > @@ -61,7 +59,9 @@ mkpath(const char *dir) > *slash = '\0'; > > if (mkdir(path, 0755) == -1 && errno != EEXIST) { > + save_errno = errno; > free(path); > + errno = save_errno; > return -1; > } > >
rpki-client don't clobber errno in mkpath
Noticed while looking at the same version in rsync. free() may clobber errno so better save the value before calling free(). Also update the comment, remove all those arguments I removed :) -- :wq Claudio Index: mkdir.c === RCS file: /cvs/src/usr.sbin/rpki-client/mkdir.c,v retrieving revision 1.6 diff -u -p -r1.6 mkdir.c --- mkdir.c 29 Mar 2021 04:01:17 - 1.6 +++ mkdir.c 6 May 2021 16:41:16 - @@ -39,15 +39,13 @@ /* * mkpath -- create directories. - * path - path - * mode - file mode of terminal directory - * dir_mode - file mode of intermediate directories + * dir - path to create directories for */ int mkpath(const char *dir) { char *path, *slash; - int done; + int done, save_errno; if ((path = strdup(dir)) == NULL) return -1; @@ -61,7 +59,9 @@ mkpath(const char *dir) *slash = '\0'; if (mkdir(path, 0755) == -1 && errno != EEXIST) { + save_errno = errno; free(path); + errno = save_errno; return -1; }
Re: timeout_del_barrier(9): remove kernel lock
On Wed, May 05, 2021 at 11:05:06AM +1000, David Gwynne wrote: > On Tue, May 04, 2021 at 11:54:31AM -0500, Scott Cheloha wrote: > > > > [...] > > > > Here is where I get confused. > > > > Why do I want to wait for *all* timeouts in the queue to finish > > running? > > You don't, you just want to know that whatever timeout was running > has finished cos you don't know if that currently running timeout is > yours or not. > > > My understanding of the timeout_del_barrier(9) use case was as > > follows: > > > > 1. I have a dynamically allocated timeout struct. The timeout is > >scheduled to execute at some point in the future. > > > > 2. I want to free the memory allocated to for the timeout. > > > > 3. To safely free the memory I need to ensure the timeout > >is not executing. Until the timeout function, i.e. > >to->to_func(), has returned it isn't necessarily safe to > >free that memory. > > > > 4. If I call timeout_del_barrier(9), it will only return if the > >timeout in question is not running. Assuming the timeout itself > >is not a periodic timeout that reschedules itself we can then > >safely free the memory. > > Barriers aren't about references to timeouts, they're about references > to the work associated with a timeout. > > If you only cared about the timeout struct itself, then you can get > all the information about whether it's referenced by the timeout > queues from the return values from timeout_add and timeout_del. > timeout_add() returns 1 if the timeout subsystem takes the reference > to the timeout. If the timeout is already scheduled it returns 0 > because it's already got a reference. timeout_del returns 1 if the > timeout itself was scheduled and it removed it, therefore giving > up it's reference. If you timeout_del and it returns 0, then it > wasn't on a queue inside timeouts. Easy. > > What timeout_add and timeout_del don't tell you is whether the work > referenced by the timeout is currently running. The timeout runners > copy the function and argument onto the stack when they dequeue a > timeout to run, and give up the reference to the timeout when the > mutex around the timeout wheels/cirqs is released. However, the argument > is still referenced on the stack and the function to process it may be > about to run or is running. You can't tell that from the timeout_add/del > return values though. > > We provide two ways to deal with that. One is you have reference > counters (or similar) on the thing you're deferring to a timeout. > The other is you use a barrier so you know the work you deferred > isn't on the timeout runners stack anymore because it's moved on > to run the barrier work. > > This is consistent with tasks and interrupts. > > > Given this understanding, my approach was to focus on the timeout in > > question. So my code just spins until it the timeout is no longer > > running. > > > > But now I think I am mistaken. > > > > IIUC you're telling me (and showing me, in code) that the goal of > > timeout_barrier() is to wait for the *whole* softclock() to return, > > not just the one timeout. > > No, just the currently running timeout. > > Putting the barrier timeout on the end of the proc and todo cirqs > is because there's no CIRCQ_INSERT_HEAD I can use right now. > > > Why do I want to wait for the whole softclock() or softclock_thread()? > > Why not just wait for the one timeout to finish? > > Cos I was too lazy to write CIRCQ_INSERT_HEAD last night :D Okay. After thinking this over I'm pretty sure we are skinning the same cat in two different ways. Your cond_wait(9) approach is fine for both timeout types because timeout_del_barrier(9) is only safe to call from process context. When I wrote my first diff I was under the impression it was safe to call timeout_del_barrier(9) from interrupt context. But I reread the manpage and now I see that this is not the case, my bad. > [...] > > This discussion has some relevance to taskqs too. I was also lazy when I > implemented taskq_barrier and used task_add to put the barrier onto the > list of work, which meant it got added to the end. Now DRM relies on > this behaviour. Maybe I should have pushed to commit the diff below. > > This diff lets taskq barriers run as soon as possible, but adds compat > to the drm workqueue stuff so they can wait for all pending work to > finish as they expect. > > P.S. don't call timeout_del from inside a timeout handler. I still think > timeout_set_proc was a mistake. > > [...] Let's pick this whole train of thought (taskq, etc.) up in a different thread. One thing at a time. -- Updated patch: - Keep timeout_barrier(9). - In timeout_barrier(9) use the barrier timeout for both normal and process-context timeouts. All callers use cond_wait(9) now. - Remove the kernel lock from the non-process path. I assume I don't need to splx(splschedclock()) anymore? - Rename "timeout_proc_barrier()" to "timeout_barrier_timeout()" because
Re: services(5): add default ftps ports
> From: "Theo de Raadt" > Date: Thu, 06 May 2021 10:26:31 -0600 > > Jan Klemkow wrote: > > > On Wed, May 05, 2021 at 12:18:43PM -0600, Theo de Raadt wrote: > > > I would like a further justification for removing these ports from > > > the very limited dynamic reserved space used by bindresvport. > > > > > > (but not by rresvport, which appears still stomp over them) > > > > > > For tcp, 32 of the 512 are locked out. > > > For udp, 19. > > > > > > What software is actually using these ports? > > > > > > Is that software irrelevant these days? > > > > I'm working on a diff to bring ftps with libtls into our ftpd(8). There > > is a "getaddrinfo(NULL, "ftps", &hints, &res0)" call, which uses this > > port. Thus, I made this change. > > Hang on -- does the world want ftps support? I was going to ask the same thing. I mean even with encryption the FTP protocol still is a bad idea given all the problems with NAT traversal and such.
Re: services(5): add default ftps ports
Jan Klemkow wrote: > On Wed, May 05, 2021 at 12:18:43PM -0600, Theo de Raadt wrote: > > I would like a further justification for removing these ports from > > the very limited dynamic reserved space used by bindresvport. > > > > (but not by rresvport, which appears still stomp over them) > > > > For tcp, 32 of the 512 are locked out. > > For udp, 19. > > > > What software is actually using these ports? > > > > Is that software irrelevant these days? > > I'm working on a diff to bring ftps with libtls into our ftpd(8). There > is a "getaddrinfo(NULL, "ftps", &hints, &res0)" call, which uses this > port. Thus, I made this change. Hang on -- does the world want ftps support?
more rsync cleanup
As noticed by benno@ the blk.blks buffer is leaked in some cases. Fix those and cleanup up the pre_* functions a bit more. I increased the diff context a bit to make the diff easier to read. -- :wq Claudio Index: uploader.c === RCS file: /cvs/src/usr.bin/rsync/uploader.c,v retrieving revision 1.25 diff -u -p -U6 -r1.25 uploader.c --- uploader.c 6 May 2021 07:35:22 - 1.25 +++ uploader.c 6 May 2021 15:34:36 - @@ -191,22 +191,24 @@ pre_link(struct upload *p, struct sess * * to overwriting with a symbolic link. * If it's a non-directory, we just overwrite it. */ assert(p->rootfd != -1); rc = fstatat(p->rootfd, f->path, &st, AT_SYMLINK_NOFOLLOW); + + if (rc == -1 && errno != ENOENT) { + ERR("%s: fstatat", f->path); + return -1; + } if (rc != -1 && !S_ISLNK(st.st_mode)) { if (S_ISDIR(st.st_mode) && unlinkat(p->rootfd, f->path, AT_REMOVEDIR) == -1) { ERR("%s: unlinkat", f->path); return -1; } rc = -1; - } else if (rc == -1 && errno != ENOENT) { - ERR("%s: fstatat", f->path); - return -1; } /* * If the symbolic link already exists, then make sure that it * points to the correct place. */ @@ -294,22 +296,23 @@ pre_dev(struct upload *p, struct sess *s * If it replaces a directory, remove the directory first. */ assert(p->rootfd != -1); rc = fstatat(p->rootfd, f->path, &st, AT_SYMLINK_NOFOLLOW); + if (rc == -1 && errno != ENOENT) { + ERR("%s: fstatat", f->path); + return -1; + } if (rc != -1 && !(S_ISBLK(st.st_mode) || S_ISCHR(st.st_mode))) { if (S_ISDIR(st.st_mode) && unlinkat(p->rootfd, f->path, AT_REMOVEDIR) == -1) { ERR("%s: unlinkat", f->path); return -1; } rc = -1; - } else if (rc == -1 && errno != ENOENT) { - ERR("%s: fstatat", f->path); - return -1; } /* Make sure existing device is of the correct type. */ if (rc != -1) { if ((f->st.mode & (S_IFCHR|S_IFBLK)) != @@ -382,22 +385,23 @@ pre_fifo(struct upload *p, struct sess * * mark it from replacement. */ assert(p->rootfd != -1); rc = fstatat(p->rootfd, f->path, &st, AT_SYMLINK_NOFOLLOW); + if (rc == -1 && errno != ENOENT) { + ERR("%s: fstatat", f->path); + return -1; + } if (rc != -1 && !S_ISFIFO(st.st_mode)) { if (S_ISDIR(st.st_mode) && unlinkat(p->rootfd, f->path, AT_REMOVEDIR) == -1) { ERR("%s: unlinkat", f->path); return -1; } rc = -1; - } else if (rc == -1 && errno != ENOENT) { - ERR("%s: fstatat", f->path); - return -1; } if (rc == -1) { newfifo = 1; if (mktemplate(&temp, f->path, sess->opts->recursive) == -1) { ERRX1("mktemplate"); @@ -458,22 +462,23 @@ pre_sock(struct upload *p, struct sess * * mark it from replacement. */ assert(p->rootfd != -1); rc = fstatat(p->rootfd, f->path, &st, AT_SYMLINK_NOFOLLOW); + if (rc == -1 && errno != ENOENT) { + ERR("%s: fstatat", f->path); + return -1; + } if (rc != -1 && !S_ISSOCK(st.st_mode)) { if (S_ISDIR(st.st_mode) && unlinkat(p->rootfd, f->path, AT_REMOVEDIR) == -1) { ERR("%s: unlinkat", f->path); return -1; } rc = -1; - } else if (rc == -1 && errno != ENOENT) { - ERR("%s: fstatat", f->path); - return -1; } if (rc == -1) { newsock = 1; if (mktemplate(&temp, f->path, sess->opts->recursive) == -1) { ERRX1("mktemplate"); @@ -530,13 +535,14 @@ pre_dir(const struct upload *p, struct s assert(p->rootfd != -1); rc = fstatat(p->rootfd, f->path, &st, AT_SYMLINK_NOFOLLOW); if (rc == -1 && errno != ENOENT) { ERR("%s: fstatat", f->path); return -1; - } else if (rc != -1 && !S_ISDIR(st.st_mode)) { + } + if (rc != -1 && !S_ISDIR(st.st_mode)) { ERRX("%s: not a directory", f->path); return -1; } else if (rc != -1) { /* * FIXME: we should fchmod the permissions here as well, * as we may locally have shut down writing int
Re: patch: make ld.so aware of pthread_key_create destructor - Re: multimedia/mpv debug vo=gpu crash on exit
On Thu, 06 May 2021 09:32:28 +0200, Sebastien Marie wrote: > We already take care of such situation with __cxa_thread_atexit_impl > (in libc/stdlib/thread_atexit.c), by keeping an additionnal reference > on object loaded (it makes ld.so aware that it is still used and so > dlclose() doesn't unload it). > > I used the same idiom for pthread_key_create() and used dlctl(3) in > the same way with the destructor address. This will set STAT_NODELETE so the DSO will never really get unloaded. That's not a problem for atexit() since the process is headed for the exit. I'm less sure about using it here since we don't have a way to unreference the DSO upon pthread_key_delete(). - todd
Re: patch: make ld.so aware of pthread_key_create destructor - Re: multimedia/mpv debug vo=gpu crash on exit
On Thu, May 06, 2021 at 09:32:28AM +0200, Sebastien Marie wrote: > Hi, > > Anindya, did a good analysis of the problem with mpv using gpu video > output backend (it is using EGL and mesa if I correctly followed). > > > For people not reading ports@ here a resume: the destructor function > used in pthread_key_create() needs to be present in memory until > _rthread_tls_destructors() is called. > > in the case of mesa, eglInitialize() function could load, via > dlopen(), code which will use pthread_key_create() with destructor. > > once dlclose() is called, the object is unloaded from memory, but a > reference to destructor is kept, leading to segfault when > _rthread_tls_destructors() run and use the destructor (because > pointing to unloaded code). > > We already take care of such situation with __cxa_thread_atexit_impl > (in libc/stdlib/thread_atexit.c), by keeping an additionnal reference > on object loaded (it makes ld.so aware that it is still used and so > dlclose() doesn't unload it). > > I used the same idiom for pthread_key_create() and used dlctl(3) in > the same way with the destructor address. > > With the following diff, I am not able to reproduce the segfault > anymore with mpv. > > diff 393e7b397988bb6abe46729de1794883d2b9d5cf /home/semarie/repos/openbsd/src > blob - 58b39bb7df70f54c708d2e2b11a3978806a86005 > file + lib/libc/thread/rthread_tls.c > --- lib/libc/thread/rthread_tls.c > +++ lib/libc/thread/rthread_tls.c > @@ -22,10 +22,8 @@ > #include > #include > #include > +#include > > -#include > -#include > - > #include "rthread.h" > > > @@ -58,6 +56,9 @@ pthread_key_create(pthread_key_t *key, void (*destruct > rkeys[hint].used = 1; > rkeys[hint].destructor = destructor; > > + if (destructor != NULL) > + dlctl(NULL, DL_REFERENCE, destructor); > + > *key = hint++; > if (hint >= PTHREAD_KEYS_MAX) > hint = 0; > > > I am still unsure if we want to solve this problem this way, and if > this diff would introduce more problems than it tries to solve. > > At first place, it might be better to not register a destructor when > using dlopen()... > > Comments would be appreciated. > > Thanks. > -- > Sebastien Marie > > > On Wed, May 05, 2021 at 01:20:02AM -0700, Anindya Mukherjee wrote: > > Hi, > > > > I have been investigating the crash on exit problem with mpv in ports > > with vo=gpu. I think I made a little bit of progress and thought I'd > > share my findings. > > > > The crash (SIGSEGV) happens when thread local destructors > > are called from /usr/src/lib/libc/thread/rthread_tls.c:182 in > > _rthread_tls_destructors after the gpu thread exits: vo_thread in > > video/out/vo.c:1067. The crashing call stack looks like this: > > > > #0 0x0176ffdc9680 in ?? () > > #1 0x017748d347b5 in _rthread_tls_destructors (thread=0x17798917840) > > at /usr/src/lib/libc/thread/rthread_tls.c:182 > > #2 0x017748d98623 in _libc_pthread_exit (retval= > variable: Unhandled dwarf expression opcode 0xa3>) at > > /usr/src/lib/libc/thread/rthread.c:150 > > #3 0x017795b22189 in _rthread_start (v= > Unhandled dwarf expression opcode 0xa3>) at > > /usr/src/lib/librthread/rthread.c:97 > > #4 0x017748d0c5ba in __tfork_thread () at > > /usr/src/lib/libc/arch/amd64/sys/tfork_thread.S:84 > > > > Note that some of the traces were taken from different runs so there > > might be some mismatch between the handles/addresses. > > > > It crashes because the destructor is dangling. This mystified me because > > if I look at the mpv source, there is no thread local data for the gpu > > thread. Indeed, right after the gpu thread starts running if we look > > inside the thread structure, local_storage is null. However, if we look > > at the same thread at the point of the crash, its local_storage is > > populated: > > > > (gdb) p *(*thread).local_storage > > $3 = { > > keyid = 7, > > next = 0x177353442e0, > > data = 0x1770c276000 > > } > > > > The keys are indexed by the keyid in the rkeys array, from where the > > destructor is fetched in _rthread_tls_destructors: > > > > (gdb) p rkeys[7] > > $6 = { > > used = 1, > > destructor = 0x43dd33e2680 > > } > > > > This destructor now points to invalid memory. It turns out the thread > > local storage is being initialised here: > > > > #0 _libc_pthread_key_create (key=0x43dd380da08, destructor=0x43dd33e2680) > > at /usr/src/lib/libc/thread/rthread_tls.c:42 > > #1 0x043dd33e2667 in ?? () from /usr/X11R6/lib/modules/dri/iris_dri.so > > #2 0x043e793f82f7 in pthread_once (once_control=0x43dd380d9f8, > > init_routine=0x43e793db3c0 <_libc_pthread_key_create>) at > > /usr/src/lib/libc/thread/rthread_once.c:26 > > #3 0x043dd33e24bd in ?? () from /usr/X11R6/lib/modules/dri/iris_dri.so > > #4 0x043dd305475f in ?? () from /usr/X11R6/lib/modules/dri/iris_dri.so > > #5 0x043dd3036c70 in ?? () from /usr/X11R6/lib/modules/dri/iris_dri.so > > #6
patch: make ld.so aware of pthread_key_create destructor - Re: multimedia/mpv debug vo=gpu crash on exit
Hi, Anindya, did a good analysis of the problem with mpv using gpu video output backend (it is using EGL and mesa if I correctly followed). For people not reading ports@ here a resume: the destructor function used in pthread_key_create() needs to be present in memory until _rthread_tls_destructors() is called. in the case of mesa, eglInitialize() function could load, via dlopen(), code which will use pthread_key_create() with destructor. once dlclose() is called, the object is unloaded from memory, but a reference to destructor is kept, leading to segfault when _rthread_tls_destructors() run and use the destructor (because pointing to unloaded code). We already take care of such situation with __cxa_thread_atexit_impl (in libc/stdlib/thread_atexit.c), by keeping an additionnal reference on object loaded (it makes ld.so aware that it is still used and so dlclose() doesn't unload it). I used the same idiom for pthread_key_create() and used dlctl(3) in the same way with the destructor address. With the following diff, I am not able to reproduce the segfault anymore with mpv. diff 393e7b397988bb6abe46729de1794883d2b9d5cf /home/semarie/repos/openbsd/src blob - 58b39bb7df70f54c708d2e2b11a3978806a86005 file + lib/libc/thread/rthread_tls.c --- lib/libc/thread/rthread_tls.c +++ lib/libc/thread/rthread_tls.c @@ -22,10 +22,8 @@ #include #include #include +#include -#include -#include - #include "rthread.h" @@ -58,6 +56,9 @@ pthread_key_create(pthread_key_t *key, void (*destruct rkeys[hint].used = 1; rkeys[hint].destructor = destructor; + if (destructor != NULL) + dlctl(NULL, DL_REFERENCE, destructor); + *key = hint++; if (hint >= PTHREAD_KEYS_MAX) hint = 0; I am still unsure if we want to solve this problem this way, and if this diff would introduce more problems than it tries to solve. At first place, it might be better to not register a destructor when using dlopen()... Comments would be appreciated. Thanks. -- Sebastien Marie On Wed, May 05, 2021 at 01:20:02AM -0700, Anindya Mukherjee wrote: > Hi, > > I have been investigating the crash on exit problem with mpv in ports > with vo=gpu. I think I made a little bit of progress and thought I'd > share my findings. > > The crash (SIGSEGV) happens when thread local destructors > are called from /usr/src/lib/libc/thread/rthread_tls.c:182 in > _rthread_tls_destructors after the gpu thread exits: vo_thread in > video/out/vo.c:1067. The crashing call stack looks like this: > > #0 0x0176ffdc9680 in ?? () > #1 0x017748d347b5 in _rthread_tls_destructors (thread=0x17798917840) at > /usr/src/lib/libc/thread/rthread_tls.c:182 > #2 0x017748d98623 in _libc_pthread_exit (retval= Unhandled dwarf expression opcode 0xa3>) at > /usr/src/lib/libc/thread/rthread.c:150 > #3 0x017795b22189 in _rthread_start (v= Unhandled dwarf expression opcode 0xa3>) at > /usr/src/lib/librthread/rthread.c:97 > #4 0x017748d0c5ba in __tfork_thread () at > /usr/src/lib/libc/arch/amd64/sys/tfork_thread.S:84 > > Note that some of the traces were taken from different runs so there > might be some mismatch between the handles/addresses. > > It crashes because the destructor is dangling. This mystified me because > if I look at the mpv source, there is no thread local data for the gpu > thread. Indeed, right after the gpu thread starts running if we look > inside the thread structure, local_storage is null. However, if we look > at the same thread at the point of the crash, its local_storage is > populated: > > (gdb) p *(*thread).local_storage > $3 = { > keyid = 7, > next = 0x177353442e0, > data = 0x1770c276000 > } > > The keys are indexed by the keyid in the rkeys array, from where the > destructor is fetched in _rthread_tls_destructors: > > (gdb) p rkeys[7] > $6 = { > used = 1, > destructor = 0x43dd33e2680 > } > > This destructor now points to invalid memory. It turns out the thread > local storage is being initialised here: > > #0 _libc_pthread_key_create (key=0x43dd380da08, destructor=0x43dd33e2680) at > /usr/src/lib/libc/thread/rthread_tls.c:42 > #1 0x043dd33e2667 in ?? () from /usr/X11R6/lib/modules/dri/iris_dri.so > #2 0x043e793f82f7 in pthread_once (once_control=0x43dd380d9f8, > init_routine=0x43e793db3c0 <_libc_pthread_key_create>) at > /usr/src/lib/libc/thread/rthread_once.c:26 > #3 0x043dd33e24bd in ?? () from /usr/X11R6/lib/modules/dri/iris_dri.so > #4 0x043dd305475f in ?? () from /usr/X11R6/lib/modules/dri/iris_dri.so > #5 0x043dd3036c70 in ?? () from /usr/X11R6/lib/modules/dri/iris_dri.so > #6 0x043dd30e3ca3 in ?? () from /usr/X11R6/lib/modules/dri/iris_dri.so > #7 0x043dd30e4b96 in ?? () from /usr/X11R6/lib/modules/dri/iris_dri.so > #8 0x043dd302d89a in ?? () from /usr/X11R6/lib/modules/dri/iris_dri.so > #9 0x043dd3031162 in ?? () from /usr/X11R6/lib/modules/dri/iris_dri.so > #10 0x043dd3031