Re: libc/libsys split coming soon
On Sat, Feb 03, 2024 at 11:10:14AM -0700, Warner Losh wrote: > On Sat, Feb 3, 2024, 11:02 AM Konstantin Belousov > wrote: > > > On Sat, Feb 03, 2024 at 11:05:10AM -0500, Daniel Eischen wrote: > > > Will this break binary compatibility with older programs expecting those > > symbols in libc and not linked to libsys? > > > > As was mentioned, libc filters libsys. This means that libc exports all > > the same symbols as before, but forward the implementation to libsys. > > For apps nothing changes, the introduction of libsys is (should be) ABI > > compatible. > > > > More, I would state that no binaries wanting to state binary-compatble > > with future FreeBSD should link to libsys directly, at least for now. > > > > How do you view Golang or Rust run times using this then? They try to avoid > libc today. Goland runtime issues syscalls directly (using CPU instructions). I believe this is true even when Go-compiled binary is linked against libc to provide C FFI. Rust does use libc to get system services. No changes there. Or, is you question about switching Go and Rust to directly using libsys? Then Rust does not need that. For Go, indeed, using either libc or libsys instead of static linking and directly issuing syscalls would be better. Right now Go binary needs to understand all small details in the difference between C wrappers ABI and real syscalls. Also they re-wrote e.g. the gettime() code from vdso into their runtime. > > Warner > > Warner > > > > > > > > On Feb 3, 2024, at 3:39 AM, Dave Cottlehuber > > wrote: > > > > > > > > On Fri, 2 Feb 2024, at 23:31, Brooks Davis wrote: > > > >> TL;DR: The implementation of system calls is moving to a seperate > > > >> library (libsys). No changes are required to existing software > > (except > > > >> to ensure that libsys is present when building custom disk images). > > > >> > > > >> Code: https://github.com/freebsd/freebsd-src/pull/908 > > > >> > > > >> After nearly a decade of intermittent work, I'm about to land a series > > > >> of patches which moves system calls, vdso support, and libc's parsing > > of > > > >> the ELF auxiliary argument vector into a separate library (libsys). I > > > >> plan to do this early next week (February 5th). > > > >> > > > >> This change serves three primary purposes: > > > >> 1. It's easier to completely replace system call implementations for > > > >> tracing or compartmentalization purposes. > > > >> 2. It simplifies the implementation of restrictions on system calls > > such > > > >> as those implemented by OpenBSD's msyscall(2) > > > >> (https://man.openbsd.org/msyscall.2). > > > >> 3. It allows language runtimes to link with libsys for system call > > > >> implementations without requiring libc. > > > > > > > > Awesome! So (3) is generally considered ideal for languages like > > zig[1], rust or go, to use directly? > > > > > > > > What’s the appropriate mechanism for such a language to know which > > version of FreeBSD it’s talking to, to ensure syscall table matches the > > languages expectations? > > > > > > > > It would be nice to hear about any experiments in (2) and how that > > compares to things such as capsicum. > > > > > > > > [1]: https://github.com/ziglang/zig/issues/165 > > > > > > > > A+ > > > > Dave > > > > > > > > > > > > > > > > > >
Re: libc/libsys split coming soon
On Sat, Feb 3, 2024, 11:02 AM Konstantin Belousov wrote: > On Sat, Feb 03, 2024 at 11:05:10AM -0500, Daniel Eischen wrote: > > Will this break binary compatibility with older programs expecting those > symbols in libc and not linked to libsys? > > As was mentioned, libc filters libsys. This means that libc exports all > the same symbols as before, but forward the implementation to libsys. > For apps nothing changes, the introduction of libsys is (should be) ABI > compatible. > > More, I would state that no binaries wanting to state binary-compatble > with future FreeBSD should link to libsys directly, at least for now. > How do you view Golang or Rust run times using this then? They try to avoid libc today. Warner Warner > > > > > On Feb 3, 2024, at 3:39 AM, Dave Cottlehuber > wrote: > > > > > > On Fri, 2 Feb 2024, at 23:31, Brooks Davis wrote: > > >> TL;DR: The implementation of system calls is moving to a seperate > > >> library (libsys). No changes are required to existing software > (except > > >> to ensure that libsys is present when building custom disk images). > > >> > > >> Code: https://github.com/freebsd/freebsd-src/pull/908 > > >> > > >> After nearly a decade of intermittent work, I'm about to land a series > > >> of patches which moves system calls, vdso support, and libc's parsing > of > > >> the ELF auxiliary argument vector into a separate library (libsys). I > > >> plan to do this early next week (February 5th). > > >> > > >> This change serves three primary purposes: > > >> 1. It's easier to completely replace system call implementations for > > >> tracing or compartmentalization purposes. > > >> 2. It simplifies the implementation of restrictions on system calls > such > > >> as those implemented by OpenBSD's msyscall(2) > > >> (https://man.openbsd.org/msyscall.2). > > >> 3. It allows language runtimes to link with libsys for system call > > >> implementations without requiring libc. > > > > > > Awesome! So (3) is generally considered ideal for languages like > zig[1], rust or go, to use directly? > > > > > > What’s the appropriate mechanism for such a language to know which > version of FreeBSD it’s talking to, to ensure syscall table matches the > languages expectations? > > > > > > It would be nice to hear about any experiments in (2) and how that > compares to things such as capsicum. > > > > > > [1]: https://github.com/ziglang/zig/issues/165 > > > > > > A+ > > > Dave > > > > > > > > > > > >
Re: libc/libsys split coming soon
On Sat, Feb 03, 2024 at 11:05:10AM -0500, Daniel Eischen wrote: > Will this break binary compatibility with older programs expecting those > symbols in libc and not linked to libsys? As was mentioned, libc filters libsys. This means that libc exports all the same symbols as before, but forward the implementation to libsys. For apps nothing changes, the introduction of libsys is (should be) ABI compatible. More, I would state that no binaries wanting to state binary-compatble with future FreeBSD should link to libsys directly, at least for now. > > > On Feb 3, 2024, at 3:39 AM, Dave Cottlehuber wrote: > > > > On Fri, 2 Feb 2024, at 23:31, Brooks Davis wrote: > >> TL;DR: The implementation of system calls is moving to a seperate > >> library (libsys). No changes are required to existing software (except > >> to ensure that libsys is present when building custom disk images). > >> > >> Code: https://github.com/freebsd/freebsd-src/pull/908 > >> > >> After nearly a decade of intermittent work, I'm about to land a series > >> of patches which moves system calls, vdso support, and libc's parsing of > >> the ELF auxiliary argument vector into a separate library (libsys). I > >> plan to do this early next week (February 5th). > >> > >> This change serves three primary purposes: > >> 1. It's easier to completely replace system call implementations for > >> tracing or compartmentalization purposes. > >> 2. It simplifies the implementation of restrictions on system calls such > >> as those implemented by OpenBSD's msyscall(2) > >> (https://man.openbsd.org/msyscall.2). > >> 3. It allows language runtimes to link with libsys for system call > >> implementations without requiring libc. > > > > Awesome! So (3) is generally considered ideal for languages like zig[1], > > rust or go, to use directly? > > > > What’s the appropriate mechanism for such a language to know which version > > of FreeBSD it’s talking to, to ensure syscall table matches the languages > > expectations? > > > > It would be nice to hear about any experiments in (2) and how that compares > > to things such as capsicum. > > > > [1]: https://github.com/ziglang/zig/issues/165 > > > > A+ > > Dave > > > > > >
Re: libc/libsys split coming soon
On Sat, Feb 03, 2024 at 12:12:35PM +0100, Mateusz Guzik wrote: > On 2/3/24, David Chisnall wrote: > > On 3 Feb 2024, at 09:15, Mateusz Guzik wrote: > >> > >> Binary startup is very slow, for example execve of a hello world > >> binary in a Linux-based chroot on FreeBSD is faster by a factor of 2 > >> compared to a native one. As such perf-wise this looks like a step in > >> the wrong direction. It is the right change to improve modularity and the structure of the code. > > > > Have you profiled this? Is the Linux version using BIND_NOW (which comes > > with a load of problems, but it often the default for Linux systems and > > reduces the number of slow-path entries into rtld)? Do they trigger the > > same number of CoW faults? Is there a path in rtld that’s slower than the > > equivalent ld-linux.so path? Linux version probably benefits from pre-linking, which might have the side-effect of breaking semantic into as if BIND_NOW is activated. > > > > I only profiled FreeBSD, it was 4 years ago. I have neither time nor > interest in working on this. > > Relevant excerpts from profiling an fexecve loop: > > Sampling what syscalls was being executed when in kernel mode > (or trap): > > syscalls: >pread 108 >fstat 162 >issetugid 250 > sigprocmask 303 > read 310 > mprotect 341 > open 380 >close 1547 > mmap 2787 > trap 5421 > [snip] > In userspace most of the time is spent here: > ld-elf.so.1`memset 406 > ld-elf.so.1`matched_symbol 431 > ld-elf.so.1`strcmp 1078 >ld-elf.so.1`reloc_non_plt 1102 > ld-elf.so.1`symlook_obj 1102 > ld-elf.so.1`find_symdef 1439 > > find_symdef iterates a linked list, which I presume induces strcmp calls > due to unwanted entries. > [snip] So strcmp() is almost 1:1 with reloc_non_plt and/or symlook_obj. It demonstrates that the ELF hash (perhaps GNU hash, but I do not remember how long do we have it) provides very good distribution.
Re: libc/libsys split coming soon
On 02/02/2024 23:31, Brooks Davis wrote: TL;DR: The implementation of system calls is moving to a seperate library (libsys). No changes are required to existing software (except to ensure that libsys is present when building custom disk images). Code:https://github.com/freebsd/freebsd-src/pull/908 After nearly a decade of intermittent work, I'm about to land a series of patches which moves system calls, vdso support, and libc's parsing of the ELF auxiliary argument vector into a separate library (libsys). I plan to do this early next week (February 5th). This change serves three primary purposes: 1. It's easier to completely replace system call implementations for tracing or compartmentalization purposes. Will that affect code that makes syscalls without currently linking to libc? 2. It simplifies the implementation of restrictions on system calls such as those implemented by OpenBSD's msyscall(2) (https://man.openbsd.org/msyscall.2). That's one to ignore for tools that make syscalls outside of the libc memory mapping. 3. It allows language runtimes to link with libsys for system call implementations without requiring libc. I see that pagesize is on the list of functions that are moving. There are a couple of other functions that might cause me problems if libc isn't linked. Could you do a quick test with an exe linked to libsys but not libc running under Valgrind memcheck, please? A+ Paul
Re: libc/libsys split coming soon
Will this break binary compatibility with older programs expecting those symbols in libc and not linked to libsys? > On Feb 3, 2024, at 3:39 AM, Dave Cottlehuber wrote: > > On Fri, 2 Feb 2024, at 23:31, Brooks Davis wrote: >> TL;DR: The implementation of system calls is moving to a seperate >> library (libsys). No changes are required to existing software (except >> to ensure that libsys is present when building custom disk images). >> >> Code: https://github.com/freebsd/freebsd-src/pull/908 >> >> After nearly a decade of intermittent work, I'm about to land a series >> of patches which moves system calls, vdso support, and libc's parsing of >> the ELF auxiliary argument vector into a separate library (libsys). I >> plan to do this early next week (February 5th). >> >> This change serves three primary purposes: >> 1. It's easier to completely replace system call implementations for >> tracing or compartmentalization purposes. >> 2. It simplifies the implementation of restrictions on system calls such >> as those implemented by OpenBSD's msyscall(2) >> (https://man.openbsd.org/msyscall.2). >> 3. It allows language runtimes to link with libsys for system call >> implementations without requiring libc. > > Awesome! So (3) is generally considered ideal for languages like zig[1], rust > or go, to use directly? > > What’s the appropriate mechanism for such a language to know which version of > FreeBSD it’s talking to, to ensure syscall table matches the languages > expectations? > > It would be nice to hear about any experiments in (2) and how that compares > to things such as capsicum. > > [1]: https://github.com/ziglang/zig/issues/165 > > A+ > Dave > >
Re: libc/libsys split coming soon
Am Sat, Feb 03, 2024 at 10:15:09AM +0100 schrieb Mateusz Guzik: > Do I read it correctly that everything dynamically linked will also be > linked to libsys, as in executing such a binary will now require > loading one extra .so? > > Binary startup is very slow, for example execve of a hello world > binary in a Linux-based chroot on FreeBSD is faster by a factor of 2 > compared to a native one. As such perf-wise this looks like a step in > the wrong direction. I wonder if we could follow the steps of musl libc and just integrate libsys/libc into rtld, as basically all dynamically linked programs link these libraries anyway. Yours, Robert Clausecker -- () ascii ribbon campaign - for an encoding-agnostic world /\ - against html email - against proprietary attachments
Re: libc/libsys split coming soon
On 2/3/24, David Chisnall wrote: > On 3 Feb 2024, at 09:15, Mateusz Guzik wrote: >> >> Binary startup is very slow, for example execve of a hello world >> binary in a Linux-based chroot on FreeBSD is faster by a factor of 2 >> compared to a native one. As such perf-wise this looks like a step in >> the wrong direction. > > Have you profiled this? Is the Linux version using BIND_NOW (which comes > with a load of problems, but it often the default for Linux systems and > reduces the number of slow-path entries into rtld)? Do they trigger the > same number of CoW faults? Is there a path in rtld that’s slower than the > equivalent ld-linux.so path? > I only profiled FreeBSD, it was 4 years ago. I have neither time nor interest in working on this. Relevant excerpts from profiling an fexecve loop: Sampling what syscalls was being executed when in kernel mode (or trap): syscalls: pread 108 fstat 162 issetugid 250 sigprocmask 303 read 310 mprotect 341 open 380 close 1547 mmap 2787 trap 5421 [snip] In userspace most of the time is spent here: ld-elf.so.1`memset 406 ld-elf.so.1`matched_symbol 431 ld-elf.so.1`strcmp 1078 ld-elf.so.1`reloc_non_plt 1102 ld-elf.so.1`symlook_obj 1102 ld-elf.so.1`find_symdef 1439 find_symdef iterates a linked list, which I presume induces strcmp calls due to unwanted entries. [snip] Full profile user: libc.so.7`__je_extent_heap_new 71 libc.so.7`__vdso_clock_gettime 73 libc.so.7`memset 75 ld-elf.so.1`_rtld 83 ld-elf.so.1`getenv 85 libc.so.7`__je_malloc_mutex_boot 132 ld-elf.so.1`reloc_plt 148 ld-elf.so.1`__crt_malloc 163 ld-elf.so.1`symlook_default 166 ld-elf.so.1`digest_dynamic1 184 libc.so.7`__je_malloc_mutex_init 205 ld-elf.so.1`symlook_global 281 ld-elf.so.1`memset 406 ld-elf.so.1`matched_symbol 431 ld-elf.so.1`strcmp 1078 ld-elf.so.1`reloc_non_plt 1102 ld-elf.so.1`symlook_obj 1102 ld-elf.so.1`find_symdef 1439 kernel: kernel`vm_reserv_alloc_page 89 kernel`amd64_syscall 95 kernel`0x80 102 kernel`vm_page_alloc_domain_after 114 kernel`vm_object_deallocate 117 kernel`vm_map_pmap_enter 122 kernel`pmap_enter_object 140 kernel`uma_zalloc_arg 148 kernel`vm_map_lookup_entry 148 kernel`pmap_try_insert_pv_entry 156 kernel`vm_fault_dirty 168 kernel`pagecopy 177 kernel`vm_fault 260 kernel`get_pv_entry 265 kernel`pagezero_erms 367 kernel`pmap_enter_quick_locked 380 kernel`pmap_enter 432 kernel`0x80 1126 kernel`0x80 2017 kernel`trap 2097 syscalls: pread 108 fstat 162 issetugid 250 sigprocmask 303 read 310 mprotect 341 open 380 close 1547 mmap 2787 trap 5421 Counting fexecve: dtrace -n 'fbt::sys_fexecve:entry { @[count] = stack(); } tick-30s { exit(0); }' dtrace script, can be run as: dtrace -w -x aggsize=128M -s script.d assumes binary name is a.out syscall::fexecve:entry { self->inexec = 1; } syscall::fexecve:return { self->inexec = 0; } fbt::trap:entry { self->trap = 1; } fbt::trap:return { self->trap = 0; }
Re: libc/libsys split coming soon
On 3 Feb 2024, at 09:15, Mateusz Guzik wrote: > > Binary startup is very slow, for example execve of a hello world > binary in a Linux-based chroot on FreeBSD is faster by a factor of 2 > compared to a native one. As such perf-wise this looks like a step in > the wrong direction. Have you profiled this? Is the Linux version using BIND_NOW (which comes with a load of problems, but it often the default for Linux systems and reduces the number of slow-path entries into rtld)? Do they trigger the same number of CoW faults? Is there a path in rtld that’s slower than the equivalent ld-linux.so path? David
Re: libc/libsys split coming soon
On 2/2/24, Brooks Davis wrote: > TL;DR: The implementation of system calls is moving to a seperate > library (libsys). No changes are required to existing software (except > to ensure that libsys is present when building custom disk images). > > Code: https://github.com/freebsd/freebsd-src/pull/908 > > After nearly a decade of intermittent work, I'm about to land a series > of patches which moves system calls, vdso support, and libc's parsing of > the ELF auxiliary argument vector into a separate library (libsys). I > plan to do this early next week (February 5th). > > This change serves three primary purposes: > 1. It's easier to completely replace system call implementations for > tracing or compartmentalization purposes. > 2. It simplifies the implementation of restrictions on system calls such > as those implemented by OpenBSD's msyscall(2) > (https://man.openbsd.org/msyscall.2). > 3. It allows language runtimes to link with libsys for system call > implementations without requiring libc. > > libsys is an auxiliary filter for libc. This means that for any symbol > defined by both, the libsys version takes precedence at runtime. For > system call implementations, libc contains empty stubs. For others it > contains copies of the functions (this could be further refined at a > later date). The statically linked libc contains the full > implementations so linking libsys is not required. > Do I read it correctly that everything dynamically linked will also be linked to libsys, as in executing such a binary will now require loading one extra .so? Binary startup is very slow, for example execve of a hello world binary in a Linux-based chroot on FreeBSD is faster by a factor of 2 compared to a native one. As such perf-wise this looks like a step in the wrong direction. Is there a problem making it so that libc ends up unchanged, but all the bits are available separately in libsys if one does not want libc? -- Mateusz Guzik
Re: libc/libsys split coming soon
On Fri, 2 Feb 2024, at 23:31, Brooks Davis wrote: > TL;DR: The implementation of system calls is moving to a seperate > library (libsys). No changes are required to existing software (except > to ensure that libsys is present when building custom disk images). > > Code: https://github.com/freebsd/freebsd-src/pull/908 > > After nearly a decade of intermittent work, I'm about to land a series > of patches which moves system calls, vdso support, and libc's parsing of > the ELF auxiliary argument vector into a separate library (libsys). I > plan to do this early next week (February 5th). > > This change serves three primary purposes: > 1. It's easier to completely replace system call implementations for > tracing or compartmentalization purposes. > 2. It simplifies the implementation of restrictions on system calls such > as those implemented by OpenBSD's msyscall(2) > (https://man.openbsd.org/msyscall.2). > 3. It allows language runtimes to link with libsys for system call > implementations without requiring libc. Awesome! So (3) is generally considered ideal for languages like zig[1], rust or go, to use directly? What’s the appropriate mechanism for such a language to know which version of FreeBSD it’s talking to, to ensure syscall table matches the languages expectations? It would be nice to hear about any experiments in (2) and how that compares to things such as capsicum. [1]: https://github.com/ziglang/zig/issues/165 A+ Dave