Re: librthread sem_t opaqueness, storage & unnamed semaphore sharing
On 2020-03-02, Lauri Tirkkonen wrote: > Thanks for the input, and ping - is there still something about this > diff that I should fix? I'm kinda busy, but I should be able to look into it eventually.
Re: librthread sem_t opaqueness, storage & unnamed semaphore sharing
On Mon, Feb 24 2020 15:33:35 -0500, Ted Unangst wrote: > Martin Pieuchot wrote: > > On 24/02/20(Mon) 11:29, Lauri Tirkkonen wrote: > > > On Mon, Feb 24 2020 10:24:53 +0100, Martin Pieuchot wrote: > > > > On 23/02/20(Sun) 14:48, Lauri Tirkkonen wrote: > > > > > I was working on a make jobserver implementation that uses POSIX > > > > > semaphores as job tokens instead of a complicated socket-based > > > > > approach. > > > > > Initially I used named semaphores, which work fine, except if child > > > > > processes with less privileges need to also open the named semaphore > > > > > (eg. 'make build' as root executing 'su build -c make'). For that > > > > > reason > > > > > I wanted to use an unnamed semaphore (sem_init()) which is stored in > > > > > shm > > > > > -- that way I could leave the shm fd open and pass it to children. > > > > > > > > > > But unfortunately, sem_t is currently just a pointer to the opaque > > > > > struct __sem, and sem_int() always calloc()s the storage for the > > > > > struct. > > > > > > > > That's by design. > > > > > > Ok - could you elaborate what the design is? > > > > If the size of a descriptor change, because some fields are added and/or > > removed, it doesn't matter for the application because it only manipulates > > pointers. That means we can change the data types without creating an ABI > > break. > > I think we are approaching the point where we can settle on fixed sized types > now. If we want to be cautious, we can add a reserved padding field, too. But > there are some edge cases which I think can be removed by eliminating the > dynamic allocation paths. > > > > See the followup patch -- sharing the semaphore between processes does > > > work with it. > > > > Well ignoring the `pshared' argument is questionable. Why don't you > > remove the "#if notyet" and start playing with the existing code and > > try to figure out if something is missing for your use case? > > I'm not sure the code in notyet will work. It was based on a misunderstanding > I had of the requirements. Returning control of the sem_t placement to the > application is the right approach. Thanks for the input, and ping - is there still something about this diff that I should fix? -- Lauri Tirkkonen | lotheac @ IRCnet
Re: librthread sem_t opaqueness, storage & unnamed semaphore sharing
Martin Pieuchot wrote: > On 24/02/20(Mon) 11:29, Lauri Tirkkonen wrote: > > On Mon, Feb 24 2020 10:24:53 +0100, Martin Pieuchot wrote: > > > On 23/02/20(Sun) 14:48, Lauri Tirkkonen wrote: > > > > I was working on a make jobserver implementation that uses POSIX > > > > semaphores as job tokens instead of a complicated socket-based approach. > > > > Initially I used named semaphores, which work fine, except if child > > > > processes with less privileges need to also open the named semaphore > > > > (eg. 'make build' as root executing 'su build -c make'). For that reason > > > > I wanted to use an unnamed semaphore (sem_init()) which is stored in shm > > > > -- that way I could leave the shm fd open and pass it to children. > > > > > > > > But unfortunately, sem_t is currently just a pointer to the opaque > > > > struct __sem, and sem_int() always calloc()s the storage for the struct. > > > > > > That's by design. > > > > Ok - could you elaborate what the design is? > > If the size of a descriptor change, because some fields are added and/or > removed, it doesn't matter for the application because it only manipulates > pointers. That means we can change the data types without creating an ABI > break. I think we are approaching the point where we can settle on fixed sized types now. If we want to be cautious, we can add a reserved padding field, too. But there are some edge cases which I think can be removed by eliminating the dynamic allocation paths. > > See the followup patch -- sharing the semaphore between processes does > > work with it. > > Well ignoring the `pshared' argument is questionable. Why don't you > remove the "#if notyet" and start playing with the existing code and > try to figure out if something is missing for your use case? I'm not sure the code in notyet will work. It was based on a misunderstanding I had of the requirements. Returning control of the sem_t placement to the application is the right approach.
Re: librthread sem_t opaqueness, storage & unnamed semaphore sharing
On Mon, Feb 24 2020 10:42:22 +0100, Martin Pieuchot wrote: > > Yes, that's what I'm trying to do. Yes, I've seen the current > > implementation -- that's why I started this thread, in an attempt to > > make them supported. :) > > > > See the followup patch -- sharing the semaphore between processes does > > work with it. > > Well ignoring the `pshared' argument is questionable. It does feel strange, but a fully userspace implementation of semaphores (ie. using only mmap MAP_SHARED files for communication with other thread/processes) doesn't really have a meaningful distinction whether sharing happens between threads or processes. The other semaphore implementations I glanced at all had some kind of 'kernel semaphore'. I suppose a userspace implementation like this could store the pid of the process allowed to access the semaphore in struct __sem if pshared=0, and check that on every access, but I'm honestly not sure what value it brings for a userspace library to essentially prevent access to userspace memory if the application went through the effort to share that memory between processes anyway. -- Lauri Tirkkonen | lotheac @ IRCnet
Re: librthread sem_t opaqueness, storage & unnamed semaphore sharing
On Mon, Feb 24 2020 10:42:22 +0100, Martin Pieuchot wrote: > On 24/02/20(Mon) 11:29, Lauri Tirkkonen wrote: > > On Mon, Feb 24 2020 10:24:53 +0100, Martin Pieuchot wrote: > > > On 23/02/20(Sun) 14:48, Lauri Tirkkonen wrote: > > > > I was working on a make jobserver implementation that uses POSIX > > > > semaphores as job tokens instead of a complicated socket-based approach. > > > > Initially I used named semaphores, which work fine, except if child > > > > processes with less privileges need to also open the named semaphore > > > > (eg. 'make build' as root executing 'su build -c make'). For that reason > > > > I wanted to use an unnamed semaphore (sem_init()) which is stored in shm > > > > -- that way I could leave the shm fd open and pass it to children. > > > > > > > > But unfortunately, sem_t is currently just a pointer to the opaque > > > > struct __sem, and sem_int() always calloc()s the storage for the struct. > > > > > > That's by design. > > > > Ok - could you elaborate what the design is? > > If the size of a descriptor change, because some fields are added and/or > removed, it doesn't matter for the application because it only manipulates > pointers. That means we can change the data types without creating an ABI > break. I understand this, that's why I bumped the major and build-tested it. What I don't understand is how an application could ever share a semaphore with another process without sharing the userspace structure that contains the important bits. As I mentioned, NetBSD works around that in the pshared case by stuffing a kernel semaphore identifier into the sem_t* returned by sem_init -- essentially, that identifier becomes the only important bit and no storage is allocated, but from the application point of view, sem_t must still be shared with another process through shm otherwise (only now it's only a pointer that is used to hold an integer value). > > > > This means the application cannot control where unnamed semaphores are > > > > stored, so I can't put it in shm. > > > > > > Are you trying to use semaphore shared between process? Did you called > > > sem_init() with pshared=1? Have you seen that the current implementation > > > doesn't support them? > > > > Yes, that's what I'm trying to do. Yes, I've seen the current > > implementation -- that's why I started this thread, in an attempt to > > make them supported. :) > > > > See the followup patch -- sharing the semaphore between processes does > > work with it. > > Well ignoring the `pshared' argument is questionable. Why don't you > remove the "#if notyet" and start playing with the existing code and > try to figure out if something is missing for your use case? I did read the existing code behind the #ifdef notyet, and even found a commit from 2013 disabling the code (by adding the #ifdef) "until it really works". It seems to me that time has not come. I don't understand how the pshared case under #ifdef notyet was even supposed to work. It seems to just turn an unnamed semaphore into a randomly-named semaphore, but provides no way for the application to share it with another process; if the application puts sem_t in shm, only a pointer is now in shm -- the struct __sem is not. Named semaphores at least can be opened by other processes if they know the name, but randomly-named semaphores that are unlinked after creation can certainly not. -- Lauri Tirkkonen | lotheac @ IRCnet
Re: librthread sem_t opaqueness, storage & unnamed semaphore sharing
On 24/02/20(Mon) 11:29, Lauri Tirkkonen wrote: > On Mon, Feb 24 2020 10:24:53 +0100, Martin Pieuchot wrote: > > On 23/02/20(Sun) 14:48, Lauri Tirkkonen wrote: > > > I was working on a make jobserver implementation that uses POSIX > > > semaphores as job tokens instead of a complicated socket-based approach. > > > Initially I used named semaphores, which work fine, except if child > > > processes with less privileges need to also open the named semaphore > > > (eg. 'make build' as root executing 'su build -c make'). For that reason > > > I wanted to use an unnamed semaphore (sem_init()) which is stored in shm > > > -- that way I could leave the shm fd open and pass it to children. > > > > > > But unfortunately, sem_t is currently just a pointer to the opaque > > > struct __sem, and sem_int() always calloc()s the storage for the struct. > > > > That's by design. > > Ok - could you elaborate what the design is? If the size of a descriptor change, because some fields are added and/or removed, it doesn't matter for the application because it only manipulates pointers. That means we can change the data types without creating an ABI break. > > > This means the application cannot control where unnamed semaphores are > > > stored, so I can't put it in shm. > > > > Are you trying to use semaphore shared between process? Did you called > > sem_init() with pshared=1? Have you seen that the current implementation > > doesn't support them? > > Yes, that's what I'm trying to do. Yes, I've seen the current > implementation -- that's why I started this thread, in an attempt to > make them supported. :) > > See the followup patch -- sharing the semaphore between processes does > work with it. Well ignoring the `pshared' argument is questionable. Why don't you remove the "#if notyet" and start playing with the existing code and try to figure out if something is missing for your use case?
Re: librthread sem_t opaqueness, storage & unnamed semaphore sharing
On Mon, Feb 24 2020 10:24:53 +0100, Martin Pieuchot wrote: > On 23/02/20(Sun) 14:48, Lauri Tirkkonen wrote: > > I was working on a make jobserver implementation that uses POSIX > > semaphores as job tokens instead of a complicated socket-based approach. > > Initially I used named semaphores, which work fine, except if child > > processes with less privileges need to also open the named semaphore > > (eg. 'make build' as root executing 'su build -c make'). For that reason > > I wanted to use an unnamed semaphore (sem_init()) which is stored in shm > > -- that way I could leave the shm fd open and pass it to children. > > > > But unfortunately, sem_t is currently just a pointer to the opaque > > struct __sem, and sem_int() always calloc()s the storage for the struct. > > That's by design. Ok - could you elaborate what the design is? > > This means the application cannot control where unnamed semaphores are > > stored, so I can't put it in shm. > > Are you trying to use semaphore shared between process? Did you called > sem_init() with pshared=1? Have you seen that the current implementation > doesn't support them? Yes, that's what I'm trying to do. Yes, I've seen the current implementation -- that's why I started this thread, in an attempt to make them supported. :) See the followup patch -- sharing the semaphore between processes does work with it. -- Lauri Tirkkonen | lotheac @ IRCnet
Re: librthread sem_t opaqueness, storage & unnamed semaphore sharing
On 23/02/20(Sun) 14:48, Lauri Tirkkonen wrote: > I was working on a make jobserver implementation that uses POSIX > semaphores as job tokens instead of a complicated socket-based approach. > Initially I used named semaphores, which work fine, except if child > processes with less privileges need to also open the named semaphore > (eg. 'make build' as root executing 'su build -c make'). For that reason > I wanted to use an unnamed semaphore (sem_init()) which is stored in shm > -- that way I could leave the shm fd open and pass it to children. > > But unfortunately, sem_t is currently just a pointer to the opaque > struct __sem, and sem_int() always calloc()s the storage for the struct. That's by design. > This means the application cannot control where unnamed semaphores are > stored, so I can't put it in shm. Are you trying to use semaphore shared between process? Did you called sem_init() with pshared=1? Have you seen that the current implementation doesn't support them?
Re: librthread sem_t opaqueness, storage & unnamed semaphore sharing
On Sun, Feb 23 2020 14:48:36 +0200, Lauri Tirkkonen wrote: > So, diff below makes struct __sem non-opaque and removes the indirect > allocations, so that the application is required to provide storage and > can therefore control where it's stored (which could be eg. shm). followup diff that makes sem_init() ignore pshared arg -- it doesn't need to care, if the application puts the semaphore in shm. I tested an unnamed shm_init() semaphore in shm_mkstemp() -created shm with this, passing the fd to a child process and mmap()ing in the child process; sharing a semaphore this way does seem to work. diff --git a/lib/librthread/rthread_sem.c b/lib/librthread/rthread_sem.c index 984e5fb0047..441843d87ca 100644 --- a/lib/librthread/rthread_sem.c +++ b/lib/librthread/rthread_sem.c @@ -105,41 +105,6 @@ sem_init(sem_t *sem, int pshared, unsigned int value) return (-1); } - if (pshared) { - errno = EPERM; - return (-1); -#ifdef notyet - char name[SEM_RANDOM_NAME_LEN]; - sem_t *sempshared; - int i; - - for (;;) { - for (i = 0; i < SEM_RANDOM_NAME_LEN - 1; i++) - name[i] = arc4random_uniform(255) + 1; - name[SEM_RANDOM_NAME_LEN - 1] = '\0'; - sempshared = sem_open(name, O_CREAT | O_EXCL, 0, value); - if (sempshared != SEM_FAILED) - break; - if (errno == EEXIST) - continue; - if (errno != EPERM) - errno = ENOSPC; - return (-1); - } - - /* unnamed semaphore should not be opened twice */ - if (sem_unlink(name) == -1) { - sem_close(sempshared); - errno = ENOSPC; - return (-1); - } - - *semp = *sempshared; - free(sempshared); - return (0); -#endif - } - bzero(sem, sizeof(*sem)); sem->value = value; diff --git a/lib/librthread/rthread_sem_compat.c b/lib/librthread/rthread_sem_compat.c index 91c888cd49b..8dc863a04d3 100644 --- a/lib/librthread/rthread_sem_compat.c +++ b/lib/librthread/rthread_sem_compat.c @@ -116,41 +116,6 @@ sem_init(sem_t *sem, int pshared, unsigned int value) return (-1); } - if (pshared) { - errno = EPERM; - return (-1); -#ifdef notyet - char name[SEM_RANDOM_NAME_LEN]; - sem_t *sempshared; - int i; - - for (;;) { - for (i = 0; i < SEM_RANDOM_NAME_LEN - 1; i++) - name[i] = arc4random_uniform(255) + 1; - name[SEM_RANDOM_NAME_LEN - 1] = '\0'; - sempshared = sem_open(name, O_CREAT | O_EXCL, 0, value); - if (sempshared != SEM_FAILED) - break; - if (errno == EEXIST) - continue; - if (errno != EPERM) - errno = ENOSPC; - return (-1); - } - - /* unnamed semaphore should not be opened twice */ - if (sem_unlink(name) == -1) { - sem_close(sempshared); - errno = ENOSPC; - return (-1); - } - - *semp = *sempshared; - free(sempshared); - return (0); -#endif - } - bzero(sem, sizeof(*sem)); sem->lock = _SPINLOCK_UNLOCKED; sem->value = value; -- Lauri Tirkkonen | lotheac @ IRCnet
librthread sem_t opaqueness, storage & unnamed semaphore sharing
I was working on a make jobserver implementation that uses POSIX semaphores as job tokens instead of a complicated socket-based approach. Initially I used named semaphores, which work fine, except if child processes with less privileges need to also open the named semaphore (eg. 'make build' as root executing 'su build -c make'). For that reason I wanted to use an unnamed semaphore (sem_init()) which is stored in shm -- that way I could leave the shm fd open and pass it to children. But unfortunately, sem_t is currently just a pointer to the opaque struct __sem, and sem_int() always calloc()s the storage for the struct. This means the application cannot control where unnamed semaphores are stored, so I can't put it in shm. I might be missing something here, but I don't see why sem_t is not just typedef'd to a non-opaque struct. The equivalent structs have definitions in semaphore.h on FreeBSD and glibc (though glibc doesn't expose the members in this definition, it seems to be provided just for sizing). NetBSD uses an opaque struct which is allocated, *except* in the sem_init() pshared==1 case, where it instead returns a kernel semaphore identifier cast into a sem_t*; I find that to be somewhat convoluted. So, diff below makes struct __sem non-opaque and removes the indirect allocations, so that the application is required to provide storage and can therefore control where it's stored (which could be eg. shm). After bootstrap (installing new header, building and installing the new library) this survived base and xenocara 'make build' on amd64. --- include/semaphore.h | 11 +++-- lib/libc/include/thread_private.h | 7 --- lib/librthread/rthread.h| 4 +- lib/librthread/rthread_sem.c| 70 + lib/librthread/rthread_sem_compat.c | 68 +--- lib/librthread/shlib_version| 4 +- 6 files changed, 53 insertions(+), 111 deletions(-) diff --git a/include/semaphore.h b/include/semaphore.h index 21dc1a25bed..98471644357 100644 --- a/include/semaphore.h +++ b/include/semaphore.h @@ -38,10 +38,15 @@ #define _SEMAPHORE_H_ #include +#include -/* Opaque type definition. */ -struct __sem; -typedef struct __sem *sem_t; +struct __sem { + _atomic_lock_t lock; + volatile int waitcount; + volatile int value; + int shared; +}; +typedef struct __sem sem_t; struct timespec; #define SEM_FAILED ((sem_t *)0) diff --git a/lib/libc/include/thread_private.h b/lib/libc/include/thread_private.h index 98dfaa63223..f39094e0274 100644 --- a/lib/libc/include/thread_private.h +++ b/lib/libc/include/thread_private.h @@ -273,13 +273,6 @@ __END_HIDDEN_DECLS #define_SPINLOCK_UNLOCKED _ATOMIC_LOCK_UNLOCKED -struct __sem { - _atomic_lock_t lock; - volatile int waitcount; - volatile int value; - int shared; -}; - TAILQ_HEAD(pthread_queue, pthread); #ifdef FUTEX diff --git a/lib/librthread/rthread.h b/lib/librthread/rthread.h index 1af4509a9a0..89e39c9e0b6 100644 --- a/lib/librthread/rthread.h +++ b/lib/librthread/rthread.h @@ -79,8 +79,8 @@ struct pthread_spinlock { (((size) + (_thread_pagesize - 1)) & ~(_thread_pagesize - 1)) __BEGIN_HIDDEN_DECLS -int_sem_wait(sem_t, int, const struct timespec *, int *); -int_sem_post(sem_t); +int_sem_wait(sem_t *, int, const struct timespec *, int *); +int_sem_post(sem_t *); void _rthread_init(void); struct stack *_rthread_alloc_stack(pthread_t); diff --git a/lib/librthread/rthread_sem.c b/lib/librthread/rthread_sem.c index bd96769dc39..984e5fb0047 100644 --- a/lib/librthread/rthread_sem.c +++ b/lib/librthread/rthread_sem.c @@ -55,7 +55,7 @@ * Internal implementation of semaphores */ int -_sem_wait(sem_t sem, int can_eintr, const struct timespec *abstime, +_sem_wait(sem_t *sem, int can_eintr, const struct timespec *abstime, int *delayed_cancel) { unsigned int val; @@ -86,7 +86,7 @@ _sem_wait(sem_t sem, int can_eintr, const struct timespec *abstime, /* always increment count */ int -_sem_post(sem_t sem) +_sem_post(sem_t *sem) { membar_exit_before_atomic(); atomic_inc_int(&sem->value); @@ -98,10 +98,8 @@ _sem_post(sem_t sem) * exported semaphores */ int -sem_init(sem_t *semp, int pshared, unsigned int value) +sem_init(sem_t *sem, int pshared, unsigned int value) { - sem_t sem; - if (value > SEM_VALUE_MAX) { errno = EINVAL; return (-1); @@ -142,26 +140,19 @@ sem_init(sem_t *semp, int pshared, unsigned int value) #endif } - sem = calloc(1, sizeof(*sem)); - if (!sem) { - errno = ENOSPC; - return (-1); - } + bzero(sem, sizeof(*sem)); sem->value = value; - *semp = sem; return (0); } int -sem_destroy(sem_t *semp) +sem_destroy(sem_t *sem) { - sem_t sem; - if (!_threads_ready) /* for SEM_MMAP_SIZE */