Re: Pls sanity check my semtimedop(2) implementation
On Fri, Jul 18, 2008 at 11:58 AM, Jilles Tjoelker [EMAIL PROTECTED] wrote: On Sat, Jul 12, 2008 at 07:11:26PM -0400, Michael B Allen wrote: Below is a semtimedop(2) implementation that I'm using for FreeBSD. I was hoping someone could look it over and tell me if they think the implementation is sound. [snip semtimedop implementation that uses SIGALRM and relies on EINTR] The code seems to work ok but when stressing the FreeBSD build of my app I have managed to provoke errors related to concurrency (usually when a SIGALRM goes off). The Linux build works flawlessesly so I'm wondering about this one critical function that is different. In your implementation, the SIGALRM signal may happen before you even call semop(2). If so, most likely the semop(2) will hang arbitrarily long. Indeed. And I reconnoiter that condition is likely to happen if called with a sufficiently small timeout value. Another dirty fix is to try non-blocking semop(2) several times with sleeps in between. Actually that seems to work pretty well. If I force the nanosleep codepath to be used always, the application actually works pretty well under load. I wonder if the overhead of managing the signal and timer is worth the trouble. For posterity, below is the current implementation of semtimedop(2) for FreeBSD. Further ideas are welcome. void _timeval_diff(struct timeval *tv1, struct timeval *tv2, struct timeval *tvd) { tvd-tv_sec = tv1-tv_sec - tv2-tv_sec; tvd-tv_usec = tv1-tv_usec - tv2-tv_usec; if (tvd-tv_usec 0) { tvd-tv_usec = 100 - tvd-tv_usec; tvd-tv_sec--; } } void signal_ignore(int s, siginfo_t *si, void *ctx) { } int _semtimedop(int semid, struct sembuf *array, size_t nops, struct timespec *_timeout) { struct timeval timeout, before, after; struct itimerval value, ovalue; struct sigaction sa, osa; int ret; if (_timeout) { timeout.tv_sec = _timeout-tv_sec; timeout.tv_usec = _timeout-tv_nsec / 1000; if (gettimeofday(before, NULL) == -1) { errno = EFAULT; return -1; } if (timeout.tv_sec == 0 timeout.tv_usec 5000) { struct timeval tsleep, tend; struct sembuf wait; wait = *array; wait.sem_flg |= IPC_NOWAIT; tsleep.tv_sec = 0; tsleep.tv_usec = 1; timeradd(before, timeout, tend); for ( ;; ) { struct timeval tnow, tleft; struct timespec ts; ret = semop(semid, wait, nops); if (ret == 0) { return 0; } else if (errno != EAGAIN) { break; } if (gettimeofday(tnow, NULL) == -1) { errno = EFAULT; break; } if (timercmp(tnow, tend, =)) { errno = EAGAIN; break; } timersub(tend, tnow, tleft); if (tsleep.tv_usec tleft.tv_usec) tsleep.tv_usec = tleft.tv_usec; ts.tv_sec = 0; ts.tv_nsec = tsleep.tv_usec * 1000; nanosleep(ts, NULL); tsleep.tv_usec *= 10; } return -1; } memset(value, 0, sizeof value); value.it_value = timeout; memset(sa, 0, sizeof sa); sa.sa_sigaction = signal_ignore; sa.sa_flags = SA_SIGINFO; sigemptyset(sa.sa_mask); sigaction(SIGALRM, sa, osa); if (setitimer(ITIMER_REAL, value, ovalue) == -1) { sigaction(SIGALRM, osa, NULL); return -1; } } ret = semop(semid, array, nops); if (_timeout) { ret = setitimer(ITIMER_REAL, ovalue, NULL); if (ret 0) errno = EFAULT; sigaction(SIGALRM, osa, NULL); } if (ret == -1) { if (_timeout) { struct timeval elapsed; if (gettimeofday(after, NULL) == -1) { } else { _timeval_diff(after, before, elapsed); if (timercmp(elapsed, timeout, =)) errno = EAGAIN; } } return -1; } return 0; } ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Pls sanity check my semtimedop(2) implementation
On Friday 18 July 2008 12:27:05 pm Sean C. Farley wrote: On Thu, 17 Jul 2008, Michael B Allen wrote: *snip* But I'll keep it in mind for the future. I don't recall why I chose System V semaphores originally. I think process-shared semantics in the POSIX implementations where not mature at the time. I would love to move away from System V semaphores. It's all too easy to leak them and trying to clean up on restart is dangerous. It is my understanding that process-shared is not currently supported at least in 7. Does anyone know if there is any intention of this being eventually supported? I have needed this in the past but do not need it at the moment. It would be nice to have someday. There aren't currently plans, no. -- John Baldwin ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Pls sanity check my semtimedop(2) implementation
On Sat, Jul 12, 2008 at 07:11:26PM -0400, Michael B Allen wrote: Below is a semtimedop(2) implementation that I'm using for FreeBSD. I was hoping someone could look it over and tell me if they think the implementation is sound. [snip semtimedop implementation that uses SIGALRM and relies on EINTR] The code seems to work ok but when stressing the FreeBSD build of my app I have managed to provoke errors related to concurrency (usually when a SIGALRM goes off). The Linux build works flawlessesly so I'm wondering about this one critical function that is different. In your implementation, the SIGALRM signal may happen before you even call semop(2). If so, most likely the semop(2) will hang arbitrarily long. A somewhat dirty fix is to put a nonzero value into it_interval. This will ensure that if a signal is missed another one will be generated quickly. You might be able to fix this using setjmp/longjmp, but this is neither pretty nor efficient. Another dirty fix is to try non-blocking semop(2) several times with sleeps in between. Do you think it would make any difference if I used ITIMER_VIRTUAL / SIGVTALRM instead of ITIMER_REAL / SIGALRM? This does not fix the inherent problem. Also, given your use of signals in the first place, your application must be single threaded. This means the ITIMER_VIRTUAL timer does not run while semop(2) is waiting. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Pls sanity check my semtimedop(2) implementation
On Thu, 17 Jul 2008, Michael B Allen wrote: *snip* But I'll keep it in mind for the future. I don't recall why I chose System V semaphores originally. I think process-shared semantics in the POSIX implementations where not mature at the time. I would love to move away from System V semaphores. It's all too easy to leak them and trying to clean up on restart is dangerous. It is my understanding that process-shared is not currently supported at least in 7. Does anyone know if there is any intention of this being eventually supported? I have needed this in the past but do not need it at the moment. It would be nice to have someday. Sean -- [EMAIL PROTECTED] ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Pls sanity check my semtimedop(2) implementation
On Saturday 12 July 2008 07:11:26 pm Michael B Allen wrote: Hi, Below is a semtimedop(2) implementation that I'm using for FreeBSD. I was hoping someone could look it over and tell me if they think the implementation is sound. The code seems to work ok but when stressing the FreeBSD build of my app I have managed to provoke errors related to concurrency (usually when a SIGALRM goes off). The Linux build works flawlessesly so I'm wondering about this one critical function that is different. Do you think it would make any difference if I used ITIMER_VIRTUAL / SIGVTALRM instead of ITIMER_REAL / SIGALRM? Or perhaps I should be using a different implementation entirely? What specific races are you seeing? The timer is firing too early, too late? -- John Baldwin ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Pls sanity check my semtimedop(2) implementation
On Thu, Jul 17, 2008 at 10:05 AM, John Baldwin [EMAIL PROTECTED] wrote: On Saturday 12 July 2008 07:11:26 pm Michael B Allen wrote: Hi, Below is a semtimedop(2) implementation that I'm using for FreeBSD. I was hoping someone could look it over and tell me if they think the implementation is sound. The code seems to work ok but when stressing the FreeBSD build of my app I have managed to provoke errors related to concurrency (usually when a SIGALRM goes off). The Linux build works flawlessesly so I'm wondering about this one critical function that is different. Do you think it would make any difference if I used ITIMER_VIRTUAL / SIGVTALRM instead of ITIMER_REAL / SIGALRM? Or perhaps I should be using a different implementation entirely? What specific races are you seeing? The timer is firing too early, too late? It's very difficult to tell. I can only trigger the issue very occasionally running my torture test such that any diagnostic logging changes the results. And at this point I'm not sure my semtimedop implementation is responsible. I have not seen the issue since fixing the race pointed out by Mikko (although I have not tried very hard to provoke it). For now, I'm satisfied since I do not think the issue will be triggered in the wild. I hate to use signals for anything but as much as I try, there's just no other way to implement semtimedop within a single largely self-contained function. In the future I will likely use another process in the application that uses select(2) as an event service to post on semaphores after a certain time period. Unfortunately, right now, that service ultimately calls semtimedop so I'll save it for a rainy day. Although if you implemented semtimedop(2) into the FreeBSD API that would work too :-) Thanks, Mike -- Michael B Allen PHP Active Directory SPNEGO SSO http://www.ioplex.com/ ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Pls sanity check my semtimedop(2) implementation
On Thursday 17 July 2008 01:42:31 pm Michael B Allen wrote: On Thu, Jul 17, 2008 at 10:05 AM, John Baldwin [EMAIL PROTECTED] wrote: On Saturday 12 July 2008 07:11:26 pm Michael B Allen wrote: Hi, Below is a semtimedop(2) implementation that I'm using for FreeBSD. I was hoping someone could look it over and tell me if they think the implementation is sound. The code seems to work ok but when stressing the FreeBSD build of my app I have managed to provoke errors related to concurrency (usually when a SIGALRM goes off). The Linux build works flawlessesly so I'm wondering about this one critical function that is different. Do you think it would make any difference if I used ITIMER_VIRTUAL / SIGVTALRM instead of ITIMER_REAL / SIGALRM? Or perhaps I should be using a different implementation entirely? What specific races are you seeing? The timer is firing too early, too late? It's very difficult to tell. I can only trigger the issue very occasionally running my torture test such that any diagnostic logging changes the results. And at this point I'm not sure my semtimedop implementation is responsible. I have not seen the issue since fixing the race pointed out by Mikko (although I have not tried very hard to provoke it). For now, I'm satisfied since I do not think the issue will be triggered in the wild. I hate to use signals for anything but as much as I try, there's just no other way to implement semtimedop within a single largely self-contained function. In the future I will likely use another process in the application that uses select(2) as an event service to post on semaphores after a certain time period. Unfortunately, right now, that service ultimately calls semtimedop so I'll save it for a rainy day. Although if you implemented semtimedop(2) into the FreeBSD API that would work too :-) POSIX semaphores (sem_open(3), sem_create(3), etc.) do have a sem_timedwait(3). However, POSIX semaphores have several bugs in 6.x and 7.x (they should work a lot better in 8). If you want I can give you a patch for 6.x or 7.x that backports the 8.x POSIX semaphores. -- John Baldwin ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Pls sanity check my semtimedop(2) implementation
On Thu, Jul 17, 2008 at 8:15 PM, John Baldwin [EMAIL PROTECTED] wrote: On Thursday 17 July 2008 01:42:31 pm Michael B Allen wrote: On Thu, Jul 17, 2008 at 10:05 AM, John Baldwin [EMAIL PROTECTED] wrote: On Saturday 12 July 2008 07:11:26 pm Michael B Allen wrote: Hi, Below is a semtimedop(2) implementation that I'm using for FreeBSD. I was hoping someone could look it over and tell me if they think the implementation is sound. The code seems to work ok but when stressing the FreeBSD build of my app I have managed to provoke errors related to concurrency (usually when a SIGALRM goes off). The Linux build works flawlessesly so I'm wondering about this one critical function that is different. Do you think it would make any difference if I used ITIMER_VIRTUAL / SIGVTALRM instead of ITIMER_REAL / SIGALRM? Or perhaps I should be using a different implementation entirely? What specific races are you seeing? The timer is firing too early, too late? It's very difficult to tell. I can only trigger the issue very occasionally running my torture test such that any diagnostic logging changes the results. And at this point I'm not sure my semtimedop implementation is responsible. I have not seen the issue since fixing the race pointed out by Mikko (although I have not tried very hard to provoke it). For now, I'm satisfied since I do not think the issue will be triggered in the wild. I hate to use signals for anything but as much as I try, there's just no other way to implement semtimedop within a single largely self-contained function. In the future I will likely use another process in the application that uses select(2) as an event service to post on semaphores after a certain time period. Unfortunately, right now, that service ultimately calls semtimedop so I'll save it for a rainy day. Although if you implemented semtimedop(2) into the FreeBSD API that would work too :-) POSIX semaphores (sem_open(3), sem_create(3), etc.) do have a sem_timedwait(3). However, POSIX semaphores have several bugs in 6.x and 7.x (they should work a lot better in 8). If you want I can give you a patch for 6.x or 7.x that backports the 8.x POSIX semaphores. I can't ask my customers to patch their systems. But I'll keep it in mind for the future. I don't recall why I chose System V semaphores originally. I think process-shared semantics in the POSIX implementations where not mature at the time. I would love to move away from System V semaphores. It's all too easy to leak them and trying to clean up on restart is dangerous. Mike ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Pls sanity check my semtimedop(2) implementation
On Sun, 13 Jul 2008, Michael B Allen wrote: Hi, Below is a semtimedop(2) implementation that I'm using for FreeBSD. I was hoping someone could look it over and tell me if they think the implementation is sound. The code seems to work ok but when stressing the FreeBSD build of my app I have managed to provoke errors related to concurrency (usually when a SIGALRM goes off). The Linux build works flawlessesly so I'm wondering about this one critical function that is different. At the very least you need to check errno when semop() returns -1. Unless it is EINTR, you have other problems. Also, if there is any other code using the timer across this function call, you have race conditions between changing the signal handler and setting the timer. Even if there is no other use of the timer across this function, resetting the signal handler before disarming the timer leaves you open to the signal being handled by the default handler which will make the process exit. $.02, /Mikko Is there any reason why I would want to use ITIMER_VIRTUAL / SIGVTALRM instead of ITIMER_REAL / SIGALRM? Or perhaps I should be using a different implementation entirely? Mike int _semtimedop(int semid, struct sembuf *array, size_t nops, struct timespec *_timeout) { struct timeval timeout, before, after; struct itimerval value, ovalue; struct sigaction sa, osa; int ret; if (_timeout) { timeout.tv_sec = _timeout-tv_sec; timeout.tv_usec = _timeout-tv_nsec / 1000; if (gettimeofday(before, NULL) == -1) { return -1; } memset(value, 0, sizeof value); value.it_value = timeout; memset(sa, 0, sizeof sa); /* signal_print writes the signal info to a log file */ sa.sa_sigaction = signal_print; sa.sa_flags = SA_SIGINFO; sigemptyset(sa.sa_mask); sigaction(SIGALRM, sa, osa); if (setitimer(ITIMER_REAL, value, ovalue) == -1) { sigaction(SIGALRM, osa, NULL); return -1; } } ret = semop(semid, array, nops); if (_timeout) { sigaction(SIGALRM, osa, NULL); if (setitimer(ITIMER_REAL, ovalue, NULL) == -1) { return -1; } } if (ret == -1) { if (_timeout) { struct timeval elapsed; if (gettimeofday(after, NULL) == -1) { return -1; } _timeval_diff(after, before, elapsed); if (timercmp(elapsed, timeout, =)) errno = EAGAIN; } return -1; } return 0; } ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED] ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Pls sanity check my semtimedop(2) implementation
On 7/13/08, Mikko Työläjärvi [EMAIL PROTECTED] wrote: On Sun, 13 Jul 2008, Michael B Allen wrote: Hi, Below is a semtimedop(2) implementation that I'm using for FreeBSD. I was hoping someone could look it over and tell me if they think the implementation is sound. The code seems to work ok but when stressing the FreeBSD build of my app I have managed to provoke errors related to concurrency (usually when a SIGALRM goes off). The Linux build works flawlessesly so I'm wondering about this one critical function that is different. At the very least you need to check errno when semop() returns -1. Unless it is EINTR, you have other problems. Also, if there is any other code using the timer across this function call, you have race conditions between changing the signal handler and setting the timer. Even if there is no other use of the timer across this function, resetting the signal handler before disarming the timer leaves you open to the signal being handled by the default handler which will make the process exit. Hi Mikko, So if some other code uses setitimer(2) for whatever reason, then I have the potential for a race. I'm not aware of any other such instances of setitimer but my app is actually a plugin for a larger application so I can't entirely rule out the possibility. Is there any facility for creating a stateful timer so that I don't run into this problem? Can anyone provide the basis for an alternative implementation? Should I use select(2) instead? Mike Is there any reason why I would want to use ITIMER_VIRTUAL / SIGVTALRM instead of ITIMER_REAL / SIGALRM? Or perhaps I should be using a different implementation entirely? Mike int _semtimedop(int semid, struct sembuf *array, size_t nops, struct timespec *_timeout) { struct timeval timeout, before, after; struct itimerval value, ovalue; struct sigaction sa, osa; int ret; if (_timeout) { timeout.tv_sec = _timeout-tv_sec; timeout.tv_usec = _timeout-tv_nsec / 1000; if (gettimeofday(before, NULL) == -1) { return -1; } memset(value, 0, sizeof value); value.it_value = timeout; memset(sa, 0, sizeof sa); /* signal_print writes the signal info to a log file */ sa.sa_sigaction = signal_print; sa.sa_flags = SA_SIGINFO; sigemptyset(sa.sa_mask); sigaction(SIGALRM, sa, osa); if (setitimer(ITIMER_REAL, value, ovalue) == -1) { sigaction(SIGALRM, osa, NULL); return -1; } } ret = semop(semid, array, nops); if (_timeout) { sigaction(SIGALRM, osa, NULL); if (setitimer(ITIMER_REAL, ovalue, NULL) == -1) { return -1; } } if (ret == -1) { if (_timeout) { struct timeval elapsed; if (gettimeofday(after, NULL) == -1) { return -1; } _timeval_diff(after, before, elapsed); if (timercmp(elapsed, timeout, =)) errno = EAGAIN; } return -1; } return 0; } ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]