Re: Pls sanity check my semtimedop(2) implementation

2008-07-30 Thread Michael B Allen
On Fri, Jul 18, 2008 at 11:58 AM, Jilles Tjoelker [EMAIL PROTECTED] wrote:
 On Sat, Jul 12, 2008 at 07:11:26PM -0400, Michael B Allen wrote:
 Below is a semtimedop(2) implementation that I'm using for FreeBSD. I
 was hoping someone could look it over and tell me if they think the
 implementation is sound.

 [snip semtimedop implementation that uses SIGALRM and relies on EINTR]

 The code seems to work ok but when stressing the FreeBSD build of my app
 I have managed to provoke errors related to concurrency (usually when a
 SIGALRM goes off). The Linux build works flawlessesly so I'm wondering
 about this one critical function that is different.

 In your implementation, the SIGALRM signal may happen before you even
 call semop(2). If so, most likely the semop(2) will hang arbitrarily
 long.

Indeed. And I reconnoiter that condition is likely to happen if called
with a sufficiently small timeout value.

 Another dirty fix is to try non-blocking semop(2) several times with
 sleeps in between.

Actually that seems to work pretty well.

If I force the nanosleep codepath to be used always, the application
actually works pretty well under load. I wonder if the overhead of
managing the signal and timer is worth the trouble.

For posterity, below is the current implementation of semtimedop(2)
for FreeBSD. Further ideas are welcome.

void
_timeval_diff(struct timeval *tv1, struct timeval *tv2, struct timeval *tvd)
{
tvd-tv_sec = tv1-tv_sec - tv2-tv_sec;
tvd-tv_usec = tv1-tv_usec - tv2-tv_usec;
if (tvd-tv_usec  0) {
tvd-tv_usec = 100 - tvd-tv_usec;
tvd-tv_sec--;
}
}
void
signal_ignore(int s, siginfo_t *si, void *ctx)
{
}
int
_semtimedop(int semid, struct sembuf *array, size_t nops, struct
timespec *_timeout)
{
struct timeval timeout, before, after;
struct itimerval value, ovalue;
struct sigaction sa, osa;
int ret;

if (_timeout) {
timeout.tv_sec = _timeout-tv_sec;
timeout.tv_usec = _timeout-tv_nsec / 1000;

if (gettimeofday(before, NULL) == -1) {
errno = EFAULT;
return -1;
}

if (timeout.tv_sec == 0  timeout.tv_usec  5000) {
struct timeval tsleep, tend;
struct sembuf wait;

wait = *array;
wait.sem_flg |= IPC_NOWAIT;

tsleep.tv_sec = 0;
tsleep.tv_usec = 1;

timeradd(before, timeout, tend);

for ( ;; ) {
struct timeval tnow, tleft;
struct timespec ts;

ret = semop(semid, wait, nops);
if (ret == 0) {
return 0;
} else if (errno != EAGAIN) {
break;
}

if (gettimeofday(tnow, NULL) == -1) {
errno = EFAULT;
break;
}

if (timercmp(tnow, tend, =)) {
errno = EAGAIN;
break;
}

timersub(tend, tnow, tleft);

if (tsleep.tv_usec  tleft.tv_usec)
tsleep.tv_usec = tleft.tv_usec;

ts.tv_sec = 0;
ts.tv_nsec = tsleep.tv_usec * 1000;
nanosleep(ts, NULL);

tsleep.tv_usec *= 10;
}

return -1;
}

memset(value, 0, sizeof value);
value.it_value = timeout;

memset(sa, 0, sizeof sa);
sa.sa_sigaction = signal_ignore;
sa.sa_flags = SA_SIGINFO;
sigemptyset(sa.sa_mask);
sigaction(SIGALRM, sa, osa);

if (setitimer(ITIMER_REAL, value, ovalue) == -1) {
sigaction(SIGALRM, osa, NULL);
return -1;
}
}

ret = semop(semid, array, nops);

if (_timeout) {
ret = setitimer(ITIMER_REAL, ovalue, NULL);
if (ret  0)
errno = EFAULT;

sigaction(SIGALRM, osa, NULL);
}

if (ret == -1) {
if (_timeout) {
struct timeval elapsed;

if (gettimeofday(after, NULL) == -1) {
} else {

_timeval_diff(after, before, elapsed);

if (timercmp(elapsed, timeout, =))
errno = EAGAIN;
}
}

return -1;
}

return 0;
}
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Pls sanity check my semtimedop(2) implementation

2008-07-21 Thread John Baldwin
On Friday 18 July 2008 12:27:05 pm Sean C. Farley wrote:
 On Thu, 17 Jul 2008, Michael B Allen wrote:
 
 *snip*
 
  But I'll keep it in mind for the future. I don't recall why I chose
  System V semaphores originally. I think process-shared semantics in
  the POSIX implementations where not mature at the time. I would love
  to move away from System V semaphores. It's all too easy to leak them
  and trying to clean up on restart is dangerous.
 
 It is my understanding that process-shared is not currently supported at
 least in 7.
 
 Does anyone know if there is any intention of this being eventually
 supported?  I have needed this in the past but do not need it at the
 moment.  It would be nice to have someday.

There aren't currently plans, no.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Pls sanity check my semtimedop(2) implementation

2008-07-18 Thread Jilles Tjoelker
On Sat, Jul 12, 2008 at 07:11:26PM -0400, Michael B Allen wrote:
 Below is a semtimedop(2) implementation that I'm using for FreeBSD. I
 was hoping someone could look it over and tell me if they think the
 implementation is sound.

 [snip semtimedop implementation that uses SIGALRM and relies on EINTR]

 The code seems to work ok but when stressing the FreeBSD build of my app
 I have managed to provoke errors related to concurrency (usually when a
 SIGALRM goes off). The Linux build works flawlessesly so I'm wondering
 about this one critical function that is different.

In your implementation, the SIGALRM signal may happen before you even
call semop(2). If so, most likely the semop(2) will hang arbitrarily
long.

A somewhat dirty fix is to put a nonzero value into it_interval. This
will ensure that if a signal is missed another one will be generated
quickly.

You might be able to fix this using setjmp/longjmp, but this is neither
pretty nor efficient.

Another dirty fix is to try non-blocking semop(2) several times with
sleeps in between.

 Do you think it would make any difference if I used
 ITIMER_VIRTUAL / SIGVTALRM instead of ITIMER_REAL / SIGALRM?

This does not fix the inherent problem.

Also, given your use of signals in the first place, your application
must be single threaded. This means the ITIMER_VIRTUAL timer does not
run while semop(2) is waiting.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Pls sanity check my semtimedop(2) implementation

2008-07-18 Thread Sean C. Farley

On Thu, 17 Jul 2008, Michael B Allen wrote:

*snip*


But I'll keep it in mind for the future. I don't recall why I chose
System V semaphores originally. I think process-shared semantics in
the POSIX implementations where not mature at the time. I would love
to move away from System V semaphores. It's all too easy to leak them
and trying to clean up on restart is dangerous.


It is my understanding that process-shared is not currently supported at
least in 7.

Does anyone know if there is any intention of this being eventually
supported?  I have needed this in the past but do not need it at the
moment.  It would be nice to have someday.

Sean
--
[EMAIL PROTECTED]
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Pls sanity check my semtimedop(2) implementation

2008-07-17 Thread John Baldwin
On Saturday 12 July 2008 07:11:26 pm Michael B Allen wrote:
 Hi,
 
 Below is a semtimedop(2) implementation that I'm using for FreeBSD. I
 was hoping someone could look it over and tell me if they think the
 implementation is sound.
 
 The code seems to work ok but when stressing the FreeBSD build of my app
 I have managed to provoke errors related to concurrency (usually when a
 SIGALRM goes off). The Linux build works flawlessesly so I'm wondering
 about this one critical function that is different.
 
 Do you think it would make any difference if I used
 ITIMER_VIRTUAL / SIGVTALRM instead of ITIMER_REAL / SIGALRM?
 
 Or perhaps I should be using a different implementation entirely?

What specific races are you seeing?  The timer is firing too early, too late?

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Pls sanity check my semtimedop(2) implementation

2008-07-17 Thread Michael B Allen
On Thu, Jul 17, 2008 at 10:05 AM, John Baldwin [EMAIL PROTECTED] wrote:
 On Saturday 12 July 2008 07:11:26 pm Michael B Allen wrote:
 Hi,

 Below is a semtimedop(2) implementation that I'm using for FreeBSD. I
 was hoping someone could look it over and tell me if they think the
 implementation is sound.

 The code seems to work ok but when stressing the FreeBSD build of my app
 I have managed to provoke errors related to concurrency (usually when a
 SIGALRM goes off). The Linux build works flawlessesly so I'm wondering
 about this one critical function that is different.

 Do you think it would make any difference if I used
 ITIMER_VIRTUAL / SIGVTALRM instead of ITIMER_REAL / SIGALRM?

 Or perhaps I should be using a different implementation entirely?

 What specific races are you seeing?  The timer is firing too early, too late?

It's very difficult to tell. I can only trigger the issue very
occasionally running my torture test such that any diagnostic logging
changes the results.

And at this point I'm not sure my semtimedop implementation is
responsible. I have not seen the issue since fixing the race pointed
out by Mikko (although I have not tried very hard to provoke it).

For now, I'm satisfied since I do not think the issue will be
triggered in the wild. I hate to use signals for anything but as much
as I try, there's just no other way to implement semtimedop within a
single largely self-contained function. In the future I will likely
use another process in the application that uses select(2) as an
event service to post on semaphores after a certain time period.
Unfortunately, right now, that service ultimately calls semtimedop so
I'll save it for a rainy day.

Although if you implemented semtimedop(2) into the FreeBSD API that
would work too :-)

Thanks,
Mike

-- 
Michael B Allen
PHP Active Directory SPNEGO SSO
http://www.ioplex.com/
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Pls sanity check my semtimedop(2) implementation

2008-07-17 Thread John Baldwin
On Thursday 17 July 2008 01:42:31 pm Michael B Allen wrote:
 On Thu, Jul 17, 2008 at 10:05 AM, John Baldwin [EMAIL PROTECTED] wrote:
  On Saturday 12 July 2008 07:11:26 pm Michael B Allen wrote:
  Hi,
 
  Below is a semtimedop(2) implementation that I'm using for FreeBSD. I
  was hoping someone could look it over and tell me if they think the
  implementation is sound.
 
  The code seems to work ok but when stressing the FreeBSD build of my app
  I have managed to provoke errors related to concurrency (usually when a
  SIGALRM goes off). The Linux build works flawlessesly so I'm wondering
  about this one critical function that is different.
 
  Do you think it would make any difference if I used
  ITIMER_VIRTUAL / SIGVTALRM instead of ITIMER_REAL / SIGALRM?
 
  Or perhaps I should be using a different implementation entirely?
 
  What specific races are you seeing?  The timer is firing too early, too
  late?

 It's very difficult to tell. I can only trigger the issue very
 occasionally running my torture test such that any diagnostic logging
 changes the results.

 And at this point I'm not sure my semtimedop implementation is
 responsible. I have not seen the issue since fixing the race pointed
 out by Mikko (although I have not tried very hard to provoke it).

 For now, I'm satisfied since I do not think the issue will be
 triggered in the wild. I hate to use signals for anything but as much
 as I try, there's just no other way to implement semtimedop within a
 single largely self-contained function. In the future I will likely
 use another process in the application that uses select(2) as an
 event service to post on semaphores after a certain time period.
 Unfortunately, right now, that service ultimately calls semtimedop so
 I'll save it for a rainy day.

 Although if you implemented semtimedop(2) into the FreeBSD API that
 would work too :-)

POSIX semaphores (sem_open(3), sem_create(3), etc.) do have a 
sem_timedwait(3).  However, POSIX semaphores have several bugs in 6.x and 7.x 
(they should work a lot better in 8).  If you want I can give you a patch for 
6.x or 7.x that backports the 8.x POSIX semaphores.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Pls sanity check my semtimedop(2) implementation

2008-07-17 Thread Michael B Allen
On Thu, Jul 17, 2008 at 8:15 PM, John Baldwin [EMAIL PROTECTED] wrote:
 On Thursday 17 July 2008 01:42:31 pm Michael B Allen wrote:
 On Thu, Jul 17, 2008 at 10:05 AM, John Baldwin [EMAIL PROTECTED] wrote:
  On Saturday 12 July 2008 07:11:26 pm Michael B Allen wrote:
  Hi,
 
  Below is a semtimedop(2) implementation that I'm using for FreeBSD. I
  was hoping someone could look it over and tell me if they think the
  implementation is sound.
 
  The code seems to work ok but when stressing the FreeBSD build of my app
  I have managed to provoke errors related to concurrency (usually when a
  SIGALRM goes off). The Linux build works flawlessesly so I'm wondering
  about this one critical function that is different.
 
  Do you think it would make any difference if I used
  ITIMER_VIRTUAL / SIGVTALRM instead of ITIMER_REAL / SIGALRM?
 
  Or perhaps I should be using a different implementation entirely?
 
  What specific races are you seeing?  The timer is firing too early, too
  late?

 It's very difficult to tell. I can only trigger the issue very
 occasionally running my torture test such that any diagnostic logging
 changes the results.

 And at this point I'm not sure my semtimedop implementation is
 responsible. I have not seen the issue since fixing the race pointed
 out by Mikko (although I have not tried very hard to provoke it).

 For now, I'm satisfied since I do not think the issue will be
 triggered in the wild. I hate to use signals for anything but as much
 as I try, there's just no other way to implement semtimedop within a
 single largely self-contained function. In the future I will likely
 use another process in the application that uses select(2) as an
 event service to post on semaphores after a certain time period.
 Unfortunately, right now, that service ultimately calls semtimedop so
 I'll save it for a rainy day.

 Although if you implemented semtimedop(2) into the FreeBSD API that
 would work too :-)

 POSIX semaphores (sem_open(3), sem_create(3), etc.) do have a
 sem_timedwait(3).  However, POSIX semaphores have several bugs in 6.x and 7.x
 (they should work a lot better in 8).  If you want I can give you a patch for
 6.x or 7.x that backports the 8.x POSIX semaphores.

I can't ask my customers to patch their systems.

But I'll keep it in mind for the future. I don't recall why I chose
System V semaphores originally. I think process-shared semantics in
the POSIX implementations where not mature at the time. I would love
to move away from System V semaphores. It's all too easy to leak them
and trying to clean up on restart is dangerous.

Mike
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Pls sanity check my semtimedop(2) implementation

2008-07-13 Thread Mikko Työläjärvi

On Sun, 13 Jul 2008, Michael B Allen wrote:


Hi,

Below is a semtimedop(2) implementation that I'm using for FreeBSD. I
was hoping someone could look it over and tell me if they think the
implementation is sound.

The code seems to work ok but when stressing the FreeBSD build of my app
I have managed to provoke errors related to concurrency (usually when a
SIGALRM goes off). The Linux build works flawlessesly so I'm wondering
about this one critical function that is different.


At the very least you need to check errno when semop() returns -1.
Unless it is EINTR, you have other problems.

Also, if there is any other code using the timer across this function
call, you have race conditions between changing the signal handler and
setting the timer.  Even if there is no other use of the timer across
this function, resetting the signal handler before disarming the timer
leaves you open to the signal being handled by the default handler
which will make the process exit.

  $.02,
  /Mikko


Is there any reason why I would want to use ITIMER_VIRTUAL /
SIGVTALRM instead of ITIMER_REAL / SIGALRM?

Or perhaps I should be using a different implementation entirely?

Mike

int
_semtimedop(int semid,
  struct sembuf *array,
  size_t nops,
  struct timespec *_timeout)
{
  struct timeval timeout, before, after;
  struct itimerval value, ovalue;
  struct sigaction sa, osa;
  int ret;

  if (_timeout) {
  timeout.tv_sec = _timeout-tv_sec;
  timeout.tv_usec = _timeout-tv_nsec / 1000;

  if (gettimeofday(before, NULL) == -1) {
  return -1;
  }

  memset(value, 0, sizeof value);
  value.it_value = timeout;

  memset(sa, 0, sizeof sa);
  /* signal_print writes the signal info to a log file
   */
  sa.sa_sigaction = signal_print;
  sa.sa_flags = SA_SIGINFO;
  sigemptyset(sa.sa_mask);
  sigaction(SIGALRM, sa, osa);

  if (setitimer(ITIMER_REAL, value, ovalue) == -1) {
  sigaction(SIGALRM, osa, NULL);
  return -1;
  }
  }

  ret = semop(semid, array, nops);

  if (_timeout) {
  sigaction(SIGALRM, osa, NULL);

  if (setitimer(ITIMER_REAL, ovalue, NULL) == -1) {
  return -1;
  }
  }

  if (ret == -1) {
  if (_timeout) {
  struct timeval elapsed;

  if (gettimeofday(after, NULL) == -1) {
  return -1;
  }

  _timeval_diff(after, before, elapsed);

  if (timercmp(elapsed, timeout, =))
  errno = EAGAIN;
  }

  return -1;
  }

  return 0;
}
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Pls sanity check my semtimedop(2) implementation

2008-07-13 Thread Michael B Allen
On 7/13/08, Mikko Työläjärvi [EMAIL PROTECTED] wrote:
 On Sun, 13 Jul 2008, Michael B Allen wrote:


  Hi,
 
  Below is a semtimedop(2) implementation that I'm using for FreeBSD. I
  was hoping someone could look it over and tell me if they think the
  implementation is sound.
 
  The code seems to work ok but when stressing the FreeBSD build of my app
  I have managed to provoke errors related to concurrency (usually when a
  SIGALRM goes off). The Linux build works flawlessesly so I'm wondering
  about this one critical function that is different.
 

  At the very least you need to check errno when semop() returns -1.
  Unless it is EINTR, you have other problems.

  Also, if there is any other code using the timer across this function
  call, you have race conditions between changing the signal handler and
  setting the timer.  Even if there is no other use of the timer across
  this function, resetting the signal handler before disarming the timer
  leaves you open to the signal being handled by the default handler
  which will make the process exit.

Hi Mikko,

So if some other code uses setitimer(2) for whatever reason, then I
have the potential for a race. I'm not aware of any other such
instances of setitimer but my app is actually a plugin for a larger
application so I can't entirely rule out the possibility.

Is there any facility for creating a stateful timer so that I don't
run into this problem?

Can anyone provide the basis for an alternative implementation?

Should I use select(2) instead?

Mike

  Is there any reason why I would want to use ITIMER_VIRTUAL /
  SIGVTALRM instead of ITIMER_REAL / SIGALRM?
 
  Or perhaps I should be using a different implementation entirely?
 
  Mike
 
  int
  _semtimedop(int semid,
   struct sembuf *array,
   size_t nops,
   struct timespec *_timeout)
  {
   struct timeval timeout, before, after;
   struct itimerval value, ovalue;
   struct sigaction sa, osa;
   int ret;
 
   if (_timeout) {
   timeout.tv_sec = _timeout-tv_sec;
   timeout.tv_usec = _timeout-tv_nsec / 1000;
 
   if (gettimeofday(before, NULL) == -1) {
   return -1;
   }
 
   memset(value, 0, sizeof value);
   value.it_value = timeout;
 
   memset(sa, 0, sizeof sa);
   /* signal_print writes the signal info to a log file
*/
   sa.sa_sigaction = signal_print;
   sa.sa_flags = SA_SIGINFO;
   sigemptyset(sa.sa_mask);
   sigaction(SIGALRM, sa, osa);
 
   if (setitimer(ITIMER_REAL, value, ovalue) == -1) {
   sigaction(SIGALRM, osa, NULL);
   return -1;
   }
   }
 
   ret = semop(semid, array, nops);
 
   if (_timeout) {
   sigaction(SIGALRM, osa, NULL);
 
   if (setitimer(ITIMER_REAL, ovalue, NULL) == -1) {
   return -1;
   }
   }
 
   if (ret == -1) {
   if (_timeout) {
   struct timeval elapsed;
 
   if (gettimeofday(after, NULL) == -1) {
   return -1;
   }
 
   _timeval_diff(after, before, elapsed);
 
   if (timercmp(elapsed, timeout, =))
   errno = EAGAIN;
   }
 
   return -1;
   }
 
   return 0;
  }
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]