Re: IPC, shared memory, syncronization AND threads...

2000-08-16 Thread Peter Jeremy

On Tue, 15 Aug 2000 10:30:25 -0600 (MDT), Ronald G Minnich [EMAIL PROTECTED] wrote:
The idea is simple: tset is the fastest, but you only want to spin so
long. Then you want to drop into the kernel, and wait for someone to wake
you up.

Agreed.

Here's a simple test-and-set function for the 386 (tested and works):

int
tset(int *i, int lockval, int unlockval)
{
  int j = 0;
asm("movl 16(%ebp), %eax");
asm("movl 8(%ebp),%ecx");
asm("movl 12(%ebp),%edx");
asm("cmpxchg %edx, (%ecx)");
asm("jne failed");
asm("movl %eax, -4(%ebp)");
asm("jmp done");
asm("failed: movl %eax, -4(%ebp)");
asm("done:");
  return j;
}

Actually, this isn't particularly good coding.  It isn't SMP-safe.  If
you compile it with -fomit-frame-pointer or -fomit-leaf-frame-pointer,
it won't work (and will corrupt some innocent, probably stack,
memory).  When the code is optimised, it works as much by accident as
design.  And the documentation for gcc indicates that sequences of asm
statements can be re-ordered.

Something like the following should be somewhat safer.  It returns
unlockval if the semaphore was not locked, otherwise it returns the
current contents of the semaphore (which seems to be the same as
your code).

int
tset(int *i, int lockval, int unlockval)
{
int j;

__asm__("lock cmpxchg %2, (%3)"
: "=a" (j)
: "0" (unlockval), "r" (lockval), "r" (i)
: "memory", "cc");

return j;
}

Peter


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: IPC, shared memory, syncronization AND threads...

2000-08-16 Thread Peter Dufault

I've looked at the POSIX spec to find the right way to portably
implement low overhead process synchronization.

I think the right way is to add _POSIX_THREAD_PROCESS_SHARED support
so that mutexes can be shared between processes.

There is something vague about the spec.  I don't see that you can
reject "pthread_mutexattr_setpshared()" with EINVAL, and I don't
clearly see that the mutexes in existence after a fork are defined
to be a new set of mutexes with identical existing values.  I have
to assume they are as mutexes can be statically allocated and there
is no way to ensure they are in a shared region without sharing
pages happening to contain mutexes with that attribute of the new
process space with the old one.

Assuming unique mutexes after a fork unless they happen to be in
a shared region, you could create a mutex in shared memory, apply
pthread_mutexattr_setpshared() to it, and then have the usual
pthread_mutex_lock()/pthread_mutex_unlock() interface support the
high performance synchronization.

Peter

--
Peter Dufault ([EMAIL PROTECTED])   Realtime development, Machine control,
HD Associates, Inc.   Fail-Safe systems, Agency approval


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: IPC, shared memory, syncronization AND threads...

2000-08-16 Thread Peter Dufault

Here's the kind of thing I have in mind, wrapped around the pthreads
mutexes.  This replaces default pthread mutexes (those with no special
attributes) with possibly fast ones.  I haven't done any real timing but
I've verified that a program I have works and runs a lot faster with
these wrappers.  Obviously you have to use -include mutex.h and various
-D flags to try this out.

Header:

#ifndef _MUTEX_H_
#define _MUTEX_H_

#include pthread.h

struct _pthread_mutex;
typedef struct _pthread_mutex *_pthread_mutex_t;

int _pthread_mutex_init(_pthread_mutex_t *, const pthread_mutexattr_t *);
int _pthread_mutex_lock(_pthread_mutex_t *);
int _pthread_mutex_unlock(_pthread_mutex_t *);

#endif /* _MUTEX_H_ */

Wrappers:

#include stdio.h
#include stdlib.h

#include pthread.h

typedef unsigned long castype;  /* Data type you can CAS */
extern int cas(volatile castype *, const castype, const castype);

struct _pthread_mutex {
castype lock;
struct pthread_mutex *strong_lock;
struct pthread_cond *done_unlocking;
struct pthread_mutex *unlock_lock;
int unlocking;
};

#include "mutex.h"

/*
 * _PTHREAD_MUTEX_INITIALIZER would have to be something like:
 */
#define _PTHREAD_MUTEX_INITIALIZER { \
0,\
PTHREAD_MUTEX_INITIALIZER,\
PTHREAD_COND_INITIALIZER,\
PTHREAD_MUTEX_INITIALIZER,\
0}

/* cas: Compare and swap.  Return 1 if it succeeds, zero if it
 * doesn't.
 */
int
cas(volatile castype *ptr, const castype o, const castype n)
{
volatile int result = 0;

__asm   __volatile(
"cmpxchg%L3 %3,%1; jne 0f; inc%L0 %0; 0:"
:   "=m"(result)
:   "m"(*ptr), "a"(o), "r"(n)
);

return result;
}

static void
oops(const char *s)
{
fprintf(stderr, "_pthread_mutex: %s.\n", s);
}

/* Here's the init.  If there are any non-standard attributes use a default.
 */
int
_pthread_mutex_init(_pthread_mutex_t *mp, const pthread_mutexattr_t *attr)
{
struct _pthread_mutex *m = (struct _pthread_mutex *)calloc(1, sizeof(*m));

if (attr) {
m-lock = 4;
return pthread_mutex_init(m-strong_lock, attr);
}

*mp = m;

return 0;
}

/* mutex_lock: 
 * First try to go from 0 to 1.  If that works, we have the lock.
 *
 * Then see if it is 4.  That means it is a non-default lock,
 * just call the standard locker.  These two steps could be published
 * in the header along with the structure to make things inlineable.
 *
 * Finally go into a loop, doing the first step again, then:
 *
 * If it fails, set it from 1 to 2 to let the unlock know someone
 * is waiting and then call the existing lock.
 * If it is already at 2 just call the existing lock.
 *
 * The only thing left is that it is being unlocked, that is, it must be
 * 3.  Wait for the unlocker to do its thing then repeat.
 */

int _pthread_mutex_lock(_pthread_mutex_t *mp)
{
struct _pthread_mutex *m = *mp;

if (cas(m-lock, 0, 1))/* Try for the fast lock. */
return 0;

if (m-lock == 4)
return pthread_mutex_lock(m-strong_lock); /* Not default lock */

while (1) {
if (cas(m-lock, 0, 1))/* Try for the fast lock. */
break;

if (cas(m-lock, 1, 2) || cas(m-lock, 2, 2))
return pthread_mutex_lock(m-strong_lock);

pthread_mutex_lock(m-unlock_lock);

/* It is being unlocked, which can take a long time.
 * Wait for the unlocker 
 */

while (m-unlocking) {
pthread_cond_wait(m-done_unlocking, m-unlock_lock);
}

pthread_mutex_unlock(m-unlock_lock);
}

return 0;
}

/* unlock: First try to do a fast unlock and also handle the special
 * attributes case.
 * Then we must be the unlocker (there should only be one), so set the
 * flag so people know we're unlocking.
 *
 * Do the unlock and then signal anyone waiting.
 */
int _pthread_mutex_unlock(_pthread_mutex_t *mp)
{
struct _pthread_mutex *m = *mp;
if (cas(m-lock, 1, 0))/* Try for the fast unlock */
return 0;

if (m-lock == 4)
return pthread_mutex_unlock(m-strong_lock);   /* Not default lock */

/* It has to be 2 or there are multiple unlocks going on.
 */
if (!cas(m-lock, 2, 3))
oops("multiple unlocks?");

pthread_mutex_lock(m-unlock_lock);
m-unlocking = 1;
pthread_mutex_unlock(m-strong_lock);
m-unlocking = 0;
pthread_mutex_unlock(m-unlock_lock);

pthread_cond_signal(m-done_unlocking);

return 0;
}


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: IPC, shared memory, syncronization AND threads...

2000-08-16 Thread Ronald G Minnich

On Wed, 16 Aug 2000, Peter Jeremy wrote:

 Here's a simple test-and-set function for the 386 (tested and works):
 Actually, this isn't particularly good coding.  It isn't SMP-safe.  

you caught me! I'm a lousy assembly programmer! 

Actually, that code is so old it predates SMP by a bit ... 

I like your improved version ... that one goes in my archives. 

Thanks!

ron



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: IPC, shared memory, syncronization AND threads...

2000-08-15 Thread John Polstra

In article [EMAIL PROTECTED], Jonas Bulow
[EMAIL PROTECTED] wrote:
 John Polstra wrote:

  I think the ideal solution would first try to lock the
  test-and-set lock, maybe spinning on it just a few times.  If that
  failed it would fall back to using a system-call lock such as
  flock() which would allow the process to block without spinning.
  But I don't have any code to do that.  (If you write some, could I
  have a copy?)

 I am about to.

Actually I thought about this some more, and I'm not all that sure
it's possible.  I haven't actually _tried_ it, but I think you'd end
up needing a low-level mutex around parts of the code.  That would
have to be implemented as a spinlock, which is exactly what we're
trying to avoid in this exercise.

 don't know it it's bad design to have rtld.c export
 lockdflt_init in the same way as dlopen, what di you think?

Right, bad design. :-)

John

-- 
  John Polstra   [EMAIL PROTECTED]
  John D. Polstra  Co., Inc.Seattle, Washington USA
  "Disappointment is a good sign of basic intelligence."  -- Chögyam Trungpa



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: IPC, shared memory, syncronization AND threads...

2000-08-15 Thread Ronald G Minnich

OK, here's a note from long ago, when this came up before. 
Dated:  Tue Jul  2 10:48:16 1996

The idea is simple: tset is the fastest, but you only want to spin so
long. Then you want to drop into the kernel, and wait for someone to wake
you up. This thing was quite fast on freebsd, even four years ago. In fact
I have yet to see anything faster, but I'm willing to be corrected.

--

Here's a simple test-and-set function for the 386 (tested and works):

int
tset(int *i, int lockval, int unlockval)
{
  int j = 0;
asm("movl 16(%ebp), %eax");
asm("movl 8(%ebp),%ecx");
asm("movl 12(%ebp),%edx");
asm("cmpxchg %edx, (%ecx)");
asm("jne failed");
asm("movl %eax, -4(%ebp)");
asm("jmp done");
asm("failed: movl %eax, -4(%ebp)");
asm("done:");
  return j;
}

This will run a bit faster than a file lock system call :-). But what about 
contention? notice that this function, if it fails, fails. so we need a 
retry strategy. IF you fail, should you spin forever? NO! Do that, and 
you eat the processor doing nothing. You ought to have a reasonable way 
to, say, spin one or two more times, then go to a more heavyweight sleep. 

SO: here's the fastlock library call (#ifdef USEMOD is for LKM )

void
fastlock(int *address, int lockval, int unlockval)
{
int testval;
#ifdef USEMOD
static int syscallnum = -1;
if (syscallnum  0)
syscallnum = syscallfind("fastlock");
if (syscallnum  0) {
perror("fastlock syscallfind");
return;
}
#endif

  testval = tset(address, lockval, unlockval);
  if (testval == unlockval) {
#ifdef FASTLOCKDEBUG
printf("(%d)fastlock quickout\n", getpid());
#endif
return;
  }
  /* attempt to lock failed. test then wait in kernel sleep() */
  while (1)
{
  /* set the high-order bit. This tells the unlocker to do the system
   * call and free up the lock.
   */
  (void) tset(address, testval|0x8000,testval);
#ifdef FASTLOCKDEBUG
  printf("(%d)hang in there\n", getpid());
#endif
  /* we should be checking here to make sure that high-order bit is 
   * set. But this second tset fails only 
   * in the event of contention, in which case 
   * someone else has set the high-order
   * bit too ... seems pointless, esp. given that fastlock has a timeout
   */
  syscall(syscallnum, 1, address, unlockval);
  testval = tset(address, lockval, unlockval);
  if (testval == unlockval)
return;
}
  
}

So what are we doing? We're doing the tset. If it fails, then we do one 
more tset, to set the high order bit, then drop into the fastlock system 
call. Once we return from that, we try to tset the variable again. If 
that fails, we drop once again into the system call. 

Here's fastunlock: 
void
fastunlock(int *address, int unlockval)
{
  int dosyscall = 0;
  static int syscallnum = -1; /* this is really in the file */
#ifdef USEMOD
if (syscallnum  0)
syscallnum = syscallfind("fastlock");
if (syscallnum  0) {
perror("fastunlock syscallfind");
return;
}
#endif
  if (*address  0x8000)
dosyscall = 1;
  *address = unlockval;
#ifdef FASTLOCKDEBUG
  printf("(%d)fastunlock dosyscall is %d\n", getpid(), dosyscall);
  if (dosyscall) printf("conflict %d\n", getpid());
  fflush(stdout);
#endif
  if (dosyscall)
syscall(syscallnum, 0, address, unlockval);
}

Ok, this one tests to see if it needs to wake any sleepers, clears the 
memory variable, then drops into the kernel if needed (if (dosyscall) ...)

Here's the system call. Note several things: 
1) the definition of 'unlocked' is passed in to the system call for the 
final test, not assumed to be zero. 
2) The 'address' argument does NOT NEED TO BE AN ADDRESS. it's a number
   that all the procs have to agree on, that is all.
3) if you accidently awake more than one sleeper, the loop in fastlock
   handles that properly
4) This system call handles both waking up and sleeping
5) For my measurements, this thing is a whole lot faster than anything
   else available on (e.g.) freebsd. Questions to me. 

int
fastlock(p, uap, retval)
struct proc *p;
struct flu *uap;
int retval[];
{
extern int hz;
retval[0] = 0;
/*
printf("fastlockunlock: com %d address 0x%x unlocked %d\n",
uap-com, uap-address, uap-unlocked);
 */
if (uap-com == 0) /* unlock */
wakeup((void *) uap-address);
else
{
/* last chance */
/* try one last time to see if it is unlocked */
int curval = fuword((void *) uap-address);
if (curval == uap-unlocked)
return;
tsleep((void *) uap-address, PUSER, NULL, 10*hz);

Re: IPC, shared memory, syncronization AND threads...

2000-08-15 Thread Jonas Bulow

John Polstra wrote:
 
 In article [EMAIL PROTECTED], Jonas Bulow
 [EMAIL PROTECTED] wrote:
  John Polstra wrote:
 Actually I thought about this some more, and I'm not all that sure
 it's possible.  I haven't actually _tried_ it, but I think you'd end
 up needing a low-level mutex around parts of the code.  That would
 have to be implemented as a spinlock, which is exactly what we're
 trying to avoid in this exercise.

What do you mean with low-level mutex? I mean, how low is low? :-)

After doing some more thinking about the cmpxchgl-lock, it's quite hard
to use it together with a technique involving the kernel. It will be a
contradiction in many ways. I would be nice to have kqueue a EVFILT_MEM
and wait for the contents of a memory adress contain a specific value
(or other condition like threshold, range entrance/leaving). Then it can
be used to wait for the adress used with cmpxchgl. Well, this was just
thinking for this very moment. 

 
  don't know it it's bad design to have rtld.c export
  lockdflt_init in the same way as dlopen, what di you think?
 
 Right, bad design. :-)

just cheking.. :-)


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: IPC, shared memory, syncronization AND threads...

2000-08-15 Thread Ronald G Minnich

On Tue, 15 Aug 2000, Jonas Bulow wrote:

 After doing some more thinking about the cmpxchgl-lock, it's quite hard
 to use it together with a technique involving the kernel. 

well, no I don't think it is. I used to use it a lot, see my earlier post
from today. 

ron



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: IPC, shared memory, syncronization AND threads...

2000-08-15 Thread Peter Dufault

 On Tue, 15 Aug 2000, Jonas Bulow wrote:
 
  After doing some more thinking about the cmpxchgl-lock, it's quite hard
  to use it together with a technique involving the kernel. 
 
 well, no I don't think it is. I used to use it a lot, see my earlier post
 from today. 

One point to keep in mind is that you will get a big win from these approaches
in low-contention situations.

I've got something that I use in bus simulations. When a client
process doesn't get the compare-and-swap lock in a few instructions
it has to request it from a master process via a single master-request-FIFO
which eventually answers it back through a per-client-response FIFO,
continuing the blocked clients in the order that they missed the lock,
and using up system calls and context switches to ensure that.

Thus when I lose, I lose big time, but it gives me the ordering I want.
When I win I wind up with no system calls.  Sort of like
a cache.

If I was losing a lot I'd have to rethink the approach and use something
to divy up the work properly.  If ordering isn't required and you
aren't worried about starvation, it simplifies things a lot because
you can have a Linda-like pool of work requests to hand out to a swarm
of worker bees.

Peter

--
Peter Dufault ([EMAIL PROTECTED])   Realtime development, Machine control,
HD Associates, Inc.   Fail-Safe systems, Agency approval


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: IPC, shared memory, syncronization AND threads...

2000-08-15 Thread Gary T. Corcoran


Peter Dufault wrote:
 
 you can have a Linda-like pool of work requests to hand out to a swarm of worker 
bees.
 ^^^

Could you please decode this for me?  :)

Thanks,
Gary


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: IPC, shared memory, syncronization AND threads...

2000-08-15 Thread Lithix

| you can have a Linda-like pool of work requests to |hand out to a swarm of
| worker bees.
|  ^^^
| Could you please decode this for me?  :)

This page talks about Linda, check out the "Linda Basics" section and read about 
tuples.
http://www.sca.com/ltutorial.html


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: IPC, shared memory, syncronization AND threads...

2000-08-14 Thread Jonas Bulow

John Polstra wrote:
 Jonas Bulow wrote
  Maybe I havn't been thinking enough but wouldn't this lock mechanism
  be a good choice to use for mmaped:memory accessed by multiple
  processes?
 
 It depends on the amount of contention you expect.  The code in
 lockdflt.c was designed for a very low-contention situation (usually
 no contention at all).  It also had to work in a very restrictive
 environment where the threads package was unknown and could be
 practically anything.  Also it was designed to do locking between two
 threads running in the same process, which is not the problem you're
 trying to solve.  Your environment is much more controlled, so you can
 probably do better.

I think I'm trying to solve the threading-problem too. The overall
architecture is a preforked server where there is a need to share
information between all threads in all preforked processes. The solution
below seems to be good if flock doesn't block all threads in a process,
that is.

 
 I think the ideal solution would first try to lock the test-and-set
 lock, maybe spinning on it just a few times.  If that failed it would
 fall back to using a system-call lock such as flock() which would
 allow the process to block without spinning.  But I don't have any
 code to do that.  (If you write some, could I have a copy?)

I am about to. I don't know it it's bad design to have rtld.c export
lockdflt_init in the same way as dlopen, what di you think?

/jonas


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: IPC, shared memory, syncronization

2000-08-13 Thread Nate Williams

   I don't know about the "bsd" or whatever way. If you're doing real
   parallel programming and want real performance, you'll use a test-and-set
   like function that uses the low-level machine instructions for same.
  
  That is exacly what I'm looking for! I found it to be overkill to
  involve the kernel just because I wanted to have a context switch during
  the "test-and-set".
 
 Precisely how do you expect to "have a context switch" without "involving
 the kernel"?

If your threads are implemented wholly in userland, you can easily do a
context switch w/out involving the kernel.  Our current pthreads library
does this now, and the JDK's internal (green) threads implementation
does it as well.



Nate


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: IPC, shared memory, syncronization

2000-08-13 Thread Jonas Bulow

Wes Peters wrote:
 
 Jonas Bulow wrote:
 
  Ronald G Minnich wrote:
  
   I don't know about the "bsd" or whatever way. If you're doing real
   parallel programming and want real performance, you'll use a test-and-set
   like function that uses the low-level machine instructions for same.
 
  That is exacly what I'm looking for! I found it to be overkill to
  involve the kernel just because I wanted to have a context switch during
  the "test-and-set".
 
 Precisely how do you expect to "have a context switch" without "involving
 the kernel"?

Sorry, I missed an important word in the sentence above, namely "not". I
don't want to have a context switch during the test-and-set operation.
Now, when I found the code in lockdflt.c (rtle-elf) that doesn't seem to
be a problem.

/j


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: IPC, shared memory, syncronization

2000-08-12 Thread Jonas Bulow

John Polstra wrote:
 If you want the "BSD way" you should probably create a 0-length
 temporary file somewhere and use the flock(2) system call on it.  The
 file itself isn't important; it's just something to lock.

I don't see any reason to do system calls just because I want to do an
atomic operation (i.e aquire/release a lock).

I found some really good code (:-)) in
/usr/src/libexec/rtld-elf/i386/lockdflt.c that seems to do what I want.
It's more the "i386"-way than the BSD-way. Maybe I havn't been thinking
enough but wouldn't this lock mechanism be a good choice to use for
mmaped:memory accessed by multiple processes? 

In lock_create the lock is aligned to CACHE_LINE_SIZE. Why is that
important?


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: IPC, shared memory, syncronization

2000-08-12 Thread Brian Fundakowski Feldman

On Sat, 12 Aug 2000, Jonas Bulow wrote:

 John Polstra wrote:
  If you want the "BSD way" you should probably create a 0-length
  temporary file somewhere and use the flock(2) system call on it.  The
  file itself isn't important; it's just something to lock.
 
 I don't see any reason to do system calls just because I want to do an
 atomic operation (i.e aquire/release a lock).
 
 I found some really good code (:-)) in
 /usr/src/libexec/rtld-elf/i386/lockdflt.c that seems to do what I want.
 It's more the "i386"-way than the BSD-way. Maybe I havn't been thinking
 enough but wouldn't this lock mechanism be a good choice to use for
 mmaped:memory accessed by multiple processes? 

I was just going to suggest this =) The best way to go about this
method is, IMHO, to map a range of memory you'll get "locks" from and
use that as a zone-type allocator.  For the most part, you can reuse
lockdflt.c for the i386 and alpha archs and it will probably work well
:)

The caveats are that you need to have mmap()-shared locks themselves,
if you're not threaded you'll probably want to take all the
signal-related stuff out.  If you don't need shared locks,
you can simplify things by just using machdep cmpxchgl() and
cmp0_and_store_int() routines, along with probably wanted to do a
nanosleep() like the rtld-elf code, too.

I assume if you were doing things with threads, you'd be using the
pthread_mutex_t routines, of course ;)

 In lock_create the lock is aligned to CACHE_LINE_SIZE. Why is that
 important?

I'm thinking it's to keep things in one line of the data cache so as
to not impact performance more than necessary.  I didn't really pay
attention to this part of the implementation, but it makes sense to me
:)

--
 Brian Fundakowski Feldman   \  FreeBSD: The Power to Serve!  /
 [EMAIL PROTECTED]`--'



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: IPC, shared memory, syncronization

2000-08-12 Thread Wes Peters

Jonas Bulow wrote:
 
 Ronald G Minnich wrote:
 
  I don't know about the "bsd" or whatever way. If you're doing real
  parallel programming and want real performance, you'll use a test-and-set
  like function that uses the low-level machine instructions for same.
 
 That is exacly what I'm looking for! I found it to be overkill to
 involve the kernel just because I wanted to have a context switch during
 the "test-and-set".

Precisely how do you expect to "have a context switch" without "involving
the kernel"?

-- 
"Where am I, and what am I doing in this handbasket?"

Wes Peters Softweyr LLC
[EMAIL PROTECTED]   http://softweyr.com/


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: IPC, shared memory, syncronization

2000-08-12 Thread John Polstra

In article [EMAIL PROTECTED],
Jonas Bulow  [EMAIL PROTECTED] wrote:
 John Polstra wrote:
  If you want the "BSD way" you should probably create a 0-length
  temporary file somewhere and use the flock(2) system call on it.  The
  file itself isn't important; it's just something to lock.
 
 I don't see any reason to do system calls just because I want to do an
 atomic operation (i.e aquire/release a lock).

Fair enough.  It's a trade-off between simplicity and overhead and it
depends also on the amount of contention you expect for the locks.

 I found some really good code (:-)) in
 /usr/src/libexec/rtld-elf/i386/lockdflt.c

:-)

 that seems to do what I want.  It's more the "i386"-way than the
 BSD-way.

Well, there is alpha code there too. :-)

 Maybe I havn't been thinking enough but wouldn't this lock mechanism
 be a good choice to use for mmaped:memory accessed by multiple
 processes?

It depends on the amount of contention you expect.  The code in
lockdflt.c was designed for a very low-contention situation (usually
no contention at all).  It also had to work in a very restrictive
environment where the threads package was unknown and could be
practically anything.  Also it was designed to do locking between two
threads running in the same process, which is not the problem you're
trying to solve.  Your environment is much more controlled, so you can
probably do better.

I think the ideal solution would first try to lock the test-and-set
lock, maybe spinning on it just a few times.  If that failed it would
fall back to using a system-call lock such as flock() which would
allow the process to block without spinning.  But I don't have any
code to do that.  (If you write some, could I have a copy?)

 In lock_create the lock is aligned to CACHE_LINE_SIZE. Why is that
 important?

It's just more efficient that way.  The spinlock tends to hammer the
cache line containing the lock (i.e., invalidate the whole line over
and over).  If anything else is also accessing the same cache line,
there is a big performance penalty.

John
-- 
  John Polstra   [EMAIL PROTECTED]
  John D. Polstra  Co., Inc.Seattle, Washington USA
  "Disappointment is a good sign of basic intelligence."  -- Chögyam Trungpa



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



IPC, shared memory, syncronization

2000-08-11 Thread Jonas Bulow

What is the "BSD-way" of access to shared memory (mmap:ed) secure (avoid
race conditions, etc)? Right now I'm using posix semaphores but I would
like to know if there is a substitute like the way kqueue is for
select/poll.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: IPC, shared memory, syncronization

2000-08-11 Thread Jonas Bulow

Jonas Bulow wrote:
 
 What is the "BSD-way" of access to shared memory (mmap:ed) secure (avoid
 race conditions, etc)? Right now I'm using posix semaphores but I would
 like to know if there is a substitute like the way kqueue is for
 select/poll.

Hmm, I think I lost some word and deeper thought in my previous mail.
:-)

The problem is as follows:

I have a couple of processes using a mmap:ed file as common data area.
What I want to do is to make it safe for all processes to update data in
this common memory area. I was thinking about using some part of the
common data area for semaphores in some way. I just want a simple
"test-and-set" operation I can use to make sure there is only one
process writing to the common data area.

To take the "test-and-set" further, I would like to make the process
wait for the lock to be released.

Can anyone give me hint how this is best implemented with FreeBSD as OS?

I apology if this is not a pure FreeBSD related question but I could not
find a better forum for this question. I could only find out solutions
based on posix semaphores.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: IPC, shared memory, syncronization

2000-08-11 Thread John Polstra

In article [EMAIL PROTECTED],
Jonas Bulow  [EMAIL PROTECTED] wrote:
 Jonas Bulow wrote:
  
  What is the "BSD-way" of access to shared memory (mmap:ed) secure (avoid
  race conditions, etc)? Right now I'm using posix semaphores but I would
  like to know if there is a substitute like the way kqueue is for
  select/poll.
 
 Hmm, I think I lost some word and deeper thought in my previous mail.
 :-)
 
 The problem is as follows:
 
 I have a couple of processes using a mmap:ed file as common data area.
 What I want to do is to make it safe for all processes to update data in
 this common memory area. I was thinking about using some part of the
 common data area for semaphores in some way. I just want a simple
 "test-and-set" operation I can use to make sure there is only one
 process writing to the common data area.

If you want the "BSD way" you should probably create a 0-length
temporary file somewhere and use the flock(2) system call on it.  The
file itself isn't important; it's just something to lock.

Or you could use semop(2) on semaphores.  But that's the SYSV way, not
the BSD way.

John
-- 
  John Polstra   [EMAIL PROTECTED]
  John D. Polstra  Co., Inc.Seattle, Washington USA
  "Disappointment is a good sign of basic intelligence."  -- Chögyam Trungpa



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message