Re: memory allocation (was Re: mod_include performance numbers)

2001-04-22 Thread dean gaudet

can you give a short description of this allocator?

-dean

On Sat, 21 Apr 2001, Cliff Woolley wrote:

 On Sat, 21 Apr 2001, Greg Ames wrote:

  are you thinking about an atomic push/pop block allocator?  I'll be
  happy to help out if so, especially with the machine instruction level
  stuff.

 Yes, I am, and I definitely *will* need help from various people getting
 the appropriate machine language magic for their platforms working.  I've
 already done the generic fallback locking implementation of the stack;
 that was simple.  And I have what you and Jeff gave me for S390.  But...

  yeah, but that stuff can go away.  compare-and-exchange (or compare 
  swap, or load  reserve, or...) is our friend in multithreaded systems,
  especially on multiprocessors.  The piece I haven't figured out is how
  to set up CPU architecture dependent directories, or macros, or
  whatever, in APR.  sigh

 THAT's the problem.  No other piece of APR is *architecture* dependent.
 There's an "arch" include directory, but it's really a misnamed "os"
 directory.  I've got a scheme implemented that gives us an APR_ARCH_IS_foo
 macro, but it's a hack.  It disobeys the typical APR rule of "all macros
 are always defined and have a value of 0 or 1" and is just defined or not.
 I've also been afraid that some of the possible ways to implement these
 machine-language tricks will also be *compiler* dependent, not just
 architecture dependent.  If so, then that makes this that much harder.
 For example, I've taken your jstack.h and pulled out the __cds() "calls"
 and wrapped a macro around them so that different architectures can insert
 their equivalent instruction.  But are "__cds()" and "cds_t" available on
 all S390 platforms?  I'm guessing no.  Ugh.  If anybody has a clean way to
 do this, I'm all ears.

 --Cliff

 --
Cliff Woolley
[EMAIL PROTECTED]
Charlottesville, VA







Re: memory allocation (was Re: mod_include performance numbers)

2001-04-21 Thread Greg Ames

Cliff Woolley wrote:
 
 On Sat, 21 Apr 2001, Greg Ames wrote:
 
  are you thinking about an atomic push/pop block allocator?  I'll be
  happy to help out if so, especially with the machine instruction level
  stuff.
 
 Yes, I am, and I definitely *will* need help from various people getting
 the appropriate machine language magic for their platforms working.  I've
 already done the generic fallback locking implementation of the stack;
 that was simple.  

Cool! that's got to be the first piece.

 
  yeah, but that stuff can go away.  compare-and-exchange (or compare 
  swap, or load  reserve, or...) is our friend in multithreaded systems,
  especially on multiprocessors.  The piece I haven't figured out is how
  to set up CPU architecture dependent directories, or macros, or
  whatever, in APR.  sigh
 
 THAT's the problem.  No other piece of APR is *architecture* dependent.
 There's an "arch" include directory, but it's really a misnamed "os"
 directory.  I've got a scheme implemented that gives us an APR_ARCH_IS_foo
 macro, but it's a hack.  It disobeys the typical APR rule of "all macros
 are always defined and have a value of 0 or 1" and is just defined or not.
 I've also been afraid that some of the possible ways to implement these
 machine-language tricks will also be *compiler* dependent, not just
 architecture dependent.  If so, then that makes this that much harder.

Yessir...on platforms that have multiple compilers, there could be
multiple ways of
coding the same machine instruction, or no support at all.  This sounds
like something autoconf tests could figure out.

On the other hand, if gcc is running on i486 or above, it shouldn't
matter if it's Linux or FreeBSD or whatever.  So if we figure out one of
those platforms, we get a lot of bang for the buck.

 For example, I've taken your jstack.h 

jstack == Jeff's stack.  I can only take credit for teaching him some
things about atomic updates on multiprocessors.

 and pulled out the __cds() "calls"
 and wrapped a macro around them so that different architectures can insert
 their equivalent instruction.  But are "__cds()" and "cds_t" available on
 all S390 platforms?  I'm guessing no.  Ugh.  

I'm guessing you're right, I doubt if gcc supports it (Linux390).  But
autoconf is our friend.  (sheesh...did I really say that? )

 If anybody has a clean way to do this, I'm all ears.

Me too.  More ideas are greatly appreciated.

Greg



Re: memory allocation (was Re: mod_include performance numbers)

2001-04-21 Thread Cliff Woolley

On Sat, 21 Apr 2001, dean gaudet wrote:

 can you give a short description of this allocator?

FirstBill wrote the beginnings of it.  It's basically a drop-in
replacement for malloc/calloc/free (really a wrapper around them) that,
when initialized, pre-allocates blocks of various sizes (in FirstBill's,
IIRC, it does as many blocks of a given power-of-two size as will fit in
8KB).

It uses a simple stack to keep its free lists.  The stack, while simple in
concept, is the tricky-in-implementation part.  The idea is that the stack
API just has three operations: init/push/pop.  That's it.  On many
platforms, a stack like this can be implemented without locks, using
architecture-specific instructions like Compare-Double-and-Swap.

So it's really just a wrapper around malloc that keeps stacks of blocks
that can be very efficiently re-allocated.

That's it.

--Cliff

--
   Cliff Woolley
   [EMAIL PROTECTED]
   Charlottesville, VA