Re: memory allocation (was Re: mod_include performance numbers)
can you give a short description of this allocator? -dean On Sat, 21 Apr 2001, Cliff Woolley wrote: On Sat, 21 Apr 2001, Greg Ames wrote: are you thinking about an atomic push/pop block allocator? I'll be happy to help out if so, especially with the machine instruction level stuff. Yes, I am, and I definitely *will* need help from various people getting the appropriate machine language magic for their platforms working. I've already done the generic fallback locking implementation of the stack; that was simple. And I have what you and Jeff gave me for S390. But... yeah, but that stuff can go away. compare-and-exchange (or compare swap, or load reserve, or...) is our friend in multithreaded systems, especially on multiprocessors. The piece I haven't figured out is how to set up CPU architecture dependent directories, or macros, or whatever, in APR. sigh THAT's the problem. No other piece of APR is *architecture* dependent. There's an "arch" include directory, but it's really a misnamed "os" directory. I've got a scheme implemented that gives us an APR_ARCH_IS_foo macro, but it's a hack. It disobeys the typical APR rule of "all macros are always defined and have a value of 0 or 1" and is just defined or not. I've also been afraid that some of the possible ways to implement these machine-language tricks will also be *compiler* dependent, not just architecture dependent. If so, then that makes this that much harder. For example, I've taken your jstack.h and pulled out the __cds() "calls" and wrapped a macro around them so that different architectures can insert their equivalent instruction. But are "__cds()" and "cds_t" available on all S390 platforms? I'm guessing no. Ugh. If anybody has a clean way to do this, I'm all ears. --Cliff -- Cliff Woolley [EMAIL PROTECTED] Charlottesville, VA
Re: memory allocation (was Re: mod_include performance numbers)
Cliff Woolley wrote: On Sat, 21 Apr 2001, Greg Ames wrote: are you thinking about an atomic push/pop block allocator? I'll be happy to help out if so, especially with the machine instruction level stuff. Yes, I am, and I definitely *will* need help from various people getting the appropriate machine language magic for their platforms working. I've already done the generic fallback locking implementation of the stack; that was simple. Cool! that's got to be the first piece. yeah, but that stuff can go away. compare-and-exchange (or compare swap, or load reserve, or...) is our friend in multithreaded systems, especially on multiprocessors. The piece I haven't figured out is how to set up CPU architecture dependent directories, or macros, or whatever, in APR. sigh THAT's the problem. No other piece of APR is *architecture* dependent. There's an "arch" include directory, but it's really a misnamed "os" directory. I've got a scheme implemented that gives us an APR_ARCH_IS_foo macro, but it's a hack. It disobeys the typical APR rule of "all macros are always defined and have a value of 0 or 1" and is just defined or not. I've also been afraid that some of the possible ways to implement these machine-language tricks will also be *compiler* dependent, not just architecture dependent. If so, then that makes this that much harder. Yessir...on platforms that have multiple compilers, there could be multiple ways of coding the same machine instruction, or no support at all. This sounds like something autoconf tests could figure out. On the other hand, if gcc is running on i486 or above, it shouldn't matter if it's Linux or FreeBSD or whatever. So if we figure out one of those platforms, we get a lot of bang for the buck. For example, I've taken your jstack.h jstack == Jeff's stack. I can only take credit for teaching him some things about atomic updates on multiprocessors. and pulled out the __cds() "calls" and wrapped a macro around them so that different architectures can insert their equivalent instruction. But are "__cds()" and "cds_t" available on all S390 platforms? I'm guessing no. Ugh. I'm guessing you're right, I doubt if gcc supports it (Linux390). But autoconf is our friend. (sheesh...did I really say that? ) If anybody has a clean way to do this, I'm all ears. Me too. More ideas are greatly appreciated. Greg
Re: memory allocation (was Re: mod_include performance numbers)
On Sat, 21 Apr 2001, dean gaudet wrote: can you give a short description of this allocator? FirstBill wrote the beginnings of it. It's basically a drop-in replacement for malloc/calloc/free (really a wrapper around them) that, when initialized, pre-allocates blocks of various sizes (in FirstBill's, IIRC, it does as many blocks of a given power-of-two size as will fit in 8KB). It uses a simple stack to keep its free lists. The stack, while simple in concept, is the tricky-in-implementation part. The idea is that the stack API just has three operations: init/push/pop. That's it. On many platforms, a stack like this can be implemented without locks, using architecture-specific instructions like Compare-Double-and-Swap. So it's really just a wrapper around malloc that keeps stacks of blocks that can be very efficiently re-allocated. That's it. --Cliff -- Cliff Woolley [EMAIL PROTECTED] Charlottesville, VA