Hash tables and request headers Re: observations on fragmentation in SMS pools

2001-07-11 Thread Brian Pane
Roy T. Fielding wrote:
Another reason why apr_hash_t isn't a good match for HTTP headers
is that the same field name may appear multiple times in the request
or response headers (section 4.2 of RFC 2616), but the hash table
implementation is designed around unique keys.
HTTP headers were designed to be processed as a hash table, hence the
rules regarding combining multiple fields with the same name.  The only
exception is Set-Cookie, because the folks at Netscape didn't follow
the rules.
Ah, I was wondering why RFC 2109 seemed at odds with 2616 on the
subject of field-combining syntax.  :-(
If Set-Cookie is the only exception, it might not be a bad idea to
use apr_hash_t for the request and response headers.  I'm even willing
to contribute a patch for this, but it would impact both APR (slightly)
and Apache (a lot).  Here's my proposed approach:
 1. Add a 2nd create function for apr_hash_t that lets the caller
supply three callback functions:
- a hash function
- an equality-check function
- a concatenation function (if non-NULL, setting x=y followed by
  x=y results in concat_fn(y,z) being stored in the table as the
  value for x)
 2. Change the headers_in and headers_out fields of the request_req
from tables to hash tables.  Supply case-insensitive hash and
comparison functions for these hash tables.  For headers_out, also
supply a concatenation function that builds comma-separated lists
(and semicolon-separated lists when the key is set-cookie)
 3. Change all of the references to headers_in and headers_out in
the core and standard modules to use the hash API instead of
the table API.
 4. Document the change so that maintainers of 3rd party modules
know how to make their code compile once again.
Thoughts?  Is a change like this worth doing?
--Brian



Re: Observations on fragmentation in SMS pools

2001-07-10 Thread Jon Travis
On Mon, Jul 09, 2001 at 04:50:31PM -0700, [EMAIL PROTECTED] wrote:
 
  cool, huh?  [and it's only 1024 LOC yes i know it's not
  portable like APR i was v.impressed that someone actually
  looked at it when i first mentioned it here, btw ]
 
  so *grin*.
 
  can you guarantee thread-safety on apr_hash_t using
  apr_hash_first()/next()/this()?
 
  can you guarantee complete traversal with multiple
  simultaneous adders, deleters, readers and writers?
 
  and does anyone want a challenge of porting tdb to apr?
  *grin*
 
 Challenge, did somebody say challenge?  I'm always up for a challenge.
 :-)
 
   BTW: Why are tables in APR at all?  The only thing I see used
   is headers in apr-util hooks code, and in the xml, but that of
   course can be fixed.  Step 1, remove tables from APR, Step 2,
   remove tables from Apache.
 
  agh!  tables are kinda... entrenched into xvl.  okay, maybe
  not: only 10 instances of apr_table_make.  10 of apr_table_do
  [which is why i was advocating it: i really like it :)]
 
 Tables are in APR, because were originally moved from Apache to APR before
 APR-util existed.  They should really move to apr-util.  They should never
 be removed from Apache.  Tables are useful because they garuantee a
 general order to the data, namely, the order you insert information into
 the table.  Hashes have a different use case.

Tables should definitely be moved to APR-util if they are to remain.  As
for Apache, there are better structures that dictate general order than
the table.  IMNSHO, the only reason tables are still in Apache is inertia.
Nobody wants to go through and change everything to a more sane data
structure.  Case insensitive searches -- linear search time over the
table ... ugh.  

-- Jon



table inefficiencies Re: Observations on fragmentation in SMS pools

2001-07-10 Thread Brian Pane
Jon Travis wrote:
[...]
Tables should definitely be moved to APR-util if they are to remain.  As
for Apache, there are better structures that dictate general order than
the table.  IMNSHO, the only reason tables are still in Apache is inertia.
Nobody wants to go through and change everything to a more sane data
structure.  Case insensitive searches -- linear search time over the
table ... ugh.  

It's worth noting that half of the apr_table_get calls in
Apache are from mod_mime.  I posted a patch to new-httpd
a couple of weeks ago that replaces mod_mime's tables with
hash tables.
Other frequent callers of linear-time table lookups:
ap_set_byterange, ap_setup_client_block.
--Brian



Re: table inefficiencies Re: Observations on fragmentation in SMS pools

2001-07-10 Thread Cliff Woolley

[Is it just me or is it nearly impossible to have a conversation about
Apache or APR that doesn't in some way belong on BOTH lists? sigh]


On Mon, 9 Jul 2001, Brian Pane wrote:

 It's worth noting that half of the apr_table_get calls in
 Apache are from mod_mime.  I posted a patch to new-httpd
 a couple of weeks ago that replaces mod_mime's tables with
 hash tables.

Can you repost that to new-httpd?  I'd forgotten about it but am
definitely interested.

 Other frequent callers of linear-time table lookups:
 ap_set_byterange, ap_setup_client_block.

Patches for those would be fine by me, as well.

--Cliff


--
   Cliff Woolley
   [EMAIL PROTECTED]
   Charlottesville, VA




Re: Observations on fragmentation in SMS pools

2001-07-10 Thread Justin Erenkrantz
On Mon, Jul 09, 2001 at 08:14:26PM -0700, Jon Travis wrote:
 Tables should definitely be moved to APR-util if they are to remain.  As
 for Apache, there are better structures that dictate general order than
 the table.  IMNSHO, the only reason tables are still in Apache is inertia.
 Nobody wants to go through and change everything to a more sane data
 structure.  Case insensitive searches -- linear search time over the
 table ... ugh.  

One problem with hash tables is that if you want to retrieve a HTTP
header, you now need to have it be the same case as it came in (unless
you are normalizing every call).  RFC 2616 states that (Section 4.2):

Field names are case-insensitive.

Therefore, if the client sends:

CoNnEcTiOn: Keep-Alive

it should respect that.  And, this data structure is primarily used for 
storing HTTP header info (I'm not sure about mod_mime).  I'm not aware 
of any browsers that do this, but I'd bet someone will because the RFC 
says you can.  (It's possible I'm misinterpreting the RFC, wouldn't be 
the first time...)

AFAICT, this doesn't call for a true hash table or a true array.  It's 
something slightly different.  Any data structure which replaces tables 
needs to be able to handle the case-insensitivity gracefully.

Just want to make sure that any solution keeps us correct to the RFC.  
-- justin



Re: Observations on fragmentation in SMS pools

2001-07-10 Thread Brian Pane
Justin Erenkrantz wrote:
On Mon, Jul 09, 2001 at 08:14:26PM -0700, Jon Travis wrote:
Tables should definitely be moved to APR-util if they are to remain.  As
for Apache, there are better structures that dictate general order than
the table.  IMNSHO, the only reason tables are still in Apache is inertia.
Nobody wants to go through and change everything to a more sane data
structure.  Case insensitive searches -- linear search time over the
table ... ugh.  

One problem with hash tables is that if you want to retrieve a HTTP
header, you now need to have it be the same case as it came in (unless
you are normalizing every call).  RFC 2616 states that (Section 4.2):
Field names are case-insensitive.
Therefore, if the client sends:
CoNnEcTiOn: Keep-Alive
it should respect that.  And, this data structure is primarily used for 
storing HTTP header info (I'm not sure about mod_mime).  I'm not aware 
of any browsers that do this, but I'd bet someone will because the RFC 
says you can.  (It's possible I'm misinterpreting the RFC, wouldn't be 
the first time...)

AFAICT, this doesn't call for a true hash table or a true array.  It's 
something slightly different.  Any data structure which replaces tables 
needs to be able to handle the case-insensitivity gracefully.

Just want to make sure that any solution keeps us correct to the RFC.  
-- justin

Another reason why apr_hash_t isn't a good match for HTTP headers
is that the same field name may appear multiple times in the request
or response headers (section 4.2 of RFC 2616), but the hash table
implementation is designed around unique keys.
--Brian



data types on type of shm (Re: Observations on fragmentation in SMS pools)

2001-07-10 Thread dean gaudet
On Mon, 9 Jul 2001, Luke Kenneth Casson Leighton wrote:

 HOWEVER!  supporting the data types that apr_pool_xxx() USES
 is a different matter.

shm can be at different memory locations in all processes.  pointers don't
work.  you'd need to radically change the basic data types, which would
affect performance and make the code ugly.  so i think this is a bad idea.

btw, if you want to a shm database/hash table done properly, take a look
at sleepycat db.

-dean



tables - hash (Re: Observations on fragmentation in SMS pools)

2001-07-10 Thread dean gaudet
On Mon, 9 Jul 2001, Roy T. Fielding wrote:

  Tables are in APR, because were originally moved from Apache to APR before
  APR-util existed.  They should really move to apr-util.  They should never
  be removed from Apache.  Tables are useful because they garuantee a
  general order to the data, namely, the order you insert information into
  the table.  Hashes have a different use case.

 Ummm, no, tables do not guarantee that -- arrays do.  Tables were specfically
 created to be an abstract hash table, but the implementation remained
 simple because we never used them for large tables.

the r-headers_{in,out,err} tables are referenced so frequently that
making them hash tables is a win... there was a russian fellow, dimitri? i
forget his last name -- works on freebsd as well i think.  he tried this
and said it was a win.

i did a hash implementation for tables, but it was a loss because the hash
overhead on the zillion other small tables kills you.

but if you just fix mod_mime (Brian's patch is a start :), and the
headers_{in,out,err} then you get the biggest bang.  (btw,
headers_{in,out,err} should be turned into a perfect hash using gperf...
and all the compile-time string constants such as Connection,
Accept-Types, ... should become integers with #defines.)

-dean



Re: Observations on fragmentation in SMS pools

2001-07-10 Thread Luke Kenneth Casson Leighton
  ... errr.. should i be using apr_hash_t or something? :)
 
 Yes.

okay.  will take a look at it, to see where i could use it.

i have a suspicion that a lot of instances i really _need_
the case-insensitive stuff, and also need the dual
set and add capability.

not least because HTTP POST args can have this:
key1=value1key1=value2 and, correct me if i'm wrong,
here, that results in a table:
(key1, value1)
(key1, value2)

luke


Re: Observations on fragmentation in SMS pools

2001-07-10 Thread Luke Kenneth Casson Leighton
On Sun, Jul 08, 2001 at 10:19:33AM -0700, Justin Erenkrantz wrote:
 On Sun, Jul 08, 2001 at 10:14:16AM -0700, dean gaudet wrote:
  an ideal situation for free-lists (blocks of freed, but not free()d
  memory) is one per cpu.
  
  a less ideal situation is one per thread.
  
  an even less ideal situation is one per process (which requires locking).
  
  it's insane to have one per pool :)
 
 I think we're shooting for #2 (less ideal).  Unless someone can come up
 with a way to dynamically tell how many CPUs we are running on and bind
 a free-list to a specific CPU.
 
 We're currently doing #3 (even less ideal) in apr_pools.c.  
 
 And, yeah, the current trivial SMS is doing #4.  =)  But, don't expect it 
 to stay like that.  And, if we implement the apr_sms_child_malloc, it
 gets to somewhere between #2 and #3.  -- justin

#5 the approach taken by tdb.  have a hash-chain.

then you get the best of all worlds.  simultaneous access
only requires that you lock *one* hash chain, which allows
a theoretical limit of number-of-hash-bucket simultaneous
accesses.

luke

p.s. i'm hoping that people will not be put off by tdb's GPL
license because i think that there are some really good techniques
used in there that could be applied to APR code, too.


Re: Observations on fragmentation in SMS pools

2001-07-10 Thread Jon Travis
On Tue, Jul 10, 2001 at 04:52:25PM +0200, Luke Kenneth Casson Leighton wrote:
   ... errr.. should i be using apr_hash_t or something? :)
  
  Yes.
 
 okay.  will take a look at it, to see where i could use it.
 
 i have a suspicion that a lot of instances i really _need_
 the case-insensitive stuff, and also need the dual
 set and add capability.
 
 not least because HTTP POST args can have this:
 key1=value1key1=value2 and, correct me if i'm wrong,
 here, that results in a table:
 (key1, value1)
 (key1, value2)

I'll reply to this one, but it also goes towards the other
guys pointing to the spec.  There isn't anything wrong with
storing both the sensitive and then keying off of a toupper
of the string, and chaining it off the end if the entry
already exists in the table (for multiple headers of the
same name).  One could even take the existing table code,
add a field for the toupper'd string, add a small adler32 (or
whatever cksum is to your liking) field, and get a lot 
better performance than we see now.  Of course I still
am no big fan of the API.

-- Jon


Re: Observations on fragmentation in SMS pools

2001-07-09 Thread Brian Pane
Sander Striker wrote:
I'll think out loud now:
Me too :-)
The solution might be adding specific allocation functions for SMS
implementations.  These functions could look something like this:
APR_DECLARE(void *) apr_sms_child_malloc(apr_sms_t *sms,
apr_sms_t *child,
apr_size_t size);
APR_DECLARE(apr_status_t) apr_sms_child_free(apr_sms_t *sms,
apr_sms_t *child,
void *mem);
Internally the framework will use the child_malloc_fn if present.  If
not, it will fall back to malloc_fn.
This is something that overthrows the KIS policy though...
I think this approach (or a variant thereof) has a lot of promise.
One way to look at the root cause of the sms-pool fragmentation
and performance issues is that the notion of parent pool implies
two things:
 1. If P is the parent of S, S must cease to exist when
P is destroyed.
 2. If S needs more memory, it gets obtains it from P
by calling the malloc method of P.
It often doesn't make sense for these two roles to be
combined in the same object.  Corollary: It doesn't make
sense for apr_sms_malloc(parent) to be the mechanism by
which a child allocates additional space.
For example, a pool created for a subrequest needs to go out
of scope when its parent does, but it doesn't derive any real
benefit from routing requests for additional blocks through its
parent.  We could get better performance if the role of the
parent SMS were to supply a block source to the child.  This
block souce would be a pointer to an SMS that the child should
call whenever it needed more memory.  The block source could be
the parent itself, or it could be any ancestor of the parent.
In the case of a succession of sms_trivials stacked on top of
an sms_std, each sms_trivial would supply the pointer to the sms_std
as the block source for its child(ren).  (The resulting setup
would then look like the original pools implementation, in the
sense that children would bypass their parents to get additional
blocks.  The difference is that the parent gets to decide what
type of block source its children use--so if a stack of SMSs are
supposed to use memory from a shared memory segment, each SMS
in the stack can ensure that its descendents are using the shared
mem block source rather than some malloc(3)-based block source.)
Thoughts?
Thanks,
--Brian



Re: Observations on fragmentation in SMS pools

2001-07-09 Thread David Reid
To be honest changing sms_trivial to use malloc instead of apr_sms_malloc is
an easy move (just a few lines to change) so it's probably worth trying and
then seeing what we get...

BTW, I'm impressed by the amount of traffic this has generated.  Exactly
what I hoped it would do!

I guess also I'm wondering if we should apply my patch.  For all the reasons
given it's not ideal, but it's a starting point, and if in 2 weeks it looks
totally different, well, no bother, but at least it lets everyone work on it
with a minimum of fuss...  Opinions?  3 +1's and I'll commit while I'm in
Pheonix (got a nice connection)

david

- Original Message -
From: Brian Pane [EMAIL PROTECTED]
To: Sander Striker [EMAIL PROTECTED]
Cc: APR Development List dev@apr.apache.org
Sent: Monday, July 09, 2001 1:46 AM
Subject: Re: Observations on fragmentation in SMS pools


 Sander Striker wrote:

 I'll think out loud now:
 
 Me too :-)

 
 The solution might be adding specific allocation functions for SMS
 implementations.  These functions could look something like this:
 
 APR_DECLARE(void *) apr_sms_child_malloc(apr_sms_t *sms,
  apr_sms_t *child,
  apr_size_t size);
 
 APR_DECLARE(apr_status_t) apr_sms_child_free(apr_sms_t *sms,
  apr_sms_t *child,
  void *mem);
 
 Internally the framework will use the child_malloc_fn if present.  If
 not, it will fall back to malloc_fn.
 
 This is something that overthrows the KIS policy though...
 

 I think this approach (or a variant thereof) has a lot of promise.

 One way to look at the root cause of the sms-pool fragmentation
 and performance issues is that the notion of parent pool implies
 two things:
   1. If P is the parent of S, S must cease to exist when
  P is destroyed.
   2. If S needs more memory, it gets obtains it from P
  by calling the malloc method of P.

 It often doesn't make sense for these two roles to be
 combined in the same object.  Corollary: It doesn't make
 sense for apr_sms_malloc(parent) to be the mechanism by
 which a child allocates additional space.

 For example, a pool created for a subrequest needs to go out
 of scope when its parent does, but it doesn't derive any real
 benefit from routing requests for additional blocks through its
 parent.  We could get better performance if the role of the
 parent SMS were to supply a block source to the child.  This
 block souce would be a pointer to an SMS that the child should
 call whenever it needed more memory.  The block source could be
 the parent itself, or it could be any ancestor of the parent.
 In the case of a succession of sms_trivials stacked on top of
 an sms_std, each sms_trivial would supply the pointer to the sms_std
 as the block source for its child(ren).  (The resulting setup
 would then look like the original pools implementation, in the
 sense that children would bypass their parents to get additional
 blocks.  The difference is that the parent gets to decide what
 type of block source its children use--so if a stack of SMSs are
 supposed to use memory from a shared memory segment, each SMS
 in the stack can ensure that its descendents are using the shared
 mem block source rather than some malloc(3)-based block source.)

 Thoughts?

 Thanks,
 --Brian







RE: Observations on fragmentation in SMS pools

2001-07-09 Thread Sander Striker
 To be honest changing sms_trivial to use malloc instead of 
 apr_sms_malloc is
 an easy move (just a few lines to change) so it's probably worth 
 trying and then seeing what we get...

Massive leakage :)

Remember how the apr_sms_reset code works; it does not have to
traverse and destroy all children since they all use the memory
from that sms.

 BTW, I'm impressed by the amount of traffic this has generated.  Exactly
 what I hoped it would do!
 
 I guess also I'm wondering if we should apply my patch.  For all 
 the reasons
 given it's not ideal, but it's a starting point, and if in 2 
 weeks it looks
 totally different, well, no bother, but at least it lets everyone 
 work on it
 with a minimum of fuss...  Opinions?  3 +1's and I'll commit while I'm in
 Pheonix (got a nice connection)

Don't know if it counts, but +1 :)

 david

Sander



Re: Observations on fragmentation in SMS pools

2001-07-09 Thread Brian Pane
Sander Striker wrote:
To be honest changing sms_trivial to use malloc instead of 
apr_sms_malloc is
an easy move (just a few lines to change) so it's probably worth 
trying and then seeing what we get...

Massive leakage :)
Remember how the apr_sms_reset code works; it does not have to
traverse and destroy all children since they all use the memory
from that sms.
I propose that it's better to not optimize for reset.
With all the fragmentation I'm seeing in the httpd, the
number of blocks managed by an SMS in the middle of a stack
can be larger than the number of blocks that its descendants
would have to manage (and free upon destruction) if they
got memory from a block source themselves.  Thus we'd likely
have _less_ work to do upon reset if the SMS being reclaimed
had to recurse through its descendants and ask them to return
their blocks by themselves (because the total number of blocks
being returned to SMSs further down the stack would be smaller).
This problem would go away, of course, if the memory management
strategy employed by the parent SMS were one that didn't cause
so much fragmentation.  But I don't see a way to accomplish
that without slowing down allocation.
--Brian


Re: Observations on fragmentation in SMS pools

2001-07-09 Thread Luke Kenneth Casson Leighton
regarding fragmentation:

is it reasonable to assume that pre-allocating larger chunks of
memory and sub-dividing them yourself will prevent memory
fragmentation, with the side-effect of having More Code?

use hash tables to optimise the list-walking?

write a better sms instance, one that isn't 'trivial'?

luke


Re: Observations on fragmentation in SMS pools

2001-07-09 Thread Luke Kenneth Casson Leighton
On Sun, Jul 08, 2001 at 10:56:59AM -0700, Jon Travis wrote:

 On Sun, Jul 08, 2001 at 10:41:12AM -0700, dean gaudet wrote:

  On Sun, 8 Jul 2001, Jon Travis wrote:
  
   There is still all this tomfoolery with locking, though, which I think
   would be nice to fix with different sized buckets in the freelist.
   Stuff that the malloc in glibc does.  I cringe at the thought of how
   much overhead due to abstraction this whole project is causing.
  
  haha, the abstraction problems started AGES ago :)  i think the first real
  sign of the doom ahead was the argument of changing ap_pool_t to
  ap_context_t because there was this arbitrary hash table attached to it.
  
  i'd suggest that before any decisions are made regarding global free list
  or per thread free list someone should study what the non-SMS, pools-only
  server behaves like.  the problems you're looking at now in the SMS server
  are a result of SMS's feature of allocating all the way up through the
  ancestry adding bytes each step of the way.

this _can_ be mitigated by doing your own memory-management on top
of a more normal one.

which is what sander's 'smart memory allocator' does.

sma is targetted at having to have massive amounts of reallocs and
mallocs and then freeing large numbers of them in one blast, repeat.

luke


Re: Observations on fragmentation in SMS pools

2001-07-09 Thread Jon Travis
On Mon, Jul 09, 2001 at 05:49:24PM +0200, Luke Kenneth Casson Leighton wrote:
 On Sun, Jul 08, 2001 at 11:06:33AM -0700, Justin Erenkrantz wrote:
  On Sun, Jul 08, 2001 at 11:04:09AM -0700, Jon Travis wrote:
   As for the ability to use shared memory for this ... yeah that is
   an interesting idea.  What are your actual use cases for that?
  
  Ian posted a patch to do something like this - I think he wanted a hash
  table that was shared across all processes.  The problem gets to be 
  growing the memory space, but I think his use case was with fixed 
  memory.  He could probably tell more about how he wanted to do it.  
  
  Both rbb and I suggested that this is what SMS was designed for.
  Have a SMS that is backed by shared memory and then the hashtable
  immediately becomes shared when you create it with this shared memory
  SMS.  No change needed to the hashtable code.  
 
 okay, had an idea here that i thought i'd share with you about
 shmem/pools/sms.
 
 lessons from tdb's design.
 
 accessing shmem, you really have to treat it as a
 very-fast-file-in-memory.
 
 therefore, best thing to do to be able to easily access
 sections of it: treat the shmem as an in-memory database.
 
 the simplest way to fit that into the existing APR code is
 to have direct support for apr_table_t in a pool-based
 or sms-based shared-memory-segment.
 
 if you examine tdb.c's design, you will notice that apr_table_do()
 becomes identical to [but more powerful than] tdb_traverse().
 
 
 apr_array_header_t?  again, i haven't thought about it, but
 you could likely get away with the same thing.
 

The apr_table_t is just an apr_array_header_t with a specific
key/val entry type.

 in other words, i think that supporting apr_pool_t on top of
 shared memory (via sms) is going to be tricky: in my opinion,
 the apr_pool_xxx() API _and_ the apr_sms_xxx() API can't Deal With
 shared memory at a convenient user-acceptable level.
 
 HOWEVER!  supporting the data types that apr_pool_xxx() USES
 is a different matter.
 
 by supporting apr_array_header_t (if possible) and apr_table_t
 the complexity of messing about will be hidden from users of
 these data types, but they will work quite happily as expected,
 even if it's shared memory behind them.
 
 what people think?

I think you are in for a ride on this one.  The API for dealing
with the tables is currently rather poor, since typically it
is casted to an array and then iterated over.  The strcasecmp's
of the table code are a whole seperate issue.  I would think that
the (nearly analagous) apr_hash_t would better suit what you
are talking about here.  

BTW: Why are tables in APR at all?  The only thing I see used
is headers in apr-util hooks code, and in the xml, but that of
course can be fixed.  Step 1, remove tables from APR, Step 2, 
remove tables from Apache.

-- Jon



Re: Observations on fragmentation in SMS pools

2001-07-08 Thread Justin Erenkrantz
On Sun, Jul 08, 2001 at 12:47:32AM -0700, Brian Pane wrote:
 I added some instrumentation to apr_sms_trivial_malloc
 to study in more detail where its bottlenecks were.
 
 As a result, I found a few interesting phenomena:
 
 1. The good news is that allocations of small amounts
of memory are very efficient.  They almost always
take the fastest path through the code, in which
some available space is reserved from the
sms-used_sentinel.prev block with a handful of
pointer arithmetic operations.
 
 2. The bad news is that allocations for larger blocks
(in the =8KB range) typically require a call to the
parent SMS to get data.  On my test machine, I'm seeing
elapsed times in the 30 microsecond range when this
happens, compared to less than 1 microsecond for small
allocations that don't require more memory from the
parent SMS.  And when an allocation falls through to
the parent, it often seems to fall all the way through
to the root SMS (I suspect that 30us includes a malloc).
The problem seems to be particularly bad for things that
create subrequests, like mod_include.
 
 3. The worse news is that there seems to be lot of
fragmentation.  For example, I saw this pattern
during a server-parsed request:
  - the application code requests 12296 bytes
from a pool
  - not enough memory is available in the SMS, so it
requests 16400 bytes from its parent SMS.
  - the parent SMS doesn't have enough free space
either, so it requests 20504 bytes from the
grandparent SMS.
  - the grandparent SMS doesn't have enough space
either, but it has to iterate through 15 blocks
its free list to figure that out.  Each of these
blocks has between 8176 and 12272 bytes available.
  - the grandparent calls through to the great-grandparent
to get 24608 bytes.  The great-grandparent doesn't
have a block with that much free space, but it
iterates through 9 blocks in its free list in
search of one; all of these blocks had 16376 bytes
free.
  - the great-grandparent thus requests 28712 bytes from
the great-great grandparent.  The great-great-grandparent
doesn't have any blocks in its free list, so it calls
through to its parent, which at last is an SMS that
does a real malloc.
This type of pattern may explain the reported higher memory
use of the SMS-based httpd compared with the original pools;
there's a lot of memory in those free lists that can't be
used in this example.
 
 For an SMS that's going to be a parent of other SMSs, we'll
 need something with more sophisticated policies for reassigning
 freed space than the current trivial-SMS.

Yup.  I've brought this up to Sander and David before, but this is how 
pools work EXCEPT for the fact that we get a little more pathological in 
SMS's addition of memory because each parent tacks on memory (as you
described).  The optimization here is for the 4KB allocation.  I'm not
sure that is what we need to be optimizing for.  More analysis is
probably needed to see what our typical usage pattern is.  This is where
Ian's pool replay code is *very* helpful (although I don't like adding
more #ifdef to the pool code - yuck - I'd like to see a SMS that just
prints that info and passes it up to the parent to do what it needs 
to).

Before turning SMS on in the mainline code by default, I think we need 
to come up with a more robust allocation algorithm.  And, the whole 
point of SMS is that we can play with memory allocation algorithms 
until we are blue in the face and nothing else changes.  You can't do 
that with the current pools code.  We need more people playing with it
and suggesting new allocation algorithms.  =)  The trivial SMS tries to
keep the memory allocation strategy as close as possible to what we 
currently have.

Personally, I'd set MIN_FREE to 0 and bite the small allocations (now 
they'd be slow, but really, how many allocations are under 4KB - data 
structures, perhaps?).  I'd also like to see it smarter about reclaiming
free space, but that gets really tricky.

It's a tradeoff (and is purposeful for lots of small allocations), but 
until someone can write a better memory allocation algorithm, this is 
what we got.  Trivial SMS has its Achilles heel here - this is just an
aggrevation of how the original pool code works (the pool code would add 
4KB, while each level of the SMS adds 4KB).

Aaron pointed out that CLR (Cormen/Leiserson/Rivest for those that never 
took algorithms classes) has many memory allocation algorithms.  I'd 
bet we could find one that would work better.  -- justin



Re: Observations on fragmentation in SMS pools

2001-07-08 Thread Ben Laurie
Justin Erenkrantz wrote:
 Aaron pointed out that CLR (Cormen/Leiserson/Rivest for those that never
 took algorithms classes) has many memory allocation algorithms.  I'd
 bet we could find one that would work better.  -- justin

At USENIX there was a talk about extending the slab allocator to
multiple CPUs. One of the things in the talk was a thing called vmem,
which is a general purpose resource allocator with low fragmentation and
constant-time allocations (optionally - for more time you can get a
better fit). Sounds pretty good, eh?

http://www.usenix.org/publications/library/proceedings/usenix01/bonwick.html

Cheers,

Ben.

--
http://www.apache-ssl.org/ben.html

There is no limit to what a man can do or how far he can go if he
doesn't mind who gets the credit. - Robert Woodruff


Re: Observations on fragmentation in SMS pools

2001-07-08 Thread David Reid
  1. The good news is that allocations of small amounts
 of memory are very efficient.  They almost always
 take the fastest path through the code, in which
 some available space is reserved from the
 sms-used_sentinel.prev block with a handful of
 pointer arithmetic operations.

Cool.

 
  2. The bad news is that allocations for larger blocks
 (in the =8KB range) typically require a call to the
 parent SMS to get data.  On my test machine, I'm seeing
 elapsed times in the 30 microsecond range when this
 happens, compared to less than 1 microsecond for small
 allocations that don't require more memory from the
 parent SMS.  And when an allocation falls through to
 the parent, it often seems to fall all the way through
 to the root SMS (I suspect that 30us includes a malloc).
 The problem seems to be particularly bad for things that
 create subrequests, like mod_include.

OK,not so cool.

 
  3. The worse news is that there seems to be lot of
 fragmentation.  For example, I saw this pattern
 during a server-parsed request:
snip.

 
  For an SMS that's going to be a parent of other SMSs, we'll
  need something with more sophisticated policies for reassigning
  freed space than the current trivial-SMS.

Definately.


 Yup.  I've brought this up to Sander and David before, but this is how

Really - first i've heard of it :(

 pools work EXCEPT for the fact that we get a little more pathological in
 SMS's addition of memory because each parent tacks on memory (as you
 described).  The optimization here is for the 4KB allocation.  I'm not
 sure that is what we need to be optimizing for.  More analysis is
 probably needed to see what our typical usage pattern is.  This is where
 Ian's pool replay code is *very* helpful (although I don't like adding
 more #ifdef to the pool code - yuck - I'd like to see a SMS that just
 prints that info and passes it up to the parent to do what it needs
 to).

Yep, i had a evry simple debug sms that simply printed out what it was
doing, then did it.  Maybe it's time to revive it, but it generates a LOT of
output.


 Before turning SMS on in the mainline code by default, I think we need
 to come up with a more robust allocation algorithm.  And, the whole
 point of SMS is that we can play with memory allocation algorithms
 until we are blue in the face and nothing else changes.  You can't do
 that with the current pools code.  We need more people playing with it
 and suggesting new allocation algorithms.  =)

Absolutely.  This was my motivation in doing the work in the first place,
and it seems to be working.

 The trivial SMS tries to
 keep the memory allocation strategy as close as possible to what we
 currently have.

Oh, we're nowhere near ready to turn it on by default.  I never thought we
were, but you need a test case to prove/disprove what's going on, and that
what this whole pools as sms code gives us, a complex example that we can
use to improve our code :)


 Personally, I'd set MIN_FREE to 0 and bite the small allocations (now
 they'd be slow, but really, how many allocations are under 4KB - data
 structures, perhaps?).  I'd also like to see it smarter about reclaiming
 free space, but that gets really tricky.

Oh, if you look at it we do a LOT of allocations below 4K from pools.  Turn
on the checking of every allocation and look at the sizes being requested.


 It's a tradeoff (and is purposeful for lots of small allocations), but
 until someone can write a better memory allocation algorithm, this is
 what we got.  Trivial SMS has its Achilles heel here - this is just an
 aggrevation of how the original pool code works (the pool code would add
 4KB, while each level of the SMS adds 4KB).

Well, we have what we have.  My main concern about trivial is that the code
gets complex, and im worried that we may be over complicating things for
ourselves.  KISS is aguiding principal that it never does any harm to follow
:)


 Aaron pointed out that CLR (Cormen/Leiserson/Rivest for those that never
 took algorithms classes) has many memory allocation algorithms.  I'd
 bet we could find one that would work better.  -- justin

The sms code was added on May 9th, which looks like less than 2 months ago,
so to be honest we're in pretty damn good shape.  Let's take some time, do
the testing and define the problems then start looking at fixes.  Rushing in
with a lot of quick solutions won't get us anywhere but a world of hurt.

david



RE: Observations on fragmentation in SMS pools

2001-07-08 Thread Sander Striker
 1. The good news is that allocations of small amounts
of memory are very efficient.  They almost always
take the fastest path through the code, in which
some available space is reserved from the
sms-used_sentinel.prev block with a handful of
pointer arithmetic operations.

 Cool.

This is what the pools code did.  I only used a different way
to store the required accounting information.

 2. The bad news is that allocations for larger blocks
(in the =8KB range) typically require a call to the
parent SMS to get data.  On my test machine, I'm seeing
elapsed times in the 30 microsecond range when this
happens, compared to less than 1 microsecond for small
allocations that don't require more memory from the
parent SMS.  And when an allocation falls through to
the parent, it often seems to fall all the way through
to the root SMS (I suspect that 30us includes a malloc).
The problem seems to be particularly bad for things that
create subrequests, like mod_include.

 OK, not so cool.

No, a first optimization might be doing some simple defines in
the sms_private.h file which eliminates the overhead of the
framework code when apr_sms_malloc is called from inside an
sms module. The defines would look something like this:

#define apr_sms_malloc(sms, size)  sms-malloc_fn(sms, size)
#define apr_sms_calloc(sms, size)  sms-calloc_fn(sms, size)

I'm afraid we can't do the free_fn because it might be NULL.

This would get rid of a bit of overhead, but it isn't going to
make an enormous amount of difference.

 3. The worse news is that there seems to be lot of
fragmentation.  For example, I saw this pattern
during a server-parsed request:
 snip.

 For an SMS that's going to be a parent of other SMSs, we'll
 need something with more sophisticated policies for reassigning
 freed space than the current trivial-SMS.

 Definately.

 Yup.  I've brought this up to Sander and David before, but this is how

 Really - first i've heard of it :(

Yes, Justin brought something like this up with me when he reviewed the
trivial code.  We didn't discuss this usage pattern though.
Ofcourse it is very logical that it happens...

I'll think out loud now:
The solution might be adding specific allocation functions for SMS
implementations.  These functions could look something like this:

APR_DECLARE(void *) apr_sms_child_malloc(apr_sms_t *sms,
 apr_sms_t *child,
 apr_size_t size);

APR_DECLARE(apr_status_t) apr_sms_child_free(apr_sms_t *sms,
 apr_sms_t *child,
 void *mem);

Internally the framework will use the child_malloc_fn if present.  If
not, it will fall back to malloc_fn.

This is something that overthrows the KIS policy though...

 pools work EXCEPT for the fact that we get a little more pathological in
 SMS's addition of memory because each parent tacks on memory (as you
 described).  The optimization here is for the 4KB allocation.  I'm not
 sure that is what we need to be optimizing for.  More analysis is
 probably needed to see what our typical usage pattern is.  This is where
 Ian's pool replay code is *very* helpful (although I don't like adding
 more #ifdef to the pool code - yuck - I'd like to see a SMS that just
 prints that info and passes it up to the parent to do what it needs
 to).

I want to state here that POOLS_ARE_SMS effectively makes
each pool a trivial sms. To take full advantage of the SMS
code the stack of SMSs should be optimized. Right now we
have this (in httpd with POOLS_ARE_SMS):

std  trivial  trivial [ trivial ...]
^^
 APR  httpd

Justin already hinted in a private mail that he was thinking
about something like this:

  ...  trivial  tracking [ tracking ...]

In the buckets code blocks could be a very good choice.

So, for a first pass at allowing httpd to run with pools
as sms is not optimal, but it gets things working.

 Yep, i had a evry simple debug sms that simply printed out what it was
 doing, then did it.  Maybe it's time to revive it, but it
 generates a LOT of output.

Another option is just using the debug code you put in and tag the
relevant SMSs. This way you can see the usage patterns in each part
of the application.

 Before turning SMS on in the mainline code by default, I think we need
 to come up with a more robust allocation algorithm.  And, the whole
 point of SMS is that we can play with memory allocation algorithms
 until we are blue in the face and nothing else changes.  You can't do
 that with the current pools code.  We need more people playing with it
 and suggesting new allocation algorithms.  =)

 Absolutely.  This was my motivation in doing the work in the first place,
 and it seems to be working.

Yes :)

I've been working on a power of 2 allocator.  I still haven't got that up
to speed 

Re: Observations on fragmentation in SMS pools

2001-07-08 Thread Bill Stoddard

 I've been working on a power of 2 allocator.  I still haven't got that up
 to speed yet, but it is worth a look.  I'll play with it this week and post
 it to the list

I hacked a simple power of two allocator together as a proof of concept that 
replacing
malloc/free with apr_malloc/apr_free could give us a performance boost (it 
does). Find it
here (it is for Windows).

http://wstoddard.com/patches/apr_mem_alloc.tar

There are several simple changes to this code that could make it much closer to 
something
that could be committed.

Bill



Re: Observations on fragmentation in SMS pools

2001-07-08 Thread rbb
On Sun, 8 Jul 2001, Ian Holsman wrote:

 Bill Stoddard wrote:

 I've been working on a power of 2 allocator.  I still haven't got that up
 to speed yet, but it is worth a look.  I'll play with it this week and post
 it to the list
 
 
 I hacked a simple power of two allocator together as a proof of concept that 
 replacing
 malloc/free with apr_malloc/apr_free could give us a performance boost (it 
 does). Find it
 here (it is for Windows).
 

 Hi Bill,
 I was wondering why the brigade code doesn't make use of pools/sms  to
 handle it's memory.
 It has a pool passed to it from what I can see...

Pools aren't used for brigades because when a pool is cleaned up, the
memory goes away.  This means that whenever we are doing a pipelined
request, and we allocate the brigade out of the request_rec, we would have
to copy all of the data into the conn_rec before we could destroy the
request.  If we allocate everything out of the conn_rec, then we have a
huge resource leak until the end of the connection.

For a complete description of why we don't use pools for brigades, please
take a look at the mailing list archives from the time period around when
the brigades were introduced.  This question was brought up MANY times,
and it has been answered in great detail.

Ryan

_
Ryan Bloom  [EMAIL PROTECTED]
Covalent Technologies   [EMAIL PROTECTED]
-



Re: Observations on fragmentation in SMS pools

2001-07-08 Thread Cliff Woolley
On Sun, 8 Jul 2001, Ian Holsman wrote:

 I was wondering why the brigade code doesn't make use of pools/sms  to
 handle it's memory.
 It has a pool passed to it from what I can see...

The brigade is allocated out of a pool, and that pool is used to register
a cleanup on the brigade to make sure all of the buckets in that brigade
get destroyed.

The buckets themselves use malloc(). The buckets do not and cannot use
pools because the lifetimes could get to be all wrong pretty easily, and
it could be a _major_ resource leak on big requests or pipelined
connections. I'm in the process of switching them to use apr_sms_malloc().
I've got a 3-phase plan for making that switchover and have written phase
1, which I just need to test and commit.

The three phases go something like this (I might have mentioned this on
the list before... I can't remember whether I did or not, so here it is
again):

Phase 1: s/malloc/apr_sms_malloc/g; s/free/apr_sms_free/g; Add in a global
 apr_sms_std instance in the buckets code that all buckets will
 use to get their memory from.  Add an apr_sms_t* to the
 apr_bucket struct so that the apr_bucket_foo_make() functions
 know which SMS to use when allocating memory for private
 structures.  Set that pointer to reference the global
 apr_sms_std instance in all of the apr_bucket_foo_create()
 functions for now.

Phase 2: Add an apr_sms_t parameter to all apr_bucket_foo_create()
 functions that lets the caller determine which SMS to use.
 For now, they can all just pass in a pointer to the global
 apr_sms_std instance.

Phase 3: Create a per-thread blocks SMS in the Apache MPM's and
 stash a pointer to it in the conn_rec for each request.
 Use that instead of the global apr_sms_std when creating
 all buckets and get rid of the global apr_sms_std.

--Cliff

--
   Cliff Woolley
   [EMAIL PROTECTED]
   Charlottesville, VA





Re: Observations on fragmentation in SMS pools

2001-07-08 Thread dean gaudet


On Sun, 8 Jul 2001, Justin Erenkrantz wrote:

 Yup.  I've brought this up to Sander and David before, but this is how
 pools

woah.  no way really?

that's not at all how it was in 1.3 or in early 2.0 ...

in 2.0 as of uh a year ago say, there was one free list per process,
and locks were used to access it.

no matter where in the tree of pools you tried an allocation, if it
didn't fit into the current block, the allocator would lock and go to
the global free block list and pick up the first fit block.  none of this
going up through a chain of pools or anything, that's insane.

 It's a tradeoff (and is purposeful for lots of small allocations), but
 until someone can write a better memory allocation algorithm, this is
 what we got.

1.3's memory allocation algorithm kick ass, to put it mildly.  i've not read
the apr-dev archives as to the Why and the Goals of SMS ...

-dean



Re: Observations on fragmentation in SMS pools

2001-07-08 Thread Justin Erenkrantz
On Sun, Jul 08, 2001 at 10:14:16AM -0700, dean gaudet wrote:
 an ideal situation for free-lists (blocks of freed, but not free()d
 memory) is one per cpu.
 
 a less ideal situation is one per thread.
 
 an even less ideal situation is one per process (which requires locking).
 
 it's insane to have one per pool :)

I think we're shooting for #2 (less ideal).  Unless someone can come up
with a way to dynamically tell how many CPUs we are running on and bind
a free-list to a specific CPU.

We're currently doing #3 (even less ideal) in apr_pools.c.  

And, yeah, the current trivial SMS is doing #4.  =)  But, don't expect it 
to stay like that.  And, if we implement the apr_sms_child_malloc, it
gets to somewhere between #2 and #3.  -- justin



Re: Observations on fragmentation in SMS pools

2001-07-08 Thread Justin Erenkrantz
On Sun, Jul 08, 2001 at 10:18:12AM -0700, dean gaudet wrote:
 
 
 On Sun, 8 Jul 2001, Justin Erenkrantz wrote:
 
  Yup.  I've brought this up to Sander and David before, but this is how
  pools
 
 woah.  no way really?
 
 that's not at all how it was in 1.3 or in early 2.0 ...
 
 in 2.0 as of uh a year ago say, there was one free list per process,
 and locks were used to access it.
 
 no matter where in the tree of pools you tried an allocation, if it
 didn't fit into the current block, the allocator would lock and go to
 the global free block list and pick up the first fit block.  none of this
 going up through a chain of pools or anything, that's insane.

No, nothing changed.  We just added the chain of pools.  =)  It's a
work in progress.  You may call it a feature right now.  We're not 
saying it is ready to be activated - just hammered/examined by people
on this list.  That definitely needs to be cleaned up.  We know that.

If we switch that global free block list to per-thread (where we are
heading), is this allocation algorithm the best one for httpd-style 
memory usage?  If so, then I guess I won't look at other allocation
algorithms.  I'll defer to people who've studied this a lot longer 
than I have.  -- justin



Re: Observations on fragmentation in SMS pools

2001-07-08 Thread Justin Erenkrantz
On Sun, Jul 08, 2001 at 10:34:22AM -0700, Jon Travis wrote:
 There is still all this tomfoolery with locking, though, which I think
 would be nice to fix with different sized buckets in the freelist.  
 Stuff that the malloc in glibc does.  I cringe at the thought of how
 much overhead due to abstraction this whole project is causing.

Well, most of the tomfoolery goes away when the free-list is per thread.
No need for locking then.

I think that this gets us better performance in a threaded environment
than the current apr_pools.c - which has a per-process global list.
The key is the allocation algorithm needs to take that into account
(at this point, trivial isn't).  In order to truly support a threaded 
environment, the free-lists need to become per-thread.  With this 
abstraction, that *is* much easier.  We're not there yet, but we're
laying the groundwork to do so.  -- justin



RE: Observations on fragmentation in SMS pools

2001-07-08 Thread Sander Striker
 Stuff that the malloc in glibc does.  I cringe at the thought of how
 much overhead due to abstraction this whole project is causing.

The other thing this abstraction buys us is the ability to pass in
different memory to standard APR operations. Hopefully we can get
shared memory to work in this fashion. I know for sure that we can
do mlock()'d memory (which might be usefull for security type calculation
stuff you don't want to go to swap space).

Sander



Re: Observations on fragmentation in SMS pools

2001-07-08 Thread Justin Erenkrantz
On Sun, Jul 08, 2001 at 10:41:12AM -0700, dean gaudet wrote:
 also -- it would be worthwhile for someone to try a version of pools which
 keeps no freelist, and uses malloc()/free() on each block.  i'm guessing
 this is faster in a multithreaded server than us doing a per-process
 global list because malloc() itself already has to deal with
 multithreading, and has all sorts of other optimisations (i.e. it's not
 just a first-fit allocator).  and as an added bonus it'd be worth trying
 www.hoard.org's malloc replacement.

Switch the apr_sms_pools.c to call apr_sms_tracking_create instead of
apr_sms_trivial_create.  (You need the tracking because you will have to
free the memory, so you have to at least keep track of what you have
allocated.)  

Also, I did try having the pools use malloc/free directly 
(ALLOC_USE_MALLOC) and the performance was dreadful.  At least on 
Solaris.  -- justin



Re: Observations on fragmentation in SMS pools

2001-07-08 Thread Justin Erenkrantz
On Sun, Jul 08, 2001 at 11:04:09AM -0700, Jon Travis wrote:
 As for the ability to use shared memory for this ... yeah that is
 an interesting idea.  What are your actual use cases for that?

Ian posted a patch to do something like this - I think he wanted a hash
table that was shared across all processes.  The problem gets to be 
growing the memory space, but I think his use case was with fixed 
memory.  He could probably tell more about how he wanted to do it.  

Both rbb and I suggested that this is what SMS was designed for.
Have a SMS that is backed by shared memory and then the hashtable
immediately becomes shared when you create it with this shared memory
SMS.  No change needed to the hashtable code.  

And, potentially, the scoreboard could (??) get switched to something
using a shared memory SMS.  But, I don't know enough about the 
scoreboard to say more about that.  -- justin



Re: Observations on fragmentation in SMS pools

2001-07-08 Thread dean gaudet
On Sun, 8 Jul 2001, Jon Travis wrote:

 I talked to rbb about this not too long ago, and he told me that you
 did a lot of performance work, and that is why pools use their own
 allocation scheme (8k blocks, freelists, etc.) instead of using a
 total malloc() oriented scheme.

yes and no.  i can't say that i ever compared an implemention of apache
using malloc/free to one just using palloc ... it would have been insane
to try to go about adding free() calls everywhere they would have been
required to do that.  (and you can sit back and do some math and figure
out that it would be slower by virtue of having to track many more
pointers, and double the function call overhead of each allocation.)

so given the restriction that i was implementing palloc -- an allocator
with only a group free operation of all allocated elements, then what i
did was to tune the apache-1.3 (maybe it was 1.2 at the time) code.  the
palloc stuff goes back to before apache 1.0, and i'm pretty sure it's
rst's code (but you should ask the real old timers, not me, i came on
board around 1.2).  there were definitely a few bugs in both the alloc.c
code, and in the way apache used it.  i don't remember the details (but
i'm responsible for the mess o' ifdefs of debugging code).

another thing you need to consider is that this was a
process-per-connection server -- and the global free list made a lot of
sense.  but i don't remember actually timing it against using malloc/free
directly for the blocks.  i think that would have sucked on many of the
unixes at the time -- some of which it was standard practice to use
gnumalloc instead of libc malloc to get a perf improvement.  it's
important to remember that malloc/free haven't stood still, and the
technology in there has improved.

 You are right that libc malloc()
 implementations are very fast, can deal with threading issues
 more elegantly than we could in APR, and have other niceties
 (different sized buckets, etc.).  There is currently a ton of locking
 in the pool implementation to deal with the threads (and we are
 actually missing a lock in one needed location, I believe).

you know, i'd be tempted to say get rid of the free list even in the
process-per-connection servers.  it simplifies the code, which is way
better for maintenance.  and it's probably a toss up today as to which is
faster.  (it might be slower on ancient unixes, but, um, i fail to see the
problem ;)

-dean



Re: Observations on fragmentation in SMS pools

2001-07-08 Thread dean gaudet


On Sun, 8 Jul 2001, Justin Erenkrantz wrote:

 Also, I did try having the pools use malloc/free directly
 (ALLOC_USE_MALLOC) and the performance was dreadful.  At least on
 Solaris.  -- justin

yes, ALLOC_USE_MALLOC is dreadful -- that's not what i meant.

what i mean is something like the below patch, which i haven't even tried
to compile or test ;)  (the patch needs work to rescue some of the
debugging code in new_block).

it removes block_freelist and uses malloc/free for blocks.  this is much
less expensive than malloc/free on each allocation (amortization).

-dean

Index: apr_pools.c
===
RCS file: /home/cvs/apr/memory/unix/apr_pools.c,v
retrieving revision 1.100
diff -u -r1.100 apr_pools.c
--- apr_pools.c 2001/07/07 22:23:54 1.100
+++ apr_pools.c 2001/07/08 18:17:36
@@ -248,7 +248,6 @@
 /*
  * Static cells for managing our internal synchronisation.
  */
-static union block_hdr *block_freelist = NULL;

 #if APR_HAS_THREADS
 static apr_lock_t *alloc_mutex;
@@ -417,120 +416,20 @@

 static void free_blocks(union block_hdr *blok)
 {
-#ifdef ALLOC_USE_MALLOC
 union block_hdr *next;

 for ( ; blok; blok = next) {
next = blok-h.next;
DO_FREE(blok);
 }
-#else /* ALLOC_USE_MALLOC */
-
-#ifdef ALLOC_STATS
-unsigned num_blocks;
-#endif /* ALLOC_STATS */
-
-/*
- * First, put new blocks at the head of the free list ---
- * we'll eventually bash the 'next' pointer of the last block
- * in the chain to point to the free blocks we already had.
- */
-
-union block_hdr *old_free_list;
-
-if (blok == NULL) {
-   return; /* Sanity check --- freeing empty pool? */
-}
-
-#if APR_HAS_THREADS
-if (alloc_mutex) {
-apr_lock_acquire(alloc_mutex);
-}
-#endif
-old_free_list = block_freelist;
-block_freelist = blok;
-
-/*
- * Next, adjust first_avail pointers of each block --- have to do it
- * sooner or later, and it simplifies the search in new_block to do it
- * now.
- */
-
-#ifdef ALLOC_STATS
-num_blocks = 1;
-#endif /* ALLOC_STATS */
-
-while (blok-h.next != NULL) {
-
-#ifdef ALLOC_STATS
-   ++num_blocks;
-#endif /* ALLOC_STATS */
-
-   chk_on_blk_list(blok, old_free_list);
-   blok-h.first_avail = (char *) (blok + 1);
-   debug_fill(blok-h.first_avail, blok-h.endp - blok-h.first_avail);
-#ifdef APR_POOL_DEBUG
-   blok-h.owning_pool = FREE_POOL;
-#endif /* APR_POOL_DEBUG */
-   blok = blok-h.next;
-}
-
-chk_on_blk_list(blok, old_free_list);
-blok-h.first_avail = (char *) (blok + 1);
-debug_fill(blok-h.first_avail, blok-h.endp - blok-h.first_avail);
-#ifdef APR_POOL_DEBUG
-blok-h.owning_pool = FREE_POOL;
-#endif /* APR_POOL_DEBUG */
-
-/* Finally, reset next pointer to get the old free blocks back */
-
-blok-h.next = old_free_list;
-
-#ifdef ALLOC_STATS
-if (num_blocks  max_blocks_in_one_free) {
-   max_blocks_in_one_free = num_blocks;
-}
-++num_free_blocks_calls;
-num_blocks_freed += num_blocks;
-#endif /* ALLOC_STATS */
-
-#if APR_HAS_THREADS
-if (alloc_mutex) {
-apr_lock_release(alloc_mutex);
-}
-#endif /* APR_HAS_THREADS */
-#endif /* ALLOC_USE_MALLOC */
 }

 /*
- * Get a new block, from our own free list if possible, from the system
- * if necessary.  Must be called with alarms blocked.
+ * get a new block from the system
  */
 static union block_hdr *new_block(apr_size_t min_size, apr_abortfunc_t 
abortfunc)
 {
-union block_hdr **lastptr = block_freelist;
-union block_hdr *blok = block_freelist;
-
-/* First, see if we have anything of the required size
- * on the free list...
- */
-
-while (blok != NULL) {
-   if ((apr_ssize_t)min_size + BLOCK_MINFREE = blok-h.endp - 
blok-h.first_avail) {
-   *lastptr = blok-h.next;
-   blok-h.next = NULL;
-   debug_verify_filled(blok-h.first_avail, blok-h.endp,
-   [new_block] Ouch!  Someone trounced a block 
-   on the free list!\n);
-   return blok;
-   }
-   else {
-   lastptr = blok-h.next;
-   blok = blok-h.next;
-   }
-}
-
-/* Nope. */
+union block_hdr *blok;

 min_size += BLOCK_MINFREE;
 blok = malloc_block((min_size  BLOCK_MINALLOC)
@@ -586,7 +485,6 @@
 union block_hdr *blok;
 apr_pool_t *new_pool;

-
 #if APR_HAS_THREADS
 if (alloc_mutex) {
 apr_lock_acquire(alloc_mutex);
@@ -960,7 +858,7 @@

 APR_DECLARE(apr_size_t) apr_pool_free_blocks_num_bytes(void)
 {
-return bytes_in_block_list(block_freelist);
+return 0;
 }

 /* the unix linker defines this symbol as the last byte + 1 of
@@ -1255,13 +1153,7 @@
 cur_len = strp - blok-h.first_avail;

 /* must try another blok */
-#if APR_HAS_THREADS
-apr_lock_acquire(alloc_mutex);
-#endif
 nblok = new_block(2 * cur_len, NULL);
-#if 

Re: Observations on fragmentation in SMS pools

2001-07-08 Thread Ian Holsman
dean gaudet wrote:
On Sun, 8 Jul 2001, Justin Erenkrantz wrote:
Also, I did try having the pools use malloc/free directly
(ALLOC_USE_MALLOC) and the performance was dreadful.  At least on
Solaris.  -- justin
yes, ALLOC_USE_MALLOC is dreadful -- that's not what i meant.
what i mean is something like the below patch, which i haven't even tried
to compile or test ;)  (the patch needs work to rescue some of the
debugging code in new_block).
it removes block_freelist and uses malloc/free for blocks.  this is much
less expensive than malloc/free on each allocation (amortization).
-dean
If you can wait till monday, and you are interested, I'll put it through 
the ringer on a 8-CPU Sun4500
..Ian


Index: apr_pools.c
===
RCS file: /home/cvs/apr/memory/unix/apr_pools.c,v
retrieving revision 1.100
diff -u -r1.100 apr_pools.c
--- apr_pools.c 2001/07/07 22:23:54 1.100
+++ apr_pools.c 2001/07/08 18:17:36
@@ -248,7 +248,6 @@
/*
 * Static cells for managing our internal synchronisation.
 */
-static union block_hdr *block_freelist = NULL;
#if APR_HAS_THREADS
static apr_lock_t *alloc_mutex;
@@ -417,120 +416,20 @@
static void free_blocks(union block_hdr *blok)
{
-#ifdef ALLOC_USE_MALLOC
union block_hdr *next;
for ( ; blok; blok = next) {
next = blok-h.next;
DO_FREE(blok);
}
-#else /* ALLOC_USE_MALLOC */
-
-#ifdef ALLOC_STATS
-unsigned num_blocks;
-#endif /* ALLOC_STATS */
-
-/*
- * First, put new blocks at the head of the free list ---
- * we'll eventually bash the 'next' pointer of the last block
- * in the chain to point to the free blocks we already had.
- */
-
-union block_hdr *old_free_list;
-
-if (blok == NULL) {
-   return; /* Sanity check --- freeing empty pool? */
-}
-
-#if APR_HAS_THREADS
-if (alloc_mutex) {
-apr_lock_acquire(alloc_mutex);
-}
-#endif
-old_free_list = block_freelist;
-block_freelist = blok;
-
-/*
- * Next, adjust first_avail pointers of each block --- have to do it
- * sooner or later, and it simplifies the search in new_block to do it
- * now.
- */
-
-#ifdef ALLOC_STATS
-num_blocks = 1;
-#endif /* ALLOC_STATS */
-
-while (blok-h.next != NULL) {
-
-#ifdef ALLOC_STATS
-   ++num_blocks;
-#endif /* ALLOC_STATS */
-
-   chk_on_blk_list(blok, old_free_list);
-   blok-h.first_avail = (char *) (blok + 1);
-   debug_fill(blok-h.first_avail, blok-h.endp - blok-h.first_avail);
-#ifdef APR_POOL_DEBUG
-   blok-h.owning_pool = FREE_POOL;
-#endif /* APR_POOL_DEBUG */
-   blok = blok-h.next;
-}
-
-chk_on_blk_list(blok, old_free_list);
-blok-h.first_avail = (char *) (blok + 1);
-debug_fill(blok-h.first_avail, blok-h.endp - blok-h.first_avail);
-#ifdef APR_POOL_DEBUG
-blok-h.owning_pool = FREE_POOL;
-#endif /* APR_POOL_DEBUG */
-
-/* Finally, reset next pointer to get the old free blocks back */
-
-blok-h.next = old_free_list;
-
-#ifdef ALLOC_STATS
-if (num_blocks  max_blocks_in_one_free) {
-   max_blocks_in_one_free = num_blocks;
-}
-++num_free_blocks_calls;
-num_blocks_freed += num_blocks;
-#endif /* ALLOC_STATS */
-
-#if APR_HAS_THREADS
-if (alloc_mutex) {
-apr_lock_release(alloc_mutex);
-}
-#endif /* APR_HAS_THREADS */
-#endif /* ALLOC_USE_MALLOC */
}
/*
- * Get a new block, from our own free list if possible, from the system
- * if necessary.  Must be called with alarms blocked.
+ * get a new block from the system
 */
static union block_hdr *new_block(apr_size_t min_size, apr_abortfunc_t 
abortfunc)
{
-union block_hdr **lastptr = block_freelist;
-union block_hdr *blok = block_freelist;
-
-/* First, see if we have anything of the required size
- * on the free list...
- */
-
-while (blok != NULL) {
-   if ((apr_ssize_t)min_size + BLOCK_MINFREE = blok-h.endp - 
blok-h.first_avail) {
-   *lastptr = blok-h.next;
-   blok-h.next = NULL;
-   debug_verify_filled(blok-h.first_avail, blok-h.endp,
-   [new_block] Ouch!  Someone trounced a block 
-   on the free list!\n);
-   return blok;
-   }
-   else {
-   lastptr = blok-h.next;
-   blok = blok-h.next;
-   }
-}
-
-/* Nope. */
+union block_hdr *blok;
min_size += BLOCK_MINFREE;
blok = malloc_block((min_size  BLOCK_MINALLOC)
@@ -586,7 +485,6 @@
union block_hdr *blok;
apr_pool_t *new_pool;
-
#if APR_HAS_THREADS
if (alloc_mutex) {
apr_lock_acquire(alloc_mutex);
@@ -960,7 +858,7 @@
APR_DECLARE(apr_size_t) apr_pool_free_blocks_num_bytes(void)
{
-return bytes_in_block_list(block_freelist);
+return 0;
}
/* the unix linker defines this symbol as the last byte + 1 of
@@ -1255,13 +1153,7 @@
cur_len = strp - blok-h.first_avail;
/* must try another blok */
-#if APR_HAS_THREADS
-

Re: Observations on fragmentation in SMS pools

2001-07-08 Thread dean gaudet
On Sun, 8 Jul 2001, Ian Holsman wrote:

 dean gaudet wrote:

 it removes block_freelist and uses malloc/free for blocks.  this is much
 less expensive than malloc/free on each allocation (amortization).

 If you can wait till monday, and you are interested, I'll put it through
 the ringer on a 8-CPU Sun4500

oh i'm in no rush, i'm being a lazy ass and not even testing these things
i'm posting, so thanks for helping :)  also, you would definitely want to
try the www.hoard.org allocator on a (solaris) machine with that many
cpus.

-dean





Re: Observations on fragmentation in SMS pools

2001-07-08 Thread dean gaudet
On Sun, 8 Jul 2001, Brian Pane wrote:

 dean gaudet wrote:

 On Sun, 8 Jul 2001, Ian Holsman wrote:
 
 dean gaudet wrote:
 
 it removes block_freelist and uses malloc/free for blocks.  this is much
 less expensive than malloc/free on each allocation (amortization).
 
 If you can wait till monday, and you are interested, I'll put it through
 the ringer on a 8-CPU Sun4500
 
 
 oh i'm in no rush, i'm being a lazy ass and not even testing these things
 i'm posting, so thanks for helping :)  also, you would definitely want to
 try the www.hoard.org allocator on a (solaris) machine with that many
 cpus.
 
 If I remember correctly, he's tried the Hoard allocator on
 a server that size with ALLOC_USE_MALLOC, and it performed
 much better than the global free list.  But Hoard is LGPL. :-(

i would totally believe that the ALLOC_USE_MALLOC code performs better
with hoard on a server that size... because the ALLOC_USE_MALLOC code
doesn't need to use a mutex ever.  the global freelist stuff is just a bad
idea on a box with 8 cpus, the mutex kills ya.

but with the elimination of the mutex we should scale better even just
using regular malloc.

i wasn't advocating bundling hoard, it's firmly my opinion that each libc
vendor should be doing their damndest to provide the best malloc possible.
(especially because each libc vendor tends to also be a libpthread vendor
and they know the dirty details of making the two work well together).
i've no qualms with putting try hoard into the how do i perf tune
apache? page.  it wouldn't be a requirement...

-dean




Re: Observations on fragmentation in SMS pools

2001-07-08 Thread Ian Holsman
Jon Travis wrote:
On Sun, Jul 08, 2001 at 11:27:08AM -0700, Ian Holsman wrote:
Justin Erenkrantz wrote:
On Sun, Jul 08, 2001 at 11:04:09AM -0700, Jon Travis wrote:
[snip]
I'm still curious about the use case.  I'm sure there is one, if you
could just enlighten me.  I already know it's 'cool'.
for every page request which comes in, we need to figure out what app is 
serving it, and which category it belongs to
(eg... it is a price comparision page for a mobile phone), this 
information is used by other modules to determine different things.
(eg... we store info on the category tree so we know that a mobile phone 
category is part of  electronics.

so we plug parts of the URI into a hash table and store info about it 
for each different category.
now.. the reason we are going with shared memory on this is that there 
are ~10,000 (soon to be 60,000) categories on our site,
and we also have other info we store so it blows out to be ~8M-64M of 
memory being used just to store this stuff

we 'could' have used a gdbm/db but I think that direct shared memory 
would be faster than a file based DB shoved into memory cache.

The other unique application of shared memory that we plan to do soon is 
having a state table stored in shared mem to implement a faster setenvIf 
module.

..Ian
(ps.. if you can suggest other methods of doing the above, I would be 
greatfull for the ideas)

-- Jon




Re: Observations on fragmentation in SMS pools

2001-07-08 Thread Ben Laurie
dean gaudet wrote:
 
 On Sun, 8 Jul 2001, Ben Laurie wrote:
 
  Justin Erenkrantz wrote:
   Aaron pointed out that CLR (Cormen/Leiserson/Rivest for those that never
   took algorithms classes) has many memory allocation algorithms.  I'd
   bet we could find one that would work better.  -- justin
 
  At USENIX there was a talk about extending the slab allocator to
  multiple CPUs. One of the things in the talk was a thing called vmem,
  which is a general purpose resource allocator with low fragmentation and
  constant-time allocations (optionally - for more time you can get a
  better fit). Sounds pretty good, eh?
 
  http://www.usenix.org/publications/library/proceedings/usenix01/bonwick.html
 
 linux 2.4 uses a slab allocator for many critical data structures -- in
 particular the networking code uses it.  this is one of the tricks they
 used betwen 2.2 and 2.4 to increase the scalability to large numbers of
 CPUs.  their allocator does have per-cpu caching, but i don't remember all
 the details.

I should be clear: this gadged goes underneath the slab allocator. That
is, the slab allocator uses memory allocated by vmem. Apparently, this
is because slab allocators always defer the question of how to get big
lumps of memory, and vmem is what it is proposed it should defer it to.

BTW, the obvious thing to do to cure the problem described in the first
place is to not add extra memory after the first step up the tree.

Cheers,

Ben.

--
http://www.apache-ssl.org/ben.html

There is no limit to what a man can do or how far he can go if he
doesn't mind who gets the credit. - Robert Woodruff