The cause of the problem is that every time apr_hash_set(..., non-NULL) adds a new entry, it allocates some new pool memory for that entry and links it into one of its linked lists. When apr_hash_set(..., NULL) deletes an entry, it unlinks it but never re-uses that memory.
So this is an inefficiency in APR's implementation of hash tables. It is not exactly a "leak", since it is contained with a pool, but it is unbounded memory use caused by repeated simple actions - not a good property for a container to have.
It would be possible to fix this in APR if people thought it worthwhile, e.g. by using a "free-list". (Maintain, in the hash table, a list of free entries, initially empty. When deleting an entry, move it to the free-list. When a new entry is wanted, take one from the free-list if available, else allocate it from the pool.) This would make the memory used by a hash table proportional to the maximum number of entries that it had ever held simultaneously, rather than the number of insertions that had ever been performed.
In fact, using a free-list seems to be quite easy, with virtually no speed penalty. A patch is attached, prepared in the manner of a Subversion patch.
(Please copy replies to [EMAIL PROTECTED], as I am not subscribed to [email protected])
- Julian
Prevent unbounded memory use during repeated operations on a hash table.
The hash table was allocating new memory for each new insertion of an entry,
and not reclaiming it on deletions, so repetition of (insert, delete) caused
the memory use to keep growing. This fix causes the memory freed by a deletion
to be reused by a subsequent insertion, so that the memory used by the hash
table is proportional to the maximum number of entries that it has ever held
simultaneously, rather than the number of insertions that have been performed.
* apr/tables/apr_hash.c:
(apr_hash_t): Add new field "free", a free-list.
(apr_hash_make, apr_hash_copy, apr_hash_merge): Initialise the free-list.
(find_entry): Use an entry from the free-list if there is one.
(apr_hash_set): Return a deleted entry to the free-list.
Index: tables/apr_hash.c
===================================================================
RCS file: /home/cvspublic/apr/tables/apr_hash.c,v
retrieving revision 1.40
diff -u -3 -p -d -r1.40 apr_hash.c
--- tables/apr_hash.c 19 Apr 2004 11:44:37 -0000 1.40
+++ tables/apr_hash.c 2 Nov 2004 22:14:34 -0000
@@ -73,6 +73,7 @@ struct apr_hash_t {
apr_hash_index_t iterator; /* For apr_hash_first(NULL, ...) */
unsigned int count, max;
apr_hashfunc_t hash_func;
+ apr_hash_entry_t *free; /* List of recycled entries */
};
#define INITIAL_MAX 15 /* tunable == 2^n - 1 */
@@ -92,6 +93,7 @@ APR_DECLARE(apr_hash_t *) apr_hash_make(
apr_hash_t *ht;
ht = apr_palloc(pool, sizeof(apr_hash_t));
ht->pool = pool;
+ ht->free = NULL;
ht->count = 0;
ht->max = INITIAL_MAX;
ht->array = alloc_array(ht, ht->max);
@@ -264,7 +266,10 @@ static apr_hash_entry_t **find_entry(apr
return hep;
/* add a new entry for non-NULL values */
- he = apr_palloc(ht->pool, sizeof(*he));
+ if (he = ht->free)
+ ht->free = he->next;
+ else
+ he = apr_palloc(ht->pool, sizeof(*he));
he->next = NULL;
he->hash = hash;
he->key = key;
@@ -286,6 +291,7 @@ APR_DECLARE(apr_hash_t *) apr_hash_copy(
sizeof(*ht->array) * (orig->max + 1) +
sizeof(apr_hash_entry_t) * orig->count);
ht->pool = pool;
+ ht->free = NULL;
ht->count = orig->count;
ht->max = orig->max;
ht->hash_func = orig->hash_func;
@@ -333,7 +339,10 @@ APR_DECLARE(void) apr_hash_set(apr_hash_
if (*hep) {
if (!val) {
/* delete entry */
+ apr_hash_entry_t *old = *hep;
*hep = (*hep)->next;
+ old->next = ht->free;
+ ht->free = old;
--ht->count;
}
else {
@@ -396,6 +405,7 @@ APR_DECLARE(apr_hash_t *) apr_hash_merge
res = apr_palloc(p, sizeof(apr_hash_t));
res->pool = p;
+ res->free = NULL;
res->hash_func = base->hash_func;
res->count = base->count;
res->max = (overlay->max > base->max) ? overlay->max : base->max;
