Hi All,
I am running jabber 1.4.2, not the latest from cvs
but fairly new. I have a test application which in an async manner logs into a
jabber server. Async meaning that if I give this app 200 users too login as it
doesn't do the logins in a linear manner, it is a state machine so any user can
be in any state of login at any time during the login process. During the
login I get a segfault in xdb_thump because it is trying to remove an entry from
the linked list that is already gone (I assume due to the state of the login).
Problem code I believe:
result xdb_thump(void
*arg)
{ xdbcache xc = (xdbcache)arg; xdbcache cur, next; int now = time(NULL); /* spin through the cache
looking for stale requests */
cur = xc->next; while(cur != xc) { next = cur->next; ++++30 seconds old
/*
really old ones get wacked */
if((now - cur->sent) > 30) { /* remove from ring */ cur->prev->next = cur->next; cur->next->prev = cur->prev; /*
make sure it's null as a flag for xdb_set's
*/
cur->data = ""> /*
free the thread!
*/
cur->preblock = 0; if(cur->cond != NULL) pth_cond_notify(cur->cond, FALSE); cur =
next;
continue; } /*
resend the waiting ones every so often
*/
if((now - cur->sent) > 10) xdb_deliver(xc->i, cur); /* cur
could have been free'd already on it's thread
*/
cur = next; } return r_DONE;
} Gets invoked by the following code:
xdbcache xdb_cache(instance
id)
{ xdbcache newx; if(id ==
NULL)
{ fprintf(stderr, "Programming Error: xdb_cache() called with NULL\n"); return NULL; } newx = pmalloco(id->p,
sizeof(_xdbcache));
newx->i = id; /* flags it as the top of the ring too */ newx->next = newx->prev = newx; /* init ring */ ++++++ We register a handler here to handle
requests, which in the correct case removes the cache from the list
/* register the handler in the
instance to filter out xdb results */
register_phandler(id, o_PRECOND, xdb_results, (void *)newx); +++++We register a beat here to check the cache for
entries that are over 30 seconds old and remove them
/* heartbeat to keep a watchful
eye on xdb_cache */
register_beat(10,xdb_thump,(void *)newx); return newx;
} NOTE: ++++++ are my comments
I believe that there is a thread handling the beat
and a different thread handling the phandler, if that is not true then my theory
is shot. If it is true then what is stopping the handler thread from removing
the same entry that the beat removed?
Any insight would be great.
Thanks
Glenn
|