Hi All,
 
I am running jabber 1.4.2, not the latest from cvs but fairly new. I have a test application which in an async manner logs into a jabber server. Async meaning that if I give this app 200 users too login as it doesn't do the logins in a linear manner, it is a state machine so any user can be in any state of login at any time during the login process. During the login I get a segfault in xdb_thump because it is trying to remove an entry from the linked list that is already gone (I assume due to the state of the login). Problem code I believe:
 
result xdb_thump(void *arg)
{
    xdbcache xc = (xdbcache)arg;
    xdbcache cur, next;
    int now = time(NULL);
 
    /* spin through the cache looking for stale requests */
    cur = xc->next;
    while(cur != xc)
    {
        next = cur->next;
 
++++30 seconds old
        /* really old ones get wacked */
        if((now - cur->sent) > 30)
        {
            /* remove from ring */
            cur->prev->next = cur->next;
            cur->next->prev = cur->prev;
 
            /* make sure it's null as a flag for xdb_set's */
            cur->data = "">
 
            /* free the thread! */
            cur->preblock = 0;
            if(cur->cond != NULL)
                pth_cond_notify(cur->cond, FALSE);
 
            cur = next;
            continue;
        }
 
        /* resend the waiting ones every so often */
        if((now - cur->sent) > 10)
            xdb_deliver(xc->i, cur);
 
        /* cur could have been free'd already on it's thread */
        cur = next;
    }
 
    return r_DONE;
}
 
 
Gets invoked by the following code:
 
xdbcache xdb_cache(instance id)
{
    xdbcache newx;
 
    if(id == NULL)
    {
        fprintf(stderr, "Programming Error: xdb_cache() called with NULL\n");
        return NULL;
    }
 
    newx = pmalloco(id->p, sizeof(_xdbcache));
    newx->i = id; /* flags it as the top of the ring too */
    newx->next = newx->prev = newx; /* init ring */
 
++++++ We register a handler here to handle requests, which in the correct case removes the cache from the list
    /* register the handler in the instance to filter out xdb results */
    register_phandler(id, o_PRECOND, xdb_results, (void *)newx);
 
+++++We register a beat here to check the cache for entries that are over 30 seconds old and remove them
    /* heartbeat to keep a watchful eye on xdb_cache */
    register_beat(10,xdb_thump,(void *)newx);
 
    return newx;
}
 
NOTE: ++++++ are my comments
 
I believe that there is a thread handling the beat and a different thread handling the phandler, if that is not true then my theory is shot. If it is true then what is stopping the handler thread from removing the same entry that the beat removed?
 
 
Any insight would be great.
 
    Thanks
 
            Glenn
 

Reply via email to