ok, from reading through the syslog, Jun 2 07:27:01 was the point where the last activity from a housekeeping thread was logged.
Jun 2 07:37:05 was the last timeout from all jobs scheduled from there, so I realy don't think that there was something blocked inside of the eventqueue thread.
searching for 'processing outbound queue' which is output by the smtp-queue unveils that.
since my tries to attach to citserver are probably after the restart, we can't say whether there was a thread being housekeeper blocked, or whether there was no more calling of do_housekeeping().
whatever the reason was, its been related to either one of the housekeeping sub-jobs being blocked, or the housekeeper facility in itself not working.
so it comes down to 'everything much cleaner' introducing some race condition, or some parts of the housekeeper itself having a race condition.
Feeding jobs into the event-queue is a signal through a non-blocking pipe with the libev function ev_async_send() which is nonblocking.
it is however protected by mutexes, which could have a race condition.
alternatively some other part of the housekeeping / queue / indexer could be blocking.
i.e. the citadel networker has a mutex on the list of active server internetworking, or for the access to the netconfig.
since my last commits fixed a bug of read_network_map() (which was basicaly always returning a NULL-pointer in advance) I think we have some follow up of this here.
the first follow up was a crash related to the pointer still being in use after freeing it.
maybe the second is the possibility of a deadlock.
the networking code was written in a non thread-safe manner with global variables which sometimes caused crashes on uncensored related to citadel client sessions and the networker/housekeeping thread accessing / freeing these vars concurently. I've changed that with 7.8x, and introduced the above mentioned NULL-Pointer bug.