We're still collecting some more trace data on the scheduler to determine why this is happening but the engineer looking into this was able to provide this info so far:
---------- The scheduler uses select() to multiplex I/O to its children and it does a good job of handling that I/O asynchronously. Where it falls down is when it communicates to the postgres server. It does postgres server calls synchronously, and one specific query to the server can take multiple minutes to perform. (It's the status clean-up code in DBCheckStatus(), if you're following this in the source code. :-) The scheduler has a 10-second timeout when waiting for communication to the child processes, but all of the watchdog timeouts for that I/O get blown off when the scheduler waits multiple minutes for a database query. I'm trying to figure out why this particular query should take so long, since the scheduler doesn't seem to wait very long at all to update its status information in the DB. ---------- I was wondering if this seemed like a reasonable hypothesis on the scheduler's current activity (again running on the older version) or what your thoughts were on this Bob? -FuRoSh... _______________________________________________ fossology mailing list [email protected] http://fossology.org/mailman/listinfo/fossology

