I'm thinking the work generator code would look something like:

int transitioner_backlog_minutes(double &backlog_minutes)
{
    MYSQL_RES* rp;
    MYSQL_ROW row;
    int retval = 0;
    retval = boinc_db.do_query("select (now() -
FROM_UNIXTIME(MIN(transition_time)))/1000 as backlog_minutes from
workunit");
    if (retval) {
        log_messages.printf(MSG_NORMAL,"Error %d in call to
transitioner_backlog_minutes",retval);
        return ERR_FOPEN;
    }
    rp = mysql_store_result(boinc_db.mysql);
    row = mysql_fetch_row(rp);
    sscanf(row[0],"%lf",&backlog_minutes);
    mysql_free_result(rp);
    return retval;
}


void main_loop() {
  int retval;
  while(1) {
    check_stop_daemons();
    int n;
    retval = count_unsent_results(n,app.id);
    double backlog;
    retval += transitioner_backlog_minutes(backlog);
    if (retval == 0) {
        //do not create work if we have enough WUs or
        //the transition backlog exceeds 5 minutes
        if ((n > CUSHION) || (backlog > 5.0)) {
            sleep(30);
        }
else {


I think it would also have to check to make sure the transitioner daemon is
running.  If it isn't, then the work generator will keep making WUs because
the transitioner isn't making result records.

That, or change the code so the first result record is made when the work
is generated rather than waiting for the transitioner to do it.  I don't
know if that would be a simpler solution or not.

Jon Sonntag



On Tue, Dec 10, 2013 at 10:17 AM, Jon Sonntag <[email protected]> wrote:

> If the transitioner is backlogged, the work generator is called to create
> more work.  But, since the transition creates the results, the server
> things that there isn't any work so it creates more.  That causes the
> transition to get even more bogged down trying to create the result
> records.  Collatz is only supposed to have 1K of workunits per app.  As a
> result of the "catch 22" described, there are almost 500,000 WUs now.  That
> causes the database to grow even larger and when it exceeds the physical
> RAM, MySQL slows down considerably which makes the transitioner even slower
> causing even more workunits to be generated.
>
> So... is there any way that the work generator can check not only that the
> count of WUs per appid is below the threshhold but also that the
> transitioner is not backlogged prior to generating more work?
>
> Jon Sonntag
>
_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.

Reply via email to