So my validator is running into a weird issue where there frequently missing 
output files from results when they attempt to be validated.  Sometimes they 
seem to show up later; but when I set the retry flag to true in the validator, 
it seems to enter into an infinite loop where it repeatedly tries to validate 
the results.

Any reason why a not-insignificant number of results would reach the validator 
without an output file? Checking in the database these are results that 
succeeded successfully (everything looks fine in the stderr out, which is 
written to at the same time the output files are written).


Basically, I'm reading the result file into a string:

string get_file_as_string(string file_path) throw (int) {
    //read the entire contents of the file into a string
    ifstream sites_file(file_path.c_str());

    if (!sites_file.is_open()) {
        throw 1;
    }

    std::string fc;

    sites_file.seekg(0, std::ios::end);   
    fc.reserve(sites_file.tellg());
    sites_file.seekg(0, std::ios::beg);

    fc.assign((std::istreambuf_iterator<char>(sites_file)), 
std::istreambuf_iterator<char>());

    return fc;
}

Which throws an exception if it can't open the file.  If this happens the file 
parsing function returns the error:

    string fc;
    try {
        fc = get_file_as_string(file_path);
    } catch (int err) {
        log_messages.printf(MSG_CRITICAL, "[RESULT#%d %s] get_data_from_result: 
could not open file for result\n", result.id, result.name);
        log_messages.printf(MSG_CRITICAL, "     file path: %s\n", 
file_path.c_str());
        //retry this result?
        return err;
    }

And then if that happens I set the retry flag to true and return an error 
within checkset:

            retval = get_data_from_result(uint32_max, checksum, failed_sets, 
results[i]);
            if (retval) {
                log_messages.printf(MSG_CRITICAL, "result[%2d] - id: %10d, 
error getting data from result: %d, retrying.\n", i, results[i].id, retval);
                retry = true;
                return retval;
            }

(All the validation code is here:  
https://github.com/travisdesell/Subset-Sum/blob/master/server/sss_validation_policy.cpp
 )

So any reason this would enter into an infinite loop with the validator 
repeatedly retrying these results? Am I returning a wrong error value, or is 
there something in the results I need to set, so it will back off these results 
for awhile to retry them?  It seems like in some of the cases (maybe all of 
them?) the file does end up eventually showing up.


Thanks,
--Travis

---------------------------------------------------------------------------
Travis Desell,  Assistant Professor
University of North Dakota - Dept. of Computer Science 
[email protected] - cell: 518-867-1054
Streibel Hall Room 220 - office: 777-701-3477
3950 Campus Road Stop 9015
Grand Forks, North Dakota 52802-9015

Homepage ( http://people.cs.und.edu/~tdesell/ )
MilkyWay@Home ( http://milkyway.cs.rpi.edu/ )
DNA@Home ( http://volunteer.cs.und.edu/dna )
Worldwide Computing Laboratory ( http://wcl.cs.rpi.edu/ )
----------------------------------------------------------------------------

_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.

Reply via email to