I'll look into that one and come up with a fix.. On Thu, Jun 4, 2015 at 7:31 AM, Teemu Mannermaa <[email protected]> wrote:
> Hello, > > I've been tracking down a strange scheduler problem where it fails to > find a valid app version for a host at random times only to succeed the > next round. Obviously there's a valid app version for the host. > > I added some additional logging (attached as it helps understand all > failure conditions of version select) and found out that the if at > https://github.com/BOINC/boinc/blob/master/sched/sched_version.cpp#L845 > seems to be one that fails even though there's no BAVP yet. Somehow the > r ends up negative here as shown in the log: > "[version] Not selected, AV#36 r*45.66 GFLOP <= Best AV 0.00 GFLOP > (r=-1.391884, n=1)" > and so never select the app version even if that's the only one > otherwise valid for the host. :( > > What this relates is the <version_select_random_factor> option as > explained in > https://boinc.berkeley.edu/trac/wiki/ProjectOptions#Appversionselection. > We had no value for that before so the default at > https://github.com/BOINC/boinc/blob/master/sched/sched_config.cpp#L94 > gets used (1.0 that already seems to contradict the documented default > of 0.1). Now I'm not sure what range rand_normal() at > https://github.com/BOINC/boinc/blob/master/lib/util.cpp#L571 can return > values but my guess is they can be negative (and less than -1). > > I'm not sure how to fix this as I don't quite understand the logic > behind this code. I've worked around it for now by setting > <version_select_random_factor> explicitly to 0.1 and now there have been > no negative r incidents. Might as well disable it as the estimation > discrepancies (especially between a new OpenCL app and an old > CUDA/Stream app) can be in the order 10 or 100, not mere 1. > -- > Teemu Mannermaa > System Specialist > > "Anything is possible but probabilities vary." > > > _______________________________________________ > boinc_projects mailing list > [email protected] > http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_projects > To unsubscribe, visit the above URL and > (near bottom of page) enter your email address. > _______________________________________________ boinc_dev mailing list [email protected] http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address.
