The code is the way it is because some apps have wildly wrong
fraction done estimates toward the beginning.
Let's focus on why the static estimates are wrong.
These are based on the statistics of the host's actual elapsed times
for previous jobs; that is, avp->flops is set so that
wu->rsc_fpops_est / avp->flops is the average elapsed time.
1) is this mechanism not working as intended?
2) has this particular host not completed enough jobs to
get a meaningful average?
-- David
On 09-Feb-2014 5:46 AM, William wrote:
As Jon pointed out, it's very embarrassingly inaccurate early in the run.
The negative exponential weighting of the dynamic estimate is the problem.
Suggest weighting the dynamic estimate as:
MAX (1, (3 * the PERCENTAGE done))
Then at 11% the weight of the dynamic estimate is 33% (vs. 1.21%).
~~~~~
"Rightful liberty is unobstructed action according to our will within limits
drawn
around us by the equal rights of others. I do not add 'within the limits of the
law'
because law is often but the tyrant's will, and always so when it violates the
rights of the individual." - Thomas Jefferson
On Saturday, February 8, 2014 8:44 PM, David Anderson <[email protected]>
wrote:
The estimate is a weighted combination of
static (based on wu.rsc_flops_est and avp->flops)
and dynamic (based on fraction done and elapsed time)
estimates; see below.
The weight of the dynamic estimate is the square of the fraction done;
e.g. when 50% done, the weight is 0.25.
So at 11% done the estimate is based almost entirely on the static estimate.
-- David
double ACTIVE_TASK::est_dur() {
if (fraction_done >= 1) return elapsed_time;
double wu_est = result->estimated_runtime();
if (fraction_done <= 0) return wu_est;
if (wu_est < elapsed_time) wu_est = elapsed_time;
double frac_est = fraction_done_elapsed_time / fraction_done;
double fd_weight = fraction_done * fraction_done;
double wu_weight = 1 - fd_weight;
double x = fd_weight*frac_est + wu_weight*wu_est;
return x;
}
double RESULT::estimated_runtime_uncorrected() {
return wup->rsc_fpops_est/avp->flops;
}
// estimate how long a result will take on this host
//
double RESULT::estimated_runtime() {
double x = estimated_runtime_uncorrected();
if (!project->dont_use_dcf) {
x *= project->duration_correction_factor;
}
return x;
}
On 08-Feb-2014 11:54 AM, Jon Sonntag wrote:
> Why would 11% complete in 1.5 hours have an estimated 72 hours remaining
> when it should be closer to 14 hours remaining? Does BOINC need a math
> tutor? ;-)
>
> I find it interesting that the estimates on a Q6600 are correct but on
both
> of my i7 hosts they are way too high.
>
> All hosts have all been running the app for several weeks so any learning
> curve by the smart estimate algorithm should have adjusted the numbers
> already, right? How long should it take BOINC to get the estimates
> correct? I would think less than an hour when percent complete is
totally
> linear. Or is the problem the that the benchmarks do not take into
account
> hyper-threading which skews the estimates?
>
> Jon Sonntag
> _______________________________________________
> boinc_dev mailing list
> [email protected] <mailto:[email protected]>
> http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
> To unsubscribe, visit the above URL and
> (near bottom of page) enter your email address.
>
_______________________________________________
boinc_dev mailing list
[email protected] <mailto:[email protected]>
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.
_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.