That is not at all what I am saying.  In the case I am seeing, when the
task reports 50% done, it is 50% done.

When a task enters EDF it is because the round robin scheduler indicated
that a miss might occur.  The rr scheduler uses the resource fraction as a
part of determining whether the task is in deadline danger.  (all well and
good).

When running on the CPU, the resource fraction is not relevant, a single
CPU task gets 100% of a CPU.  So, if the project has a resource fraction of
0.1 on a 2 CPU system, and it is running on a 2 CPU system, the rr sim is
expecting the task to get a run fraction of 0.1, but it is actually getting
a run fraction of 0.5.  After a while, the calculation of deadline miss in
rr sim will be such that based on the time remaining on the task and the
resource share, the task will no longer be in deadline danger, and will be
available for preemption before its time slot is done.  This can lead to
fairly rapid cycling through identical tasks.

Running an EDF sim in the case that RR sim indicates a deadline problem,
and using the EDF sim results to limit preemption as opposed to a re-orderd
normal start solves both the problem you are describing, and it solves the
problem I am describing.

jm7


                                                                           
             David Anderson                                                
             <[email protected]                                             
             ey.edu>                                                    To 
                                       [email protected]              
             01/15/2010 01:28                                           cc 
             PM                        [email protected]          
                                                                   Subject 
                                       Re: CPU scheduling.                 
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           




I think what you're saying is that the job's fraction-done reporting is
wrong;
e.g. it reports 50% done when it's actually only 25% done.
This causes the RR sim to think it will meet its deadline,
potentially causing it to get preempted by unstarted identical jobs.

The change should handle this case correctly.

-- David

[email protected] wrote:
> ===================================================================
> --- branches/boinc_core_release_6_10/checkin_notes           2010-01-14
> 21:37:15 UTC (rev 20165)
> +++ branches/boinc_core_release_6_10/checkin_notes           2010-01-14
> 21:40:31 UTC (rev 20166)
> @@ -188,3 +188,26 @@
>      client/
>          scheduler_op.cpp
>          work_fetch.cpp
> +
> +David  5 Jan 2010
> +    - client: scheduling problem:
> +        - a project overestimates job FLOP counts
> +        - the client starts jobs in EDF mode
> +        - as job progresses and fraction done increases,
> +            its completion time estimate decreases until
> +            it's no longer a deadline miss.
> +        - job gets preempted by other job from that project;
> +            you end up with lots of partly completed jobs.
> +        Solution (I hope): if an app version has running jobs,
> +            compute a "temp DCF" for the app version,
> +            which is the min of dynamic/static estimates for its jobs.
> +            Apply this scaling factor to completion time estimates
> +            for unstarted jobs in RR simulation
> +    - client: the estimation of remaining time of running jobs was wrong
> +        (how did this bug survive so long?)
> +
> +    client/
> +        app.h
> +        client_types.h
> +        rr_sim.cpp
> +        work_fetch.cpp
>
>
> The problem described is not the actual problem.  The basic problem can
> happen even if the estimate is exact.
>
> The real problem is that a task when running may be using CPU time faster
> than its resource fraction.  This is true if the resource fraction for
the
> project < number of CPUs for the task / ncpus.  In this case, at some
point
> during the run of the task, the task will no longer be in deadline
trouble.
> If there are other tasks in deadline trouble with later deadlines, these
> tasks will instantly preempt the running task.
>
> The fix for this is an EDF simulation that has the ability to turn off
> preemption.  The fix checked in will have no effect on this problem.
>
> jm7
>




_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.

Reply via email to