Re: [boinc_dev] 6.6.20 and work scheduling

Paul D. Buck Mon, 27 Apr 2009 11:30:49 -0700

On Apr 27, 2009, at 10:14 AM, [email protected] wrote:

> As long as you insist on talking about the frequency of the test,  
> people
> are going to be ignoring you.  Please start talking about what is  
> wrong
> with the test itself.  Fixing the test will fix the problem.  No  
> amount of
> tinkering with the frequency of the test is going to fix the problem.
>
> The project came and went already.  It was doing document indexing.   
> The
> runtimes were very short, but the transfer times were killing it.


Which proves my point.  The deadlines were unrealistic.

Yes, most of the real issue are that the rules that make the test  
bad.  But that is not the sole problem here.

The frequency means that I cannot help you troubleshoot the rules  
because I have hundreds to thousands of calls to the routines that are  
just so much wasted time.  This buries the bad calls is so much  
garbage that I cannot find that needle that is needed to fix the tests.

And, just because you don't think the frequency of the tests, or even  
Dr. Anderson not thinking the frequency of the test is a problem does  
not mean that it is not a problem.

Which *IS* also one of the problems in the BOINC world.  We ignore  
people that ask questions we don't want asked.  We avoid opinions that  
don't comport with ours ...

So, we had a project that had an unrealistic deadline and that we put  
into place this rule and because that one project had a mythical need  
and that means we now cannot change BOINC for the better?

What is wrong with the test is that we do it too often.  We also use  
the wrong driving parameters.  Because we do it so often, and keep no  
history, we have instability in the scheduling system and no  
pretending that the frequency does not matter is not going to make it  
more stable.  Even if you fix the rules the fact that the client is  
recalculating the deadlines every 10 seconds (or less) means that  
BOINC is going to change its mind as to what to run.  Because we also  
don't enforce TSI ...  This is not a simple one minor butlet and we  
are done ...

You cannot, or will not, see the frequency caused instability unless  
you have a system that is both fast and wide.  As best as I can tell  
you have neither, nor does UCB, though they are welcome to drive over  
anytime to look at mine (2 hours or so from UCB, and I will buy lunch  
and pay for the gas).



Ok, we fix all other problems but still check every 10 seconds on  
which tasks to run.  If we do not enforce TSI, meaning, you cannot  
switch a task out until it has completed its TSI or ended (a rule you  
also say should not be enforced), that means that assuming that I have  
a batch of tasks that are from a project, all have roughly the same  
deadlines, well are we not going to enforce "keep work mix  
interesting"? Then I am going to run that as a big batch which will  
cause task abandonment ... oh, and because we are keeping the event  
driven basis that means that the task I started because a task ended  
is still going to be superseded by another task when the upload  
ends ... leading to more tasks abandoned partly done ...

Essentially you want to fix the problem without changing any of the  
drivers of the problem... one of which is the event base triggers ...  
which happen far too often... and pretending that they don't won't  
make it less of a problem.

So, ignore me some more if you want, why not, everybody else does...  
still does not mean that I am wrong ... I first reported this problem  
in 2005 or there abouts ... it is still a problem ... and it will  
continue to be a problem unless you stop clinging to "I don't think  
doing it once a second is a problem, so it cannot be a problem"  
mindset.  Even it we change the rules to better ones the fact that  
fast systems run the tests so often are still going to be unstable.   
BOINC ignores history ... that and fast repeats of any test is a  
recipe for instability ...

I agree that changing the frequency is not going to solve this, but  
maybe it will allow me to help provide the data so we can solve the  
rest of the problems.  And changing the frequency of the tests will  
make the system a little less unstable.  Oh, and save compute time.   
Oh, one more thing, running the test every 60 seconds with a 2 minute  
task means I would still make the deadlines... no need to run the test  
RIGHT NOW ... 30 seconds later would not be a killer ... even for a  
mythical need ...
_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.

Re: [boinc_dev] 6.6.20 and work scheduling

Reply via email to