I just checked in the change you describe. Actually it was added 4 years ago, but was backed out because there were reports that it wasn't working right.
This time I added some <file_xfer_debug>-enabled messages, so we should be able to track down any problems. -- David Lynn W. Taylor wrote: > Hi All, > > I've been watching s...@home during their current challenges, and I > think I see something that can be optimized a bit. > > The problem: > > Take a very fast machine, something with lots of RAM, a couple of i7 > processors and several high-end CUDA cards -- a machine that can chew > through work units at an amazing rate. > > It has a big cache. > > As work is completed, each work unit goes into the transfer queue. > > BOINC sends each one, and if the upload server is unreachable, each work > unit is retried based on the back-off algorithm. > > If an upload fails, that information does not affect the other running > upload timers. > > In other words, this mega-fast machine could have a lot (hundreds) of > pending uploads, and tries every one every few hours. > > I see two issues: > > 1) The most important work (the one with the earliest deadline) may be > one of the ones that tries the least (longest interval). > > 2) Retrying 100's of units adds load to the servers. 180,000-odd > clients trying to reach one or two machines at SETI. > > Optimization: > > On a failed upload, BOINC could basically treat that as if every upload > timed out. That would reduce the number of attempted uploads from all > clients, reducing the load on the servers. > > Of course, since the odds of a successful upload is just about zero for > a work unit that isn't retried, by itself this is a bad idea. > > So, when any retry timer runs out, instead of retrying that WU, retry > the one with the earliest deadline -- the one at the highest risk. > > As the load drops, work would continue to be uploaded in deadline order > until everything is caught up. > > I know a project can have different upload servers for different > applications, or for load balancing, or whatever, so this would only > apply to work going to the same server. > > The same idea could apply to downloads as well. Does the BOINC client > get the deadline from the scheduler?? > > Now, if I can figure out how to get a BOINC development environment > going, and unless it's just a stupid idea, I'll be glad to take a shot > at the code. > > Comments? > > -- Lynn > _______________________________________________ > boinc_dev mailing list > [email protected] > http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev > To unsubscribe, visit the above URL and > (near bottom of page) enter your email address. _______________________________________________ boinc_dev mailing list [email protected] http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address.
