We need to:
- Be able to provide effective FUQID functionality, on some level, using
  some combination of third party and FPI provided software.
- Be able to handle largish numbers of high-level downloads and uploads
  in Fred 0.7 without using hundreds to thousands of real threads.

There are a number of problems:
- At present there is no real co-operative allocation of threads. Each
  high level fetch process uses one thread (plus 1 low-level) if it is
  fetching one file, or 20 threads (plus 20 low-level) if it is fetching
  a splitfile. A PUTDIR operation can therefore use from 10 to 200
  threads at a time, depending on... chance, essentially.
- Most of these threads will be waiting for clearance to send their
  requests, rather than waiting for the low-level request thread to
  finish, or waiting for a response from the network.
- Continuations will not entirely solve the problem. The low-level
  threads could be eliminated with continuations, however the Fetcher
  processes could not, because they use recursion across 3 functions
  (you can only suspend a continuation in the top level).
- It will be difficult to efficiently provide FUQID style functionality
  purely in third party software, with the node only providing simple
  high level fetch over FCP. Client authors may resort to extreme
  measures such as implementing their own metadata parsing.
- Therefore we need to provide some means for long term, restartable
  downloads in the node. This will be difficult with the current code.
- It would be preferable for several reasons to move the RetryTracker
  download scheduling code from per-segment/per-insert-file to a global
  tracker/scheduler.

The solution:
- Refactor Fetcher:
-- fetches a key
-- if it fails, returns an error
-- if it succeeds (data, not metadata), returns the data
-- parses the metadata
-- if it is a simple redirect, returns a new Fetcher (possibly with more
meta strings, flags for multi-level metadata etc)
-- if it is a splitfile, returns a SplitFetcher, with a
SplitFetcherCallback (which will do archive processing, multi-level
metadata or whatever)

SplitFetcher:
-- queues the keys to be fetched onto the global fetch queue
-- when finished successfully, calls the SplitFetcherCallback

SplitFetcherCallback:
-- depends on what the Fetcher wanted
-- for example: multi-level metadata.
--- parse the metadata, see what has to be done.
--- does exactly the same calculation as Fetcher would; in fact it calls
the same function (it's an inner class)


The point here? We never wait for anything except an actually running
request, and we can suspend - to disk, if necessary. Two sides of the
same coin.

Combined with a global request/insert tracker (derived from
RequestStarter and RetryTracker), we can efficiently handle
hundreds of gigabytes of background downloads at the same time as
foreground fproxy browsing. We can control the policy on what gets
fetched (e.g. priority then retries), we can save partially completed
fetches to disk, and we use no more simultaneous inserts, requests,
and threads than is necessary.
-- 
Matthew J Toseland - toad at amphibian.dyndns.org
Freenet Project Official Codemonkey - http://freenetproject.org/
ICTHUS - Nothing is impossible. Our Boss says so.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: 
<https://emu.freenetproject.org/pipermail/tech/attachments/20060119/15c6fd1e/attachment.pgp>

Reply via email to