On Mon, 05 Dec 2005 22:31:11 +0000, Matthew Toseland wrote:

> On Tue, Dec 06, 2005 at 12:01:29AM +0200, Jusa Saari wrote:
>> >From what I've understood, Freenet 0.7 is supposed to handle splitfiles
>> transparently, so that the inserting node fragments and the retrieving
>> node reassembles files automatically, without client programs needing to
>> know or care about the block dis/reassembly. Am I correct ?
> 
> Pretty much. However, clients (e.g. fproxy) can specify a maximum file
> size.
>> 
>> Now, suppose that you have a large file; say, a Linux DVD image. Suppose
>> that you have inserted it with a program like Frost, which only inserts
>> the file when it receives a request for it (a must for sharing a large
>> amount of data). Suppose that just enough blocks have fallen to bitrot
>> that the file cannot be reassembled anymore; getting just a single block
>> reinserted might be enough.
> 
> I dispute that inserting the file when you receive a request is "a must".

It is a must, because otherwise the only way to keep a not-very-popular
file available is to keep periodically reinserting it. This wastes network
resources and local bandwith, and might very well help push other content
out of the network, causing a vicious cycle, since the authors of that
content will then need to reinsert more often to keep their content
available.

Freenet is a combination of cache and transport system. In the long term,
a file can be reached through Freenet only if it is either extremely
popular or if it is periodically reinserted from the backing store (hard
drives space outside the datastore). Insert-on-demand is an absolutely
vital for such a scheme to work well; without it the reinserts will end up
flushing each other out of the network, which leads to decreasing reinsert
interval, which leads to more bitrot, which leads to decreasing reinsert
interval and so on. And of course freesites and other content that can't
really be reinserted on request due to the latencies involved gets flushed
out too.

So, basically, insert-on-request is vital for Freenet for it to function
under any significant load.

>> In current system such a situation is easy to handle. You can simply ask
>> the inserter for the specific blocks. In the new Freenet, however,
>> blocks are hidden, so the retriever doesn't know which blocks failed,
>> and the inserter has no way of inserting just them. This means that he
>> has to reinsert the entire multi-gigabyte DVD image, which is a huge
>> waste of resources.
>> 
>> Now, this could be solved by simply allowing access to the underlaying
>> block system, but that is needlessly complex and might lead to problems
>> if the block size or some other aspect of the system ever changes.
>> Instead, I'm suggesting that the insert request can specify the range of
>> bytes to insert; that is, when inserting the multi-gigabyte file, I can
>> specify that I only want to insert bytes form offset to offset2 (and of
>> course I should be able to specify multiple ranges). The retriever
>> should similarly get information of what byte ranges failed. Checkblocks
>> could be assigned to a logical range after the actual file data.
> 
> Hmmm. You would of course end up reinserting an entire segment.

Why ? If the splitfile code is deterministic, you get the exact same
blocks from the same file. You know that blocks n, m and o failed, and the
rest presumably succeeded, so why should you resinsert the rest of
the blocks in the segment ? I can understand that some checkblock
algorithm might need to recalculate each block at once, but you still
don't need to *insert* them all.

>> The good sides of this idea are that it should be trivial to implement
>> (just don't insert the blocks that are completely outside all ranges)
>> and would allow inserting just the missing blocks without programs even
>> needing to know that Freenet uses block, much less any details of the
>> implementation.
>> 
>> Comments ?
> 
> Not vital at present IMHO. We'll see in future.

For the reasons stated above, I disagree with you.

Please also note that this kind of thing is not something that can simply
be added later. It needs support from insert and request tools; failing to
include it until Freenet becomes popular and load goes up means that the
popular tools will lack support for this feature - they can't support a
feature that didn't exist at the time they were made, and it will take a
long time to get everyone to upgrade to new versions.

Better add this feature now, so that the support will be in the tools when
it will be needed.

It should also be noted that including this support now will stop any tool
from implementing its own segmentation. Filesizes are growing all the
time, so tool authors will need to include this feature, whether it is
officially supported or not. Everyones life will be easier if it is.


Reply via email to