Ian Clarke <I.Clarke at dynamicblue.com> wrote: > Rather than simple moving data straight to the top of the datastore when > it is re-requested, it is moved up the queue an ammount inversely > proportional to its size. More accurately, if the datastore is of size > n (and here I refer to the portion of the datastore containing actual > data, not just references) a piece of data of size p should be moved up > by nq/p where q is the size of the smallest piece of data. If doing > this results in the data being moved up beyond the top of the queue then > it is placed at the top of the queue. This means that the smallest > piece of data (unless I have fscked up the math) will always be moved > straight to the top of the datastore, but other data will have to work > harder in proportion to its size (so a piece of data twice the size of > the smallest piece of data will have to get 2 hits to get to the top of > the queue if it starts at the bottom).
I don't think items should zip to the top of the datastore that quickly, though -- rather, the ordering in the datastore should reflect the cumulative history of requests. In the ideal world, items would be kept always sorted by score = requests/size. (So a set of 1k items would be ranked by # of requests, while a 2k item with 20 hits is ranked the same as a 1k item with 10 hits, since it takes up the space that could otherwise be used to satisfy 20 hits divided among two 1k items). If we don't want to keep actual records of the number of times each item has been requested, we should use their position in the list to approximate their past popularity. The problem is that an unpopular (small) item at the back of the list can erase its whole history of unpopularity by zipping to the top with a single hit, passing many more popular items, and then take a long time to sink back to its true position. Instead, items should gradually work their way up the list. Also, newly-inserted items would be in time order, not size order. Here's a modification: when an item is requested, it moves up by x places, where x = 1 + log2(r/p), and p is the size of the item and r is the size of the -largest- item in the datastore. So every item moves up at least one place when hit, but smaller items move up more places. When an item is inserted, it gets an automatic one hit. This means that all the items which have only been inserted but never requested stay at the back of the queue, sorted more or less by size, while items which are actually being used stay ahead of them. theo -- PGP: http://www.doc.ic.ac.uk/~twh1/ D5E5 0237 0592 CAF6 E4C4 967F 457E 9583 6AB4 876B _______________________________________________ Freenet-dev mailing list Freenet-dev at lists.sourceforge.net http://lists.sourceforge.net/mailman/listinfo/freenet-dev
