On Mon, Apr 02, 2001 at 08:32:32PM -0700, Mr.Bad wrote:
> So, let's say that I'm doing something like the key indexer, that uses
> incremented keys. EOF does this, too.
>
> Frequently, the incremented keys start at zero and go up. This is
> probably the slowest way to do incremented keys, since you have to
> test each value as you go along.
That is one way to do it. Another way is to use incrementation
combined with date-stamps (this is what is used by the EOF news).
This approach is probably better suited for news than for indices,
because news is something that is generally quite current and is
directly tied to the present time, while an item in an index can stay
current for a long period of time.
The main advantage of using a date (without a time) combined with an
increment is that it avoids the "find the first item" problem. This
is one problem that I figured out based on experience with fnindex and
fnnews back in the ancient Freenet 0.2 days. The people who started
using fnindex and fnnews right after it came into existance could
normally use the indices and newsgroups which had just been created,
but later people who tried to use fnindex and fnnews found that they
didn't work (because that many of the posts between the current posts
and the first posts were missing, causing the fnindex/fnnews client to
think that it ran out of items to read (due to tolerance values, which
are explained later - but this problem would probably also be present
under what you propose) before it got the the current posts (which
were newer, so they hadn't yet dropped out of Freenet)). The people
who originally used the indices and newsgroups were at or close to the
newest items, so none of the items between their current locations
(the items that they read last - items were stored locally on one's
computer) and the newest items were missing.
> Another way to find an "open" key would be to do a kind of binary
> search. For example, start with value 1. If it's taken, double it to
> value 2. If that's taken, double it to value 4. If -that- is taken,
> double it to value 8. Again, if that's taken, double to value 16.
>
> If that -IS- taken, split the difference between this key and the last
> taken key, and check that one. So, now we're at 12. Is that taken? If so,
> split the difference between this one and the last checked open key,
> so we're at 14. If that one isn't taken, split the difference down
> again, to 13. If that's taken, then 14 is the next open key, and
> that's where to insert to.
>
> In order, we checked 8 keys, instead of 14:
>
> number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
>
> check order 1 2 3 4 6 8 7 5
>
> This is kind of a trivial example, but if you think about having a key
> index with thousands of incremented values, you've got a real time
> savings here.
Yeah, that is quite a problem. The thing is that you are going to
have significant waits from the start, due to the inherent slowness of
Freenet. In addition, while this is much better from an O(x) point of
view, for small key indices it will be much *worse* from a time point
of view. On the other hand, for large key indices (such as ones with
thousands of keys), this will result in a considerably (relative)
lower time.
> Also note that after the number of inc values goes above a power of 2,
> you don't have to check the keys between that power of two and the
> next lower one again. So those will fall off the network, thankfully,
> without the incrementer "losing its place."
>
> Anyways, I was just thinking that this is probably a smarter way to do
> incrementing.
When I implemented fnindex and fnnews a *long* time ago, I used a
different approach, which was to just do a linear search with
tolerance values (a certain number of nonexistant items in a row would
be required to make the client stop requesting items in sequence).
Originally, I used a tolerance value of about eight, but I found in
practice that the tolerance value had to be made much smaller (about
two) for practical speed purposes (with a tolerance of eight, the time
for trying to request all eight posts to end the scan was a vast
majority of all the time taken by the requests). Note that this was
without multithreading (back when there was freenetlib, Itamar and
myself tried to make it thread-safe, but for whatever reason we
weren't able to get it work reliably with threads).
--
Yes, I know my enemies.
They're the teachers who tell me to fight me.
Compromise, conformity, assimilation, submission, ignorance,
hypocrisy, brutality, the elite.
All of which are American dreams.
- Rage Against The Machine
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 3388 bytes
Desc: not available
URL:
<https://emu.freenetproject.org/pipermail/devl/attachments/20010403/4718e989/attachment.pgp>