[Freenet-dev] Searching. Questions and a Proposal (again)

Alex Barnell Tue, 18 Apr 2000 23:09:06 +0100

awlydick wrote:

[Searching for metadata]


>         Also, we only need to generate three different searches. One for each
>         keyword. As some wise soul suggested earlier, we really need to 
> include
>         the other keywords that are being searched for within each of the 
> broken
>         up searches. But only route the searches with a single keyword.
> 

We can include the other keywords in the search request, but if the
nodes subsequently don't find enough matches then we should also return
partial matches.

Partial matches have both:

a) matched one or more metadata criteria, and 
b) have the potential for matching the rest

However, if we return metadata as part of the search result, we should
*only* store the data that is relevant to the routing path we have
chosen (which could have been specified in the search request by a
special tag). Otherwise we reduce the potential the nodes have for
clustering of data, by wasting disk-space on data that shouldn't be
routed there anyway. This way, the user gets all the data, but
clustering efficiency is still high.

>         So. We use break a search into multiple searches (one per keyword) and
>         route them as we would normal Freenet Keys. They are smart routed 
> until
>         their HTL expires, and are not executed in parallel. The user stops 
> sending

In addition to HTL expiring, we can specify maximum_matches, and can
expire requests when this has been reached/exceeded. We decrease
maximum_requests by the number of results we matched.

> Well. There you have it. Long as hell. Boring to read. Fun to write though :-)

I think a definite paper on metadata and searching could be longer than
Ian's paper.

> I think that it could work, and I don't see anything glaringly wrong with it
> given the debate that I've read so far. But poke lots of holes in it and i'll
> try and scramble to fix it. Have fun.
> 

I am 100% confident that metadata-searching and clustering works. 
Freenet
should scale beautifully, unlike other search mechanisms like web
spiders,
which cannot hope to keep up with the growth in data creation.

What you describe is very similar to the method I am trying to write up, 
There is a lot to consider for the best way to match keys, for
there are two reasons for doing so. We want to route requests to the
best node to enable clustering, and we also want to match keys in a
human-sensible sort of way. e.g. "The Bible" and "Bible" could be
considered close, since we need only four deletions/insertions to move
between them. We could also know that "The " is pretty meaningless when
it comes to searches.

And imagine the case when someone receives 100 matches,
but none are suitable. We should be able to start another request, but
specify an 'ignore' list, which is a list of  CHKs we are not interested
in. So we can successively get an extra 100 matches at a time.

Keep brainstorming guys :)



_______________________________________________
Freenet-dev mailing list
Freenet-dev at lists.sourceforge.net
http://lists.sourceforge.net/mailman/listinfo/freenet-dev

[Freenet-dev] Searching. Questions and a Proposal (again)

Reply via email to