On Sat, 2003-08-16 at 21:23, Jeremy Caleb Heffner wrote: > I wasn't really referring to inserting the index as a typical key. The > reason why not is because I think this idea is based upon combining indexes > so that they slowly grow and 'centralize' in a way, which I don't think is a > good idea in a distributed system like this. > > What I was referring to was instituting a new Search message type in the > Freenet protocol (think somewhat like how Gnutella does this, but our own > version of course). > > The way this message would work is very similar to the way a chain is formed > to pass key data back to the requester, which preserves the anonymity of the > searcher. Caching the search results along the chain also protects the > privacy of the local index because it would contain both locally indexed > data and indexes it relayed. Each time a keyword was searched for its > results would be distributed to more nodes, expanding the ability to search > for that keyword and lessen the bandwidth consumed for X number of results. > > Would this work? (I am not claming to be an expert of any kind, just > throwing out an idea for a searching system that doesn't rely on > conglomerating indexes.) > > Jeremy
You should look up FASD. FASD is a search mechanism that someone designed a few years ago so that searching could be done in freenet, but the design has laid dormant for a while. It used a cosine correlation function to determine "closeness" to certain metadata. The problem I see with a metadata-key system is that it suffers the same problem as the META tags search engines used to index sites. How are you going to prove that the indexes are honest and correct? FASD wanted to make the metadata used for querying decentralized from an insertion standpoint. That is, publishers were responsible for inserting metadata into freenet. This means that you have to trust the metadata keys that were inserted. FASD does have a culling mechanism so that metadata could be validated and deleted, but this system seems like it would be expensive to execute on a large scale network. The idea I have for a search function is to have different search engine sites in freenet. The search engine maintainers would gather data from freenet by spidering/hand-picking/doing whatever they feel like to generate this index. When a user goes to this site, a submit form sends a command to a client program (probably integrated with FProxy) to execute a search across the indexes. Indexes are stored in the following format: [EMAIL PROTECTED] where Keyword would be a listing of certain sites that would correlate to that keyword, along with "weights" for each site. A large search index would have many of these keyword pages. So a search for "movies" would look for [EMAIL PROTECTED] which might contain the following: 17 [EMAIL PROTECTED] 15 [EMAIL PROTECTED] 7 [EMAIL PROTECTED] 5 [EMAIL PROTECTED] 3 [EMAIL PROTECTED] and the search result page returned through FProxy would contain those pages in that order. It is up to the author of the search engine to determine how to weight the keys. A search engine author could do a Google-like PageRank that weights a site based on links between sites after spidering through Freenet. On the other hand, an author could generate a searchable directory site that stores specific content (for example, a site containing all of the books from Project Gutenburg). Yet a third search-engine author could take several of these indexes from different sites to create a bigger, better index. A filesharing app running on freenet could also be designed to publish its own indexes and combine it with others. So the general idea here is to provide a search system to fit everyone's needs. Multiple-keyword searches could be done by requesting the index page for all of those keywords, and take the intersection of all of those indexes. The weights for a specific site across all of those pages could be combined by some simple formula (such as addition). Thoughts? Scott Young _______________________________________________ devl mailing list [EMAIL PROTECTED] http://hawk.freenetproject.org:8080/cgi-bin/mailman/listinfo/devl
