----- Original Message ----- From: "Niklas Bergh" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Sunday, August 17, 2003 6:04 PM Subject: Re: [freenet-dev] Search Indexing Round 2
> > > > > On Sunday 17 August 2003 06:48 am, Niklas Bergh wrote: > > > > particular word where 'weight' is defined as Number of > > > > "occurrances of 'wordX'"/"Total number of words in the index" (word > > > > rareness). This would allow the search engine to know that, for > > > > Or something along that line but possibly somewhat smarter (maybe a cap of > > how much a single hit could add to the final result) to avoid things like: > > > > > <font size=1 color=white> > > > movies download movies download movies download movies download movies > > > download movies download (... x 100) > > > </font> > > > > :) > > > > /N > > But I see your point. It might be a mighty good idea to (at least > optionally) have some kind of additional indexer-generated fact to merge > into the score, something like Googles 'So and so many pages (or whatever > else that can link) links to this resource'. The problem with this is that > some of the things that I envision should use the search/indexing > functionallity might not have that kind of structure, words are probably > always present though. I should add another thing though: Linking information is *not* a property on the index-words. Linking information is the kind of information that would go into the '[EMAIL PROTECTED] resources' file (or possibly a '[EMAIL PROTECTED] [EMAIL PROTECTED]' file). I the indexer only provides information about how many other resources that link to a specific one that could be used to modify the resulting score for that resource. If the indexer provided full linking information (i.e. information of exactly which resources that links to a specific one) one could build a 'Search from here' functionallity.. not very useful for freesites but it might be useful for Frost or FMB or some other kind of more tree-oriented application. /N _______________________________________________ devl mailing list [EMAIL PROTECTED] http://hawk.freenetproject.org:8080/cgi-bin/mailman/listinfo/devl
