YourSoft írta:
Hi,
I have a suggestion to improve nutch search results.
The "big" search engines (like google) measure the distance between
the query words.
E.g.:
query string: lucene in action
When you search for it in google, google will boost up that documents
where the "lucene in action" is in the same sequence.
I think it is possible in nutch/lucene (e.g. if your search string is:
"lucene in action"), but nutch don't make it.
Any ideas how to make it?
Regards,
Ferenc
I'm sorry something is missing from previous mail:
When search the keywords, there something also improve the boost:
- How many times found the full query ('Lucene in action") in document.
(The length of total document / full query count - if it is bigger than
10-20% it is BAD)
- How many times found the query words in document ("lucene" "in" "action")
-------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers