Doug,
There was a question in one of my last mails about document ids and more then one segment index.
In case you can answer this question and suggest a solution to get an unique document id then we can heavily improve the speed.
Stefan
Am 26.07.2004 um 21:51 schrieb Doug Cutting:
Michael Rosset wrote:Attached is a patch for search.jsp adding support for grouping by host.
I just tried this on a test index with 160k pages. It gets really slow when there are lots of duplicates. I haven't looked too closely, but I assume this is because it has to look at lots of hit details.
I think we can accelerate this. We index the hostname in the "site" field. When re-querying we could add a clause to the query which prohibits sites we don't want to see any more hits from. This could be done with something like:
query.addProhibitedTerm("site", host);
The query should be cloned first, which means that Query needs to be made cloneable.
Does this sound like a good approach to accelerating this? If so, Stefan or Andrzej, do you want to look into implementing this?
Doug
------------------------------------------------------- This SF.Net email is sponsored by BEA Weblogic Workshop FREE Java Enterprise J2EE developer tools! Get your free copy of BEA WebLogic Workshop 8.1 today. http://ads.osdn.com/?ad_id=4721&alloc_id=10040&op=click _______________________________________________ Nutch-developers mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/nutch-developers
--------------------------------------------------------------- enterprise information technology consulting open technology: http://www.media-style.com open discussion: http://www.text-mining.org open thoughts: http://www.find23.net
------------------------------------------------------- This SF.Net email is sponsored by BEA Weblogic Workshop FREE Java Enterprise J2EE developer tools! Get your free copy of BEA WebLogic Workshop 8.1 today. http://ads.osdn.com/?ad_id=4721&alloc_id=10040&op=click _______________________________________________ Nutch-developers mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/nutch-developers
