In your bug report you was suggesting to save re calculating the doubled hosts to save a value as hidden form value to continue calculating only starting on this value and only for the next page.
My answer was that this is not possible since just saving the offset provide trouble until calculating the the page numbers.
The only chance i see is to save the actually page number and a unique identifier.
Since your suggested unique identifier is available without loading the details of a hit, it will be much faster since we only need to query the detail for the next page.
Make that sense? How to get the segment name of a hit in the SearchBean?
Stefan
Am 26.07.2004 um 22:38 schrieb Doug Cutting:
Stefan Groschupf wrote:There was a question in one of my last mails about document ids and more then one segment index.
In case you can answer this question and suggest a solution to get an unique document id then we can heavily improve the speed.
Is this the question?
Stefan Groschupf wrote:
> So a question, as far i understand lucene the document number (==
> hit.getIndexDocNo()) is unique per index.
> If that is true than hit.getIndexDocNo() is not unique since hits
> can be
> found in different segment indexes and on different servers, isn't it?
> Is there any chance to get a unique id of the document that is not
> stored in the details?
There are several ways to uniquely identify a page. The combination of indexDocNo and segment name is unique. If you've run "dedup" on the segments, then both the URL and the MD5 digest are also unique.
What do you need a unique id for? Each page should only occur once in a hit list. Search-time duplicate detection is just done to reduce the number of hits per site, no?
Doug
------------------------------------------------------- This SF.Net email is sponsored by BEA Weblogic Workshop FREE Java Enterprise J2EE developer tools! Get your free copy of BEA WebLogic Workshop 8.1 today. http://ads.osdn.com/?ad_id=4721&alloc_id=10040&op=click _______________________________________________ Nutch-developers mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/nutch-developers
--------------------------------------------------------------- enterprise information technology consulting open technology: http://www.media-style.com open discussion: http://www.text-mining.org open thoughts: http://www.find23.net
------------------------------------------------------- This SF.Net email is sponsored by BEA Weblogic Workshop FREE Java Enterprise J2EE developer tools! Get your free copy of BEA WebLogic Workshop 8.1 today. http://ads.osdn.com/?ad_id=4721&alloc_id=10040&op=click _______________________________________________ Nutch-developers mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/nutch-developers
