Re: cache persistent Hits

Gaston Tue, 26 Sep 2006 12:14:52 -0700

hi,

first thank you for the fast reply.

I use MultiSearcher that opens 3 indexes, so this makes the wholeoperation surly slower, but 20seconds for 5260 results out of an 212MBindex is much too slow.

Another reason can of course be my ISP.

Here is my code:

       IndexSearcher[] searchers;
       searchers=new IndexSearcher[3];
       String path="/home/sn/public_html/";
       searchers[0]=new IndexSearcher(path+"index1");
       searchers[1]=new IndexSearcher(path+"index2");
       searchers[2]=new IndexSearcher(path+"index3");
       MultiSearcher saercher=new MultiSearcher(searchers);

QueryParser parser=new QueryParser("content",newStandardAnalyzer());

           parser.setOperator(QueryParser.DEFAULT_OPERATOR_AND);

Query query=parser.parse("urlName:"+userInput+" OR"+"content:"+userInput);Hits hits=searcher.search(query);for(int i=0;i<hits.length();i++)

Document doc=hits.doc(i);}// Outprint only 10 results per page


   for(int i=startPoint;i<startPoint+10;i++)
           {

Document doc=hits.doc(i);out.println(escapeHTML(doc.get("description"))+"<p>");out.println("<ahref="+doc.get("url")+">"+doc.get("url").substring(7)+"</a>");

                   out.println("<p><p><p>");

}


Perhaps somebody see the reason why it is so slow.

Thank you in advance

Greetings Gaston




Erick Erickson schrieb:

Well, my index is over 1.4G, and others are reporting very largeindexes inthe 10s of gigabytes. So I suspect your index size isn't the issue.I'd be

very, very, very surprised if it was.

Three things spring immediately to mind.

First, opening an IndexSearcher is a slow operation. Are you opening anew

IndexSearcher for each query? If so, don't <G>. You can re-use the same
searcher across threads without fear and you should *definitely* keep it
open between queries.

Second, your query could just be very, very interesting. It would be more
helpful if you posted an example of the code where you take your timings
(including opening the IndexSearcher).

Third, if you're using a Hits object to iterate over many documents, be

aware that it re-executes the query every hundred results or so. Youwant to

use one of the  HitCollector/TopDocs/TopDocsCollector classes if you are

iterating over all the returned documents. And you really *don't* wantto do

an IndexReader.doc(doc#) or Searcher.doc(doc#) on every document.

If none of this helps, please post some code fragments and I'm sureothers

will chime in.

Best
Erick

On 9/26/06, Gaston <[EMAIL PROTECTED]> wrote:


Hi,

Lucene has itself  volatile caching mechanism provided by a weak
HashMap. Is there a possibilty to serialize the Hits Object? I think of
a HashMap that for each found result, caches the first 100 results. Is
it possible to implement such a feature or is there such an extension?
My problem is that the searching of my application with an index with
the size of 212MB takes to much time, despite I set the BooleanOperator
from OR to AND

I am happy about every suggestion.

Greetings

Gaston.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: cache persistent Hits

Reply via email to