Re: how to define a pool for Searcher?

Erick Erickson Wed, 07 Mar 2007 05:35:54 -0800

You may not be able to store all the documents, but what about
just storing the document IDs in a list?


And remember that a Hits object re-queries the index every 100
documents or so when you iterate through it, so if you're really
using a Hits object, you're re-executing the query anyway.

You might want to think about using a HitCollector, or TopDocs,
or... and recording the document IDs (which is very fast, and
holding 50,000 integers isn't very expensive).

If you're paging through the hits using a Hits object, I believe
you'll find no effective difference between keeping the
searcher open and navigating through the list and
re-executing the query each time.

If this is true, you'll see a dramatic difference by using, say,
TopDocs when you're paging into the latter parts of the list.

And you'll see an *enormous* difference if you cache the doc IDs
for your paging <G>...

Of course, if you're not using a Hits object most of this doesn't
count.

Best
Erick

On 3/7/07, Mohammad Norouzi <[EMAIL PROTECTED]> wrote:


yes I am very concerned about this because we have a big project with many
users and I am responsible for this. the thing that preoccupied my mind is
application performance because there is more than 500 thousands records
(documents).

a single search may returns about 50 thousand documents and it is not
possible to put all of them into a say java.util.List  so I have to keep
the
searcher open and move forward or backward through the Hits and when the
user clicked on "Finish" button or the time exceeds over than a specific
time,(or the session destroyed)  so I set a flag to true, then the other
session can access that searcher without closing any searcher or reader.

any way, your comments are useful for me.
thanks

On 3/7/07, Mark Miller <[EMAIL PROTECTED]> wrote:
>
> To address your hits question: I wouldn't keep hits around, but would
> re-search instead. It is often more of a headache than a time savings to
> keep around all of the Hits objects and to have to manage them. I made
> my own Hits object that does no caching because of this. Pagination is
> often best done by re-querying.
>
> Also, keep in mind that you prob won't have 1000 Similarities...you will
> prob have much closer to 1 <g>, maybe a couple if you have created a
> custom one. The biggest chance you have more than one Searcher cached
> for an Index is if you have a MultiSearcher cached that searches over
> it. Out of the box, indexAccessor does not handle MultiSearchers
> perfectly though...it does not check out a Searcher for each of the
> underlying Indexes, so you will have to do that your self...then
> remember to release them all when you release the MultiSearcher.
>
> I think in general, you are over concerned. IndexAccessor will handle
> most of this for you without much intervention on your part.
>
> - Mark
>
> Mohammad Norouzi wrote:
> > Hello Mark,
> > there is something vague for me about the Lucene-indexAccessor you
> > created
> > and my problem.
> > as I see your codes, you create IndexSearcher and put it into a Map
> > and the
> > only thing that separate them is the Similarity the have. so if say
1000
> > users with different Similarity connect to my application there will
> > be 1000
> > IndexSearcher with their own internal Reader.
> > now, in my case, I have an IndexResultSet just like java.sql.ResultSet
> > which
> > it contains a Hits. so a user may go forward or backward through the
> > Hits'
> > documents and actually every user are doing this job.
> >
> > to do so, I have to find the Similarity that a user working with it
> > and find
> > the right IndexSearcher in order to support pagination for her. is
this
> > right? I mean can I trust to Similarity to find the right IndexReader
> > that a
> > user have used it before?
> >
> > another question is, how about I have one IndexReader for all my
> > IndexSearcher and manage them simultaneously to access that single
> > Reader.?
> >
> > thank you very much in advance
> >
> >
> > On 2/22/07, Mark Miller <[EMAIL PROTECTED]> wrote:
> >>
> >> I would not do this from scratch...if you are interested in Solr go
> that
> >> route else I would build off
> >> http://issues.apache.org/jira/browse/LUCENE-390
> >>
> >> - Mark
> >>
> >> Mohammad Norouzi wrote:
> >> > Hi all,
> >> > I am going to build a Searcher pooling. if any one has experience
on
> >> > this, I
> >> > would be glad to hear his/her recommendation and suggestion. I want
> to
> >> > know
> >> > what issues I should be apply. considering I am going to use this
on
> a
> >> > web
> >> > application with many user sessions.
> >> >
> >> > thank you very much in advance.
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: [EMAIL PROTECTED]
> >> For additional commands, e-mail: [EMAIL PROTECTED]
> >>
> >>
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>


--
Regards,
Mohammad

Re: how to define a pool for Searcher?

Reply via email to