Re: how to define a pool for Searcher?

Mark Miller Wed, 07 Mar 2007 07:17:23 -0800

You should not be returning 50 thousand documents. Since you areimplementing paging, you should only return enough to cover your pagesize. If a user is viewing page 1 with documents 1-10, you send backinformation for 10 of the docs. On page 2, 10-20, you send backinformation for 10 of the docs, after executing a new search andskimming off the correct docs.

Also, unless your documents are *very* big, 500,000 documents on anysort of modern hardware is a cakewalk for Lucene.

Using Hits as it was meant to be used is generally a poor decision in amulti-user environment in my opinion. You will be faster and scalebetter using a stateless paradigm instead of taking advantage of Hitscaching. Besides that, if you are currently loading up 50 thousanddocuments from Hits, you are doing something that is extremelyinefficient. As Erick said, Hits will continuously re-query to fill itups cache. After grabbing info for a 100 docs it will re query, thenafter 200 it will require, etc.


Stateless man. Trust me.

- Mark

Mohammad Norouzi wrote:

yes I am very concerned about this because we have a big project withmanyusers and I am responsible for this. the thing that preoccupied mymind is

application performance because there is more than 500 thousands records
(documents).

a single search may returns about 50 thousand documents and it is not

possible to put all of them into a say java.util.List so I have tokeep the

searcher open and move forward or backward through the Hits and when the
user clicked on "Finish" button or the time exceeds over than a specific
time,(or the session destroyed)  so I set a flag to true, then the other
session can access that searcher without closing any searcher or reader.

any way, your comments are useful for me.
thanks

On 3/7/07, Mark Miller <[EMAIL PROTECTED]> wrote:

To address your hits question: I wouldn't keep hits around, but would
re-search instead. It is often more of a headache than a time savings to
keep around all of the Hits objects and to have to manage them. I made
my own Hits object that does no caching because of this. Pagination is
often best done by re-querying.

Also, keep in mind that you prob won't have 1000 Similarities...you will
prob have much closer to 1 <g>, maybe a couple if you have created a
custom one. The biggest chance you have more than one Searcher cached
for an Index is if you have a MultiSearcher cached that searches over
it. Out of the box, indexAccessor does not handle MultiSearchers
perfectly though...it does not check out a Searcher for each of the
underlying Indexes, so you will have to do that your self...then
remember to release them all when you release the MultiSearcher.

I think in general, you are over concerned. IndexAccessor will handle
most of this for you without much intervention on your part.

- Mark

Mohammad Norouzi wrote:
> Hello Mark,
> there is something vague for me about the Lucene-indexAccessor you
> created
> and my problem.
> as I see your codes, you create IndexSearcher and put it into a Map
> and the

> only thing that separate them is the Similarity the have. so if say1000

> users with different Similarity connect to my application there will
> be 1000
> IndexSearcher with their own internal Reader.
> now, in my case, I have an IndexResultSet just like java.sql.ResultSet
> which
> it contains a Hits. so a user may go forward or backward through the
> Hits'
> documents and actually every user are doing this job.
>
> to do so, I have to find the Similarity that a user working with it
> and find

> the right IndexSearcher in order to support pagination for her. isthis

> right? I mean can I trust to Similarity to find the right IndexReader
> that a
> user have used it before?
>
> another question is, how about I have one IndexReader for all my
> IndexSearcher and manage them simultaneously to access that single
> Reader.?
>
> thank you very much in advance
>
>
> On 2/22/07, Mark Miller <[EMAIL PROTECTED]> wrote:
>>
>> I would not do this from scratch...if you are interested in Solr go
that
>> route else I would build off
>> http://issues.apache.org/jira/browse/LUCENE-390
>>
>> - Mark
>>
>> Mohammad Norouzi wrote:
>> > Hi all,

>> > I am going to build a Searcher pooling. if any one hasexperience on

>> > this, I
>> > would be glad to hear his/her recommendation and suggestion. I want
to
>> > know

>> > what issues I should be apply. considering I am going to usethis on

a
>> > web
>> > application with many user sessions.
>> >
>> > thank you very much in advance.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [EMAIL PROTECTED]
>> For additional commands, e-mail: [EMAIL PROTECTED]
>>
>>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: how to define a pool for Searcher?

Reply via email to