You should not be returning 50 thousand documents. Since you are implementing paging, you should only return enough to cover your page size. If a user is viewing page 1 with documents 1-10, you send back information for 10 of the docs. On page 2, 10-20, you send back information for 10 of the docs, after executing a new search and skimming off the correct docs.

Also, unless your documents are *very* big, 500,000 documents on any sort of modern hardware is a cakewalk for Lucene.

Using Hits as it was meant to be used is generally a poor decision in a multi-user environment in my opinion. You will be faster and scale better using a stateless paradigm instead of taking advantage of Hits caching. Besides that, if you are currently loading up 50 thousand documents from Hits, you are doing something that is extremely inefficient. As Erick said, Hits will continuously re-query to fill it ups cache. After grabbing info for a 100 docs it will re query, then after 200 it will require, etc.

Stateless man. Trust me.

- Mark

Mohammad Norouzi wrote:
yes I am very concerned about this because we have a big project with many users and I am responsible for this. the thing that preoccupied my mind is
application performance because there is more than 500 thousands records
(documents).

a single search may returns about 50 thousand documents and it is not
possible to put all of them into a say java.util.List so I have to keep the
searcher open and move forward or backward through the Hits and when the
user clicked on "Finish" button or the time exceeds over than a specific
time,(or the session destroyed)  so I set a flag to true, then the other
session can access that searcher without closing any searcher or reader.

any way, your comments are useful for me.
thanks

On 3/7/07, Mark Miller <[EMAIL PROTECTED]> wrote:

To address your hits question: I wouldn't keep hits around, but would
re-search instead. It is often more of a headache than a time savings to
keep around all of the Hits objects and to have to manage them. I made
my own Hits object that does no caching because of this. Pagination is
often best done by re-querying.

Also, keep in mind that you prob won't have 1000 Similarities...you will
prob have much closer to 1 <g>, maybe a couple if you have created a
custom one. The biggest chance you have more than one Searcher cached
for an Index is if you have a MultiSearcher cached that searches over
it. Out of the box, indexAccessor does not handle MultiSearchers
perfectly though...it does not check out a Searcher for each of the
underlying Indexes, so you will have to do that your self...then
remember to release them all when you release the MultiSearcher.

I think in general, you are over concerned. IndexAccessor will handle
most of this for you without much intervention on your part.

- Mark

Mohammad Norouzi wrote:
> Hello Mark,
> there is something vague for me about the Lucene-indexAccessor you
> created
> and my problem.
> as I see your codes, you create IndexSearcher and put it into a Map
> and the
> only thing that separate them is the Similarity the have. so if say 1000
> users with different Similarity connect to my application there will
> be 1000
> IndexSearcher with their own internal Reader.
> now, in my case, I have an IndexResultSet just like java.sql.ResultSet
> which
> it contains a Hits. so a user may go forward or backward through the
> Hits'
> documents and actually every user are doing this job.
>
> to do so, I have to find the Similarity that a user working with it
> and find
> the right IndexSearcher in order to support pagination for her. is this
> right? I mean can I trust to Similarity to find the right IndexReader
> that a
> user have used it before?
>
> another question is, how about I have one IndexReader for all my
> IndexSearcher and manage them simultaneously to access that single
> Reader.?
>
> thank you very much in advance
>
>
> On 2/22/07, Mark Miller <[EMAIL PROTECTED]> wrote:
>>
>> I would not do this from scratch...if you are interested in Solr go
that
>> route else I would build off
>> http://issues.apache.org/jira/browse/LUCENE-390
>>
>> - Mark
>>
>> Mohammad Norouzi wrote:
>> > Hi all,
>> > I am going to build a Searcher pooling. if any one has experience on
>> > this, I
>> > would be glad to hear his/her recommendation and suggestion. I want
to
>> > know
>> > what issues I should be apply. considering I am going to use this on
a
>> > web
>> > application with many user sessions.
>> >
>> > thank you very much in advance.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [EMAIL PROTECTED]
>> For additional commands, e-mail: [EMAIL PROTECTED]
>>
>>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to