Hi,

> Well, the first question is always "are you opening/closing your
> IndexSearchers for each request on your remote machines?". This is always a
> no-no. This is also a question for your single-searcher version.

Yes I know, each search slave (RMI server) have single instance
 of IndexSearcher and it's open once when RMI server starts.

> What is your performance if you only go to one server? I'd start by finding

A performance on one server with FULL index (not divided by 10)
 is about 2500 ms.
On one server with splitted index (divided by 10) is about 50 ms.

And on ParallelMultiSearcher with 10 of remote searchable,
 each RemoteSearchable returns in about 50 - 100 ms,
 and ParallelMultiSearcher returns also 50 - 100 ms, because of
 threading.
but Hits Searcher.search(Query, Sort) responds in about 500 - 1000 ms.

I think that Searcher.search with Sort reads all of SortFields from
 IndexReader and it's bottleneck.

Are there results of high performance distributed Lucene with ParallelMultiSearcher?
Or need hadoop?

Erick Erickson wrote:
Well, the first question is always "are you opening/closing your
IndexSearchers for each request on your remote machines?". This is always a
no-no. This is also a question for your single-searcher version.

What is your performance if you only go to one server? I'd start by finding
out what happens when you forget all the ParallelMultiSearcher stuff, all
the RMI stuff etc, and just see what your performance is on one of your
index parts locally. Once that is answered, extend to RMI, then the
Parallel...., at each step seeing if your performance degrades unacceptably.
That'll at least give you a clue what part of the process is the biggest
problem.

And without knowing a LOT more about your searches, and your index, it's
kind of hard to come up with solutions <G>....

Best
Erick

On 10/3/06, Scott <[EMAIL PROTECTED]> wrote:

Hi,

I have a question about ParallelMultiSearcher performance.

I want to search documents on about 10 gigabytes of index.
(The index has 10,000,000 documents.)

I get very slow performance using IndexSearcher with ONE index normally.
Then I tried to use ParallelMultiSearcher with 10 servers of remote
searchable.

Index:
Each search slaves have 1/10 of index.
(ONE index divided to 10 servers.)

Search slave:
Each search slaves start remote searchable RMI server,
and wait connecting from search master.

Search master:
The search master use Naming.lookup() to get remote searchable.
Get 10 remote searchables from each search slaves and build
ParallelMultiSearcher.
Then search.

Any solution?

--
Scott

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




--
Scott

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to