OK, you're now officially beyond my competence, so I'll have to wait for
people who actually know <G>....

Although if I read your stats right, you're getting approximately 1 sec
response time over 10M documents on a 10G index. That's not bad at all. What
kind of response time do you need?

On 10/3/06, Scott <[EMAIL PROTECTED]> wrote:

Hi,

> Well, the first question is always "are you opening/closing your
> IndexSearchers for each request on your remote machines?". This is
always a
> no-no. This is also a question for your single-searcher version.

Yes I know, each search slave (RMI server) have single instance
  of IndexSearcher and it's open once when RMI server starts.

> What is your performance if you only go to one server? I'd start by
finding

A performance on one server with FULL index (not divided by 10)
  is about 2500 ms.
On one server with splitted index (divided by 10) is about 50 ms.

And on ParallelMultiSearcher with 10 of remote searchable,
  each RemoteSearchable returns in about 50 - 100 ms,
  and ParallelMultiSearcher returns also 50 - 100 ms, because of
  threading.
but Hits Searcher.search(Query, Sort) responds in about 500 - 1000 ms.

I think that Searcher.search with Sort reads all of SortFields from
  IndexReader and it's bottleneck.

Are there results of high performance distributed Lucene with
ParallelMultiSearcher?
Or need hadoop?

Erick Erickson wrote:
> Well, the first question is always "are you opening/closing your
> IndexSearchers for each request on your remote machines?". This is
always a
> no-no. This is also a question for your single-searcher version.
>
> What is your performance if you only go to one server? I'd start by
finding
> out what happens when you forget all the ParallelMultiSearcher stuff,
all
> the RMI stuff etc, and just see what your performance is on one of your
> index parts locally. Once that is answered, extend to RMI, then the
> Parallel...., at each step seeing if your performance degrades
> unacceptably.
> That'll at least give you a clue what part of the process is the biggest
> problem.
>
> And without knowing a LOT more about your searches, and your index, it's
> kind of hard to come up with solutions <G>....
>
> Best
> Erick
>
> On 10/3/06, Scott <[EMAIL PROTECTED]> wrote:
>>
>> Hi,
>>
>> I have a question about ParallelMultiSearcher performance.
>>
>> I want to search documents on about 10 gigabytes of index.
>> (The index has 10,000,000 documents.)
>>
>> I get very slow performance using IndexSearcher with ONE index
normally.
>> Then I tried to use ParallelMultiSearcher with 10 servers of remote
>> searchable.
>>
>> Index:
>> Each search slaves have 1/10 of index.
>> (ONE index divided to 10 servers.)
>>
>> Search slave:
>> Each search slaves start remote searchable RMI server,
>> and wait connecting from search master.
>>
>> Search master:
>> The search master use Naming.lookup() to get remote searchable.
>> Get 10 remote searchables from each search slaves and build
>> ParallelMultiSearcher.
>> Then search.
>>
>> Any solution?
>>
>> --
>> Scott
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [EMAIL PROTECTED]
>> For additional commands, e-mail: [EMAIL PROTECTED]
>>
>>
>

--
Scott

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Reply via email to