Re: Performance of never optimizing

Justus Pendleton Sun, 02 Nov 2008 21:50:09 -0800

On 03/11/2008, at 4:27 PM, Otis Gospodnetic wrote:

Why are you optimizing? Trying to make the search faster? I wouldtry to avoid optimizing during high usage periods.

I assume that the original, long-ago, decision to optimize was made toimprove searching performance.

One thing that you might not have tried is the constant re-openingof the IndexReader, which you'll need to do if you want to see indexchanges instantly.

We do keep track of when the index has been updated and re-openIndexReaders so that they see the updates instantly.

So you indexed once and then measured search performance? Or didyou measure indexing performance? I can't quite tell from your email.And in one case you optimized before searching and in the other youdid not optimize?

Yes, I indexed once and then measured search performance. (The actualalgorithm used can be seen at http://confluence.atlassian.com/display/JIRACOM/Lucene+graphs)For my current purposes I don't care about indexing performance.

1. Why does the merge factor of 4 appear to be faster than themerge factor of
2?
Faster for indexing or searching? If indexing, then it's because 4means fewer segment merges than 2. If searching, then I don't know,unless you had indexing and searching happening in parallel, whichthen means less IO for 4.

For searching. The index and search should not have been happening inparallel. However, multiple searches are occurring in parallel.

Did you index fit in RAM, by the way?

The machine has, I believe, 4 GB of RAM and the benchmark suitereports than 700 MB were used, so it does appear to have fit into RAM.

2. Why does non-optimized searching appear to be faster thanoptimized searching
once the index hits ~500,000 documents?
Not sure without seeing the index/machine.

The machine is an 8-core Mac Pro. If you'd like, I can provide theindexes online somewhere. Or if you can provide pointers on what tolook for, I'm more than happy to investigate this myself.

It sounds like you were measuring search performance while at thesame time increasing the index size by incrementally adding more docs?

No documents were being added to the index while the searching wasbeing performed. I was trying to measure only the search performance.

20 reqs/sec sounds very low. How large is your index, how much RAM,and how about heap size?
What were your queries like? random?  from log?

The queries were generated by the ReutersQueryMaker. I am not surewhat the heap size used as various stages were. (I ran the benchmarksover the weekend; they took several days.)

I'm confused by what exactly you did and measured, but it could justbe that I'm tired.

My apologies for not being clearer in my initial email. I appreciatethe help,


Cheers,
Justus


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Performance of never optimizing

Reply via email to