Hi Samar, Have you looked at top or iostat or other monitoring utilities to see if you are cpu bound vs I/O bound?
With 225 term queries, it's possible that you are I/O bound. I suspect you need to think about seek time and caching. For each unique field:term combination lucene has to look up the postings for that term in the index. Additionally for any phrase, lucene has to additionally look up the positions data for each term in the phrase. (In our case phrase searches are very expensive as our positions (*prx) index is about 8 times as large as our frq index) So for 225 terms including some number of phrases, that is a lot of disk seeks. To the extent that the terms are close together in the index and various buffer caches contain adjacent terms, you might not actually have 225 seeks, but I suspect there will still be a lot. Although Lucene implements a number of caches (and you should take a look at your cache hit ratios), Lucene depends on the OS disk cache to cache postings data for individual terms. Most unix/linux OS's use free memory for disk caching. How much memory is available on the machine after the JVM gets it allocation? Have you considered running cache warming queries of your most frequent terms/phrases so that the data is in the OS disk cache? Tom >> When queries (without two fields mentioned above) have a lot of >>words/phrases search time is high. E.g I took a query with around 80 unique >>terms (not words) in 5 fields. These terms occur repeatedly and become total >>225 terms (non-unique). This particular query took 4.2 seconds. the 15 >>indexes used for this query were of total size 5 G. >>Are 225 terms (80 unique) is a very big number? -----Original Message----- --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org