hello All, I'm looking for some advice on how to improve scalability - we have a fairly large lucene index of 35M documents, max 1k document size (most much smaller) and 14 fields. We combine descriptive text together into a "contents" field and search on that and have been very pleased with handling almost 100 queries/sec at about 8-12ms for the average search.
Prior to that we had a common attribute for which about 50% of the docs had one value and the rest had the other value and the boolean query slowed response times very significantly. We handled this by breaking up our indexes so that the index only contained one attribute or the other and eliminated the need for the boolean - this was a 7-8x improvement. Now we're back to wanting to add another attribute to the documents for which most of the docs will have one value and much fewer will have the other and although it sounds so simple - my limited testing with an 85/15 ratio is showing another big hit on performance with the boolean. A two term boolean search without the attribute is about 7-8ms, adding the attribute to the boolean search increases the elapsed time to 4x and 2x of original for the 85% and 15% frequencies respective. I had some hope that a QueryFilter would really help out but it turns out to be much much slower: the 85% term ends up taking a whopping 336ms and the 15% term ends up around 65ms which is 40x and 8x slower than the original 8ms query speed without the additional attribute. I have to ask if there's not a better way to handle the addition of an common attribute with a few possible values across the index. Any other recommended approaches? Thanks in advance, Vince
