So, I have an oddball question I have been battling with in the last day or two.
I have an 8 million document solr index, roughly divided down the middle by an identifying "product" value, one of two distinct values. The documents in both "sides" are very similar, with stored text fields, etc. I have two nearly identical request handlers, one for each "side". When I perform very similar queries on either "side" for random phrases, requesting 500 rows with highlighting on titles and summaries, I get very different results. One "side" consistently returns results in around 1-2 seconds, whereas the other one consistently returns in 6-10 seconds. I don't see any reason why it's worse; each run of queries is deliberately randomized to avoid caches getting in the way. Each test query returns the full first 500 in most cases. My filter query cache configuration looks like: <filterCache class="solr.FastLRUCache" size="750000" initialSize="10000" autowarmCount="0"/> (desperately trying to increase it, hoping this would help). The other caches are quite small; the use cases the customer is dealing with don't involve much in the way of paging, just returning a large initial set with highlighting in the shortest time. I'm trying to optimize this down so the disparity between the two "halves" is not so dramatic. Is there any optimizations or things I should be looking for to tune? Is it just the "way it is"? I've tried to argue to decrease the return set size, turn off highlighting, etc., but these seem to be out of the question. I would at least like some concrete reason why one filter query would be so relatively out of whack than the other, given the document ranges are very nearly half (3.8 million vs. 4.0 million in the slower side). Any pointers or suggestions would be appreciated. Thanks in advance. Neal Ensor nen...@gmail.com