On Sat, 2011-01-01 at 03:06 +0100, Tri Nguyen wrote: > I remember going through some page that had graphs of response times based on > index size for solr. > > Anyone know of such pages?
Sorry, no. Some small scale tests with our corpus showed that response times suffered less than proportionally to index size, with regard to the raw searches: Doubling the index size did not halve the response time. On the other hand, faceting time was proportional to the index size. As always, your mileage will vary. > Internally, we have some requirements for response times and I'm trying to > figure out when to shard the index. If you discover that your searches are primarily IO-bound, which is often the case, and if you're still using spinning disks, I highly recommend that you upgrade to SDD's. They are very cheap compared to RAM, you don't need to change your code or workflow and they work beautifully with Lucene/SOLR: They gave us 2-4 times speedup, compared to 2 * 15.000 RPM harddisks in RAID 1. Compared to holding the index fully in RAM (with a 14GB index) they gave us 80% on a dual core machine - more CPU cores might benefit more from the RAM solution.