On Sat, 2011-01-01 at 03:06 +0100, Tri Nguyen wrote:
> I remember going through some page that had graphs of response times based on 
> index size for solr.
>  
> Anyone know of such pages?

Sorry, no. Some small scale tests with our corpus showed that response
times suffered less than proportionally to index size, with regard to
the raw searches: Doubling the index size did not halve the response
time. On the other hand, faceting time was proportional to the index
size. As always, your mileage will vary.

> Internally, we have some requirements for response times and I'm trying to 
> figure out when to shard the index.

If you discover that your searches are primarily IO-bound, which is
often the case, and if you're still using spinning disks, I highly
recommend that you upgrade to SDD's. They are very cheap compared to
RAM, you don't need to change your code or workflow and they work
beautifully with Lucene/SOLR: They gave us 2-4 times speedup, compared
to 2 * 15.000 RPM harddisks in RAID 1. Compared to holding the index
fully in RAM (with a 14GB index) they gave us 80% on a dual core machine
- more CPU cores might benefit more from the RAM solution.

Reply via email to