On Tue, 2013-04-16 at 12:22 +0200, Montu v Boda wrote:
> we have 5060000 document is index in solr and it's size is 400GB.
> 
> now when We search for keyword "test" it will take 1 min to give the
> response for 10000 rows.

At this point, you have searched for other keywords before you measure
on keyword "test", right? The first search on a newly opened index is
notoriously slow.

> after fire the query, when we open the resource management then it will show
> that more cost is of Disk I/O

Both searching and value retrieval (for the 10K rows) requires a lot of
random access in Lucene/Solr and, I guess, just about every other
comparable search engines.

I will bet a cake that your underlying storage is spinning disks. When
you perform a search for a keyword that has not been used before or not
in a while, the disk cache has little data for that search so there will
be a lot of random access to the underlying storage. Spinning disks are
really bad at this.

> any help would be helpfull to us

Short answer: Use a SSD.

Longer answer: You need to either lower the amount of seeks or make them
faster (or both). You lower the amount of seeks by (in your case)
copious amounts of RAM and a lot of warming of your searchers. You make
the seeks faster by switching storage type.

RAIDing of spinning drives does not help much as the benefits of this
are higher bulk transfer rates and/or concurrent requests, where you
need lower latency. You could buy faster spinning drives, but with
current prices of SSDs I would really advice that you choose that road
instead.

Regards,
Toke Eskildsen, State and University Library, Denmark

Reply via email to