I see the CPU working very hard, and at the same time I see 2 MB/sec disk access for that 15 seconds. I am not running it this instant, but it seems to me that there was more CPU cycles available, so unless it's an issue of not being able to multithread it any further I'd say it's more IO related.
I'm going to set up solr cloud and shard across the 2 servers I have available for now. It's not an optimal setup we have while we're in a private beta period, but maybe it'll improve things (I've got 2 servers with 2x 4TB disks in raid-0 shared with the webservers). I'll work towards some improved IO performance and maybe more shards and see how things go. I'll also be able to up the RAM in just a couple of weeks. Are there any settings I should think of in terms of improving cache performance when I can give it say 10GB of RAM? Thanks, this has been tremendously helpful. David -----Original Message----- From: Tom Burton-West [mailto:tburt...@umich.edu] Sent: Saturday, March 23, 2013 1:38 AM To: solr-user@lucene.apache.org Subject: Re: Slow queries for common terms Hi David and Jan, I wrote the blog post, and David, you are right, the problem we had was with phrase queries because our positions lists are so huge. Boolean queries don't need to read the positions lists. I think you need to determine whether you are CPU bound or I/O bound. It is possible that you are I/O bound and reading the term frequency postings for 90 million docs is taking a long time. In that case, More memory in the machine (but not dedicated to Solr) might help because Solr relies on OS disk caching for caching the postings lists. You would still need to do some cache warming with your most common terms. On the other hand as Jan pointed out, you may be cpu bound because Solr doesn't have early termination and has to rank all 90 million docs in order to show the top 10 or 25. Did you try the OR search to see if your CPU is at 100%? Tom On Fri, Mar 22, 2013 at 10:14 AM, Jan Høydahl <jan....@cominvent.com> wrote: > Hi > > There might not be a final cure with more RAM if you are CPU bound. > Scoring 90M docs is some work. Can you check what's going on during > those > 15 seconds? Is your CPU at 100%? Try an (foo OR bar OR baz) search > which generates >100mill hits and see if that is slow too, even if you > don't use frequent words. > > I'm sure you can find other frequent terms in your corpus which > display similar behaviour, words which are even more frequent than > "book". Are you using "AND" as default operator? You will benefit from > limiting the number of results as much as possible. > > The real solution is to shard across N number of servers, until you > reach the desired performance for the desired indexing/querying load. > > -- > Jan Høydahl, search solution architect Cominvent AS - > www.cominvent.com Solr Training - www.solrtraining.com > >