Hi Ming, which Solr version are you using? In case you use one of the latest versions (4.5 or above) try the new parameter facet.threads with a reasonable value (4 to 8 gave me a massive performance speedup when working with large facets, i.e. nTerms >> 10^7).
-Sascha Mingfeng Yang wrote: > I have an index with 170M documents, and two of the fields for each > doc is "source" and "url". And I want to know the top 500 most > frequent urls from Video source. > > So I did a facet with > "fq=source:Video&facet=true&facet.field=url&facet.limit=500", and > the matching documents are about 9 millions. > > The solr cluster is hosted on two ec2 instances each with 4 cpu, and > 32G memory. 16G is allocated tfor java heap. 4 master shards on one > machine, and 4 replica on another machine. Connected together via > zookeeper. > > Whenever I did the query above, the response is just taking too long > and the client will get timed out. Sometimes, when the end user is > impatient, so he/she may wait for a few second for the results, and > then kill the connection, and then issue the same query again and > again. Then the server will have to deal with multiple such heavy > queries simultaneously and being so busy that we got "no server > hosting shard" error, probably due to lost communication between solr > node and zookeeper. > > Is there any way to deal with such problem? > > Thanks, Ming >