Hi Yonik, I have ran the queries against single index solr with only 16M documents. After attaching facet.method=fc the results seemed to come faster (first two queries below), but still not fast enough.
Here are the fieldValueCache stats: (facet.limit=1000000&facet.mincount=5&facet.method=fc, 542094 hits, 1 min) --> smallest result set *name: *fieldValueCache *class: *org.apache.solr.search.FastLRUCache * version: *1.0 *description: *Concurrent LRU Cache(maxSize=10000, initialSize=10, minSize=9000, acceptableSize=9500, cleanupThread=false) * stats: *lookups : 400 hits : 396 hitratio : 0.99 inserts : 1 evictions : 0 size : 1 warmupTime : 0 cumulative_lookups : 400 cumulative_hits : 396 cumulative_hitratio : 0.99 cumulative_inserts : 1 cumulative_evictions : 0 item_shingleContent_trigram : {field=shingleContent_trigram,memSize=1786355392,tindexSize=17977426,time=662387,phase1=654707,nTerms=53492050,bigTerms=38,termInstances=602090958,uses=397} (facet.limit=1000000&facet.mincount=5&facet.method=fc, 2837589 hits, 3 min 8 s) --> largest result set *name: *fieldValueCache *class: *org.apache.solr.search.FastLRUCache * version: *1.0 *description: *Concurrent LRU Cache(maxSize=10000, initialSize=10, minSize=9000, acceptableSize=9500, cleanupThread=false) * stats: *lookups : 401 hits : 397 hitratio : 0.99 inserts : 1 evictions : 0 size : 1 warmupTime : 0 cumulative_lookups : 401 cumulative_hits : 397 cumulative_hitratio : 0.99 cumulative_inserts : 1 cumulative_evictions : 0 item_shingleContent_trigram : {field=shingleContent_trigram,memSize=1786355392,tindexSize=17977426,time=662387,phase1=654707,nTerms=53492050,bigTerms=38,termInstances=602090958,uses=398} On Wed, Mar 16, 2011 at 9:46 PM, Yonik Seeley <yo...@lucidimagination.com>wrote: > On Wed, Mar 16, 2011 at 8:05 AM, Dmitry Kan <dmitry....@gmail.com> wrote: > > Hello guys. We are using shard'ed solr 1.4 for heavy faceted search over > the > > trigrams field with about 1 million of entries in the result set and more > > than 100 million of entries to facet on in the index. Currently the > faceted > > search is very slow, taking about 5 minutes per query. Would running on a > > cloud with Hadoop make it faster (to seconds) as faceting seems to be a > > natural map-reduce task? > > How many indexed tokens does each document have (for the field you are > faceting on) on average? > How many unique tokens are indexed in that field over the complete index? > > Or you could go to the admin/stats page and cut-n-paste the > fieldValueCache entry after your faceting request - it should contain > most of the info to further analyze this. > > -Yonik > http://lucidimagination.com > -- Regards, Dmitry Kan