SOLR 1.4 has a new feature https://issues.apache.org/jira/browse/SOLR-475that speeds up faceting on fields with many terms by adding an UnInvertedField. Bobo uses a custom field cache as well. It may be useful to benchmark the 3 different approaches (bitsets, SOLR-475, Bobo). This could be a good wiki page explaining the differences between them?
On Mon, Jul 13, 2009 at 9:49 AM, Bradford Stephens < bradfordsteph...@gmail.com> wrote: > Thanks for this -- we're also trying out bobo-browse for Lucene, and > early results look pretty enticing. They greatly sped up how fast you > read in documents from disk, among other things: > http://bobo-browse.wiki.sourceforge.net/ > > On Sat, Jul 11, 2009 at 12:10 AM, Shalin Shekhar > Mangar<shalinman...@gmail.com> wrote: > > On Sat, Jul 11, 2009 at 12:01 AM, Bradford Stephens < > > bradfordsteph...@gmail.com> wrote: > > > >> Does the facet aggregation take place on the Solr search server, or > >> the Solr client? > >> > >> It's pretty slow for me -- on a machine with 8 cores/ 8 GB RAM, 50 > >> million document index (about 36M unique values in the "author" > >> field), a query that returns 131,000 hits takes about 20 seconds to > >> calculate the top 50 authors. The query I'm running is this: > >> > >> > >> > http://dttest10:8983/solr/select/select?q=java&facet=true&facet.field=authorname > >> : > >> > >> > > Is the author field tokenized? Is it multi-valued? It is best to have > > untokenized fields. > > > > Solr 1.4 has huge improvements in faceting performance so you can try > that > > and see if it helps. See Yonik's blog post about this - > > > http://yonik.wordpress.com/2008/11/25/solr-faceted-search-performance-improvements/ > > > > -- > > Regards, > > Shalin Shekhar Mangar. > > > > > > -- > http://www.roadtofailure.com -- The Fringes of Scalability, Social > Media, and Computer Science >