[ https://issues.apache.org/jira/browse/SOLR-711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Fuad Efendi closed SOLR-711. ---------------------------- Resolution: Fixed Thanks Shalin for pointing to SOLR-475 which is very advanced solution to term counting approach. > SimpleFacets: Performance Boost for Tokenized Fields for smaller DocSet using > Term Vectors > ------------------------------------------------------------------------------------------ > > Key: SOLR-711 > URL: https://issues.apache.org/jira/browse/SOLR-711 > Project: Solr > Issue Type: Improvement > Components: search > Affects Versions: 1.3 > Reporter: Fuad Efendi > Fix For: 1.4 > > Original Estimate: 1680h > Remaining Estimate: 1680h > > From > [http://www.nabble.com/SimpleFacets%3A-Performance-Boost-for-Tokenized-Fields-td19033760.html]: > Scenario: > - 10,000,000 documents in the index; > - 5-10 terms per document; > - 200,000 unique terms for a tokenized field. > _Obviously calculating sizes of 200,000 intersections with FilterCache is 100 > times slower than traversing 10 - 20,000 documents for smaller DocSets and > counting frequencies of Terms._ > Not applicable if size of DocSet is close to total number of unique tokens > (200,000 in our scenario). > See SimpleFacets.java: > {code} > public NamedList getFacetTermEnumCounts( > SolrIndexSearcher searcher, > DocSet docs, ... > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.