[ 
https://issues.apache.org/jira/browse/SOLR-5894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Toke Eskildsen updated SOLR-5894:
---------------------------------
    Description: 
Multiple performance enhancements to Solr String faceting.

* Sparse counters, switching the constant time overhead of extracting top-X 
terms with time overhead linear to result set size
* Counter re-use for reduced garbage collection and lower per-call overhead
* Optional counter packing, trading speed for space
* Improved distribution count logic, greatly improving the performance of 
distributed faceting
* In-segment threaded faceting
* Regexp based white- and black-listing of facet terms
* Heuristic faceting for large result sets

Currently implemented for Solr 4.10. Source, detailed description and directly 
usable WAR at http://tokee.github.io/lucene-solr/

This project has grown beyond a simple patch and will require a fair amount of 
co-operation with a committer to get into Solr. Splitting into smaller issues 
is a possibility.

  was:
Multiple performance enhancements to Solr String faceting.

* Sparse counters, switching the constant time overhead of extracting top-X 
terms with time linear to result set size
* Counter re-use for reduced garbage collection and lower per-call overhead
* Optional counter packing, trading speed for space
* Improved distribution count logic, greatly reducing the overhead of 
distributed faceting
* In-segment threaded faceting
* Regexp based white- and black-listing of facet terms
* Heuristic faceting for large result sets

Currently implemented for Solr 4.10 and available as source and directly usable 
WAR at http://tokee.github.io/lucene-solr/




> Speed up high-cardinality facets with sparse counters
> -----------------------------------------------------
>
>                 Key: SOLR-5894
>                 URL: https://issues.apache.org/jira/browse/SOLR-5894
>             Project: Solr
>          Issue Type: Improvement
>          Components: SearchComponents - other
>    Affects Versions: 4.7.1
>            Reporter: Toke Eskildsen
>            Priority: Minor
>              Labels: faceted-search, faceting, memory, performance
>         Attachments: SOLR-5894.patch, SOLR-5894.patch, SOLR-5894.patch, 
> SOLR-5894.patch, SOLR-5894.patch, SOLR-5894.patch, SOLR-5894.patch, 
> SOLR-5894.patch, SOLR-5894.patch, SOLR-5894_test.zip, SOLR-5894_test.zip, 
> SOLR-5894_test.zip, SOLR-5894_test.zip, SOLR-5894_test.zip, 
> author_7M_tags_1852_logged_queries_warmed.png, 
> sparse_2000000docs_fc_cutoff_20140403-145412.png, 
> sparse_5000000docs_20140331-151918_multi.png, 
> sparse_5000000docs_20140331-151918_single.png, 
> sparse_50510000docs_20140328-152807.png
>
>
> Multiple performance enhancements to Solr String faceting.
> * Sparse counters, switching the constant time overhead of extracting top-X 
> terms with time overhead linear to result set size
> * Counter re-use for reduced garbage collection and lower per-call overhead
> * Optional counter packing, trading speed for space
> * Improved distribution count logic, greatly improving the performance of 
> distributed faceting
> * In-segment threaded faceting
> * Regexp based white- and black-listing of facet terms
> * Heuristic faceting for large result sets
> Currently implemented for Solr 4.10. Source, detailed description and 
> directly usable WAR at http://tokee.github.io/lucene-solr/
> This project has grown beyond a simple patch and will require a fair amount 
> of co-operation with a committer to get into Solr. Splitting into smaller 
> issues is a possibility.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to