Elran Dvir created SOLR-5972:
--------------------------------

             Summary: new statistics facet capabilities to StatsComponent facet 
- limit, sort and missing.
                 Key: SOLR-5972
                 URL: https://issues.apache.org/jira/browse/SOLR-5972
             Project: Solr
          Issue Type: New Feature
            Reporter: Elran Dvir


I thought it would be very useful to enable limiting and sorting StatsComponent 
facet response.
I chose to implement it in Stats Component rather than Analytics component 
because Analytics doesn't support distributed queries yet. 

The default for limit is -1 - returns all facet values.
The default for sort is no sorting.
The default for missing is true.
So if you use stats component exactly as before, the response won't change as 
of nowadays.
If ask for sort or limit, missing facet value will be the last, as in regular 
facet.
Sort types supported: min, max, sum and countdistinct for stats fields, and 
count and index for facet fields (all sort types are lower cased).
Sort directions asc and desc are supported.
Sorting by multiple fields is supported.

our example use case will be employees' monthly salaries:

The follwing query returns the 10 most "expensive" employees: 
"q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=salary
 sum desc&f.employee_name.stats.facet.limit=10" 
The follwing query returns the 10 least "expensive" employees:
"q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=salary
 sum asc&f.employee_name.stats.facet.limit=10" 
The follwing query returns the employee that got the highest salary ever:
"q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=salary
 max desc&f.employee_name.stats.facet.limit=1" 
The follwing query returns the employee that got the lowest salary ever:
"q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=salary
 min asc&f.employee_name.stats.facet.limit=1" 
The follwing query returns the 10 first (lexicographically) employees:
"q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=employee_name
 index asc&f.employee_name.stats.facet.limit=10" 
The follwing query returns the 10 employees that have worked for the longest 
period:
"q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=employee_name
 count desc&f.employee_name.stats.facet.limit=10" 
The follwing query returns the 10 employee whose salaries vary the most:
"q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=salary
 countdistinct desc&f.employee_name.stats.facet.limit=10" 

Attached a patch implementing this in StatsComponent.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to