[jira] [Commented] (SOLR-5972) new statistics facet capabilities to StatsComponent facet - limit, sort and missing.
[ https://issues.apache.org/jira/browse/SOLR-5972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15692801#comment-15692801 ] Lyubov Romanchuk commented on SOLR-5972: Hi all, Attached the patch for multi value docvalue fields. Best regards, Lyuba > new statistics facet capabilities to StatsComponent facet - limit, sort and > missing. > > > Key: SOLR-5972 > URL: https://issues.apache.org/jira/browse/SOLR-5972 > Project: Solr > Issue Type: New Feature >Reporter: Elran Dvir > Attachments: SOLR-5972.patch, SOLR-5972.patch, > SOLR-5972_multivalue_docvalue.patch > > > I thought it would be very useful to enable limiting and sorting > StatsComponent facet response. > I chose to implement it in Stats Component rather than Analytics component > because Analytics doesn't support distributed queries yet. > The default for limit is -1 - returns all facet values. > The default for sort is no sorting. > The default for missing is true. > So if you use stats component exactly as before, the response won't change as > of nowadays. > If ask for sort or limit, missing facet value will be the last, as in regular > facet. > Sort types supported: min, max, sum and countdistinct for stats fields, and > count and index for facet fields (all sort types are lower cased). > Sort directions asc and desc are supported. > Sorting by multiple fields is supported. > our example use case will be employees' monthly salaries: > The follwing query returns the 10 most "expensive" employees: > "q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=salary > sum desc&f.employee_name.stats.facet.limit=10" > The follwing query returns the 10 least "expensive" employees: > "q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=salary > sum asc&f.employee_name.stats.facet.limit=10" > The follwing query returns the employee that got the highest salary ever: > "q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=salary > max desc&f.employee_name.stats.facet.limit=1" > The follwing query returns the employee that got the lowest salary ever: > "q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=salary > min asc&f.employee_name.stats.facet.limit=1" > The follwing query returns the 10 first (lexicographically) employees: > "q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=employee_name > index asc&f.employee_name.stats.facet.limit=10" > The follwing query returns the 10 employees that have worked for the longest > period: > "q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=employee_name > count desc&f.employee_name.stats.facet.limit=10" > The follwing query returns the 10 employee whose salaries vary the most: > "q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=salary > countdistinct desc&f.employee_name.stats.facet.limit=10" > Attached a patch implementing this in StatsComponent. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5972) new statistics facet capabilities to StatsComponent facet - limit, sort and missing.
[ https://issues.apache.org/jira/browse/SOLR-5972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14303217#comment-14303217 ] Elran Dvir commented on SOLR-5972: -- Hi all, This patch contains a new statistics result for a field - existInDoc. It returns the number of documents in which the field has a value (not missing). For multivalue fields there is a calculation of existInDoc inside the class UnInvertedField. Since Solr 4.10 there was a fix for a stats calculation of multi valued field which is doc valued. The class handling it is DocValuesStats. I want to support existInDoc calculation also for multi valued - doc valued field. How Should I change DocValuesStats to support this? Thanks. > new statistics facet capabilities to StatsComponent facet - limit, sort and > missing. > > > Key: SOLR-5972 > URL: https://issues.apache.org/jira/browse/SOLR-5972 > Project: Solr > Issue Type: New Feature >Reporter: Elran Dvir > Attachments: SOLR-5972.patch, SOLR-5972.patch > > > I thought it would be very useful to enable limiting and sorting > StatsComponent facet response. > I chose to implement it in Stats Component rather than Analytics component > because Analytics doesn't support distributed queries yet. > The default for limit is -1 - returns all facet values. > The default for sort is no sorting. > The default for missing is true. > So if you use stats component exactly as before, the response won't change as > of nowadays. > If ask for sort or limit, missing facet value will be the last, as in regular > facet. > Sort types supported: min, max, sum and countdistinct for stats fields, and > count and index for facet fields (all sort types are lower cased). > Sort directions asc and desc are supported. > Sorting by multiple fields is supported. > our example use case will be employees' monthly salaries: > The follwing query returns the 10 most "expensive" employees: > "q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=salary > sum desc&f.employee_name.stats.facet.limit=10" > The follwing query returns the 10 least "expensive" employees: > "q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=salary > sum asc&f.employee_name.stats.facet.limit=10" > The follwing query returns the employee that got the highest salary ever: > "q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=salary > max desc&f.employee_name.stats.facet.limit=1" > The follwing query returns the employee that got the lowest salary ever: > "q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=salary > min asc&f.employee_name.stats.facet.limit=1" > The follwing query returns the 10 first (lexicographically) employees: > "q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=employee_name > index asc&f.employee_name.stats.facet.limit=10" > The follwing query returns the 10 employees that have worked for the longest > period: > "q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=employee_name > count desc&f.employee_name.stats.facet.limit=10" > The follwing query returns the 10 employee whose salaries vary the most: > "q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=salary > countdistinct desc&f.employee_name.stats.facet.limit=10" > Attached a patch implementing this in StatsComponent. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5972) new statistics facet capabilities to StatsComponent facet - limit, sort and missing.
[ https://issues.apache.org/jira/browse/SOLR-5972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14091573#comment-14091573 ] Hoss Man commented on SOLR-5972: Elran: I appreciate that you've put a lot of work into trying to improve the {{stats.facet}} feature of StatsComponent, but personally i don't think it's really wise for us to be pursuing multiple divergent sets of "facet" code in Solr. The existing "StatsComponent Faceting" code has always felt like a kludge to me and has never worked as well or gotten as developer attention as the FacetComponent. I think in the long run, implementing things like SOLR-6351 to let people _combine_ StatsComponent with FacetComponent, and deprecating {{stats.facet}} completely will make a lot more sense, and be a lot more powerful. > new statistics facet capabilities to StatsComponent facet - limit, sort and > missing. > > > Key: SOLR-5972 > URL: https://issues.apache.org/jira/browse/SOLR-5972 > Project: Solr > Issue Type: New Feature >Reporter: Elran Dvir > Attachments: SOLR-5972.patch, SOLR-5972.patch > > > I thought it would be very useful to enable limiting and sorting > StatsComponent facet response. > I chose to implement it in Stats Component rather than Analytics component > because Analytics doesn't support distributed queries yet. > The default for limit is -1 - returns all facet values. > The default for sort is no sorting. > The default for missing is true. > So if you use stats component exactly as before, the response won't change as > of nowadays. > If ask for sort or limit, missing facet value will be the last, as in regular > facet. > Sort types supported: min, max, sum and countdistinct for stats fields, and > count and index for facet fields (all sort types are lower cased). > Sort directions asc and desc are supported. > Sorting by multiple fields is supported. > our example use case will be employees' monthly salaries: > The follwing query returns the 10 most "expensive" employees: > "q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=salary > sum desc&f.employee_name.stats.facet.limit=10" > The follwing query returns the 10 least "expensive" employees: > "q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=salary > sum asc&f.employee_name.stats.facet.limit=10" > The follwing query returns the employee that got the highest salary ever: > "q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=salary > max desc&f.employee_name.stats.facet.limit=1" > The follwing query returns the employee that got the lowest salary ever: > "q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=salary > min asc&f.employee_name.stats.facet.limit=1" > The follwing query returns the 10 first (lexicographically) employees: > "q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=employee_name > index asc&f.employee_name.stats.facet.limit=10" > The follwing query returns the 10 employees that have worked for the longest > period: > "q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=employee_name > count desc&f.employee_name.stats.facet.limit=10" > The follwing query returns the 10 employee whose salaries vary the most: > "q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=salary > countdistinct desc&f.employee_name.stats.facet.limit=10" > Attached a patch implementing this in StatsComponent. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5972) new statistics facet capabilities to StatsComponent facet - limit, sort and missing.
[ https://issues.apache.org/jira/browse/SOLR-5972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14083916#comment-14083916 ] Elran Dvir commented on SOLR-5972: -- I attached a newer patch with fix of calculation of existInDoc for multivalue fields > new statistics facet capabilities to StatsComponent facet - limit, sort and > missing. > > > Key: SOLR-5972 > URL: https://issues.apache.org/jira/browse/SOLR-5972 > Project: Solr > Issue Type: New Feature >Reporter: Elran Dvir > Attachments: SOLR-5972.patch, SOLR-5972.patch > > > I thought it would be very useful to enable limiting and sorting > StatsComponent facet response. > I chose to implement it in Stats Component rather than Analytics component > because Analytics doesn't support distributed queries yet. > The default for limit is -1 - returns all facet values. > The default for sort is no sorting. > The default for missing is true. > So if you use stats component exactly as before, the response won't change as > of nowadays. > If ask for sort or limit, missing facet value will be the last, as in regular > facet. > Sort types supported: min, max, sum and countdistinct for stats fields, and > count and index for facet fields (all sort types are lower cased). > Sort directions asc and desc are supported. > Sorting by multiple fields is supported. > our example use case will be employees' monthly salaries: > The follwing query returns the 10 most "expensive" employees: > "q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=salary > sum desc&f.employee_name.stats.facet.limit=10" > The follwing query returns the 10 least "expensive" employees: > "q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=salary > sum asc&f.employee_name.stats.facet.limit=10" > The follwing query returns the employee that got the highest salary ever: > "q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=salary > max desc&f.employee_name.stats.facet.limit=1" > The follwing query returns the employee that got the lowest salary ever: > "q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=salary > min asc&f.employee_name.stats.facet.limit=1" > The follwing query returns the 10 first (lexicographically) employees: > "q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=employee_name > index asc&f.employee_name.stats.facet.limit=10" > The follwing query returns the 10 employees that have worked for the longest > period: > "q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=employee_name > count desc&f.employee_name.stats.facet.limit=10" > The follwing query returns the 10 employee whose salaries vary the most: > "q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=salary > countdistinct desc&f.employee_name.stats.facet.limit=10" > Attached a patch implementing this in StatsComponent. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org