[ 
https://issues.apache.org/jira/browse/SOLR-6349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-6349:
---------------------------
    Attachment: SOLR-6349.patch


Starting to get back into this, here's a quick checkpoint of some small progress


Step #1: This new patch brings Xu's latest patch up to date with trunk using 
the minimal changes that seemed to work -- in particular: I haven't started 
really digging into the code changes other then getting things to compile & 
tests to pass.

Step #2...

My main focus for now is making sure the tests are rock solid & all inclusive 
so we can then iterate on the code changes (see early comments about my cocerns 
with spreading hte logic arround).  Only 2 noticable changes in this patch...

* Fixed FacetPivotSmallTest.testPivotFacetStatsUnsortedTagged
** was prematurely specifying 'mean=true' but then trying to assert that all 
stats were returned
** beefed this up to also assert that it got an expected number of stats - if 
we add more stats in the future, this will be a canary that the test needs 
updated to assert the correct values for these new stats.

* StatsComponentTest
** added more asserts to the 3 
testFieldStatisticsResults_TYPE_FieldAlwaysMissing to ensure expected values 
for all stats (when there is nothing to compute stats on)...{noformat}
// numerics & strings & dates
min=null
max=null
// just numerics
sum=0.0
sumOfSquares=0.0
stddev=0.0
mean=NaN
{noformat}
*** these are based on the current behavior of the code ... my initial gut 
reaction was that they should all be null, but a quick bit of research says 
that in maths the "empty sum" is defined as "0" -- if you start with that 
premise, then the values for the rest seems correct to me, but i'm definitely 
interested in knowing if there are contrary opinions (is NaN better?)
** included "expected number of stats" asserts in these tests as well - more 
canary's if/when future stats are added.


> LocalParams for enabling/disabling individual stats
> ---------------------------------------------------
>
>                 Key: SOLR-6349
>                 URL: https://issues.apache.org/jira/browse/SOLR-6349
>             Project: Solr
>          Issue Type: Sub-task
>            Reporter: Hoss Man
>         Attachments: SOLR-6349-tflobbe.patch, SOLR-6349-tflobbe.patch, 
> SOLR-6349-tflobbe.patch, SOLR-6349-xu.patch, SOLR-6349-xu.patch, 
> SOLR-6349-xu.patch, SOLR-6349-xu.patch, SOLR-6349.patch, 
> SOLR-6349___bad_idea_broken.patch
>
>
> Stats component currently computes all stats (except for one) every time 
> because they are relatively cheap, and in some cases dependent on eachother 
> for distrib computation -- but if we start layering stats on other things it 
> becomes unnecessarily expensive to compute all the stats when they just want 
> the "sum" (and it will definitely become excessively verbose in the 
> responses).  
> The plan here is to use local params to make this configurable.  All of the 
> existing stat options could be modeled as a simple boolean param, but future 
> params (like percentiles) might take in a more complex param value...
> Example:
> {noformat}
> stats.field={!min=true max=true percentiles='99,99.999'}price
> stats.field={!mean=true}weight
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to