[
https://issues.apache.org/jira/browse/SOLR-6349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hoss Man updated SOLR-6349:
---------------------------
Attachment: SOLR-6349.patch
Starting to get back into this, here's a quick checkpoint of some small progress
Step #1: This new patch brings Xu's latest patch up to date with trunk using
the minimal changes that seemed to work -- in particular: I haven't started
really digging into the code changes other then getting things to compile &
tests to pass.
Step #2...
My main focus for now is making sure the tests are rock solid & all inclusive
so we can then iterate on the code changes (see early comments about my cocerns
with spreading hte logic arround). Only 2 noticable changes in this patch...
* Fixed FacetPivotSmallTest.testPivotFacetStatsUnsortedTagged
** was prematurely specifying 'mean=true' but then trying to assert that all
stats were returned
** beefed this up to also assert that it got an expected number of stats - if
we add more stats in the future, this will be a canary that the test needs
updated to assert the correct values for these new stats.
* StatsComponentTest
** added more asserts to the 3
testFieldStatisticsResults_TYPE_FieldAlwaysMissing to ensure expected values
for all stats (when there is nothing to compute stats on)...{noformat}
// numerics & strings & dates
min=null
max=null
// just numerics
sum=0.0
sumOfSquares=0.0
stddev=0.0
mean=NaN
{noformat}
*** these are based on the current behavior of the code ... my initial gut
reaction was that they should all be null, but a quick bit of research says
that in maths the "empty sum" is defined as "0" -- if you start with that
premise, then the values for the rest seems correct to me, but i'm definitely
interested in knowing if there are contrary opinions (is NaN better?)
** included "expected number of stats" asserts in these tests as well - more
canary's if/when future stats are added.
> LocalParams for enabling/disabling individual stats
> ---------------------------------------------------
>
> Key: SOLR-6349
> URL: https://issues.apache.org/jira/browse/SOLR-6349
> Project: Solr
> Issue Type: Sub-task
> Reporter: Hoss Man
> Attachments: SOLR-6349-tflobbe.patch, SOLR-6349-tflobbe.patch,
> SOLR-6349-tflobbe.patch, SOLR-6349-xu.patch, SOLR-6349-xu.patch,
> SOLR-6349-xu.patch, SOLR-6349-xu.patch, SOLR-6349.patch,
> SOLR-6349___bad_idea_broken.patch
>
>
> Stats component currently computes all stats (except for one) every time
> because they are relatively cheap, and in some cases dependent on eachother
> for distrib computation -- but if we start layering stats on other things it
> becomes unnecessarily expensive to compute all the stats when they just want
> the "sum" (and it will definitely become excessively verbose in the
> responses).
> The plan here is to use local params to make this configurable. All of the
> existing stat options could be modeled as a simple boolean param, but future
> params (like percentiles) might take in a more complex param value...
> Example:
> {noformat}
> stats.field={!min=true max=true percentiles='99,99.999'}price
> stats.field={!mean=true}weight
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]