[ 
https://issues.apache.org/jira/browse/SOLR-6351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-6351:
---------------------------
    Attachment: SOLR-6351.patch


bq. It looks like Vitaliy's new code doesn't account for stats returned by a 
shard in response to refinement requests.

I updated DistributedFacetPivotLongTailTest to check stats on a request where 
refinement was required for correct results and was able to reliable reproduce.

Looking at the logs i realized that in the refinement requests "stats=false" 
was explicitly set, and traced this to an optimization in StatsComponent -- it 
was assuming that only the initial (PURPOSE_GET_TOP_IDS) shard request needed 
stats computed, so i modified that to recognize (PURPOSE_REFINE_PIVOT_FACETS) 
as another situation where we need to leave stats=true

This gets the tests to pass (still hammering on TestCloudPivot, but so far 
looks good) but i'm not really liking this solution, for reasons i noted in a 
StatsComponent nocommi comment...

{noformat}
// nocommit: PURPOSE_REFINE_PIVOT_FACETS by itself shouldn't be enough for 
this...
//
// we need to check if the pivots actually have stats hanging off of them,
// if they don't then we still should supress the stats param
// (no need to compute the top level stats over and over)
//
// actually ... even if we do have stats hanging off of pivots,
// we need to make stats component smart enough not to waste time re-computing
// top level stats on every refinement request.
//
// so maybe StatsCOmponent should be left alone, and FacetComponent's prepare 
method 
// should be modified so that *if* isShard && there are pivot refine params && 
those 
// pivots have stats, then set some variable so that stats logic happens even 
if stats=false?
{noformat}

I also made a few other various changes as i was reviewing the test (noted 
below).

My plan is to move forward and continue reviewing more of the patch, starting 
with the other tests, and then dig into the code changes -- writting additional 
test cases if/when i notice things that looks like they may not be adequately 
covered -- and then come back and revist the question of the "stats=false" 
during refinement requests later (i'm certainly open to suggestions)

{panel:title=changes in this iteration of the patch}
* TestCloudPivots
** removed the bogus param "cleanup" vitaliy mentioned
** added some nocommits as reminders for the future
* DistributedFacetPivotLongTailTest
** refactored query + assertFieldStats into "doTestDeepPivotStats" 
*** only called once, and wasn't general in anyway - only usable for checking 
one specific query
** renamed "foo_i" to "stat_i" so it's a bit more obvious why that field is 
there
** added stat_i to some long tail docs & updated the existing stats assertions
** modified existing query that required refinement to show stats aren't correct
*** "bbb0" on shard2 only gets included with refinement, but the "min" stat 
trivially demonstrates that the "-1" from shard2 isn't included 
* StatsComponent
** check for PURPOSE_REFINE_PIVOT_FACETS in modifyRequest.

*NOTE:* This patch is significantly smaller then the last one because i 
generated it using "svn diff -x --ignore-all-space" to supress a bunch of small 
formatted changes from the previous patches.
{panel}


> Let Stats Hang off of Pivots (via 'tag')
> ----------------------------------------
>
>                 Key: SOLR-6351
>                 URL: https://issues.apache.org/jira/browse/SOLR-6351
>             Project: Solr
>          Issue Type: Sub-task
>            Reporter: Hoss Man
>         Attachments: SOLR-6351.patch, SOLR-6351.patch, SOLR-6351.patch, 
> SOLR-6351.patch, SOLR-6351.patch, SOLR-6351.patch, SOLR-6351.patch, 
> SOLR-6351.patch, SOLR-6351.patch
>
>
> he goal here is basically flip the notion of "stats.facet" on it's head, so 
> that instead of asking the stats component to also do some faceting 
> (something that's never worked well with the variety of field types and has 
> never worked in distributed mode) we instead ask the PivotFacet code to 
> compute some stats X for each leaf in a pivot.  We'll do this with the 
> existing {{stats.field}} params, but we'll leverage the {{tag}} local param 
> of the {{stats.field}} instances to be able to associate which stats we want 
> hanging off of which {{facet.pivot}}
> Example...
> {noformat}
> facet.pivot={!stats=s1}category,manufacturer
> stats.field={!key=avg_price tag=s1 mean=true}price
> stats.field={!tag=s1 min=true max=true}user_rating
> {noformat}
> ...with the request above, in addition to computing the min/max user_rating 
> and mean price (labeled "avg_price") over the entire result set, the 
> PivotFacet component will also include those stats for every node of the tree 
> it builds up when generating a pivot of the fields "category,manufacturer"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to