[ 
https://issues.apache.org/jira/browse/LUCENE-4715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13565586#comment-13565586
 ] 

Shai Erera commented on LUCENE-4715:
------------------------------------

First, look at the patch, there's a test for that :).

The way it works is that we now do per-dimension counts rollup. That is, say 
that you index the following dimensions under the same CLP: A (NO_PARENTS), B 
(ALL_PARENTS) and C (ALL_BUT_DIMENSION). When you ask to aggregate all of them 
then:

* StandardFacetsCollector does not work with NO_PARENTS (not sure if it throws 
a hard exception now, I'll check). So your only choice is 
CountingFacetsCollector.

* CountingFacetsCollector works as follows:
** Aggregates in a FixedBitSet (one per segment) the matching documents.
** It then traverses the counting list and counts all the ordinals that it 
finds.
** Then when it computes the facet results, it goes per FacetRequest:
*** If the FR.categoryPath was indexed with NO_PARENTS ("A" in our case), it 
rolls up its ordinals only, not caring about the huge counts[]. See Mike's test 
above, this general improves the process by a bit.
*** Otherwise, there's no more rollup needed. "B" would have a count too, while 
"C" count will be 0, and only its children will be counted.

Hope that explains it.
                
> Add OrdinalPolicy.ALL_BUT_DIMENSION
> -----------------------------------
>
>                 Key: LUCENE-4715
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4715
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/facet
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>         Attachments: LUCENE-4715.patch
>
>
> With the move of OrdinalPolicy to CategoryListParams, 
> NonTopLevelOrdinalPolicy was nuked. It might be good to restore it, as 
> another enum value of OrdinalPolicy.
> It's the same like ALL_PARENTS, only doesn't add the dimension ordinal, which 
> could save space as well as computation time. It's good for when you don't 
> care about the count of Date/, but only about its children counts.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to