[ 
https://issues.apache.org/jira/browse/LUCENE-10325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17506640#comment-17506640
 ] 

Yuting Gan edited comment on LUCENE-10325 at 3/15/22, 12:22 AM:
----------------------------------------------------------------

Thanks [~gsmiller] for creating issue!

I provided a default implementation of _getTopDims(int topNDims, int 
topNChildren)_ in the Facets class that calls the existing 
_getAllDims(topNChildren)_ function and returns _FacetResult_ of the requested 
_topNDims_ and their {_}topNChildren{_}.

Currently, I only experimented with one overridden implementation of 
_getTopDims_ in _SortedSetDocValuesFacetCounts_ that aims to provide a more 
optimal way of populating {_}dimCount{_}. It avoids resolving all child paths 
and creating all _FacetResult_ for every dim when calling _getTopDims._ 

I created #747 for this change and will appreciate any feedback. Since this 
change has a lot of code refactoring in SSDVFacetCounts, if it is worth and the 
PR is approved, I can also expand it to ConcurrentSSDVFacetCounts and explore 
other possible optimized implementations in faceting. Thanks!


was (Author: yutinggan):
Thanks [~gsmiller] for creating issue!

I provided a default implementation of `getTopDims({color:#cc7832}int 
{color}topNDims{color:#cc7832}, int {color}topNChildren)` in the Facets class 
that calls the existing `getAllDims(topNChildren)` function and returns 
`FacetResult` of the requested `topNDims` and their `topNChildren`.

Currently, I only experimented with one overridden implementation of 
`getTopDims` in `SortedSetDocValuesFacetCounts` that aims to provide a more 
optimal way of populating dimCount. It avoids resolving all child paths and 
creating all FacetResult for every dim when calling `getTopDims`. 

I created #747 for this change and will appreciate any feedback. Since this 
change has a lot of code refactoring in SSDVFacetCounts, if it is worth and the 
PR is approved, I can also expand it to `ConcurrentSSDVFacetCounts`and explore 
other possible optimized implementations in faceting.

> Add getTopDims functionality to Facets
> --------------------------------------
>
>                 Key: LUCENE-10325
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10325
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/facet
>            Reporter: Greg Miller
>            Priority: Major
>          Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> The current {{getAllDims}} functionality is really the only way for users to 
> determine the "top" dimensions in a faceting field (i.e., get the top dims by 
> count along with their top-n children), but it has the unfortunate 
> side-effect of resolving all child paths for every dim, even if the user 
> doesn't intend to use those dims. For example, if a match set contains docs 
> relating to 100 different dims (and various values under each), but the user 
> only wants the top 10 dims with their top 5 children, they can call 
> getAllDims(5) then just grab the first 10 results, but a lot of wasted work 
> has been done for the other 90 dims.
> It would be nice to implement something like {{getTopDims(int numDims, int 
> numChildren)}} that would only do the work necessary to resolve {{numDims}} 
> dims instead of all dims.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to