[
https://issues.apache.org/jira/browse/LUCENE-4769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13576144#comment-13576144
]
Shai Erera commented on LUCENE-4769:
------------------------------------
FacetsAggregator is an abstraction of the facets package that lets you compute
different functions on the aggregated ordinals. E.g. counting is equivalent to
#sum(1), while SumScoreFacetsAggregator does #sum(score) etc.
You're right that this could be implemented as a Codec, and then we won't even
need to alert the user that if he uses that caching method, he should use
DiskValuesFormat. But it looks an awkward decision to me. Usually, caching does
not force you to index stuff in a specific way. Rather, you decide at runtime
if you want to cache the data or not. You can even choose to stop using the
cache, while the app is running. Also, it's odd that if the app already indexed
documents with the default Codec, it won't be able to using this caching
method, unless it reindexes, or until those segments are merged (b/c their
DVFormat will be different, and so the aggregator would need to revert to a
different counting code).
I dunno ... it's certainly doable, but it doesn't feel right to me.
> Add a CountingFacetsAggregator which reads ordinals from a cache
> ----------------------------------------------------------------
>
> Key: LUCENE-4769
> URL: https://issues.apache.org/jira/browse/LUCENE-4769
> Project: Lucene - Core
> Issue Type: New Feature
> Components: modules/facet
> Reporter: Shai Erera
> Assignee: Shai Erera
> Attachments: LUCENE-4769.patch
>
>
> Mike wrote a prototype of a FacetsCollector which reads ordinals from a
> CachedInts structure on LUCENE-4609. I ported it to the new facets API, as a
> FacetsAggregator. I think we should offer users the means to use such a
> cache, even if it consumes more RAM. Mike tests show that this cache consumed
> x2 more RAM than if the DocValues were loaded into memory in their raw form.
> Also, a PackedInts version of such cache took almost the same amount of RAM
> as straight int[], but the gains were minor.
> I will post the patch shortly.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]