[
https://issues.apache.org/jira/browse/LUCENE-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless updated LUCENE-4600:
---------------------------------------
Attachment: LUCENE-4600-cli.patch
bq. Also, you can obtain the right IntDecoder from the CLP for decoding the
ordinals. That would remove the hard dependency on VInt+gap, and allow e.g. to
use a PackedInts decoder.
I tried this, changing the CountingFacetsCollector to the attached
patch (to use CategoryListIterator), but alas those abstractions are
apparently costing us in this hotspot (unless I screwed something up
in the patch? Eg, that null I pass is kinda spooky!):
{noformat}
Task QPS base StdDev QPS comp StdDev
Pct diff
HighTerm 0.86 (4.7%) 0.56 (0.4%)
-34.4% ( -37% - -30%)
MedTerm 5.85 (1.0%) 5.04 (0.5%)
-13.9% ( -15% - -12%)
LowTerm 11.82 (0.6%) 11.02 (0.5%)
-6.8% ( -7% - -5%)
{noformat}
base is the original CountingFacetsCollector and comp is the new one
using the CategoryListIterator API.
I think we should try to invoke specialized collectors when possible?
> Explore facets aggregation during documents collection
> ------------------------------------------------------
>
> Key: LUCENE-4600
> URL: https://issues.apache.org/jira/browse/LUCENE-4600
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Michael McCandless
> Attachments: LUCENE-4600-cli.patch, LUCENE-4600.patch,
> LUCENE-4600.patch
>
>
> Today the facet module simply gathers all hits (as a bitset, optionally with
> a float[] to hold scores as well, if you will aggregate them) during
> collection, and then at the end when you call getFacetsResults(), it makes a
> 2nd pass over all those hits doing the actual aggregation.
> We should investigate just aggregating as we collect instead, so we don't
> have to tie up transient RAM (fairly small for the bit set but possibly big
> for the float[]).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]