[
https://issues.apache.org/jira/browse/LUCENE-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13558247#comment-13558247
]
Michael McCandless commented on LUCENE-4600:
--------------------------------------------
Base = DecoderCountingFacetsCollector; comp=CountingFacetsCollector:
{noformat}
Task QPS base StdDev QPS comp StdDev
Pct diff
HighTerm 25.67 (1.6%) 30.45 (1.9%)
18.6% ( 14% - 22%)
LowTerm 145.87 (1.0%) 154.38 (0.8%)
5.8% ( 4% - 7%)
MedTerm 44.45 (1.4%) 51.01 (1.5%)
14.8% ( 11% - 17%)
PKLookup 240.08 (0.9%) 239.94 (1.0%)
-0.1% ( -1% - 1%)
{noformat}
So it seems like the IntDecoder abstractions hurt ...
Base = DecoderCountingFacetsCollector;
comp=PostCollectionCountingFacetsCollector:
{noformat}
Task QPS base StdDev QPS comp StdDev
Pct diff
HighTerm 30.46 (0.8%) 30.16 (2.1%)
-1.0% ( -3% - 2%)
LowTerm 142.89 (0.5%) 153.94 (0.8%)
7.7% ( 6% - 9%)
MedTerm 50.46 (0.8%) 50.65 (1.8%)
0.4% ( -2% - 2%)
PKLookup 238.65 (1.1%) 238.55 (0.9%)
-0.0% ( -2% - 2%)
{noformat}
This is very interesting! And good news for sampling?
> Explore facets aggregation during documents collection
> ------------------------------------------------------
>
> Key: LUCENE-4600
> URL: https://issues.apache.org/jira/browse/LUCENE-4600
> Project: Lucene - Core
> Issue Type: Improvement
> Components: modules/facet
> Reporter: Michael McCandless
> Attachments: LUCENE-4600-cli.patch, LUCENE-4600.patch,
> LUCENE-4600.patch, LUCENE-4600.patch, LUCENE-4600.patch, LUCENE-4600.patch
>
>
> Today the facet module simply gathers all hits (as a bitset, optionally with
> a float[] to hold scores as well, if you will aggregate them) during
> collection, and then at the end when you call getFacetsResults(), it makes a
> 2nd pass over all those hits doing the actual aggregation.
> We should investigate just aggregating as we collect instead, so we don't
> have to tie up transient RAM (fairly small for the bit set but possibly big
> for the float[]).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]