[
https://issues.apache.org/jira/browse/LUCENE-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13558295#comment-13558295
]
Michael McCandless commented on LUCENE-4600:
--------------------------------------------
StandardFacetsCollector (base) vs DecoderCountingFacetsCollector (comp):
{noformat}
Task QPS base StdDev QPS comp StdDev
Pct diff
HighTerm 21.44 (1.4%) 25.71 (1.3%)
19.9% ( 16% - 22%)
LowTerm 99.73 (3.2%) 145.71 (1.2%)
46.1% ( 40% - 52%)
MedTerm 35.13 (1.6%) 44.46 (1.1%)
26.6% ( 23% - 29%)
PKLookup 241.15 (1.0%) 238.90 (1.0%)
-0.9% ( -2% - 1%)
{noformat}
StandardFacetsCollector (base) vs PostCollectionCountingFacetsCollector (comp):
{noformat}
Task QPS base StdDev QPS comp StdDev
Pct diff
HighTerm 21.26 (0.9%) 31.36 (1.4%)
47.5% ( 44% - 50%)
LowTerm 99.84 (3.2%) 159.17 (0.7%)
59.4% ( 53% - 65%)
MedTerm 34.91 (1.3%) 52.65 (1.2%)
50.8% ( 47% - 54%)
PKLookup 238.08 (1.3%) 238.26 (1.2%)
0.1% ( -2% - 2%)
{noformat}
StandardFacetsCollector (base) vs CountingFacetsCollector (comp):
{noformat}
Task QPS base StdDev QPS comp StdDev
Pct diff
HighTerm 21.35 (1.3%) 30.26 (2.9%)
41.7% ( 37% - 46%)
LowTerm 100.45 (4.0%) 153.26 (1.1%)
52.6% ( 45% - 60%)
MedTerm 35.02 (1.9%) 50.77 (2.0%)
45.0% ( 40% - 49%)
PKLookup 237.88 (2.4%) 239.34 (0.9%)
0.6% ( -2% - 4%)
{noformat}
> Explore facets aggregation during documents collection
> ------------------------------------------------------
>
> Key: LUCENE-4600
> URL: https://issues.apache.org/jira/browse/LUCENE-4600
> Project: Lucene - Core
> Issue Type: Improvement
> Components: modules/facet
> Reporter: Michael McCandless
> Attachments: LUCENE-4600-cli.patch, LUCENE-4600.patch,
> LUCENE-4600.patch, LUCENE-4600.patch, LUCENE-4600.patch, LUCENE-4600.patch,
> LUCENE-4600.patch
>
>
> Today the facet module simply gathers all hits (as a bitset, optionally with
> a float[] to hold scores as well, if you will aggregate them) during
> collection, and then at the end when you call getFacetsResults(), it makes a
> 2nd pass over all those hits doing the actual aggregation.
> We should investigate just aggregating as we collect instead, so we don't
> have to tie up transient RAM (fairly small for the bit set but possibly big
> for the float[]).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]