[
https://issues.apache.org/jira/browse/LUCENE-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13558738#comment-13558738
]
Michael McCandless commented on LUCENE-4600:
--------------------------------------------
ALL_PARENTS StandardFacetsCollector (base) vs CountingFacetsCollector (comp):
{noformat}
Task QPS base StdDev QPS comp StdDev
Pct diff
Respell 55.89 (3.2%) 55.13 (3.9%)
-1.4% ( -8% - 5%)
PKLookup 207.52 (1.6%) 206.95 (1.4%)
-0.3% ( -3% - 2%)
Wildcard 62.22 (3.2%) 62.94 (2.7%)
1.2% ( -4% - 7%)
IntNRQ 17.88 (5.2%) 18.16 (5.7%)
1.6% ( -8% - 13%)
Prefix3 45.56 (4.9%) 46.48 (4.1%)
2.0% ( -6% - 11%)
HighSloppyPhrase 0.80 (9.7%) 0.84 (8.5%)
4.9% ( -12% - 25%)
HighPhrase 13.52 (7.7%) 15.09 (8.1%)
11.6% ( -3% - 29%)
LowSloppyPhrase 15.02 (3.9%) 17.15 (4.0%)
14.1% ( 5% - 22%)
LowPhrase 14.14 (4.3%) 16.77 (4.9%)
18.6% ( 8% - 29%)
MedSloppyPhrase 14.81 (2.6%) 18.33 (2.7%)
23.7% ( 17% - 29%)
Fuzzy2 27.57 (2.6%) 34.95 (3.1%)
26.8% ( 20% - 33%)
AndHighHigh 9.39 (1.6%) 11.92 (1.4%)
27.0% ( 23% - 30%)
MedTerm 14.63 (2.2%) 18.89 (1.7%)
29.1% ( 24% - 33%)
HighTerm 5.28 (1.8%) 7.02 (2.4%)
33.0% ( 28% - 37%)
Fuzzy1 20.79 (2.1%) 27.71 (2.8%)
33.3% ( 27% - 39%)
OrHighLow 4.82 (1.8%) 6.70 (2.6%)
39.1% ( 34% - 44%)
OrHighMed 4.74 (1.8%) 6.61 (3.0%)
39.4% ( 34% - 44%)
OrHighHigh 2.68 (1.8%) 3.77 (2.9%)
40.9% ( 35% - 46%)
MedPhrase 39.21 (3.6%) 55.35 (3.6%)
41.2% ( 32% - 50%)
AndHighMed 36.29 (3.5%) 51.92 (2.0%)
43.1% ( 36% - 50%)
LowTerm 27.96 (3.2%) 41.47 (2.2%)
48.3% ( 41% - 55%)
AndHighLow 64.36 (5.4%) 107.94 (5.7%)
67.7% ( 53% - 83%)
MedSpanNear 70.17 (6.1%) 123.23 (7.4%)
75.6% ( 58% - 94%)
LowSpanNear 70.35 (6.0%) 123.59 (7.1%)
75.7% ( 58% - 94%)
HighSpanNear 70.35 (6.1%) 123.69 (7.8%)
75.8% ( 58% - 95%)
{noformat}
These are nice gains!
> Explore facets aggregation during documents collection
> ------------------------------------------------------
>
> Key: LUCENE-4600
> URL: https://issues.apache.org/jira/browse/LUCENE-4600
> Project: Lucene - Core
> Issue Type: Improvement
> Components: modules/facet
> Reporter: Michael McCandless
> Assignee: Shai Erera
> Attachments: LUCENE-4600-cli.patch, LUCENE-4600.patch,
> LUCENE-4600.patch, LUCENE-4600.patch, LUCENE-4600.patch, LUCENE-4600.patch,
> LUCENE-4600.patch, LUCENE-4600.patch
>
>
> Today the facet module simply gathers all hits (as a bitset, optionally with
> a float[] to hold scores as well, if you will aggregate them) during
> collection, and then at the end when you call getFacetsResults(), it makes a
> 2nd pass over all those hits doing the actual aggregation.
> We should investigate just aggregating as we collect instead, so we don't
> have to tie up transient RAM (fairly small for the bit set but possibly big
> for the float[]).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]