[
https://issues.apache.org/jira/browse/LUCENE-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13558752#comment-13558752
]
Michael McCandless commented on LUCENE-4600:
--------------------------------------------
NO_PARENTS CountingFacetsCollector vs itself (ie all differences are noise).
Use the absolute QPS to compare to the "QPS comp" column above, eg MedTerm was
18.89 QPS above with ALL_PARENTS and with NO_PARENTS MedTerm is 22.67-22.80 QPS:
{noformat}
Task QPS base StdDev QPS comp StdDev
Pct diff
AndHighLow 85.20 (5.0%) 83.74 (5.7%)
-1.7% ( -11% - 9%)
LowSpanNear 95.25 (5.5%) 93.67 (6.8%)
-1.7% ( -13% - 11%)
HighSpanNear 95.19 (5.4%) 93.80 (6.7%)
-1.5% ( -12% - 11%)
MedSpanNear 94.97 (5.4%) 93.59 (6.8%)
-1.5% ( -12% - 11%)
AndHighMed 45.68 (2.8%) 45.29 (2.9%)
-0.9% ( -6% - 4%)
OrHighLow 7.62 (2.2%) 7.55 (2.2%)
-0.8% ( -5% - 3%)
OrHighHigh 4.33 (2.2%) 4.29 (2.2%)
-0.8% ( -5% - 3%)
LowTerm 38.17 (2.0%) 37.90 (2.2%)
-0.7% ( -4% - 3%)
OrHighMed 7.54 (2.2%) 7.49 (2.1%)
-0.7% ( -4% - 3%)
Prefix3 45.95 (4.3%) 45.68 (4.4%)
-0.6% ( -8% - 8%)
MedTerm 22.80 (2.2%) 22.67 (2.1%)
-0.6% ( -4% - 3%)
Fuzzy1 26.16 (1.9%) 26.04 (2.0%)
-0.4% ( -4% - 3%)
IntNRQ 17.94 (6.1%) 17.86 (6.2%)
-0.4% ( -11% - 12%)
AndHighHigh 12.33 (1.2%) 12.29 (1.3%)
-0.4% ( -2% - 2%)
Fuzzy2 32.00 (2.8%) 31.89 (3.0%)
-0.3% ( -5% - 5%)
MedPhrase 49.48 (3.9%) 49.32 (4.4%)
-0.3% ( -8% - 8%)
HighTerm 8.02 (2.1%) 8.00 (2.0%)
-0.2% ( -4% - 3%)
PKLookup 211.76 (1.4%) 211.32 (1.8%)
-0.2% ( -3% - 3%)
Wildcard 62.37 (2.3%) 62.28 (2.3%)
-0.1% ( -4% - 4%)
MedSloppyPhrase 17.49 (2.5%) 17.52 (2.7%)
0.2% ( -4% - 5%)
Respell 55.68 (5.0%) 55.85 (3.3%)
0.3% ( -7% - 9%)
LowSloppyPhrase 16.29 (4.7%) 16.43 (5.2%)
0.9% ( -8% - 11%)
LowPhrase 15.68 (5.3%) 15.81 (5.4%)
0.9% ( -9% - 12%)
HighPhrase 14.22 (8.7%) 14.45 (8.9%)
1.6% ( -14% - 21%)
HighSloppyPhrase 0.83 (9.3%) 0.85 (11.9%)
2.1% ( -17% - 25%)
{noformat}
> Explore facets aggregation during documents collection
> ------------------------------------------------------
>
> Key: LUCENE-4600
> URL: https://issues.apache.org/jira/browse/LUCENE-4600
> Project: Lucene - Core
> Issue Type: Improvement
> Components: modules/facet
> Reporter: Michael McCandless
> Assignee: Shai Erera
> Attachments: LUCENE-4600-cli.patch, LUCENE-4600.patch,
> LUCENE-4600.patch, LUCENE-4600.patch, LUCENE-4600.patch, LUCENE-4600.patch,
> LUCENE-4600.patch, LUCENE-4600.patch
>
>
> Today the facet module simply gathers all hits (as a bitset, optionally with
> a float[] to hold scores as well, if you will aggregate them) during
> collection, and then at the end when you call getFacetsResults(), it makes a
> 2nd pass over all those hits doing the actual aggregation.
> We should investigate just aggregating as we collect instead, so we don't
> have to tie up transient RAM (fairly small for the bit set but possibly big
> for the float[]).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]