[ https://issues.apache.org/jira/browse/LUCENE-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13558752#comment-13558752 ]
Michael McCandless commented on LUCENE-4600: -------------------------------------------- NO_PARENTS CountingFacetsCollector vs itself (ie all differences are noise). Use the absolute QPS to compare to the "QPS comp" column above, eg MedTerm was 18.89 QPS above with ALL_PARENTS and with NO_PARENTS MedTerm is 22.67-22.80 QPS: {noformat} Task QPS base StdDev QPS comp StdDev Pct diff AndHighLow 85.20 (5.0%) 83.74 (5.7%) -1.7% ( -11% - 9%) LowSpanNear 95.25 (5.5%) 93.67 (6.8%) -1.7% ( -13% - 11%) HighSpanNear 95.19 (5.4%) 93.80 (6.7%) -1.5% ( -12% - 11%) MedSpanNear 94.97 (5.4%) 93.59 (6.8%) -1.5% ( -12% - 11%) AndHighMed 45.68 (2.8%) 45.29 (2.9%) -0.9% ( -6% - 4%) OrHighLow 7.62 (2.2%) 7.55 (2.2%) -0.8% ( -5% - 3%) OrHighHigh 4.33 (2.2%) 4.29 (2.2%) -0.8% ( -5% - 3%) LowTerm 38.17 (2.0%) 37.90 (2.2%) -0.7% ( -4% - 3%) OrHighMed 7.54 (2.2%) 7.49 (2.1%) -0.7% ( -4% - 3%) Prefix3 45.95 (4.3%) 45.68 (4.4%) -0.6% ( -8% - 8%) MedTerm 22.80 (2.2%) 22.67 (2.1%) -0.6% ( -4% - 3%) Fuzzy1 26.16 (1.9%) 26.04 (2.0%) -0.4% ( -4% - 3%) IntNRQ 17.94 (6.1%) 17.86 (6.2%) -0.4% ( -11% - 12%) AndHighHigh 12.33 (1.2%) 12.29 (1.3%) -0.4% ( -2% - 2%) Fuzzy2 32.00 (2.8%) 31.89 (3.0%) -0.3% ( -5% - 5%) MedPhrase 49.48 (3.9%) 49.32 (4.4%) -0.3% ( -8% - 8%) HighTerm 8.02 (2.1%) 8.00 (2.0%) -0.2% ( -4% - 3%) PKLookup 211.76 (1.4%) 211.32 (1.8%) -0.2% ( -3% - 3%) Wildcard 62.37 (2.3%) 62.28 (2.3%) -0.1% ( -4% - 4%) MedSloppyPhrase 17.49 (2.5%) 17.52 (2.7%) 0.2% ( -4% - 5%) Respell 55.68 (5.0%) 55.85 (3.3%) 0.3% ( -7% - 9%) LowSloppyPhrase 16.29 (4.7%) 16.43 (5.2%) 0.9% ( -8% - 11%) LowPhrase 15.68 (5.3%) 15.81 (5.4%) 0.9% ( -9% - 12%) HighPhrase 14.22 (8.7%) 14.45 (8.9%) 1.6% ( -14% - 21%) HighSloppyPhrase 0.83 (9.3%) 0.85 (11.9%) 2.1% ( -17% - 25%) {noformat} > Explore facets aggregation during documents collection > ------------------------------------------------------ > > Key: LUCENE-4600 > URL: https://issues.apache.org/jira/browse/LUCENE-4600 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet > Reporter: Michael McCandless > Assignee: Shai Erera > Attachments: LUCENE-4600-cli.patch, LUCENE-4600.patch, > LUCENE-4600.patch, LUCENE-4600.patch, LUCENE-4600.patch, LUCENE-4600.patch, > LUCENE-4600.patch, LUCENE-4600.patch > > > Today the facet module simply gathers all hits (as a bitset, optionally with > a float[] to hold scores as well, if you will aggregate them) during > collection, and then at the end when you call getFacetsResults(), it makes a > 2nd pass over all those hits doing the actual aggregation. > We should investigate just aggregating as we collect instead, so we don't > have to tie up transient RAM (fairly small for the bit set but possibly big > for the float[]). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org