[ https://issues.apache.org/jira/browse/LUCENE-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13558247#comment-13558247 ]
Michael McCandless commented on LUCENE-4600: -------------------------------------------- Base = DecoderCountingFacetsCollector; comp=CountingFacetsCollector: {noformat} Task QPS base StdDev QPS comp StdDev Pct diff HighTerm 25.67 (1.6%) 30.45 (1.9%) 18.6% ( 14% - 22%) LowTerm 145.87 (1.0%) 154.38 (0.8%) 5.8% ( 4% - 7%) MedTerm 44.45 (1.4%) 51.01 (1.5%) 14.8% ( 11% - 17%) PKLookup 240.08 (0.9%) 239.94 (1.0%) -0.1% ( -1% - 1%) {noformat} So it seems like the IntDecoder abstractions hurt ... Base = DecoderCountingFacetsCollector; comp=PostCollectionCountingFacetsCollector: {noformat} Task QPS base StdDev QPS comp StdDev Pct diff HighTerm 30.46 (0.8%) 30.16 (2.1%) -1.0% ( -3% - 2%) LowTerm 142.89 (0.5%) 153.94 (0.8%) 7.7% ( 6% - 9%) MedTerm 50.46 (0.8%) 50.65 (1.8%) 0.4% ( -2% - 2%) PKLookup 238.65 (1.1%) 238.55 (0.9%) -0.0% ( -2% - 2%) {noformat} This is very interesting! And good news for sampling? > Explore facets aggregation during documents collection > ------------------------------------------------------ > > Key: LUCENE-4600 > URL: https://issues.apache.org/jira/browse/LUCENE-4600 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet > Reporter: Michael McCandless > Attachments: LUCENE-4600-cli.patch, LUCENE-4600.patch, > LUCENE-4600.patch, LUCENE-4600.patch, LUCENE-4600.patch, LUCENE-4600.patch > > > Today the facet module simply gathers all hits (as a bitset, optionally with > a float[] to hold scores as well, if you will aggregate them) during > collection, and then at the end when you call getFacetsResults(), it makes a > 2nd pass over all those hits doing the actual aggregation. > We should investigate just aggregating as we collect instead, so we don't > have to tie up transient RAM (fairly small for the bit set but possibly big > for the float[]). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org