[ https://issues.apache.org/jira/browse/LUCENE-4709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13560476#comment-13560476 ]
Shai Erera commented on LUCENE-4709: ------------------------------------ BTW, a somewhat supporting evidence that we should nuke it, are the following benchmark results (thanks Mike!). Base is trunk, comp is trunk + no residue computation: {noformat} Task QPS base StdDev QPS comp StdDev Pct diff Respell 111.64 (3.2%) 110.49 (3.2%) -1.0% ( -7% - 5%) OrHighHigh 4.33 (2.8%) 4.30 (3.0%) -0.7% ( -6% - 5%) HighSpanNear 2.98 (2.3%) 2.97 (2.0%) -0.4% ( -4% - 3%) HighSloppyPhrase 0.89 (8.9%) 0.89 (8.2%) -0.3% ( -15% - 18%) HighTerm 7.95 (2.3%) 7.93 (2.4%) -0.2% ( -4% - 4%) OrHighLow 7.57 (2.2%) 7.55 (2.3%) -0.2% ( -4% - 4%) OrHighMed 7.51 (2.7%) 7.51 (2.8%) 0.1% ( -5% - 5%) Wildcard 74.46 (3.6%) 74.54 (2.0%) 0.1% ( -5% - 5%) PKLookup 247.56 (2.1%) 247.85 (2.8%) 0.1% ( -4% - 5%) LowSpanNear 7.54 (4.6%) 7.59 (3.6%) 0.7% ( -7% - 9%) AndHighHigh 12.56 (0.9%) 12.68 (1.0%) 0.9% ( -1% - 2%) MedSpanNear 19.88 (1.5%) 20.08 (2.2%) 1.0% ( -2% - 4%) MedSloppyPhrase 18.45 (2.1%) 18.64 (2.1%) 1.0% ( -3% - 5%) LowSloppyPhrase 17.52 (3.7%) 17.71 (3.8%) 1.1% ( -6% - 8%) Prefix3 45.70 (5.6%) 46.25 (2.7%) 1.2% ( -6% - 10%) LowPhrase 16.86 (3.4%) 17.07 (3.1%) 1.2% ( -5% - 8%) MedTerm 23.00 (1.4%) 23.33 (1.8%) 1.4% ( -1% - 4%) IntNRQ 17.97 (7.8%) 18.26 (4.7%) 1.6% ( -10% - 15%) HighPhrase 15.71 (7.0%) 15.98 (5.2%) 1.7% ( -9% - 15%) Fuzzy1 33.30 (1.8%) 33.90 (1.3%) 1.8% ( -1% - 5%) Fuzzy2 41.46 (2.2%) 42.26 (2.0%) 1.9% ( -2% - 6%) LowTerm 40.47 (1.1%) 41.45 (1.7%) 2.4% ( 0% - 5%) AndHighMed 49.38 (0.9%) 51.08 (1.3%) 3.4% ( 1% - 5%) MedPhrase 55.65 (2.7%) 57.79 (2.5%) 3.8% ( -1% - 9%) AndHighLow 98.02 (1.5%) 104.36 (2.9%) 6.5% ( 2% - 10%) {noformat} > Nuke FacetResultNode.residue > ---------------------------- > > Key: LUCENE-4709 > URL: https://issues.apache.org/jira/browse/LUCENE-4709 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet > Reporter: Shai Erera > Assignee: Shai Erera > > The residue is the count of all categories that did not make it to the top K. > But, this is a senseless statistic. Take for example the following case: two > documents with categories [A/1, A/2, A/3] and [A/1, A/4, A/5]. If you ask for > top-1 category of "A", you'll get A (count=2), A/1 (count=2), but A's residue > will be 4! > As a user, that number doesn't tell you anything, except maybe when you index > only one category per document for a given dimension. But in that case, the > residue is {{root.value - sum(topK.value)}}, which the application can > compute if it needs to. > In short, we're just wasting CPU cycles for that statistic, so I'm going to > remove it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org