[ https://issues.apache.org/jira/browse/LUCENE-5425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13890040#comment-13890040 ]
Lei Wang commented on LUCENE-5425: ---------------------------------- tried with the lucenutil, but got some problem. I cannot get same numbers for two identical code of trunk. even if they are all trunks, i get different numbers: Report after iter 19: TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff OrHighMed 74.15 (7.1%) 71.24 (8.3%) -3.9% ( -18% - 12%) LowTerm 515.68 (15.1%) 496.20 (12.3%) -3.8% ( -27% - 27%) OrNotHighLow 72.22 (8.2%) 70.36 (7.6%) -2.6% ( -17% - 14%) OrNotHighMed 79.01 (7.3%) 77.43 (8.4%) -2.0% ( -16% - 14%) OrHighNotHigh 38.66 (4.5%) 37.90 (6.4%) -2.0% ( -12% - 9%) Respell 51.21 (7.1%) 50.23 (6.5%) -1.9% ( -14% - 12%) MedPhrase 69.67 (7.5%) 68.35 (7.4%) -1.9% ( -15% - 14%) OrHighLow 67.24 (7.8%) 66.00 (9.0%) -1.8% ( -17% - 16%) Fuzzy1 27.37 (5.7%) 26.96 (5.5%) -1.5% ( -11% - 10%) Fuzzy2 37.21 (3.8%) 36.71 (5.6%) -1.3% ( -10% - 8%) MedSloppyPhrase 9.94 (5.4%) 9.83 (3.9%) -1.1% ( -9% - 8%) LowSpanNear 8.60 (3.9%) 8.54 (3.8%) -0.7% ( -8% - 7%) AndHighHigh 40.23 (3.1%) 40.03 (2.5%) -0.5% ( -5% - 5%) HighTerm 76.07 (9.0%) 75.96 (9.1%) -0.2% ( -16% - 19%) OrHighHigh 11.62 (3.0%) 11.62 (4.8%) -0.1% ( -7% - 7%) IntNRQ 9.51 (3.9%) 9.51 (8.3%) 0.0% ( -11% - 12%) HighPhrase 25.61 (7.0%) 25.63 (7.7%) 0.1% ( -13% - 15%) LowSloppyPhrase 30.21 (5.2%) 30.24 (4.3%) 0.1% ( -8% - 10%) PKLookup 212.03 (9.0%) 212.25 (11.5%) 0.1% ( -18% - 22%) OrNotHighHigh 27.75 (3.5%) 27.80 (6.5%) 0.2% ( -9% - 10%) OrHighNotMed 58.14 (5.9%) 58.27 (8.3%) 0.2% ( -13% - 15%) MedSpanNear 22.73 (3.7%) 22.80 (5.1%) 0.3% ( -8% - 9%) Wildcard 42.84 (5.0%) 42.97 (5.4%) 0.3% ( -9% - 11%) HighSloppyPhrase 23.99 (7.4%) 24.08 (6.3%) 0.4% ( -12% - 15%) AndHighLow 625.62 (6.6%) 629.52 (10.5%) 0.6% ( -15% - 18%) Prefix3 77.68 (7.2%) 78.17 (6.2%) 0.6% ( -11% - 15%) LowPhrase 14.58 (4.7%) 14.77 (5.0%) 1.3% ( -8% - 11%) HighSpanNear 11.84 (4.3%) 11.99 (5.2%) 1.3% ( -7% - 11%) OrHighNotLow 66.04 (8.4%) 67.28 (9.2%) 1.9% ( -14% - 21%) AndHighMed 66.55 (4.3%) 67.91 (6.2%) 2.1% ( -8% - 13%) MedTerm 139.78 (9.5%) 145.63 (10.3%) 4.2% ( -14% - 26%) with the patch, the numbers are also different, but no bigger difference than the trunk-trunk numbers: Report after iter 19: TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff AndHighLow 730.30 (11.5%) 700.95 (10.6%) -4.0% ( -23% - 20%) LowTerm 520.94 (10.6%) 504.25 (11.4%) -3.2% ( -22% - 21%) Fuzzy1 57.55 (5.1%) 56.26 (4.8%) -2.2% ( -11% - 8%) Respell 35.85 (4.7%) 35.18 (4.1%) -1.9% ( -10% - 7%) OrHighNotHigh 37.77 (7.3%) 37.19 (5.9%) -1.5% ( -13% - 12%) HighSloppyPhrase 12.30 (7.5%) 12.17 (7.7%) -1.1% ( -15% - 15%) HighPhrase 29.38 (5.2%) 29.06 (4.3%) -1.1% ( -10% - 8%) OrNotHighMed 25.93 (6.2%) 25.68 (5.5%) -1.0% ( -11% - 11%) OrNotHighHigh 19.72 (5.9%) 19.53 (4.9%) -0.9% ( -11% - 10%) Fuzzy2 11.30 (3.6%) 11.24 (5.1%) -0.6% ( -8% - 8%) PKLookup 218.16 (8.6%) 217.53 (9.3%) -0.3% ( -16% - 19%) LowSloppyPhrase 43.09 (5.6%) 43.00 (3.5%) -0.2% ( -8% - 9%) MedSpanNear 30.65 (4.4%) 30.60 (3.2%) -0.1% ( -7% - 7%) MedSloppyPhrase 21.71 (5.7%) 21.70 (3.8%) -0.0% ( -8% - 9%) Wildcard 14.67 (3.3%) 14.67 (2.6%) -0.0% ( -5% - 6%) HighSpanNear 0.64 (4.6%) 0.64 (5.0%) 0.1% ( -9% - 10%) LowPhrase 21.05 (5.6%) 21.09 (7.6%) 0.2% ( -12% - 14%) AndHighMed 175.53 (7.2%) 176.00 (8.2%) 0.3% ( -14% - 16%) Prefix3 31.24 (3.3%) 31.37 (2.7%) 0.4% ( -5% - 6%) OrNotHighLow 76.32 (6.3%) 76.80 (7.7%) 0.6% ( -12% - 15%) OrHighHigh 33.43 (6.4%) 33.65 (7.6%) 0.7% ( -12% - 15%) AndHighHigh 35.51 (3.1%) 35.76 (3.1%) 0.7% ( -5% - 7%) IntNRQ 9.36 (4.4%) 9.43 (3.7%) 0.7% ( -7% - 9%) HighTerm 90.42 (7.0%) 91.40 (5.3%) 1.1% ( -10% - 14%) OrHighLow 71.32 (8.6%) 72.13 (8.2%) 1.1% ( -14% - 19%) LowSpanNear 107.82 (6.8%) 109.19 (5.9%) 1.3% ( -10% - 14%) OrHighMed 45.43 (8.8%) 46.09 (8.8%) 1.5% ( -14% - 20%) MedTerm 139.24 (7.0%) 141.28 (8.4%) 1.5% ( -13% - 18%) MedPhrase 96.51 (5.0%) 98.10 (6.8%) 1.6% ( -9% - 14%) OrHighNotMed 50.88 (6.9%) 52.13 (8.3%) 2.5% ( -11% - 18%) OrHighNotLow 65.31 (8.9%) 67.16 (8.8%) 2.8% ( -13% - 22%) Btw, I copied the facet config from the nightly py, and the index looks like: index = comp.newIndex('trunk', WIKI_MEDIUM_10M, facets = (('Date',),), facetDVFormat='Direct') > Make creation of FixedBitSet in FacetsCollector overridable > ----------------------------------------------------------- > > Key: LUCENE-5425 > URL: https://issues.apache.org/jira/browse/LUCENE-5425 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet > Affects Versions: 4.6 > Reporter: John Wang > Attachments: facetscollector.patch, facetscollector.patch, > fixbitset.patch > > > In FacetsCollector, creation of bits in MatchingDocs are allocated per query. > For large indexes where maxDocs are large creating a bitset of maxDoc bits > will be expensive and would great a lot of garbage. > Attached patch is to allow for this allocation customizable while maintaining > current behavior. -- This message was sent by Atlassian JIRA (v6.1.5#6160) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org