[
https://issues.apache.org/jira/browse/LUCENE-5316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13830060#comment-13830060
]
Michael McCandless commented on LUCENE-5316:
--------------------------------------------
OK, I ran the same perf tests with the last patch. The "sometimes all
0 facet counts" problem is fixed!
NO_PARENTS:
{noformat}
Task QPS base StdDev QPS comp StdDev
Pct diff
LowSloppyPhrase 99.22 (4.1%) 35.51 (1.9%)
-64.2% ( -67% - -60%)
MedSpanNear 96.92 (4.2%) 35.23 (2.0%)
-63.6% ( -67% - -59%)
AndHighLow 91.99 (3.5%) 34.57 (2.0%)
-62.4% ( -65% - -58%)
HighPhrase 90.68 (3.8%) 34.35 (2.0%)
-62.1% ( -65% - -58%)
HighSloppyPhrase 81.34 (2.9%) 32.74 (2.1%)
-59.7% ( -62% - -56%)
HighSpanNear 65.81 (2.9%) 30.03 (2.2%)
-54.4% ( -57% - -50%)
OrNotHighLow 63.44 (3.4%) 29.66 (2.0%)
-53.3% ( -56% - -49%)
MedPhrase 62.66 (3.2%) 29.30 (2.0%)
-53.2% ( -56% - -49%)
LowTerm 61.47 (3.8%) 29.02 (2.0%)
-52.8% ( -56% - -48%)
Fuzzy1 47.78 (3.3%) 25.66 (2.3%)
-46.3% ( -50% - -42%)
OrNotHighHigh 47.59 (3.8%) 25.73 (2.3%)
-45.9% ( -50% - -41%)
OrHighLow 43.78 (1.9%) 24.41 (2.1%)
-44.2% ( -47% - -41%)
AndHighMed 42.81 (2.1%) 24.04 (2.0%)
-43.9% ( -47% - -40%)
OrNotHighMed 38.92 (2.6%) 22.95 (2.0%)
-41.0% ( -44% - -37%)
Fuzzy2 38.27 (2.6%) 22.86 (2.2%)
-40.3% ( -43% - -36%)
MedTerm 31.78 (2.5%) 20.14 (2.1%)
-36.6% ( -40% - -32%)
AndHighHigh 27.50 (1.7%) 18.33 (1.9%)
-33.3% ( -36% - -30%)
Prefix3 25.26 (1.9%) 17.35 (1.7%)
-31.3% ( -34% - -28%)
OrHighNotMed 22.27 (1.4%) 16.04 (1.4%)
-28.0% ( -30% - -25%)
OrHighNotLow 18.01 (1.6%) 13.76 (1.5%)
-23.6% ( -26% - -20%)
OrHighMed 17.33 (2.1%) 13.26 (1.6%)
-23.5% ( -26% - -20%)
OrHighHigh 16.84 (1.9%) 13.05 (1.5%)
-22.5% ( -25% - -19%)
MedSloppyPhrase 15.54 (3.9%) 12.22 (3.4%)
-21.4% ( -27% - -14%)
HighTerm 15.87 (2.2%) 12.48 (1.7%)
-21.3% ( -24% - -17%)
LowPhrase 13.78 (1.6%) 11.03 (1.3%)
-20.0% ( -22% - -17%)
Wildcard 12.93 (1.9%) 10.65 (1.3%)
-17.7% ( -20% - -14%)
LowSpanNear 10.55 (2.0%) 8.92 (1.7%)
-15.5% ( -18% - -12%)
OrHighNotHigh 9.29 (1.4%) 8.16 (1.4%)
-12.2% ( -14% - -9%)
IntNRQ 5.19 (1.3%) 4.92 (1.9%)
-5.1% ( -8% - -1%)
Respell 85.48 (2.6%) 87.32 (2.6%)
2.2% ( -2% - 7%)
{noformat}
ALL_BUT_DIM:
{noformat}
Task QPS base StdDev QPS comp StdDev
Pct diff
Respell 89.86 (3.0%) 89.27 (2.2%)
-0.7% ( -5% - 4%)
LowSpanNear 12.01 (2.3%) 11.95 (2.1%)
-0.5% ( -4% - 3%)
Fuzzy1 88.54 (2.2%) 88.41 (1.5%)
-0.2% ( -3% - 3%)
Fuzzy2 62.54 (2.1%) 62.50 (2.3%)
-0.1% ( -4% - 4%)
OrHighNotHigh 10.10 (1.7%) 10.09 (1.9%)
-0.1% ( -3% - 3%)
OrNotHighHigh 85.35 (5.4%) 85.31 (5.4%)
-0.0% ( -10% - 11%)
OrHighNotLow 22.11 (1.4%) 22.10 (1.2%)
-0.0% ( -2% - 2%)
MedSloppyPhrase 18.87 (4.0%) 18.88 (4.7%)
0.1% ( -8% - 9%)
HighTerm 18.93 (1.5%) 18.97 (1.7%)
0.2% ( -2% - 3%)
OrHighMed 21.19 (1.6%) 21.26 (1.4%)
0.3% ( -2% - 3%)
LowPhrase 15.79 (4.4%) 15.85 (4.2%)
0.4% ( -7% - 9%)
AndHighHigh 38.40 (1.0%) 38.57 (1.2%)
0.4% ( -1% - 2%)
OrHighHigh 20.55 (1.4%) 20.64 (1.5%)
0.4% ( -2% - 3%)
OrHighNotMed 29.27 (1.4%) 29.40 (1.2%)
0.4% ( -2% - 3%)
AndHighMed 72.26 (1.1%) 72.60 (1.1%)
0.5% ( -1% - 2%)
Wildcard 14.92 (1.0%) 14.99 (1.3%)
0.5% ( -1% - 2%)
HighSpanNear 159.71 (3.5%) 160.74 (3.7%)
0.6% ( -6% - 8%)
IntNRQ 5.15 (1.4%) 5.18 (1.7%)
0.7% ( -2% - 3%)
Prefix3 33.93 (1.3%) 34.18 (1.8%)
0.7% ( -2% - 3%)
MedTerm 44.36 (1.7%) 44.69 (1.6%)
0.8% ( -2% - 4%)
OrNotHighMed 62.66 (2.4%) 63.18 (3.1%)
0.8% ( -4% - 6%)
OrHighLow 75.94 (1.4%) 76.65 (1.5%)
0.9% ( -1% - 3%)
OrNotHighLow 150.08 (4.7%) 151.62 (5.0%)
1.0% ( -8% - 11%)
MedPhrase 138.21 (3.7%) 139.67 (3.6%)
1.1% ( -6% - 8%)
LowTerm 140.27 (2.3%) 142.14 (2.6%)
1.3% ( -3% - 6%)
HighSloppyPhrase 283.76 (1.3%) 291.51 (1.8%)
2.7% ( 0% - 5%)
HighPhrase 455.49 (1.5%) 476.29 (3.1%)
4.6% ( 0% - 9%)
MedSpanNear 660.85 (2.0%) 693.91 (2.4%)
5.0% ( 0% - 9%)
AndHighLow 482.21 (2.1%) 511.77 (2.5%)
6.1% ( 1% - 11%)
LowSloppyPhrase 759.01 (1.9%) 816.01 (2.1%)
7.5% ( 3% - 11%)
{noformat}
> Taxonomy tree traversing improvement
> ------------------------------------
>
> Key: LUCENE-5316
> URL: https://issues.apache.org/jira/browse/LUCENE-5316
> Project: Lucene - Core
> Issue Type: Improvement
> Components: modules/facet
> Reporter: Gilad Barkai
> Priority: Minor
> Attachments: LUCENE-5316.patch, LUCENE-5316.patch, LUCENE-5316.patch,
> LUCENE-5316.patch, LUCENE-5316.patch
>
>
> The taxonomy traversing is done today utilizing the
> {{ParallelTaxonomyArrays}}. In particular, two taxonomy-size {{int}} arrays
> which hold for each ordinal it's (array #1) youngest child and (array #2)
> older sibling.
> This is a compact way of holding the tree information in memory, but it's not
> perfect:
> * Large (8 bytes per ordinal in memory)
> * Exposes internal implementation
> * Utilizing these arrays for tree traversing is not straight forward
> * Lose reference locality while traversing (the array is accessed in
> increasing only entries, but they may be distant from one another)
> * In NRT, a reopen is always (not worst case) done at O(Taxonomy-size)
> This issue is about making the traversing more easy, the code more readable,
> and open it for future improvements (i.e memory footprint and NRT cost) -
> without changing any of the internals.
> A later issue(s?) could be opened to address the gaps once this one is done.
--
This message was sent by Atlassian JIRA
(v6.1#6144)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]