[
https://issues.apache.org/jira/browse/LUCENE-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13430958#comment-13430958
]
Han Jiang commented on LUCENE-4283:
-----------------------------------
Hmm, the improvement isn't that noisy
{noformat}
Task QPS base StdDev base QPS comp StdDev comp Pct
diff
AndHighHigh 83.84 5.07 88.64 2.41 -3% -
15%
AndHighLow 1716.87 62.53 1891.91 20.85 5% -
15%
AndHighMed 348.15 37.20 441.49 10.78 11% -
45%
Fuzzy1 87.67 0.92 84.80 2.36 -6% -
0%
Fuzzy2 32.84 0.37 31.41 1.06 -8% -
0%
HighPhrase 18.45 0.93 18.88 0.53 -5% -
10%
HighSloppyPhrase 22.16 0.76 21.55 0.57 -8% -
3%
HighSpanNear 3.07 0.11 3.09 0.04 -3% -
5%
HighTerm 181.58 18.26 171.10 6.44 -17% -
8%
IntNRQ 48.39 1.47 49.28 0.88 -2% -
6%
LowPhrase 80.49 3.34 87.04 2.63 0% -
16%
LowSloppyPhrase 28.53 1.09 27.31 0.71 -10% -
2%
LowSpanNear 46.86 1.63 49.34 1.15 0% -
11%
LowTerm 1637.37 19.39 1608.23 16.93 -3% -
0%
MedPhrase 22.48 1.03 23.27 0.52 -3% -
10%
MedSloppyPhrase 15.46 0.52 15.00 0.37 -8% -
2%
MedSpanNear 37.09 1.21 37.80 0.69 -3% -
7%
MedTerm 587.20 44.40 560.78 19.09 -14% -
6%
OrHighHigh 62.10 0.88 62.95 1.05 -1% -
4%
OrHighLow 126.89 1.48 128.30 1.53 -1% -
3%
OrHighMed 124.20 1.18 125.34 1.23 -1% -
2%
PKLookup 213.54 3.75 211.98 0.37 -2% -
1%
Prefix3 106.76 2.31 107.79 0.84 -1% -
3%
Respell 100.12 1.00 96.48 2.58 -7% -
0%
Wildcard 149.61 3.53 150.29 0.88 -2% -
3%
{noformat}
> Support more frequent skip with Block Postings Format
> -----------------------------------------------------
>
> Key: LUCENE-4283
> URL: https://issues.apache.org/jira/browse/LUCENE-4283
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Han Jiang
> Priority: Minor
> Attachments: LUCENE-4283-buggy.patch, LUCENE-4283-buggy.patch,
> LUCENE-4283-codes-cleanup.patch, LUCENE-4283-record-next-skip.patch,
> LUCENE-4283-record-skip&inlining-scanning.patch, LUCENE-4283-slow.patch,
> LUCENE-4283-small-interval-fully.patch,
> LUCENE-4283-small-interval-partially.patch
>
>
> This change works on the new bulk branch.
> Currently, our BlockPostingsFormat only supports skipInterval==blockSize.
> Every time the skipper reaches the last level 0 skip point, we'll have to
> decode a whole block to read doc/freq data. Also, a higher level skip list
> will be created only for those df>blockSize^k, which means for most terms,
> skipping will just be a linear scan. If we increase current blockSize for
> better bulk i/o performance, current skip setting will be a bottleneck.
> For ForPF, the encoded block can be easily splitted if we set
> skipInterval=32*k.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]