[
https://issues.apache.org/jira/browse/LUCENE-893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12499332
]
robert engels commented on LUCENE-893:
--------------------------------------
Some food for thought:
A couple of runs of XBench on hardware that is radically difference in terms of
raw performance shows a nearly 4x performance improvement using 256k blocks
during sequential access. For random reads the numbers are closer to 20x.
The trick is determining how much sequential data is (should) be read - the
locality of data for the current query along with future queries, since even if
Lucene reads extra unneeded data in this run, what is the chance that the data
will be needed in future queries (thus having it already in the cache).
It would seem that these numbers show the ideal solution would vary the buffer
size when the engine determines that it is going to read a lot of sequential
data (e.g. a wide open range query), and use smaller buffer sizes when it
expects only a few results.
Maybe this might shove Lucene down the path where the index is optimized so
that common queries terms are always put in a separate segment/index providing
a high degree of locality to optimize the reading. Maybe there is some academic
research in this area?
Disk Test 81.23
Sequential 81.55
Uncached Write 80.69 33.63 MB/sec [4K blocks]
Uncached Write 80.94 33.15 MB/sec [256K blocks]
Uncached Read 77.68 12.30 MB/sec [4K blocks]
Uncached Read 87.48 35.35 MB/sec [256K blocks]
Random 80.92
Uncached Write 62.67 0.94 MB/sec [4K blocks]
Uncached Write 89.93 20.28 MB/sec [256K blocks]
Uncached Read 89.01 0.59 MB/sec [4K blocks]
Uncached Read 89.93 18.51 MB/sec [256K blocks]
Disk Test 48.34
Sequential 47.83
Uncached Write 39.10 16.30 MB/sec [4K blocks]
Uncached Write 59.73 24.46 MB/sec [256K blocks]
Uncached Read 38.72 6.13 MB/sec [4K blocks]
Uncached Read 64.56 26.08 MB/sec [256K blocks]
Random 48.87
Uncached Write 35.51 0.53 MB/sec [4K blocks]
Uncached Write 46.00 10.37 MB/sec [256K blocks]
Uncached Read 66.61 0.44 MB/sec [4K blocks]
Uncached Read 59.06 12.15 MB/sec [256K blocks]
> Increase buffer sizes used during searching
> -------------------------------------------
>
> Key: LUCENE-893
> URL: https://issues.apache.org/jira/browse/LUCENE-893
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Store
> Affects Versions: 2.1
> Reporter: Michael McCandless
>
> Spinoff of LUCENE-888.
> In LUCENE-888 we increased buffer sizes that impact indexing and found
> substantial (10-18%) overall performance gains.
> It's very likely that we can also gain some performance for searching
> by increasing the read buffers in BufferedIndexInput used by
> searching.
> We need to test performance impact to verify and then pick a good
> overall default buffer size, also being careful not to add too much
> overall HEAP RAM usage because a potentially very large number of
> BufferedIndexInput instances are created during searching
> (# segments X # index files per segment).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]