On Fri, 2009-12-11 at 14:53 +0100, Michael McCandless wrote:
> How long does Lucene take to build the ords for the toplevel reader?
>
> You should be able to just time FieldCache.getStringIndex(topLevelReader).
>
> I think your 8.5 seconds for first Lucene search was with the
> StringIndex computed per segment?
Cold disk-cache (directly after reboot):
[2009-12-14 14:44:10,914] Requesting StringIndex for field sort_title
[2009-12-14 14:44:20,326] Got StringIndex of length 2916008 in 9
seconds, 412 ms
Warm disk-cache (3 minutes after first test):
[2009-12-14 14:44:10,914] Requesting StringIndex for field sort_title
[2009-12-14 14:44:20,326] Got StringIndex of length 2916008 in 8
seconds, 414 ms
The response time for the first sorted search was about 8,5 seconds, but
that was after 6 non-sorted searches without the use of explicit field
cache, so some amount of warm-up was performed.
Caveat: I must stress that this is very much ad hoc testing.
----------------- FieldCache test code
// Meant for testing
private FieldCache.StringIndex getStringIndex(
IndexReader reader, String field) {
log.info("Requesting StringIndex for field " + field);
Profiler profiler = new Profiler();
FieldCache.StringIndex stringIndex;
try {
stringIndex = FieldCache.DEFAULT.getStringIndex(reader,
field);
} catch (IOException e) {
log.error("Could not retrieve StringIndex", e);
return null;
}
log.info("Got StringIndex of length " + stringIndex.order.length
+ " in " + profiler.getSpendTime());
return stringIndex;
}
----------------- Lucene 2.4 index
ls -l index/sb/20091201-115941/lucene/
-rw-rw-r-- 1 summatst summatst 12840211452 Dec 2 11:21 _0.cfx
-rw-rw-r-- 1 summatst summatst 361027455 Dec 2 11:19 _32.cfs
-rw-rw-r-- 1 summatst summatst 373374178 Dec 2 11:19 _65.cfs
-rw-rw-r-- 1 summatst summatst 438076782 Dec 2 11:21 _98.cfs
-rw-rw-r-- 1 summatst summatst 463141239 Dec 2 11:19 _cb.cfs
-rw-rw-r-- 1 summatst summatst 1862427706 Dec 2 11:19 _rm.cfs
-rw-rw-r-- 1 summatst summatst 203 Dec 2 11:21 segments_3
-rw-rw-r-- 1 summatst summatst 20 Dec 2 11:18 segments.gen
-----------------
Regards,
Toke Eskildsen
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]