Ok, here you can find some details about my tests:
*MultiReader creation*
IndexReader subReader;
List<IndexReader> subReaders = new ArrayList<IndexReader>();
for (Directory dir : this.directories) {
try {
subReader = IndexReader.open(dir, true);
subReaders.add(subReader);
} catch (...) {
... ... ...
}
}
this.reader = new MultiReader(subReaders.toArray(new IndexReader[] {}));
(where *this.directories* is a List<Directory> containing all my index
directories).
*RangeFilter test*
@Test
public void testRangeFilter() throws IOException, ParseException {
IndexManager im = SearchObjectsFactory.getIndexManager();
IndexReader reader = im.getReader();
long timer;
DocIdSet docIdSet;
Filter filter;
logger.info("Num docs: " + reader.numDocs());
logger.info("Before creating filter...");
timer = System.currentTimeMillis();
filter = new RangeFilter("date_doc", "20081001000000",
"20090131235959", true, true);
logger.info("After creating filter..." + (System.currentTimeMillis()
- timer));
logger.info("Before reading idSet...");
timer = System.currentTimeMillis();
docIdSet = filter.getDocIdSet(reader);
logger.info("After reading idSet..." + ((OpenBitSet)
docIdSet).cardinality() + " " + (System.currentTimeMillis() - timer));
logger.info("Before reading idSet...");
timer = System.currentTimeMillis();
docIdSet = filter.getDocIdSet(reader);
logger.info("After reading idSet..." + ((OpenBitSet)
docIdSet).cardinality() + " " + (System.currentTimeMillis() - timer));
logger.info("Before reading idSet...");
timer = System.currentTimeMillis();
docIdSet = filter.getDocIdSet(reader);
logger.info("After reading idSet..." + ((OpenBitSet)
docIdSet).cardinality() + " " + (System.currentTimeMillis() - timer));
}
*
Test *results* (Num docs = 2,940,738)
1 Original index (12 collections * 6 months = 72 indexes)*
1a Range [20090101000000 - 20090131235959] --> 379,560 docs
2,274 ms 1,477 ms 1,283 ms
1b Range [20081201000000 - 20090131235959] --> 974,754 docs
4,489 ms 3,333 ms 3,390 ms
1c Range [20081001000000 - 20090131235959] --> 2,197,590 docs
8,482 ms 7,471 ms 7,424 ms
***2Consolidated index (1 index)*
2a Range [20090101000000 - 20090131235959] --> 379,560 docs
492 ms 116 ms 83 ms
2b Range [20081201000000 - 20090131235959] --> 974,754 docs
640 ms 159 ms 138 ms
2c Range [20081001000000 - 20090131235959] --> 2,197,590 docs
817 ms 322 ms 295 ms
The field on which I am applying the RangeFilter is a date field and it has
299,622 unique terms.
Thanks,
Raf
On Fri, Apr 10, 2009 at 7:54 PM, Michael McCandless <
[email protected]> wrote:
> <cut>
> Hmmm, interesting!
>
> Can you provide more details about your tests? EG the code fragment
> showing your query, the creation of the MultiReader, how you run the
> search, etc.?
>
> Is the field that you're applying the RangeFilter on highly unique or
> rather redundant?
>
> Mike
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>