[ http://issues.apache.org/jira/browse/LUCENE-550?page=comments#action_12451730 ] wolfgang hoschek commented on LUCENE-550: -----------------------------------------
What's the benchmark configuration? For example, is throughput bounded by indexing or querying? Measuring N queries against a single preindexed document vs. 1 precompiled query against N documents? See the line boolean measureIndexing = false; // toggle this to measure query performance in my driver. If measuring indexing, what kind of analyzer / token filter chain is used? If measuring queries, what kind of query types are in the mix, with which relative frequencies? You may want to experiment with modifying/commenting/uncommenting various parts of the driver setup, for any given target scenario. Would it be possible to post the benchmark code, test data, queries for analysis? > InstanciatedIndex - faster but memory consuming index > ----------------------------------------------------- > > Key: LUCENE-550 > URL: http://issues.apache.org/jira/browse/LUCENE-550 > Project: Lucene - Java > Issue Type: New Feature > Components: Store > Affects Versions: 1.9 > Reporter: Karl Wettin > Attachments: class_diagram.png, class_diagram.png, > instanciated_20060527.tar, InstanciatedIndexTermEnum.java, > lucene.1.9-karl1.jpg, lucene2-karl_20060722.tar.gz, > lucene2-karl_20060723.tar.gz > > > After fixing the bugs, it's now 4.5 -> 5 times the speed. This is true for > both at index and query time. Sorry if I got your hopes up too much. There > are still things to be done though. Might not have time to do anything with > this until next month, so here is the code if anyone wants a peek. > Not good enough for Jira yet, but if someone wants to fool around with it, > here it is. The implementation passes a TermEnum -> TermDocs -> Fields -> > TermVector comparation against the same data in a Directory. > When it comes to features, offsets don't exists and positions are stored ugly > and has bugs. > You might notice that norms are float[] and not byte[]. That is me who > refactored it to see if it would do any good. Bit shifting don't take many > ticks, so I might just revert that. > I belive the code is quite self explaining. > InstanciatedIndex ii = .. > ii.new InstanciatedIndexReader(); > ii.addDocument(s).. replace IndexWriter for now. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]