[ http://issues.apache.org/jira/browse/LUCENE-550?page=all ]
Karl Wettin updated LUCENE-550:
-------------------------------
Attachment: src_20060509.tar.gz
Some new statistics.
* A corpus of 500 documents, 1-5K text per document.
* Placed 150 000 term and boolean queries.
* Retrieved the top <100 hits from each result.
Query alone is about 5x faster,
but 9x if you include the hits collection.
I belive that span queries will be about 10x-20x faster as the skipTo() is
really really optimized. There is a bug in my term position code, so I have not
been able to messure it for real yet.
Hope to have that working and an updated class diagram for you soon.
> InstanciatedIndex - faster but memory consuming index
> -----------------------------------------------------
>
> Key: LUCENE-550
> URL: http://issues.apache.org/jira/browse/LUCENE-550
> Project: Lucene - Java
> Type: New Feature
> Components: Store
> Versions: 1.9
> Reporter: Karl Wettin
> Attachments: Document.java, InstanciatedIndex.java, Term.java,
> class_diagram.png, class_diagram.png, src.tar.gz, src_20060509.tar.gz
>
> After fixing the bugs, it's now 4.5 -> 5 times the speed. This is true for
> both at index and query time. Sorry if I got your hopes up too much. There
> are still things to be done though. Might not have time to do anything with
> this until next month, so here is the code if anyone wants a peek.
> Not good enough for Jira yet, but if someone wants to fool around with it,
> here it is. The implementation passes a TermEnum -> TermDocs -> Fields ->
> TermVector comparation against the same data in a Directory.
> When it comes to features, offsets don't exists and positions are stored ugly
> and has bugs.
> You might notice that norms are float[] and not byte[]. That is me who
> refactored it to see if it would do any good. Bit shifting don't take many
> ticks, so I might just revert that.
> I belive the code is quite self explaining.
> InstanciatedIndex ii = ..
> ii.new InstanciatedIndexReader();
> ii.addDocument(s).. replace IndexWriter for now.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]