Re: [Performance] Streaming main memory indexing of single strings

Wolfgang Hoschek Wed, 20 Apr 2005 11:26:57 -0700

On Apr 20, 2005, at 9:22 AM, Erik Hatcher wrote:

On Apr 20, 2005, at 12:11 PM, Wolfgang Hoschek wrote:
By the way, by now I have a version against 1.4.3 that is 10-100 times faster (i.e. 30000 - 200000 index+query steps/sec) than the simplistic RAMDirectory approach, depending on the nature of the input data and query. From some preliminary testing it returns exactly what RAMDirectory returns.
Awesome. Using the basic StringIndexReader I sent?


Yep, it's loosely based on the empty skeleton you sent.

I've been fiddling with it a bit more to get other query types. I'll add it to the contrib area when its a bit more robust.

Perhaps we could merge up once I'm ready and put that into the contrib area? My version now supports tokenization with any analyzer and it supports any arbitrary Lucene query. I might make the API for adding terms a little more general, perhaps allowing arbitrary Document objects if that's what other folks really need...

As an aside, is there any work going on to potentially support prefix (and infix) wild card queries ala "*fish"?
WildcardQuery supports wildcard characters anywhere in the string. QueryParser itself restricts expressions that have leading wildcards from being accepted.

Any particular reason for this restriction? Is this simply a current parser limitation or something inherent?

QueryParser supports wildcard characters in the middle of strings no problem though. Are you seeing otherwise?


I ment an infix query such as "*fish*"

Wolfgang.


-----------------------------------------------------------------------
Wolfgang Hoschek                  |   email: [EMAIL PROTECTED]
Distributed Systems Department    |   phone: (415)-533-7610
Berkeley Laboratory               |   http://dsd.lbl.gov/~hoschek/
-----------------------------------------------------------------------


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [Performance] Streaming main memory indexing of single strings

Reply via email to