On Monday 03 February 2003 07:19, Terry Steichen wrote: > I believe that the tokenizer treats a dash as a token separator. Hence, > the only way, as I recall, to eliminate this behavior is to modify > QueryParser.jj so it doesn't do this. However, doing this can cause some > other problems, like hyphenated words at a line break and the like.
It might be enough to just replace analyzer passed in to QueryParser to do this? This is the case if QueryParser only handles modifiers outside terms, and terms are passed to analyzer. I think this is the case (QueryParser does call the analyzer in couple of places, and one word may actually expand to a phrase or vice versa)? Still, it seems like using a hyphen as separator shouldn't necessarily cause big problems when indexer does the same; queries against "2 - 5" would be phrase queries for "2 5", which is still reasonably specific (and should match the content). On the other hand, simple analyzer and standard analyzer have pretty different tokenization rules, so it's important to make sure same analyzer is used for both indexing and searching (that mismatch can prevent matches easily). -+ Tatu +- --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]