I think the KeywordAnlyser bit is maybe a red herring, the problem seems to be that you cant use * within double quotes, I made some changes to my data and index to remove the space character

You can't use a wildcard within double quotes. The Lucene syntax grammar does not look for such things. The KeywordAnalyzer is not a red herring though...it will not work as expected. Like I said, the QueryParser breaks up a query where it sees spaces BEFORE sending to an analyzer. This is why Erik thinks that KeywordAnazlyer is the same as WhiteSpaceAnalyzer...it is not...KeywordAnazlyer does not break up words on spaces...it just looks like it when you use QueryParser because QueryParser breaks up words by spaces before even using an Analyzer. When you use KeywordAnalyzer for indexing it will not break on spaces...when you use it with the QueryParser it WILL break on spaces.
If I fed 54:puid* to my code it generates a Prefix Query and works as required
Search Query Is54:puid*
Parsed Search Query Is54:puid*of type:class org.apache.lucene.search.PrefixQuery

but with the quotes (which I would need if my value contained spaces) I only get a Term Query (which doesnt handle wildcards)
Search Query Is54:"puid*"
Parsed Search Query Is54:puid*of type:class org.apache.lucene.search.TermQuery
I explained this, though apparently not well. QueryParser feeds chunks of text to the analyzer. If there are spaces, each term is fed to the analyzer one at a time and a TermQuery is generated for each chunk. If you have something in quotes, the whole piece is fed to the analyzer...if the analyzer generates more than one token, a phrasequery is generated...if only ONE token is generated (as is the case when you feed anything within quotes to a KeywordAnalyzer), then a TermQuery is generated. The code does not look for any wildcards. Look at the QueryParser code, getField in particular. One option you have is to override getField, and before generating a TermQuery, scan for wildcards and if you find one, return a wildcardquery instead of a term query.

I realize I have a weak grasp of my own language and can be a lot less than clear...but the information you need has certainly been given in my emails. Perhaps someone else could be a little more clearer than I am able.

- Mark

so why is this ?

thanks Paul


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to