Hello, This was a thread on lucene-user initially, but I'm copying lucene-dev as well. Sorry about duplicates.
--- Stefan Bergstrand <[EMAIL PROTECTED]> wrote: > Doug Cutting <[EMAIL PROTECTED]> writes: > > Just noticed this problem in my program. > > It seems as if the analyzer passed to QueryParser.parse(), never is > passed to PrefixQuery (which is what my test case is parsed to). > > A quick look in QueryParser.jj confirms this: > > q = new PrefixQuery(new Term(field, term.image.substring > (0, term.image.length()-1))); I thought that queries such as 'rou?d' are considered wildcard queries by QueryParser.jj, and not Prefix queries, no? In the default definition of token in QueryParser.jj I see this: | <PREFIXTERM: <_TERM_START_CHAR> (<_TERM_CHAR>)* "*" > | <WILDTERM: <_TERM_START_CHAR> (<_TERM_CHAR> | ( [ "*", "?" ] ))* > Then further down in QueryParser.jj we have this: if (wildcard) q = new WildcardQuery(new Term(field, term.image)); So a WildWuery is being constructed, not PrefixQuery, I think. What I don't understand is why the definition of _TERM_START_CHAR looks like this: | <#_TERM_START_CHAR: ~[ " ", "\t", "+", "-", "!", "(", ")", ":", "^", "[", "]", "\"", "{", "}", "~", "*" ] > Maybe the name is misleading, but it seems like _TERM_START_CHAR are the characters that a TERM can start with, because later in QueryParser.jj we have TERM defined as: | <TERM: <_TERM_START_CHAR> (<_TERM_CHAR>)* > and _TERM_CHAR has this definition: | <#_TERM_CHAR: <_TERM_START_CHAR> > So how can we have a "*" in _TERM_START_CHAR when terms are not allowed to start with a "*", and if we do have "*", how come we do not have "?" as well? Can somebodyt correct me in every place where I made false statements, assumptions, and conclusions? Thanks, Otis > > > From: Howk, Michael [mailto:[EMAIL PROTECTED]] > > > > > > Also, Lucene returns the parsed version of each of our > > > searches. When we > > > search by rou*d, Lucene parses it as rou*d (which is what we > > > would expect). > > > But when we search by rou?d, Lucene parses it as "rou d". It > > > seems to wrap > > > the term in quotes and replace the question mark with a > > > space. Any ideas? Or > > > can someone give us an idea of how to understand WildcardQuery or > > > WildcardTermEnum? > > > > It sounds like the problem is in the query parser. Brian? > > > > Doug > > > > -- > > To unsubscribe, e-mail: > <mailto:[EMAIL PROTECTED]> > > For additional commands, e-mail: > <mailto:[EMAIL PROTECTED]> > > > > > > -- > --------------------------- > Stefan Bergstrand > Polopoly - Cultivating the information garden > Ph: +46 8 506 782 67 > Cell: +46 704 47 82 67 > Fax: +46 8 506 782 51 > [EMAIL PROTECTED], http://www.polopoly.com > > -- > To unsubscribe, e-mail: > <mailto:[EMAIL PROTECTED]> > For additional commands, e-mail: > <mailto:[EMAIL PROTECTED]> > __________________________________________________ Do You Yahoo!? Yahoo! Sports - live college hoops coverage http://sports.yahoo.com/ -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>