I bet it's that underscore in m_sz. Different analyzers do different things with different punctuation characters. I can never remember which does exactly what - it'll be in the javadocs or Lucene In Action or somewhere on the web.
You can check what exactly has been indexed by using Luke - always a good idea. In this case you probably want an analyzer that just downcases the tag names. Easy enough to build your own if there isn't an existing one that meets your needs. -- Ian. On Mon, May 24, 2010 at 1:14 PM, Shlomy Reinstein <srein...@gmail.com> wrote: > Hi, > > Thanks. By "strict prefix", I meant a prefix of the name > (case-insensitive). What you suggest ("tagname: updateC*") was the > first thing I tried, but it happens to work only partially. In my > case, I have a lot of names beginning with "m_sz<something>", e.g. > "m_szComment", "m_szName". Trying a query like "tagname: m_szC*" will > not find the "m_szComment" value. I just wrote the wrong example here. > > Shlomy > > On Mon, May 24, 2010 at 11:32 AM, Ian Lea <ian....@gmail.com> wrote: >> StandardAnalyzer should work fine, mark the field as indexed, no need >> to store it unless you want to retrieve it for display. >> >> Query via QueryParser using "tagname: updateC*" or programatically via >> PrefixQuery. >> >> >> Although I'm not sure exactly what you mean by "strict prefix". If >> you mean that the prefix should match exactly on case, punctuation >> etc. maybe use KeywordAnalyzer instead. >> >> >> >> -- >> Ian. >> >> >> On Sun, May 23, 2010 at 3:03 PM, Shlomy Reinstein <srein...@gmail.com> wrote: >>> Hi, >>> >>> I have a Lucene index that contains source code tags (a tag can be any >>> named source code element - function, class, variable). Each document >>> contains a field with the tag name and some additional information. >>> I'd like to be able to perform strict prefix queries. E.g. if I have a >>> tag named "updateChildren", I'd like to be able to write a query like >>> "updateC*" and get the list of tags that have this prefix >>> (updateChildren being one of them). >>> >>> Is there a way to do this? Please suggest how I should store the field >>> for this purpose - which analyzer to use, whether to keep it stored, >>> etc. >>> >>> Thanks, >>> Shlomy >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>> For additional commands, e-mail: java-user-h...@lucene.apache.org >>> >>> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org