Hi Colin,

Is it possible that you are using an analyzer that breaks words on non
letters? For instance SimpleAnalyzer? if so, the doc text:
   pagefile.sys
is indexed as two words:
  pagefile sys
At search time, the query text:
  pagefile.sys
is also parsed-tokenized into a two words query:
  profile sys
but the query text:
  pagefile.sys*
is not analyzed (by design) and matches only words that start with:
  pagefile.sys
But there are no such words in the index, because it was indexed with
breaking words on non-letters...

Hopefully this gets you started... If this is the reason, you may want to
use a different analyzer (See Wiki page "AnalysisParalysis").

Otherwise, make sure you use the same analyzers at indexing and search ...
and see the Lucene FAQ entry "Why am I getting no hits / incorrect hits?".

If all this still fails, try to post here a simple code snippet showing how
you index and how you search.

Regards,
Doron

"McGuigan, Colin" <[EMAIL PROTECTED]> wrote on 09/03/2007 13:10:49:

> (Lucene 1.9.1)
>
>
>
> I have a "filename" field in Lucene that holds a value, like this:
> pagefile.sys
>
>
>
> If I run searches through QueryParser, and I do a search for:
>
>
>
> pagefile.sys
>
> pagefile
>
> pagefile.
>
>
>
> This all works because it goes through getFieldQuery, which tokenizes
> the string and generates a PhraseQuery out of it.
>
>
>
> But if I search for this:
>
>
>
> pagefile.*
>
>
>
> It doesn't work, because it goes through PrefixQuery, and PrefixQuery
> looks for terms that start with "pagefile.", but no terms will start
> with "pagefile.", because periods are not tokenized.  Similarly,
> searching for:
>
>
>
> pagefile*sys
>
>
>
> Doesn't work, because it goes through WildcardQuery, and WildcardQuery
> is set up to only match terms as well, and no term starts with
> "pagefile" and ends with "sys".
>
>
>
> I've done a lot of googling on this, but I can't find a good answer for
> what I should do.  I'm playing around with removing QueryParser entirely
> and generating a MultiPhraseQuery, but want to make sure I'm not
> reinventing an already invented wheel.
>
>
>
> --Colin McGuigan
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to