Hi,

my guess would also be that the StandardAnalyzer lowercases your terms while
you have indexed them as they are without lowercasing.
One idea would be to use the PerFieldAnalyzerWrapper and map a
KeywordAnalyzer (which basically doesn't tokenise your stream at all) to any
fields you want not analyzed.

Remember that when indexing you don't need to specify
Field.Index.NOT_ANALYZED to those fields anymore and that you need to
specify the same analyzer to the QueryParser when searching.


savvas.

2009/12/14 Michel Nadeau <aka...@gmail.com>

> Hi !
>
> My Lucene 3.0.0 index contains a field "DOMAIN" that contains an Internet
> domain name - like
>
> * www.DomainName.com
> * www.domainname.com
> * www.DomainName.com/path/to/document/doc.html?a=2
>
> This field is indexed like this -
>
> doc.add(new Field("DOMAIN", sValue, Field.Store.YES,
> Field.Index.NOT_ANALYZED));
>
> When I search in this field, my search query looks like this:
>
> DOMAIN:www.DomainName*
>
> My problem is that it seems it never returns domains with uppercase
> letters.
>
> For example, I display all documents (using ConstantScoreQuery), and see
> this domain name: www.BidClerk.com
> ...So I know it's there - and so I search for: DOMAIN:www.BidC* - well it
> will *never* be found !
>
> But whatever all-lowecase domain will be found, all the time.
>
> My guess is that the problem is the analyzer I'm using - a StandadAnalyzer:
>
> QueryParser parser = new QueryParser(Version.LUCENE_CURRENT, "content", new
> StandardAnalyzer(Version.LUCENE_CURRENT));
> q = parser.parse(QUERY);
>
> So here are my questions:
> * Should I use a KeywordAnalyzer instead?
> * If I have domains like WWW.ASK.COM, www.ask.com, www.Ask.com,
> WwW.AsK.CoM- and I search for "DOMAIN:
> www.ask.com" ; will they all be found whatever the case?
>
> Thanks!
>
> - Mike
> aka...@gmail.com
>

Reply via email to