Re: Case insensitive StringField?

Jack Krupansky Tue, 21 May 2013 20:07:56 -0700

Yes it is. It always will. But... you can escape the spaces with abackslash:


Query q = qp.parse("new\\ york");


-- Jack Krupansky

-----Original Message-----From: Shahak Nagiel

Sent: Tuesday, May 21, 2013 10:09 PM
To: [email protected]
Subject: Re: Case insensitive StringField?

Jack / Michael: Thanks, but the query parser still seems to be tokenizingthe query?


public class StringPhraseAnalyzer extends Analyzer  {

protected TokenStreamComponents createComponents (String fieldName,Reader reader) {

       Tokenizer tok = new KeywordTokenizer(reader);
       TokenFilter filter = new LowerCaseFilter(Version.LUCENE_41, tok);
       filter = new TrimFilter(filter, true);
       return new TokenStreamComponents(tok, filter);
   }
}

...

Analyzer analyzer = new StringPhraseAnalyzer();

// using this analyzer, add document to index with city TextField (value"NEW YORK")


QueryParser qp = new QueryParser(Version.LUCENE_41, "city", analyzer);

Query q = qp.parse("new york");
System.out.println ("Query: " + q);


results in...
Query: city:new city:york// I expected "city:new york"

...and no matches. Is a QueryParser the wrong way to generate the query forthis type of analyzer?

Thanks again!

________________________________
From: Jack Krupansky <[email protected]>
To: [email protected]
Sent: Tuesday, May 21, 2013 10:22 AM
Subject: Re: Case insensitive StringField?

To be clear, analysis is not supported on StringField (or any non-tokenized
field). But the good news is that by using the keyword tokenizer
(KeywordTokenizer) on a TextField, you can get the same effect.

That will preserve the entire input as a single token. You may want to
include filters to trim exterior white space and normalize interior white
space.

-- Jack Krupansky

-----Original Message-----From: Shahak Nagiel

Sent: Tuesday, May 21, 2013 10:06 AM
To: [email protected]
Subject: Case insensitive StringField?

It appears that StringField instances are treated as literals, even though
my analyzer lower-cases (on both write and read sides).  So, for example, I
can match with a term query (e.g. "NEW YORK"), but only if the case matches.
If I use a QueryParser (or MultiFieldQueryParser), it never works because
these query values are lowercased and don't match.

I've found that using a TextField instead works, presumably because it's
tokenized and processed correctly by the write analyzer.  However, I would
prefer that queries match against the entire/exact phrase ("NEW YORK"),
rather than among the tokens ("NEW" or "YORK").

What's the solution here?

Thanks in advance.

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]

For additional commands, e-mail: [email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: Case insensitive StringField?

Reply via email to