You can apply the lower case filter to the whitespace or other analyzer and use that as the analyzer.

-- Jack Krupansky

-----Original Message----- From: Jochen Hebbrecht
Sent: Monday, October 01, 2012 10:34 AM
To: java-user@lucene.apache.org
Subject: Re: Searching for a search string containing a literal slash doesn't work with QueryParser

Hi Jack,

I tried analyzing through WhitespaceAnalyzer. Now I can search on my query
string AND I can find my document! Great!
But all my searches are now case sensitive. So when I index a field as
"JavaOne", I also have to enter in my search word: "JavaOne" and not
"javaone" or "javaOne".

How do you solve this in a proper way? Bringing all characters
toLowerCase() when indexing them?

Jochen


2012/10/1 Jack Krupansky <j...@basetechnology.com>

That's "The escape merely..."

-- Jack Krupansky

-----Original Message----- From: Jack Krupansky
Sent: Monday, October 01, 2012 9:58 AM
To: java-user@lucene.apache.org
Subject: Re: Searching for a search string containing a literal slash
doesn't work with QueryParser


The scape merely assures that the slash will not be parsed as query syntax
and will be passed directly to the analyzer, but the standard analyzer will
in fact always remove it. Maybe you want the white space analyzer or
keyword
analyzer (no characters removed.)

-- Jack Krupansky

-----Original Message----- From: Jochen Hebbrecht
Sent: Monday, October 01, 2012 8:59 AM
To: java-user@lucene.apache.org
Subject: Searching for a search string containing a literal slash doesn't
work with QueryParser

Hi,

I'm currently trying to search on the following search string in my Lucene
index: "2012/0.124.323".
The java code to search for ('value' is my search string)

----
QueryParser queryParser = new QueryParser(Version.LUCENE_36, field, new
StandardAnalyzer(Version.**LUCENE_36));
queryParser.**setAllowLeadingWildcard(true);
return queryParser.parse(value);
----

This returns a query result: "2012" "0.124.323". QueryParser is replacing
the forward slash by a space.
I tried escaping the "/" with a backslash "\", but this doesn't work
either.

Maybe required to fully understand my scenario. I have the following import
XML:

---
...
<TEXT l="963" t="826" r="1391" b="870">Vervaldag </TEXT>
<TEXT l="963" t="826" r="1391" b="870">17/07/12</TEXT>
<TEXT l="2100" t="833" r="2275" b="871">09/07/12</TEXT>
<TEXT l="42" t="871" r="338" b="907">2012/0.124.323</TEXT>
<TEXT l="1478" t="938" r="1673" b="978">Kapitaals</TEXT>
...
---

I get all TEXT values with an XPath expression and I index them as:

---
XPathExpression expr = xpath.compile("//TEXT");
Object result = expr.evaluate(document, XPathConstants.NODESET);
NodeList nodes = (NodeList) result;
for (int i = 0; i < nodes.getLength(); i++) {
   doc.add(new org.apache.lucene.document.**Field("IMAGE",
nodes.item(i).getFirstChild().**getNodeValue(), Store.NO,
Index.ANALYZED));
}
---

I'm using the StandardAnalyzer.

What is the best way to solve my issue? Do I need to switch from Analyzer?
Do I have to use something else then QueryParser? ...
I also want to support searching on 2012/0.*, so I cannot only use
TermQuery ...

Kind regards,
Jochen


------------------------------**------------------------------**---------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.**apache.org<java-user-unsubscr...@lucene.apache.org> For additional commands, e-mail: java-user-help@lucene.apache.**org<java-user-h...@lucene.apache.org>

------------------------------**------------------------------**---------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.**apache.org<java-user-unsubscr...@lucene.apache.org> For additional commands, e-mail: java-user-help@lucene.apache.**org<java-user-h...@lucene.apache.org>




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to