[
https://issues.apache.org/jira/browse/LUCENE-800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12475513
]
Dilip Nimkar commented on LUCENE-800:
-------------------------------------
In my test code, I took care of the difference between \ as the Java escape
character and \ as the Lucene escape character.
System.out.println(new QueryParser("_default_", analyzer).parse(
"item:\\\\")) //note the 4 backslashes.
should print on the console item:\\
But it is printing item:\
Same is the case with the second string in the test code.
in general, the boolean test
str.equals(QueryParser("_default_", analyzer).parse( str).toString())
should always evaluate to true if the analyzer is not changing the string.
But in our case it is evaluating to false.
The behavior I have consitently found is that - "Whenever and wherever a java
String contains an unbroken sequence of N escaped backslashes (that is, N
pairs of unescaped backslashes, totalling 2N backslashes) where N>= 2, the
parse() method creates a Query that has only n-1 escaped backslashes in the
corresponding place. " If you have 20 escaped backslashes in a java string, the
Lucene query will end up with 19.
Thank you much for your time, attention and efforts.
Thanks.
> Incorrect parsing by QueryParser.parse() when it encounters backslashes
> (always eats one backslash.)
> ----------------------------------------------------------------------------------------------------
>
> Key: LUCENE-800
> URL: https://issues.apache.org/jira/browse/LUCENE-800
> Project: Lucene - Java
> Issue Type: Bug
> Components: QueryParser
> Reporter: Dilip Nimkar
> Assigned To: Michael Busch
>
> Test code and output follow. Tested Lucene 1.9 version only. Affects hose
> who would index/search for Lucene's reserved characters.
> Description: When an input search string has a sequence of N (java-escaped)
> backslashes, where N >= 2, the QueryParser will produce a query in which that
> sequence has N-1 backslashes.
> TEST CODE:
> Analyzer analyzer = new WhitespaceAnalyzer();
> String[] queryStrs = {"item:\\\\",
> "item:\\\\*",
> "(item:\\\\ item:ABCD\\\\))",
> "(item:\\\\ item:ABCD\\\\)"};
> for (String queryStr : queryStrs) {
> System.out.println("--------------------------------------");
> System.out.println("String queryStr = " + queryStr);
> Query luceneQuery = null;
> try {
> luceneQuery = new QueryParser("_default_", analyzer).parse(queryStr);
> System.out.println("luceneQuery.toString() = " +
> luceneQuery.toString());
> } catch (Exception e) {
> System.out.println(e.getClass().toString());
> }
> }
> OUTPUT (with remarks in comment notation:)
> --------------------------------------
> String queryStr = item:\\
> luceneQuery.toString() = item:\ //One backslash has disappeared.
> Searcher will fail on this query.
> --------------------------------------
> String queryStr = item:\\*
> luceneQuery.toString() = item:\* //One backslash has disappeared.
> This query will search for something unintended.
> --------------------------------------
> String queryStr = (item:\\ item:ABCD\\))
> luceneQuery.toString() = item:\ item:ABCD\) //This should have thrown a
> ParseException because of an unescaped ')'. It did not.
> --------------------------------------
> String queryStr = (item:\\ item:ABCD\\)
> class org.apache.lucene.queryParser.ParseException //...and this one
> should not have, but it did.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]