[ https://issues.apache.org/jira/browse/LUCENE-800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12475513 ]
Dilip Nimkar commented on LUCENE-800: ------------------------------------- In my test code, I took care of the difference between \ as the Java escape character and \ as the Lucene escape character. System.out.println(new QueryParser("_default_", analyzer).parse( "item:\\\\")) //note the 4 backslashes. should print on the console item:\\ But it is printing item:\ Same is the case with the second string in the test code. in general, the boolean test str.equals(QueryParser("_default_", analyzer).parse( str).toString()) should always evaluate to true if the analyzer is not changing the string. But in our case it is evaluating to false. The behavior I have consitently found is that - "Whenever and wherever a java String contains an unbroken sequence of N escaped backslashes (that is, N pairs of unescaped backslashes, totalling 2N backslashes) where N>= 2, the parse() method creates a Query that has only n-1 escaped backslashes in the corresponding place. " If you have 20 escaped backslashes in a java string, the Lucene query will end up with 19. Thank you much for your time, attention and efforts. Thanks. > Incorrect parsing by QueryParser.parse() when it encounters backslashes > (always eats one backslash.) > ---------------------------------------------------------------------------------------------------- > > Key: LUCENE-800 > URL: https://issues.apache.org/jira/browse/LUCENE-800 > Project: Lucene - Java > Issue Type: Bug > Components: QueryParser > Reporter: Dilip Nimkar > Assigned To: Michael Busch > > Test code and output follow. Tested Lucene 1.9 version only. Affects hose > who would index/search for Lucene's reserved characters. > Description: When an input search string has a sequence of N (java-escaped) > backslashes, where N >= 2, the QueryParser will produce a query in which that > sequence has N-1 backslashes. > TEST CODE: > Analyzer analyzer = new WhitespaceAnalyzer(); > String[] queryStrs = {"item:\\\\", > "item:\\\\*", > "(item:\\\\ item:ABCD\\\\))", > "(item:\\\\ item:ABCD\\\\)"}; > for (String queryStr : queryStrs) { > System.out.println("--------------------------------------"); > System.out.println("String queryStr = " + queryStr); > Query luceneQuery = null; > try { > luceneQuery = new QueryParser("_default_", analyzer).parse(queryStr); > System.out.println("luceneQuery.toString() = " + > luceneQuery.toString()); > } catch (Exception e) { > System.out.println(e.getClass().toString()); > } > } > OUTPUT (with remarks in comment notation:) > -------------------------------------- > String queryStr = item:\\ > luceneQuery.toString() = item:\ //One backslash has disappeared. > Searcher will fail on this query. > -------------------------------------- > String queryStr = item:\\* > luceneQuery.toString() = item:\* //One backslash has disappeared. > This query will search for something unintended. > -------------------------------------- > String queryStr = (item:\\ item:ABCD\\)) > luceneQuery.toString() = item:\ item:ABCD\) //This should have thrown a > ParseException because of an unescaped ')'. It did not. > -------------------------------------- > String queryStr = (item:\\ item:ABCD\\) > class org.apache.lucene.queryParser.ParseException //...and this one > should not have, but it did. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]