[ 
https://issues.apache.org/jira/browse/LUCENE-10430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17501072#comment-17501072
 ] 

Holger Rehn commented on LUCENE-10430:
--------------------------------------

Thanks for the feedback! But why do I need to escape double quotes? This isn't 
a regex meta character and doesn't have a special meaning in regular 
expressions, so should be treated as literal, right? And
{code:java}
Pattern.compile( "\"" ).matcher( "\"" ).matches(){code}
simply returns true, as expected. Btw. - are you sure escaping double quotes 
really works as expected? I seem to remember to have already tried that, 
without getting the expected result... but I'm not sure.

> Literal double quotes cause exception in class RegExp
> -----------------------------------------------------
>
>                 Key: LUCENE-10430
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10430
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/other
>    Affects Versions: 9.0
>            Reporter: Holger Rehn
>            Priority: Major
>
> Class org.apache.lucene.util.automaton.RegExp fails to parse valid regular 
> expressions that contain double quotes (except in character classes). This of 
> course affects corresponding RegexpQuerys, as well.
> Example: 
> {code:java}
> Query  q = new RegexpQuery( new Term( "field", "a\"b" ) );
> RegExp r = new RegExp( "a\"b" );{code}
> Both fail with:
> {code:java}
> java.lang.IllegalArgumentException: expected '"' at position 3
>     at 
> org.apache.lucene.util.automaton.RegExp.parseSimpleExp(RegExp.java:1299)
>     at 
> org.apache.lucene.util.automaton.RegExp.parseCharClassExp(RegExp.java:1229)
>     at org.apache.lucene.util.automaton.RegExp.parseComplExp(RegExp.java:1218)
>     at 
> org.apache.lucene.util.automaton.RegExp.parseRepeatExp(RegExp.java:1192)
>     at 
> org.apache.lucene.util.automaton.RegExp.parseConcatExp(RegExp.java:1185)
>     at 
> org.apache.lucene.util.automaton.RegExp.parseConcatExp(RegExp.java:1187)
>     at org.apache.lucene.util.automaton.RegExp.parseInterExp(RegExp.java:1179)
>     at org.apache.lucene.util.automaton.RegExp.parseUnionExp(RegExp.java:1173)
>     at org.apache.lucene.util.automaton.RegExp.<init>(RegExp.java:496)
>     ...{code}
> As a workaround we currently replace all double quotes with a dot.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to