[ 
https://issues.apache.org/jira/browse/LUCENE-8572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16904395#comment-16904395
 ] 

Chongchen Chen commented on LUCENE-8572:
----------------------------------------

I find the code is relative to 
[LUCENE-4199|https://issues.apache.org/jira/browse/LUCENE-4199] . Maybe we 
should implement it like:

{code:java}
public static final CharSequence escapeWhiteChar(CharSequence str,
      Locale locale) {
    ...

    for (int i = 0; i < escapableWhiteChars.length; i++) {
      buffer = buffer.toString().replace(escapableWhiteChars[i], "\\");
      buffer = 
buffer.toString().replace(escapableWhiteChars[i].toLowerCase(locale), "\\");
    }
    return buffer;
  }
{code}



> StringIndexOutOfBoundsException in parser/EscapeQuerySyntaxImpl.java
> --------------------------------------------------------------------
>
>                 Key: LUCENE-8572
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8572
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/queryparser
>    Affects Versions: 6.3
>            Reporter: Octavian Mocanu
>            Priority: Major
>
> With "lucene-queryparser-6.3.0", specifically in
> "org/apache/lucene/queryparser/flexible/standard/parser/EscapeQuerySyntaxImpl.java"
>  
> when escaping strings containing extended unicode chars, and with a locale 
> distinct from that of the character set the string uses, the process fails, 
> with a "java.lang.StringIndexOutOfBoundsException".
>  
> The reason is that the comparison is done by previously converting all of the 
> characters of the string to lower case chars, and by doing this, the original 
> string size isn't anymore the same, but less, as of the transformed one, so 
> that executing
>  
> org/apache/lucene/queryparser/flexible/standard/parser/EscapeQuerySyntaxImpl.java:89
> fails with a java.lang.StringIndexOutOfBoundsException.
> I wonder whether the transformation to lower case is really needed when 
> treating the escape chars, since by avoiding it, the error may be avoided.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to