[ 
https://issues.apache.org/jira/browse/LANG-1300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15904594#comment-15904594
 ] 

ASF GitHub Bot commented on LANG-1300:
--------------------------------------

Github user dmjones500 commented on the issue:

    https://github.com/apache/commons-lang/pull/251
  
    We need to agree on the desired behaviour of the method. Based on the 
signature, I think we can adjust it to support supplementary characters without 
violating the implied contract of the method. The issue for me is about indexes.
    
    Based on the JavaDoc description, I suggest we return the code unit index. 
This is because the input is a `CharSequence` and the JavaDoc states:
    
    > Finds the first index within a `CharSequence`, handling null.
    
    Also, a `CharSequence` only has methods which operate on code unit indexes. 
So returning a code unit index would be helpful for further operations on the 
sequence.
    
    What does everyone else think? We should obviously update the JavaDocs to 
make it crystal clear what we decide, either way.


> Clarify or improve behaviour of int-based methods in StringUtils
> ----------------------------------------------------------------
>
>                 Key: LANG-1300
>                 URL: https://issues.apache.org/jira/browse/LANG-1300
>             Project: Commons Lang
>          Issue Type: Improvement
>          Components: lang.*
>    Affects Versions: 3.5
>            Reporter: Duncan Jones
>            Priority: Minor
>             Fix For: Discussion
>
>
> The following methods use an {{int}} to represent a search character:
> {code:java}
> boolean contains(final CharSequence seq, final int searchChar)
> int indexOf(final CharSequence seq, final int searchChar)
> int indexOf(final CharSequence seq, final int searchChar, final int startPos)
> int lastIndexOf(final CharSequence seq, final int searchChar)
> int lastIndexOf(final CharSequence seq, final int searchChar, final int 
> startPos)
> {code}
> When I see an {{int}} representing a character, I tend to assume the method 
> can handle supplementary characters. However, the current behaviour of these 
> methods depends upon whether the {{CharSequence}} is a {{String}} or not.
> {code:java}
> StringBuilder builder = new StringBuilder();
> builder.appendCodePoint(0x2070E);
> System.out.println(StringUtils.lastIndexOf(builder, 0x2070E)); // -1
> System.out.println(StringUtils.lastIndexOf(builder.toString(), 0x2070E)); // 0
> {code}
> The Javadoc for these methods are ambiguous on this point, stating:
> {quote}
> This method uses {{String.lastIndexOf(int)}} if possible.
> {quote}
> I think we should consider updating the {{CharSequenceUtils}} methods used by 
> this class to convert all {{CharSequence}} parameters to strings, enabling 
> full code point support. The docs could be updated to make this crystal clear.
> There is a question of whether this breaks backwards compatibility.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to