IBue commented on code in PR #1327:
URL: https://github.com/apache/commons-lang/pull/1327#discussion_r1873889888
##########
src/main/java/org/apache/commons/lang3/StringUtils.java:
##########
@@ -2888,17 +2876,20 @@ public static int indexOfAnyBut(final CharSequence seq,
final CharSequence searc
if (isEmpty(seq) || isEmpty(searchChars)) {
return INDEX_NOT_FOUND;
}
- final int strLen = seq.length();
- for (int i = 0; i < strLen; i++) {
- final char ch = seq.charAt(i);
- final boolean chFound = CharSequenceUtils.indexOf(searchChars, ch,
0) >= 0;
- if (i + 1 < strLen && Character.isHighSurrogate(ch)) {
- final char ch2 = seq.charAt(i + 1);
- if (chFound && CharSequenceUtils.indexOf(searchChars, ch2, 0)
< 0) {
- return i;
- }
- } else if (!chFound) {
- return i;
Review Comment:
<ins>note on performance:</ins>
boxing is the main responsible for increase in memory footprint here, that
is +3GB for a 100M input sequence,
but the current implementation takes already a 10² order of magnitude more
runtime than the proposed one for a 10M input sequence (minutes!) and a result
index at half of the input size.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]