alhudz opened a new pull request, #1722:
URL: https://github.com/apache/commons-lang/pull/1722
Repro: `WordUtils.initials("Ben 😀mile Lee")` where the second word begins
with U+1F600.
Cause: the loop copies the first `char` after a delimiter and skips the rest
of the word, so a word that starts with a supplementary code point keeps only
the high surrogate and the low half is dropped, leaving a lone surrogate in the
result (`B` + U+D83D + `L`).
Fix: copy the trailing low surrogate together with its high half, and size
the buffer to the input length so a two-`char` initial cannot run past it. BMP
input is unchanged.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]