alhudz opened a new pull request, #1731:
URL: https://github.com/apache/commons-lang/pull/1731

   Repro: `WordUtils.wrap("a😀😀😀😀", 4, "\n", true)`, four `U+1F600` after a 
leading `a`.
   Cause: with `wrapLongWords` set, a word longer than the column is 
hard-broken at the fixed char offset `wrapLength + offset` and the new-line is 
inserted there. When that offset lands between the high and low surrogate of a 
supplementary code point the pair is split, so a lossless wrap emits a lone 
high surrogate at the end of one line and a lone low surrogate at the start of 
the next.
   Fix: nudge the break one char forward when it would land inside a pair, 
keeping the whole code point on the current line. BMP input and the 
delimiter-based wrap paths are unaffected.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to