The multiple redundant bounds checks bother me, but I don't know how to fix them without abandoning a bit of modularization.
if (cp >= 0) { if (COMPACT_STRINGS && cp < 0x100) return latin1encode() if (cp < MIN_SURROGATE || (cp > MAX_SURROGATE && cp < MIN_SUPPLEMENTARY_CODEPOINT) return bmpencode() if (cp >= MIN_SUPPLEMENTARY_CODEPOINT && cp <= MAX_CODEPOINT) return suppEncode() } throw ... + static byte[] toBytes(int cp) { + if (Character.isBmpCodePoint(cp)) { + return toBytes((char)cp); + } else { + byte[] result = new byte[4]; + putChar(result, 0, Character.highSurrogate(cp)); + putChar(result, 1, Character.lowSurrogate(cp)); + return result; + } + } We should continue to assume that supplementary characters are very rare, even in this context, so the supplementary character code could be moved to a separate cold method.