[GitHub] [commons-text] aherbert commented on a diff in pull request #430: TEXT-227: removed unnecessary IF statement

via GitHub Sun, 16 Jul 2023 04:59:36 -0700


aherbert commented on code in PR #430:
URL: https://github.com/apache/commons-text/pull/430#discussion_r1264670367



##########
src/main/java/org/apache/commons/text/AlphabetConverter.java:
##########
@@ -87,9 +87,6 @@ public final class AlphabetConverter {
      * @see 
"http://www.oracle.com/us/technologies/java/supplementary-142654.html";
      */
     private static String codePointToString(final int i) {
-        if (Character.charCount(i) == 1) {

Review Comment:
   Looking at the JDK source for `Character.charCount(int)`, it checks the int 
has no bits set in the upper 16-bits and returns 1 or 2. This check is also 
done in `Character.toChars(int)`. If a single char is found that method will 
immediately return an array with 1 character to use for the String constructor. 
However in recent JDKs the `String.valueOf(char)` constructor has optimisations 
for compact strings by storing LATIN1 charsets using 1 byte per character. 
These look more involved that just using a `char[]` array of size 1.
   
   So the current code could save 1 byte of storage on modern JDK 
implementations. I would guess it would be slower due to additional checks for 
LATIN1 chars but that would have to be verified with a speed test.
   
   Note that the `String.valueOf(char)` method may be internally optimised to 
intern LATIN1 single character strings for reuse. You could cache the ASCII 
alphabet for example. I cannot see this in my JDK source but it would be an 
option for a JDK to implement.
   
   Either variation of this method therefore has possible minor advantages. I 
would be reluctant to change this without a JMH test of performance across the 
LTS JDK releases for the two variations. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@commons.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [commons-text] aherbert commented on a diff in pull request #430: TEXT-227: removed unnecessary IF statement

Reply via email to