[ 
https://issues.apache.org/jira/browse/GEODE-6579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16809206#comment-16809206
 ] 

Darrel Schneider commented on GEODE-6579:
-----------------------------------------

As of java 9 using reflection to directly set the char[] is a non-starter. Jdk 
9 has changed the "value" field on String from a char[] to a byte[]. By default 
if the JVM is using a Latin character set then each character is stored as a 
single byte.

We also discussed using the package protected String(char[], boolean) 
constructor which is called by newStringUnsafe(char[]). In fact using 
"newStringUnsafe" would have been the way to do this optimization for jdk 8. 
But that method in jdk9 ends up copying the char[] into a byte[].

Our old code that used the deprecated String(byte[]) constructor is probably 
the fastest way to create a String instance in jdk9 since the amount of garbage 
produced by it will be half that of those who call String(char[], boolean).

Given how much the internal implementation of String changed from jdk8 to jdk9 
I think the suggested optimization is a bad idea.

We have also considered trying to avoid garbage creation by instead having a 
long lived byte[] that we reuse each time. Given that this byte[] has a very 
short life, and will always be less than 65k in size, I'm not convinced that we 
should try to avoid this garbage creation. 

> Creating a String during deserialization could be optimized
> -----------------------------------------------------------
>
>                 Key: GEODE-6579
>                 URL: https://issues.apache.org/jira/browse/GEODE-6579
>             Project: Geode
>          Issue Type: Improvement
>          Components: serialization
>            Reporter: Darrel Schneider
>            Assignee: Darrel Schneider
>            Priority: Major
>              Labels: optimization
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> When creating a string during deserialization from data that we know is in 
> the ASCII character set (each character can be represented by one byte) we 
> currently read all the bytes into a temporary byte array and then create a 
> String instance by giving it that byte array. The String constructor has to 
> create its own char array and then copy all the bytes into it. After that the 
> byte array is garbage.
> We could instead directly create a char array, fill it by reading each byte 
> from the DataInput into it and then using reflection to directly set this 
> char array as the value field of the String instance we just created (as an 
> empty String). This prevents an extra copy of the data and reduces garbage 
> creation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to