[ 
https://issues.apache.org/jira/browse/AVRO-1411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13846558#comment-13846558
 ] 

Rob Turner commented on AVRO-1411:
----------------------------------

This is exactly the same as AVRO-1348, I have added a patch there.

> org.apache.avro.util.Utf8 performance improvement by remove private Charset 
> in class
> ------------------------------------------------------------------------------------
>
>                 Key: AVRO-1411
>                 URL: https://issues.apache.org/jira/browse/AVRO-1411
>             Project: Avro
>          Issue Type: Improvement
>          Components: java
>    Affects Versions: 1.7.5
>            Reporter: Tie Liu
>            Priority: Minor
>
> Inside org.apache.avro.util.Utf8 class, it has a private member field defined 
> as: private static final Charset UTF8 = Charset.forName("UTF-8");
> and it's used as:
>   public static final byte[] getBytesFor(String str) {
>     return str.getBytes(UTF8);
>   }
> I guess the intention of create this object is to save object creation, but 
> when we dive into the string.getBytes code, when it's called with Charset, it 
> actually create a new StringEncoder in java.lang.StringCoding:
>     static byte[] encode(Charset cs, char[] ca, int off, int len) {
>       StringEncoder se = new StringEncoder(cs, cs.name());
>       char[] c = Arrays.copyOf(ca, ca.length);
>       return se.encode(c, off, len);
>     }
> If instead we just call it with string literal "UTF-8", it will just reuse 
> the threadlocal StringEncoder. 
> We tried overwrite this class with passing string literal and proved those 
> short lived StringEncoder objects is not created any more. Would like apache 
> to fix this so we don't need to overwrite it anymore. 
>  



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Reply via email to