[ 
https://issues.apache.org/jira/browse/AVRO-911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13121643#comment-13121643
 ] 

Milind Bhandarkar commented on AVRO-911:
----------------------------------------

Todd, indeed GC kicking in for a lot of objects will slow tasks somewhat. Where 
I came from, the basic data type that was being fed as a value was not a simple 
int or text. It was a complex structure that had a map, a list of maps, and a 
map of maps. The Value object reuse in reducer meant that only the top level 
object was reused. Everything underneath had to be reallocated anyway.

Plus number of people cloning their objects were doing it incorrectly.

Considering the complexity, and as you correctly point out in your blog, making 
sure that GC does not trigger during the lifetime of the task by allocating 
bigger heap, meant that reusing the objects were not a worthwhile optimization.
                
> remove object reuse from Java APIs
> ----------------------------------
>
>                 Key: AVRO-911
>                 URL: https://issues.apache.org/jira/browse/AVRO-911
>             Project: Avro
>          Issue Type: Improvement
>          Components: java
>            Reporter: Doug Cutting
>            Assignee: Doug Cutting
>             Fix For: 1.6.0
>
>         Attachments: perf-reuse.patch
>
>
> Avro's Java APIs were designed to permit object reuse when reading with the 
> assumption that would provide performance advantages.  In particular, the old 
> parameter in DatumReader<T>.read(T old, Decoder), the Utf8 class, and the 
> GenericArray.peek() method were all designed for this purpose.  But I am 
> unable to see significant performance improvements when objects are reused.  
> I tried modifying Perf.java's GenericTest to reuse records, and its 
> StringTest to not reuse Utf8 instances and, in both cases, performance is not 
> substantially altered.
> If we were to remove these then issues such as AVRO-803 would disappear.  
> Always using java.lang.String instead of Utf8 would remove a lot of user 
> confusion. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to