[ 
https://issues.apache.org/jira/browse/AVRO-3094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17315537#comment-17315537
 ] 

ASF subversion and git services commented on AVRO-3094:
-------------------------------------------------------

Commit 89efb8dd7bef28945d0e9598d913aa9ec843593c in avro's branch 
refs/heads/master from Radai Rosenblatt
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=89efb8d ]

AVRO-3094: improve performance of SpecificData.getForClass(), especially around 
old generated specific record classes (#1172)

Avro 1.9+ attempts to access static field MODEL$ on avo-generated classes.
Record classes generated by older Avro (for example 1.7) do not have this field.
This causes modern avro to catch an exception on every record being deserialized
for such classes, which can cause a x3 slow down
(see profiler results in the Jira ticket).

This fix uses ClassValue to cache the SpecificData instance to use per class.
This results in s ~x5 speedup for the happy path (classes that have MODEL$) and
~x50 speedup for classes that do not have MODEL$.

Co-authored-by: radai <ra...@fractal.lan>

> performance regression in SpecificData.getForClass() when run with code 
> generated by older avro
> -----------------------------------------------------------------------------------------------
>
>                 Key: AVRO-3094
>                 URL: https://issues.apache.org/jira/browse/AVRO-3094
>             Project: Apache Avro
>          Issue Type: Bug
>          Components: java
>    Affects Versions: 1.9.2, 1.10.2
>            Reporter: radai rosenblatt
>            Assignee: Radai Rosenblatt
>            Priority: Major
>         Attachments: model$.png
>
>
> starting with 1.9, avro SpecificData supports per-class MODEL$ definitions, 
> and looks for them on specific classes:
> {code:java}
> public static <T> SpecificData getForClass(Class<T> c) {
>   if (SpecificRecordBase.class.isAssignableFrom(c)) {
>     final Field specificDataField;
>     try {
>       specificDataField = c.getDeclaredField("MODEL$");
>       specificDataField.setAccessible(true);
>       return (SpecificData) specificDataField.get(null);
>     } catch (NoSuchFieldException e) {
>       // Return default instance
>       return SpecificData.get();    <======= EXPENSIVE
>     } catch (IllegalAccessException e) {
>       throw new AvroRuntimeException(e);
>     }
>   }
>   return SpecificData.get();
> } {code}
> when this is run vs specific record classes generated by older avro, which do 
> not have field MODEL$ this reslts in a serious performance degradation. we've 
> measured the impact on user code to be x3 slower in one case. here's a flame 
> graph:
>   !model$.png!
> under java 7+ it should be completely possible to cache the existence (or 
> lack thereof) of MODEL$ using 
> [ClassValue|https://docs.oracle.com/javase/7/docs/api/java/lang/ClassValue.html]
>  which would also speed this up when operating on classes generated by more 
> modern avro since it would avoid reflection



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to