[ 
https://issues.apache.org/jira/browse/AVRO-2247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16678879#comment-16678879
 ] 

ASF GitHub Bot commented on AVRO-2247:
--------------------------------------

unchuckable commented on issue #354: AVRO-2247 - improved java reading 
performance with new reader
URL: https://github.com/apache/avro/pull/354#issuecomment-436800295
 
 
   @Fokko - Actually, reason was twofold: For one, I was looking at the code 
generation of Raymie for AVRO-2090 and was considering working up a concept to 
do on-the-fly bytecode generation for deserialization. And coming up with 
something that creates an execution plan was kinda the natural first step for 
that. I'd really like to extend that in a way that makes the ExecutionSteps 
generate inlined bytecode at a later point on the fly, so they JVM can optimize 
even more.  And on the other hand, I tried to understand the 
ResolvingGrammarGenerator and had a hard time with it, so I tried to build 
something that felt easier for me, and was kinda surprised with the results. 
I'm well aware it would be preferable to improve on what's already there, but I 
felt that the one-stage "execution plan" approach was too different from the 
two-stage "DatumReader and ResolvingDecoder" approach. I'm happy tho even if 
this PR only serves as inspiration for other changes, and am willing to assist 
in getting things done another way, too.
   
   @cutting - The ExecutionSteps are created in 
`FastReader.initializeRecordReader(...)`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Improve Java reading performance with a new reader
> --------------------------------------------------
>
>                 Key: AVRO-2247
>                 URL: https://issues.apache.org/jira/browse/AVRO-2247
>             Project: Apache Avro
>          Issue Type: Improvement
>          Components: java
>            Reporter: Martin Jubelgas
>            Priority: Major
>             Fix For: 1.9.0
>
>         Attachments: Perf-Comparison.md
>
>
> Complementary to AVRO-2090, I have been working on decoding of Avro objects 
> in Java and am suggesting a new implementation of a DatumReader that improves 
> read performance for both generic and specific records by approximately 20% 
> (and even more in cases of nested objects with defaults, a case I encounter a 
> lot in practical use).
> Key concept is to create a detailed execution plan once at DatumReader. This 
> execution plan contains all required defaulting/lookup values so they need 
> not be looked up during object traversal while reading.
> The reader implementation can be enabled and disabled per GenericData 
> instance. The system default is set via the system variable 
> "org.apache.avro.fastread" (defaults to "false").
> Attached a performance comparison of the existing implementation with the 
> proposed one. Will open a pull request with respective code in a bit (not 
> including interoperability with the optimizations of AVRO-2090 yet). Please 
> let me know your opinion of whether this is worth pursuing further.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to