[ https://issues.apache.org/jira/browse/AVRO-327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12801028#action_12801028 ]
Scott Carey commented on AVRO-327: ---------------------------------- Some bigger context: InputStream and OutputStream are slow and should be avoided as much as possible when copying chunks to/from them smaller than a couple hundred bytes on average. This performance difference isn't small, its a factor of 2.5 on the previous test for things that aren't fully optimized yet -- including the readDouble() that read 8 bytes at a time. This has a large impact on the impact of other improvements. An improvement that currently helps by 10%, will help by 25% after this change. If one wants to be able to have one thread decode/encode at gigabit ethernet wire speed, avoiding inputStream.read() and OutputStream.write(byte b) is mandatory -- even if you use a BufferedInputStream. This is not just for decoder/encoder, but also in various other places, where the assumed "pass data around" method is via InputStream and OutputStream. ByteBuffer, byte[], Channel, are good options for various use cases that perform much better when small reads/writes are done than an equivalent Input/Output stream. There will be more to change than just BinaryDecoder eventually, and a holistic approach is better than a patchwork one. To address Thiru's concerns, I think that it can be made even simpler: {code} void f(InputStream in) { BinaryDecoder bin = new BinaryDecoder(in); AvroObject o = readAvro(bin); NonAvroObject no = readNonAvro(bin.inputStream()); } {code} BinaryDecoder can construct a specialized InputStream inner class on demand (and cache it). The contract would be that once an InputStream is given to a decoder, it should not be accessed directly -- not any different than what happens when you wrap an input stream with a buffered input stream. Alternatively Decoder could implement BufferedInputStream itself -- but that would force that on all implementations. If two readers need to readahead-buffer on the same data, a different API will be needed (ByteBuffers? something else? read methods that don't advance the position?). I can produce a patch next week for review. > Performance improvements to BinaryDecoder.readLong() > ---------------------------------------------------- > > Key: AVRO-327 > URL: https://issues.apache.org/jira/browse/AVRO-327 > Project: Avro > Issue Type: Improvement > Components: java > Reporter: Thiruvalluvan M. G. > Assignee: Thiruvalluvan M. G. > Attachments: AVRO-327.patch > > > AVRO-315 proposed performance improvements to readLong(), readFloat() and > readDouble(). readLong() did not improve performance well for all. Scott > proposed a better method (but requires a change in semantics and API). We'll > carry on the discussion on that proposal here. AVRO-315 will be committed > with changes for readFloat() and readDouble(). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.