Hi all, We're building an application that requires a high-throughput Kafka consumer in C++ and must utilize Avro as its data transmission format.
Currently memoryInputStream allocates a new MemoryInputStream2 on the heap every call. For context we're decoding in a tight loop and that's adds a new/delete per message for an object that's just 3 scalars. To avoid that, we wrote an internal InputStream that reuses the memory buffer via a reset() method. I thought it might be a good idea to hoist this into the Avro C++ implementation itself. I benchmarked the difference on my local machine and it had a pretty significant impact. Although it’s a local, hacky benchmark so take it with a grain of salt. Full decode (3-field record, binaryDecoder, release build): - Stock memoryInputStream(): 111 ns/decode ~9M decodes/sec - Reusable stream with reset(): 74 ns/decode ~13.5M decodes/sec Happy to put together a patch if there's interest. Any thoughts on the right approach, or is this something that would be desired? Thanks, Robbie
