[ https://issues.apache.org/jira/browse/KAFKA-5761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16739077#comment-16739077 ]
douyu commented on KAFKA-5761: ------------------------------ +1 > Serializer API should support ByteBuffer > ---------------------------------------- > > Key: KAFKA-5761 > URL: https://issues.apache.org/jira/browse/KAFKA-5761 > Project: Kafka > Issue Type: Improvement > Components: clients > Affects Versions: 0.11.0.0 > Reporter: Bhaskar Gollapudi > Priority: Major > Labels: features, performance > > Consider the Serializer : Its main method is : > byte[] serialize(String topic, T data); > Producer applications create a implementation that takes in an instance ( > of T ) and convert that to a byte[]. This byte array is allocated a new for > this message.This byte array then is handed over to Kafka Producer API > internals that write the bytes to buffer/ network socket. When the next > message arrives , the serializer instead of creating a new byte[] , should > try to reuse the existing byte[] for the new message. This requires two > things : > 1. The process of handing off the bytes to the buffer/socket and reusing > the byte[] must happen on the same thread. > 2 There should be a way for marking the end of available bytes in the > byte[]. > The first is reasonably simple to understand. If this does not happen , and > without other necessary synchrinization , the byte[] get corrupted and so > is the message written to buffer/socket.However , this requirement is easy > to meet for a producer application , because it controls the threads on > which the serializer is invoked. > The second is where the problem lies with the current API. It does not > allow a variable size of bytes to be read from a container. It is limited > by the byte[]'s length. This forces the producer to > 1 either create a new byte[] for a message that is bigger than the previous > one. > OR > 2. Decide a max size and use a padding . > Both are cumbersome and error prone, and may cause wasting of network > bandwidth. > Instead , if there is an Serializer with this method : > ByteBuffer serialize(String topic, T data); > This helps to implements a reusable bytes container for clients to avoid > allocations for each message. -- This message was sent by Atlassian JIRA (v7.6.3#76005)