[I] [Java][Format] Should Encoder work with ByteBuffer (via MemoryBuffer)? [fury]

via GitHub Tue, 21 Jan 2025 12:45:42 -0800


stevenschlansker opened a new issue, #2019:
URL: https://github.com/apache/fury/issues/2019


   ### Feature Request
   
   We are evaluating integrating Fury row format with a Kafka application. The 
Kafka deserializer provides a signature
   ```
   T Deserializer.deserialize(String topic, Headers headers, ByteBuffer data)
   ```
   In order to use this with either
   ```
   T Encoder.decode(byte[])
   T RowEncoder.fromRow(BinaryRow row)
   ```
   some slightly awkward adaptation is necessary.
   
   In the first case, the adaptation is simple and automatic by a default 
method, but costs copying your buffer into a `byte[]` unnecessarily.
   In the second case, I was able to adapt using the implementation as an 
example, but it feels like it would be good for the framework to provide this 
out of the box:
   
   ```
       public kafka.Deserializer<Rec> deserializer() {
           final var schema = rowEncoder.schema();
           final var schemaHash = DataTypes.computeSchemaHash(schema);
           return new kafka.Deserializer<Rec>() {
               @Override
               public Rec deserialize(final String topic, final byte[] data) {
                   // This case is easy
                   return rowEncoder.decode(data);
               }
   
               @Override
               public Rec deserialize(final String topic, final Headers 
headers, final ByteBuffer data) {
                   // This case... not so much
                   final MemoryBuffer buffer = MemoryUtils.wrap(data);
                   final long peerSchemaHash = buffer.readInt64();
                   if (peerSchemaHash != schemaHash) {
                       throw new ClassNotCompatibleException(
                               String.format(
                                       "Schema is not consistent, encoder 
schema is %s. "
                                               + "self/peer schema hash are 
%s/%s. "
                                               + "Please check writer schema.",
                                               schema, schemaHash, 
peerSchemaHash));
                   }
                   final BinaryRow row = new BinaryRow(schema);
                   row.pointTo(buffer, buffer.readerIndex(), buffer.size());
                   return rowEncoder.fromRow(row);
               }
           };
       }
   ```
   
   ### Is your feature request related to a problem? Please describe
   
   For decode, a seemingly unnecessary byte[] copy could be avoided
   For fromRow, some repeated low level code could be hidden inside the Fury 
framework
   
   ### Describe the solution you'd like
   
   Add new methods:
   ```
   Encoder.decode(MemoryBuffer buf)
   RowEncoder.fromRow(MemoryBuffer buf)
   ```
   
   It could be interesting to examine whether 
   ```
   Encoder.encodeTo(T obj, MemoryBuffer buf)
   ```
   is beneficial to add to avoid intermediate `byte[]` there too.
   
   ### Describe alternatives you've considered
   
   _No response_
   
   ### Additional context
   
   Thank you for your consideration! We hope to adopt Fury into our application 
if the prototyping works out.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[I] [Java][Format] Should Encoder work with ByteBuffer (via MemoryBuffer)? [fury]

Reply via email to