GitHub user gudladona added a comment to the discussion: Native Protobuf Record
Payload
Thanks for the feedback, I think I need to familiarize myself with the
RecordMerger. Historically, the (Avro) data itself is stored in the Payload as
bytes. See the base class below
```
/**
* Instantiate {@link BaseAvroPayload}.
*
* @param record Generic record for the payload.
* @param orderingVal {@link Comparable} to be used in pre combine.
*/
public BaseAvroPayload(GenericRecord record, Comparable orderingVal) {
this.recordBytes = record != null ? HoodieAvroUtils.avroToBytes(record) :
new byte[0];
this.orderingVal = orderingVal;
this.isDeletedRecord = record == null || isDeleteRecord(record);
if (orderingVal == null) {
throw new HoodieException("Ordering value is null for record: " + record);
}
}
```
Here we store the avro as bytes anyway to use the `getInsertValue` function
later to create a GenericRecord. My thinking being we could accomplish the same
with proto bytes and a Schema that can prevent the current implementation of a
converstion to avro and then conversion to recordBytes and then convert back to
GenericRecord. We could have only 1 pass of Proto to avro conversion OR, if a
proto parquet writer is used simply pass the proto bytes as the source along
with the schema.
GitHub link:
https://github.com/apache/hudi/discussions/13867#discussioncomment-14362625
----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]