stevenschlansker opened a new pull request, #2414: URL: https://github.com/apache/fory/pull/2414
## What does this PR do? Introduce alternate "compact" row encoding that better uses knowledge of fixed-size types and sacrifices alignment to save space. Introduce new Builder pattern to avoid explosion of `Encoders` static methods as more features are added to row format. Optimizations include: * struct stores fixed-size fields (e.g. Int128. FixedSizeBinary) inline in fixed-data area without offset + size * struct of all fixed-sized fields is itself considered fixed-size to store in other struct or array * struct skips null bitmap if all fields are non-nullable * struct sorts fields by fixed-size for best-effort (but not guaranteed) alignment * struct can use less than 8 bytes for small data (int, short, etc) * struct null bitmap stored at end of struct to borrow alignment padding if possible * array stores fixed-size fields inline in fixed-data area without offset+size * array header uses 4 bytes for size (since Collection and array are only int-sized) and leaves remaining 4 bytes for start of null bitmap Fixups include: * toString better handles varbinary / fixed-binary (hex dump of first 256 bytes) * start making Javadoc for row format TODO: [ ] figure out how top-level array and map encoder fit into new builder class Not compatible with existing row format. ## Related issues Fixes #2337 ## Does this PR introduce any user-facing change? New API for new Compact codec. Existing codec unchanged. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
