ArnavBalyan opened a new pull request, #704:
URL: https://github.com/apache/arrow-go/pull/704

   cc @julienledem @alamb 
   
   ### Rationale for this change
    - Implements ALP (Adaptive Lossless floating point) encoding for float and 
double columns as per Prateek's spec. Related to 
https://github.com/apache/parquet-format/pull/548.
    - ALP converts floating point values to integers via decimal scaling, then 
applies For and bit packing. Values that don't round trip exactly are stored as 
exceptions.
    - The encoder is incremental and flushes per vector as values arrive. The 
decoder is lazy with vectors decoded on demand via offset array.
   
   ### What changes are included in this PR?
    - Wired ALP implementation through encoder.go and decoder.go based on the 
original 
[spec](https://docs.google.com/document/d/1xz2cudDpN2Y1ImFcTXh15s-3fPtD_aWt/edit).
 
    - Registered a new encoding as Encodings.ALP.
    - Added unit tests/cross compat tests
   
   ### Are these changes tested?
    - Yes new unit tests were added, all of which pass in addition to the 
existing tests.
    - Cross compat tests were added,  the external encoded file can be provided 
as env var which triggers arrow-go ALP decoding.
   
   ### Are there any user-facing changes?
    - Yes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to