ianmcook commented on code in PR #35: URL: https://github.com/apache/arrow-experiments/pull/35#discussion_r1843912421
########## http/get_compressed/README.md: ########## @@ -20,3 +20,160 @@ # HTTP GET Arrow Data: Compression Examples This directory contains examples of HTTP servers/clients that transmit/receive data in the Arrow IPC streaming format and use compression (in various ways) to reduce the size of the transmitted data. + +Since we re-use the [Arrow IPC format][ipc] for transferring Arrow data over +HTTP and both Arrow IPC and HTTP standards support compression on their own, +there are at least two approaches to this problem: + +1. Compressed HTTP responses carrying Arrow IPC streams with uncompressed + array buffers. +2. Uncompressed HTTP responses carrying Arrow IPC streams with compressed + array buffers. + +Applying both IPC buffer and HTTP compression to the same data is not +recommended. The extra CPU overhead of decompressing the data twice is +not worth any possible gains that double compression might bring. If +compression ratios are unambiguously more important than reducing CPU +overhead, then a different compression algorithm that optimizes for that can +be chosen. + +This table shows the support for different compression algorithms in HTTP and +Arrow IPC: + +| Format | HTTP Support | IPC Support | +| ------------------ | --------------- | --------------- | +| gzip (GZip) | X | | +| deflate (DEFLATE) | X | | +| br (Brotli) | X[^2] | | +| zstd (Zstandard) | X[^2] | X | +| lz4 (LZ4) | | X | Review Comment: ```suggestion | Codec | Identifier | HTTP Support | IPC Support | |----------- | ----------- | ------------- | ------------ | | GZip | `gzip` | X | | | DEFLATE | `deflate` | X | | | Brotli | `br` | X[^2] | | | Zstandard | `zstd` | X[^2] | X[^3] | | LZ4 | `lz4` | | X[^3] | ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
