kylebarron edited a comment on issue #180: URL: https://github.com/apache/arrow-rs/issues/180#issuecomment-1058827130
Another update on the compression codecs: - [x] Uncompressed - [x] Snappy: works out of the box with `snap` feature - [x] Gzip: works out of the box with the `flate2` feature - [x] Brotli: works out of the box with the `brotli` feature. (I thought this had failed for me before, so maybe it was [updating my llvm/clang](https://github.com/kylebarron/parquet-wasm/pull/2) while debugging ZSTD that caused this to work). - [x] ZSTD: works with [this unreleased commit](https://github.com/gyscos/zstd-rs/commit/d6bfa32d09b8e4ef747c9b57109974c270ffab72) on `zstd-rs`, merged in January 2022, a day after the `0.10.0` release was cut 🥲 . Tested working when pointing to the latest `zstd-rs` master and with `default-features = false`. Hopefully they make a new release soon. (Should work as of #1414) - [ ] LZ4: The currently used [`lz4-rs`](https://github.com/10XGenomics/lz4-rs) hasn't had a release since June 8, 2020. It accepted a PR from July 2020 that [purported to add support for `wasm32-unknown-unknown`](https://github.com/10XGenomics/lz4-rs/pull/11) but I pulled their repo and couldn't get it to build for that target. [`lz4-flex`](https://github.com/PSeitz/lz4_flex) successfully compiles to WASM, but the API is slightly different. I tried (https://github.com/kylebarron/arrow-rs/pull/3/files#diff-73978efa44253b6c1cafc48e0fd042b761ebfff35cb32c9f53717d1641dab0fe) to update `parquet/src/compression.rs` to the `lz4-flex` API. It compiles fine but I get panics when testing the WASM with an LZ4 parquet file in the browser. I don't really know what I'm doing wrong 😅 . Switching out to `lz4-flex` seems very achievable by someone who knows Rust better than I do 😄 . - [ ] LZO: I see references to LZO compression in `parquet/src/basic.rs` but it doesn't seem to be implemented in `parquet/src/compression.rs`? I hadn't heard of LZO before. According to Wes, [LZO isn't really used anymore](https://github.com/apache/arrow/issues/2209#issuecomment-402859258), so I don't see myself working on this. In terms of Arrow IPC files being malformatted, I switched from `arrow::ipc::writer::StreamWriter` to `arrow::ipc::writer::FileWriter`. Now all the Arrow files generated by `parquet-wasm` from my Parquet test files are readable in _Python_ using `pa.ipc.open_file`, so presumably the JS errors arising from `arrow.tableFromIPC(tableBytes)` are issues with the JS library and its IPC parser. (It looks like there are known issues with the IPC support in JS [ARROW-15642](https://issues.apache.org/jira/browse/ARROW-15642), [ARROW-13818](https://issues.apache.org/jira/browse/ARROW-13818), [ARROW-8674](https://issues.apache.org/jira/browse/ARROW-8674)) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org