[GitHub] [parquet-mr] wgtmac commented on pull request #1000: PARQUET-2196: Support LZ4_RAW codec

2022-10-23 Thread GitBox
wgtmac commented on PR #1000: URL: https://github.com/apache/parquet-mr/pull/1000#issuecomment-1288388631 I have changed the approach of interop test to follow encryption which downloads test files from parquet-testing repo and verifies the data decompressed and decoded from them.

[GitHub] [parquet-mr] wgtmac commented on pull request #1000: PARQUET-2196: Support LZ4_RAW codec

2022-10-18 Thread GitBox
wgtmac commented on PR #1000: URL: https://github.com/apache/parquet-mr/pull/1000#issuecomment-1282105001 > Yep, we test encryption interop using binary files in the parquet-testing repo. @wgtmac Please have a look at this code:

[GitHub] [parquet-mr] wgtmac commented on pull request #1000: PARQUET-2196: Support LZ4_RAW codec

2022-10-17 Thread GitBox
wgtmac commented on PR #1000: URL: https://github.com/apache/parquet-mr/pull/1000#issuecomment-1281706700 > Looks good. The only thing is we checked in binary files directly. It would be hard to maintain in the future. Can you generate the parquet file using the parquetwriter?

[GitHub] [parquet-mr] wgtmac commented on pull request #1000: PARQUET-2196: Support LZ4_RAW codec

2022-09-29 Thread GitBox
wgtmac commented on PR #1000: URL: https://github.com/apache/parquet-mr/pull/1000#issuecomment-1262462916 The interop test has been added. Please take a look again. Thanks! @shangxinli @pitrou -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [parquet-mr] wgtmac commented on pull request #1000: PARQUET-2196: Support LZ4_RAW codec

2022-09-28 Thread GitBox
wgtmac commented on PR #1000: URL: https://github.com/apache/parquet-mr/pull/1000#issuecomment-1261165283 In terms of the interop test, I plan to simply run the new ScanCommand (introduced by https://github.com/apache/parquet-mr/pull/998) of parquet-cli on the two lz4_raw parquet files in

[GitHub] [parquet-mr] wgtmac commented on pull request #1000: PARQUET-2196: Support LZ4_RAW codec

2022-09-28 Thread GitBox
wgtmac commented on PR #1000: URL: https://github.com/apache/parquet-mr/pull/1000#issuecomment-1261133491 @shangxinli Thanks for the review. I haven't be able to run the benchmark since the Hadoop Lz4Codec is implemented via JNI and I cannot simply run it on my Mac M1 laptop.

[GitHub] [parquet-mr] wgtmac commented on pull request #1000: PARQUET-2196: Support LZ4_RAW codec

2022-09-27 Thread GitBox
wgtmac commented on PR #1000: URL: https://github.com/apache/parquet-mr/pull/1000#issuecomment-1259431976 > @wgtmac Did you try to read an actual file produced by Parquet C++? > > Note you can find such files in https://github.com/apache/parquet-testing/ Yes, I have tried that.

[GitHub] [parquet-mr] wgtmac commented on pull request #1000: PARQUET-2196: Support LZ4_RAW codec

2022-09-27 Thread GitBox
wgtmac commented on PR #1000: URL: https://github.com/apache/parquet-mr/pull/1000#issuecomment-1259233215 @pitrou @shangxinli Can you please take a look? Thanks in advance! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and