Re: [PR] Improve parquet gzip compression performance using zlib-rs [arrow-rs]

2025-03-28 Thread via GitHub


alamb commented on PR #7200:
URL: https://github.com/apache/arrow-rs/pull/7200#issuecomment-2761670007

   🎉  -- thank you so much @psvri  for sticking with this -- the crate is 
better all around because of it


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Improve parquet gzip compression performance using zlib-rs [arrow-rs]

2025-03-28 Thread via GitHub


mbrobbel merged PR #7200:
URL: https://github.com/apache/arrow-rs/pull/7200


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Improve parquet gzip compression performance using zlib-rs [arrow-rs]

2025-03-27 Thread via GitHub


mbrobbel commented on PR #7200:
URL: https://github.com/apache/arrow-rs/pull/7200#issuecomment-2758985152

   @psvri if you merge/rebase `main` (to include #7336) this should be good to 
go.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Improve parquet gzip compression performance using zlib-rs [arrow-rs]

2025-03-13 Thread via GitHub


alamb commented on PR #7200:
URL: https://github.com/apache/arrow-rs/pull/7200#issuecomment-2722518262

   > The MSRV action is failing since I haven't updated the rust version from 
1.70 to 1.75 in my branch.
   > 
   > Would it be possible to update MSRV in the next major version ?
   
   I think we should
   
   I think we are blocked on someone writing down a policy.
   - #181 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Improve parquet gzip compression performance using zlib-rs [arrow-rs]

2025-03-13 Thread via GitHub


psvri commented on PR #7200:
URL: https://github.com/apache/arrow-rs/pull/7200#issuecomment-2720981944

   The MSRV action is failing since I haven't updated the rust version from 
1.70 to 1.75 in my branch.
   
   Would it be possible to update MSRV in the next major version ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Improve parquet gzip compression performance using zlib-rs [arrow-rs]

2025-03-13 Thread via GitHub


psvri commented on PR #7200:
URL: https://github.com/apache/arrow-rs/pull/7200#issuecomment-2719606613

   Hello @alamb 
   
   I got these numbers by running gzip benchmarks from this file 
https://github.com/apache/arrow-rs/blob/main/parquet/benches/compression.rs


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Improve parquet gzip compression performance using zlib-rs [arrow-rs]

2025-03-12 Thread via GitHub


alamb commented on PR #7200:
URL: https://github.com/apache/arrow-rs/pull/7200#issuecomment-2719077139

   Hi @psvri  -- thank you for this contribution
   
   I noticed that this PR reports benchmark numbers. What benchmark were these? 
Are they from zilb-rs? 
   
   Did you run any benchmarks for the parquet crate itself (as in from the 
https://github.com/apache/arrow-rs/tree/main/parquet/benches directory)?
   
   ```
   Benchmarking compress GZIP(GzipLevel(6)) - alphanumeric: Collecting 100 
samples in estimated 5.0406 s (200 itercompress GZIP(GzipLevel(6)) - 
alphanumeric
   time:   [24.395 ms 24.934 ms 25.612 ms]
   change: [-33.807% -31.734% -29.276%] (p = 0.00 < 
0.05)
   Performance has improved.
   Found 4 outliers among 100 measurements (4.00%)
 1 (1.00%) high mild
 3 (3.00%) high severe
   
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Improve parquet gzip compression performance using zlib-rs [arrow-rs]

2025-03-12 Thread via GitHub


alamb commented on PR #7200:
URL: https://github.com/apache/arrow-rs/pull/7200#issuecomment-2719071233

   I merged this PR up from main to rerun the tests as I think the failing CI 
check was resoled


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Improve parquet gzip compression performance using zlib-rs [arrow-rs]

2025-03-07 Thread via GitHub


alamb commented on PR #7200:
URL: https://github.com/apache/arrow-rs/pull/7200#issuecomment-2706255823

   It seems like this would be another good reason to
   - #181  
   
   🤔 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Improve parquet gzip compression performance using zlib-rs [arrow-rs]

2025-02-26 Thread via GitHub


psvri commented on code in PR #7200:
URL: https://github.com/apache/arrow-rs/pull/7200#discussion_r1972003732


##
parquet/Cargo.toml:
##
@@ -50,7 +50,7 @@ bytes = { version = "1.1", default-features = false, features 
= ["std"] }
 thrift = { version = "0.17", default-features = false }
 snap = { version = "1.0", default-features = false, optional = true }
 brotli = { version = "7.0", default-features = false, features = ["std"], 
optional = true }
-flate2 = { version = "1.0", default-features = false, features = 
["rust_backend"], optional = true }
+flate2 = { version = "1.1", default-features = false, features = ["zlib-rs"], 
optional = true }

Review Comment:
   Yes. its written in pure rust. 
   
   In my fork I can see wasm32 pipeline not failing 
https://github.com/psvri/arrow-rs/actions/runs/13547936797/job/37864105483



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Improve parquet gzip compression performance using zlib-rs [arrow-rs]

2025-02-26 Thread via GitHub


kylebarron commented on code in PR #7200:
URL: https://github.com/apache/arrow-rs/pull/7200#discussion_r1971995097


##
parquet/Cargo.toml:
##
@@ -50,7 +50,7 @@ bytes = { version = "1.1", default-features = false, features 
= ["std"] }
 thrift = { version = "0.17", default-features = false }
 snap = { version = "1.0", default-features = false, optional = true }
 brotli = { version = "7.0", default-features = false, features = ["std"], 
optional = true }
-flate2 = { version = "1.0", default-features = false, features = 
["rust_backend"], optional = true }
+flate2 = { version = "1.1", default-features = false, features = ["zlib-rs"], 
optional = true }

Review Comment:
   Is this pure-rust? Does this compile for `wasm32-unknown-unknown`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org