tustvold commented on code in PR #2943:
URL: https://github.com/apache/arrow-rs/pull/2943#discussion_r1007271432


##########
parquet/src/compression.rs:
##########
@@ -325,6 +327,65 @@ mod zstd_codec {
 #[cfg(any(feature = "zstd", test))]
 pub use zstd_codec::*;
 
+#[cfg(any(feature = "lz4", test))]
+mod lz4_raw_codec {
+    use std::io::{Read, Write};
+
+    use crate::compression::Codec;
+    use crate::errors::Result;
+
+    /// Codec for LZ4 Raw compression algorithm.
+    pub struct LZ4RawCodec {}
+
+    impl LZ4RawCodec {
+        /// Creates new LZ4 Raw compression codec.
+        pub(crate) fn new() -> Self {
+            Self {}
+        }
+    }
+
+    // Compute max LZ4 uncompress size.
+    // Check 
https://stackoverflow.com/questions/25740471/lz4-library-decompressed-data-upper-bound-size-estimation
+    fn max_uncompressed_size(compressed_size: usize) -> usize {
+        (compressed_size << 8) - compressed_size - 2526
+    }
+
+    impl Codec for LZ4RawCodec {
+        fn decompress(
+            &mut self,
+            input_buf: &[u8],
+            output_buf: &mut Vec<u8>,
+        ) -> Result<usize> {
+            let offset = output_buf.len();
+            let required_len = max_uncompressed_size(input_buf.len());

Review Comment:
   Longer term it would be nice to plumb the decompressed size down, as we do 
actually know what it is from the page header



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to