nathanb9 commented on code in PR #9697:
URL: https://github.com/apache/arrow-rs/pull/9697#discussion_r3075611132


##########
parquet/src/arrow/push_decoder/mod.rs:
##########
@@ -366,13 +369,31 @@ impl ParquetPushDecoder {
         self.state.buffered_bytes()
     }
 
-    /// Clear any staged byte ranges currently buffered for future decode work.
+    /// Release all staged byte ranges currently buffered for future decode
+    /// work.
     ///
-    /// This clears byte ranges still owned by the decoder's internal
+    /// This releases byte ranges still owned by the decoder's internal
     /// `PushBuffers`. It does not affect any data that has already been handed
     /// off to an active [`ParquetRecordBatchReader`].
+    pub fn release_all(&mut self) {
+        self.state.release_all();
+    }
+
+    /// Use [`Self::release_all`] instead.
+    #[deprecated(since = "58.1.0", note = "Use `release_all` instead")]
     pub fn clear_all_ranges(&mut self) {

Review Comment:
   my first commit here deprecated so soon 😆 😢 lol



##########
parquet/src/file/metadata/push_decoder.rs:
##########
@@ -361,6 +360,7 @@ impl ParquetMetaDataPushDecoder {
     /// Try to decode the metadata from the pushed data, returning the
     /// decoded metadata or an error if not enough data is available.
     pub fn try_decode(&mut self) -> Result<DecodeResult<ParquetMetaData>> {
+        self.buffers.ensure_sorted();

Review Comment:
   Don't need this because you later call the push buffers's `get_bytes` which 
also ensures of the order. But since you've made it short circuit anyway maybe 
its worth leaving for easier readability



##########
parquet/src/util/push_buffers.rs:
##########
@@ -82,83 +100,166 @@ impl PushBuffers {
             file_len,
             ranges: Vec::new(),
             buffers: Vec::new(),
+            #[cfg(feature = "arrow")]
+            watermark: 0,
+            sorted: true,
         }
     }
 
-    /// Push all the ranges and buffers
+    /// Restore the sort invariant on `ranges`/`buffers`.
+    ///
+    /// Because IO completions are expected to generally arrive in-order,
+    /// `push_range` appends without sorting. We instead delay sorting until
+    /// conumption to amortize its cost, if necessary.

Review Comment:
   ```suggestion
       /// consumption to amortize its cost, if necessary.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to