alamb commented on code in PR #2994:
URL: https://github.com/apache/arrow-rs/pull/2994#discussion_r1010593727


##########
parquet/src/arrow/arrow_reader/selection.rs:
##########
@@ -117,6 +117,39 @@ impl RowSelection {
         Self { selectors }
     }
 
+    /// Creates a [`RowSelection`] from a slice of uncombined `RowSelector`:
+    /// Like [skip(5),skip(5),read(10)].
+    /// After combine will return [skip(10),read(10)]
+    /// # Note
+    /// If directly use uncombined `RowSelection` with offset_index in parquet
+    /// will panic.
+    fn from_selectors_and_combine(selectors: &[RowSelector]) -> Self {
+        if selectors.len() < 2 {
+            return Self {
+                selectors: Vec::from(selectors),
+            };
+        }
+        let first = selectors.first().unwrap();
+        let mut sum_rows = first.row_count;
+        let mut selected = first.skip;

Review Comment:
   ```suggestion
           let mut skip = first.skip;
   ```
   
   Maybe it would be easier to read this code if the name of the variable is 
the same as the name of the field that uses it 🤔 



##########
parquet/src/arrow/arrow_reader/selection.rs:
##########
@@ -117,6 +117,39 @@ impl RowSelection {
         Self { selectors }
     }
 
+    /// Creates a [`RowSelection`] from a slice of uncombined `RowSelector`:
+    /// Like [skip(5),skip(5),read(10)].
+    /// After combine will return [skip(10),read(10)]
+    /// # Note
+    /// If directly use uncombined `RowSelection` with offset_index in parquet
+    /// will panic.

Review Comment:
   ```suggestion
       /// # Note
       ///  [`RowSelection`] must be combined prior to use within offset_index 
or else the code will panic.
   ```



##########
parquet/src/arrow/arrow_reader/selection.rs:
##########
@@ -117,6 +117,39 @@ impl RowSelection {
         Self { selectors }
     }
 
+    /// Creates a [`RowSelection`] from a slice of uncombined `RowSelector`:

Review Comment:
   👍  thank you for the comments



##########
parquet/src/arrow/arrow_reader/selection.rs:
##########
@@ -470,6 +512,56 @@ mod tests {
         );
     }
 
+    #[test]
+    fn test_combine() {
+        let a = vec![
+            RowSelector::skip(3),
+            RowSelector::skip(3),
+            RowSelector::select(10),
+            RowSelector::skip(4),
+        ];
+
+        let b = vec![
+            RowSelector::skip(3),
+            RowSelector::skip(3),
+            RowSelector::select(10),
+            RowSelector::skip(4),
+            RowSelector::skip(0),
+        ];
+
+        let c = vec![
+            RowSelector::skip(2),
+            RowSelector::skip(4),
+            RowSelector::select(3),
+            RowSelector::select(3),
+            RowSelector::select(4),
+            RowSelector::skip(3),
+            RowSelector::skip(1),
+            RowSelector::skip(0),
+        ];
+
+        let expected = RowSelection::from(vec![
+            RowSelector::skip(6),
+            RowSelector::select(10),
+            RowSelector::skip(4),
+        ]);
+
+        assert_eq!(RowSelection::from_selectors_and_combine(&a), expected);
+        assert_eq!(RowSelection::from_selectors_and_combine(&b), expected);
+        assert_eq!(RowSelection::from_selectors_and_combine(&c), expected);
+    }
+
+    #[test]
+    fn test_from_one_and_empty() {
+        let a = vec![RowSelector::select(10)];
+        let selection1 = RowSelection::from(a.clone());
+        assert_eq!(selection1.selectors, a);

Review Comment:
   I also recommend tests for 2 selectors (only) to cover all edge cases:
   
   ```rust
   vec![
     RowSelector::select(10), 
     RowSelector::select(5)
   ];
   ```
   
   ```rust
   vec![
     RowSelector::select(10), 
     RowSelector::skip(5)
   ];
   ```
   
   ```rust
   vec![
     RowSelector::skip(10), 
     RowSelector::select(5)
   ];
   ```
   
   ```rust
   vec![
     RowSelector::skip(10), 
     RowSelector::skip(5)
   ];
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to