alamb commented on code in PR #1552:
URL: https://github.com/apache/arrow-rs/pull/1552#discussion_r848632777


##########
arrow/src/record_batch.rs:
##########
@@ -41,6 +41,7 @@ use crate::error::{ArrowError, Result};
 pub struct RecordBatch {
     schema: SchemaRef,
     columns: Vec<Arc<dyn Array>>,
+    row_count: usize,

Review Comment:
   maybe leave a comment here about why `row_count` is added?



##########
arrow/src/record_batch.rs:
##########
@@ -128,7 +125,15 @@ impl RecordBatch {
         }
 
         // check that all columns have the same row count
-        let row_count = columns[0].data().len();
+        let row_count = options
+            .row_count
+            .or_else(|| columns.first().map(|col| col.len()))
+            .ok_or_else(|| {
+                ArrowError::InvalidArgumentError(
+                    "must either specify a row count or at least one 
column".to_string(),
+                )
+            })?;
+
         if columns.iter().any(|c| c.len() != row_count) {
             return Err(ArrowError::InvalidArgumentError(
                 "all columns in a record batch must have the same 
length".to_string(),

Review Comment:
   This error will now also happen when the specified row count doesn't match 
the array counts (so the wording might be good to update?)



##########
arrow/src/record_batch.rs:
##########
@@ -243,7 +248,7 @@ impl RecordBatch {
     /// # }
     /// ```
     pub fn num_rows(&self) -> usize {
-        self.columns[0].data().len()
+        self.row_count

Review Comment:
   Maybe we could avoid adding `row_count` with something like
   
   ```rust
   self
     .columns
     .get(0)
     .map(|col| col.data.len())
     .unwrap_or(0)
   ```
   ?



##########
arrow/src/record_batch.rs:
##########
@@ -402,15 +402,20 @@ impl RecordBatch {
 
 /// Options that control the behaviour used when creating a [`RecordBatch`].
 #[derive(Debug)]
+#[non_exhaustive]

Review Comment:
   Or maybe it is time to make these fields non `pub` and add 
`with_match_field_name(&mut self, val: bool)` type accessors?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to