alamb commented on code in PR #1552:
URL: https://github.com/apache/arrow-rs/pull/1552#discussion_r848632777
##########
arrow/src/record_batch.rs:
##########
@@ -41,6 +41,7 @@ use crate::error::{ArrowError, Result};
pub struct RecordBatch {
schema: SchemaRef,
columns: Vec<Arc<dyn Array>>,
+ row_count: usize,
Review Comment:
maybe leave a comment here about why `row_count` is added?
##########
arrow/src/record_batch.rs:
##########
@@ -128,7 +125,15 @@ impl RecordBatch {
}
// check that all columns have the same row count
- let row_count = columns[0].data().len();
+ let row_count = options
+ .row_count
+ .or_else(|| columns.first().map(|col| col.len()))
+ .ok_or_else(|| {
+ ArrowError::InvalidArgumentError(
+ "must either specify a row count or at least one
column".to_string(),
+ )
+ })?;
+
if columns.iter().any(|c| c.len() != row_count) {
return Err(ArrowError::InvalidArgumentError(
"all columns in a record batch must have the same
length".to_string(),
Review Comment:
This error will now also happen when the specified row count doesn't match
the array counts (so the wording might be good to update?)
##########
arrow/src/record_batch.rs:
##########
@@ -243,7 +248,7 @@ impl RecordBatch {
/// # }
/// ```
pub fn num_rows(&self) -> usize {
- self.columns[0].data().len()
+ self.row_count
Review Comment:
Maybe we could avoid adding `row_count` with something like
```rust
self
.columns
.get(0)
.map(|col| col.data.len())
.unwrap_or(0)
```
?
##########
arrow/src/record_batch.rs:
##########
@@ -402,15 +402,20 @@ impl RecordBatch {
/// Options that control the behaviour used when creating a [`RecordBatch`].
#[derive(Debug)]
+#[non_exhaustive]
Review Comment:
Or maybe it is time to make these fields non `pub` and add
`with_match_field_name(&mut self, val: bool)` type accessors?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]