alamb commented on code in PR #9369:
URL: https://github.com/apache/arrow-rs/pull/9369#discussion_r2795202413
##########
parquet/src/arrow/arrow_reader/mod.rs:
##########
@@ -1395,43 +1393,31 @@ impl ParquetRecordBatchReader {
continue;
}
- let mask = mask_cursor.mask_values_for(&mask_chunk)?;
-
let read =
self.array_reader.read_records(mask_chunk.chunk_rows)?;
if read == 0 {
return Err(general_err!(
"reached end of column while expecting {} rows",
mask_chunk.chunk_rows
));
}
- if read != mask_chunk.chunk_rows {
- return Err(general_err!(
- "insufficient rows read from array reader -
expected {}, got {}",
- mask_chunk.chunk_rows,
- read
- ));
- }
let array = self.array_reader.consume_batch()?;
- // The column reader exposes the projection as a struct
array; convert this
- // into a record batch before applying the boolean filter
mask.
let struct_array = array.as_struct_opt().ok_or_else(|| {
ArrowError::ParquetError(
"Struct array reader should return struct
array".to_string(),
)
})?;
+ // Key Change: partial read → emit immediately, no mask
Review Comment:
I think after we merge this PR, the idea of "key change" will be confusing
(as it won't be part of a change)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]