friendlymatthew commented on code in PR #20854:
URL: https://github.com/apache/datafusion/pull/20854#discussion_r2914274361
##########
datafusion/datasource-parquet/src/row_filter.rs:
##########
@@ -251,15 +251,26 @@ impl FilterCandidateBuilder {
return Ok(None);
};
+ let schema_descr = metadata.file_metadata().schema_descr();
let root_indices: Vec<_> =
required_columns.required_columns.into_iter().collect();
- let leaf_indices = leaf_indices_for_roots(
- &root_indices,
- metadata.file_metadata().schema_descr(),
+ let mut leaf_indices = leaf_indices_for_roots(&root_indices,
schema_descr);
+
+ let struct_leaf_indices = resolve_struct_field_leaves(
+ &required_columns.struct_field_accesses,
+ &self.file_schema,
+ schema_descr,
);
+ leaf_indices.extend_from_slice(&struct_leaf_indices);
+ leaf_indices.sort_unstable();
Review Comment:
To my knowledge, no.
`leaf_indices` is only used to build a `ProjectionMask::leaves`
https://github.com/apache/datafusion/blob/5af7361a987f3441aa9718c35ef5381f480a9c94/datafusion/datasource-parquet/src/row_filter.rs#L144-L149
[`ProjectionMask` does not care about
order](https://arrow.apache.org/rust/src/parquet/arrow/mod.rs.html#291-302). It
builds a boolean mask of size `vec![false; num_columns]` and sets via
`mask[leaf_idx] = true`.
Other call sites that use `leaf_indices` isn't considering order
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]