jp0317 commented on code in PR #36510:
URL: https://github.com/apache/arrow/pull/36510#discussion_r1265621368
##########
cpp/src/parquet/file_reader.cc:
##########
@@ -66,7 +66,8 @@ static constexpr int64_t kMaxDictHeaderSize = 100;
RowGroupReader::RowGroupReader(std::unique_ptr<Contents> contents)
: contents_(std::move(contents)) {}
-std::shared_ptr<ColumnReader> RowGroupReader::Column(int i) {
+std::shared_ptr<ColumnReader> RowGroupReader::Column(
+ int i, std::optional<ReaderProperties> prop) {
Review Comment:
Thanks for the comment! I feel that would require more changes on the
`properties_` related logic. IIUC the `properties_` is passed to file reader
during its creation and is copied to each row group reader. In the context of
column chunk specific buffer size, a row group reader does not need a
`properties_` that stores `ColumnReaderProperties` for an irrelevant row
group. Besides, maintaining such a map might not be cheap when there're
numerous columns (in another PR i'm changing a map to bit vector). Another
way, which i feel is less cleaner, is to provide new apis allowing users to
update the `properties_`, such that the `properties_` contains the
`ColumnReaderProperties` only for the current row group reader being created.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]