jp0317 commented on code in PR #36510:
URL: https://github.com/apache/arrow/pull/36510#discussion_r1265621368


##########
cpp/src/parquet/file_reader.cc:
##########
@@ -66,7 +66,8 @@ static constexpr int64_t kMaxDictHeaderSize = 100;
 RowGroupReader::RowGroupReader(std::unique_ptr<Contents> contents)
     : contents_(std::move(contents)) {}
 
-std::shared_ptr<ColumnReader> RowGroupReader::Column(int i) {
+std::shared_ptr<ColumnReader> RowGroupReader::Column(
+    int i, std::optional<ReaderProperties> prop) {

Review Comment:
   Thanks for the comment! I feel that would require more changes on the 
`properties_` related logic.  IIUC the `properties_` is passed to file reader 
during its creation and is copied to each row group reader.  In the context of 
column chunk specific buffer size,  a row group reader does not need a 
`properties_` that stores  `ColumnReaderProperties`  for an irrelevant row 
group. Besides, maintaining such a map might not be cheap when there're 
numerous columns (in another PR i'm changing a map to bit vector).  Another 
way, which i feel is less cleaner, is to provide new apis allowing users to 
update the `properties_`, such that the `properties_` contains the 
`ColumnReaderProperties`  only for the current row group reader being created. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to