jp0317 commented on code in PR #36510:
URL: https://github.com/apache/arrow/pull/36510#discussion_r1264734613
##########
cpp/src/parquet/file_reader.h:
##########
@@ -44,7 +44,8 @@ class PARQUET_EXPORT RowGroupReader {
// An implementation of the Contents class is defined in the .cc file
struct Contents {
virtual ~Contents() {}
- virtual std::unique_ptr<PageReader> GetColumnPageReader(int i) = 0;
+ virtual std::unique_ptr<PageReader> GetColumnPageReader(
Review Comment:
Thanks for the review. Regarding `GetColumnChunkRange`, is there any concern
exposing it? IIUC currently users can only rely on `total_compressed_size`
which reveals no offset information and may not reflect the actual chunk size .
For the `ColumnReaderProperties`, given that the reader apis are all index
based, maybe we can just use index (as mapleFU suggested) without involving
column paths, especially a map on path strings? Initially i was trying to avoid
keeping such a map in `ReaderProperties`, and more importantly, i feel it makes
sense to implement this customized buffer size as "column chunk specific":
different column chunks from the same column can have different buffer size.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]