westonpace commented on code in PR #13830:
URL: https://github.com/apache/arrow/pull/13830#discussion_r955398578


##########
cpp/src/arrow/dataset/file_base.h:
##########
@@ -196,6 +200,12 @@ class ARROW_DS_EXPORT FileFragment : public Fragment,
 
   const FileSource& source() const { return source_; }
   const std::shared_ptr<FileFormat>& format() const { return format_; }
+  const int64_t start_byte() const { return start_byte_; }
+  const int64_t end_byte() const { return end_byte_; }
+  void set_bounds(int64_t start, int64_t end) {

Review Comment:
   I don't think this is specific to CSV either.  ARROW-17159 at least seems to 
suggest that it is desired for Parquet but @zhztheplayer might be in a better 
place to justify why that and not row group indices.  I think the goal though 
is to have a format-agnostic ability to slice fragments.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to