[
https://issues.apache.org/jira/browse/PARQUET-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17632431#comment-17632431
]
ASF GitHub Bot commented on PARQUET-2213:
-----------------------------------------
wgtmac commented on code in PR #1010:
URL: https://github.com/apache/parquet-mr/pull/1010#discussion_r1020335511
##########
parquet-common/src/main/java/org/apache/parquet/io/InputFile.java:
##########
@@ -41,4 +41,16 @@ public interface InputFile {
*/
SeekableInputStream newStream() throws IOException;
+ /**
+ * Open a new {@link SeekableInputStream} for the underlying data file,
+ * in the range of '[offset, offset + length)'
+ *
+ * @param offset the offset in the file to read from
+ * @param length the total number of bytes to read
+ * @return a new {@link SeekableInputStream} to read the file
+ * @throws IOException if the stream cannot be opened
+ */
+ default SeekableInputStream newStream(long offset, long length) throws
IOException {
Review Comment:
If we need to read multiple part of a parquet file (e.g. different row
groups, page index, footer, etc.), should we call it multiple times for each
individual part?
> Add an alternative InputFile.newStream that allow an input range
> ----------------------------------------------------------------
>
> Key: PARQUET-2213
> URL: https://issues.apache.org/jira/browse/PARQUET-2213
> Project: Parquet
> Issue Type: Improvement
> Reporter: Chao Sun
> Priority: Minor
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)